2011-10-19 06:09:19

by David Rientjes

[permalink] [raw]
Subject: Re: iwlagn is getting very shaky

On Wed, 19 Oct 2011, Norbert Preining wrote:

> Hi everyone
>
> (please Cc),
>
> I am currently running 3.1.0-rc10, and I am having a hard time with
> the wlan network here at the university.
>
> For quite some time, like 10min, it is fine, then suddently the
> iwlagn driver gives up on me and connection is dropped.
>
> In the log file I see:
> [ 172.137011] iwlagn 0000:06:00.0: Tx aggregation enabled on ra = 00:24:c4:ab:bd:ef tid = 0
> [ 821.841016] iwlagn 0000:06:00.0: Tx aggregation enabled on ra = 00:24:c4:ab:bd:ef tid = 6
> [ 1095.580735] wlan0: direct probe to 00:24:c4:ab:bd:e0 (try 1/3)
> [ 1095.780076] wlan0: direct probe to 00:24:c4:ab:bd:e0 (try 2/3)
> [ 1095.980101] wlan0: direct probe to 00:24:c4:ab:bd:e0 (try 3/3)
> [ 1096.180117] wlan0: direct probe to 00:24:c4:ab:bd:e0 timed out
> [ 1105.255464] wlan0: deauthenticating from 00:24:c4:ab:bd:ef by local choice (reason=2)
> [ 1105.255519] iwlagn 0000:06:00.0: Stopping AGG while state not ON or starting
> [ 1105.265581] cfg80211: Calling CRDA for country: JP
> [ 1105.271476] wlan0: authenticate with 00:24:c4:ab:bd:e0 (try 1)
> [ 1105.468105] wlan0: authenticate with 00:24:c4:ab:bd:e0 (try 2)
> [ 1105.668110] wlan0: authenticate with 00:24:c4:ab:bd:e0 (try 3)
> [ 1105.868090] wlan0: authentication with 00:24:c4:ab:bd:e0 timed out
> [ 1113.667890] wlan0: direct probe to 00:24:c4:ab:bd:e0 (try 1/3)
> [ 1113.864116] wlan0: direct probe to 00:24:c4:ab:bd:e0 (try 2/3)
> [ 1114.064095] wlan0: direct probe to 00:24:c4:ab:bd:e0 (try 3/3)
> [ 1114.264109] wlan0: direct probe to 00:24:c4:ab:bd:e0 timed out
>
> Somewhere around 1100 the connection is gone and never comes back again.
>
> I tried removing the driver module from the kernel and reinserting it,
> tried to turn on and off the hardware swithc (rfkill), all without
> no success, the wlan connection remains dead until I reboot.
>
> I am not sure exactely when it started, I guess somewhere in the
> 3.1 cycle, before I was permanently working wiht wlan, now I always
> plug in the cable.
>
> If there is any way to track down this, or any suggestions how I can
> debug it, please let me know.
>
> Hardware: Sony VGN-Z11, Intel(R) WiFi Link 5100 AGN, REV=0x54
> L1 Enabled; Disabling L0S
> device EEPROM VER=0x11e, CALIB=0x4
> Device SKU: 0Xf0
> Tunable channels: 13 802.11bg, 24 802.11a channels
> loaded firmware version 8.83.5.1 build 33692 (EXP)
>
>
> On the other hand, the same laptop with the very same configuration
> works very nicely in my flat's wlan, which is some dirt cheap Japanese
> only wlan router.
>

There have been recent issues in 3.1-rc9 reported with iwlagn, see the
thread at https://lkml.org/lkml/2011/10/15/107 even though you have
different hardware. Adding Wey-Yi Guy from Intel and the linux-wireless
mailing list to the cc.


2011-10-26 04:01:28

by Wey-Yi Guy

[permalink] [raw]
Subject: Re: iwlagn is getting very shaky

Norbert,

On Tue, 2011-10-25 at 19:50 -0700, Norbert Preining wrote:
> Hi David,
>
> On So, 23 Okt 2011, David Rientjes wrote:
> > It'd be interesting to see if this problem still exists in linux-next
> > since some wireless changes went into that recently. Please try the tree
> > at git://github.com/sfrothwell/linux-next.git
>
> I didn't try that by now, but I tried current linux git where
> a lot of things went into, I am running on top of 7670c7010c.
> With that I get:
> - two different kernel bugs, the first one actually appearing several
> times
> - after a short time no package goes out, but the addapter still believes
> to be associated.
>
> Here is the dmesg output of the two kernel bugs and the following
> hard reset.
>
> BTW, that happened at the *critical* access point at the university.
> At home I was before running hte same kernel without problem.
> BUt it might be a suspend/resume problem, too, as I suspended the
> computer and woke it up here.
>
>
so this is different problem, right? it looks different compare to what
you described before which is cause by queue stopped.

Thanks
Wey

> [ 3279.808204] ------------[ cut here ]------------
> [ 3279.808246] WARNING: at include/net/mac80211.h:3570 rate_control_send_low+0xa5/0x165 [mac80211]()
> [ 3279.808254] Hardware name: VGN-Z11VN_B
> [ 3279.808258] Modules linked in: rfcomm bnep vboxpci vboxnetadp vboxnetflt vboxdrv snd_hrtimer binfmt_misc dm_crypt dm_mod isofs btrfs zlib_deflate crc32c libcrc32c vfat fat fuse loop uinput snd_hda_codec_realtek snd_hda_intel snd_hda_codec snd_hwdep snd_pcm_oss snd_mixer_oss snd_pcm arc4 snd_seq_dummy snd_seq_oss iwlwifi snd_seq_midi snd_rawmidi snd_seq_midi_event snd_seq mac80211 btusb snd_timer bluetooth snd_seq_device firewire_ohci snd firewire_core sony_laptop crc16 soundcore mxm_wmi cfg80211 snd_page_alloc crc_itu_t rfkill joydev tpm_infineon
> [ 3279.808379] Pid: 4956, comm: kworker/u:11 Tainted: G W 3.1.0+ #40
> [ 3279.808385] Call Trace:
> [ 3279.808400] [<ffffffff810376c1>] warn_slowpath_common+0x83/0x9b
> [ 3279.808411] [<ffffffff810376f3>] warn_slowpath_null+0x1a/0x1c
> [ 3279.808438] [<ffffffffa0158149>] rate_control_send_low+0xa5/0x165 [mac80211]
> [ 3279.808459] [<ffffffffa01b01c2>] rs_get_rate+0x146/0x254 [iwlwifi]
> [ 3279.808486] [<ffffffffa01585b1>] rate_control_get_rate+0x86/0x14c [mac80211]
> [ 3279.808516] [<ffffffffa015f632>] ieee80211_tx_h_rate_ctrl+0x1cb/0x3e4 [mac80211]
> [ 3279.808546] [<ffffffffa015fac9>] invoke_tx_handlers+0x69/0xf5 [mac80211]
> [ 3279.808575] [<ffffffffa015fcd2>] ieee80211_tx+0x7f/0xaf [mac80211]
> [ 3279.808605] [<ffffffffa0160172>] ieee80211_xmit+0x89/0x97 [mac80211]
> [ 3279.808634] [<ffffffffa0160d10>] ieee80211_tx_skb+0x57/0x5f [mac80211]
> [ 3279.808659] [<ffffffffa0152098>] ieee80211_send_nullfunc+0x5f/0x64 [mac80211]
> [ 3279.808684] [<ffffffffa01520ff>] ieee80211_mgd_probe_ap_send+0x62/0xe6 [mac80211]
> [ 3279.808709] [<ffffffffa01522fb>] ? ieee80211_recalc_ps+0x10/0x250 [mac80211]
> [ 3279.808734] [<ffffffffa0152960>] ieee80211_mgd_probe_ap+0xc9/0xd9 [mac80211]
> [ 3279.808759] [<ffffffffa0152a6b>] ieee80211_beacon_connection_loss_work+0xe2/0xed [mac80211]
> [ 3279.808785] [<ffffffffa0152989>] ? ieee80211_sta_monitor_work+0x19/0x19 [mac80211]
> [ 3279.808797] [<ffffffff8104d9cc>] process_one_work+0x17b/0x2bd
> [ 3279.808807] [<ffffffff8104c011>] ? need_to_create_worker+0x12/0x26
> [ 3279.808818] [<ffffffff8104eac7>] worker_thread+0xdb/0x15f
> [ 3279.808829] [<ffffffff8104e9ec>] ? manage_workers.isra.24+0x171/0x171
> [ 3279.808839] [<ffffffff81051ef2>] kthread+0x84/0x8c
> [ 3279.808851] [<ffffffff81401394>] kernel_thread_helper+0x4/0x10
> [ 3279.808862] [<ffffffff81051e6e>] ? kthread_worker_fn+0x148/0x148
> [ 3279.808872] [<ffffffff81401390>] ? gs_change+0xb/0xb
> [ 3279.808879] ---[ end trace b7620f873182e7e3 ]---
> [ 3284.284925] wlan0: deauthenticating from 00:24:c4:ab:bd:ef by local choice (reason=2)
> [ 3284.344356] cfg80211: Calling CRDA to update world regulatory domain
> [ 3284.351319] wlan0: authenticate with 00:24:c4:ab:bd:ef (try 1)
> [ 3284.445310] wlan0: authenticated
> [ 3284.447486] wlan0: associate with 00:24:c4:ab:bd:ef (try 1)
> [ 3284.449486] wlan0: RX ReassocResp from 00:24:c4:ab:bd:ef (capab=0x1 status=0 aid=2)
> [ 3284.449489] wlan0: associated
> [ 3314.515741] iwlwifi 0000:06:00.0: Tx aggregation enabled on ra = 00:24:c4:ab:bd:ef tid = 0
> [ 3314.515762] ------------[ cut here ]------------
> [ 3314.515800] WARNING: at drivers/net/wireless/iwlwifi/iwl-trans-pcie.c:1101 iwl_trans_pcie_tx+0x180/0x661 [iwlwifi]()
> [ 3314.515808] Hardware name: VGN-Z11VN_B
> [ 3314.515813] Modules linked in: rfcomm bnep vboxpci vboxnetadp vboxnetflt vboxdrv snd_hrtimer binfmt_misc dm_crypt dm_mod isofs btrfs zlib_deflate crc32c libcrc32c vfat fat fuse loop uinput snd_hda_codec_realtek snd_hda_intel snd_hda_codec snd_hwdep snd_pcm_oss snd_mixer_oss snd_pcm arc4 snd_seq_dummy snd_seq_oss iwlwifi snd_seq_midi snd_rawmidi snd_seq_midi_event snd_seq mac80211 btusb snd_timer bluetooth snd_seq_device firewire_ohci snd firewire_core sony_laptop crc16 soundcore mxm_wmi cfg80211 snd_page_alloc crc_itu_t rfkill joydev tpm_infineon
> [ 3314.515937] Pid: 4954, comm: kworker/u:9 Tainted: G W 3.1.0+ #40
> [ 3314.515943] Call Trace:
> [ 3314.515948] <IRQ> [<ffffffff810376c1>] warn_slowpath_common+0x83/0x9b
> [ 3314.515973] [<ffffffff810376f3>] warn_slowpath_null+0x1a/0x1c
> [ 3314.515996] [<ffffffffa01cdb0e>] iwl_trans_pcie_tx+0x180/0x661 [iwlwifi]
> [ 3314.516009] [<ffffffff810d14b3>] ? kmem_cache_alloc+0x44/0xb9
> [ 3314.516030] [<ffffffffa01b59f8>] iwlagn_tx_skb+0x862/0x901 [iwlwifi]
> [ 3314.516048] [<ffffffffa01abf8c>] iwlagn_mac_tx+0x131/0x1a2 [iwlwifi]
> [ 3314.516069] [<ffffffffa016abd6>] ? ieee80211_tx_h_fragment+0x16/0x22c [mac80211]
> [ 3314.516098] [<ffffffffa015ef24>] __ieee80211_tx+0x176/0x1cf [mac80211]
> [ 3314.516117] [<ffffffffa016ab48>] ? ieee80211_tx_h_calculate_duration+0x4c/0x65 [mac80211]
> [ 3314.516147] [<ffffffffa015fcea>] ieee80211_tx+0x97/0xaf [mac80211]
> [ 3314.516178] [<ffffffffa0160be6>] ieee80211_tx_pending+0xf0/0x1c3 [mac80211]
> [ 3314.516191] [<ffffffff8103cdd2>] tasklet_action+0x77/0xc2
> [ 3314.516201] [<ffffffff8103cf13>] __do_softirq+0xbc/0x1a5
> [ 3314.516212] [<ffffffff8103cb4f>] ? test_ti_thread_flag.constprop.16+0x9/0x16
> [ 3314.516225] [<ffffffff8140148c>] call_softirq+0x1c/0x30
> [ 3314.516230] <EOI> [<ffffffff8100371a>] do_softirq+0x38/0x6e
> [ 3314.516248] [<ffffffff8103cbd9>] _local_bh_enable_ip.isra.12+0x7d/0xa0
> [ 3314.516259] [<ffffffff8103cc0a>] local_bh_enable_ip+0xe/0x10
> [ 3314.516268] [<ffffffff813ff657>] _raw_spin_unlock_bh+0x23/0x25
> [ 3314.516291] [<ffffffffa014ec70>] ieee80211_agg_tx_operational+0x99/0xa4 [mac80211]
> [ 3314.516316] [<ffffffffa014f7bf>] ieee80211_process_addba_resp+0xb8/0xf1 [mac80211]
> [ 3314.516329] [<ffffffff81064040>] ? arch_read_unlock+0xe/0xe
> [ 3314.516355] [<ffffffffa0156a38>] ieee80211_iface_work+0x130/0x2b5 [mac80211]
> [ 3314.516381] [<ffffffffa0156908>] ? ieee80211_teardown_sdata+0xcc/0xcc [mac80211]
> [ 3314.516394] [<ffffffff8104d9cc>] process_one_work+0x17b/0x2bd
> [ 3314.516404] [<ffffffff8104c011>] ? need_to_create_worker+0x12/0x26
> [ 3314.516415] [<ffffffff8104eac7>] worker_thread+0xdb/0x15f
> [ 3314.516426] [<ffffffff8104e9ec>] ? manage_workers.isra.24+0x171/0x171
> [ 3314.516436] [<ffffffff81051ef2>] kthread+0x84/0x8c
> [ 3314.516447] [<ffffffff81401394>] kernel_thread_helper+0x4/0x10
> [ 3314.516458] [<ffffffff81051e6e>] ? kthread_worker_fn+0x148/0x148
> [ 3314.516467] [<ffffffff81401390>] ? gs_change+0xb/0xb
> [ 3314.516474] ---[ end trace b7620f873182e7e4 ]---
> [ 3324.128094] iwlwifi 0000:06:00.0: Queue 11 stuck for 10000 ms.
> [ 3324.128105] iwlwifi 0000:06:00.0: Current read_ptr 14 write_ptr 31
> [ 3324.128113] iwlwifi 0000:06:00.0: On demand firmware reload
> [ 3324.128548] ieee80211 phy0: Hardware restart was requested
> [ 3324.128657] iwlwifi 0000:06:00.0: L1 Enabled; Disabling L0S
> [ 3324.131680] iwlwifi 0000:06:00.0: Radio type=0x1-0x2-0x0
>
>
> Best wishes
>
> Norbert
> ------------------------------------------------------------------------
> Norbert Preining preining@{jaist.ac.jp, logic.at, debian.org}
> JAIST, Japan TeX Live & Debian Developer
> DSA: 0x09C5B094 fp: 14DF 2E6C 0307 BE6D AD76 A9C0 D2BF 4AA3 09C5 B094
> ------------------------------------------------------------------------
> PODE HOLE (n.)
> A hole drilled in chipboard lavatory walls by homosexuals for any one
> of a number of purposes.
> --- Douglas Adams, The Meaning of Liff



2011-10-21 07:26:22

by Johannes Berg

[permalink] [raw]
Subject: Re: iwlagn is getting very shaky

On Fri, 2011-10-21 at 10:23 +0900, Norbert Preining wrote:

> > > [17419.285141] iwlagn 0000:06:00.0: Encounter TX_STATUS_FAIL_PASSIVE_NO_RX, am I on 5.2G band? (0)
> > > [17419.286583] wlan0: deauthenticating from 00:24:c4:ab:bd:ef by local choice (reason=2)
> > > [17419.300296] cfg80211: Calling CRDA for country: JP
> > > [17419.309278] wlan0: authenticate with 00:24:c4:ab:bd:ef (try 1)
> > > [17419.310105] wlan0: authenticated
> > > [17419.317900] wlan0: associate with 00:24:c4:ab:bd:ef (try 1)
> >
> > It seems a bit like the 5 GHz AP (:e0) just fails, and wpa_supplicant
> > selects a 2.4 GHz AP (:ef) now? Pure conjecture, I guess the AP could
> > also be on the same channel -- did you have scan information for these
> > APs?
>
> BSS 00:24:c4:ab:bd:e0 (on wlan0)
> freq: 2412

> And the one I am on:
> BSS 00:24:c4:ab:bd:ef (on wlan0) -- associated
> freq: 5200

> No idea if that helps you.

Yes, thanks. So it's actually the other way around I guess.

I'm wondering if this too is related to the channel mess in mac80211.

johannes


2011-10-19 07:30:00

by Wey-Yi Guy

[permalink] [raw]
Subject: Re: iwlagn is getting very shaky

On Tue, 2011-10-18 at 23:48 -0700, Norbert Preining wrote:
> On Di, 18 Okt 2011, Guy, Wey-Yi wrote:
> > > >> loaded firmware version 8.83.5.1 build 33692 (EXP)
> > > >>
> >
> > looks like you are using experimental version of uCode, but by saying
> > that, I don't think i will make much differences
> >
> > so could you dump the tx queue to see if it is the similar problem what
> > Norbert point out
>
> Not sure what you mean Wey, should *I* Norbert use the default firmware
> and try to reproduce it? Ir should Pekka try to reproduce it with
> normal firmware? Or with both?
>
just got a idea, could you try to revert
commit#68b993118f715cc631b62b6a50574e4701fe9ace

not sure you can cleanly revert it, please let me know if you got
problem


Thanks
Wey
>
> Norbert
> ------------------------------------------------------------------------
> Norbert Preining preining@{jaist.ac.jp, logic.at, debian.org}
> JAIST, Japan TeX Liot sure yu ve & Debian Developer
> DSA: 0x09C5B094 fp: 14DF 2E6C 0307 BE6D AD76 A9C0 D2BF 4AA3 09C5 B094
> ------------------------------------------------------------------------
> WROOT (n.)
> A short little berk who thinks that by pulling on his pipe and gazing
> shrewdly at you he will give the impression that he is infinitely wise
> and 5 ft 11 in.
> --- Douglas Adams, The Meaning of Liff



2011-10-20 02:55:34

by Wey-Yi Guy

[permalink] [raw]
Subject: RE: iwlagn is getting very shaky

Hmm, I am sure I have the attachment, but here I attach again.

Also, the patch is simple. Only few lines

>From 78e955b90a8e259d180bfb962ef4558645d74a91 Mon Sep 17 00:00:00 2001
From: Wey-Yi Guy <[email protected]>
Date: Wed, 19 Oct 2011 18:27:41 -0700
Subject: [PATCH 1/1] iwlagn: stop queue testing

Signed-off-by: Wey-Yi Guy <[email protected]>
---
drivers/net/wireless/iwlwifi/iwl-agn-lib.c | 6 ++++++
1 files changed, 6 insertions(+), 0 deletions(-)

diff --git a/drivers/net/wireless/iwlwifi/iwl-agn-lib.c b/drivers/net/wireless/iwlwifi/iwl-agn-lib.c
index 3bee0f1..d543686 100644
--- a/drivers/net/wireless/iwlwifi/iwl-agn-lib.c
+++ b/drivers/net/wireless/iwlwifi/iwl-agn-lib.c
@@ -191,8 +191,14 @@ static void iwlagn_set_tx_status(struct iwl_priv *priv,
if (status == TX_STATUS_FAIL_PASSIVE_NO_RX &&
iwl_is_associated_ctx(ctx) && ctx->vif &&
ctx->vif->type == NL80211_IFTYPE_STATION) {
+#if 0
ctx->last_tx_rejected = true;
iwl_stop_queue(priv, &priv->txq[txq_id]);
+#endif
+IWL_ERR(priv,
+ "Encounter TX_STATUS_FAIL_PASSIVE_NO_RX, am I on 5.2G band? (%d)\n",
+ txq_id);
+
}

IWL_DEBUG_TX_REPLY(priv, "TXQ %d status %s (0x%08x) rate_n_flags "
--
1.7.0.4

Thanks
Wey
-----Original Message-----
From: Norbert Preining [mailto:[email protected]]
Sent: Wednesday, October 19, 2011 7:51 PM
To: Guy, Wey-Yi W
Cc: David Rientjes; [email protected]; [email protected]; [email protected]; [email protected]; Pekka Enberg
Subject: Re: iwlagn is getting very shaky

On Mi, 19 Okt 2011, Guy, Wey-Yi wrote:
> > Do you mean the patch you send with a strange date:
...

> > It does not apply at all, I cannot even find
> > rejected
> > in iwl-agn-tx.c.
> >
> > On which tree is that based, can you provide a patch against main
> > git branch of Linus.
> >
> > Thanks.
> >
> Let's try this and see if apply. btw, are you on 5GHz?

Which one??? Nothing attached, nothing received, and the patch you sent before does not apply .... not even with manual intervention ... at least I am not able.

5GHz: I don't know by now, not at university by now.

Best wishes

Norbert
------------------------------------------------------------------------
Norbert Preining preining@{jaist.ac.jp, logic.at, debian.org}
JAIST, Japan TeX Live & Debian Developer
DSA: 0x09C5B094 fp: 14DF 2E6C 0307 BE6D AD76 A9C0 D2BF 4AA3 09C5 B094
------------------------------------------------------------------------
`Ford, you're turning into a penguin. Stop it.'
--- Arthur experiences the improbability drive at work.
--- Douglas Adams, The Hitchhikers Guide to the Galaxy


Attachments:
0001-iwlagn-stop-queue-testing.patch (1.11 kB)
0001-iwlagn-stop-queue-testing.patch

2011-10-19 06:38:53

by Wey-Yi Guy

[permalink] [raw]
Subject: Re: iwlagn is getting very shaky

hi all,

On Tue, 2011-10-18 at 23:25 -0700, Norbert Preining wrote:
> Hi David, hi all
>
> On Di, 18 Okt 2011, David Rientjes wrote:
> > There have been recent issues in 3.1-rc9 reported with iwlagn, see the
> > thread at https://lkml.org/lkml/2011/10/15/107 even though you have
>
> Interesting. I read through the thread and activated the debugfs
> option.
>
> I could get my hardware back by
> echo 1 > /sys/kernel/debug/ieee80211/phy0/iwlagn/debug/force_reset
>
> [ 2761.352629] ieee80211 phy0: Hardware restart was requested
> [ 2761.352714] iwlagn 0000:06:00.0: L1 Enabled; Disabling L0S
> [ 2761.355763] iwlagn 0000:06:00.0: Radio type=0x1-0x2-0x0
> [ 2779.484308] wlan0: direct probe to 00:24:c4:ab:bd:e0 (try 1/3)
> [ 2779.684128] wlan0: direct probe to 00:24:c4:ab:bd:e0 (try 2/3)
> [ 2779.884087] wlan0: direct probe to 00:24:c4:ab:bd:e0 (try 3/3)
> [ 2780.084079] wlan0: direct probe to 00:24:c4:ab:bd:e0 timed out
> [ 2788.051381] wlan0: direct probe to 00:24:c4:ab:bd:e0 (try 1/3)
> [ 2788.248079] wlan0: direct probe to 00:24:c4:ab:bd:e0 (try 2/3)
> [ 2788.448083] wlan0: direct probe to 00:24:c4:ab:bd:e0 (try 3/3)
> [ 2788.648140] wlan0: direct probe to 00:24:c4:ab:bd:e0 timed out
> [ 2796.614710] wlan0: authenticate with 00:24:c4:ab:bd:ef (try 1)
> [ 2796.615623] wlan0: authenticated
> [ 2796.618046] wlan0: associate with 00:24:c4:ab:bd:ef (try 1)
> [ 2796.622748] wlan0: RX AssocResp from 00:24:c4:ab:bd:ef (capab=0x1 status=0 aid=1)
> [ 2796.622751] wlan0: associated
> [ 2871.224192] e1000e: eth0 NIC Link is Down
>
> I unplugged the cable and could ping the world still, nice....
>
> After a short time I got:
> [ 2895.575964] wlan0: direct probe to 00:24:c4:ab:bd:e0 (try 1/3)
> [ 2895.772067] wlan0: direct probe to 00:24:c4:ab:bd:e0 (try 2/3)
> [ 2895.972101] wlan0: direct probe to 00:24:c4:ab:bd:e0 (try 3/3)
> [ 2896.172054] wlan0: direct probe to 00:24:c4:ab:bd:e0 timed out
> [ 2905.316968] wlan0: deauthenticating from 00:24:c4:ab:bd:ef by local choice (reason=2)
> [ 2905.356316] cfg80211: Calling CRDA to update world regulatory domain
> [ 2905.361965] wlan0: authenticate with 00:24:c4:ab:bd:e0 (try 1)
> [ 2905.560063] wlan0: authenticate with 00:24:c4:ab:bd:e0 (try 2)
> [ 2905.760091] wlan0: authenticate with 00:24:c4:ab:bd:e0 (try 3)
> [ 2905.960077] wlan0: authentication with 00:24:c4:ab:bd:e0 timed out
> [ 2913.908984] wlan0: direct probe to 00:24:c4:ab:bd:e0 (try 1/3)
> [ 2914.108116] wlan0: direct probe to 00:24:c4:ab:bd:e0 (try 2/3)
> [ 2914.308116] wlan0: direct probe to 00:24:c4:ab:bd:e0 (try 3/3)
> [ 2914.508103] wlan0: direct probe to 00:24:c4:ab:bd:e0 timed out
> [ 2922.473062] wlan0: direct probe to 00:24:c4:ab:bd:e0 (try 1/3)
> [ 2922.672109] wlan0: direct probe to 00:24:c4:ab:bd:e0 (try 2/3)
> [ 2922.872106] wlan0: direct probe to 00:24:c4:ab:bd:e0 (try 3/3)
> [ 2923.072103] wlan0: direct probe to 00:24:c4:ab:bd:e0 timed out
>
> And at this time the tx_queue showed me:
> -----------------------------------------------------------
> hwq 00: read=91 write=91 stop=1 swq_id=0x00 (ac 0/hwq 0)s.

> stop-count: 1

it is very interesting, for sure there is a bug here which cause NIC
stop working, if you look at the tx queue, hwq 0 is stop, which mean
nothing go out. I am not sure how we get into this? yes, most likely
force_reset will fix it by reload the firmware and reset all the queues

Could you help me how to repro this problem?

Thanks
Wey


> hwq 01: read=0 write=0 stop=0 swq_id=0x05 (ac 1/hwq 1)
> stop-count: 0
> hwq 02: read=127 write=127 stop=0 swq_id=0x0a (ac 2/hwq 2)
> stop-count: 0
> hwq 03: read=0 write=0 stop=0 swq_id=0x0f (ac 3/hwq 3)
> stop-count: 0
> hwq 04: read=13 write=13 stop=0 swq_id=0x00 (ac 0/hwq 0)
> hwq 05: read=0 write=0 stop=0 swq_id=0x00 (ac 0/hwq 0)
> hwq 06: read=0 write=0 stop=0 swq_id=0x00 (ac 0/hwq 0)
> hwq 07: read=0 write=0 stop=0 swq_id=0x00 (ac 0/hwq 0)
> hwq 08: read=0 write=0 stop=0 swq_id=0x00 (ac 0/hwq 0)
> hwq 09: read=0 write=0 stop=0 swq_id=0x00 (ac 0/hwq 0)
> hwq 10: read=0 write=0 stop=0 swq_id=0x2a (ac 2/hwq 10)
> hwq 11: read=0 write=0 stop=0 swq_id=0x2c (ac 0/hwq 11)
> hwq 12: read=0 write=0 stop=0 swq_id=0x00 (ac 0/hwq 0)
> hwq 13: read=0 write=0 stop=0 swq_id=0x00 (ac 0/hwq 0)
> hwq 14: read=0 write=0 stop=0 swq_id=0x00 (ac 0/hwq 0)
> hwq 15: read=0 write=0 stop=0 swq_id=0x00 (ac 0/hwq 0)
> hwq 16: read=0 write=0 stop=0 swq_id=0x00 (ac 0/hwq 0)
> hwq 17: read=0 write=0 stop=0 swq_id=0x00 (ac 0/hwq 0)
> hwq 18: read=0 write=0 stop=0 swq_id=0x00 (ac 0/hwq 0)
> hwq 19: read=0 write=0 stop=0 swq_id=0x00 (ac 0/hwq 0)
> -------------------------------------------------
>
> Hope that helps. Anyone let me know if you need more testing.
>
> Once more, be reminded the the firmware of the iwlagn is from
> an experimental build that should solve the AGN stopped working
> problem.
>
> Best wishes
>
> Norbert
> ------------------------------------------------------------------------
> Norbert Preining preining@{jaist.ac.jp, logic.at, debian.org}
> JAIST, Japan TeX Live & Debian Developer
> DSA: 0x09C5B094 fp: 14DF 2E6C 0307 BE6D AD76 A9C0 D2BF 4AA3 09C5 B094
> ------------------------------------------------------------------------
> SCRAMOGE (vb.)
> To cut oneself whilst licking envelopes.
> --- Douglas Adams, The Meaning of Liff



2011-10-21 01:24:13

by Norbert Preining

[permalink] [raw]
Subject: Re: iwlagn is getting very shaky

On Do, 20 Okt 2011, Johannes Berg wrote:
> > [17409.616748] wlan0: direct probe to 00:24:c4:ab:bd:e0 (try 1/3)
..
> > [17419.285141] iwlagn 0000:06:00.0: Encounter TX_STATUS_FAIL_PASSIVE_NO_RX, am I on 5.2G band? (0)
> > [17419.286583] wlan0: deauthenticating from 00:24:c4:ab:bd:ef by local choice (reason=2)
> > [17419.300296] cfg80211: Calling CRDA for country: JP
> > [17419.309278] wlan0: authenticate with 00:24:c4:ab:bd:ef (try 1)
> > [17419.310105] wlan0: authenticated
> > [17419.317900] wlan0: associate with 00:24:c4:ab:bd:ef (try 1)
>
> It seems a bit like the 5 GHz AP (:e0) just fails, and wpa_supplicant
> selects a 2.4 GHz AP (:ef) now? Pure conjecture, I guess the AP could
> also be on the same channel -- did you have scan information for these
> APs?

BSS 00:24:c4:ab:bd:e0 (on wlan0)
TSF: 8728607601135 usec (101d, 00:36:47)
freq: 2412
beacon interval: 102
capability: ESS ShortPreamble ShortSlotTime (0x0421)
signal: -62.00 dBm
last seen: 6012 ms ago
SSID: XXXXXX
Supported rates: 1.0* 2.0* 5.5* 6.0 9.0 11.0* 12.0 18.0
DS Parameter set: channel 1
TIM: DTIM Count 0 DTIM Period 1 Bitmap Control 0x0 Bitmap[0] 0x0
Country: JP Environment: Indoor/Outdoor
Channels [1 - 13] @ 20 dBm
ERP: Use_Protection
HT capabilities:
Capabilities: 0x182c
HT20
SM Power Save disabled
RX HT20 SGI
No RX STBC
Max AMSDU length: 7935 bytes
DSSS/CCK HT40
Maximum RX AMPDU length 65535 bytes (exponent: 0x003)
Minimum RX AMPDU time spacing: 8 usec (0x06)
HT RX MCS rate indexes supported: 0-15
HT TX MCS rate indexes are undefined
Extended supported rates: 24.0 36.0 48.0 54.0
HT operation:
* primary channel: 1
* secondary channel offset: no secondary
* STA channel width: 20 MHz
* RIFS: 0
* HT protection: non-HT mixed
* non-GF present: 1
* OBSS non-GF present: 0
* dual beacon: 0
* dual CTS protection: 0
* STBC beacon: 0
* L-SIG TXOP Prot: 0
* PCO active: 0
* PCO phase: 0
WMM: * Parameter version 1
* u-APSD
* BE: CW 15-1023, AIFSN 3
* BK: CW 15-1023, AIFSN 7
* VI: CW 7-15, AIFSN 2, TXOP 3008 usec
* VO: acm CW 3-7, AIFSN 2, TXOP 1504 usec


And the one I am on:
BSS 00:24:c4:ab:bd:ef (on wlan0) -- associated
TSF: 8728558719143 usec (101d, 00:35:58)
freq: 5200
beacon interval: 102
capability: ESS (0x0001)
signal: -69.00 dBm
last seen: 96 ms ago
Information elements from Probe Response frame:
SSID: XXXXXX
Supported rates: 6.0* 9.0 12.0* 18.0 24.0* 36.0 48.0 54.0
Country: JP Environment: Indoor/Outdoor
Channels [36 - 64] @ 22 dBm
HT capabilities:
Capabilities: 0x186e
HT20/HT40
SM Power Save disabled
RX HT20 SGI
RX HT40 SGI
No RX STBC
Max AMSDU length: 7935 bytes
DSSS/CCK HT40
Maximum RX AMPDU length 65535 bytes (exponent: 0x003)
Minimum RX AMPDU time spacing: 8 usec (0x06)
HT RX MCS rate indexes supported: 0-15
HT TX MCS rate indexes are undefined
HT operation:
* primary channel: 40
* secondary channel offset: below
* STA channel width: any
* RIFS: 1
* HT protection: no
* non-GF present: 1
* OBSS non-GF present: 0
* dual beacon: 0
* dual CTS protection: 0
* STBC beacon: 0
* L-SIG TXOP Prot: 0
* PCO active: 0
* PCO phase: 0
WMM: * Parameter version 1
* u-APSD
* BE: CW 15-1023, AIFSN 3
* BK: CW 15-1023, AIFSN 7
* VI: CW 7-15, AIFSN 2, TXOP 3008 usec
* VO: acm CW 3-7, AIFSN 2, TXOP 1504 usec


No idea if that helps you.

Best wishes

Norbert
------------------------------------------------------------------------
Norbert Preining preining@{jaist.ac.jp, logic.at, debian.org}
JAIST, Japan TeX Live & Debian Developer
DSA: 0x09C5B094 fp: 14DF 2E6C 0307 BE6D AD76 A9C0 D2BF 4AA3 09C5 B094
------------------------------------------------------------------------
SCREGGAN (n. banking)
The crossed-out bit caused by people putting the wrong year on their
cheques all through January.
--- Douglas Adams, The Meaning of Liff

2011-10-21 01:24:45

by Norbert Preining

[permalink] [raw]
Subject: Re: iwlagn is getting very shaky

On Do, 20 Okt 2011, wwguy wrote:
> Great, this is just a test patch, we will figure out what is the correct
> way to handle this and fix the issue for good :-)

Thanks. If you need more testing just let me know.

Best wishes

Norbert
------------------------------------------------------------------------
Norbert Preining preining@{jaist.ac.jp, logic.at, debian.org}
JAIST, Japan TeX Live & Debian Developer
DSA: 0x09C5B094 fp: 14DF 2E6C 0307 BE6D AD76 A9C0 D2BF 4AA3 09C5 B094
------------------------------------------------------------------------
HODDLESDEN (n.)
An 'injured' footballer's limb back into the game which draws applause
but doesn't fool anybody.
--- Douglas Adams, The Meaning of Liff

2011-10-19 07:06:07

by Wey-Yi Guy

[permalink] [raw]
Subject: Re: iwlagn is getting very shaky

On Tue, 2011-10-18 at 23:48 -0700, Norbert Preining wrote:
> On Di, 18 Okt 2011, Guy, Wey-Yi wrote:
> > > >> loaded firmware version 8.83.5.1 build 33692 (EXP)
> > > >>
> >
> > looks like you are using experimental version of uCode, but by saying
> > that, I don't think i will make much differences
> >
> > so could you dump the tx queue to see if it is the similar problem what
> > Norbert point out
>
> Not sure what you mean Wey, should *I* Norbert use the default firmware
> and try to reproduce it? Ir should Pekka try to reproduce it with
> normal firmware? Or with both?
>
Sorry, I was thinking ask Pekka to use the officialfirmware. But more
important, I will like to find out is Pekka has the same issue as you
have which is tx queue stopped.

Thanks
Wey

> Best wishes
>
> Norbert
> ------------------------------------------------------------------------
> Norbert Preining preining@{jaist.ac.jp, logic.at, debian.org}
> JAIST, Japan TeX Live & Debian Developer
> DSA: 0x09C5B094 fp: 14DF 2E6C 0307 BE6D AD76 A9C0 D2BF 4AA3 09C5 B094
> ------------------------------------------------------------------------
> WROOT (n.)
> A short little berk who thinks that by pulling on his pipe and gazing
> shrewdly at you he will give the impression that he is infinitely wise
> and 5 ft 11 in.
> --- Douglas Adams, The Meaning of Liff



2011-10-20 02:22:55

by Wey-Yi Guy

[permalink] [raw]
Subject: Re: iwlagn is getting very shaky

On Wed, 2011-10-19 at 19:02 -0700, Norbert Preining wrote:
> On Mi, 19 Okt 2011, wwguy wrote:
> > instead of having you to revert the patch I ask you to do, could you
> > please just apply the patch I attach here and see if it make any
> > differences? it is also for debugging purpose, so please check the dmesg
> > log.
>
> Do you mean the patch you send with a strange date:
> Signed-off-by: Wey-Yi Guy <[email protected]>
> ---
> drivers/net/wireless/iwlwifi/iwl-agn-tx.c | 5 +++++
> 1 files changed, 5 insertions(+), 0 deletions(-)
>
> diff --git a/drivers/net/wireless/iwlwifi/iwl-agn-tx.c
> b/drivers/net/wireless/iwlwifi/iwl-agn-tx.c
> index 863c43f..5a87071 100644
> --- a/drivers/net/wireless/iwlwifi/iwl-agn-tx.c
> +++ b/drivers/net/wireless/iwlwifi/iwl-agn-tx.c
> @@ -812,8 +812,13 @@ int iwlagn_rx_reply_tx(struct iwl_priv *priv, struct
> iwl_rx_mem_buffer *rxb,
> if (status == TX_STATUS_FAIL_PASSIVE_NO_RX &&
> iwl_is_associated_ctx(ctx) && ctx->vif &&
> ctx->vif->type == NL80211_IFTYPE_STATION) {
> +#if 0
> ctx->last_tx_rejected = true;
> iwl_trans_stop_queue(trans(priv), txq_id);
> +#endif
> +IWL_ERR(priv,
> + "Encounter TX_STATUS_FAIL_PASSIVE_NO_RX, am I on 5.2G band? (%d)\n",
> + txq_id);
>
> IWL_DEBUG_TX_REPLY(priv,
> "TXQ %d status %s (0x%08x) "
> --
> 1.7.0.4
>
> It does not apply at all, I cannot even find
> rejected
> in iwl-agn-tx.c.
>
> On which tree is that based, can you provide a patch against main
> git branch of Linus.
>
> Thanks.
>
Let's try this and see if apply. btw, are you on 5GHz?

Thanks
Wey


Attachments:
0001-iwlagn-stop-queue-testing.patch (1.08 kB)

2011-10-20 02:02:35

by Norbert Preining

[permalink] [raw]
Subject: Re: iwlagn is getting very shaky

On Mi, 19 Okt 2011, wwguy wrote:
> instead of having you to revert the patch I ask you to do, could you
> please just apply the patch I attach here and see if it make any
> differences? it is also for debugging purpose, so please check the dmesg
> log.

Do you mean the patch you send with a strange date:
Signed-off-by: Wey-Yi Guy <[email protected]>
---
drivers/net/wireless/iwlwifi/iwl-agn-tx.c | 5 +++++
1 files changed, 5 insertions(+), 0 deletions(-)

diff --git a/drivers/net/wireless/iwlwifi/iwl-agn-tx.c
b/drivers/net/wireless/iwlwifi/iwl-agn-tx.c
index 863c43f..5a87071 100644
--- a/drivers/net/wireless/iwlwifi/iwl-agn-tx.c
+++ b/drivers/net/wireless/iwlwifi/iwl-agn-tx.c
@@ -812,8 +812,13 @@ int iwlagn_rx_reply_tx(struct iwl_priv *priv, struct
iwl_rx_mem_buffer *rxb,
if (status == TX_STATUS_FAIL_PASSIVE_NO_RX &&
iwl_is_associated_ctx(ctx) && ctx->vif &&
ctx->vif->type == NL80211_IFTYPE_STATION) {
+#if 0
ctx->last_tx_rejected = true;
iwl_trans_stop_queue(trans(priv), txq_id);
+#endif
+IWL_ERR(priv,
+ "Encounter TX_STATUS_FAIL_PASSIVE_NO_RX, am I on 5.2G band? (%d)\n",
+ txq_id);

IWL_DEBUG_TX_REPLY(priv,
"TXQ %d status %s (0x%08x) "
--
1.7.0.4

It does not apply at all, I cannot even find
rejected
in iwl-agn-tx.c.

On which tree is that based, can you provide a patch against main
git branch of Linus.

Thanks.

Best wishes

Norbert
------------------------------------------------------------------------
Norbert Preining preining@{jaist.ac.jp, logic.at, debian.org}
JAIST, Japan TeX Live & Debian Developer
DSA: 0x09C5B094 fp: 14DF 2E6C 0307 BE6D AD76 A9C0 D2BF 4AA3 09C5 B094
------------------------------------------------------------------------
BROMPTON
A bromton is that which is said to have been committed when you are
convinced you are about to blow off with a resounding trumpeting noise
in a public place and all that actually slips out is a tiny 'pfpt'.
--- Douglas Adams, The Meaning of Liff

2011-10-19 06:48:34

by Norbert Preining

[permalink] [raw]
Subject: Re: iwlagn is getting very shaky

On Di, 18 Okt 2011, Guy, Wey-Yi wrote:
> > >> loaded firmware version 8.83.5.1 build 33692 (EXP)
> > >>
>
> looks like you are using experimental version of uCode, but by saying
> that, I don't think i will make much differences
>
> so could you dump the tx queue to see if it is the similar problem what
> Norbert point out

Not sure what you mean Wey, should *I* Norbert use the default firmware
and try to reproduce it? Ir should Pekka try to reproduce it with
normal firmware? Or with both?

Best wishes

Norbert
------------------------------------------------------------------------
Norbert Preining preining@{jaist.ac.jp, logic.at, debian.org}
JAIST, Japan TeX Live & Debian Developer
DSA: 0x09C5B094 fp: 14DF 2E6C 0307 BE6D AD76 A9C0 D2BF 4AA3 09C5 B094
------------------------------------------------------------------------
WROOT (n.)
A short little berk who thinks that by pulling on his pipe and gazing
shrewdly at you he will give the impression that he is infinitely wise
and 5 ft 11 in.
--- Douglas Adams, The Meaning of Liff

2011-10-19 07:19:05

by Pekka Enberg

[permalink] [raw]
Subject: Re: iwlagn is getting very shaky

Hi Wey,

On Wed, Oct 19, 2011 at 9:15 AM, Guy, Wey-Yi <[email protected]> wrote:
> Sorry, I was thinking ask Pekka to use the officialfirmware. But more
> important, I will like to find out is Pekka has the same issue as you
> have which is tx queue stopped.

I'm happy to help but I have no idea what you want me to do :-). When
my internet connection dies the next time, what do you want me to
capture for you? What's official firmware? I'm using Fedora 15 with
3.1-rc9 built from sources here. Do you want to see my lspci, .config,
dmesg, something else?

As for the problems, I'm using iPhone 3G internet via WiFi sharing and
connection seems to die few times a day or if I take the phone far
enough from the laptop. I usually turn off the AP for a while and use
NetworkManager on the laptop to disable/enable networking to make it
come back up.

I haven't tested anything but 3.1-rcX kernels on this machine but I
can do git bisect too if it comes to that.

Pekka

2011-10-20 09:12:45

by Pekka Enberg

[permalink] [raw]
Subject: Re: iwlagn is getting very shaky

On Wed, Oct 19, 2011 at 9:44 AM, Guy, Wey-Yi <[email protected]> wrote:
>> As for the problems, I'm using iPhone 3G internet via WiFi sharing and
>> connection seems to die few times a day or if I take the phone far
>> enough from the laptop. I usually turn off the AP for a while and use
>> NetworkManager on the laptop to disable/enable networking to make it
>> come back up.
>>
>> I haven't tested anything but 3.1-rcX kernels on this machine but I
>> can do git bisect too if it comes to that.
>>
> I am just try to understand if you encounter the same problem as
> Norbert. if you can dump the tx_queue
> in /sys/kernel/debug/ieee80211/phX/iwlagn/debug folder and send me the
> log, thsat will be great.

Here's a dump of a very typical issue I'm seeing. I left the office
with my phone and once I came back, my laptop refused to reconnect to
the AP until I restarted wireless networking from NetworkManager. I've
sometimes seen the same symptoms even with laptop and the phone being
physically close to each other.

/sys/kernel/debug/ieee80211/phy0/iwlagn/debug/wd_timeout
cat: /sys/kernel/debug/ieee80211/phy0/iwlagn/debug/wd_timeout: Invalid argument
/sys/kernel/debug/ieee80211/phy0/iwlagn/debug/rxon_filter_flags
0x0004
/sys/kernel/debug/ieee80211/phy0/iwlagn/debug/rxon_flags
0x8005
/sys/kernel/debug/ieee80211/phy0/iwlagn/debug/reply_tx_error
Statistics_TX_Error:
DELAY: 0
FEW_BYTES: 0
BT_PRIO: 0
QUIET_PERIOD: 0
CALC_TTAK: 0
INTERNAL_CROSSED_RETRY: 0
SHORT_LIMIT: 0
LONG_LIMIT: 24
FIFO_UNDERRUN: 0
DRAIN_FLOW: 0
RFKILL_FLUSH: 0
LIFE_EXPIRE: 0
DEST_PS: 0
HOST_ABORTED: 0
BT_RETRY: 0
STA_INVALID: 0
FRAG_DROPPED: 0
TID_DISABLE: 0
FIFO_FLUSHED: 0
INSUFFICIENT_CF_POLL: 0
PASSIVE_NO_RX: 1
NO_BEACON_ON_RADAR: 0
UNKNOWN: 0

Statistics_Agg_TX_Error:
UNDERRUN_MSK: 0
BT_PRIO_MSK: 0
FEW_BYTES_MSK: 0
ABORT_MSK: 0
LAST_SENT_TTL_MSK: 0
LAST_SENT_TRY_CNT_MSK: 0
LAST_SENT_BT_KILL_MSK: 0
SCD_QUERY_MSK: 0
TEST_BAD_CRC32_MSK: 0
RESPONSE_MSK: 0
DUMP_TX_MSK: 0
DELAY_TX_MSK: 0
UNKNOWN: 0
/sys/kernel/debug/ieee80211/phy0/iwlagn/debug/ucode_bt_stats
Statistics Flag(0x30002):
Operational Frequency: 2.4 GHz
TGj Narrow Band: disabled
Statistics_BT:
current accumulative
hi_priority_tx_req_cnt: 0 0
hi_priority_tx_denied_cnt: 0 0
lo_priority_tx_req_cnt: 0 0
lo_priority_tx_denied_cnt: 0 0
hi_priority_rx_req_cnt: 0 0
hi_priority_rx_denied_cnt: 0 0
lo_priority_rx_req_cnt: 0 0
lo_priority_rx_denied_cnt: 0 0
(rx)num_bt_kills: 0 0
/sys/kernel/debug/ieee80211/phy0/iwlagn/debug/ucode_tracing
ucode trace timer is Off
non_wraps_count: 0
wraps_once_count: 0
wraps_more_count: 0
/sys/kernel/debug/ieee80211/phy0/iwlagn/debug/chain_noise
active_chains: 3
chain_noise_a: 330
chain_noise_b: 335
chain_noise_c: 0
chain_signal_a: 575
chain_signal_b: 674
chain_signal_c: 0
beacon_count: 16
disconn_array: 0 0 1
delta_gain_code: 4 0 0
radio_write: 1
state: 3
/sys/kernel/debug/ieee80211/phy0/iwlagn/debug/sensitivity
auto_corr_ofdm: 80
auto_corr_ofdm_mrc: 128
auto_corr_ofdm_x1: 105
auto_corr_ofdm_mrc_x1: 192
auto_corr_cck: 125
auto_corr_cck_mrc: 160
last_bad_plcp_cnt_ofdm: 0
last_fa_cnt_ofdm: 0
last_bad_plcp_cnt_cck: 0
last_fa_cnt_cck: 0
nrg_curr_state: 0
nrg_prev_state: 0
nrg_value: 0 0 0 0 0 0 0 0 0 0
nrg_silence_rssi: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
nrg_silence_ref: 0
nrg_energy_idx: 0
nrg_silence_idx: 0
nrg_th_cck: 97
nrg_auto_corr_silence_diff: 0
num_in_cck_no_fa: 0
nrg_th_ofdm: 100
/sys/kernel/debug/ieee80211/phy0/iwlagn/debug/protection_mode
use rts/cts for aggregation
/sys/kernel/debug/ieee80211/phy0/iwlagn/debug/txfifo_flush
cat: /sys/kernel/debug/ieee80211/phy0/iwlagn/debug/txfifo_flush:
Invalid argument
/sys/kernel/debug/ieee80211/phy0/iwlagn/debug/ucode_general_stats
Statistics Flag(0x30002):
Operational Frequency: 2.4 GHz
TGj Narrow Band: disabled
Statistics_General: current cumulative delta max
temperature: 32
temperature_m: 3437
ttl_timestamp: 1351982035
burst_check: 0 888 32 33
burst_count: 0 0 0 0
wait_for_silence_timeout_count: 0 101 1 1
sleep_time: 0 0 0 0
slots_out: 0 0 0 0
slots_idle: 0 0 0 0
tx_on_a: 0 0 0 0
tx_on_b: 0 0 0 0
exec_time: 0 0 0 0
probe_time: 0 233 1 1
rx_enable_counter: 0 0 0 0
/sys/kernel/debug/ieee80211/phy0/iwlagn/debug/ucode_tx_stats
Statistics Flag(0x30002):
Operational Frequency: 2.4 GHz
TGj Narrow Band: disabled
Statistics_Tx: current cumulative delta max
preamble: 0 69043 5 156
rx_detected_cnt: 0 1175 1 34
bt_prio_defer_cnt: 0 0 0 0
bt_prio_kill_cnt: 0 0 0 0
few_bytes_cnt: 0 0 0 0
cts_timeout: 0 0 0 0
ack_timeout: 0 2410 2 45
expected_ack_cnt: 0 37195 5 81
actual_ack_cnt: 0 34785 3 73
dump_msdu_cnt: 0 64 2 3
abort_nxt_frame_mismatch: 0 0 0 0
abort_missing_nxt_frame: 0 0 0 0
cts_timeout_collision: 0 0 0 0
ack_ba_timeout_collision: 0 146 1 4
agg ba_timeout: 0 0 0 0
agg ba_resched_frames: 0 0 0 0
agg scd_query_agg_frame: 0 0 0 0
agg scd_query_no_agg: 0 0 0 0
agg scd_query_agg: 0 0 0 0
agg scd_query_mismatch: 0 0 0 0
agg frame_not_ready: 0 0 0 0
agg underrun: 0 0 0 0
agg bt_prio_kill: 0 0 0 0
agg rx_ba_rsp_cnt: 0 0 0 0
tx power: (1/2 dB step)
antenna A: 0x1C
antenna B: 0x1C
/sys/kernel/debug/ieee80211/phy0/iwlagn/debug/ucode_rx_stats
Statistics Flag(0x30002):
Operational Frequency: 2.4 GHz
TGj Narrow Band: disabled
Statistics_Rx - OFDM: current cumulative delta max
ina_cnt: 4 1739356 3 2107
fina_cnt: 2 1407532 2 773
plcp_err: 2 1241567 2 773
crc32_err: 0 7362 1 85
overrun_err: 0 0 0 0
early_overrun_err: 0 0 0 0
crc32_good: 0 172221 4 283
false_alarm_cnt: 2 354918 1 2139
fina_sync_err_cnt: 0 1490 1 30
sfd_timeout: 0 0 0 0
fina_timeout: 0 0 0 0
unresponded_rts: 0 0 0 0
rxe_frame_lmt_ovrun: 0 121 1 30
sent_ack_cnt: 0 30667 3 72
sent_cts_cnt: 0 0 0 0
sent_ba_rsp_cnt: 0 0 0 0
dsp_self_kill: 0 0 0 0
mh_format_err: 0 0 0 0
re_acq_main_rssi_sum: 0 0 0 0
Statistics_Rx - CCK: current cumulative delta max
ina_cnt: 1 1714491 1 1408
fina_cnt: 1 1713486 1 1407
plcp_err: 1 603100 1 1270
crc32_err: 0 3851 1 27
overrun_err: 0 0 0 0
early_overrun_err: 0 0 0 0
crc32_good: 0 650281 8 112
false_alarm_cnt: 0 0 0 0
fina_sync_err_cnt: 0 1060 1 3
sfd_timeout: 0 0 0 0
fina_timeout: 0 0 0 0
unresponded_rts: 0 0 0 0
rxe_frame_lmt_ovrun: 0 2 1 1
sent_ack_cnt: 0 52 1 5
sent_cts_cnt: 0 0 0 0
sent_ba_rsp_cnt: 0 0 0 0
dsp_self_kill: 0 0 0 0
mh_format_err: 0 0 0 0
re_acq_main_rssi_sum: 0 0 0 0
Statistics_Rx - GENERAL: current cumulative delta max
bogus_cts: 0 2 1 1
bogus_ack: 0 9 1 2
non_bssid_frames: 0 0 0 0
filtered_frames: 0 549185 1 83
non_channel_beacons: 0 0 0 0
channel_beacons: 0 87279 1 1
num_missed_bcon: 0 110 6 10
adc_rx_saturation_time: 0 0 0 0
ina_detect_search_tm: 0 77790045 65535 65535
beacon_silence_rssi_a: 0 32156769 3082 11822
beacon_silence_rssi_b: 0 63539976 6419 12593
beacon_silence_rssi_c: 0 0 0 0
interference_data_flag: 0 1236 1 1
channel_load: 0 988646230 2218 4167874
dsp_false_alarms: 0 163030 18 255
beacon_rssi_a: 0 127134 1 45
beacon_rssi_b: 0 91295 3 45
beacon_rssi_c: 0 0 0 0
beacon_energy_a: 0 215369 5 88
beacon_energy_b: 0 100030 2 91
beacon_energy_c: 0 113982 5 93
Statistics_Rx - OFDM_HT: current cumulative delta max
plcp_err: 0 0 0 0
overrun_err: 0 0 0 0
early_overrun_err: 0 0 0 0
crc32_good: 0 0 0 0
crc32_err: 0 0 0 0
mh_format_err: 0 0 0 0
agg_crc32_good: 0 115 2 30
agg_mpdu_cnt: 0 198 15 44
agg_cnt: 0 69 1 17
unsupport_mcs: 0 356 1 31
/sys/kernel/debug/ieee80211/phy0/iwlagn/debug/force_reset
Force reset method 0
number of reset request: 102
number of reset request success: 79
number of reset request reject: 23
reset duration: 3000
Force reset method 1
number of reset request: 0
number of reset request success: 0
number of reset request reject: 0
reset duration: 5000
/sys/kernel/debug/ieee80211/phy0/iwlagn/debug/plcp_delta
50
/sys/kernel/debug/ieee80211/phy0/iwlagn/debug/missed_beacon
5
/sys/kernel/debug/ieee80211/phy0/iwlagn/debug/fh_reg
FH register values:
FH_RSCSR_CHNL0_STTS_WPTR_REG: 0X0fff1e00
FH_RSCSR_CHNL0_RBDCB_BASE_REG: 0X00fff1f0
FH_RSCSR_CHNL0_WPTR: 0X000000d8
FH_MEM_RCSR_CHNL0_CONFIG_REG: 0X80819104
FH_MEM_RSSR_SHARED_CTRL_REG: 0X000000fc
FH_MEM_RSSR_RX_STATUS_REG: 0X07030000
FH_MEM_RSSR_RX_ENABLE_ERR_IRQ2DRV: 0X00000000
FH_TSSR_TX_STATUS_REG: 0X07ff0001
FH_TSSR_TX_ERROR_REG: 0X00000000
/sys/kernel/debug/ieee80211/phy0/iwlagn/debug/csr
cat: /sys/kernel/debug/ieee80211/phy0/iwlagn/debug/csr: Invalid argument
/sys/kernel/debug/ieee80211/phy0/iwlagn/debug/clear_traffic_statistics
cat: /sys/kernel/debug/ieee80211/phy0/iwlagn/debug/clear_traffic_statistics:
Invalid argument
/sys/kernel/debug/ieee80211/phy0/iwlagn/debug/clear_ucode_statistics
cat: /sys/kernel/debug/ieee80211/phy0/iwlagn/debug/clear_ucode_statistics:
Invalid argument
/sys/kernel/debug/ieee80211/phy0/iwlagn/debug/power_save_status
Power Save Status: PHY
/sys/kernel/debug/ieee80211/phy0/iwlagn/debug/tx_queue
hwq 00: read=37 write=37 stop=0 swq_id=0x00 (ac 0/hwq 0)
stop-count: 0
hwq 01: read=0 write=0 stop=0 swq_id=0x05 (ac 1/hwq 1)
stop-count: 0
hwq 02: read=194 write=194 stop=0 swq_id=0x0a (ac 2/hwq 2)
stop-count: 0
hwq 03: read=0 write=0 stop=0 swq_id=0x0f (ac 3/hwq 3)
stop-count: 0
hwq 04: read=0 write=0 stop=0 swq_id=0x13 (ac 3/hwq 4)
hwq 05: read=0 write=0 stop=0 swq_id=0x16 (ac 2/hwq 5)
hwq 06: read=0 write=0 stop=0 swq_id=0x19 (ac 1/hwq 6)
hwq 07: read=0 write=0 stop=0 swq_id=0x1c (ac 0/hwq 7)
hwq 08: read=0 write=0 stop=0 swq_id=0x22 (ac 2/hwq 8)
hwq 09: read=75 write=75 stop=0 swq_id=0x00 (ac 0/hwq 0)
hwq 10: read=0 write=0 stop=0 swq_id=0x00 (ac 0/hwq 0)
hwq 11: read=0 write=0 stop=0 swq_id=0x00 (ac 0/hwq 0)
hwq 12: read=0 write=0 stop=0 swq_id=0x00 (ac 0/hwq 0)
hwq 13: read=0 write=0 stop=0 swq_id=0x00 (ac 0/hwq 0)
hwq 14: read=0 write=0 stop=0 swq_id=0x00 (ac 0/hwq 0)
hwq 15: read=0 write=0 stop=0 swq_id=0x00 (ac 0/hwq 0)
hwq 16: read=0 write=0 stop=0 swq_id=0x00 (ac 0/hwq 0)
hwq 17: read=0 write=0 stop=0 swq_id=0x00 (ac 0/hwq 0)
hwq 18: read=0 write=0 stop=0 swq_id=0x00 (ac 0/hwq 0)
hwq 19: read=0 write=0 stop=0 swq_id=0x00 (ac 0/hwq 0)
/sys/kernel/debug/ieee80211/phy0/iwlagn/debug/rx_queue
read: 223
write: 221
free_count: 32
closed_rb_num: 223
/sys/kernel/debug/ieee80211/phy0/iwlagn/debug/traffic_log
Tx Queue
q[0]: read_ptr: 37, write_ptr: 37
q[1]: read_ptr: 0, write_ptr: 0
q[2]: read_ptr: 194, write_ptr: 194
q[3]: read_ptr: 0, write_ptr: 0
q[4]: read_ptr: 0, write_ptr: 0
q[5]: read_ptr: 0, write_ptr: 0
q[6]: read_ptr: 0, write_ptr: 0
q[7]: read_ptr: 0, write_ptr: 0
q[8]: read_ptr: 0, write_ptr: 0
q[9]: read_ptr: 75, write_ptr: 75
q[10]: read_ptr: 0, write_ptr: 0
q[11]: read_ptr: 0, write_ptr: 0
q[12]: read_ptr: 0, write_ptr: 0
q[13]: read_ptr: 0, write_ptr: 0
q[14]: read_ptr: 0, write_ptr: 0
q[15]: read_ptr: 0, write_ptr: 0
q[16]: read_ptr: 0, write_ptr: 0
q[17]: read_ptr: 0, write_ptr: 0
q[18]: read_ptr: 0, write_ptr: 0
q[19]: read_ptr: 0, write_ptr: 0
Rx Queue
read: 223, write: 221
/sys/kernel/debug/ieee80211/phy0/iwlagn/debug/tx_statistics
Management:
MANAGEMENT_ASSOC_REQ : 1
MANAGEMENT_ASSOC_RESP : 0
MANAGEMENT_REASSOC_REQ : 3
MANAGEMENT_REASSOC_RESP : 0
MANAGEMENT_PROBE_REQ : 0
MANAGEMENT_PROBE_RESP : 0
MANAGEMENT_BEACON : 0
MANAGEMENT_ATIM : 0
MANAGEMENT_DISASSOC : 0
MANAGEMENT_AUTH : 19
MANAGEMENT_DEAUTH : 4
MANAGEMENT_ACTION : 0
Control
CONTROL_BACK_REQ : 0
CONTROL_BACK : 0
CONTROL_PSPOLL : 0
CONTROL_RTS : 0
CONTROL_CTS : 0
CONTROL_ACK : 0
CONTROL_CFEND : 0
CONTROL_CFENDACK : 0
Data:
cnt: 32204
bytes: 5475621
/sys/kernel/debug/ieee80211/phy0/iwlagn/debug/rx_statistics
Management:
MANAGEMENT_ASSOC_REQ : 0
MANAGEMENT_ASSOC_RESP : 1
MANAGEMENT_REASSOC_REQ : 0
MANAGEMENT_REASSOC_RESP : 3
MANAGEMENT_PROBE_REQ : 0
MANAGEMENT_PROBE_RESP : 149
MANAGEMENT_BEACON : 106894
MANAGEMENT_ATIM : 0
MANAGEMENT_DISASSOC : 0
MANAGEMENT_AUTH : 4
MANAGEMENT_DEAUTH : 0
MANAGEMENT_ACTION : 0
Control:
CONTROL_BACK_REQ : 0
CONTROL_BACK : 0
CONTROL_PSPOLL : 0
CONTROL_RTS : 0
CONTROL_CTS : 0
CONTROL_ACK : 0
CONTROL_CFEND : 0
CONTROL_CFENDACK : 0
Data:
cnt: 30900
bytes: 18938054

2011-10-19 06:54:07

by Norbert Preining

[permalink] [raw]
Subject: Re: iwlagn is getting very shaky

Hi David, hi all

On Di, 18 Okt 2011, David Rientjes wrote:
> There have been recent issues in 3.1-rc9 reported with iwlagn, see the
> thread at https://lkml.org/lkml/2011/10/15/107 even though you have

Interesting. I read through the thread and activated the debugfs
option.

I could get my hardware back by
echo 1 > /sys/kernel/debug/ieee80211/phy0/iwlagn/debug/force_reset

[ 2761.352629] ieee80211 phy0: Hardware restart was requested
[ 2761.352714] iwlagn 0000:06:00.0: L1 Enabled; Disabling L0S
[ 2761.355763] iwlagn 0000:06:00.0: Radio type=0x1-0x2-0x0
[ 2779.484308] wlan0: direct probe to 00:24:c4:ab:bd:e0 (try 1/3)
[ 2779.684128] wlan0: direct probe to 00:24:c4:ab:bd:e0 (try 2/3)
[ 2779.884087] wlan0: direct probe to 00:24:c4:ab:bd:e0 (try 3/3)
[ 2780.084079] wlan0: direct probe to 00:24:c4:ab:bd:e0 timed out
[ 2788.051381] wlan0: direct probe to 00:24:c4:ab:bd:e0 (try 1/3)
[ 2788.248079] wlan0: direct probe to 00:24:c4:ab:bd:e0 (try 2/3)
[ 2788.448083] wlan0: direct probe to 00:24:c4:ab:bd:e0 (try 3/3)
[ 2788.648140] wlan0: direct probe to 00:24:c4:ab:bd:e0 timed out
[ 2796.614710] wlan0: authenticate with 00:24:c4:ab:bd:ef (try 1)
[ 2796.615623] wlan0: authenticated
[ 2796.618046] wlan0: associate with 00:24:c4:ab:bd:ef (try 1)
[ 2796.622748] wlan0: RX AssocResp from 00:24:c4:ab:bd:ef (capab=0x1 status=0 aid=1)
[ 2796.622751] wlan0: associated
[ 2871.224192] e1000e: eth0 NIC Link is Down

I unplugged the cable and could ping the world still, nice....

After a short time I got:
[ 2895.575964] wlan0: direct probe to 00:24:c4:ab:bd:e0 (try 1/3)
[ 2895.772067] wlan0: direct probe to 00:24:c4:ab:bd:e0 (try 2/3)
[ 2895.972101] wlan0: direct probe to 00:24:c4:ab:bd:e0 (try 3/3)
[ 2896.172054] wlan0: direct probe to 00:24:c4:ab:bd:e0 timed out
[ 2905.316968] wlan0: deauthenticating from 00:24:c4:ab:bd:ef by local choice (reason=2)
[ 2905.356316] cfg80211: Calling CRDA to update world regulatory domain
[ 2905.361965] wlan0: authenticate with 00:24:c4:ab:bd:e0 (try 1)
[ 2905.560063] wlan0: authenticate with 00:24:c4:ab:bd:e0 (try 2)
[ 2905.760091] wlan0: authenticate with 00:24:c4:ab:bd:e0 (try 3)
[ 2905.960077] wlan0: authentication with 00:24:c4:ab:bd:e0 timed out
[ 2913.908984] wlan0: direct probe to 00:24:c4:ab:bd:e0 (try 1/3)
[ 2914.108116] wlan0: direct probe to 00:24:c4:ab:bd:e0 (try 2/3)
[ 2914.308116] wlan0: direct probe to 00:24:c4:ab:bd:e0 (try 3/3)
[ 2914.508103] wlan0: direct probe to 00:24:c4:ab:bd:e0 timed out
[ 2922.473062] wlan0: direct probe to 00:24:c4:ab:bd:e0 (try 1/3)
[ 2922.672109] wlan0: direct probe to 00:24:c4:ab:bd:e0 (try 2/3)
[ 2922.872106] wlan0: direct probe to 00:24:c4:ab:bd:e0 (try 3/3)
[ 2923.072103] wlan0: direct probe to 00:24:c4:ab:bd:e0 timed out

And at this time the tx_queue showed me:
------------------------------------------------------------
hwq 00: read=91 write=91 stop=1 swq_id=0x00 (ac 0/hwq 0)
stop-count: 1
hwq 01: read=0 write=0 stop=0 swq_id=0x05 (ac 1/hwq 1)
stop-count: 0
hwq 02: read=127 write=127 stop=0 swq_id=0x0a (ac 2/hwq 2)
stop-count: 0
hwq 03: read=0 write=0 stop=0 swq_id=0x0f (ac 3/hwq 3)
stop-count: 0
hwq 04: read=13 write=13 stop=0 swq_id=0x00 (ac 0/hwq 0)
hwq 05: read=0 write=0 stop=0 swq_id=0x00 (ac 0/hwq 0)
hwq 06: read=0 write=0 stop=0 swq_id=0x00 (ac 0/hwq 0)
hwq 07: read=0 write=0 stop=0 swq_id=0x00 (ac 0/hwq 0)
hwq 08: read=0 write=0 stop=0 swq_id=0x00 (ac 0/hwq 0)
hwq 09: read=0 write=0 stop=0 swq_id=0x00 (ac 0/hwq 0)
hwq 10: read=0 write=0 stop=0 swq_id=0x2a (ac 2/hwq 10)
hwq 11: read=0 write=0 stop=0 swq_id=0x2c (ac 0/hwq 11)
hwq 12: read=0 write=0 stop=0 swq_id=0x00 (ac 0/hwq 0)
hwq 13: read=0 write=0 stop=0 swq_id=0x00 (ac 0/hwq 0)
hwq 14: read=0 write=0 stop=0 swq_id=0x00 (ac 0/hwq 0)
hwq 15: read=0 write=0 stop=0 swq_id=0x00 (ac 0/hwq 0)
hwq 16: read=0 write=0 stop=0 swq_id=0x00 (ac 0/hwq 0)
hwq 17: read=0 write=0 stop=0 swq_id=0x00 (ac 0/hwq 0)
hwq 18: read=0 write=0 stop=0 swq_id=0x00 (ac 0/hwq 0)
hwq 19: read=0 write=0 stop=0 swq_id=0x00 (ac 0/hwq 0)
-------------------------------------------------

Hope that helps. Anyone let me know if you need more testing.

Once more, be reminded the the firmware of the iwlagn is from
an experimental build that should solve the AGN stopped working
problem.

Best wishes

Norbert
------------------------------------------------------------------------
Norbert Preining preining@{jaist.ac.jp, logic.at, debian.org}
JAIST, Japan TeX Live & Debian Developer
DSA: 0x09C5B094 fp: 14DF 2E6C 0307 BE6D AD76 A9C0 D2BF 4AA3 09C5 B094
------------------------------------------------------------------------
SCRAMOGE (vb.)
To cut oneself whilst licking envelopes.
--- Douglas Adams, The Meaning of Liff

2011-10-19 12:49:52

by Norbert Preining

[permalink] [raw]
Subject: Re: iwlagn is getting very shaky

Hi Wey,

On Di, 18 Okt 2011, Guy, Wey-Yi wrote:
> Could you please let me know the brand/model of the AP which having
> problem. Also the configuration of your NIC also help.

I will try to find out as many information as possible tomorrow
when I am back at university. I think the routers are hanging on
the ceiling, so I can somehow check what they are.

> also, when you say you don't have problem with other router, could you
> please be more clear, are those APs having the similar functions as the
> one you have in university, or different?

Well, at home I have a normal WLAN router, one where you plug on
the one side some upstream cable and then you have several downstream
cable (eth) and wlan.

On the university there is probably much more system since there are many
APs and one can easily wander around. I will try to find out as
much as possible.

Anything one can to wiht iwconfig, ethercap or similar tools?

> just got a idea, could you try to revert
> commit#68b993118f715cc631b62b6a50574e4701fe9ace

I try it tomorrow and let you know. I tried for now to revert it
directly without success, but I will try to do it by hand. If
I have problems I let you know. Tomorrow, now is break time ;-)

Best wishes

Norbert
------------------------------------------------------------------------
Norbert Preining preining@{jaist.ac.jp, logic.at, debian.org}
JAIST, Japan TeX Live & Debian Developer
DSA: 0x09C5B094 fp: 14DF 2E6C 0307 BE6D AD76 A9C0 D2BF 4AA3 09C5 B094
------------------------------------------------------------------------
DES MOINES (pl.n.)
The two little lines which come down from your nose.
--- Douglas Adams, The Meaning of Liff

2011-10-20 03:14:27

by Norbert Preining

[permalink] [raw]
Subject: Re: iwlagn is getting very shaky

On Mi, 19 Okt 2011, Guy, Wey-Yi W wrote:
> Hmm, I am sure I have the attachment, but here I attach again.
>
> Also, the patch is simple. Only few lines

No attachment again ... something is filtering here???

Please include it directly in the email

Best wishes

Norbert
------------------------------------------------------------------------
Norbert Preining preining@{jaist.ac.jp, logic.at, debian.org}
JAIST, Japan TeX Live & Debian Developer
DSA: 0x09C5B094 fp: 14DF 2E6C 0307 BE6D AD76 A9C0 D2BF 4AA3 09C5 B094
------------------------------------------------------------------------
`How do you feel?' he asked him.
bits of me keep
passing out.' ....
`We're safe,' he said.
`Oh good,' said Arthur.
in one of the
spaceships of the Vogon Constructor Fleet.'
this is obviously some strange usage of
the word "safe" that I wasn't previously aware of.'
--- Arthur after his first ever teleport ride.
--- Douglas Adams, The Hitchhikers Guide to the Galaxy

2011-10-26 04:54:53

by Richard Yao

[permalink] [raw]
Subject: Re: iwlagn is getting very shaky

Dear Norbert,

Try doing "iwconfig wlan0 rts 0". As I said less than an hour ago in
my rather lengthy post to the Linux mailing list, that made things
better for me, but not perfect.

Speaking of which, that post described my findings on why I was having
wireless issues that I did not post several weeks ago because I didn't
think anyone would care. Since people do seem to care, I probably
should have CCed that email to everyone else that was on people's CC
lists, but I am new to mailing lists (and to Linux), so I did not
think of it until now.

Yours truly,
Richard Yao

On Wed, Oct 26, 2011 at 12:34 AM, Norbert Preining <[email protected]> wrote:
> On Di, 25 Okt 2011, wwguy wrote:
>> > But *I* cannot be sure about it ;-)
>> >
>> Are you seeing this without suspend/resume?
>
> Yes. I just shut down and started, and the same happens again. If you
> need the dmesg output let me know.
>
> Best wishes
>
> Norbert
> ------------------------------------------------------------------------
> Norbert Preining ? ? ? ? ? ?preining@{jaist.ac.jp, logic.at, debian.org}
> JAIST, Japan ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? TeX Live & Debian Developer
> DSA: 0x09C5B094 ? fp: 14DF 2E6C 0307 BE6D AD76 ?A9C0 D2BF 4AA3 09C5 B094
> ------------------------------------------------------------------------
> RUNCORN (n.)
> A peeble (q.v.) which is larger that a belper (q.v.)
> ? ? ? ? ? ? ? ? ? ? ? ?--- Douglas Adams, The Meaning of Liff
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at ?http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at ?http://www.tux.org/lkml/
>

2011-10-20 14:16:41

by Wey-Yi Guy

[permalink] [raw]
Subject: Re: iwlagn is getting very shaky

Hi Norbert,
On Wed, 2011-10-19 at 21:59 -0700, Norbert Preining wrote:
> Hi Wey,
>
> so I am now at the university, and it seems your last patch
> did wonder. Here a log after waking up from suspend at university...
>
Great, this is just a test patch, we will figure out what is the correct
way to handle this and fix the issue for good :-)

Thanks
Wey


2011-10-19 07:35:28

by Wey-Yi Guy

[permalink] [raw]
Subject: Re: iwlagn is getting very shaky

On Wed, 2011-10-19 at 00:19 -0700, Pekka Enberg wrote:
> Hi Wey,
>
> On Wed, Oct 19, 2011 at 9:15 AM, Guy, Wey-Yi <[email protected]> wrote:
> > Sorry, I was thinking ask Pekka to use the officialfirmware. But more
> > important, I will like to find out is Pekka has the same issue as you
> > have which is tx queue stopped.
>
> I'm happy to help but I have no idea what you want me to do :-). When
> my internet connection dies the next time, what do you want me to
> capture for you? What's official firmware? I'm using Fedora 15 with
> 3.1-rc9 built from sources here. Do you want to see my lspci, .config,
> dmesg, something else?

I don't think it is firmware problem, so please ignore :-)
>
> As for the problems, I'm using iPhone 3G internet via WiFi sharing and
> connection seems to die few times a day or if I take the phone far
> enough from the laptop. I usually turn off the AP for a while and use
> NetworkManager on the laptop to disable/enable networking to make it
> come back up.
>
> I haven't tested anything but 3.1-rcX kernels on this machine but I
> can do git bisect too if it comes to that.
>
I am just try to understand if you encounter the same problem as
Norbert. if you can dump the tx_queue
in /sys/kernel/debug/ieee80211/phX/iwlagn/debug folder and send me the
log, thsat will be great.


p.s. you will need to enable CONFIG_IWLWIFI_DEBUGFS option in
your .config

Thanks
Wey
> Pekka



2011-10-19 15:37:10

by Wey-Yi Guy

[permalink] [raw]
Subject: Re: iwlagn is getting very shaky

Hi Norbert,

On Wed, 2011-10-19 at 05:49 -0700, Norbert Preining wrote:
> Hi Wey,
>
> On Di, 18 Okt 2011, Guy, Wey-Yi wrote:
> > Could you please let me know the brand/model of the AP which having
> > problem. Also the configuration of your NIC also help.
>
> I will try to find out as many information as possible tomorrow
> when I am back at university. I think the routers are hanging on
> the ceiling, so I can somehow check what they are.
>
> > also, when you say you don't have problem with other router, could you
> > please be more clear, are those APs having the similar functions as the
> > one you have in university, or different?
>
> Well, at home I have a normal WLAN router, one where you plug on
> the one side some upstream cable and then you have several downstream
> cable (eth) and wlan.
>
> On the university there is probably much more system since there are many
> APs and one can easily wander around. I will try to find out as
> much as possible.
>
> Anything one can to wiht iwconfig, ethercap or similar tools?
>
> > just got a idea, could you try to revert
> > commit#68b993118f715cc631b62b6a50574e4701fe9ace
>
> I try it tomorrow and let you know. I tried for now to revert it
> directly without success, but I will try to do it by hand. If
> I have problems I let you know. Tomorrow, now is break time ;-)
>
Are you connect on 5.2GHz? looks to me there were only one place can
stop the queue which is for radar channel detection.

instead of having you to revert the patch I ask you to do, could you
please just apply the patch I attach here and see if it make any
differences? it is also for debugging purpose, so please check the dmesg
log.

Thanks
Wey


Attachments:
0001-iwlwifi-stop-queue-testing.patch (1.09 kB)

2011-10-26 04:03:38

by Norbert Preining

[permalink] [raw]
Subject: Re: iwlagn is getting very shaky

On Di, 25 Okt 2011, wwguy wrote:
> so this is different problem, right? it looks different compare to what
> you described before which is cause by queue stopped.

I assume that this is a different regression concerning 3.1.0 released.

But *I* cannot be sure about it ;-)

Best wishes

Norbert
------------------------------------------------------------------------
Norbert Preining preining@{jaist.ac.jp, logic.at, debian.org}
JAIST, Japan TeX Live & Debian Developer
DSA: 0x09C5B094 fp: 14DF 2E6C 0307 BE6D AD76 A9C0 D2BF 4AA3 09C5 B094
------------------------------------------------------------------------
ROCHESTER (n.)
One who is able to gain occupation of the armrest on both sides of
their cinema or aircraft seat.
--- Douglas Adams, The Meaning of Liff

2011-10-19 07:04:25

by Wey-Yi Guy

[permalink] [raw]
Subject: Re: iwlagn is getting very shaky

On Tue, 2011-10-18 at 23:46 -0700, Norbert Preining wrote:
> On Di, 18 Okt 2011, Guy, Wey-Yi wrote:
> > it is very interesting, for sure there is a bug here which cause NIC
> > stop working, if you look at the tx queue, hwq 0 is stop, which mean
> > nothing go out. I am not sure how we get into this? yes, most likely
>
> Yes, that is my obervation, too. Nothing goes out, so reassociation
> does not succeed.
>
> > Could you help me how to repro this problem?
>
> Not really, besides you come here to my university ;-)
> I have this problem currently only with the routers here at
> the university, not with other routers.
>
> Any other way I can provide you help?

Could you please let me know the brand/model of the AP which having
problem. Also the configuration of your NIC also help.

also, when you say you don't have problem with other router, could you
please be more clear, are those APs having the similar functions as the
one you have in university, or different?

Thanks
Wey

>
> Best wishes
>
> Norbert
> ------------------------------------------------------------------------
> Norbert Preining preining@{jaist.ac.jp, logic.at, debian.org}
> JAIST, Japan TeX Live & Debian Developer
> DSA: 0x09C5B094 fp: 14DF 2E6C 0307 BE6D AD76 A9C0 D2BF 4AA3 09C5 B094
> ------------------------------------------------------------------------
> SKENFRITH (n.)
> The flakes of athlete's foot found inside socks.
> --- Douglas Adams, The Meaning of Liff



2011-10-20 02:51:28

by Norbert Preining

[permalink] [raw]
Subject: Re: iwlagn is getting very shaky

On Mi, 19 Okt 2011, Guy, Wey-Yi wrote:
> > Do you mean the patch you send with a strange date:
...

> > It does not apply at all, I cannot even find
> > rejected
> > in iwl-agn-tx.c.
> >
> > On which tree is that based, can you provide a patch against main
> > git branch of Linus.
> >
> > Thanks.
> >
> Let's try this and see if apply. btw, are you on 5GHz?

Which one??? Nothing attached, nothing received, and the patch
you sent before does not apply .... not even with manual
intervention ... at least I am not able.

5GHz: I don't know by now, not at university by now.

Best wishes

Norbert
------------------------------------------------------------------------
Norbert Preining preining@{jaist.ac.jp, logic.at, debian.org}
JAIST, Japan TeX Live & Debian Developer
DSA: 0x09C5B094 fp: 14DF 2E6C 0307 BE6D AD76 A9C0 D2BF 4AA3 09C5 B094
------------------------------------------------------------------------
`Ford, you're turning into a penguin. Stop it.'
--- Arthur experiences the improbability drive at work.
--- Douglas Adams, The Hitchhikers Guide to the Galaxy

2011-10-20 05:30:43

by David Rientjes

[permalink] [raw]
Subject: Re: iwlagn is getting very shaky

On Thu, 20 Oct 2011, Norbert Preining wrote:

> No attachment again ... something is filtering here???
>
> Please include it directly in the email
>

Not sure why you're not receiving the inlined patch, you can see it at
https://lkml.org/lkml/2011/10/19/505

2011-10-20 04:59:34

by Norbert Preining

[permalink] [raw]
Subject: Re: iwlagn is getting very shaky

Hi Wey,

so I am now at the university, and it seems your last patch
did wonder. Here a log after waking up from suspend at university...

[17394.694774] wlan0: associate with 00:24:c4:ab:bd:ef (try 1)
[17394.892077] wlan0: associate with 00:24:c4:ab:bd:ef (try 2)
[17394.896190] wlan0: RX AssocResp from 00:24:c4:ab:bd:ef (capab=0x1 status=0 aid=2)
[17394.896199] wlan0: associated
[17394.974620] iwlagn 0000:06:00.0: Encounter TX_STATUS_FAIL_PASSIVE_NO_RX, am I on 5.2G band? (2)
[17394.976930] iwlagn 0000:06:00.0: Encounter TX_STATUS_FAIL_PASSIVE_NO_RX, am I on 5.2G band? (2)
[17409.608674] iwlagn 0000:06:00.0: Tx aggregation enabled on ra = 00:24:c4:ab:bd:ef tid = 0
[17409.616748] wlan0: direct probe to 00:24:c4:ab:bd:e0 (try 1/3)
[17409.816093] wlan0: direct probe to 00:24:c4:ab:bd:e0 (try 2/3)
[17410.016083] wlan0: direct probe to 00:24:c4:ab:bd:e0 (try 3/3)
[17410.216086] wlan0: direct probe to 00:24:c4:ab:bd:e0 timed out
[17419.285141] iwlagn 0000:06:00.0: Encounter TX_STATUS_FAIL_PASSIVE_NO_RX, am I on 5.2G band? (0)
[17419.285315] iwlagn 0000:06:00.0: Encounter TX_STATUS_FAIL_PASSIVE_NO_RX, am I on 5.2G band? (0)
[17419.286583] wlan0: deauthenticating from 00:24:c4:ab:bd:ef by local choice (reason=2)
[17419.300296] cfg80211: Calling CRDA for country: JP
[17419.309278] wlan0: authenticate with 00:24:c4:ab:bd:ef (try 1)
[17419.310105] wlan0: authenticated
[17419.317900] wlan0: associate with 00:24:c4:ab:bd:ef (try 1)
[17419.319837] wlan0: RX ReassocResp from 00:24:c4:ab:bd:ef (capab=0x1 status=0 aid=2)
[17419.319840] wlan0: associated
[17422.384922] iwlagn 0000:06:00.0: Tx aggregation enabled on ra = 00:24:c4:ab:bd:ef tid = 0
[17509.576705] wlan0: direct probe to 00:24:c4:ab:bd:e0 (try 1/3)
[17509.776098] wlan0: direct probe to 00:24:c4:ab:bd:e0 (try 2/3)
[17509.976086] wlan0: direct probe to 00:24:c4:ab:bd:e0 (try 3/3)
[17509.980861] wlan0: direct probe responded
[17510.016116] wlan0: authenticate with 00:24:c4:ab:bd:e0 (try 1)
[17510.018753] wlan0: authenticated
[17510.042702] wlan0: associate with 00:24:c4:ab:bd:e0 (try 1)
[17510.050541] wlan0: RX ReassocResp from 00:24:c4:ab:bd:e0 (capab=0x421 status=0 aid=1)
[17510.050551] wlan0: associated
[17589.724188] iwlagn 0000:06:00.0: Tx aggregation enabled on ra = 00:24:c4:ab:bd:e0 tid = 0
[17590.839852] iwlagn 0000:06:00.0: Tx aggregation enabled on ra = 00:24:c4:ab:bd:e0 tid = 0

but the connection is up and working.

On Di, 18 Okt 2011, Guy, Wey-Yi wrote:
> Could you please let me know the brand/model of the AP which having
> problem. Also the configuration of your NIC also help.

THe AP is a big Cisco one, more I cannot see, with 3 antenna.

For the NIC, what do you want to know:
wlan0 IEEE 802.11abgn ESSID:"XXXXXXXXXX"
Mode:Managed Frequency:2.412 GHz Access Point: 00:24:C4:AB:BD:E0
Bit Rate=72.2 Mb/s Tx-Power=15 dBm
Retry long limit:7 RTS thr:off Fragment thr:off
Encryption key:off
Power Management:off
Link Quality=51/70 Signal level=-59 dBm
Rx invalid nwid:0 Rx invalid crypt:0 Rx invalid frag:0
Tx excessive retries:685 Invalid misc:59 Missed beacon:0

The last line looks interesting, excessive retires: 685, INvalid misc: 59


Note that the "Tx excessive retries" increase steaadily and radically,
downloading updates with aptitude I see afterwards:
Tx excessive retries:17922

The "Invalid misc" also increase, but slowly.

On Mi, 19 Okt 2011, wwguy wrote:
> Are you connect on 5.2GHz? looks to me there were only one place can
> stop the queue which is for radar channel detection.

No, it is channel 1
$ iwlist wlan0 channel
wlan0 32 channels in total; available frequencies :
Channel 01 : 2.412 GHz
Channel 02 : 2.417 GHz
Channel 03 : 2.422 GHz
Channel 04 : 2.427 GHz
Channel 05 : 2.432 GHz
Channel 06 : 2.437 GHz
Channel 07 : 2.442 GHz
Channel 08 : 2.447 GHz
Channel 09 : 2.452 GHz
Channel 10 : 2.457 GHz
Channel 11 : 2.462 GHz
Channel 12 : 2.467 GHz
Channel 13 : 2.472 GHz
Channel 36 : 5.18 GHz
Channel 40 : 5.2 GHz
Channel 44 : 5.22 GHz
Channel 48 : 5.24 GHz
Channel 52 : 5.26 GHz
Channel 56 : 5.28 GHz
Channel 60 : 5.3 GHz
Channel 64 : 5.32 GHz
Channel 100 : 5.5 GHz
Channel 104 : 5.52 GHz
Channel 108 : 5.54 GHz
Channel 112 : 5.56 GHz
Channel 116 : 5.58 GHz
Channel 120 : 5.6 GHz
Channel 124 : 5.62 GHz
Channel 128 : 5.64 GHz
Channel 132 : 5.66 GHz
Channel 136 : 5.68 GHz
Channel 140 : 5.7 GHz
Current Frequency:2.412 GHz (Channel 1)


Well, for the time being the patch is nice and helps, thanks!

If you want me to test more please let me know, but I am leaving
today for some business trip and will be back on Wednesday at
work here.

Best wishes

Norbert
------------------------------------------------------------------------
Norbert Preining preining@{jaist.ac.jp, logic.at, debian.org}
JAIST, Japan TeX Live & Debian Developer
DSA: 0x09C5B094 fp: 14DF 2E6C 0307 BE6D AD76 A9C0 D2BF 4AA3 09C5 B094
------------------------------------------------------------------------
GOLANT (adj.)
Blank, sly and faintly embarrassed. Pertaining to the expression seen
on the face of someone who has clearly forgotten your name.
--- Douglas Adams, The Meaning of Liff

2011-10-26 20:02:04

by Wey-Yi Guy

[permalink] [raw]
Subject: Re: [Ilw] Re: iwlagn is getting very shaky

On Tue, 2011-10-25 at 21:54 -0700, Richard Yao wrote:
> Dear Norbert,
>
> Try doing "iwconfig wlan0 rts 0". As I said less than an hour ago in
> my rather lengthy post to the Linux mailing list, that made things
> better for me, but not perfect.
>
> Speaking of which, that post described my findings on why I was having
> wireless issues that I did not post several weeks ago because I didn't
> think anyone would care. Since people do seem to care, I probably
> should have CCed that email to everyone else that was on people's CC
> lists, but I am new to mailing lists (and to Linux), so I did not
> think of it until now.
>
yes we care, sorry to miss your email thread. please provide more
information about your issue.

Thanks
Wey

>
> On Wed, Oct 26, 2011 at 12:34 AM, Norbert Preining <[email protected]> wrote:
> > On Di, 25 Okt 2011, wwguy wrote:
> >> > But *I* cannot be sure about it ;-)
> >> >
> >> Are you seeing this without suspend/resume?
> >
> > Yes. I just shut down and started, and the same happens again. If you
> > need the dmesg output let me know.
> >
> > Best wishes
> >
> > Norbert
> > ------------------------------------------------------------------------
> > Norbert Preining preining@{jaist.ac.jp, logic.at, debian.org}
> > JAIST, Japan TeX Live & Debian Developer
> > DSA: 0x09C5B094 fp: 14DF 2E6C 0307 BE6D AD76 A9C0 D2BF 4AA3 09C5 B094
> > ------------------------------------------------------------------------
> > RUNCORN (n.)
> > A peeble (q.v.) which is larger that a belper (q.v.)
> > --- Douglas Adams, The Meaning of Liff
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> > the body of a message to [email protected]
> > More majordomo info at http://vger.kernel.org/majordomo-info.html
> > Please read the FAQ at http://www.tux.org/lkml/
> >
>
> _______________________________________________
> ilw mailing list
> [email protected]
> http://linux.intel.com/mailman/listinfo/ilw



2011-10-20 13:50:35

by Johannes Berg

[permalink] [raw]
Subject: Re: iwlagn is getting very shaky

On Thu, 2011-10-20 at 22:45 +0900, Norbert Preining wrote:
> Hi Johannes,
>
> > did you have scan information for these APs?
>
> Unfortunately not, but I can get them when I am back on Wednesday, if you want.
>
> Just let me know what exactely.

Just "iw wlan0 scan" output or so. Basically I just want the BSSID/freq
to see which APs are on which band.

johannes


2011-10-19 06:45:39

by Wey-Yi Guy

[permalink] [raw]
Subject: Re: iwlagn is getting very shaky

hi,

On Tue, 2011-10-18 at 23:11 -0700, Pekka Enberg wrote:
> On Wed, Oct 19, 2011 at 9:09 AM, David Rientjes <[email protected]> wrote:
> > On Wed, 19 Oct 2011, Norbert Preining wrote:
> >
> >> Hi everyone
> >>
> >> (please Cc),
> >>
> >> I am currently running 3.1.0-rc10, and I am having a hard time with
> >> the wlan network here at the university.
> >>
> >> For quite some time, like 10min, it is fine, then suddently the
> >> iwlagn driver gives up on me and connection is dropped.
> >>
> >> In the log file I see:
> >> [ 172.137011] iwlagn 0000:06:00.0: Tx aggregation enabled on ra = 00:24:c4:ab:bd:ef tid = 0
> >> [ 821.841016] iwlagn 0000:06:00.0: Tx aggregation enabled on ra = 00:24:c4:ab:bd:ef tid = 6
> >> [ 1095.580735] wlan0: direct probe to 00:24:c4:ab:bd:e0 (try 1/3)
> >> [ 1095.780076] wlan0: direct probe to 00:24:c4:ab:bd:e0 (try 2/3)
> >> [ 1095.980101] wlan0: direct probe to 00:24:c4:ab:bd:e0 (try 3/3)
> >> [ 1096.180117] wlan0: direct probe to 00:24:c4:ab:bd:e0 timed out
> >> [ 1105.255464] wlan0: deauthenticating from 00:24:c4:ab:bd:ef by local choice (reason=2)
> >> [ 1105.255519] iwlagn 0000:06:00.0: Stopping AGG while state not ON or starting
> >> [ 1105.265581] cfg80211: Calling CRDA for country: JP
> >> [ 1105.271476] wlan0: authenticate with 00:24:c4:ab:bd:e0 (try 1)
> >> [ 1105.468105] wlan0: authenticate with 00:24:c4:ab:bd:e0 (try 2)
> >> [ 1105.668110] wlan0: authenticate with 00:24:c4:ab:bd:e0 (try 3)
> >> [ 1105.868090] wlan0: authentication with 00:24:c4:ab:bd:e0 timed out
> >> [ 1113.667890] wlan0: direct probe to 00:24:c4:ab:bd:e0 (try 1/3)
> >> [ 1113.864116] wlan0: direct probe to 00:24:c4:ab:bd:e0 (try 2/3)
> >> [ 1114.064095] wlan0: direct probe to 00:24:c4:ab:bd:e0 (try 3/3)
> >> [ 1114.264109] wlan0: direct probe to 00:24:c4:ab:bd:e0 timed out
> >>
> >> Somewhere around 1100 the connection is gone and never comes back again.
> >>
> >> I tried removing the driver module from the kernel and reinserting it,
> >> tried to turn on and off the hardware swithc (rfkill), all without
> >> no success, the wlan connection remains dead until I reboot.
> >>
> >> I am not sure exactely when it started, I guess somewhere in the
> >> 3.1 cycle, before I was permanently working wiht wlan, now I always
> >> plug in the cable.
> >>
> >> If there is any way to track down this, or any suggestions how I can
> >> debug it, please let me know.
> >>
> >> Hardware: Sony VGN-Z11, Intel(R) WiFi Link 5100 AGN, REV=0x54
> >> L1 Enabled; Disabling L0S
> >> device EEPROM VER=0x11e, CALIB=0x4
> >> Device SKU: 0Xf0
> >> Tunable channels: 13 802.11bg, 24 802.11a channels
> >> loaded firmware version 8.83.5.1 build 33692 (EXP)
> >>

looks like you are using experimental version of uCode, but by saying
that, I don't think i will make much differences

so could you dump the tx queue to see if it is the similar problem what
Norbert point out

thanks
Wey

> >>
> >> On the other hand, the same laptop with the very same configuration
> >> works very nicely in my flat's wlan, which is some dirt cheap Japanese
> >> only wlan router.
> >>
> >
> > There have been recent issues in 3.1-rc9 reported with iwlagn, see the
> > thread at https://lkml.org/lkml/2011/10/15/107 even though you have
> > different hardware. Adding Wey-Yi Guy from Intel and the linux-wireless
> > mailing list to the cc.
>
> I'm seeing very similar problems with 3.1-rc9 when my laptop is
> connected to an iPhone WiFi hotspot. Works just fine at home where I
> use Airport Express.
>
> Pekka



2011-10-26 02:50:24

by Norbert Preining

[permalink] [raw]
Subject: Re: iwlagn is getting very shaky

Hi David,

On So, 23 Okt 2011, David Rientjes wrote:
> It'd be interesting to see if this problem still exists in linux-next
> since some wireless changes went into that recently. Please try the tree
> at git://github.com/sfrothwell/linux-next.git

I didn't try that by now, but I tried current linux git where
a lot of things went into, I am running on top of 7670c7010c.
With that I get:
- two different kernel bugs, the first one actually appearing several
times
- after a short time no package goes out, but the addapter still believes
to be associated.

Here is the dmesg output of the two kernel bugs and the following
hard reset.

BTW, that happened at the *critical* access point at the university.
At home I was before running hte same kernel without problem.
BUt it might be a suspend/resume problem, too, as I suspended the
computer and woke it up here.


[ 3279.808204] ------------[ cut here ]------------
[ 3279.808246] WARNING: at include/net/mac80211.h:3570 rate_control_send_low+0xa5/0x165 [mac80211]()
[ 3279.808254] Hardware name: VGN-Z11VN_B
[ 3279.808258] Modules linked in: rfcomm bnep vboxpci vboxnetadp vboxnetflt vboxdrv snd_hrtimer binfmt_misc dm_crypt dm_mod isofs btrfs zlib_deflate crc32c libcrc32c vfat fat fuse loop uinput snd_hda_codec_realtek snd_hda_intel snd_hda_codec snd_hwdep snd_pcm_oss snd_mixer_oss snd_pcm arc4 snd_seq_dummy snd_seq_oss iwlwifi snd_seq_midi snd_rawmidi snd_seq_midi_event snd_seq mac80211 btusb snd_timer bluetooth snd_seq_device firewire_ohci snd firewire_core sony_laptop crc16 soundcore mxm_wmi cfg80211 snd_page_alloc crc_itu_t rfkill joydev tpm_infineon
[ 3279.808379] Pid: 4956, comm: kworker/u:11 Tainted: G W 3.1.0+ #40
[ 3279.808385] Call Trace:
[ 3279.808400] [<ffffffff810376c1>] warn_slowpath_common+0x83/0x9b
[ 3279.808411] [<ffffffff810376f3>] warn_slowpath_null+0x1a/0x1c
[ 3279.808438] [<ffffffffa0158149>] rate_control_send_low+0xa5/0x165 [mac80211]
[ 3279.808459] [<ffffffffa01b01c2>] rs_get_rate+0x146/0x254 [iwlwifi]
[ 3279.808486] [<ffffffffa01585b1>] rate_control_get_rate+0x86/0x14c [mac80211]
[ 3279.808516] [<ffffffffa015f632>] ieee80211_tx_h_rate_ctrl+0x1cb/0x3e4 [mac80211]
[ 3279.808546] [<ffffffffa015fac9>] invoke_tx_handlers+0x69/0xf5 [mac80211]
[ 3279.808575] [<ffffffffa015fcd2>] ieee80211_tx+0x7f/0xaf [mac80211]
[ 3279.808605] [<ffffffffa0160172>] ieee80211_xmit+0x89/0x97 [mac80211]
[ 3279.808634] [<ffffffffa0160d10>] ieee80211_tx_skb+0x57/0x5f [mac80211]
[ 3279.808659] [<ffffffffa0152098>] ieee80211_send_nullfunc+0x5f/0x64 [mac80211]
[ 3279.808684] [<ffffffffa01520ff>] ieee80211_mgd_probe_ap_send+0x62/0xe6 [mac80211]
[ 3279.808709] [<ffffffffa01522fb>] ? ieee80211_recalc_ps+0x10/0x250 [mac80211]
[ 3279.808734] [<ffffffffa0152960>] ieee80211_mgd_probe_ap+0xc9/0xd9 [mac80211]
[ 3279.808759] [<ffffffffa0152a6b>] ieee80211_beacon_connection_loss_work+0xe2/0xed [mac80211]
[ 3279.808785] [<ffffffffa0152989>] ? ieee80211_sta_monitor_work+0x19/0x19 [mac80211]
[ 3279.808797] [<ffffffff8104d9cc>] process_one_work+0x17b/0x2bd
[ 3279.808807] [<ffffffff8104c011>] ? need_to_create_worker+0x12/0x26
[ 3279.808818] [<ffffffff8104eac7>] worker_thread+0xdb/0x15f
[ 3279.808829] [<ffffffff8104e9ec>] ? manage_workers.isra.24+0x171/0x171
[ 3279.808839] [<ffffffff81051ef2>] kthread+0x84/0x8c
[ 3279.808851] [<ffffffff81401394>] kernel_thread_helper+0x4/0x10
[ 3279.808862] [<ffffffff81051e6e>] ? kthread_worker_fn+0x148/0x148
[ 3279.808872] [<ffffffff81401390>] ? gs_change+0xb/0xb
[ 3279.808879] ---[ end trace b7620f873182e7e3 ]---
[ 3284.284925] wlan0: deauthenticating from 00:24:c4:ab:bd:ef by local choice (reason=2)
[ 3284.344356] cfg80211: Calling CRDA to update world regulatory domain
[ 3284.351319] wlan0: authenticate with 00:24:c4:ab:bd:ef (try 1)
[ 3284.445310] wlan0: authenticated
[ 3284.447486] wlan0: associate with 00:24:c4:ab:bd:ef (try 1)
[ 3284.449486] wlan0: RX ReassocResp from 00:24:c4:ab:bd:ef (capab=0x1 status=0 aid=2)
[ 3284.449489] wlan0: associated
[ 3314.515741] iwlwifi 0000:06:00.0: Tx aggregation enabled on ra = 00:24:c4:ab:bd:ef tid = 0
[ 3314.515762] ------------[ cut here ]------------
[ 3314.515800] WARNING: at drivers/net/wireless/iwlwifi/iwl-trans-pcie.c:1101 iwl_trans_pcie_tx+0x180/0x661 [iwlwifi]()
[ 3314.515808] Hardware name: VGN-Z11VN_B
[ 3314.515813] Modules linked in: rfcomm bnep vboxpci vboxnetadp vboxnetflt vboxdrv snd_hrtimer binfmt_misc dm_crypt dm_mod isofs btrfs zlib_deflate crc32c libcrc32c vfat fat fuse loop uinput snd_hda_codec_realtek snd_hda_intel snd_hda_codec snd_hwdep snd_pcm_oss snd_mixer_oss snd_pcm arc4 snd_seq_dummy snd_seq_oss iwlwifi snd_seq_midi snd_rawmidi snd_seq_midi_event snd_seq mac80211 btusb snd_timer bluetooth snd_seq_device firewire_ohci snd firewire_core sony_laptop crc16 soundcore mxm_wmi cfg80211 snd_page_alloc crc_itu_t rfkill joydev tpm_infineon
[ 3314.515937] Pid: 4954, comm: kworker/u:9 Tainted: G W 3.1.0+ #40
[ 3314.515943] Call Trace:
[ 3314.515948] <IRQ> [<ffffffff810376c1>] warn_slowpath_common+0x83/0x9b
[ 3314.515973] [<ffffffff810376f3>] warn_slowpath_null+0x1a/0x1c
[ 3314.515996] [<ffffffffa01cdb0e>] iwl_trans_pcie_tx+0x180/0x661 [iwlwifi]
[ 3314.516009] [<ffffffff810d14b3>] ? kmem_cache_alloc+0x44/0xb9
[ 3314.516030] [<ffffffffa01b59f8>] iwlagn_tx_skb+0x862/0x901 [iwlwifi]
[ 3314.516048] [<ffffffffa01abf8c>] iwlagn_mac_tx+0x131/0x1a2 [iwlwifi]
[ 3314.516069] [<ffffffffa016abd6>] ? ieee80211_tx_h_fragment+0x16/0x22c [mac80211]
[ 3314.516098] [<ffffffffa015ef24>] __ieee80211_tx+0x176/0x1cf [mac80211]
[ 3314.516117] [<ffffffffa016ab48>] ? ieee80211_tx_h_calculate_duration+0x4c/0x65 [mac80211]
[ 3314.516147] [<ffffffffa015fcea>] ieee80211_tx+0x97/0xaf [mac80211]
[ 3314.516178] [<ffffffffa0160be6>] ieee80211_tx_pending+0xf0/0x1c3 [mac80211]
[ 3314.516191] [<ffffffff8103cdd2>] tasklet_action+0x77/0xc2
[ 3314.516201] [<ffffffff8103cf13>] __do_softirq+0xbc/0x1a5
[ 3314.516212] [<ffffffff8103cb4f>] ? test_ti_thread_flag.constprop.16+0x9/0x16
[ 3314.516225] [<ffffffff8140148c>] call_softirq+0x1c/0x30
[ 3314.516230] <EOI> [<ffffffff8100371a>] do_softirq+0x38/0x6e
[ 3314.516248] [<ffffffff8103cbd9>] _local_bh_enable_ip.isra.12+0x7d/0xa0
[ 3314.516259] [<ffffffff8103cc0a>] local_bh_enable_ip+0xe/0x10
[ 3314.516268] [<ffffffff813ff657>] _raw_spin_unlock_bh+0x23/0x25
[ 3314.516291] [<ffffffffa014ec70>] ieee80211_agg_tx_operational+0x99/0xa4 [mac80211]
[ 3314.516316] [<ffffffffa014f7bf>] ieee80211_process_addba_resp+0xb8/0xf1 [mac80211]
[ 3314.516329] [<ffffffff81064040>] ? arch_read_unlock+0xe/0xe
[ 3314.516355] [<ffffffffa0156a38>] ieee80211_iface_work+0x130/0x2b5 [mac80211]
[ 3314.516381] [<ffffffffa0156908>] ? ieee80211_teardown_sdata+0xcc/0xcc [mac80211]
[ 3314.516394] [<ffffffff8104d9cc>] process_one_work+0x17b/0x2bd
[ 3314.516404] [<ffffffff8104c011>] ? need_to_create_worker+0x12/0x26
[ 3314.516415] [<ffffffff8104eac7>] worker_thread+0xdb/0x15f
[ 3314.516426] [<ffffffff8104e9ec>] ? manage_workers.isra.24+0x171/0x171
[ 3314.516436] [<ffffffff81051ef2>] kthread+0x84/0x8c
[ 3314.516447] [<ffffffff81401394>] kernel_thread_helper+0x4/0x10
[ 3314.516458] [<ffffffff81051e6e>] ? kthread_worker_fn+0x148/0x148
[ 3314.516467] [<ffffffff81401390>] ? gs_change+0xb/0xb
[ 3314.516474] ---[ end trace b7620f873182e7e4 ]---
[ 3324.128094] iwlwifi 0000:06:00.0: Queue 11 stuck for 10000 ms.
[ 3324.128105] iwlwifi 0000:06:00.0: Current read_ptr 14 write_ptr 31
[ 3324.128113] iwlwifi 0000:06:00.0: On demand firmware reload
[ 3324.128548] ieee80211 phy0: Hardware restart was requested
[ 3324.128657] iwlwifi 0000:06:00.0: L1 Enabled; Disabling L0S
[ 3324.131680] iwlwifi 0000:06:00.0: Radio type=0x1-0x2-0x0


Best wishes

Norbert
------------------------------------------------------------------------
Norbert Preining preining@{jaist.ac.jp, logic.at, debian.org}
JAIST, Japan TeX Live & Debian Developer
DSA: 0x09C5B094 fp: 14DF 2E6C 0307 BE6D AD76 A9C0 D2BF 4AA3 09C5 B094
------------------------------------------------------------------------
PODE HOLE (n.)
A hole drilled in chipboard lavatory walls by homosexuals for any one
of a number of purposes.
--- Douglas Adams, The Meaning of Liff

2011-10-20 07:40:27

by Johannes Berg

[permalink] [raw]
Subject: Re: iwlagn is getting very shaky

On Thu, 2011-10-20 at 13:59 +0900, Norbert Preining wrote:

> so I am now at the university, and it seems your last patch
> did wonder. Here a log after waking up from suspend at university...

For some value of wonder ...

> [17394.896199] wlan0: associated
> [17394.974620] iwlagn 0000:06:00.0: Encounter TX_STATUS_FAIL_PASSIVE_NO_RX, am I on 5.2G band? (2)
> [17394.976930] iwlagn 0000:06:00.0: Encounter TX_STATUS_FAIL_PASSIVE_NO_RX, am I on 5.2G band? (2)
> [17409.608674] iwlagn 0000:06:00.0: Tx aggregation enabled on ra = 00:24:c4:ab:bd:ef tid = 0
> [17409.616748] wlan0: direct probe to 00:24:c4:ab:bd:e0 (try 1/3)
> [17409.816093] wlan0: direct probe to 00:24:c4:ab:bd:e0 (try 2/3)
> [17410.016083] wlan0: direct probe to 00:24:c4:ab:bd:e0 (try 3/3)
> [17410.216086] wlan0: direct probe to 00:24:c4:ab:bd:e0 timed out
> [17419.285141] iwlagn 0000:06:00.0: Encounter TX_STATUS_FAIL_PASSIVE_NO_RX, am I on 5.2G band? (0)
> [17419.285315] iwlagn 0000:06:00.0: Encounter TX_STATUS_FAIL_PASSIVE_NO_RX, am I on 5.2G band? (0)
> [17419.286583] wlan0: deauthenticating from 00:24:c4:ab:bd:ef by local choice (reason=2)
> [17419.300296] cfg80211: Calling CRDA for country: JP
> [17419.309278] wlan0: authenticate with 00:24:c4:ab:bd:ef (try 1)
> [17419.310105] wlan0: authenticated
> [17419.317900] wlan0: associate with 00:24:c4:ab:bd:ef (try 1)

It seems a bit like the 5 GHz AP (:e0) just fails, and wpa_supplicant
selects a 2.4 GHz AP (:ef) now? Pure conjecture, I guess the AP could
also be on the same channel -- did you have scan information for these
APs?

johannes


2011-10-23 21:26:31

by David Rientjes

[permalink] [raw]
Subject: Re: iwlagn is getting very shaky

On Fri, 21 Oct 2011, Norbert Preining wrote:

> > Great, this is just a test patch, we will figure out what is the correct
> > way to handle this and fix the issue for good :-)
>
> Thanks. If you need more testing just let me know.
>

It'd be interesting to see if this problem still exists in linux-next
since some wireless changes went into that recently. Please try the tree
at git://github.com/sfrothwell/linux-next.git

2011-10-26 19:42:49

by Wey-Yi Guy

[permalink] [raw]
Subject: Re: iwlagn is getting very shaky

On Tue, 2011-10-25 at 21:34 -0700, Norbert Preining wrote:
> On Di, 25 Okt 2011, wwguy wrote:
> > > But *I* cannot be sure about it ;-)
> > >
> > Are you seeing this without suspend/resume?
>
> Yes. I just shut down and started, and the same happens again. If you
> need the dmesg output let me know.
>

please send us the log, but I will be late on response since I am on
business trip until the end of next week.

Thanks
Wey


2011-10-20 13:54:33

by Norbert Preining

[permalink] [raw]
Subject: Re: iwlagn is getting very shaky


> Just "iw wlan0 scan" output or so. Basically I just want the BSSID/freq
> to see which APs are on which band.

Ok, no prob, do it Wed.

Bye

N

2011-10-19 06:46:33

by Norbert Preining

[permalink] [raw]
Subject: Re: iwlagn is getting very shaky

On Di, 18 Okt 2011, Guy, Wey-Yi wrote:
> it is very interesting, for sure there is a bug here which cause NIC
> stop working, if you look at the tx queue, hwq 0 is stop, which mean
> nothing go out. I am not sure how we get into this? yes, most likely

Yes, that is my obervation, too. Nothing goes out, so reassociation
does not succeed.

> Could you help me how to repro this problem?

Not really, besides you come here to my university ;-)
I have this problem currently only with the routers here at
the university, not with other routers.

Any other way I can provide you help?

Best wishes

Norbert
------------------------------------------------------------------------
Norbert Preining preining@{jaist.ac.jp, logic.at, debian.org}
JAIST, Japan TeX Live & Debian Developer
DSA: 0x09C5B094 fp: 14DF 2E6C 0307 BE6D AD76 A9C0 D2BF 4AA3 09C5 B094
------------------------------------------------------------------------
SKENFRITH (n.)
The flakes of athlete's foot found inside socks.
--- Douglas Adams, The Meaning of Liff

2011-10-26 04:06:48

by Wey-Yi Guy

[permalink] [raw]
Subject: Re: iwlagn is getting very shaky

On Tue, 2011-10-25 at 21:03 -0700, Norbert Preining wrote:
> On Di, 25 Okt 2011, wwguy wrote:
> > so this is different problem, right? it looks different compare to what
> > you described before which is cause by queue stopped.
>
> I assume that this is a different regression concerning 3.1.0 released.
>
> But *I* cannot be sure about it ;-)
>
Are you seeing this without suspend/resume?

Thanks
Wey


2011-10-26 04:34:25

by Norbert Preining

[permalink] [raw]
Subject: Re: iwlagn is getting very shaky

On Di, 25 Okt 2011, wwguy wrote:
> > But *I* cannot be sure about it ;-)
> >
> Are you seeing this without suspend/resume?

Yes. I just shut down and started, and the same happens again. If you
need the dmesg output let me know.

Best wishes

Norbert
------------------------------------------------------------------------
Norbert Preining preining@{jaist.ac.jp, logic.at, debian.org}
JAIST, Japan TeX Live & Debian Developer
DSA: 0x09C5B094 fp: 14DF 2E6C 0307 BE6D AD76 A9C0 D2BF 4AA3 09C5 B094
------------------------------------------------------------------------
RUNCORN (n.)
A peeble (q.v.) which is larger that a belper (q.v.)
--- Douglas Adams, The Meaning of Liff

2011-10-19 06:11:33

by Pekka Enberg

[permalink] [raw]
Subject: Re: iwlagn is getting very shaky

On Wed, Oct 19, 2011 at 9:09 AM, David Rientjes <[email protected]> wrote:
> On Wed, 19 Oct 2011, Norbert Preining wrote:
>
>> Hi everyone
>>
>> (please Cc),
>>
>> I am currently running 3.1.0-rc10, and I am having a hard time with
>> the wlan network here at the university.
>>
>> For quite some time, like 10min, it is fine, then suddently the
>> iwlagn driver gives up on me and connection is dropped.
>>
>> In the log file I see:
>> [ ?172.137011] iwlagn 0000:06:00.0: Tx aggregation enabled on ra = 00:24:c4:ab:bd:ef tid = 0
>> [ ?821.841016] iwlagn 0000:06:00.0: Tx aggregation enabled on ra = 00:24:c4:ab:bd:ef tid = 6
>> [ 1095.580735] wlan0: direct probe to 00:24:c4:ab:bd:e0 (try 1/3)
>> [ 1095.780076] wlan0: direct probe to 00:24:c4:ab:bd:e0 (try 2/3)
>> [ 1095.980101] wlan0: direct probe to 00:24:c4:ab:bd:e0 (try 3/3)
>> [ 1096.180117] wlan0: direct probe to 00:24:c4:ab:bd:e0 timed out
>> [ 1105.255464] wlan0: deauthenticating from 00:24:c4:ab:bd:ef by local choice (reason=2)
>> [ 1105.255519] iwlagn 0000:06:00.0: Stopping AGG while state not ON or starting
>> [ 1105.265581] cfg80211: Calling CRDA for country: JP
>> [ 1105.271476] wlan0: authenticate with 00:24:c4:ab:bd:e0 (try 1)
>> [ 1105.468105] wlan0: authenticate with 00:24:c4:ab:bd:e0 (try 2)
>> [ 1105.668110] wlan0: authenticate with 00:24:c4:ab:bd:e0 (try 3)
>> [ 1105.868090] wlan0: authentication with 00:24:c4:ab:bd:e0 timed out
>> [ 1113.667890] wlan0: direct probe to 00:24:c4:ab:bd:e0 (try 1/3)
>> [ 1113.864116] wlan0: direct probe to 00:24:c4:ab:bd:e0 (try 2/3)
>> [ 1114.064095] wlan0: direct probe to 00:24:c4:ab:bd:e0 (try 3/3)
>> [ 1114.264109] wlan0: direct probe to 00:24:c4:ab:bd:e0 timed out
>>
>> Somewhere around 1100 the connection is gone and never comes back again.
>>
>> I tried removing the driver module from the kernel and reinserting it,
>> tried to turn on and off the hardware swithc (rfkill), all without
>> no success, the wlan connection remains dead until I reboot.
>>
>> I am not sure exactely when it started, I guess somewhere in the
>> 3.1 cycle, before I was permanently working wiht wlan, now I always
>> plug in the cable.
>>
>> If there is any way to track down this, or any suggestions how I can
>> debug it, please let me know.
>>
>> Hardware: Sony VGN-Z11, Intel(R) WiFi Link 5100 AGN, REV=0x54
>> L1 Enabled; Disabling L0S
>> device EEPROM VER=0x11e, CALIB=0x4
>> Device SKU: 0Xf0
>> Tunable channels: 13 802.11bg, 24 802.11a channels
>> loaded firmware version 8.83.5.1 build 33692 (EXP)
>>
>>
>> On the other hand, the same laptop with the very same configuration
>> works very nicely in my flat's wlan, which is some dirt cheap Japanese
>> only wlan router.
>>
>
> There have been recent issues in 3.1-rc9 reported with iwlagn, see the
> thread at https://lkml.org/lkml/2011/10/15/107 even though you have
> different hardware. ?Adding Wey-Yi Guy from Intel and the linux-wireless
> mailing list to the cc.

I'm seeing very similar problems with 3.1-rc9 when my laptop is
connected to an iPhone WiFi hotspot. Works just fine at home where I
use Airport Express.

Pekka

2011-10-20 13:46:55

by Norbert Preining

[permalink] [raw]
Subject: Re: iwlagn is getting very shaky

Hi Johannes,

> did you have scan information for these APs?

Unfortunately not, but I can get them when I am back on Wednesday, if you want.

Just let me know what exactely.

Best wishes

Norbert


2011-11-21 03:06:45

by Norbert Preining

[permalink] [raw]
Subject: Re: iwlagn is getting very shaky

Hi Emmanuel, hi all,

sorry for the long silence. I was a bit travelling and off,


On Mi, 09 Nov 2011, Emmanuel Grumbach wrote:
> > On Di, 08 Nov 2011, Emmanuel Grumbach wrote:
> > > I would like to see what happens in terms or power.
> > > Please set the power debug flag: modprobe iwlwifi debug=0x100
> >
> > > It may be worth to disable Link Power save too. I would be glad if you
> > > could try this patch (attached and inlined), with the power debug flag
> > > set.

I have now disabled the debug option again. And rebuild some new kernels
(3.2.0-rc1 and top git).

And what I stil see is that the tx-queue gets stuck and does not
recover without rmmod iwlwif.

hwq 11: read=0 write=0 stop=1 swq_id=0x2e (ac 2/hwq 11)

But I still am using the patch that you sent to disable power save,
so it is not only this one.

If you think I shouuld turn on power debug again and see hope that
it happens again, let me know.

Furthermore, I have now seen hiccups on my home router, too, at
times. There is recovers normally by itself, but still.

Best wishes

Norbert
------------------------------------------------------------------------
Norbert Preining preining@{jaist.ac.jp, logic.at, debian.org}
JAIST, Japan TeX Live & Debian Developer
DSA: 0x09C5B094 fp: 14DF 2E6C 0307 BE6D AD76 A9C0 D2BF 4AA3 09C5 B094
------------------------------------------------------------------------
Program aborting:
Close all that you have worked on.
You ask far too much.
--- Windows Error Haiku

2011-11-24 23:03:10

by Norbert Preining

[permalink] [raw]
Subject: Re: iwlagn is getting very shaky

On Do, 24 Nov 2011, Emmanuel Grumbach wrote:
> Please apply the patch attached on top of the one I previously sent.
> Let me know I you have trouble with patch.

Which patch? Nothing attached, I also checked in the mail archives
in case I had again problems with attached patches. Nothing.
Please resend.

Best wishes

Norbert
------------------------------------------------------------------------
Norbert Preining preining@{jaist.ac.jp, logic.at, debian.org}
JAIST, Japan TeX Live & Debian Developer
DSA: 0x09C5B094 fp: 14DF 2E6C 0307 BE6D AD76 A9C0 D2BF 4AA3 09C5 B094
------------------------------------------------------------------------
SAFFRON WALDEN (n.)
To spray the person you are talking to with half-chewed breadcrumbs or
small pieces of whitebait.
--- Douglas Adams, The Meaning of Liff

2011-11-27 10:44:36

by Emmanuel Grumbach

[permalink] [raw]
Subject: Re: iwlagn is getting very shaky

On Sun, Nov 27, 2011 at 11:46, Johannes Berg <[email protected]> wrote:
> On Sun, 2011-11-27 at 11:43 +0200, Emmanuel Grumbach wrote:
>> Hi again
>>
>> Can you please try the attached patch ?
>> I would like to know if it helps.
>
> As in the patch that Nikolay sent to the list earlier today, this should
> also check STOPPING since WANT_STOP may get cleared as soon as it is
> processed.
>

Thanks.
Norbert, could you please test Nikolay's patch instead of mine ?

2011-11-21 23:25:05

by Norbert Preining

[permalink] [raw]
Subject: Re: iwlagn is getting very shaky

On Mo, 21 Nov 2011, Emmanuel Grumbach wrote:
> iwlwifi: add debug information on queue stop / wake
>
> and run with debug flags 0x80000000 (bit 31)
> This will print a message each time a queue is stopped / started.

I have these patches in addition to current git:
--- a/drivers/net/wireless/iwlwifi/iwl-power.c
+++ b/drivers/net/wireless/iwlwifi/iwl-power.c
@@ -436,7 +436,7 @@ int iwl_power_update_mode(struct iwl_priv *priv, bool force)
/* initialize to default */
void iwl_power_initialize(struct iwl_priv *priv)
{
- priv->power_data.bus_pm = bus_get_pm_support(bus(priv));
+ priv->power_data.bus_pm = false;

priv->power_data.debug_sleep_level_override = -1;

diff --git a/drivers/net/wireless/iwlwifi/iwl-trans-pcie.c b/drivers/net/wireless/iwlwifi/iwl-trans-pcie.c
index da34110..2df3bac 100644
--- a/drivers/net/wireless/iwlwifi/iwl-trans-pcie.c
+++ b/drivers/net/wireless/iwlwifi/iwl-trans-pcie.c
@@ -1512,8 +1512,12 @@ static int iwl_trans_pcie_check_stuck_queue(struct iwl_trans *trans, int cnt)
if (time_after(jiffies, timeout)) {
IWL_ERR(trans, "Queue %d stuck for %u ms.\n", q->id,
hw_params(trans).wd_timeout);
- IWL_ERR(trans, "Current read_ptr %d write_ptr %d\n",
+ IWL_ERR(trans, "Current SW read_ptr %d write_ptr %d\n",
q->read_ptr, q->write_ptr);
+ IWL_ERR(trans, "Current HW read_ptr %d write_ptr %d\n",
+ iwl_read_prph(bus(trans), SCD_QUEUE_RDPTR(cnt))
+ & (TFD_QUEUE_SIZE_MAX - 1),
+ iwl_read_prph(bus(trans), SCD_QUEUE_WRPTR(cnt)));
return 1;
}


That is all I get. Is that what you mean?

Related to that this morning at home (not university!) I had the hanging
again, had to remove the module. Always after wake up from ram (s2ram).
There I had in the log:
Nov 22 07:55:32 mithrandir kernel: [ 4536.000032] iwlwifi 0000:06:00.0: Queue 11
stuck for 10000 ms.
Nov 22 07:55:32 mithrandir kernel: [ 4536.000044] iwlwifi 0000:06:00.0: Current
SW read_ptr 155 write_ptr 164
Nov 22 07:55:32 mithrandir kernel: [ 4536.000099] iwlwifi 0000:06:00.0: Current HW read_ptr 155 write_ptr 164
Nov 22 07:55:32 mithrandir kernel: [ 4536.000107] iwlwifi 0000:06:00.0: On demand firmware reload
Nov 22 07:55:32 mithrandir kernel: [ 4536.000546] ieee80211 phy1: Hardware restart was requested
Nov 22 07:55:32 mithrandir kernel: [ 4536.000648] iwlwifi 0000:06:00.0: L1 Enabled; Disabling L0S
Nov 22 07:55:32 mithrandir kernel: [ 4536.003755] iwlwifi 0000:06:00.0: Radio type=0x1-0x2-0x0

But somehow it didn't actually start working again.

Best wishes

Norbert
------------------------------------------------------------------------
Norbert Preining preining@{jaist.ac.jp, logic.at, debian.org}
JAIST, Japan TeX Live & Debian Developer
DSA: 0x09C5B094 fp: 14DF 2E6C 0307 BE6D AD76 A9C0 D2BF 4AA3 09C5 B094
------------------------------------------------------------------------
ULLINGSWICK (n.)
An over-developed epiglottis found in middle-aged coloraturas.
--- Douglas Adams, The Meaning of Liff

2011-11-22 07:27:38

by Norbert Preining

[permalink] [raw]
Subject: Re: iwlagn is getting very shaky

HI Emmanuel,

On Di, 22 Nov 2011, Emmanuel Grumbach wrote:
> Please apply the patch attached. It has been submitted upstream but
> hasn't been merged in the tree you use apparently.

Does not apply against current git, as there is no iwl-mac80211.c:
[/usr/src/git-kernel/linux.git] patch -p1 --dry-run < ~/QUEUE_DEBUG.patch
patching file drivers/net/wireless/iwlwifi/iwl-agn-rx.c
patching file drivers/net/wireless/iwlwifi/iwl-agn-rxon.c
Hunk #1 succeeded at 116 (offset -1 lines).
Hunk #2 succeeded at 840 (offset -4 lines).
patching file drivers/net/wireless/iwlwifi/iwl-agn-tx.c
Hunk #1 succeeded at 800 (offset -13 lines).
patching file drivers/net/wireless/iwlwifi/iwl-debug.h
Hunk #1 succeeded at 151 (offset -15 lines).
Hunk #2 succeeded at 188 (offset -15 lines).
can't find file to patch at input line 78
Perhaps you used the wrong -p or --strip option?
The text leading up to this was:
--------------------------
|diff --git a/drivers/net/wireless/iwlwifi/iwl-mac80211.c b/drivers/net/wireless/iwlwifi/iwl-mac80211.c
|index 073e827..05b1f0d 100644
|--- a/drivers/net/wireless/iwlwifi/iwl-mac80211.c
|+++ b/drivers/net/wireless/iwlwifi/iwl-mac80211.c
--------------------------
File to patch:


> Then please run with set the TXQUEUES debug flag: 0x8000000.

Is this a load options to iwlwifi module (debug=0x..... at module load time?)
or has to be set in /sys/kernel/debug/ieee80211/phy?/iwlwifi/debug/*
somewhere?

> This will give us a better understanding regarding what happens with
> the queues and how they are mapped.

You have seen my remark on RX packages and RX bytes?

> Do you use any SoftAP / P2P feature ? HW queues can be stalled

No. Fixed AP with fixed IP and no special adaptions.

> sometimes when using such configuration. Can it also be that the AP
> that we can't work with has any kind of power save feature (I know
> that this one is hard to answer though :-)) ?

There is AFAIR something that is set after 3hours or so ... not of
importance, but I will check the settings. But then, it was always
working before ;-)

> Forgot to reply on the WARN_ON you saw. Yes I know about it. We seem

Ok, fine. Thus I can forget about it. THanks.

Best wishes

Norbert
------------------------------------------------------------------------
Norbert Preining preining@{jaist.ac.jp, logic.at, debian.org}
JAIST, Japan TeX Live & Debian Developer
DSA: 0x09C5B094 fp: 14DF 2E6C 0307 BE6D AD76 A9C0 D2BF 4AA3 09C5 B094
------------------------------------------------------------------------
HARPENDEN (n.)
The coda to a phone conversion, consisting of about eight exchanges,
by which people try gracefully to get off the line.
--- Douglas Adams, The Meaning of Liff

2011-11-22 14:15:22

by Emmanuel Grumbach

[permalink] [raw]
Subject: Re: iwlagn is getting very shaky

On Tue, Nov 22, 2011 at 14:43, Norbert Preining <[email protected]> wrote:
> On Di, 22 Nov 2011, Emmanuel Grumbach wrote:
>> Sorry, please try again with that one.
>
> Here we go:
>
> [ ? 48.355300] iwlwifi 0000:06:00.0: L1 Enabled; Disabling L0S
> [ ? 48.360704] iwlwifi 0000:06:00.0: Radio type=0x1-0x2-0x0
> [ ? 48.402330] iwlwifi 0000:06:00.0: U iwl_trans_tx_queue_set_status Activate AC/CMD Queue 0 on FIFO 3
> [ ? 48.403914] iwlwifi 0000:06:00.0: U iwl_trans_tx_queue_set_status Activate AC/CMD Queue 1 on FIFO 2
> [ ? 48.404020] iwlwifi 0000:06:00.0: U iwl_trans_tx_queue_set_status Activate AC/CMD Queue 2 on FIFO 1
> [ ? 48.404020] iwlwifi 0000:06:00.0: U iwl_trans_tx_queue_set_status Activate AC/CMD Queue 3 on FIFO 0
> [ ? 48.404020] iwlwifi 0000:06:00.0: U iwl_trans_tx_queue_set_status Activate AC/CMD Queue 4 on FIFO 7
> [ ? 48.484466] iwlwifi 0000:06:00.0: L1 Enabled; Disabling L0S
> [ ? 48.488913] iwlwifi 0000:06:00.0: Radio type=0x1-0x2-0x0
> [ ? 48.534317] iwlwifi 0000:06:00.0: U iwl_trans_tx_queue_set_status Activate AC/CMD Queue 0 on FIFO 3
> [ ? 48.535806] iwlwifi 0000:06:00.0: U iwl_trans_tx_queue_set_status Activate AC/CMD Queue 1 on FIFO 2
> [ ? 48.536024] iwlwifi 0000:06:00.0: U iwl_trans_tx_queue_set_status Activate AC/CMD Queue 2 on FIFO 1
> [ ? 48.536024] iwlwifi 0000:06:00.0: U iwl_trans_tx_queue_set_status Activate AC/CMD Queue 3 on FIFO 0
> [ ? 48.536024] iwlwifi 0000:06:00.0: U iwl_trans_tx_queue_set_status Activate AC/CMD Queue 4 on FIFO 7
> [ ? 56.273961] wlan0: authenticate with 00:0a:79:eb:56:10 (try 1)
> [ ? 56.276374] wlan0: authenticated
> [ ? 56.281516] wlan0: associate with 00:0a:79:eb:56:10 (try 1)
> [ ? 56.288627] wlan0: RX AssocResp from 00:0a:79:eb:56:10 (capab=0x411 status=0 aid=1)
> [ ? 56.288633] wlan0: associated
> [ ? 74.164324] iwlwifi 0000:06:00.0: U iwl_trans_pcie_tx_agg_alloc HW queue is empty
> [ ? 75.164059] iwlwifi 0000:06:00.0: U iwl_trans_pcie_tx_agg_disable HW queue is empty
> [ ? 75.165912] iwlwifi 0000:06:00.0: U iwl_trans_tx_queue_set_status Deactivate AC/CMD Queue 11 on FIFO 0
> [ ? 76.481181] iwlwifi 0000:06:00.0: U iwl_trans_tx_queue_set_status Deactivate BA Queue 11 on FIFO 1
> [ ? 76.482944] iwlwifi 0000:06:00.0: Tx aggregation enabled on ra = 00:0a:79:eb:56:10 tid = 0

This is completely broken. This log looks really bad. AGG are enabled
and disabled right away, messing the queues completely.
It looks like the AGG gets disabled in the between alloc and status.
Will send another patch later.

Thanks

2011-11-27 09:43:17

by Emmanuel Grumbach

[permalink] [raw]
Subject: Re: iwlagn is getting very shaky

Hi again

Can you please try the attached patch ?
I would like to know if it helps.

Thanks !

On Sun, Nov 27, 2011 at 08:38, Emmanuel Grumbach <[email protected]> wrote:
>> Yeah, that was my impression too, but still, I wanted to mention it.
>>
>>> I am already on week-end here and will look at it more carefully
>>> later. First impression is that things are messy here. I guess we are
>>> getting in a bad failure path that wasn't checked...
>>
>> No problem, I can cope with it.
>>
>
> you can always disable 11n: modprobe 11n_disable=1.
>
> Well... Something is really weird here:
> mac80211 requests from us to:
> * start agg
> * stop agg
> * agg operational
>
> this is not a legal sequence
>
> BTW - I am under this impression that mac80211 hasn't been compiled
> with HT debug flag. Can you please set this compilation flag ?
>
> I also would like you to add this line in iwlagn_mac_ampdu_action:
>
>
> static int iwlagn_mac_ampdu_action(struct ieee80211_hw *hw,
> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? struct ieee80211_vif *vif,
> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? enum ieee80211_ampdu_mlme_action action,
> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? struct ieee80211_sta *sta, u16 tid, u16 *ssn,
> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? u8 buf_size)
> {
> ? ? ? ?struct iwl_priv *priv = hw->priv;
> ? ? ? ?int ret = -EINVAL;
> ? ? ? ?struct iwl_station_priv *sta_priv = (void *) sta->drv_priv;
>
> ? ? ? ?IWL_DEBUG_HT(priv, "A-MPDU action on addr %pM tid %d\n",
> ? ? ? ? ? ? ? ? ? ? sta->addr, tid);
>
> dump_stack(); ? <<<<========================================= new line
>
> ? ? ? ?if (!(priv->cfg->sku & EEPROM_SKU_CAP_11N_ENABLE))
> ? ? ? ? ? ? ? ?return -EACCES;
>
>
>
> This will give an insight at why is mac80211 calling the driver and
> better point me to the bug.
>
>
> Thanks !
>


Attachments:
addba_timer_fix.patch (563.00 B)

2011-11-04 00:56:41

by Richard Yao

[permalink] [raw]
Subject: Re: iwlagn is getting very shaky

Dear Wey,

My situation is very similar to Norbert's. At my house, I have no
issues at all, but at my university, I tend to have issues. In the
past, using 11n_disable=1 would cause the wireless interface to break
until I executed "modprobe -r iwlagn && modprobe iwlagn" as root. That
was around Linux 3.0-rc4. I haven't tried it since, but what seems to
help is "iwconfig wlan0 rts 0", but even that is not perfect. I just
gave "modprobe -r iwlagn && modprobe iwlagn 11n_disable=1" another try
and it makes no difference.

Issues with the campus Wi-Fi are not isolated to Linux users. I
suspect it has something to do with the 100 or so access points that I
see when I do "iwlist wlan0 scan" at my university and the 4 that I
see when I do that off campus. I don't want to imagine how many
devices are trying to use the available bandwidth. I imagine that
implementing "auto" for rts would improve both my and Norbert's
situations.

Yours truly,
Richard Yao

On Tue, Nov 1, 2011 at 11:21 PM, Guy, Wey-Yi <[email protected]> wrote:
> Hi Norbert,
>
> On Tue, 2011-11-01 at 20:13 -0700, Norbert Preining wrote:
>> Hi all,
>>
>> On Mi, 26 Okt 2011, Norbert Preining wrote:
>> > On Di, 25 Okt 2011, wwguy wrote:
>> > > so this is different problem, right? it looks different compare to what
>> > > you described before which is cause by queue stopped.
>> >
>> > I assume that this is a different regression concerning 3.1.0 released.
>>
>> Is there any progress on that? Currently linux git status is unusable
>> with respect to iwlwifi?
>>
>> Are there any changes planned before rc1?
>>
> after the firmware reloaded, is the traffic resume? or it is continuous
> without traffic?
>
> Looks like the different "queue stuck" problem you are seeing. you
> mention you only seeing this at university but not home. So what the
> differences are?
> what Band/channel you are using, also, if you disable 11n, are you still
> seeing the problem?
>
> you also mention it is might related to suspend/resume, could you please
> give a better description why you think it might be related
>
> Thanks
> Wey
>
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at ?http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at ?http://www.tux.org/lkml/
>

2011-11-04 04:58:27

by Norbert Preining

[permalink] [raw]
Subject: Re: iwlagn is getting very shaky

Hi Wey,

On Di, 01 Nov 2011, Guy, Wey-Yi wrote:
> after the firmware reloaded, is the traffic resume? or it is continuous
> without traffic?

Ok, I compiled a new kernel from todays git, and I see that there
are kernel bugs, but after the hardware restart traffic gets through.

While suspending I have:
[ 6630.948551] WARNING: at drivers/pci/pci-driver.c:607 pci_has_legacy_pm_support.isra.8+0x53/0x59()
[ 6630.948557] Hardware name: VGN-Z11VN_B
[ 6630.948561] Modules linked in: sony_laptop rfcomm bnep snd_hrtimer binfmt_misc dm_crypt dm_mod isofs btrfs zlib_deflate crc32c libcrc32c vfat fat fuse loop uinput snd_hda_codec_realtek arc4 iwlwifi snd_hda_intel snd_hda_codec snd_hwdep snd_pcm_oss snd_mixer_oss snd_pcm firewire_ohci mac80211 snd_seq_dummy snd_seq_oss snd_seq_midi firewire_core cfg80211 snd_rawmidi snd_seq_midi_event btusb snd_seq snd_timer mxm_wmi bluetooth snd_seq_device crc16 snd rfkill joydev crc_itu_t tpm_infineon soundcore snd_page_alloc
[ 6630.948655] Pid: 11390, comm: kworker/u:8 Not tainted 3.1.0+ #42
[ 6630.948660] Call Trace:
[ 6630.948674] [<ffffffff8103843d>] warn_slowpath_common+0x83/0x9b
[ 6630.948683] [<ffffffff8103846f>] warn_slowpath_null+0x1a/0x1c
[ 6630.948692] [<ffffffff811aa983>] pci_has_legacy_pm_support.isra.8+0x53/0x59
[ 6630.948701] [<ffffffff811aaf94>] pci_pm_suspend+0x34/0x104
[ 6630.948711] [<ffffffff8127f50b>] pm_op+0x8b/0x149
[ 6630.948719] [<ffffffff8127f72f>] __device_suspend+0x106/0x196
[ 6630.948727] [<ffffffff8127f7de>] async_suspend+0x1f/0x5d
[ 6630.948737] [<ffffffff81058251>] async_run_entry_fn+0x9e/0x131
[ 6630.948746] [<ffffffff810581b3>] ? async_schedule+0x17/0x17
[ 6630.948756] [<ffffffff8104dd60>] process_one_work+0x17b/0x2bd
[ 6630.948764] [<ffffffff8104c3a5>] ? need_to_create_worker+0x12/0x26
[ 6630.948773] [<ffffffff8104ee5b>] worker_thread+0xdb/0x15f
[ 6630.948780] [<ffffffff8104ed80>] ? manage_workers.isra.24+0x171/0x171
[ 6630.948789] [<ffffffff81052541>] kthread+0x84/0x8c
[ 6630.948800] [<ffffffff81408a14>] kernel_thread_helper+0x4/0x10
[ 6630.948809] [<ffffffff810524bd>] ? kthread_worker_fn+0x148/0x148
[ 6630.948817] [<ffffffff81408a10>] ? gs_change+0xb/0xb
[ 6630.948824] ---[ end trace 9f0907b3f72ff4c6 ]---

(several of them)


Then I get some WARNINGS from iwl:
[ 6662.601340] WARNING: at drivers/net/wireless/iwlwifi/iwl-trans-pcie.c:1101 iwl_trans_pcie_tx+0x180/0x661 [iwlwifi]()
[ 6662.601343] Hardware name: VGN-Z11VN_B
[ 6662.601345] Modules linked in: sony_laptop rfcomm bnep snd_hrtimer binfmt_misc dm_crypt dm_mod isofs btrfs zlib_deflate crc32c libcrc32c vfat fat fuse loop uinput snd_hda_codec_realtek arc4 iwlwifi snd_hda_intel snd_hda_codec snd_hwdep snd_pcm_oss snd_mixer_oss snd_pcm firewire_ohci mac80211 snd_seq_dummy snd_seq_oss snd_seq_midi firewire_core cfg80211 snd_rawmidi snd_seq_midi_event btusb snd_seq snd_timer mxm_wmi bluetooth snd_seq_device crc16 snd rfkill joydev crc_itu_t tpm_infineon soundcore snd_page_alloc
[ 6662.601386] Pid: 11391, comm: kworker/u:9 Tainted: G W 3.1.0+ #42
[ 6662.601388] Call Trace:
[ 6662.601390] <IRQ> [<ffffffff8103843d>] warn_slowpath_common+0x83/0x9b
[ 6662.601399] [<ffffffff8103846f>] warn_slowpath_null+0x1a/0x1c
[ 6662.601407] [<ffffffffa0223b04>] iwl_trans_pcie_tx+0x180/0x661 [iwlwifi]
[ 6662.601412] [<ffffffff810d32a5>] ? kmem_cache_alloc+0x44/0xb9
[ 6662.601419] [<ffffffffa020b9f4>] iwlagn_tx_skb+0x862/0x901 [iwlwifi]
[ 6662.601425] [<ffffffffa0201f8c>] iwlagn_mac_tx+0x131/0x1a2 [iwlwifi]
[ 6662.601434] [<ffffffffa01d8bc6>] ? ieee80211_tx_h_fragment+0x16/0x22c [mac80211]
[ 6662.601443] [<ffffffffa01ccf14>] __ieee80211_tx+0x176/0x1cf [mac80211]
[ 6662.601449] [<ffffffffa01d8b38>] ? ieee80211_tx_h_calculate_duration+0x4c/0x65 [mac80211]
[ 6662.601459] [<ffffffffa01cdcda>] ieee80211_tx+0x97/0xaf [mac80211]
[ 6662.601469] [<ffffffffa01cebd6>] ieee80211_tx_pending+0xf0/0x1c3 [mac80211]
[ 6662.601473] [<ffffffff8103db22>] tasklet_action+0x77/0xc2
[ 6662.601477] [<ffffffff8103dc63>] __do_softirq+0xbc/0x1a5
[ 6662.601481] [<ffffffff81408b0c>] call_softirq+0x1c/0x30
[ 6662.601483] <EOI> [<ffffffff8100359e>] do_softirq+0x38/0x6e
[ 6662.601489] [<ffffffff8103d929>] _local_bh_enable_ip.isra.12+0x7d/0xa0
[ 6662.601492] [<ffffffff8103d95a>] local_bh_enable_ip+0xe/0x10
[ 6662.601496] [<ffffffff81406d22>] _raw_spin_unlock_bh+0x23/0x25
[ 6662.601503] [<ffffffffa01bcc64>] ieee80211_agg_tx_operational+0x99/0xa4 [mac80211]
[ 6662.601512] [<ffffffffa01bd7b2>] ieee80211_process_addba_resp+0xb8/0xf2 [mac80211]
[ 6662.601516] [<ffffffff81064740>] ? do_raw_spin_trylock+0xc/0x2a
[ 6662.601525] [<ffffffffa01c4a26>] ieee80211_iface_work+0x130/0x2b5 [mac80211]
[ 6662.601534] [<ffffffffa01c48f6>] ? ieee80211_teardown_sdata+0xcc/0xcc [mac80211]
[ 6662.601538] [<ffffffff8104dd60>] process_one_work+0x17b/0x2bd
[ 6662.601541] [<ffffffff8104c3a5>] ? need_to_create_worker+0x12/0x26
[ 6662.601545] [<ffffffff8104ee5b>] worker_thread+0xdb/0x15f
[ 6662.601548] [<ffffffff8104ed80>] ? manage_workers.isra.24+0x171/0x171
[ 6662.601552] [<ffffffff81052541>] kthread+0x84/0x8c
[ 6662.601556] [<ffffffff81408a14>] kernel_thread_helper+0x4/0x10
[ 6662.601559] [<ffffffff810524bd>] ? kthread_worker_fn+0x148/0x148
[ 6662.601563] [<ffffffff81408a10>] ? gs_change+0xb/0xb

Then I get the hanging queue:
[ 6672.360098] iwlwifi 0000:06:00.0: Queue 11 stuck for 10000 ms.
[ 6672.360109] iwlwifi 0000:06:00.0: Current read_ptr 251 write_ptr 0
[ 6672.360117] iwlwifi 0000:06:00.0: On demand firmware reload
[ 6672.360548] ieee80211 phy0: Hardware restart was requested
[ 6672.360643] iwlwifi 0000:06:00.0: L1 Enabled; Disabling L0S
[ 6672.363667] iwlwifi 0000:06:00.0: Radio type=0x1-0x2-0x0

After that network manager tries to connect to my univerity network,
and I get another
[ 6702.613501] WARNING: at include/net/mac80211.h:3570 rate_control_send_low+0xa5/0x165 [mac80211]()
[ 6702.613508] Hardware name: VGN-Z11VN_B
[ 6702.613513] Modules linked in: sony_laptop rfcomm bnep snd_hrtimer binfmt_misc dm_crypt dm_mod isofs btrfs zlib_deflate crc32c libcrc32c vfat fat fuse loop uinput snd_hda_codec_realtek arc4 iwlwifi snd_hda_intel snd_hda_codec snd_hwdep snd_pcm_oss snd_mixer_oss snd_pcm firewire_ohci mac80211 snd_seq_dummy snd_seq_oss snd_seq_midi firewire_core cfg80211 snd_rawmidi snd_seq_midi_event btusb snd_seq snd_timer mxm_wmi bluetooth snd_seq_device crc16 snd rfkill joydev crc_itu_t tpm_infineon soundcore snd_page_alloc
[ 6702.613654] Pid: 11391, comm: kworker/u:9 Tainted: G W 3.1.0+ #42
[ 6702.613661] Call Trace:
[ 6702.613678] [<ffffffff8103843d>] warn_slowpath_common+0x83/0x9b
[ 6702.613689] [<ffffffff8103846f>] warn_slowpath_null+0x1a/0x1c
[ 6702.613715] [<ffffffffa01c6139>] rate_control_send_low+0xa5/0x165 [mac80211]
[ 6702.613735] [<ffffffffa02061be>] rs_get_rate+0x146/0x254 [iwlwifi]
[ 6702.613763] [<ffffffffa01c65a1>] rate_control_get_rate+0x86/0x14c [mac80211]
[ 6702.613776] [<ffffffff81406ea7>] ? _raw_spin_unlock_irqrestore+0x25/0x30
] ieee80211_tx_h_rate_ctrl+0x1cb/0x3e4 [mac80211]
[ 6702.613834] [<ffffffffa01d13d7>] ? ieee80211_send_probe_req+0x50/0x58 [mac80211]
[ 6702.613863] [<ffffffffa01cdab9>] invoke_tx_handlers+0x69/0xf5 [mac80211]
[ 6702.613892] [<ffffffffa01cdcc2>] ieee80211_tx+0x7f/0xaf [mac80211]
[ 6702.613922] [<ffffffffa01ce162>] ieee80211_xmit+0x89/0x97 [mac80211]
[ 6702.613950] [<ffffffffa01ced00>] ieee80211_tx_skb+0x57/0x5f [mac80211]
[ 6702.613976] [<ffffffffa01c0088>] ieee80211_send_nullfunc+0x5f/0x64 [mac80211]
[ 6702.613999] [<ffffffffa01bc349>] ieee80211_offchannel_return+0x94/0x1a5 [mac80211]
[ 6702.614012] [<ffffffff8104494f>] ? mod_timer+0x90/0x99
[ 6702.614038] [<ffffffffa01c3de0>] ieee80211_work_work+0xf9c/0x105d [mac80211]
[ 6702.614049] [<ffffffff8100074b>] ? __unlazy_fpu.part.3+0x9/0x61
[ 6702.614059] [<ffffffff81000d06>] ? __switch_to+0xd3/0x200
[ 6702.614070] [<ffffffff8102b30c>] ? need_resched+0x23/0x2d
[ 6702.614096] [<ffffffffa01c2e44>] ? free_work+0x19/0x19 [mac80211]
[ 6702.614108] [<ffffffff8104dd60>] process_one_work+0x17b/0x2bd
[ 6702.614118] [<ffffffff8104c3a5>] ? need_to_create_worker+0x12/0x26
[ 6702.614129] [<ffffffff8104ee5b>] worker_thread+0xdb/0x15f
[ 6702.614138] [<ffffffff8104ed80>] ? manage_workers.isra.24+0x171/0x171
[ 6702.614149] [<ffffffff81052541>] kthread+0x84/0x8c
[ 6702.614160] [<ffffffff81408a14>] kernel_thread_helper+0x4/0x10
[ 6702.614172] [<ffffffff810524bd>] ? kthread_worker_fn+0x148/0x148
[ 6702.614181] [<ffffffff81408a10>] ? gs_change+0xb/0xb

Networkanager[2263]: <info> (wlan0): supplicant interface state: completed -> authenticating
[ 6702.812109] wlan0: direct probe to 00:24:c4:ab:bd:e0 (try 2/3)
[ 6703.012130] wlan0: direct probe to 00:24:c4:ab:bd:e0 (try 3/3)
[ 6703.212113] wlan0: direct probe to 00:24:c4:ab:bd:e0 timed out

and another few (3) WARNING in include/net/mac80211.h:3570 rate_control_send_low+0xa5/0x165

and the final hickup:
[ 6712.366859] wlan0: deauthenticating from 00:24:c4:ab:bd:ef by local choice (reason=2)
[ 6712.366964] iwlwifi 0000:06:00.0: Stopping AGG while state not ON or starting for 0 on 0 (0)
wpa_supplicant[2339]: Trying to authenticate with 00:24:c4:ab:bd:e0 (SSID='XXXXXXXX' freq=2412 MHz)
wpa_supplicant[2339]: CTRL-EVENT-DISCONNECTED bssid=00:24:c4:ab:bd:e0 reason=2
[ 6712.381957] cfg80211: Calling CRDA to update world regulatory domain

After that NM succeeds and the connections seems to be stable.

I have no idea if that helps you in any way?!

Best wishes

Norbert
------------------------------------------------------------------------
Norbert Preining preining@{jaist.ac.jp, logic.at, debian.org}
JAIST, Japan TeX Live & Debian Developer
DSA: 0x09C5B094 fp: 14DF 2E6C 0307 BE6D AD76 A9C0 D2BF 4AA3 09C5 B094
------------------------------------------------------------------------
BALLYCUMBER
One of the six half-read books lying somewhere in your bed.
--- Douglas Adams, The Meaning of Liff

2011-11-25 12:31:16

by Emmanuel Grumbach

[permalink] [raw]
Subject: Re: iwlagn is getting very shaky

On Fri, Nov 25, 2011 at 14:21, Norbert Preining <[email protected]> wrote:
> Hi Emmanuel,
>
> On Do, 24 Nov 2011, Emmanuel Grumbach wrote:
>> Please apply the patch attached on top of the one I previously sent.
>> Let me know I you have trouble with patch.
>
> The patch you sent was not against kernel git, but I moved the
> debug statement from iwl-mac80211.c to iwl-agn.c (where I guess is
> the right place) and applied the rest of the patch.

Thanks for doing the porting job :-)

>
>> Please also enable the MAC80211_HT_DEBUG compilation flag
>> I also the (HT + TXQUEUES) debug flag of the driver: modprobe iwlwifi
>> debug=0x84000000
>
> Also activated.
>

cool

> Here my very very small theory, but that is just guessing: Could it
> be related to the time being adjusted by ntp or something? I have the
> feeling it works for a bit, then hangs. In this little bit it might
> be that ntp updates the time of the laptop. That would explain why
> on a full normal reboot it most of the times work, but when the laptop
> sleeps for long time, or is turned of for long time, then it is
> getting problems. But this is surely just a stupid theory.
>

Well.. All this is really Layer 2 related, and even lower. It is a
kind of bad handshake between the driver / HW / mac80211. So I
wouldn't think that NTP may hurt here. Although one should never
underestimate the damage a piece of software can do....

> Here we go with the debug output
>
>
> [ 4592.437269] iwlwifi 0000:06:00.0: L1 Enabled; Disabling L0S
> [ 4592.441080] iwlwifi 0000:06:00.0: Radio type=0x1-0x2-0x0
> [ 4592.482332] iwlwifi 0000:06:00.0: U iwl_txq_ctx_activate Activate txq 0
> [ 4592.482346] iwlwifi 0000:06:00.0: U iwl_trans_tx_queue_set_status Activate AC/CMD Queue 0 on FIFO 3
> [ 4592.482349] iwlwifi 0000:06:00.0: U iwl_txq_ctx_activate Activate txq 1
> [ 4592.482362] iwlwifi 0000:06:00.0: U iwl_trans_tx_queue_set_status Activate AC/CMD Queue 1 on FIFO 2
> [ 4592.482365] iwlwifi 0000:06:00.0: U iwl_txq_ctx_activate Activate txq 2
> [ 4592.482377] iwlwifi 0000:06:00.0: U iwl_trans_tx_queue_set_status Activate AC/CMD Queue 2 on FIFO 1
> [ 4592.482380] iwlwifi 0000:06:00.0: U iwl_txq_ctx_activate Activate txq 3
> [ 4592.482392] iwlwifi 0000:06:00.0: U iwl_trans_tx_queue_set_status Activate AC/CMD Queue 3 on FIFO 0
> [ 4592.482395] iwlwifi 0000:06:00.0: U iwl_txq_ctx_activate Activate txq 4
> [ 4592.482407] iwlwifi 0000:06:00.0: U iwl_trans_tx_queue_set_status Activate AC/CMD Queue 4 on FIFO 7
> [ 4592.482410] iwlwifi 0000:06:00.0: U iwl_txq_ctx_activate Activate txq 5
> [ 4592.482412] iwlwifi 0000:06:00.0: U iwl_txq_ctx_activate Activate txq 6
> [ 4592.482415] iwlwifi 0000:06:00.0: U iwl_txq_ctx_activate Activate txq 7
> [ 4592.482417] iwlwifi 0000:06:00.0: U iwl_txq_ctx_activate Activate txq 8
> [ 4592.482419] iwlwifi 0000:06:00.0: U iwl_txq_ctx_activate Activate txq 9
> [ 4592.482422] iwlwifi 0000:06:00.0: U iwl_txq_ctx_activate Activate txq 10
> [ 4600.139699] wlan0: authenticate with 00:0a:79:eb:56:10 (try 1)
> [ 4600.142190] wlan0: authenticated
> [ 4600.148081] wlan0: associate with 00:0a:79:eb:56:10 (try 1)
> [ 4600.153962] wlan0: RX AssocResp from 00:0a:79:eb:56:10 (capab=0x411 status=0 aid=2)
> [ 4600.153972] wlan0: associated
> [ 4601.280512] iwlwifi 0000:06:00.0: U iwlagn_mac_ampdu_action A-MPDU action on addr 00:0a:79:eb:56:10 tid 0
> [ 4601.280525] iwlwifi 0000:06:00.0: U iwlagn_mac_ampdu_action start Rx
> [ 4616.622380] iwlwifi 0000:06:00.0: I rs_tl_turn_on_agg_for_tid Starting Tx agg: STA: 00:0a:79:eb:56:10 tid: 0
> [ 4616.652083] iwlwifi 0000:06:00.0: U iwlagn_mac_ampdu_action A-MPDU action on addr 00:0a:79:eb:56:10 tid 0
> [ 4616.652087] iwlwifi 0000:06:00.0: U iwlagn_mac_ampdu_action start Tx
> [ 4616.652090] iwlwifi 0000:06:00.0: U iwlagn_tx_agg_start TX AGG request on ra = 00:0a:79:eb:56:10 tid = 0
> [ 4616.652238] iwlwifi 0000:06:00.0: U iwlagn_txq_ctx_activate_free Activate txq 11
> [ 4616.652241] iwlwifi 0000:06:00.0: U iwl_trans_pcie_tx_agg_alloc HW queue is empty
> [ 4616.652244] iwlwifi 0000:06:00.0: U iwl_start_tx_ba_trans_ready Telling mac start BA ready tid: 0
> [ 4617.652070] iwlwifi 0000:06:00.0: U iwlagn_mac_ampdu_action A-MPDU action on addr 00:0a:79:eb:56:10 tid 0
> [ 4617.652075] iwlwifi 0000:06:00.0: U iwlagn_mac_ampdu_action stop Tx
> [ 4617.652077] iwlwifi 0000:06:00.0: U iwlagn_mac_ampdu_action Stop Tx agg for tid 0
> [ 4617.652081] iwlwifi 0000:06:00.0: U iwl_trans_pcie_tx_agg_disable HW queue is empty
> [ 4617.652139] iwlwifi 0000:06:00.0: U iwl_trans_pcie_txq_agg_disable Disable AGG txq_id 11
> [ 4617.652192] iwlwifi 0000:06:00.0: U iwl_txq_ctx_deactivate Deactivate txq 11
> [ 4617.652210] iwlwifi 0000:06:00.0: U iwl_trans_tx_queue_set_status Deactivate AC/CMD Queue 11 on FIFO 0
> [ 4617.652213] iwlwifi 0000:06:00.0: U iwl_stop_tx_ba_trans_ready Telling mac stop BA ready tid: 0
> [ 4621.013744] iwlwifi 0000:06:00.0: U iwlagn_mac_ampdu_action A-MPDU action on addr 00:0a:79:eb:56:10 tid 0
> [ 4621.013760] iwlwifi 0000:06:00.0: U iwl_trans_pcie_tx_agg_setup Setup AGG for tid 0
> [ 4621.013916] iwlwifi 0000:06:00.0: U iwl_trans_tx_queue_set_status Deactivate BA Queue 11 on FIFO 1
> [ 4621.013925] iwlwifi 0000:06:00.0: U iwlagn_mac_ampdu_action priv->agg_tids_count = 1
> [ 4621.013937] iwlwifi 0000:06:00.0: Tx aggregation enabled on ra = 00:0a:79:eb:56:10 tid = 0


Thanks

I am already on week-end here and will look at it more carefully
later. First impression is that things are messy here. I guess we are
getting in a bad failure path that wasn't checked...


> [ 4621.013954] ------------[ cut here ]------------
> [ 4621.013991] WARNING: at drivers/net/wireless/iwlwifi/iwl-trans-pcie.c:1106 iwl_trans_pcie_tx+0x180/0x668 [iwlwifi]()
> [ 4621.013999] Hardware name: VGN-Z11VN_B
> [ 4621.014004] Modules linked in: usb_storage rfcomm bnep vboxpci(O) vboxnetadp(O) vboxnetflt(O) vboxdrv(O) snd_hrtimer binfmt_misc dm_crypt dm_mod isofs btrfs zlib_deflate crc32c libcrc32c vfat fat fuse loop uinput snd_hda_codec_realtek snd_hda_intel snd_hda_codec snd_hwdep snd_pcm_oss snd_mixer_oss arc4 snd_pcm iwlwifi mac80211 snd_seq_dummy snd_seq_oss snd_seq_midi snd_rawmidi snd_seq_midi_event btusb sony_laptop(O) bluetooth snd_seq snd_timer snd_seq_device snd soundcore cfg80211 firewire_ohci firewire_core mxm_wmi joydev rfkill tpm_infineon snd_page_alloc crc16 crc_itu_t
> [ 4621.014134] Pid: 6169, comm: kworker/u:44 Tainted: G ? ? ? ?W ?O 3.2.0-rc3+ #2
> [ 4621.014140] Call Trace:
> [ 4621.014145] ?<IRQ> ?[<ffffffff8103849d>] warn_slowpath_common+0x83/0x9b
> [ 4621.014171] ?[<ffffffff810384cf>] warn_slowpath_null+0x1a/0x1c
> [ 4621.014194] ?[<ffffffffa0204f4d>] iwl_trans_pcie_tx+0x180/0x668 [iwlwifi]
> [ 4621.014208] ?[<ffffffff810d3861>] ? kmem_cache_alloc+0x44/0xb9
> [ 4621.014229] ?[<ffffffffa01eca7d>] iwlagn_tx_skb+0x87b/0x91b [iwlwifi]
> [ 4621.014247] ?[<ffffffffa01e2fea>] iwlagn_mac_tx+0x131/0x1a2 [iwlwifi]
> [ 4621.014268] ?[<ffffffffa01c1cc6>] ? ieee80211_tx_h_fragment+0x16/0x22c [mac80211]
> [ 4621.014296] ?[<ffffffffa01b6000>] __ieee80211_tx+0x176/0x1cf [mac80211]
> [ 4621.014314] ?[<ffffffffa01c1c38>] ? ieee80211_tx_h_calculate_duration+0x4c/0x65 [mac80211]
> [ 4621.014344] ?[<ffffffffa01b6dc6>] ieee80211_tx+0x97/0xaf [mac80211]
> [ 4621.014374] ?[<ffffffffa01b7cc2>] ieee80211_tx_pending+0xf0/0x1c3 [mac80211]
> [ 4621.014386] ?[<ffffffff8103dc16>] tasklet_action+0x77/0xc2
> [ 4621.014395] ?[<ffffffff8103dd57>] __do_softirq+0xbc/0x1a5
> [ 4621.014408] ?[<ffffffff8140b24c>] call_softirq+0x1c/0x30
> [ 4621.014413] ?<EOI> ?[<ffffffff8100359e>] do_softirq+0x38/0x6e
> [ 4621.014430] ?[<ffffffff8103da1d>] _local_bh_enable_ip.isra.12+0x7d/0xa0
> [ 4621.014439] ?[<ffffffff8103da4e>] local_bh_enable_ip+0xe/0x10
> [ 4621.014448] ?[<ffffffff81409462>] _raw_spin_unlock_bh+0x23/0x25
> [ 4621.014471] ?[<ffffffffa01a5c6c>] ieee80211_agg_tx_operational+0x99/0xa4 [mac80211]
> [ 4621.014496] ?[<ffffffffa01a67bd>] ieee80211_process_addba_resp+0xb8/0xf3 [mac80211]
> [ 4621.014508] ?[<ffffffff81064840>] ? do_raw_spin_trylock+0x18/0x2a
> [ 4621.014534] ?[<ffffffffa01adada>] ieee80211_iface_work+0x130/0x2b5 [mac80211]
> [ 4621.014560] ?[<ffffffffa01ad9aa>] ? ieee80211_teardown_sdata+0xcc/0xcc [mac80211]
> [ 4621.014572] ?[<ffffffff8104de54>] process_one_work+0x17b/0x2bd
> [ 4621.014583] ?[<ffffffff8104c499>] ? need_to_create_worker+0x12/0x26
> [ 4621.014593] ?[<ffffffff8104ef4f>] worker_thread+0xdb/0x15f
> [ 4621.014602] ?[<ffffffff8104ee74>] ? manage_workers.isra.24+0x171/0x171
> [ 4621.014612] ?[<ffffffff81052639>] kthread+0x84/0x8c
> [ 4621.014622] ?[<ffffffff8140b154>] kernel_thread_helper+0x4/0x10
> [ 4621.014633] ?[<ffffffff810525b5>] ? kthread_worker_fn+0x148/0x148
> [ 4621.014642] ?[<ffffffff8140b150>] ? gs_change+0xb/0xb
> [ 4621.014649] ---[ end trace 81df68812429b16c ]---
> [ 4622.112804] iwlwifi 0000:06:00.0: I rs_tl_turn_on_agg_for_tid Starting Tx agg: STA: 00:0a:79:eb:56:10 tid: 0
> [ 4622.124104] iwlwifi 0000:06:00.0: U iwlagn_mac_ampdu_action A-MPDU action on addr 00:0a:79:eb:56:10 tid 0
> [ 4622.124116] iwlwifi 0000:06:00.0: U iwlagn_mac_ampdu_action start Tx
> [ 4622.124126] iwlwifi 0000:06:00.0: U iwlagn_tx_agg_start TX AGG request on ra = 00:0a:79:eb:56:10 tid = 0
> [ 4622.124304] iwlwifi 0000:06:00.0: U iwlagn_txq_ctx_activate_free Activate txq 11
> [ 4622.124315] iwlwifi 0000:06:00.0: U iwl_trans_pcie_tx_agg_alloc HW queue is NOT empty: 21 packets in HW queue
> [ 4630.040070] iwlwifi 0000:06:00.0: Queue 11 stuck for 10000 ms.
> [ 4630.040081] iwlwifi 0000:06:00.0: Current SW read_ptr 242 write_ptr 8
> [ 4630.040136] iwlwifi 0000:06:00.0: Current HW read_ptr 242 write_ptr 8
> [ 4630.040144] iwlwifi 0000:06:00.0: On demand firmware reload
> [ 4630.040587] ieee80211 phy0: Hardware restart was requested
> [ 4630.040698] iwlwifi 0000:06:00.0: L1 Enabled; Disabling L0S
> [ 4630.043703] iwlwifi 0000:06:00.0: Radio type=0x1-0x2-0x0
> [ 4630.086602] iwlwifi 0000:06:00.0: U iwl_txq_ctx_activate Activate txq 0
> [ 4630.086624] iwlwifi 0000:06:00.0: U iwl_trans_tx_queue_set_status Activate AC/CMD Queue 0 on FIFO 3
> [ 4630.086634] iwlwifi 0000:06:00.0: U iwl_txq_ctx_activate Activate txq 1
> [ 4630.086653] iwlwifi 0000:06:00.0: U iwl_trans_tx_queue_set_status Activate AC/CMD Queue 1 on FIFO 2
> [ 4630.086662] iwlwifi 0000:06:00.0: U iwl_txq_ctx_activate Activate txq 2
> [ 4630.086681] iwlwifi 0000:06:00.0: U iwl_trans_tx_queue_set_status Activate AC/CMD Queue 2 on FIFO 1
> [ 4630.086690] iwlwifi 0000:06:00.0: U iwl_txq_ctx_activate Activate txq 3
> [ 4630.086709] iwlwifi 0000:06:00.0: U iwl_trans_tx_queue_set_status Activate AC/CMD Queue 3 on FIFO 0
> [ 4630.086717] iwlwifi 0000:06:00.0: U iwl_txq_ctx_activate Activate txq 4
> [ 4630.086736] iwlwifi 0000:06:00.0: U iwl_trans_tx_queue_set_status Activate AC/CMD Queue 4 on FIFO 7
> [ 4630.086745] iwlwifi 0000:06:00.0: U iwl_txq_ctx_activate Activate txq 5
> [ 4630.086752] iwlwifi 0000:06:00.0: U iwl_txq_ctx_activate Activate txq 6
> [ 4630.086759] iwlwifi 0000:06:00.0: U iwl_txq_ctx_activate Activate txq 7
> [ 4630.086767] iwlwifi 0000:06:00.0: U iwl_txq_ctx_activate Activate txq 8
> [ 4630.086774] iwlwifi 0000:06:00.0: U iwl_txq_ctx_activate Activate txq 9
> [ 4630.086781] iwlwifi 0000:06:00.0: U iwl_txq_ctx_activate Activate txq 10
> [ 4630.157921] iwlwifi 0000:06:00.0: U iwlagn_mac_ampdu_action A-MPDU action on addr 00:0a:79:eb:56:10 tid 0
> [ 4630.157933] iwlwifi 0000:06:00.0: U iwlagn_mac_ampdu_action stop Tx
> [ 4630.157942] iwlwifi 0000:06:00.0: U iwlagn_mac_ampdu_action Stop Tx agg for tid 0
> [ 4630.157951] iwlwifi 0000:06:00.0: U iwl_trans_pcie_tx_agg_disable AGG stop before setup done
> [ 4630.157992] iwlwifi 0000:06:00.0: U iwl_trans_pcie_txq_agg_disable Disable AGG txq_id 11
> [ 4630.158073] iwlwifi 0000:06:00.0: U iwl_txq_ctx_deactivate Deactivate txq 11
> [ 4630.158113] iwlwifi 0000:06:00.0: U iwl_trans_tx_queue_set_status Deactivate AC/CMD Queue 11 on FIFO 0
> [ 4630.158123] iwlwifi 0000:06:00.0: U iwl_stop_tx_ba_trans_ready Telling mac stop BA ready tid: 0
> [ 4630.158136] iwlwifi 0000:06:00.0: U iwlagn_mac_ampdu_action priv->agg_tids_count = 0
> [ 4630.158148] iwlwifi 0000:06:00.0: U iwlagn_mac_ampdu_action A-MPDU action on addr 00:0a:79:eb:56:10 tid 0
> [ 4630.158157] iwlwifi 0000:06:00.0: U iwlagn_mac_ampdu_action stop Rx
> [ 4630.196845] iwlwifi 0000:06:00.0: U iwlagn_mac_ampdu_action A-MPDU action on addr 00:0a:79:eb:56:10 tid 0
> [ 4630.196857] iwlwifi 0000:06:00.0: U iwlagn_mac_ampdu_action start Rx
> [ 4632.759183] iwlwifi 0000:06:00.0: I rs_tl_turn_on_agg_for_tid Starting Tx agg: STA: 00:0a:79:eb:56:10 tid: 0
> [ 4632.812124] iwlwifi 0000:06:00.0: U iwlagn_mac_ampdu_action A-MPDU action on addr 00:0a:79:eb:56:10 tid 0
> [ 4632.812136] iwlwifi 0000:06:00.0: U iwlagn_mac_ampdu_action start Tx
> [ 4632.812145] iwlwifi 0000:06:00.0: U iwlagn_tx_agg_start TX AGG request on ra = 00:0a:79:eb:56:10 tid = 0
> [ 4632.812326] iwlwifi 0000:06:00.0: U iwlagn_txq_ctx_activate_free Activate txq 11
> [ 4632.812337] iwlwifi 0000:06:00.0: U iwl_trans_pcie_tx_agg_alloc HW queue is NOT empty: 21 packets in HW queue
> [ 4634.916847] iwlwifi 0000:06:00.0: U iwlagn_mac_ampdu_action A-MPDU action on addr 00:0a:79:eb:56:10 tid 6
> [ 4634.916860] iwlwifi 0000:06:00.0: U iwlagn_mac_ampdu_action start Rx
> [ 4644.716439] iwlwifi 0000:06:00.0: I rs_tl_turn_on_agg_for_tid Starting Tx agg: STA: 00:0a:79:eb:56:10 tid: 6
> [ 4644.764120] iwlwifi 0000:06:00.0: U iwlagn_mac_ampdu_action A-MPDU action on addr 00:0a:79:eb:56:10 tid 6
> [ 4644.764132] iwlwifi 0000:06:00.0: U iwlagn_mac_ampdu_action start Tx
> [ 4644.764142] iwlwifi 0000:06:00.0: U iwlagn_tx_agg_start TX AGG request on ra = 00:0a:79:eb:56:10 tid = 6
> [ 4644.764306] iwlwifi 0000:06:00.0: U iwlagn_txq_ctx_activate_free Activate txq 12
> [ 4644.764315] iwlwifi 0000:06:00.0: U iwl_trans_pcie_tx_agg_alloc HW queue is empty
> [ 4644.764325] iwlwifi 0000:06:00.0: U iwl_start_tx_ba_trans_ready Telling mac start BA ready tid: 6
> [ 4644.766328] iwlwifi 0000:06:00.0: U iwlagn_mac_ampdu_action A-MPDU action on addr 00:0a:79:eb:56:10 tid 6
> [ 4644.766339] iwlwifi 0000:06:00.0: U iwl_trans_pcie_tx_agg_setup Setup AGG for tid 6
> [ 4644.766494] iwlwifi 0000:06:00.0: U iwl_trans_tx_queue_set_status Activate BA Queue 12 on FIFO 3
> [ 4644.766503] iwlwifi 0000:06:00.0: U iwlagn_mac_ampdu_action priv->agg_tids_count = 1
> [ 4644.766549] iwlwifi 0000:06:00.0: Tx aggregation enabled on ra = 00:0a:79:eb:56:10 tid = 6
> [ 4703.713625] iwlwifi 0000:06:00.0: I iwlagn_rx_reply_compressed_ba agg frames sent:2, acked:2
>
>
> Best wishes
>
> Norbert
> ------------------------------------------------------------------------
> Norbert Preining ? ? ? ? ? ?preining@{jaist.ac.jp, logic.at, debian.org}
> JAIST, Japan ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? TeX Live & Debian Developer
> DSA: 0x09C5B094 ? fp: 14DF 2E6C 0307 BE6D AD76 ?A9C0 D2BF 4AA3 09C5 B094
> ------------------------------------------------------------------------
> BRISBANE
> A perfectly reasonable explanation (Such as the one offered by a
> person with a gurgling cough which has nothing to do with the fact
> that they smoke fifty cigarettes a day.)
> ? ? ? ? ? ? ? ? ? ? ? ?--- Douglas Adams, The Meaning of Liff
>

2011-11-08 00:31:06

by Wey-Yi Guy

[permalink] [raw]
Subject: Re: iwlagn is getting very shaky

Hi Norbert,

Could you please try with this patch?

Thanks
Wey

On Mon, 2011-11-07 at 04:38 -0800, Norbert Preining wrote:
> Hi Wey, hi all,
>
> On Fr, 04 Nov 2011, Norbert Preining wrote:
> > > after the firmware reloaded, is the traffic resume? or it is continuous
> > > without traffic?
> >
> > Ok, I compiled a new kernel from todays git, and I see that there
> > are kernel bugs, but after the hardware restart traffic gets through.
>
> Well, not always actually. I just had *at*home*, were it used to work
> without any flaws till 3.1, I saw that after resume it worked for a
> short time, then the queue got stuck, then not even forcing a reset via
> the /ss/kernel/debug/..../iwlwifi/debug/force_reset did work. Only
> an unload and reload of the module, followed by a few more hickups,
> restarted the wlan card:
>
> [ 6564.905884] PM: Finishing wakeup.
> [ 6564.905889] Restarting tasks ... done.
> [ 6564.924087] video LNXVIDEO:00: Restoring backlight state
> [ 6566.045358] e1000e 0000:00:19.0: irq 45 for MSI/MSI-X
> [ 6566.100108] e1000e 0000:00:19.0: irq 45 for MSI/MSI-X
> [ 6566.101277] iwlwifi 0000:06:00.0: L1 Enabled; Disabling L0S
> [ 6566.105210] iwlwifi 0000:06:00.0: Radio type=0x1-0x2-0x0
> [ 6573.822081] wlan0: authenticate with 00:0a:79:eb:56:10 (try 1)
> [ 6573.829824] wlan0: authenticated
> [ 6573.835543] wlan0: associate with 00:0a:79:eb:56:10 (try 1)
> [ 6573.841527] wlan0: RX AssocResp from 00:0a:79:eb:56:10 (capab=0x411 status=0 aid=2)
> [ 6573.841537] wlan0: associated
>
> Here it worked for a bit, 20secs
>
> [ 6593.731625] iwlwifi 0000:06:00.0: Tx aggregation enabled on ra = 00:0a:79:eb:56:10 tid = 0
> [ 6603.704072] iwlwifi 0000:06:00.0: Queue 11 stuck for 10000 ms.
> [ 6603.704084] iwlwifi 0000:06:00.0: Current read_ptr 249 write_ptr 14
> [ 6603.704092] iwlwifi 0000:06:00.0: On demand firmware reload
> [ 6603.704566] ieee80211 phy0: Hardware restart was requested
> [ 6603.704680] iwlwifi 0000:06:00.0: L1 Enabled; Disabling L0S
> [ 6603.707700] iwlwifi 0000:06:00.0: Radio type=0x1-0x2-0x0
>
> Here I forced the reset
>
> [ 6774.772656] iwlwifi 0000:06:00.0: On demand firmware reload
> [ 6774.773868] ieee80211 phy0: Hardware restart was requested
> [ 6774.773958] iwlwifi 0000:06:00.0: L1 Enabled; Disabling L0S
> [ 6774.777053] iwlwifi 0000:06:00.0: Radio type=0x1-0x2-0x0
>
> unloading and reloading
>
> [ 6823.909282] wlan0: deauthenticating from 00:0a:79:eb:56:10 by local choice (reason=3)
> [ 6823.976417] cfg80211: Calling CRDA for country: JP
> [ 6824.037466] iwlwifi 0000:06:00.0: PCI INT A disabled
> [ 6830.344831] Intel(R) Wireless WiFi Link AGN driver for Linux, in-tree:d
> [ 6830.344835] Copyright(c) 2003-2011 Intel Corporation
> [ 6830.344941] iwlwifi 0000:06:00.0: PCI INT A -> GSI 17 (level, low) -> IRQ 17
> [ 6830.344974] iwlwifi 0000:06:00.0: setting latency timer to 64
> [ 6830.345174] iwlwifi 0000:06:00.0: pci_resource_len = 0x00002000
> [ 6830.345176] iwlwifi 0000:06:00.0: pci_resource_base = ffffc90000048000
> [ 6830.345179] iwlwifi 0000:06:00.0: HW Revision ID = 0x0
> [ 6830.345292] iwlwifi 0000:06:00.0: irq 46 for MSI/MSI-X
> [ 6830.345369] iwlwifi 0000:06:00.0: Detected Intel(R) WiFi Link 5100 AGN, REV=0x54
> [ 6830.345467] iwlwifi 0000:06:00.0: L1 Enabled; Disabling L0S
> [ 6830.367544] iwlwifi 0000:06:00.0: device EEPROM VER=0x11e, CALIB=0x4
> [ 6830.367547] iwlwifi 0000:06:00.0: Device SKU: 0Xf0
> [ 6830.367557] iwlwifi 0000:06:00.0: Tunable channels: 13 802.11bg, 24 802.11a channels
> [ 6830.425994] iwlwifi 0000:06:00.0: loaded firmware version 8.83.5.1 build 33692 (EXP)
> [ 6830.426214] Registered led device: phy1-led
> [ 6830.426519] ieee80211 phy1: Selected rate control algorithm 'iwl-agn-rs'
>
> more hickups
>
> [ 6830.518627] iwlwifi 0000:06:00.0: L1 Enabled; Disabling L0S
> [ 6830.522336] iwlwifi 0000:06:00.0: Radio type=0x1-0x2-0x0
> [ 6830.636966] iwlwifi 0000:06:00.0: L1 Enabled; Disabling L0S
> [ 6830.640324] iwlwifi 0000:06:00.0: Radio type=0x1-0x2-0x0
> [ 6838.327066] wlan0: authenticate with 00:0a:79:eb:56:10 (try 1)
> [ 6838.330447] wlan0: authenticated
> [ 6838.335890] wlan0: associate with 00:0a:79:eb:56:10 (try 1)
> [ 6838.341503] wlan0: RX AssocResp from 00:0a:79:eb:56:10 (capab=0x411 status=0 aid=2)
> [ 6838.341507] wlan0: associated
>
> finally got it working, one more final hickup
>
> [ 6860.602423] iwlwifi 0000:06:00.0: Tx aggregation enabled on ra = 00:0a:79:eb:56:10 tid = 0
>
> from now on working
>
> Would be great to fix that for 3.2
>
> Herzliche Grüße
>
> Norbert
> ------------------------------------------------------------------------
> Norbert Preining preining@{jaist.ac.jp, logic.at, debian.org}
> JAIST, Japan TeX Live & Debian Developer
> DSA: 0x09C5B094 fp: 14DF 2E6C 0307 BE6D AD76 A9C0 D2BF 4AA3 09C5 B094
> ------------------------------------------------------------------------
> PEEBLES (pl.n.)
> Small, carefully rolled pellets of skegness (q.v.)
> --- Douglas Adams, The Meaning of Liff


Attachments:
[Internal-wifi-devel]_[PATCH]_iwlwifi:_check_the_HW_when_a_queue_is_stuck (4.26 kB)

2011-11-22 08:11:28

by Emmanuel Grumbach

[permalink] [raw]
Subject: Re: iwlagn is getting very shaky

On Tue, Nov 22, 2011 at 09:54, Norbert Preining <[email protected]> wrote:
> Hi
>
>
> I wrote ...
>
>>> Does not apply against current git,
>
> ...
>
>>> can't find file to patch at input line 78
>>> Perhaps you used the wrong -p or --strip option?
>>> The text leading up to this was:
>>> --------------------------
>>> |diff --git a/drivers/net/wireless/iwlwifi/iwl-mac80211.c
>
> You answered ...
>
>> Glad it applied
>
> Me scratching my head!?

Sorry, please try again with that one.


Attachments:
QUEUE_DEBUG.patch (12.41 kB)

2011-11-21 03:03:25

by Norbert Preining

[permalink] [raw]
Subject: Re: iwlagn is getting very shaky

On Mi, 16 Nov 2011, Richard Yao wrote:
> Would your university by any chance be using wireless network hardware
> from Aruba Networks?

No, I its Cisco.

Best wishes

Norbert
------------------------------------------------------------------------
Norbert Preining preining@{jaist.ac.jp, logic.at, debian.org}
JAIST, Japan TeX Live & Debian Developer
DSA: 0x09C5B094 fp: 14DF 2E6C 0307 BE6D AD76 A9C0 D2BF 4AA3 09C5 B094
------------------------------------------------------------------------
BOTLEY
The prominent stain on a man's trouser crotch seen on his return from
the lavatory. A botley proper is caused by an accident with the push
taps, and should not be confused with any stain caused by insufficient
waggling of the willy.
--- Douglas Adams, The Meaning of Liff

2011-11-08 04:32:43

by Norbert Preining

[permalink] [raw]
Subject: Re: iwlagn is getting very shaky

Hi Wey,

On Mo, 07 Nov 2011, Guy, Wey-Yi wrote:
> Could you please try with this patch?

new tests with 3.2.0-rc1, plus the patch you send. Starting from
cold state, so no suspend but hard off.

In short, this is the output of your patch:
[ 215.872090] iwlwifi 0000:06:00.0: Queue 11 stuck for 10000 ms.
[ 215.872101] iwlwifi 0000:06:00.0: Current SW read_ptr 105 write_ptr 115
[ 215.872156] iwlwifi 0000:06:00.0: Current HW read_ptr 105 write_ptr 115
[ 215.872164] iwlwifi 0000:06:00.0: On demand firmware reload


And here is the rest. As before, first work for short time, then WARNING
and hang, needs full reset with force_reset, finally works.

[ 41.881266] Intel(R) Wireless WiFi Link AGN driver for Linux, in-tree:d
[ 41.882703] Copyright(c) 2003-2011 Intel Corporation
[ 41.884190] iwlwifi 0000:06:00.0: PCI INT A -> GSI 17 (level, low) -> IRQ 17
[ 41.885578] iwlwifi 0000:06:00.0: setting latency timer to 64
[ 41.885652] iwlwifi 0000:06:00.0: pci_resource_len = 0x00002000
[ 41.887041] iwlwifi 0000:06:00.0: pci_resource_base = ffffc9001496c000
[ 41.888403] iwlwifi 0000:06:00.0: HW Revision ID = 0x0
[ 41.889885] iwlwifi 0000:06:00.0: irq 46 for MSI/MSI-X
[ 41.889963] iwlwifi 0000:06:00.0: Detected Intel(R) WiFi Link 5100 AGN, REV=0x54
[ 41.891401] iwlwifi 0000:06:00.0: L1 Enabled; Disabling L0S
[ 41.914539] iwlwifi 0000:06:00.0: device EEPROM VER=0x11e, CALIB=0x4
[ 41.915880] iwlwifi 0000:06:00.0: Device SKU: 0Xf0
[ 41.917237] iwlwifi 0000:06:00.0: Tunable channels: 13 802.11bg, 24 802.11a channels
[ 41.922968] iwlwifi 0000:06:00.0: loaded firmware version 8.83.5.1 build 33692 (EXP)
[ 41.924485] Registered led device: phy0-led
[ 41.969919] ieee80211 phy0: Selected rate control algorithm 'iwl-agn-rs'
[...]
[ 47.917692] iwlwifi 0000:06:00.0: L1 Enabled; Disabling L0S
[ 47.923032] iwlwifi 0000:06:00.0: Radio type=0x1-0x2-0x0
[ 48.036928] iwlwifi 0000:06:00.0: L1 Enabled; Disabling L0S
[ 48.041410] iwlwifi 0000:06:00.0: Radio type=0x1-0x2-0x0
[...]
[ 55.646391] wlan0: direct probe to 00:24:c4:ab:bd:e0 (try 1/3)
[ 55.844660] wlan0: direct probe to 00:24:c4:ab:bd:e0 (try 2/3)
[ 56.044092] wlan0: direct probe to 00:24:c4:ab:bd:e0 (try 3/3)
[ 56.244096] wlan0: direct probe to 00:24:c4:ab:bd:e0 timed out
[ 64.277546] wlan0: direct probe to 00:24:c4:ab:bd:e0 (try 1/3)
[ 64.476076] wlan0: direct probe to 00:24:c4:ab:bd:e0 (try 2/3)
[ 64.676108] wlan0: direct probe to 00:24:c4:ab:bd:e0 (try 3/3)
[ 64.876109] wlan0: direct probe to 00:24:c4:ab:bd:e0 timed out
[ 72.841928] wlan0: authenticate with 00:24:c4:ab:bd:ef (try 1)
[ 73.040056] wlan0: authenticate with 00:24:c4:ab:bd:ef (try 2)
[ 73.040718] wlan0: authenticated
[ 73.047334] wlan0: associate with 00:24:c4:ab:bd:ef (try 1)
[ 73.244070] wlan0: associate with 00:24:c4:ab:bd:ef (try 2)
[ 73.246336] wlan0: RX AssocResp from 00:24:c4:ab:bd:ef (capab=0x1 status=0 aid=3)
[ 73.246339] wlan0: associated

Up to here it probably worked, but I cannot guarantee, that was while
gnome3 was working its way through tons of initialization.

[ 81.628826] iwlwifi 0000:06:00.0: Tx aggregation enabled on ra = 00:24:c4:ab:bd:ef tid = 0
[ 106.547117] wlan0: direct probe to 00:24:c4:ab:bd:e0 (try 1/3)
[ 106.744017] wlan0: direct probe to 00:24:c4:ab:bd:e0 (try 2/3)
[ 106.944087] wlan0: direct probe to 00:24:c4:ab:bd:e0 (try 3/3)
[ 107.148023] wlan0: direct probe to 00:24:c4:ab:bd:e0 timed out
[ 118.254103] wlan0: deauthenticating from 00:24:c4:ab:bd:ef by local choice (reason=2)
[ 118.265841] cfg80211: Calling CRDA for country: JP
[ 118.276060] wlan0: direct probe to 00:24:c4:ab:bd:e0 (try 1/3)
[ 118.476081] wlan0: direct probe to 00:24:c4:ab:bd:e0 (try 2/3)
[ 118.676032] wlan0: direct probe to 00:24:c4:ab:bd:e0 (try 3/3)
[ 118.876024] wlan0: direct probe to 00:24:c4:ab:bd:e0 timed out
[ 126.839944] wlan0: authenticate with 00:24:c4:ab:bd:ef (try 1)
[ 126.840763] wlan0: authenticated
[ 126.841851] wlan0: associate with 00:24:c4:ab:bd:ef (try 1)
[ 126.844961] wlan0: RX ReassocResp from 00:24:c4:ab:bd:ef (capab=0x1 status=0 aid=3)
[ 126.844965] wlan0: associated
[ 136.116943] iwlwifi 0000:06:00.0: Tx aggregation enabled on ra = 00:24:c4:ab:bd:ef tid = 0
[ 146.550560] wlan0: direct probe to 00:24:c4:ab:bd:e0 (try 1/3)
[ 146.748058] wlan0: direct probe to 00:24:c4:ab:bd:e0 (try 2/3)
[ 146.948062] wlan0: direct probe to 00:24:c4:ab:bd:e0 (try 3/3)
[ 147.148052] wlan0: direct probe to 00:24:c4:ab:bd:e0 timed out
[ 158.039627] wlan0: deauthenticating from 00:24:c4:ab:bd:ef by local choice (reason=2)
[ 158.081078] cfg80211: Calling CRDA to update world regulatory domain
[ 158.085453] wlan0: authenticate with 00:24:c4:ab:bd:ef (try 1)
[ 158.088031] wlan0: authenticated
[ 158.092327] wlan0: associate with 00:24:c4:ab:bd:ef (try 1)
[ 158.094338] wlan0: RX ReassocResp from 00:24:c4:ab:bd:ef (capab=0x1 status=0 aid=3)
[ 158.094341] wlan0: associated
[ 206.605041] iwlwifi 0000:06:00.0: Tx aggregation enabled on ra = 00:24:c4:ab:bd:ef tid = 0
[ 206.605062] ------------[ cut here ]------------
[ 206.605101] WARNING: at drivers/net/wireless/iwlwifi/iwl-trans-pcie.c:1105 iwl_trans_pcie_tx+0x180/0x661 [iwlwifi]()
[ 206.605110] Hardware name: VGN-Z11VN_B
[ 206.605114] Modules linked in: rfcomm bnep vboxpci(O) vboxnetadp(O) vboxnetflt(O) vboxdrv(O) snd_hrtimer binfmt_misc dm_crypt dm_mod isofs btrfs zlib_deflate crc32c libcrc32c vfat fat fuse loop uinput snd_hda_codec_realtek snd_hda_intel snd_hda_codec snd_hwdep snd_pcm_oss snd_mixer_oss arc4 snd_pcm iwlwifi snd_seq_dummy snd_seq_oss snd_seq_midi snd_rawmidi snd_seq_midi_event mac80211 snd_seq sony_laptop(O) snd_timer snd_seq_device snd cfg80211 btusb bluetooth mxm_wmi firewire_ohci rfkill crc16 joydev soundcore snd_page_alloc firewire_core crc_itu_t tpm_infineon
[ 206.605244] Pid: 717, comm: kworker/u:4 Tainted: G O 3.2.0-rc1+ #44
[ 206.605250] Call Trace:
[ 206.605255] <IRQ> [<ffffffff81038475>] warn_slowpath_common+0x83/0x9b
[ 206.605281] [<ffffffff810384a7>] warn_slowpath_null+0x1a/0x1c
[ 206.605305] [<ffffffffa01fdaf7>] iwl_trans_pcie_tx+0x180/0x661 [iwlwifi]
[ 206.605328] [<ffffffffa01e72b2>] ? iwlagn_temperature+0x1a/0x1c [iwlwifi]
[ 206.605341] [<ffffffff810d381d>] ? kmem_cache_alloc+0x44/0xb9
[ 206.605362] [<ffffffffa01e5a1d>] iwlagn_tx_skb+0x87b/0x91b [iwlwifi]
[ 206.605376] [<ffffffff81408758>] ? _raw_spin_unlock_irqrestore+0x2e/0x30
[ 206.605393] [<ffffffffa01dbf8c>] iwlagn_mac_tx+0x131/0x1a2 [iwlwifi]
[ 206.605419] [<ffffffffa019cc56>] ? ieee80211_tx_h_fragment+0x16/0x22c [mac80211]
[ 206.605444] [<ffffffffa01feb13>] ? iwl_rx_queue_update_write_ptr+0x138/0x141 [iwlwifi]
[ 206.605474] [<ffffffffa0190f94>] __ieee80211_tx+0x176/0x1cf [mac80211]
[ 206.605494] [<ffffffffa019cbc8>] ? ieee80211_tx_h_calculate_duration+0x4c/0x65 [mac80211]
[ 206.605524] [<ffffffffa0191d5a>] ieee80211_tx+0x97/0xaf [mac80211]
[ 206.605556] [<ffffffffa0192c56>] ieee80211_tx_pending+0xf0/0x1c3 [mac80211]
[ 206.605568] [<ffffffff8103dbee>] tasklet_action+0x77/0xc2
[ 206.605578] [<ffffffff8103dd2f>] __do_softirq+0xbc/0x1a5
[ 206.605590] [<ffffffff8140a3cc>] call_softirq+0x1c/0x30
[ 206.605596] <EOI> [<ffffffff8100359e>] do_softirq+0x38/0x6e
[ 206.605613] [<ffffffff8103d9f5>] _local_bh_enable_ip.isra.12+0x7d/0xa0
[ 206.605623] [<ffffffff8103da26>] local_bh_enable_ip+0xe/0x10
[ 206.605632] [<ffffffff814085ca>] _raw_spin_unlock_bh+0x23/0x25
[ 206.605657] [<ffffffffa0180c64>] ieee80211_agg_tx_operational+0x99/0xa4 [mac80211]
[ 206.605683] [<ffffffffa01817b2>] ieee80211_process_addba_resp+0xb8/0xf2 [mac80211]
[ 206.605697] [<ffffffff81064840>] ? in_lock_functions+0x1a/0x1d
[ 206.605724] [<ffffffffa0188a82>] ieee80211_iface_work+0x130/0x2b5 [mac80211]
[ 206.605751] [<ffffffffa0188952>] ? ieee80211_teardown_sdata+0xcc/0xcc [mac80211]
[ 206.605764] [<ffffffff8104de2c>] process_one_work+0x17b/0x2bd
[ 206.605774] [<ffffffff8104c471>] ? need_to_create_worker+0x12/0x26
[ 206.605785] [<ffffffff8104ef27>] worker_thread+0xdb/0x15f
[ 206.605795] [<ffffffff8104ee4c>] ? manage_workers.isra.24+0x171/0x171
[ 206.605806] [<ffffffff8105260d>] kthread+0x84/0x8c
[ 206.605817] [<ffffffff8140a2d4>] kernel_thread_helper+0x4/0x10
[ 206.605829] [<ffffffff81052589>] ? kthread_worker_fn+0x148/0x148
[ 206.605839] [<ffffffff8140a2d0>] ? gs_change+0xb/0xb
[ 206.605847] ---[ end trace b36eae6e58c8ed31 ]---
[ 206.614587] wlan0: direct probe to 00:24:c4:ab:bd:e0 (try 1/3)
[ 206.812155] wlan0: direct probe to 00:24:c4:ab:bd:e0 (try 2/3)
[ 207.012096] wlan0: direct probe to 00:24:c4:ab:bd:e0 (try 3/3)
[ 207.212105] wlan0: direct probe to 00:24:c4:ab:bd:e0 timed out

Here comes the output of the patch you sent:

[ 215.872090] iwlwifi 0000:06:00.0: Queue 11 stuck for 10000 ms.
[ 215.872101] iwlwifi 0000:06:00.0: Current SW read_ptr 105 write_ptr 115
[ 215.872156] iwlwifi 0000:06:00.0: Current HW read_ptr 105 write_ptr 115
[ 215.872164] iwlwifi 0000:06:00.0: On demand firmware reload

echo 1 onto /sys/kernel/debug/...../force_reset

[ 215.872600] ieee80211 phy0: Hardware restart was requested
[ 215.872815] iwlwifi 0000:06:00.0: L1 Enabled; Disabling L0S
[ 215.875847] iwlwifi 0000:06:00.0: Radio type=0x1-0x2-0x0
[ 215.927777] wlan0: deauthenticating from 00:24:c4:ab:bd:ef by local choice (reason=2)
[ 215.960954] cfg80211: Calling CRDA to update world regulatory domain
[ 215.972261] wlan0: direct probe to 00:24:c4:ab:bd:e0 (try 1/3)
[ 216.172107] wlan0: direct probe to 00:24:c4:ab:bd:e0 (try 2/3)
[ 216.372070] wlan0: direct probe to 00:24:c4:ab:bd:e0 (try 3/3)
[ 216.572123] wlan0: direct probe to 00:24:c4:ab:bd:e0 timed out
[ 221.368107] usb 8-2: USB disconnect, device number 2
[ 224.604953] wlan0: authenticate with 00:24:c4:ab:bd:ef (try 1)
[ 224.605664] wlan0: authenticated
[ 224.606873] wlan0: associate with 00:24:c4:ab:bd:ef (try 1)
[ 224.608866] wlan0: RX ReassocResp from 00:24:c4:ab:bd:ef (capab=0x1 status=0 aid=3)
[ 224.608869] wlan0: associated
[ 329.901051] iwlwifi 0000:06:00.0: Tx aggregation enabled on ra = 00:24:c4:ab:bd:ef tid = 6
[ 332.477994] iwlwifi 0000:06:00.0: On demand firmware reload
[ 332.478449] ieee80211 phy0: Hardware restart was requested
[ 332.478532] iwlwifi 0000:06:00.0: L1 Enabled; Disabling L0S
[ 332.481587] iwlwifi 0000:06:00.0: Radio type=0x1-0x2-0x0
[ 416.866720] iwlwifi 0000:06:00.0: Tx aggregation enabled on ra = 00:24:c4:ab:bd:ef tid = 0
[ 416.932995] iwlwifi 0000:06:00.0: Tx aggregation enabled on ra = 00:24:c4:ab:bd:ef tid = 6

from here on it started to work.

Best wishes

Norbert
------------------------------------------------------------------------
Norbert Preining preining@{jaist.ac.jp, logic.at, debian.org}
JAIST, Japan TeX Live & Debian Developer
DSA: 0x09C5B094 fp: 14DF 2E6C 0307 BE6D AD76 A9C0 D2BF 4AA3 09C5 B094
------------------------------------------------------------------------
BOOK What to do if you find yourself stuck in a crack in the ground
underneath a giant boulder you can't move, with no hope of
rescue. Consider how lucky you are that life has been good to
you so far. Alternatively, if life hasn't been good to you so
far, which given your current circumstances seems more likely,
consider how lucky you are that it won't be troubling you much
longer.
--- Comforting advice for Ford and Arthur in this current
--- situation, Fit the Eighth.
--- Douglas Adams, The Hitchhikers Guide to the Galaxy

2011-11-25 12:37:27

by Norbert Preining

[permalink] [raw]
Subject: Re: iwlagn is getting very shaky

On Fr, 25 Nov 2011, Emmanuel Grumbach wrote:
> Well.. All this is really Layer 2 related, and even lower. It is a

Yeah, that was my impression too, but still, I wanted to mention it.

> I am already on week-end here and will look at it more carefully
> later. First impression is that things are messy here. I guess we are
> getting in a bad failure path that wasn't checked...

No problem, I can cope with it.

Generally after unloading nad reloading the module 1-2 times it gets
stable. Don't ask me why. No hurry

Have a nice weekend

Norbert
------------------------------------------------------------------------
Norbert Preining preining@{jaist.ac.jp, logic.at, debian.org}
JAIST, Japan TeX Live & Debian Developer
DSA: 0x09C5B094 fp: 14DF 2E6C 0307 BE6D AD76 A9C0 D2BF 4AA3 09C5 B094
------------------------------------------------------------------------
WEEM (n.)
The tools with which a dentist can inflict the greatest
pain. Formerly, which tool this was dependent upon the imagination and
skill of the individual dentist, though now, with technological
advances, weems can be bought specially.
--- Douglas Adams, The Meaning of Liff

2011-11-09 05:12:02

by Norbert Preining

[permalink] [raw]
Subject: Re: iwlagn is getting very shaky

Hi Emmanuel,

On Di, 08 Nov 2011, Emmanuel Grumbach wrote:
> I would like to see what happens in terms or power.
> Please set the power debug flag: modprobe iwlwifi debug=0x100

> It may be worth to disable Link Power save too. I would be glad if you
> could try this patch (attached and inlined), with the power debug flag
> set.

With both done, I see things like that:
[ 690.597221] iwlwifi 0000:06:00.0: L1 Enabled; Disabling L0S
[ 690.601979] iwlwifi 0000:06:00.0: Radio type=0x1-0x2-0x0
[ 690.644716] iwlwifi 0000:06:00.0: U iwl_power_sleep_cam_cmd Sleep command for CAM
[ 690.644721] iwlwifi 0000:06:00.0: U iwl_set_power Sending power/sleep command
[ 690.644724] iwlwifi 0000:06:00.0: U iwl_set_power Flags value = 0x00000000
[ 690.644726] iwlwifi 0000:06:00.0: U iwl_set_power Tx timeout = 0
[ 690.644728] iwlwifi 0000:06:00.0: U iwl_set_power Rx timeout = 0
[ 690.644731] iwlwifi 0000:06:00.0: U iwl_set_power Sleep interval vector = { 0 , 0 , 0 , 0 , 0 }
[ 690.645649] iwlwifi 0000:06:00.0: U iwl_power_sleep_cam_cmd Sleep command for CAM

Does this help you, or do you need more?

Best wishes

Norbert
------------------------------------------------------------------------
Norbert Preining preining@{jaist.ac.jp, logic.at, debian.org}
JAIST, Japan TeX Live & Debian Developer
DSA: 0x09C5B094 fp: 14DF 2E6C 0307 BE6D AD76 A9C0 D2BF 4AA3 09C5 B094
------------------------------------------------------------------------
SNITTER (n.)
One of the rather unfunny newspaper clippings pinned to an office
wall, the humour of which is supposed to derive from the fact that the
headline contains a name similar to that of one of the occupants to
the office.
--- Douglas Adams, The Meaning of Liff

2011-11-08 08:24:20

by Emmanuel Grumbach

[permalink] [raw]
Subject: Re: iwlagn is getting very shaky

On Tue, Nov 8, 2011 at 10:07, Emmanuel Grumbach <[email protected]> wrote:
>>
>> In short, this is the output of your patch:
>> [ ?215.872090] iwlwifi 0000:06:00.0: Queue 11 stuck for 10000 ms.
>> [ ?215.872101] iwlwifi 0000:06:00.0: Current SW read_ptr 105 write_ptr 115
>> [ ?215.872156] iwlwifi 0000:06:00.0: Current HW read_ptr 105 write_ptr 115
>> [ ?215.872164] iwlwifi 0000:06:00.0: On demand firmware reload
>>
>>
> Ok thanks for that. The HW and the SW agree then...
>
> I would like to see what happens in terms or power.
> Please set the power debug flag: modprobe iwlwifi debug=0x100
>
> Thanks.
>

It may be worth to disable Link Power save too. I would be glad if you
could try this patch (attached and inlined), with the power debug flag
set.


diff --git a/drivers/net/wireless/iwlwifi/iwl-power.c
b/drivers/net/wireless/iwlwifi/iwl-power.c
index 4eaab20..e5f309b 100644
--- a/drivers/net/wireless/iwlwifi/iwl-power.c
+++ b/drivers/net/wireless/iwlwifi/iwl-power.c
@@ -436,7 +436,7 @@ int iwl_power_update_mode(struct iwl_priv *priv, bool force)
/* initialize to default */
void iwl_power_initialize(struct iwl_priv *priv)
{
- priv->power_data.bus_pm = bus_get_pm_support(bus(priv));
+ priv->power_data.bus_pm = false;

priv->power_data.debug_sleep_level_override = -1;



Thanks for your patience !

2011-11-16 13:40:11

by Richard Yao

[permalink] [raw]
Subject: Re: iwlagn is getting very shaky

Dear Norbert,

Would your university by any chance be using wireless network hardware
from Aruba Networks?

A network engineer responsible for maintaining my university's network
emailed me yesterday saying that Aruba Networks has had engineers on
campus working on our wireless problems. They did a firmware update
last Thursday to address some issues, which appears to have resulted
in substantial improvements in the time it takes to connect to the
network, network bandwidth and connection reliability.

It seems that Aruba Networks has its own proprietary firmware, so we
have had to suffer for their duplication of effort. That is not to say
that the iwlagn driver is perfect (e.g. it is missing auto settings
for rts and frag), but this seems like this is a plausible explanation
for why you have issues at your university's wireless equipment while
you have no issues with your wireless router at home.

It is also probably a reason for organizations to insist on having
fully open source software in their hardware, which is something I
explained in an email to the person who decided to purchase hardware
from Aruba Networks last night. Speaking of which, it would be nice if
Intel open sourced iwlwifi-6000-4.ucode. That way improvements in it
would not be solely dependent on our friends at Intel, whose time to
work on these things is likely tied to decisions by Intel's
management.

Yours truly,
Richard Yao

On Wed, Nov 9, 2011 at 12:11 AM, Norbert Preining <[email protected]> wrote:
> Hi Emmanuel,
>
> On Di, 08 Nov 2011, Emmanuel Grumbach wrote:
>> I would like to see what happens in terms or power.
>> Please set the power debug flag: modprobe iwlwifi debug=0x100
>
>> It may be worth to disable Link Power save too. I would be glad if you
>> could try this patch (attached and inlined), with the power debug flag
>> set.
>
> With both done, I see things like that:
> [ ?690.597221] iwlwifi 0000:06:00.0: L1 Enabled; Disabling L0S
> [ ?690.601979] iwlwifi 0000:06:00.0: Radio type=0x1-0x2-0x0
> [ ?690.644716] iwlwifi 0000:06:00.0: U iwl_power_sleep_cam_cmd Sleep command for CAM
> [ ?690.644721] iwlwifi 0000:06:00.0: U iwl_set_power Sending power/sleep command
> [ ?690.644724] iwlwifi 0000:06:00.0: U iwl_set_power Flags value = 0x00000000
> [ ?690.644726] iwlwifi 0000:06:00.0: U iwl_set_power Tx timeout = 0
> [ ?690.644728] iwlwifi 0000:06:00.0: U iwl_set_power Rx timeout = 0
> [ ?690.644731] iwlwifi 0000:06:00.0: U iwl_set_power Sleep interval vector = { 0 , 0 , 0 , 0 , 0 }
> [ ?690.645649] iwlwifi 0000:06:00.0: U iwl_power_sleep_cam_cmd Sleep command for CAM
>
> Does this help you, or do you need more?
>
> Best wishes
>
> Norbert
> ------------------------------------------------------------------------
> Norbert Preining ? ? ? ? ? ?preining@{jaist.ac.jp, logic.at, debian.org}
> JAIST, Japan ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? TeX Live & Debian Developer
> DSA: 0x09C5B094 ? fp: 14DF 2E6C 0307 BE6D AD76 ?A9C0 D2BF 4AA3 09C5 B094
> ------------------------------------------------------------------------
> SNITTER (n.)
> One of the rather unfunny newspaper clippings pinned to an office
> wall, the humour of which is supposed to derive from the fact that the
> headline contains a name similar to that of one of the occupants to
> the office.
> ? ? ? ? ? ? ? ? ? ? ? ?--- Douglas Adams, The Meaning of Liff
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at ?http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at ?http://www.tux.org/lkml/
>

2011-11-02 04:14:23

by Wey-Yi Guy

[permalink] [raw]
Subject: Re: iwlagn is getting very shaky

Hi Norbert,

On Tue, 2011-11-01 at 20:13 -0700, Norbert Preining wrote:
> Hi all,
>
> On Mi, 26 Okt 2011, Norbert Preining wrote:
> > On Di, 25 Okt 2011, wwguy wrote:
> > > so this is different problem, right? it looks different compare to what
> > > you described before which is cause by queue stopped.
> >
> > I assume that this is a different regression concerning 3.1.0 released.
>
> Is there any progress on that? Currently linux git status is unusable
> with respect to iwlwifi?
>
> Are there any changes planned before rc1?
>
after the firmware reloaded, is the traffic resume? or it is continuous
without traffic?

Looks like the different "queue stuck" problem you are seeing. you
mention you only seeing this at university but not home. So what the
differences are?
what Band/channel you are using, also, if you disable 11n, are you still
seeing the problem?

you also mention it is might related to suspend/resume, could you please
give a better description why you think it might be related

Thanks
Wey




2011-11-22 07:46:00

by Emmanuel Grumbach

[permalink] [raw]
Subject: Re: iwlagn is getting very shaky

On Tue, Nov 22, 2011 at 09:27, Norbert Preining <[email protected]> wrote:
> HI Emmanuel,
>
> On Di, 22 Nov 2011, Emmanuel Grumbach wrote:
>> Please apply the patch attached. It has been submitted upstream but
>> hasn't been merged in the tree you use apparently.
>
> Does not apply against current git, as there is no iwl-mac80211.c:
> [/usr/src/git-kernel/linux.git] patch -p1 --dry-run < ~/QUEUE_DEBUG.patch
> patching file drivers/net/wireless/iwlwifi/iwl-agn-rx.c
> patching file drivers/net/wireless/iwlwifi/iwl-agn-rxon.c
> Hunk #1 succeeded at 116 (offset -1 lines).
> Hunk #2 succeeded at 840 (offset -4 lines).
> patching file drivers/net/wireless/iwlwifi/iwl-agn-tx.c
> Hunk #1 succeeded at 800 (offset -13 lines).
> patching file drivers/net/wireless/iwlwifi/iwl-debug.h
> Hunk #1 succeeded at 151 (offset -15 lines).
> Hunk #2 succeeded at 188 (offset -15 lines).
> can't find file to patch at input line 78
> Perhaps you used the wrong -p or --strip option?
> The text leading up to this was:
> --------------------------
> |diff --git a/drivers/net/wireless/iwlwifi/iwl-mac80211.c b/drivers/net/wireless/iwlwifi/iwl-mac80211.c
> |index 073e827..05b1f0d 100644
> |--- a/drivers/net/wireless/iwlwifi/iwl-mac80211.c
> |+++ b/drivers/net/wireless/iwlwifi/iwl-mac80211.c
> --------------------------
> File to patch:
>

Glad it applied
>
>> Then please run with set the TXQUEUES debug flag: 0x8000000.
>
> Is this a load options to iwlwifi module (debug=0x..... at module load time?)
> or has to be set in /sys/kernel/debug/ieee80211/phy?/iwlwifi/debug/*
> somewhere?
>

yes it is a module parameter: modprobe iwlwifi debug=0x80000000

>
>> Do you use any SoftAP / P2P feature ? HW queues can be stalled
>
> No. Fixed AP with fixed IP and no special adaptions.
>

Ok thanks

2011-11-25 04:50:04

by Emmanuel Grumbach

[permalink] [raw]
Subject: Re: iwlagn is getting very shaky

On Fri, Nov 25, 2011 at 01:02, Norbert Preining <[email protected]> wrote:
> On Do, 24 Nov 2011, Emmanuel Grumbach wrote:
>> Please apply the patch attached on top of the one I previously sent.
>> Let me know I you have trouble with patch.
>
> Which patch? Nothing attached, I also checked in the mail archives
> in case I had again problems with attached patches. Nothing.
> Please resend.
>

Oh well... Sorry again....


Attachments:
DEBUG_QUEUE2.patch (3.38 kB)

2011-11-22 05:13:21

by Norbert Preining

[permalink] [raw]
Subject: Re: iwlagn is getting very shaky

On Mo, 21 Nov 2011, Emmanuel Grumbach wrote:
> > If you think I shouuld turn on power debug again and see hope that
> > it happens again, let me know.
>
> No, I don't think it is needed. Thanks.

One more thing I realized now: While the connection is cut off
the kernel/network-manager belives I am connected:
# ifconfig wlan0
wlan0 Link encap:Ethernet HWaddr 00:16:ea:2c:6c:4a
inet addr:192.168.1.22 Bcast:192.168.1.255 Mask:255.255.255.0
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:20368 errors:0 dropped:0 overruns:0 frame:0
TX packets:21606 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:11031455 (10.5 MiB) TX bytes:2927239 (2.7 MiB)
#

If I do during this time a ping of a remote host, interestingly:
- TX packgets and TX bytes go UP
- RX bytes go also up!
- RX packages stay the same, seems that the RX packets are not recognized

It seems with the changes now the TX is working, but RX is not.

Best wishes

Norbert
------------------------------------------------------------------------
Norbert Preining preining@{jaist.ac.jp, logic.at, debian.org}
JAIST, Japan TeX Live & Debian Developer
DSA: 0x09C5B094 fp: 14DF 2E6C 0307 BE6D AD76 A9C0 D2BF 4AA3 09C5 B094
------------------------------------------------------------------------
RAMSGATE (n.)
All institutional buildings must, by law, contain at least twenty
ramsgates. These are doors which open the opposite way to the one you
expect.
--- Douglas Adams, The Meaning of Liff

2011-11-22 06:48:21

by Emmanuel Grumbach

[permalink] [raw]
Subject: Re: iwlagn is getting very shaky

On Tue, Nov 22, 2011 at 08:46, Emmanuel Grumbach <[email protected]> wrote:
> On Tue, Nov 22, 2011 at 07:13, Norbert Preining <[email protected]> wrote:
>> On Mo, 21 Nov 2011, Emmanuel Grumbach wrote:
>>> > If you think I shouuld turn on power debug again and see hope that
>>> > it happens again, let me know.
>>>
>>> No, I don't think it is needed. Thanks.
>>
>
> Hi Norbert,
>
> Please apply the patch attached. It has been submitted upstream but
> hasn't been merged in the tree you use apparently.
> Then please run with set the TXQUEUES debug flag: 0x8000000.
> This will give us a better understanding regarding what happens with
> the queues and how they are mapped.
> Do you use any SoftAP / P2P feature ? HW queues can be stalled
> sometimes when using such configuration. Can it also be that the AP
> that we can't work with has any kind of power save feature (I know
> that this one is hard to answer though :-)) ?
>
> Thanks !
>

Forgot to reply on the WARN_ON you saw. Yes I know about it. We seem
to have a race between mac80211 and the driver somewhere. I will look
at it when I will have time for that one. Hopefully before Armageddon.

2011-11-07 12:38:24

by Norbert Preining

[permalink] [raw]
Subject: Re: iwlagn is getting very shaky

Hi Wey, hi all,

On Fr, 04 Nov 2011, Norbert Preining wrote:
> > after the firmware reloaded, is the traffic resume? or it is continuous
> > without traffic?
>
> Ok, I compiled a new kernel from todays git, and I see that there
> are kernel bugs, but after the hardware restart traffic gets through.

Well, not always actually. I just had *at*home*, were it used to work
without any flaws till 3.1, I saw that after resume it worked for a
short time, then the queue got stuck, then not even forcing a reset via
the /ss/kernel/debug/..../iwlwifi/debug/force_reset did work. Only
an unload and reload of the module, followed by a few more hickups,
restarted the wlan card:

[ 6564.905884] PM: Finishing wakeup.
[ 6564.905889] Restarting tasks ... done.
[ 6564.924087] video LNXVIDEO:00: Restoring backlight state
[ 6566.045358] e1000e 0000:00:19.0: irq 45 for MSI/MSI-X
[ 6566.100108] e1000e 0000:00:19.0: irq 45 for MSI/MSI-X
[ 6566.101277] iwlwifi 0000:06:00.0: L1 Enabled; Disabling L0S
[ 6566.105210] iwlwifi 0000:06:00.0: Radio type=0x1-0x2-0x0
[ 6573.822081] wlan0: authenticate with 00:0a:79:eb:56:10 (try 1)
[ 6573.829824] wlan0: authenticated
[ 6573.835543] wlan0: associate with 00:0a:79:eb:56:10 (try 1)
[ 6573.841527] wlan0: RX AssocResp from 00:0a:79:eb:56:10 (capab=0x411 status=0 aid=2)
[ 6573.841537] wlan0: associated

Here it worked for a bit, 20secs

[ 6593.731625] iwlwifi 0000:06:00.0: Tx aggregation enabled on ra = 00:0a:79:eb:56:10 tid = 0
[ 6603.704072] iwlwifi 0000:06:00.0: Queue 11 stuck for 10000 ms.
[ 6603.704084] iwlwifi 0000:06:00.0: Current read_ptr 249 write_ptr 14
[ 6603.704092] iwlwifi 0000:06:00.0: On demand firmware reload
[ 6603.704566] ieee80211 phy0: Hardware restart was requested
[ 6603.704680] iwlwifi 0000:06:00.0: L1 Enabled; Disabling L0S
[ 6603.707700] iwlwifi 0000:06:00.0: Radio type=0x1-0x2-0x0

Here I forced the reset

[ 6774.772656] iwlwifi 0000:06:00.0: On demand firmware reload
[ 6774.773868] ieee80211 phy0: Hardware restart was requested
[ 6774.773958] iwlwifi 0000:06:00.0: L1 Enabled; Disabling L0S
[ 6774.777053] iwlwifi 0000:06:00.0: Radio type=0x1-0x2-0x0

unloading and reloading

[ 6823.909282] wlan0: deauthenticating from 00:0a:79:eb:56:10 by local choice (reason=3)
[ 6823.976417] cfg80211: Calling CRDA for country: JP
[ 6824.037466] iwlwifi 0000:06:00.0: PCI INT A disabled
[ 6830.344831] Intel(R) Wireless WiFi Link AGN driver for Linux, in-tree:d
[ 6830.344835] Copyright(c) 2003-2011 Intel Corporation
[ 6830.344941] iwlwifi 0000:06:00.0: PCI INT A -> GSI 17 (level, low) -> IRQ 17
[ 6830.344974] iwlwifi 0000:06:00.0: setting latency timer to 64
[ 6830.345174] iwlwifi 0000:06:00.0: pci_resource_len = 0x00002000
[ 6830.345176] iwlwifi 0000:06:00.0: pci_resource_base = ffffc90000048000
[ 6830.345179] iwlwifi 0000:06:00.0: HW Revision ID = 0x0
[ 6830.345292] iwlwifi 0000:06:00.0: irq 46 for MSI/MSI-X
[ 6830.345369] iwlwifi 0000:06:00.0: Detected Intel(R) WiFi Link 5100 AGN, REV=0x54
[ 6830.345467] iwlwifi 0000:06:00.0: L1 Enabled; Disabling L0S
[ 6830.367544] iwlwifi 0000:06:00.0: device EEPROM VER=0x11e, CALIB=0x4
[ 6830.367547] iwlwifi 0000:06:00.0: Device SKU: 0Xf0
[ 6830.367557] iwlwifi 0000:06:00.0: Tunable channels: 13 802.11bg, 24 802.11a channels
[ 6830.425994] iwlwifi 0000:06:00.0: loaded firmware version 8.83.5.1 build 33692 (EXP)
[ 6830.426214] Registered led device: phy1-led
[ 6830.426519] ieee80211 phy1: Selected rate control algorithm 'iwl-agn-rs'

more hickups

[ 6830.518627] iwlwifi 0000:06:00.0: L1 Enabled; Disabling L0S
[ 6830.522336] iwlwifi 0000:06:00.0: Radio type=0x1-0x2-0x0
[ 6830.636966] iwlwifi 0000:06:00.0: L1 Enabled; Disabling L0S
[ 6830.640324] iwlwifi 0000:06:00.0: Radio type=0x1-0x2-0x0
[ 6838.327066] wlan0: authenticate with 00:0a:79:eb:56:10 (try 1)
[ 6838.330447] wlan0: authenticated
[ 6838.335890] wlan0: associate with 00:0a:79:eb:56:10 (try 1)
[ 6838.341503] wlan0: RX AssocResp from 00:0a:79:eb:56:10 (capab=0x411 status=0 aid=2)
[ 6838.341507] wlan0: associated

finally got it working, one more final hickup

[ 6860.602423] iwlwifi 0000:06:00.0: Tx aggregation enabled on ra = 00:0a:79:eb:56:10 tid = 0

from now on working

Would be great to fix that for 3.2

Herzliche Gr??e

Norbert
------------------------------------------------------------------------
Norbert Preining preining@{jaist.ac.jp, logic.at, debian.org}
JAIST, Japan TeX Live & Debian Developer
DSA: 0x09C5B094 fp: 14DF 2E6C 0307 BE6D AD76 A9C0 D2BF 4AA3 09C5 B094
------------------------------------------------------------------------
PEEBLES (pl.n.)
Small, carefully rolled pellets of skegness (q.v.)
--- Douglas Adams, The Meaning of Liff

2011-11-22 12:44:09

by Norbert Preining

[permalink] [raw]
Subject: Re: iwlagn is getting very shaky

On Di, 22 Nov 2011, Emmanuel Grumbach wrote:
> Sorry, please try again with that one.

Here we go:

[ 48.355300] iwlwifi 0000:06:00.0: L1 Enabled; Disabling L0S
[ 48.360704] iwlwifi 0000:06:00.0: Radio type=0x1-0x2-0x0
[ 48.402330] iwlwifi 0000:06:00.0: U iwl_trans_tx_queue_set_status Activate AC/CMD Queue 0 on FIFO 3
[ 48.403914] iwlwifi 0000:06:00.0: U iwl_trans_tx_queue_set_status Activate AC/CMD Queue 1 on FIFO 2
[ 48.404020] iwlwifi 0000:06:00.0: U iwl_trans_tx_queue_set_status Activate AC/CMD Queue 2 on FIFO 1
[ 48.404020] iwlwifi 0000:06:00.0: U iwl_trans_tx_queue_set_status Activate AC/CMD Queue 3 on FIFO 0
[ 48.404020] iwlwifi 0000:06:00.0: U iwl_trans_tx_queue_set_status Activate AC/CMD Queue 4 on FIFO 7
[ 48.484466] iwlwifi 0000:06:00.0: L1 Enabled; Disabling L0S
[ 48.488913] iwlwifi 0000:06:00.0: Radio type=0x1-0x2-0x0
[ 48.534317] iwlwifi 0000:06:00.0: U iwl_trans_tx_queue_set_status Activate AC/CMD Queue 0 on FIFO 3
[ 48.535806] iwlwifi 0000:06:00.0: U iwl_trans_tx_queue_set_status Activate AC/CMD Queue 1 on FIFO 2
[ 48.536024] iwlwifi 0000:06:00.0: U iwl_trans_tx_queue_set_status Activate AC/CMD Queue 2 on FIFO 1
[ 48.536024] iwlwifi 0000:06:00.0: U iwl_trans_tx_queue_set_status Activate AC/CMD Queue 3 on FIFO 0
[ 48.536024] iwlwifi 0000:06:00.0: U iwl_trans_tx_queue_set_status Activate AC/CMD Queue 4 on FIFO 7
[ 56.273961] wlan0: authenticate with 00:0a:79:eb:56:10 (try 1)
[ 56.276374] wlan0: authenticated
[ 56.281516] wlan0: associate with 00:0a:79:eb:56:10 (try 1)
[ 56.288627] wlan0: RX AssocResp from 00:0a:79:eb:56:10 (capab=0x411 status=0 aid=1)
[ 56.288633] wlan0: associated
[ 74.164324] iwlwifi 0000:06:00.0: U iwl_trans_pcie_tx_agg_alloc HW queue is empty
[ 75.164059] iwlwifi 0000:06:00.0: U iwl_trans_pcie_tx_agg_disable HW queue is empty
[ 75.165912] iwlwifi 0000:06:00.0: U iwl_trans_tx_queue_set_status Deactivate AC/CMD Queue 11 on FIFO 0
[ 76.481181] iwlwifi 0000:06:00.0: U iwl_trans_tx_queue_set_status Deactivate BA Queue 11 on FIFO 1
[ 76.482944] iwlwifi 0000:06:00.0: Tx aggregation enabled on ra = 00:0a:79:eb:56:10 tid = 0
[ 76.484769] ------------[ cut here ]------------
[ 76.486566] WARNING: at drivers/net/wireless/iwlwifi/iwl-trans-pcie.c:1106 iwl_trans_pcie_tx+0x180/0x668 [iwlwifi]()
[ 76.488353] Hardware name: VGN-Z11VN_B
[ 76.488718] Modules linked in: usb_storage rfcomm bnep bluetooth crc16 snd_hrtimer vboxpci(O) vboxnetadp(O) vboxnetflt(O) vboxdrv(O) binfmt_misc dm_crypt dm_mod btrfs zlib_deflate crc32c libcrc32c vfat fat fuse loop(+) uinput snd_hda_codec_realtek arc4 snd_hda_intel snd_hda_codec snd_hwdep snd_pcm_oss snd_mixer_oss snd_pcm mxm_wmi snd_seq_dummy snd_seq_oss snd_seq_midi snd_rawmidi snd_seq_midi_event snd_seq firewire_ohci firewire_core crc_itu_t iwlwifi joydev snd_timer snd_seq_device mac80211 cfg80211 snd sony_laptop(O) rfkill tpm_infineon soundcore snd_page_alloc
[ 76.488718] Pid: 721, comm: kworker/u:5 Tainted: G D O 3.2.0-rc2+ #47
[ 76.488718] Call Trace:
[ 76.488718] <IRQ> [<ffffffff810384c5>] warn_slowpath_common+0x83/0x9b
[ 76.488718] [<ffffffff810384f7>] warn_slowpath_null+0x1a/0x1c
[ 76.488718] [<ffffffffa00e6d96>] iwl_trans_pcie_tx+0x180/0x668 [iwlwifi]
[ 76.488718] [<ffffffff810d38e5>] ? kmem_cache_alloc+0x44/0xb9
[ 76.488718] [<ffffffffa00cea1d>] iwlagn_tx_skb+0x87b/0x91b [iwlwifi]
[ 76.488718] [<ffffffffa00c4f8c>] iwlagn_mac_tx+0x131/0x1a2 [iwlwifi]
[ 76.488718] [<ffffffffa008fc56>] ? ieee80211_tx_h_fragment+0x16/0x22c [mac80211]
[ 76.488718] [<ffffffffa0083f94>] __ieee80211_tx+0x176/0x1cf [mac80211]
[ 76.488718] [<ffffffffa008fbc8>] ? ieee80211_tx_h_calculate_duration+0x4c/0x65 [mac80211]
[ 76.488718] [<ffffffffa0084d5a>] ieee80211_tx+0x97/0xaf [mac80211]
[ 76.488718] [<ffffffffa0085c56>] ieee80211_tx_pending+0xf0/0x1c3 [mac80211]
[ 76.488718] [<ffffffff8103d055>] ? __local_bh_enable+0x4a/0x83
[ 76.488718] [<ffffffff8103de1e>] ? __do_softirq+0x15b/0x1a5
[ 76.488718] [<ffffffff8103dc3e>] tasklet_action+0x77/0xc2
[ 76.488718] [<ffffffff8103dd7f>] __do_softirq+0xbc/0x1a5
[ 76.488718] [<ffffffff8140b00c>] call_softirq+0x1c/0x30
[ 76.488718] <EOI> [<ffffffff8100359e>] do_softirq+0x38/0x6e
[ 76.488718] [<ffffffff8103da45>] _local_bh_enable_ip.isra.12+0x7d/0xa0
[ 76.488718] [<ffffffff8103da76>] local_bh_enable_ip+0xe/0x10
[ 76.488718] [<ffffffff8140921a>] _raw_spin_unlock_bh+0x23/0x25
[ 76.488718] [<ffffffffa0073c64>] ieee80211_agg_tx_operational+0x99/0xa4 [mac80211]
[ 76.488718] [<ffffffffa00747b2>] ieee80211_process_addba_resp+0xb8/0xf2 [mac80211]
[ 76.488718] [<ffffffff81064840>] ? do_raw_spin_lock+0x16/0x22
[ 76.488718] [<ffffffffa007ba82>] ieee80211_iface_work+0x130/0x2b5 [mac80211]
[ 76.488718] [<ffffffffa007b952>] ? ieee80211_teardown_sdata+0xcc/0xcc [mac80211]
[ 76.488718] [<ffffffff8104de7c>] process_one_work+0x17b/0x2bd
[ 76.488718] [<ffffffff8104c4c1>] ? need_to_create_worker+0x12/0x26
[ 76.488718] [<ffffffff8104ef77>] worker_thread+0xdb/0x15f
[ 76.488718] [<ffffffff8104ee9c>] ? manage_workers.isra.24+0x171/0x171
[ 76.488718] [<ffffffff8105265d>] kthread+0x84/0x8c
[ 76.488718] [<ffffffff8140af14>] kernel_thread_helper+0x4/0x10
[ 76.488718] [<ffffffff810525d9>] ? kthread_worker_fn+0x148/0x148
[ 76.488718] [<ffffffff8140af10>] ? gs_change+0xb/0xb
[ 76.488718] ---[ end trace e59796b1a2564630 ]---
[ 81.272320] iwlwifi 0000:06:00.0: U iwl_trans_pcie_tx_agg_alloc HW queue is NOT empty: 5 packets in HW queue
[ 86.104091] iwlwifi 0000:06:00.0: Queue 11 stuck for 10000 ms.
[ 86.105749] iwlwifi 0000:06:00.0: Current SW read_ptr 76 write_ptr 81
[ 86.107435] iwlwifi 0000:06:00.0: Current HW read_ptr 76 write_ptr 81
[ 86.109073] iwlwifi 0000:06:00.0: On demand firmware reload
[ 86.111055] ieee80211 phy0: Hardware restart was requested
[ 86.112741] iwlwifi 0000:06:00.0: L1 Enabled; Disabling L0S
[ 86.117893] iwlwifi 0000:06:00.0: Radio type=0x1-0x2-0x0
[ 86.158367] iwlwifi 0000:06:00.0: U iwl_trans_tx_queue_set_status Activate AC/CMD Queue 0 on FIFO 3
[ 86.159884] iwlwifi 0000:06:00.0: U iwl_trans_tx_queue_set_status Activate AC/CMD Queue 1 on FIFO 2
[ 86.160081] iwlwifi 0000:06:00.0: U iwl_trans_tx_queue_set_status Activate AC/CMD Queue 2 on FIFO 1
[ 86.160081] iwlwifi 0000:06:00.0: U iwl_trans_tx_queue_set_status Activate AC/CMD Queue 3 on FIFO 0
[ 86.160081] iwlwifi 0000:06:00.0: U iwl_trans_tx_queue_set_status Activate AC/CMD Queue 4 on FIFO 7
[ 86.205374] iwlwifi 0000:06:00.0: U iwl_trans_tx_queue_set_status Deactivate AC/CMD Queue 11 on FIFO 0
[ 114.816331] iwlwifi 0000:06:00.0: U iwl_trans_pcie_tx_agg_alloc HW queue is empty
[ 115.816030] iwlwifi 0000:06:00.0: U iwl_trans_pcie_tx_agg_disable HW queue is empty
[ 115.817499] iwlwifi 0000:06:00.0: U iwl_trans_tx_queue_set_status Deactivate AC/CMD Queue 11 on FIFO 0
[ 116.522124] iwlwifi 0000:06:00.0: U iwl_trans_tx_queue_set_status Deactivate BA Queue 11 on FIFO 3
[ 116.523518] iwlwifi 0000:06:00.0: Tx aggregation enabled on ra = 00:0a:79:eb:56:10 tid = 6
[ 118.532313] iwlwifi 0000:06:00.0: U iwl_trans_pcie_tx_agg_alloc HW queue is NOT empty: 4 packets in HW queue


BTW; I went back now to something 3.1.0-rc10+++ (a few commits before 3.1.0)
and no warnings/erros/stuck queue.

Best wishes

Norbert
------------------------------------------------------------------------
Norbert Preining preining@{jaist.ac.jp, logic.at, debian.org}
JAIST, Japan TeX Live & Debian Developer
DSA: 0x09C5B094 fp: 14DF 2E6C 0307 BE6D AD76 A9C0 D2BF 4AA3 09C5 B094
------------------------------------------------------------------------
SCETHROG (n.)
One of those peculiar beards-without-moustaches worn by religious
Belgians and American scientists which help them look like trolls.
--- Douglas Adams, The Meaning of Liff

2011-11-09 07:15:50

by Emmanuel Grumbach

[permalink] [raw]
Subject: Re: iwlagn is getting very shaky

> On Di, 08 Nov 2011, Emmanuel Grumbach wrote:
> > I would like to see what happens in terms or power.
> > Please set the power debug flag: modprobe iwlwifi debug=0x100
>
> > It may be worth to disable Link Power save too. I would be glad if you
> > could try this patch (attached and inlined), with the power debug flag
> > set.
>
> With both done, I see things like that:
> [ ?690.597221] iwlwifi 0000:06:00.0: L1 Enabled; Disabling L0S
> [ ?690.601979] iwlwifi 0000:06:00.0: Radio type=0x1-0x2-0x0
> [ ?690.644716] iwlwifi 0000:06:00.0: U iwl_power_sleep_cam_cmd Sleep command for CAM
> [ ?690.644721] iwlwifi 0000:06:00.0: U iwl_set_power Sending power/sleep command
> [ ?690.644724] iwlwifi 0000:06:00.0: U iwl_set_power Flags value = 0x00000000
> [ ?690.644726] iwlwifi 0000:06:00.0: U iwl_set_power Tx timeout = 0
> [ ?690.644728] iwlwifi 0000:06:00.0: U iwl_set_power Rx timeout = 0
> [ ?690.644731] iwlwifi 0000:06:00.0: U iwl_set_power Sleep interval vector = { 0 , 0 , 0 , 0 , 0 }
> [ ?690.645649] iwlwifi 0000:06:00.0: U iwl_power_sleep_cam_cmd Sleep command for CAM
>
> Does this help you, or do you need more?
>

This look fine, thanks. I now would like to now if the bug reproduces
with these settings :-)
With these settings the NIC will stay up all the times, it can help.

Thanks !

2011-11-27 06:38:22

by Emmanuel Grumbach

[permalink] [raw]
Subject: Re: iwlagn is getting very shaky

> Yeah, that was my impression too, but still, I wanted to mention it.
>
>> I am already on week-end here and will look at it more carefully
>> later. First impression is that things are messy here. I guess we are
>> getting in a bad failure path that wasn't checked...
>
> No problem, I can cope with it.
>

you can always disable 11n: modprobe 11n_disable=1.

Well... Something is really weird here:
mac80211 requests from us to:
* start agg
* stop agg
* agg operational

this is not a legal sequence

BTW - I am under this impression that mac80211 hasn't been compiled
with HT debug flag. Can you please set this compilation flag ?

I also would like you to add this line in iwlagn_mac_ampdu_action:


static int iwlagn_mac_ampdu_action(struct ieee80211_hw *hw,
struct ieee80211_vif *vif,
enum ieee80211_ampdu_mlme_action action,
struct ieee80211_sta *sta, u16 tid, u16 *ssn,
u8 buf_size)
{
struct iwl_priv *priv = hw->priv;
int ret = -EINVAL;
struct iwl_station_priv *sta_priv = (void *) sta->drv_priv;

IWL_DEBUG_HT(priv, "A-MPDU action on addr %pM tid %d\n",
sta->addr, tid);

dump_stack(); <<<<========================================= new line

if (!(priv->cfg->sku & EEPROM_SKU_CAP_11N_ENABLE))
return -EACCES;



This will give an insight at why is mac80211 calling the driver and
better point me to the bug.


Thanks !

2011-11-02 03:13:55

by Norbert Preining

[permalink] [raw]
Subject: Re: iwlagn is getting very shaky

Hi all,

On Mi, 26 Okt 2011, Norbert Preining wrote:
> On Di, 25 Okt 2011, wwguy wrote:
> > so this is different problem, right? it looks different compare to what
> > you described before which is cause by queue stopped.
>
> I assume that this is a different regression concerning 3.1.0 released.

Is there any progress on that? Currently linux git status is unusable
with respect to iwlwifi?

Are there any changes planned before rc1?

Best wishes

Norbert
------------------------------------------------------------------------
Norbert Preining preining@{jaist.ac.jp, logic.at, debian.org}
JAIST, Japan TeX Live & Debian Developer
DSA: 0x09C5B094 fp: 14DF 2E6C 0307 BE6D AD76 A9C0 D2BF 4AA3 09C5 B094
------------------------------------------------------------------------
BOSEMAN
One who spends all day loafing about near pedestrian crossing looking
as if he's about to cross.
--- Douglas Adams, The Meaning of Liff

2011-11-23 01:55:13

by Norbert Preining

[permalink] [raw]
Subject: Re: iwlagn is getting very shaky

On Di, 22 Nov 2011, Emmanuel Grumbach wrote:
> This is completely broken. This log looks really bad. AGG are enabled
> and disabled right away, messing the queues completely.

I can easily reproduce that here, today after reboot I saw something more:
[ 82.894328] iwlwifi 0000:06:00.0: U iwl_trans_tx_queue_set_status Deactivate
AC/CMD Queue 11 on FIFO 0
[ 83.072243] iwlwifi 0000:06:00.0: U iwl_trans_pcie_tx_agg_alloc HW queue is NOT empty: 85 packets in HW queue
[ 288.039768] iwlwifi 0000:06:00.0: U iwl_trans_tx_queue_set_status Deactivate AC/CMD Queue 11 on FIFO 0
[ 288.039884] wlan0: deauthenticating from 00:0a:79:eb:56:10 by local choice (reason=3)
[ 288.040218] iwlwifi 0000:06:00.0: I iwl_stop_queue Stop hwq 2 ac 2 stop count 1. Queue is full
[ 288.052312] iwlwifi 0000:06:00.0: I iwl_wake_queue Wake hwq 2 ac 2. Packets reclaimed
[ 288.052542] iwlwifi 0000:06:00.0: I iwl_stop_queue Stop hwq 2 ac 2 stop count 1. Queue is full
[ 288.064074] iwlwifi 0000:06:00.0: I iwl_wake_queue Wake hwq 2 ac 2. Packets reclaimed
[ 288.064301] iwlwifi 0000:06:00.0: I iwl_stop_queue Stop hwq 2 ac 2 stop count 1. Queue is full
[ 288.074618] iwlwifi 0000:06:00.0: I iwl_wake_queue Wake hwq 2 ac 2. Packets reclaimed

> Will send another patch later.

Thanks.

Norbert
------------------------------------------------------------------------
Norbert Preining preining@{jaist.ac.jp, logic.at, debian.org}
JAIST, Japan TeX Live & Debian Developer
DSA: 0x09C5B094 fp: 14DF 2E6C 0307 BE6D AD76 A9C0 D2BF 4AA3 09C5 B094
------------------------------------------------------------------------
FORSINAIN (n. archaic)
The right of the lord of the manor to molest dwarves on their
birthdays.
--- Douglas Adams, The Meaning of Liff

2011-11-22 06:46:13

by Emmanuel Grumbach

[permalink] [raw]
Subject: Re: iwlagn is getting very shaky

On Tue, Nov 22, 2011 at 07:13, Norbert Preining <[email protected]> wrote:
> On Mo, 21 Nov 2011, Emmanuel Grumbach wrote:
>> > If you think I shouuld turn on power debug again and see hope that
>> > it happens again, let me know.
>>
>> No, I don't think it is needed. Thanks.
>

Hi Norbert,

Please apply the patch attached. It has been submitted upstream but
hasn't been merged in the tree you use apparently.
Then please run with set the TXQUEUES debug flag: 0x8000000.
This will give us a better understanding regarding what happens with
the queues and how they are mapped.
Do you use any SoftAP / P2P feature ? HW queues can be stalled
sometimes when using such configuration. Can it also be that the AP
that we can't work with has any kind of power save feature (I know
that this one is hard to answer though :-)) ?

Thanks !


Attachments:
QUEUE_DEBUG.patch (13.50 kB)

2011-11-25 12:21:54

by Norbert Preining

[permalink] [raw]
Subject: Re: iwlagn is getting very shaky

Hi Emmanuel,

On Do, 24 Nov 2011, Emmanuel Grumbach wrote:
> Please apply the patch attached on top of the one I previously sent.
> Let me know I you have trouble with patch.

The patch you sent was not against kernel git, but I moved the
debug statement from iwl-mac80211.c to iwl-agn.c (where I guess is
the right place) and applied the rest of the patch.

> Please also enable the MAC80211_HT_DEBUG compilation flag
> I also the (HT + TXQUEUES) debug flag of the driver: modprobe iwlwifi
> debug=0x84000000

Also activated.

Here my very very small theory, but that is just guessing: Could it
be related to the time being adjusted by ntp or something? I have the
feeling it works for a bit, then hangs. In this little bit it might
be that ntp updates the time of the laptop. That would explain why
on a full normal reboot it most of the times work, but when the laptop
sleeps for long time, or is turned of for long time, then it is
getting problems. But this is surely just a stupid theory.

Here we go with the debug output


[ 4592.437269] iwlwifi 0000:06:00.0: L1 Enabled; Disabling L0S
[ 4592.441080] iwlwifi 0000:06:00.0: Radio type=0x1-0x2-0x0
[ 4592.482332] iwlwifi 0000:06:00.0: U iwl_txq_ctx_activate Activate txq 0
[ 4592.482346] iwlwifi 0000:06:00.0: U iwl_trans_tx_queue_set_status Activate AC/CMD Queue 0 on FIFO 3
[ 4592.482349] iwlwifi 0000:06:00.0: U iwl_txq_ctx_activate Activate txq 1
[ 4592.482362] iwlwifi 0000:06:00.0: U iwl_trans_tx_queue_set_status Activate AC/CMD Queue 1 on FIFO 2
[ 4592.482365] iwlwifi 0000:06:00.0: U iwl_txq_ctx_activate Activate txq 2
[ 4592.482377] iwlwifi 0000:06:00.0: U iwl_trans_tx_queue_set_status Activate AC/CMD Queue 2 on FIFO 1
[ 4592.482380] iwlwifi 0000:06:00.0: U iwl_txq_ctx_activate Activate txq 3
[ 4592.482392] iwlwifi 0000:06:00.0: U iwl_trans_tx_queue_set_status Activate AC/CMD Queue 3 on FIFO 0
[ 4592.482395] iwlwifi 0000:06:00.0: U iwl_txq_ctx_activate Activate txq 4
[ 4592.482407] iwlwifi 0000:06:00.0: U iwl_trans_tx_queue_set_status Activate AC/CMD Queue 4 on FIFO 7
[ 4592.482410] iwlwifi 0000:06:00.0: U iwl_txq_ctx_activate Activate txq 5
[ 4592.482412] iwlwifi 0000:06:00.0: U iwl_txq_ctx_activate Activate txq 6
[ 4592.482415] iwlwifi 0000:06:00.0: U iwl_txq_ctx_activate Activate txq 7
[ 4592.482417] iwlwifi 0000:06:00.0: U iwl_txq_ctx_activate Activate txq 8
[ 4592.482419] iwlwifi 0000:06:00.0: U iwl_txq_ctx_activate Activate txq 9
[ 4592.482422] iwlwifi 0000:06:00.0: U iwl_txq_ctx_activate Activate txq 10
[ 4600.139699] wlan0: authenticate with 00:0a:79:eb:56:10 (try 1)
[ 4600.142190] wlan0: authenticated
[ 4600.148081] wlan0: associate with 00:0a:79:eb:56:10 (try 1)
[ 4600.153962] wlan0: RX AssocResp from 00:0a:79:eb:56:10 (capab=0x411 status=0 aid=2)
[ 4600.153972] wlan0: associated
[ 4601.280512] iwlwifi 0000:06:00.0: U iwlagn_mac_ampdu_action A-MPDU action on addr 00:0a:79:eb:56:10 tid 0
[ 4601.280525] iwlwifi 0000:06:00.0: U iwlagn_mac_ampdu_action start Rx
[ 4616.622380] iwlwifi 0000:06:00.0: I rs_tl_turn_on_agg_for_tid Starting Tx agg: STA: 00:0a:79:eb:56:10 tid: 0
[ 4616.652083] iwlwifi 0000:06:00.0: U iwlagn_mac_ampdu_action A-MPDU action on addr 00:0a:79:eb:56:10 tid 0
[ 4616.652087] iwlwifi 0000:06:00.0: U iwlagn_mac_ampdu_action start Tx
[ 4616.652090] iwlwifi 0000:06:00.0: U iwlagn_tx_agg_start TX AGG request on ra = 00:0a:79:eb:56:10 tid = 0
[ 4616.652238] iwlwifi 0000:06:00.0: U iwlagn_txq_ctx_activate_free Activate txq 11
[ 4616.652241] iwlwifi 0000:06:00.0: U iwl_trans_pcie_tx_agg_alloc HW queue is empty
[ 4616.652244] iwlwifi 0000:06:00.0: U iwl_start_tx_ba_trans_ready Telling mac start BA ready tid: 0
[ 4617.652070] iwlwifi 0000:06:00.0: U iwlagn_mac_ampdu_action A-MPDU action on addr 00:0a:79:eb:56:10 tid 0
[ 4617.652075] iwlwifi 0000:06:00.0: U iwlagn_mac_ampdu_action stop Tx
[ 4617.652077] iwlwifi 0000:06:00.0: U iwlagn_mac_ampdu_action Stop Tx agg for tid 0
[ 4617.652081] iwlwifi 0000:06:00.0: U iwl_trans_pcie_tx_agg_disable HW queue is empty
[ 4617.652139] iwlwifi 0000:06:00.0: U iwl_trans_pcie_txq_agg_disable Disable AGG txq_id 11
[ 4617.652192] iwlwifi 0000:06:00.0: U iwl_txq_ctx_deactivate Deactivate txq 11
[ 4617.652210] iwlwifi 0000:06:00.0: U iwl_trans_tx_queue_set_status Deactivate AC/CMD Queue 11 on FIFO 0
[ 4617.652213] iwlwifi 0000:06:00.0: U iwl_stop_tx_ba_trans_ready Telling mac stop BA ready tid: 0
[ 4621.013744] iwlwifi 0000:06:00.0: U iwlagn_mac_ampdu_action A-MPDU action on addr 00:0a:79:eb:56:10 tid 0
[ 4621.013760] iwlwifi 0000:06:00.0: U iwl_trans_pcie_tx_agg_setup Setup AGG for tid 0
[ 4621.013916] iwlwifi 0000:06:00.0: U iwl_trans_tx_queue_set_status Deactivate BA Queue 11 on FIFO 1
[ 4621.013925] iwlwifi 0000:06:00.0: U iwlagn_mac_ampdu_action priv->agg_tids_count = 1
[ 4621.013937] iwlwifi 0000:06:00.0: Tx aggregation enabled on ra = 00:0a:79:eb:56:10 tid = 0
[ 4621.013954] ------------[ cut here ]------------
[ 4621.013991] WARNING: at drivers/net/wireless/iwlwifi/iwl-trans-pcie.c:1106 iwl_trans_pcie_tx+0x180/0x668 [iwlwifi]()
[ 4621.013999] Hardware name: VGN-Z11VN_B
[ 4621.014004] Modules linked in: usb_storage rfcomm bnep vboxpci(O) vboxnetadp(O) vboxnetflt(O) vboxdrv(O) snd_hrtimer binfmt_misc dm_crypt dm_mod isofs btrfs zlib_deflate crc32c libcrc32c vfat fat fuse loop uinput snd_hda_codec_realtek snd_hda_intel snd_hda_codec snd_hwdep snd_pcm_oss snd_mixer_oss arc4 snd_pcm iwlwifi mac80211 snd_seq_dummy snd_seq_oss snd_seq_midi snd_rawmidi snd_seq_midi_event btusb sony_laptop(O) bluetooth snd_seq snd_timer snd_seq_device snd soundcore cfg80211 firewire_ohci firewire_core mxm_wmi joydev rfkill tpm_infineon snd_page_alloc crc16 crc_itu_t
[ 4621.014134] Pid: 6169, comm: kworker/u:44 Tainted: G W O 3.2.0-rc3+ #2
[ 4621.014140] Call Trace:
[ 4621.014145] <IRQ> [<ffffffff8103849d>] warn_slowpath_common+0x83/0x9b
[ 4621.014171] [<ffffffff810384cf>] warn_slowpath_null+0x1a/0x1c
[ 4621.014194] [<ffffffffa0204f4d>] iwl_trans_pcie_tx+0x180/0x668 [iwlwifi]
[ 4621.014208] [<ffffffff810d3861>] ? kmem_cache_alloc+0x44/0xb9
[ 4621.014229] [<ffffffffa01eca7d>] iwlagn_tx_skb+0x87b/0x91b [iwlwifi]
[ 4621.014247] [<ffffffffa01e2fea>] iwlagn_mac_tx+0x131/0x1a2 [iwlwifi]
[ 4621.014268] [<ffffffffa01c1cc6>] ? ieee80211_tx_h_fragment+0x16/0x22c [mac80211]
[ 4621.014296] [<ffffffffa01b6000>] __ieee80211_tx+0x176/0x1cf [mac80211]
[ 4621.014314] [<ffffffffa01c1c38>] ? ieee80211_tx_h_calculate_duration+0x4c/0x65 [mac80211]
[ 4621.014344] [<ffffffffa01b6dc6>] ieee80211_tx+0x97/0xaf [mac80211]
[ 4621.014374] [<ffffffffa01b7cc2>] ieee80211_tx_pending+0xf0/0x1c3 [mac80211]
[ 4621.014386] [<ffffffff8103dc16>] tasklet_action+0x77/0xc2
[ 4621.014395] [<ffffffff8103dd57>] __do_softirq+0xbc/0x1a5
[ 4621.014408] [<ffffffff8140b24c>] call_softirq+0x1c/0x30
[ 4621.014413] <EOI> [<ffffffff8100359e>] do_softirq+0x38/0x6e
[ 4621.014430] [<ffffffff8103da1d>] _local_bh_enable_ip.isra.12+0x7d/0xa0
[ 4621.014439] [<ffffffff8103da4e>] local_bh_enable_ip+0xe/0x10
[ 4621.014448] [<ffffffff81409462>] _raw_spin_unlock_bh+0x23/0x25
[ 4621.014471] [<ffffffffa01a5c6c>] ieee80211_agg_tx_operational+0x99/0xa4 [mac80211]
[ 4621.014496] [<ffffffffa01a67bd>] ieee80211_process_addba_resp+0xb8/0xf3 [mac80211]
[ 4621.014508] [<ffffffff81064840>] ? do_raw_spin_trylock+0x18/0x2a
[ 4621.014534] [<ffffffffa01adada>] ieee80211_iface_work+0x130/0x2b5 [mac80211]
[ 4621.014560] [<ffffffffa01ad9aa>] ? ieee80211_teardown_sdata+0xcc/0xcc [mac80211]
[ 4621.014572] [<ffffffff8104de54>] process_one_work+0x17b/0x2bd
[ 4621.014583] [<ffffffff8104c499>] ? need_to_create_worker+0x12/0x26
[ 4621.014593] [<ffffffff8104ef4f>] worker_thread+0xdb/0x15f
[ 4621.014602] [<ffffffff8104ee74>] ? manage_workers.isra.24+0x171/0x171
[ 4621.014612] [<ffffffff81052639>] kthread+0x84/0x8c
[ 4621.014622] [<ffffffff8140b154>] kernel_thread_helper+0x4/0x10
[ 4621.014633] [<ffffffff810525b5>] ? kthread_worker_fn+0x148/0x148
[ 4621.014642] [<ffffffff8140b150>] ? gs_change+0xb/0xb
[ 4621.014649] ---[ end trace 81df68812429b16c ]---
[ 4622.112804] iwlwifi 0000:06:00.0: I rs_tl_turn_on_agg_for_tid Starting Tx agg: STA: 00:0a:79:eb:56:10 tid: 0
[ 4622.124104] iwlwifi 0000:06:00.0: U iwlagn_mac_ampdu_action A-MPDU action on addr 00:0a:79:eb:56:10 tid 0
[ 4622.124116] iwlwifi 0000:06:00.0: U iwlagn_mac_ampdu_action start Tx
[ 4622.124126] iwlwifi 0000:06:00.0: U iwlagn_tx_agg_start TX AGG request on ra = 00:0a:79:eb:56:10 tid = 0
[ 4622.124304] iwlwifi 0000:06:00.0: U iwlagn_txq_ctx_activate_free Activate txq 11
[ 4622.124315] iwlwifi 0000:06:00.0: U iwl_trans_pcie_tx_agg_alloc HW queue is NOT empty: 21 packets in HW queue
[ 4630.040070] iwlwifi 0000:06:00.0: Queue 11 stuck for 10000 ms.
[ 4630.040081] iwlwifi 0000:06:00.0: Current SW read_ptr 242 write_ptr 8
[ 4630.040136] iwlwifi 0000:06:00.0: Current HW read_ptr 242 write_ptr 8
[ 4630.040144] iwlwifi 0000:06:00.0: On demand firmware reload
[ 4630.040587] ieee80211 phy0: Hardware restart was requested
[ 4630.040698] iwlwifi 0000:06:00.0: L1 Enabled; Disabling L0S
[ 4630.043703] iwlwifi 0000:06:00.0: Radio type=0x1-0x2-0x0
[ 4630.086602] iwlwifi 0000:06:00.0: U iwl_txq_ctx_activate Activate txq 0
[ 4630.086624] iwlwifi 0000:06:00.0: U iwl_trans_tx_queue_set_status Activate AC/CMD Queue 0 on FIFO 3
[ 4630.086634] iwlwifi 0000:06:00.0: U iwl_txq_ctx_activate Activate txq 1
[ 4630.086653] iwlwifi 0000:06:00.0: U iwl_trans_tx_queue_set_status Activate AC/CMD Queue 1 on FIFO 2
[ 4630.086662] iwlwifi 0000:06:00.0: U iwl_txq_ctx_activate Activate txq 2
[ 4630.086681] iwlwifi 0000:06:00.0: U iwl_trans_tx_queue_set_status Activate AC/CMD Queue 2 on FIFO 1
[ 4630.086690] iwlwifi 0000:06:00.0: U iwl_txq_ctx_activate Activate txq 3
[ 4630.086709] iwlwifi 0000:06:00.0: U iwl_trans_tx_queue_set_status Activate AC/CMD Queue 3 on FIFO 0
[ 4630.086717] iwlwifi 0000:06:00.0: U iwl_txq_ctx_activate Activate txq 4
[ 4630.086736] iwlwifi 0000:06:00.0: U iwl_trans_tx_queue_set_status Activate AC/CMD Queue 4 on FIFO 7
[ 4630.086745] iwlwifi 0000:06:00.0: U iwl_txq_ctx_activate Activate txq 5
[ 4630.086752] iwlwifi 0000:06:00.0: U iwl_txq_ctx_activate Activate txq 6
[ 4630.086759] iwlwifi 0000:06:00.0: U iwl_txq_ctx_activate Activate txq 7
[ 4630.086767] iwlwifi 0000:06:00.0: U iwl_txq_ctx_activate Activate txq 8
[ 4630.086774] iwlwifi 0000:06:00.0: U iwl_txq_ctx_activate Activate txq 9
[ 4630.086781] iwlwifi 0000:06:00.0: U iwl_txq_ctx_activate Activate txq 10
[ 4630.157921] iwlwifi 0000:06:00.0: U iwlagn_mac_ampdu_action A-MPDU action on addr 00:0a:79:eb:56:10 tid 0
[ 4630.157933] iwlwifi 0000:06:00.0: U iwlagn_mac_ampdu_action stop Tx
[ 4630.157942] iwlwifi 0000:06:00.0: U iwlagn_mac_ampdu_action Stop Tx agg for tid 0
[ 4630.157951] iwlwifi 0000:06:00.0: U iwl_trans_pcie_tx_agg_disable AGG stop before setup done
[ 4630.157992] iwlwifi 0000:06:00.0: U iwl_trans_pcie_txq_agg_disable Disable AGG txq_id 11
[ 4630.158073] iwlwifi 0000:06:00.0: U iwl_txq_ctx_deactivate Deactivate txq 11
[ 4630.158113] iwlwifi 0000:06:00.0: U iwl_trans_tx_queue_set_status Deactivate AC/CMD Queue 11 on FIFO 0
[ 4630.158123] iwlwifi 0000:06:00.0: U iwl_stop_tx_ba_trans_ready Telling mac stop BA ready tid: 0
[ 4630.158136] iwlwifi 0000:06:00.0: U iwlagn_mac_ampdu_action priv->agg_tids_count = 0
[ 4630.158148] iwlwifi 0000:06:00.0: U iwlagn_mac_ampdu_action A-MPDU action on addr 00:0a:79:eb:56:10 tid 0
[ 4630.158157] iwlwifi 0000:06:00.0: U iwlagn_mac_ampdu_action stop Rx
[ 4630.196845] iwlwifi 0000:06:00.0: U iwlagn_mac_ampdu_action A-MPDU action on addr 00:0a:79:eb:56:10 tid 0
[ 4630.196857] iwlwifi 0000:06:00.0: U iwlagn_mac_ampdu_action start Rx
[ 4632.759183] iwlwifi 0000:06:00.0: I rs_tl_turn_on_agg_for_tid Starting Tx agg: STA: 00:0a:79:eb:56:10 tid: 0
[ 4632.812124] iwlwifi 0000:06:00.0: U iwlagn_mac_ampdu_action A-MPDU action on addr 00:0a:79:eb:56:10 tid 0
[ 4632.812136] iwlwifi 0000:06:00.0: U iwlagn_mac_ampdu_action start Tx
[ 4632.812145] iwlwifi 0000:06:00.0: U iwlagn_tx_agg_start TX AGG request on ra = 00:0a:79:eb:56:10 tid = 0
[ 4632.812326] iwlwifi 0000:06:00.0: U iwlagn_txq_ctx_activate_free Activate txq 11
[ 4632.812337] iwlwifi 0000:06:00.0: U iwl_trans_pcie_tx_agg_alloc HW queue is NOT empty: 21 packets in HW queue
[ 4634.916847] iwlwifi 0000:06:00.0: U iwlagn_mac_ampdu_action A-MPDU action on addr 00:0a:79:eb:56:10 tid 6
[ 4634.916860] iwlwifi 0000:06:00.0: U iwlagn_mac_ampdu_action start Rx
[ 4644.716439] iwlwifi 0000:06:00.0: I rs_tl_turn_on_agg_for_tid Starting Tx agg: STA: 00:0a:79:eb:56:10 tid: 6
[ 4644.764120] iwlwifi 0000:06:00.0: U iwlagn_mac_ampdu_action A-MPDU action on addr 00:0a:79:eb:56:10 tid 6
[ 4644.764132] iwlwifi 0000:06:00.0: U iwlagn_mac_ampdu_action start Tx
[ 4644.764142] iwlwifi 0000:06:00.0: U iwlagn_tx_agg_start TX AGG request on ra = 00:0a:79:eb:56:10 tid = 6
[ 4644.764306] iwlwifi 0000:06:00.0: U iwlagn_txq_ctx_activate_free Activate txq 12
[ 4644.764315] iwlwifi 0000:06:00.0: U iwl_trans_pcie_tx_agg_alloc HW queue is empty
[ 4644.764325] iwlwifi 0000:06:00.0: U iwl_start_tx_ba_trans_ready Telling mac start BA ready tid: 6
[ 4644.766328] iwlwifi 0000:06:00.0: U iwlagn_mac_ampdu_action A-MPDU action on addr 00:0a:79:eb:56:10 tid 6
[ 4644.766339] iwlwifi 0000:06:00.0: U iwl_trans_pcie_tx_agg_setup Setup AGG for tid 6
[ 4644.766494] iwlwifi 0000:06:00.0: U iwl_trans_tx_queue_set_status Activate BA Queue 12 on FIFO 3
[ 4644.766503] iwlwifi 0000:06:00.0: U iwlagn_mac_ampdu_action priv->agg_tids_count = 1
[ 4644.766549] iwlwifi 0000:06:00.0: Tx aggregation enabled on ra = 00:0a:79:eb:56:10 tid = 6
[ 4703.713625] iwlwifi 0000:06:00.0: I iwlagn_rx_reply_compressed_ba agg frames sent:2, acked:2


Best wishes

Norbert
------------------------------------------------------------------------
Norbert Preining preining@{jaist.ac.jp, logic.at, debian.org}
JAIST, Japan TeX Live & Debian Developer
DSA: 0x09C5B094 fp: 14DF 2E6C 0307 BE6D AD76 A9C0 D2BF 4AA3 09C5 B094
------------------------------------------------------------------------
BRISBANE
A perfectly reasonable explanation (Such as the one offered by a
person with a gurgling cough which has nothing to do with the fact
that they smoke fifty cigarettes a day.)
--- Douglas Adams, The Meaning of Liff

2011-11-27 09:46:24

by Johannes Berg

[permalink] [raw]
Subject: Re: iwlagn is getting very shaky

On Sun, 2011-11-27 at 11:43 +0200, Emmanuel Grumbach wrote:
> Hi again
>
> Can you please try the attached patch ?
> I would like to know if it helps.

As in the patch that Nikolay sent to the list earlier today, this should
also check STOPPING since WANT_STOP may get cleared as soon as it is
processed.

johannes


2011-11-22 07:55:11

by Norbert Preining

[permalink] [raw]
Subject: Re: iwlagn is getting very shaky

Hi


I wrote ...

>> Does not apply against current git,

...

>> can't find file to patch at input line 78
>> Perhaps you used the wrong -p or --strip option?
>> The text leading up to this was:
>> --------------------------
>> |diff --git a/drivers/net/wireless/iwlwifi/iwl-mac80211.c

You answered ...

> Glad it applied

Me scratching my head!?

Norbert

2011-11-21 13:59:46

by Emmanuel Grumbach

[permalink] [raw]
Subject: Re: iwlagn is getting very shaky

>
> And what I stil see is that the tx-queue gets stuck and does not
> recover without rmmod iwlwif.
>
> hwq 11: read=0 write=0 stop=1 swq_id=0x2e (ac 2/hwq 11)
>
> But I still am using the patch that you sent to disable power save,
> so it is not only this one.
>
> If you think I shouuld turn on power debug again and see hope that
> it happens again, let me know.

No, I don't think it is needed. Thanks.

I do would like you to check that you have my patch:

iwlwifi: add debug information on queue stop / wake

and run with debug flags 0x80000000 (bit 31)
This will print a message each time a queue is stopped / started.

Thanks for your patience !

2011-11-22 00:24:17

by Norbert Preining

[permalink] [raw]
Subject: Re: iwlagn is getting very shaky

On Di, 22 Nov 2011, Norbert Preining wrote:
> On Mo, 21 Nov 2011, Emmanuel Grumbach wrote:
> > iwlwifi: add debug information on queue stop / wake
> >
> > and run with debug flags 0x80000000 (bit 31)
> > This will print a message each time a queue is stopped / started.
>
> I have these patches in addition to current git:

BTW, I just rebuild my kernel from current git, where there were some
fixes from Johannes Berg for mac80211 etc:
Johannes Berg (7):
nl80211: fix HT capability attribute validation
cfg80211: allow setting TXQ parameters only in AP mode
cfg80211: fix cmp_ies
cfg80211: fix missing kernel-doc
mac80211: fix NULL dereference in radiotap code
mac80211: fix bug in ieee80211_build_probe_req
mac80211: fix race between connection monitor & suspend

But it did NOT help. Even after reboot I got a hanging queue, in addition
I got some more BUGs in ext3 code (ouch) and the usual in iwl-trans-pcie.c:1105:
[ 57.555697] wlan0: associated
[ 78.009471] iwlwifi 0000:06:00.0: Tx aggregation enabled on ra = 00:0a:79:eb:56:10 tid = 0
[ 78.009494] ------------[ cut here ]------------
[ 78.009536] WARNING: at drivers/net/wireless/iwlwifi/iwl-trans-pcie.c:1105 iwl_trans_pcie_tx+0x180/0x661 [iwlwifi]()
[ 78.009545] Hardware name: VGN-Z11VN_B
[ 78.009550] Modules linked in: rfcomm bnep bluetooth crc16 snd_hrtimer vboxpci(O) vboxnetadp(O) vboxnetflt(O) vboxdrv(O) binfmt_misc dm_crypt dm_mod btrfs zlib_deflate crc32c libcrc32c vfat fat fuse loop(+) uinput snd_hda_codec_realtek arc4 snd_hda_intel snd_hda_codec snd_hwdep snd_pcm_oss snd_mixer_oss snd_pcm mxm_wmi snd_seq_dummy snd_seq_oss snd_seq_midi snd_rawmidi snd_seq_midi_event snd_seq snd_timer snd_seq_device iwlwifi firewire_ohci joydev snd firewire_core crc_itu_t mac80211 sony_laptop(O) cfg80211 soundcore snd_page_alloc rfkill tpm_infineon
[ 78.009674] Pid: 717, comm: kworker/u:3 Tainted: G D O 3.2.0-rc2+ #47
[ 78.009681] Call Trace:
[ 78.009686] <IRQ> [<ffffffff810384c5>] warn_slowpath_common+0x83/0x9b
[ 78.009712] [<ffffffff810384f7>] warn_slowpath_null+0x1a/0x1c
[ 78.009736] [<ffffffffa00f9ae3>] iwl_trans_pcie_tx+0x180/0x661 [iwlwifi]
[ 78.009750] [<ffffffff810d38e5>] ? kmem_cache_alloc+0x44/0xb9
[ 78.009772] [<ffffffffa00e1a1d>] iwlagn_tx_skb+0x87b/0x91b [iwlwifi]
[ 78.009791] [<ffffffffa00d7f8c>] iwlagn_mac_tx+0x131/0x1a2 [iwlwifi]
[ 78.009814] [<ffffffffa007bc56>] ? ieee80211_tx_h_fragment+0x16/0x22c [mac80211]
[ 78.009826] [<ffffffff811a3884>] ? is_swiotlb_buffer+0x29/0x38
[ 78.009856] [<ffffffffa006ff94>] __ieee80211_tx+0x176/0x1cf [mac80211]
[ 78.009876] [<ffffffffa007bbc8>] ? ieee80211_tx_h_calculate_duration+0x4c/0x65 [mac80211]
[ 78.009906] [<ffffffffa0070d5a>] ieee80211_tx+0x97/0xaf [mac80211]
[ 78.009938] [<ffffffffa0071c56>] ieee80211_tx_pending+0xf0/0x1c3 [mac80211]
[ 78.009950] [<ffffffff8103dc3e>] tasklet_action+0x77/0xc2
[ 78.009959] [<ffffffff8103dd7f>] __do_softirq+0xbc/0x1a5
[ 78.009972] [<ffffffff8140b00c>] call_softirq+0x1c/0x30
[ 78.009978] <EOI> [<ffffffff8100359e>] do_softirq+0x38/0x6e
[ 78.009995] [<ffffffff8103da45>] _local_bh_enable_ip.isra.12+0x7d/0xa0
[ 78.010004] [<ffffffff8103da76>] local_bh_enable_ip+0xe/0x10
[ 78.010014] [<ffffffff8140921a>] _raw_spin_unlock_bh+0x23/0x25
[ 78.010039] [<ffffffffa005fc64>] ieee80211_agg_tx_operational+0x99/0xa4 [mac80211]
[ 78.010065] [<ffffffffa00607b2>] ieee80211_process_addba_resp+0xb8/0xf2 [mac80211]
[ 78.010077] [<ffffffff81064840>] ? do_raw_spin_lock+0x16/0x22
[ 78.010104] [<ffffffffa0067a82>] ieee80211_iface_work+0x130/0x2b5 [mac80211]
[ 78.010132] [<ffffffffa0067952>] ? ieee80211_teardown_sdata+0xcc/0xcc [mac80211]
[ 78.010145] [<ffffffff8104de7c>] process_one_work+0x17b/0x2bd
[ 78.010154] [<ffffffff8104c4c1>] ? need_to_create_worker+0x12/0x26
[ 78.010165] [<ffffffff8104ef77>] worker_thread+0xdb/0x15f
[ 78.010174] [<ffffffff8104ee9c>] ? manage_workers.isra.24+0x171/0x171
[ 78.010184] [<ffffffff8105265d>] kthread+0x84/0x8c
[ 78.010195] [<ffffffff8140af14>] kernel_thread_helper+0x4/0x10
[ 78.010206] [<ffffffff810525d9>] ? kthread_worker_fn+0x148/0x148
[ 78.010215] [<ffffffff8140af10>] ? gs_change+0xb/0xb
[ 78.010222] ---[ end trace 1999568f54f9a3c3 ]---

Followed by a stuck queue:
[ 87.480094] iwlwifi 0000:06:00.0: Queue 11 stuck for 10000 ms.
[ 87.480106] iwlwifi 0000:06:00.0: Current SW read_ptr 21 write_ptr 72
[ 87.480161] iwlwifi 0000:06:00.0: Current HW read_ptr 21 write_ptr 72
[ 87.480169] iwlwifi 0000:06:00.0: On demand firmware reload
[ 87.480644] ieee80211 phy0: Hardware restart was requested
[ 87.480763] iwlwifi 0000:06:00.0: L1 Enabled; Disabling L0S
[ 87.483792] iwlwifi 0000:06:00.0: Radio type=0x1-0x2-0x0

at which point nothing works anymore. rmmod and modprobe iwlwifi
gets back the connection:
[ 417.209410] wlan0: deauthenticating from 00:0a:79:eb:56:10 by local choice (reason=3)
[ 417.220286] iwlwifi 0000:06:00.0: Stopping AGG while state not ON or starting for 0 on 0 (0)
[ 417.304463] cfg80211: Calling CRDA for country: JP
[ 417.372535] iwlwifi 0000:06:00.0: PCI INT A disabled
[ 421.388543] Intel(R) Wireless WiFi Link AGN driver for Linux, in-tree:d
[ 421.388547] Copyright(c) 2003-2011 Intel Corporation
[ 421.388720] iwlwifi 0000:06:00.0: PCI INT A -> GSI 17 (level, low) -> IRQ 17
[ 421.388753] iwlwifi 0000:06:00.0: setting latency timer to 64
[ 421.389058] iwlwifi 0000:06:00.0: pci_resource_len = 0x00002000
[ 421.389061] iwlwifi 0000:06:00.0: pci_resource_base = ffffc900041b8000
[ 421.389063] iwlwifi 0000:06:00.0: HW Revision ID = 0x0
[ 421.389210] iwlwifi 0000:06:00.0: irq 46 for MSI/MSI-X
[ 421.389286] iwlwifi 0000:06:00.0: Detected Intel(R) WiFi Link 5100 AGN, REV=0x54
[ 421.389381] iwlwifi 0000:06:00.0: L1 Enabled; Disabling L0S
[ 421.411615] iwlwifi 0000:06:00.0: device EEPROM VER=0x11e, CALIB=0x4
[ 421.411619] iwlwifi 0000:06:00.0: Device SKU: 0Xf0
[ 421.411629] iwlwifi 0000:06:00.0: Tunable channels: 13 802.11bg, 24 802.11a channels
[ 421.417401] iwlwifi 0000:06:00.0: loaded firmware version 8.83.5.1 build 33692 (EXP)
[ 421.417563] Registered led device: phy1-led
[ 421.417859] ieee80211 phy1: Selected rate control algorithm 'iwl-agn-rs'
[ 421.431467] iwlwifi 0000:06:00.0: L1 Enabled; Disabling L0S
[ 421.435641] iwlwifi 0000:06:00.0: Radio type=0x1-0x2-0x0
[ 421.553374] iwlwifi 0000:06:00.0: L1 Enabled; Disabling L0S
[ 421.556448] iwlwifi 0000:06:00.0: Radio type=0x1-0x2-0x0
[ 429.248438] wlan0: authenticate with 00:0a:79:eb:56:10 (try 1)
[ 429.255782] wlan0: authenticated
[ 429.257063] wlan0: associate with 00:0a:79:eb:56:10 (try 1)
[ 429.262801] wlan0: RX AssocResp from 00:0a:79:eb:56:10 (capab=0x411 status=0 aid=1)
[ 429.262806] wlan0: associated

Best wishes

Norbert
------------------------------------------------------------------------
Norbert Preining preining@{jaist.ac.jp, logic.at, debian.org}
JAIST, Japan TeX Live & Debian Developer
DSA: 0x09C5B094 fp: 14DF 2E6C 0307 BE6D AD76 A9C0 D2BF 4AA3 09C5 B094
------------------------------------------------------------------------
HODNET (n.)
The wooden safety platform supported by scaffolding round a building
under construction from which the builders (at almost no personal
risk) can drop pieces of cement on passers-by.
--- Douglas Adams, The Meaning of Liff

2011-11-24 06:47:41

by Emmanuel Grumbach

[permalink] [raw]
Subject: Re: iwlagn is getting very shaky

On Wed, Nov 23, 2011 at 03:55, Norbert Preining <[email protected]> wrote:
> On Di, 22 Nov 2011, Emmanuel Grumbach wrote:
>> This is completely broken. This log looks really bad. AGG are enabled
>> and disabled right away, messing the queues completely.
>
> I can easily reproduce that here, today after reboot I saw something more:
> [ ? 82.894328] iwlwifi 0000:06:00.0: U iwl_trans_tx_queue_set_status Deactivate
> AC/CMD Queue 11 on FIFO 0
> [ ? 83.072243] iwlwifi 0000:06:00.0: U iwl_trans_pcie_tx_agg_alloc HW queue is NOT empty: 85 packets in HW queue
> [ ?288.039768] iwlwifi 0000:06:00.0: U iwl_trans_tx_queue_set_status Deactivate AC/CMD Queue 11 on FIFO 0
> [ ?288.039884] wlan0: deauthenticating from 00:0a:79:eb:56:10 by local choice (reason=3)
> [ ?288.040218] iwlwifi 0000:06:00.0: I iwl_stop_queue Stop hwq 2 ac 2 stop count 1. Queue is full
> [ ?288.052312] iwlwifi 0000:06:00.0: I iwl_wake_queue Wake hwq 2 ac 2. Packets reclaimed
> [ ?288.052542] iwlwifi 0000:06:00.0: I iwl_stop_queue Stop hwq 2 ac 2 stop count 1. Queue is full
> [ ?288.064074] iwlwifi 0000:06:00.0: I iwl_wake_queue Wake hwq 2 ac 2. Packets reclaimed
> [ ?288.064301] iwlwifi 0000:06:00.0: I iwl_stop_queue Stop hwq 2 ac 2 stop count 1. Queue is full
> [ ?288.074618] iwlwifi 0000:06:00.0: I iwl_wake_queue Wake hwq 2 ac 2. Packets reclaimed
>
>> Will send another patch later.
>

Please apply the patch attached on top of the one I previously sent.
Let me know I you have trouble with patch.
Please also enable the MAC80211_HT_DEBUG compilation flag
I also the (HT + TXQUEUES) debug flag of the driver: modprobe iwlwifi
debug=0x84000000

Thanks

2011-11-08 08:07:27

by Emmanuel Grumbach

[permalink] [raw]
Subject: Re: iwlagn is getting very shaky

>
> In short, this is the output of your patch:
> [ ?215.872090] iwlwifi 0000:06:00.0: Queue 11 stuck for 10000 ms.
> [ ?215.872101] iwlwifi 0000:06:00.0: Current SW read_ptr 105 write_ptr 115
> [ ?215.872156] iwlwifi 0000:06:00.0: Current HW read_ptr 105 write_ptr 115
> [ ?215.872164] iwlwifi 0000:06:00.0: On demand firmware reload
>
>
Ok thanks for that. The HW and the SW agree then...

I would like to see what happens in terms or power.
Please set the power debug flag: modprobe iwlwifi debug=0x100

Thanks.

2011-11-28 03:56:50

by Norbert Preining

[permalink] [raw]
Subject: Re: iwlagn is getting very shaky

On So, 27 Nov 2011, Emmanuel Grumbach wrote:
> Norbert, could you please test Nikolay's patch instead of mine ?

Looks promising, I will do more testing on the way.

Are you still interested in the output from a kernel wiht all the
patches (the dump_stack() etc etc)? There are actually quite a lot
of stacks dumped now into my dmesg/syslog ;-)

Best wishes

Norbert
------------------------------------------------------------------------
Norbert Preining preining@{jaist.ac.jp, logic.at, debian.org}
JAIST, Japan TeX Live & Debian Developer
DSA: 0x09C5B094 fp: 14DF 2E6C 0307 BE6D AD76 A9C0 D2BF 4AA3 09C5 B094
------------------------------------------------------------------------
VIRGINSTOW (n.)
A Durex machine which doesn't have the phrase 'So was the Titanic'
scrawled on it. The word has now fallen into disuse.
--- Douglas Adams, The Meaning of Liff