2009-11-03 23:38:53

by Andrew Lutomirski

[permalink] [raw]
Subject: iwlwifi connection troubles, maybe aggregation related

Hi all-

My laptop (Intel 5350) has trouble using the wireless networks here.
I'm at MIT, which has a bunch of Cisco 1250 AP's (dual-band, MIMO,
etc). Running Windows, everything works perfectly. On Linux
(2.6.31-rc5, but I've seen problems with other, older kernels as
well), it sometimes works, but I frequently find the network almost
completely unusable. I can associate and ping just fine, but, as soon
as I try to send any significant amount of data, I can no longer
transmit. I can still receive both broadcast and unicast frames, but
the network doesn't see anything I send. An older laptop (presumably
with 4965,

This seems to be correlated with a line like:

iwlagn 0000:03:00.0: iwl_tx_agg_start on ra = 00:21:d8:49:4a:52 tid = 0

appearing in dmesg.

Running "iw dev wlan0 disconnect" will make the connection start
working until I try to send data again (presumably because either NM
or wpa_supplicant will reassociate).

Turning on or off power management and fiddling with
no_sleep_autoadjust makes no difference. Setting tx_agg_tid_enable to
zero in debugfs while the connection was working seemed to make it a
little more reliable (it lasted long enough to do "git pull" but not
much longer).

After running "iw dev wlan0 disconnect" a few times, I start to get
errors like this:

[18078.209635] iwlagn 0000:03:00.0: SENSITIVITY_CMD failed
[18078.313461] iwlagn 0000:03:00.0: No space for Tx
[18078.313467] iwlagn 0000:03:00.0: Error sending SENSITIVITY_CMD:
enqueue_hcmd failed: -28
[18078.313470] iwlagn 0000:03:00.0: SENSITIVITY_CMD failed
[18078.522409] iwlagn 0000:03:00.0: No space for Tx
[18078.522414] iwlagn 0000:03:00.0: Error sending SENSITIVITY_CMD:
enqueue_hcmd failed: -28

The driver doesn't recover until I do "echo 1 >
/sys/kernel/debug/ieee80211/phy0/reset" Oddly enough, after resetting
just now, I couldn't trigger the failure again, even though it was
100% reproducible before resetting.

Thanks,
Andy


2009-11-06 19:45:04

by Andrew Lutomirski

[permalink] [raw]
Subject: Re: iwlwifi connection troubles, maybe aggregation related

On Wed, Nov 4, 2009 at 4:30 AM, Johannes Berg <[email protected]> wrote:
> On Tue, 2009-11-03 at 18:38 -0500, Andrew Lutomirski wrote:
>
>> iwlagn 0000:03:00.0: iwl_tx_agg_start on ra = 00:21:d8:49:4a:52 tid = 0
>
>> Turning on or off power management and fiddling with
>> no_sleep_autoadjust makes no difference. ?Setting tx_agg_tid_enable to
>> zero in debugfs while the connection was working seemed to make it a
>> little more reliable (it lasted long enough to do "git pull" but not
>> much longer).
>>
>> After running "iw dev wlan0 disconnect" a few times, I start to get
>> errors like this:
>>
>> [18078.209635] iwlagn 0000:03:00.0: SENSITIVITY_CMD failed
>> [18078.313461] iwlagn 0000:03:00.0: No space for Tx
>> [18078.313467] iwlagn 0000:03:00.0: Error sending SENSITIVITY_CMD:
>> enqueue_hcmd failed: -28
>> [18078.313470] iwlagn 0000:03:00.0: SENSITIVITY_CMD failed
>> [18078.522409] iwlagn 0000:03:00.0: No space for Tx
>> [18078.522414] iwlagn 0000:03:00.0: Error sending SENSITIVITY_CMD:
>> enqueue_hcmd failed: -28
>
> Sounds like the firmware messes up ...
>
> Maybe as a first workaround you could modprobe iwlagn with
> 11n_disable=1. But I don't know at this point what the problem could be.

11n_disable50=1 seems to work. I'll reply again if it stops working.

Do the Intel folks have any ideas?

Thanks,
Andy

2009-11-17 22:03:54

by Andrew Lutomirski

[permalink] [raw]
Subject: Re: iwlwifi connection troubles, maybe aggregation related

Greg: I'm not sure your bug is the same as mine. I don't get oopses.

This bug has been here for a long time and it's getting old, so it's
now #2120 :)

http://bugzilla.intellinuxwireless.org/show_bug.cgi?id=2120

--Andy

On Fri, Nov 6, 2009 at 2:45 PM, Andrew Lutomirski <[email protected]> wrote:
> On Wed, Nov 4, 2009 at 4:30 AM, Johannes Berg <[email protected]> wrote:
>> On Tue, 2009-11-03 at 18:38 -0500, Andrew Lutomirski wrote:
>>
>>> iwlagn 0000:03:00.0: iwl_tx_agg_start on ra = 00:21:d8:49:4a:52 tid = 0
>>
>>> Turning on or off power management and fiddling with
>>> no_sleep_autoadjust makes no difference. ?Setting tx_agg_tid_enable to
>>> zero in debugfs while the connection was working seemed to make it a
>>> little more reliable (it lasted long enough to do "git pull" but not
>>> much longer).
>>>
>>> After running "iw dev wlan0 disconnect" a few times, I start to get
>>> errors like this:
>>>
>>> [18078.209635] iwlagn 0000:03:00.0: SENSITIVITY_CMD failed
>>> [18078.313461] iwlagn 0000:03:00.0: No space for Tx
>>> [18078.313467] iwlagn 0000:03:00.0: Error sending SENSITIVITY_CMD:
>>> enqueue_hcmd failed: -28
>>> [18078.313470] iwlagn 0000:03:00.0: SENSITIVITY_CMD failed
>>> [18078.522409] iwlagn 0000:03:00.0: No space for Tx
>>> [18078.522414] iwlagn 0000:03:00.0: Error sending SENSITIVITY_CMD:
>>> enqueue_hcmd failed: -28
>>
>> Sounds like the firmware messes up ...
>>
>> Maybe as a first workaround you could modprobe iwlagn with
>> 11n_disable=1. But I don't know at this point what the problem could be.
>
> 11n_disable50=1 seems to work. ?I'll reply again if it stops working.
>
> Do the Intel folks have any ideas?
>
> Thanks,
> Andy
>

2009-11-04 09:31:35

by Johannes Berg

[permalink] [raw]
Subject: Re: iwlwifi connection troubles, maybe aggregation related

On Tue, 2009-11-03 at 18:38 -0500, Andrew Lutomirski wrote:

> iwlagn 0000:03:00.0: iwl_tx_agg_start on ra = 00:21:d8:49:4a:52 tid = 0

> Turning on or off power management and fiddling with
> no_sleep_autoadjust makes no difference. Setting tx_agg_tid_enable to
> zero in debugfs while the connection was working seemed to make it a
> little more reliable (it lasted long enough to do "git pull" but not
> much longer).
>
> After running "iw dev wlan0 disconnect" a few times, I start to get
> errors like this:
>
> [18078.209635] iwlagn 0000:03:00.0: SENSITIVITY_CMD failed
> [18078.313461] iwlagn 0000:03:00.0: No space for Tx
> [18078.313467] iwlagn 0000:03:00.0: Error sending SENSITIVITY_CMD:
> enqueue_hcmd failed: -28
> [18078.313470] iwlagn 0000:03:00.0: SENSITIVITY_CMD failed
> [18078.522409] iwlagn 0000:03:00.0: No space for Tx
> [18078.522414] iwlagn 0000:03:00.0: Error sending SENSITIVITY_CMD:
> enqueue_hcmd failed: -28

Sounds like the firmware messes up ...

Maybe as a first workaround you could modprobe iwlagn with
11n_disable=1. But I don't know at this point what the problem could be.

johannes


Attachments:
signature.asc (801.00 B)
This is a digitally signed message part

2009-11-04 11:40:13

by Greg Oliver

[permalink] [raw]
Subject: Re: iwlwifi connection troubles, maybe aggregation related

On Wed, Nov 4, 2009 at 4:59 AM, Greg Oliver <[email protected]> wrote:
> On Wed, Nov 4, 2009 at 3:30 AM, Johannes Berg <[email protected]> wrote:
>> On Tue, 2009-11-03 at 18:38 -0500, Andrew Lutomirski wrote:
>>
>>> iwlagn 0000:03:00.0: iwl_tx_agg_start on ra = 00:21:d8:49:4a:52 tid = 0
>>
>>> Turning on or off power management and fiddling with
>>> no_sleep_autoadjust makes no difference.  Setting tx_agg_tid_enable to
>>> zero in debugfs while the connection was working seemed to make it a
>>> little more reliable (it lasted long enough to do "git pull" but not
>>> much longer).
>>>
>>> After running "iw dev wlan0 disconnect" a few times, I start to get
>>> errors like this:
>>>
>>> [18078.209635] iwlagn 0000:03:00.0: SENSITIVITY_CMD failed
>>> [18078.313461] iwlagn 0000:03:00.0: No space for Tx
>>> [18078.313467] iwlagn 0000:03:00.0: Error sending SENSITIVITY_CMD:
>>> enqueue_hcmd failed: -28
>>> [18078.313470] iwlagn 0000:03:00.0: SENSITIVITY_CMD failed
>>> [18078.522409] iwlagn 0000:03:00.0: No space for Tx
>>> [18078.522414] iwlagn 0000:03:00.0: Error sending SENSITIVITY_CMD:
>>> enqueue_hcmd failed: -28
>>
>> Sounds like the firmware messes up ...
>>
>> Maybe as a first workaround you could modprobe iwlagn with
>> 11n_disable=1. But I don't know at this point what the problem could be.
>>
>> johannes
>>
>
> I am also still getting oops' with -rc5..
>
> [36759.355012] Pid: 0, comm: swapper Tainted: P
> 2.6.31-14-generic #48-Ubuntu
> [36759.355018] Call Trace:
> [36759.355022]  <IRQ>  [<ffffffff810e083c>] __alloc_pages_slowpath+0x4cc/0x4e0
> [36759.355047]  [<ffffffff810e099e>] __alloc_pages_nodemask+0x14e/0x150
> [36759.355059]  [<ffffffff81111dfa>] kmalloc_large_node+0x5a/0xb0
> [36759.355067]  [<ffffffff81115fa5>] __kmalloc_node_track_caller+0x135/0x180
> [36759.355109]  [<ffffffffa03c7977>] ? iwl_rx_allocate+0x197/0x2f0 [iwlcore]
> [36759.355121]  [<ffffffff8142e41b>] __alloc_skb+0x7b/0x180
> [36759.355146]  [<ffffffffa03c7977>] iwl_rx_allocate+0x197/0x2f0 [iwlcore]
> [36759.355171]  [<ffffffffa03c8de6>] iwl_rx_replenish_now+0x16/0x30 [iwlcore]
> [36759.355191]  [<ffffffffa03e4f18>] iwl_rx_handle+0x288/0x2f0 [iwlagn]
> [36759.355208]  [<ffffffffa03e5708>] iwl_irq_tasklet+0x138/0x4e0 [iwlagn]
> [36759.355220]  [<ffffffff810741e0>] ? delayed_work_timer_fn+0x0/0x40
> [36759.355228]  [<ffffffff81073d82>] ? insert_work+0x72/0xc0
> [36759.355239]  [<ffffffff81036419>] ? default_spin_lock_flags+0x9/0x10
> [36759.355248]  [<ffffffff81063ee0>] tasklet_action+0xd0/0xe0
> [36759.355257]  [<ffffffff8106549d>] __do_softirq+0xbd/0x200
> [36759.355266]  [<ffffffff810131ec>] call_softirq+0x1c/0x30
> [36759.355273]  [<ffffffff81014bc5>] do_softirq+0x55/0x90
> [36759.355281]  [<ffffffff81065205>] irq_exit+0x85/0x90
> [36759.355287]  [<ffffffff81014100>] do_IRQ+0x70/0xe0
> [36759.355296]  [<ffffffff81012a13>] ret_from_intr+0x0/0x11
> [36759.355301]  <EOI>  [<ffffffff812d7ed9>] ? acpi_idle_enter_bm+0x28b/0x2bf
> [36759.355317]  [<ffffffff812d7ed2>] ? acpi_idle_enter_bm+0x284/0x2bf
> [36759.355327]  [<ffffffff813fe40b>] ? cpuidle_idle_call+0x9b/0xf0
> [36759.355335]  [<ffffffff81010e12>] ? cpu_idle+0xb2/0x100
> [36759.355344]  [<ffffffff81514c56>] ? rest_init+0x66/0x70
> [36759.355355]  [<ffffffff8183a047>] ? start_kernel+0x352/0x35b
> [36759.355364]  [<ffffffff8183959a>] ? x86_64_start_reservations+0x125/0x129
> [36759.355372]  [<ffffffff81839698>] ? x86_64_start_kernel+0xfa/0x109
>
> Same oops' as with -rc4 it seems..  It does not hard lock anything -
> this is not during any suspend/resume..  This laptop is always on..
>
> [36894.090466] iwlagn 0000:04:00.0: Failed to allocate SKB buffer with
> GFP_ATOMIC. Only 0 free buffers remaining.
>
> -Greg
>

Disregard that - I forgot I was not using the -rc5 currently....

2009-11-04 01:47:27

by Greg Oliver

[permalink] [raw]
Subject: Re: iwlwifi connection troubles, maybe aggregation related

On Tue, Nov 3, 2009 at 5:38 PM, Andrew Lutomirski <[email protected]> wrote:
> Hi all-
>
> My laptop (Intel 5350) has trouble using the wireless networks here.
> I'm at MIT, which has a bunch of Cisco 1250 AP's (dual-band, MIMO,
> etc).  Running Windows, everything works perfectly.  On Linux
> (2.6.31-rc5, but I've seen problems with other, older kernels as
> well), it sometimes works, but I frequently find the network almost
> completely unusable.  I can associate and ping just fine, but, as soon
> as I try to send any significant amount of data, I can no longer
> transmit.  I can still receive both broadcast and unicast frames, but
> the network doesn't see anything I send.  An older laptop (presumably
> with 4965,
>
> This seems to be correlated with a line like:
>
> iwlagn 0000:03:00.0: iwl_tx_agg_start on ra = 00:21:d8:49:4a:52 tid = 0
>
> appearing in dmesg.
>
> Running "iw dev wlan0 disconnect" will make the connection start
> working until I try to send data again (presumably because either NM
> or wpa_supplicant will reassociate).
>
> Turning on or off power management and fiddling with
> no_sleep_autoadjust makes no difference.  Setting tx_agg_tid_enable to
> zero in debugfs while the connection was working seemed to make it a
> little more reliable (it lasted long enough to do "git pull" but not
> much longer).
>
> After running "iw dev wlan0 disconnect" a few times, I start to get
> errors like this:
>
> [18078.209635] iwlagn 0000:03:00.0: SENSITIVITY_CMD failed
> [18078.313461] iwlagn 0000:03:00.0: No space for Tx
> [18078.313467] iwlagn 0000:03:00.0: Error sending SENSITIVITY_CMD:
> enqueue_hcmd failed: -28
> [18078.313470] iwlagn 0000:03:00.0: SENSITIVITY_CMD failed
> [18078.522409] iwlagn 0000:03:00.0: No space for Tx
> [18078.522414] iwlagn 0000:03:00.0: Error sending SENSITIVITY_CMD:
> enqueue_hcmd failed: -28
>
> The driver doesn't recover until I do "echo 1 >
> /sys/kernel/debug/ieee80211/phy0/reset"  Oddly enough, after resetting
> just now, I couldn't trigger the failure again, even though it was
> 100% reproducible before resetting.

I too have been battling this issue with a 5300 for quite some time..
I *thought* it was fixed with the 2.6.32 compat-wireless series
(excluding the -rc4 which kernel oops'ed), only to find out that it
was because I was farther away from the AP, and negotiating lower
rates...

My card seems to work just fine through MCS13, about 60% at MCS14, and
almost always fails at 15.. The intel folks were very helpful in
trying to fix it, but it was never actually resolved. I've been
battling it so long though, though, I was just prepared to wait for
the magic fix to just show up one day :)

My machine (like yours), once reliably connected will preform 100%
until I reboot or unload the module.. It may take 10 insmods to get a
reliable connection though..

I guess what I'm saying is that I can help debug the issue as well...
It would be very nice to have this go away... I have an icon on my
desktop to reload the wireless subsystem if that tells you how many
times I have to do it until it loads up reliably...

-Greg

2009-11-04 10:59:15

by Greg Oliver

[permalink] [raw]
Subject: Re: iwlwifi connection troubles, maybe aggregation related

On Wed, Nov 4, 2009 at 3:30 AM, Johannes Berg <[email protected]> wrote:
> On Tue, 2009-11-03 at 18:38 -0500, Andrew Lutomirski wrote:
>
>> iwlagn 0000:03:00.0: iwl_tx_agg_start on ra = 00:21:d8:49:4a:52 tid = 0
>
>> Turning on or off power management and fiddling with
>> no_sleep_autoadjust makes no difference.  Setting tx_agg_tid_enable to
>> zero in debugfs while the connection was working seemed to make it a
>> little more reliable (it lasted long enough to do "git pull" but not
>> much longer).
>>
>> After running "iw dev wlan0 disconnect" a few times, I start to get
>> errors like this:
>>
>> [18078.209635] iwlagn 0000:03:00.0: SENSITIVITY_CMD failed
>> [18078.313461] iwlagn 0000:03:00.0: No space for Tx
>> [18078.313467] iwlagn 0000:03:00.0: Error sending SENSITIVITY_CMD:
>> enqueue_hcmd failed: -28
>> [18078.313470] iwlagn 0000:03:00.0: SENSITIVITY_CMD failed
>> [18078.522409] iwlagn 0000:03:00.0: No space for Tx
>> [18078.522414] iwlagn 0000:03:00.0: Error sending SENSITIVITY_CMD:
>> enqueue_hcmd failed: -28
>
> Sounds like the firmware messes up ...
>
> Maybe as a first workaround you could modprobe iwlagn with
> 11n_disable=1. But I don't know at this point what the problem could be.
>
> johannes
>

I am also still getting oops' with -rc5..

[36759.355012] Pid: 0, comm: swapper Tainted: P
2.6.31-14-generic #48-Ubuntu
[36759.355018] Call Trace:
[36759.355022] <IRQ> [<ffffffff810e083c>] __alloc_pages_slowpath+0x4cc/0x4e0
[36759.355047] [<ffffffff810e099e>] __alloc_pages_nodemask+0x14e/0x150
[36759.355059] [<ffffffff81111dfa>] kmalloc_large_node+0x5a/0xb0
[36759.355067] [<ffffffff81115fa5>] __kmalloc_node_track_caller+0x135/0x180
[36759.355109] [<ffffffffa03c7977>] ? iwl_rx_allocate+0x197/0x2f0 [iwlcore]
[36759.355121] [<ffffffff8142e41b>] __alloc_skb+0x7b/0x180
[36759.355146] [<ffffffffa03c7977>] iwl_rx_allocate+0x197/0x2f0 [iwlcore]
[36759.355171] [<ffffffffa03c8de6>] iwl_rx_replenish_now+0x16/0x30 [iwlcore]
[36759.355191] [<ffffffffa03e4f18>] iwl_rx_handle+0x288/0x2f0 [iwlagn]
[36759.355208] [<ffffffffa03e5708>] iwl_irq_tasklet+0x138/0x4e0 [iwlagn]
[36759.355220] [<ffffffff810741e0>] ? delayed_work_timer_fn+0x0/0x40
[36759.355228] [<ffffffff81073d82>] ? insert_work+0x72/0xc0
[36759.355239] [<ffffffff81036419>] ? default_spin_lock_flags+0x9/0x10
[36759.355248] [<ffffffff81063ee0>] tasklet_action+0xd0/0xe0
[36759.355257] [<ffffffff8106549d>] __do_softirq+0xbd/0x200
[36759.355266] [<ffffffff810131ec>] call_softirq+0x1c/0x30
[36759.355273] [<ffffffff81014bc5>] do_softirq+0x55/0x90
[36759.355281] [<ffffffff81065205>] irq_exit+0x85/0x90
[36759.355287] [<ffffffff81014100>] do_IRQ+0x70/0xe0
[36759.355296] [<ffffffff81012a13>] ret_from_intr+0x0/0x11
[36759.355301] <EOI> [<ffffffff812d7ed9>] ? acpi_idle_enter_bm+0x28b/0x2bf
[36759.355317] [<ffffffff812d7ed2>] ? acpi_idle_enter_bm+0x284/0x2bf
[36759.355327] [<ffffffff813fe40b>] ? cpuidle_idle_call+0x9b/0xf0
[36759.355335] [<ffffffff81010e12>] ? cpu_idle+0xb2/0x100
[36759.355344] [<ffffffff81514c56>] ? rest_init+0x66/0x70
[36759.355355] [<ffffffff8183a047>] ? start_kernel+0x352/0x35b
[36759.355364] [<ffffffff8183959a>] ? x86_64_start_reservations+0x125/0x129
[36759.355372] [<ffffffff81839698>] ? x86_64_start_kernel+0xfa/0x109

Same oops' as with -rc4 it seems.. It does not hard lock anything -
this is not during any suspend/resume.. This laptop is always on..

[36894.090466] iwlagn 0000:04:00.0: Failed to allocate SKB buffer with
GFP_ATOMIC. Only 0 free buffers remaining.

-Greg