2009-04-27 00:01:01

by Niel Lambrechts

[permalink] [raw]
Subject: 2.6.30-rc3: iwlagn probe timeouts (regression)

Hi,

In earlier 2.6.30 (rc1 and 2) kernel tests I suffered the "beacon loss,
sending probe request" problem, during which my connection would
intermittently fail and reconnect. With the latest git kernel which
seems to include a patch (ad935687dbe7307f5abd9e3f610a965a287324a9) for
at least some of this, my card (Intel 5300AGN REV=0x24) completely fails
to associate and I get:

Apr 27 01:32:36 linux-7vph kernel: Registered led device: iwl-phy0::radio
Apr 27 01:32:36 linux-7vph kernel: Registered led device: iwl-phy0::assoc
Apr 27 01:32:36 linux-7vph kernel: Registered led device: iwl-phy0::RX
Apr 27 01:32:36 linux-7vph kernel: Registered led device: iwl-phy0::TX
Apr 27 01:32:36 linux-7vph kernel: ADDRCONF(NETDEV_UP): wlan0: link is
not ready
Apr 27 01:32:40 linux-7vph sudo: niella : TTY=pts/2 ; PWD=/home/niella
; USER=root ; COMMAND=/usr/bin/tail -f /var/log/messages
Apr 27 01:33:16 linux-7vph kernel: wlan0: direct probe to AP
00:1d:92:1d:1e:8e try 1
Apr 27 01:33:16 linux-7vph kernel: wlan0: direct probe to AP
00:1d:92:1d:1e:8e try 2
Apr 27 01:33:16 linux-7vph kernel: wlan0: direct probe to AP
00:1d:92:1d:1e:8e try 3
Apr 27 01:33:16 linux-7vph kernel: wlan0: direct probe to AP
00:1d:92:1d:1e:8e timed out
Apr 27 01:33:29 linux-7vph kernel: wlan0: direct probe to AP
00:1d:92:1d:1e:8e try 1
Apr 27 01:33:29 linux-7vph kernel: wlan0: direct probe to AP
00:1d:92:1d:1e:8e try 1
Apr 27 01:33:29 linux-7vph kernel: wlan0: direct probe to AP
00:1d:92:1d:1e:8e try 2
Apr 27 01:33:29 linux-7vph kernel: wlan0: direct probe to AP
00:1d:92:1d:1e:8e try 3
Apr 27 01:33:29 linux-7vph kernel: wlan0: direct probe to AP
00:1d:92:1d:1e:8e timed out
Apr 27 01:33:42 linux-7vph kernel: wlan0: direct probe to AP
00:1d:92:1d:1e:8e try 1
Apr 27 01:33:42 linux-7vph kernel: wlan0: direct probe to AP
00:1d:92:1d:1e:8e try 1
Apr 27 01:33:42 linux-7vph kernel: wlan0: direct probe to AP
00:1d:92:1d:1e:8e try 2
Apr 27 01:33:42 linux-7vph kernel: wlan0: direct probe to AP
00:1d:92:1d:1e:8e try 3
Apr 27 01:33:43 linux-7vph kernel: wlan0: direct probe to AP
00:1d:92:1d:1e:8e timed out

I do not have any similar problems with 2.6.29 or 2.6.28 at the exact
same physical location.

I can try to bisect in a day or two if need be.

Regards,
Niel


2009-04-27 17:13:26

by Reinette Chatre

[permalink] [raw]
Subject: Re: 2.6.30-rc3: iwlagn probe timeouts (regression)

Hi Niel,

On Sun, 2009-04-26 at 17:00 -0700, Niel Lambrechts wrote:
> Hi,
>
> In earlier 2.6.30 (rc1 and 2) kernel tests I suffered the "beacon loss,
> sending probe request" problem, during which my connection would
> intermittently fail and reconnect. With the latest git kernel which
> seems to include a patch (ad935687dbe7307f5abd9e3f610a965a287324a9) for
> at least some of this, my card (Intel 5300AGN REV=0x24) completely fails
> to associate and I get:
>
> Apr 27 01:32:36 linux-7vph kernel: Registered led device: iwl-phy0::radio
> Apr 27 01:32:36 linux-7vph kernel: Registered led device: iwl-phy0::assoc
> Apr 27 01:32:36 linux-7vph kernel: Registered led device: iwl-phy0::RX
> Apr 27 01:32:36 linux-7vph kernel: Registered led device: iwl-phy0::TX
> Apr 27 01:32:36 linux-7vph kernel: ADDRCONF(NETDEV_UP): wlan0: link is
> not ready
> Apr 27 01:32:40 linux-7vph sudo: niella : TTY=pts/2 ; PWD=/home/niella
> ; USER=root ; COMMAND=/usr/bin/tail -f /var/log/messages
> Apr 27 01:33:16 linux-7vph kernel: wlan0: direct probe to AP
> 00:1d:92:1d:1e:8e try 1
> Apr 27 01:33:16 linux-7vph kernel: wlan0: direct probe to AP
> 00:1d:92:1d:1e:8e try 2
> Apr 27 01:33:16 linux-7vph kernel: wlan0: direct probe to AP
> 00:1d:92:1d:1e:8e try 3
> Apr 27 01:33:16 linux-7vph kernel: wlan0: direct probe to AP
> 00:1d:92:1d:1e:8e timed out
> Apr 27 01:33:29 linux-7vph kernel: wlan0: direct probe to AP
> 00:1d:92:1d:1e:8e try 1
> Apr 27 01:33:29 linux-7vph kernel: wlan0: direct probe to AP
> 00:1d:92:1d:1e:8e try 1
> Apr 27 01:33:29 linux-7vph kernel: wlan0: direct probe to AP
> 00:1d:92:1d:1e:8e try 2
> Apr 27 01:33:29 linux-7vph kernel: wlan0: direct probe to AP
> 00:1d:92:1d:1e:8e try 3
> Apr 27 01:33:29 linux-7vph kernel: wlan0: direct probe to AP
> 00:1d:92:1d:1e:8e timed out
> Apr 27 01:33:42 linux-7vph kernel: wlan0: direct probe to AP
> 00:1d:92:1d:1e:8e try 1
> Apr 27 01:33:42 linux-7vph kernel: wlan0: direct probe to AP
> 00:1d:92:1d:1e:8e try 1
> Apr 27 01:33:42 linux-7vph kernel: wlan0: direct probe to AP
> 00:1d:92:1d:1e:8e try 2
> Apr 27 01:33:42 linux-7vph kernel: wlan0: direct probe to AP
> 00:1d:92:1d:1e:8e try 3
> Apr 27 01:33:43 linux-7vph kernel: wlan0: direct probe to AP
> 00:1d:92:1d:1e:8e timed out
>
> I do not have any similar problems with 2.6.29 or 2.6.28 at the exact
> same physical location.
>
> I can try to bisect in a day or two if need be.

>From what I can tell we did not send any iwlagn patches to 2.6.30 that
are related to this behavior. Please do try a bisect.

Thank you

Reinette



2009-04-29 08:28:44

by Johannes Berg

[permalink] [raw]
Subject: Re: 2.6.30-rc3: iwlagn probe timeouts (regression)

On Wed, 2009-04-29 at 01:20 +0200, Niel Lambrechts wrote:

> Thanks for the help, reverting the commit did indeed fix things for me -
> I tested that earlier this evening with the latest git kernel...

Ok, thanks.

> > Scratch that, try this patch instead. Sorry, stupid mistake! mac80211
> > never asks the driver to set a txpower level, and keeps the variable set
> > to 0, but the driver looks at it anyway. Bug on both accounts, I guess,
> > but mac80211 should set the variable and tell the driver anyway.
> >
> and so does your patch, although I had to patch by hand, as my version
> of the file still looked like this:
>
> local->hw.conf.long_frame_max_tx_count = 4;
> local->hw.conf.short_frame_max_tx_count = 7;
> local->hw.conf.radio_enabled = true;
> + local->user_power_level = -1;
>
> Can I more or less bargain that this fix will make in in time before the
> final 2.6.30 kernel is released?

Yes. Thanks for testing that too. I'm sorry that it was such a stupid
bug and you spent so much time hunting it.

johannes


Attachments:
signature.asc (836.00 B)
This is a digitally signed message part

2009-04-28 23:20:43

by Niel Lambrechts

[permalink] [raw]
Subject: Re: 2.6.30-rc3: iwlagn probe timeouts (regression)

On 04/28/2009 11:47 PM, Johannes Berg wrote:
> On Tue, 2009-04-28 at 23:42 +0200, Johannes Berg wrote:
>
>
>> Ok, that's confusing. It doesn't even change any code that is normally
>> executed, at least not significantly since local->user_power_level is
>> usually 0; checking
>> if (local->user_power_level)
>> vs. checking
>> if (local->user_power_level >= 0)
>> shouldn't make a difference in that case (although I admit that I forgot
>> a few cases in that commit, will fix).
>>
>> Can you please verify that the code behaves correctly if you revert just
>> this commit? Unless you're playing with "iwconfig wlan0 txpower .." I
>> don't see a reason for this to cause a problem.
>>

Hi Johannes,

Thanks for the help, reverting the commit did indeed fix things for me -
I tested that earlier this evening with the latest git kernel...
>
> Scratch that, try this patch instead. Sorry, stupid mistake! mac80211
> never asks the driver to set a txpower level, and keeps the variable set
> to 0, but the driver looks at it anyway. Bug on both accounts, I guess,
> but mac80211 should set the variable and tell the driver anyway.
>
and so does your patch, although I had to patch by hand, as my version
of the file still looked like this:

local->hw.conf.long_frame_max_tx_count = 4;
local->hw.conf.short_frame_max_tx_count = 7;
local->hw.conf.radio_enabled = true;
+ local->user_power_level = -1;

Can I more or less bargain that this fix will make in in time before the
final 2.6.30 kernel is released? (Sorry, I'm just not certain how the
rules work around test cycles that may be involved, if any for such a
trivial issue)

Regards,
Niel

2009-04-28 21:43:14

by Johannes Berg

[permalink] [raw]
Subject: Re: 2.6.30-rc3: iwlagn probe timeouts (regression)

On Tue, 2009-04-28 at 22:58 +0200, Niel Lambrechts wrote:

> >> Apr 27 01:33:43 linux-7vph kernel: wlan0: direct probe to AP
> >> 00:1d:92:1d:1e:8e timed out

(etc)

> Okidoki, I've managed to bisect it:
>
> commit 47afbaf5af9454a7a1a64591e20cbfcc27ca67a8
> Author: Johannes Berg <[email protected]>
> Date: Tue Apr 7 15:22:28 2009 +0200
>
> mac80211: correct wext transmit power handler

Ok, that's confusing. It doesn't even change any code that is normally
executed, at least not significantly since local->user_power_level is
usually 0; checking
if (local->user_power_level)
vs. checking
if (local->user_power_level >= 0)
shouldn't make a difference in that case (although I admit that I forgot
a few cases in that commit, will fix).

Can you please verify that the code behaves correctly if you revert just
this commit? Unless you're playing with "iwconfig wlan0 txpower .." I
don't see a reason for this to cause a problem.

johannes


Attachments:
signature.asc (836.00 B)
This is a digitally signed message part

2009-04-28 20:58:41

by Niel Lambrechts

[permalink] [raw]
Subject: Re: 2.6.30-rc3: iwlagn probe timeouts (regression)

On 04/27/2009 07:19 PM, reinette chatre wrote:
> Hi Niel,
>
> On Sun, 2009-04-26 at 17:00 -0700, Niel Lambrechts wrote:
>
>> Hi,
>>
>> In earlier 2.6.30 (rc1 and 2) kernel tests I suffered the "beacon loss,
>> sending probe request" problem, during which my connection would
>> intermittently fail and reconnect. With the latest git kernel which
>> seems to include a patch (ad935687dbe7307f5abd9e3f610a965a287324a9) for
>> at least some of this, my card (Intel 5300AGN REV=0x24) completely fails
>> to associate and I get:
>>
>> Apr 27 01:32:36 linux-7vph kernel: Registered led device: iwl-phy0::radio
>> Apr 27 01:32:36 linux-7vph kernel: Registered led device: iwl-phy0::assoc
>> Apr 27 01:32:36 linux-7vph kernel: Registered led device: iwl-phy0::RX
>> Apr 27 01:32:36 linux-7vph kernel: Registered led device: iwl-phy0::TX
>> Apr 27 01:32:36 linux-7vph kernel: ADDRCONF(NETDEV_UP): wlan0: link is
>> not ready
>> Apr 27 01:32:40 linux-7vph sudo: niella : TTY=pts/2 ; PWD=/home/niella
>> ; USER=root ; COMMAND=/usr/bin/tail -f /var/log/messages
>> Apr 27 01:33:16 linux-7vph kernel: wlan0: direct probe to AP
>> 00:1d:92:1d:1e:8e try 1
>> Apr 27 01:33:16 linux-7vph kernel: wlan0: direct probe to AP
>> 00:1d:92:1d:1e:8e try 2
>> Apr 27 01:33:16 linux-7vph kernel: wlan0: direct probe to AP
>> 00:1d:92:1d:1e:8e try 3
>> Apr 27 01:33:16 linux-7vph kernel: wlan0: direct probe to AP
>> 00:1d:92:1d:1e:8e timed out
>> Apr 27 01:33:29 linux-7vph kernel: wlan0: direct probe to AP
>> 00:1d:92:1d:1e:8e try 1
>> Apr 27 01:33:29 linux-7vph kernel: wlan0: direct probe to AP
>> 00:1d:92:1d:1e:8e try 1
>> Apr 27 01:33:29 linux-7vph kernel: wlan0: direct probe to AP
>> 00:1d:92:1d:1e:8e try 2
>> Apr 27 01:33:29 linux-7vph kernel: wlan0: direct probe to AP
>> 00:1d:92:1d:1e:8e try 3
>> Apr 27 01:33:29 linux-7vph kernel: wlan0: direct probe to AP
>> 00:1d:92:1d:1e:8e timed out
>> Apr 27 01:33:42 linux-7vph kernel: wlan0: direct probe to AP
>> 00:1d:92:1d:1e:8e try 1
>> Apr 27 01:33:42 linux-7vph kernel: wlan0: direct probe to AP
>> 00:1d:92:1d:1e:8e try 1
>> Apr 27 01:33:42 linux-7vph kernel: wlan0: direct probe to AP
>> 00:1d:92:1d:1e:8e try 2
>> Apr 27 01:33:42 linux-7vph kernel: wlan0: direct probe to AP
>> 00:1d:92:1d:1e:8e try 3
>> Apr 27 01:33:43 linux-7vph kernel: wlan0: direct probe to AP
>> 00:1d:92:1d:1e:8e timed out
>>
>> I do not have any similar problems with 2.6.29 or 2.6.28 at the exact
>> same physical location.
>>
>> I can try to bisect in a day or two if need be.
>>
>
> >From what I can tell we did not send any iwlagn patches to 2.6.30 that
> are related to this behavior. Please do try a bisect.
>
Okidoki, I've managed to bisect it:

commit 47afbaf5af9454a7a1a64591e20cbfcc27ca67a8
Author: Johannes Berg <[email protected]>
Date: Tue Apr 7 15:22:28 2009 +0200

mac80211: correct wext transmit power handler
---------

Under normal working circumstances, I get a reported 30-42% signal
strength to my wireless router, with the problematic kernels I just kept
getting probe attempts followed by the timeout, as per my original post.

Regards,
Niel

2009-04-28 21:48:06

by Johannes Berg

[permalink] [raw]
Subject: Re: 2.6.30-rc3: iwlagn probe timeouts (regression)

On Tue, 2009-04-28 at 23:42 +0200, Johannes Berg wrote:

> Ok, that's confusing. It doesn't even change any code that is normally
> executed, at least not significantly since local->user_power_level is
> usually 0; checking
> if (local->user_power_level)
> vs. checking
> if (local->user_power_level >= 0)
> shouldn't make a difference in that case (although I admit that I forgot
> a few cases in that commit, will fix).
>
> Can you please verify that the code behaves correctly if you revert just
> this commit? Unless you're playing with "iwconfig wlan0 txpower .." I
> don't see a reason for this to cause a problem.

Scratch that, try this patch instead. Sorry, stupid mistake! mac80211
never asks the driver to set a txpower level, and keeps the variable set
to 0, but the driver looks at it anyway. Bug on both accounts, I guess,
but mac80211 should set the variable and tell the driver anyway.

johannes

---
net/mac80211/main.c | 1 +
1 file changed, 1 insertion(+)

--- wireless-testing.orig/net/mac80211/main.c 2009-04-28 23:43:49.000000000 +0200
+++ wireless-testing/net/mac80211/main.c 2009-04-28 23:47:22.000000000 +0200
@@ -764,6 +764,7 @@ struct ieee80211_hw *ieee80211_alloc_hw(
local->hw.conf.long_frame_max_tx_count = wiphy->retry_long;
local->hw.conf.short_frame_max_tx_count = wiphy->retry_short;
local->hw.conf.radio_enabled = true;
+ local->user_power_level = -1;

INIT_LIST_HEAD(&local->interfaces);
mutex_init(&local->iflist_mtx);