2009-05-01 12:00:08

by Maxim Levitsky

[permalink] [raw]
Subject: [BUG] scans can still hang with -EBUSY on iwl3945

Don't know if this is new regression, or just exposed race condition,
but currently using rfkill, and/or s2disk sometimes leaves the card in
state in which it can't scan.

Tested with and without hardware scanning.

It just returns -EBUSY till I reload the driver.

I use iwlwifi.git, iwl3945

Best regards,
Maxim Levitsky



2009-05-01 20:41:18

by Maxim Levitsky

[permalink] [raw]
Subject: Re: [BUG] scans can still hang with -EBUSY on iwl3945

On Fri, 2009-05-01 at 08:58 -0700, reinette chatre wrote:
> Maxim,
>
> On Fri, 2009-05-01 at 05:00 -0700, Maxim Levitsky wrote:
> > Don't know if this is new regression, or just exposed race condition,
> > but currently using rfkill, and/or s2disk sometimes leaves the card in
> > state in which it can't scan.
>
> Could you please provide more details of this scenario?

OK, lets focus on my current setup:

* iwlwifi.git - very close to current head, yesterday HEAD
* disable_hw_scan=1 - this bug is present regardless of this setting,
yet this rules out firmware interactions during rfkill.

* Network Manager - ubuntu 9.04 unmodified.
* wpa_supplicant - close to git HEAD, with one local change:
It is known that on first attempt scans always fail, with same -EBUSY
iwlst scan shows 'Device is busy', supplicant complains about scan
requested (ret=-1), ... When this happens supplicant waits for 10
seconds, and I reduced that wait to 1 second.


When this happens, I can't make the card scan, scans always fail, and I
can only reload the module.

Best regards,
Maxim Levitsky


2009-05-13 08:30:24

by Johannes Berg

[permalink] [raw]
Subject: Re: [BUG] scans can still hang with -EBUSY on iwl3945

On Wed, 2009-05-13 at 03:50 +0300, Maxim Levitsky wrote:

> > Also you could stick some printks into
> > net/mac80211/scan.c:ieee80211_scan_work() at strategic places and see...
> > but if they never happen you haven't turned off hw scan.

> You know, you are right, it turns out that hw scan was active (due to
> all the experimentation, I forgot to turn it off...) and it is obvious
> that iwlwifi has rfkill related bugs there, I handle that later.

Ok.

> But without hw scan, something is still there, I sooneer or later will
> understand.

Ok, I can't say I can see any issue, but if you put printks into the
mac80211 scan code and reproduce it (I don't have rfkill) I'll be happy
to take a look.

johannes


Attachments:
signature.asc (801.00 B)
This is a digitally signed message part

2009-05-12 23:14:45

by Maxim Levitsky

[permalink] [raw]
Subject: Re: [BUG] scans can still hang with -EBUSY on iwl3945

On Wed, 2009-05-13 at 01:12 +0200, Johannes Berg wrote:
> On Wed, 2009-05-13 at 02:10 +0300, Maxim Levitsky wrote:
>
> > > May 13 01:44:33 maxim-laptop kernel: [ 1093.991176] iwl3945 0000:06:00.0: Radio Frequency Kill Switch is On:
> > ^^^^^^^^^^ Kernel notices the rfkill state
> > > May 13 01:44:33 maxim-laptop kernel: [ 1093.991179] Kill switch must be turned off for wireless networking to work.
> > > May 13 01:44:35 maxim-laptop kernel: [ 1095.789455] ieee80211_ioctl_siwessid: called
> > ^^^^^^^^^^ Evil NM notices Rfkill too, and sends bogus bssid to driver
> > > May 13 01:44:35 maxim-laptop kernel: [ 1095.789462] ieee80211_sta_req_auth: called
> > > May 13 01:44:35 maxim-laptop kernel: [ 1095.789498] iwl3945 0000:06:00.0: Error sending REPLY_LEDS_CMD: enqueue_hcmd failed: -5
> > > May 13 01:44:35 maxim-laptop kernel: [ 1095.789530] iwl3945 0000:06:00.0: Error sending REPLY_RXON: enqueue_hcmd failed: -5
> > > May 13 01:44:35 maxim-laptop kernel: [ 1095.789536] iwl3945 0000:06:00.0: Error clearing ASSOC_MSK on current configuration (-5).
> > > May 13 01:44:35 maxim-laptop kernel: [ 1095.870202] iwl3945 0000:06:00.0: Error sending REPLY_RXON: enqueue_hcmd failed: -5
> > > May 13 01:44:35 maxim-laptop kernel: [ 1095.870210] iwl3945 0000:06:00.0: Error clearing ASSOC_MSK on current configuration (-5).
> > > May 13 01:44:35 maxim-laptop kernel: [ 1095.870223] iwl3945 0000:06:00.0: Error sending REPLY_RXON: enqueue_hcmd failed: -5
> > > May 13 01:44:35 maxim-laptop kernel: [ 1095.870227] iwl3945 0000:06:00.0: Error clearing ASSOC_MSK on current configuration (-5).
> > > May 13 01:44:35 maxim-laptop kernel: [ 1095.870234] wlan0: direct probe to AP 00:0e:2e:a4:12:3e try 1
> > > May 13 01:44:35 maxim-laptop kernel: [ 1095.960876] cfg80211: scan requested
> > ^^^^^^^^^^^ This results in a scan
> > > May 13 01:44:35 maxim-laptop kernel: [ 1095.960884] __ieee80211_start_scan: scan requested
> > > May 13 01:44:35 maxim-laptop kernel: [ 1096.042972] wlan0: deauthenticating by local choice (reason=3)
> > ^^^^^^^^^^^ And deauth....
> > > May 13 01:44:35 maxim-laptop kernel: [ 1096.331307] ieee80211_ioctl_siwap: called
> > ^^^^^^^^^^^ And NM sets ESSID on top of that....
> > > May 13 01:44:35 maxim-laptop kernel: [ 1096.331310] ieee80211_sta_req_auth: called
> > >
> >
> > Now I press rfkill again, but scan is hung
>
> So it's a bug in iwlwifi, it neither rejects nor executes a scan when in
> rfkill.
But I use disable_hw_scan=1 !

Iwlwifi shouldn't be called in this case (am I mistaken?)

Best regards,
Maxim Levitsky


>
> johannes


2009-05-14 21:15:16

by Johannes Berg

[permalink] [raw]
Subject: Re: [BUG] scans can still hang with -EBUSY on iwl3945

On Thu, 2009-05-14 at 14:14 -0700, reinette chatre wrote:

> I am not convinced. When mac80211 requests a scan from the driver
> checking if rfkill is enabled is the first thing it does and it will
> return an error in this case. See iwl_mac_hw_scan().
>
> Maxim, please ensure your driver is compiled with debug support
> (CONFIG_IWLWIFI_DEBUG) and load the driver with debug=0x2. We should see
> debug messages that will guide us here.

debug=0x803 would be more useful I think. Or maybe 0x802.

I wish I had the ftrace thing for mac80211 ready already... oh well.

johannes


Attachments:
signature.asc (801.00 B)
This is a digitally signed message part

2009-05-14 21:07:44

by Reinette Chatre

[permalink] [raw]
Subject: Re: [BUG] scans can still hang with -EBUSY on iwl3945

On Tue, 2009-05-12 at 16:12 -0700, Johannes Berg wrote:
> On Wed, 2009-05-13 at 02:10 +0300, Maxim Levitsky wrote:
>
> > > May 13 01:44:33 maxim-laptop kernel: [ 1093.991176] iwl3945 0000:06:00.0: Radio Frequency Kill Switch is On:
> > ^^^^^^^^^^ Kernel notices the rfkill state
> > > May 13 01:44:33 maxim-laptop kernel: [ 1093.991179] Kill switch must be turned off for wireless networking to work.
> > > May 13 01:44:35 maxim-laptop kernel: [ 1095.789455] ieee80211_ioctl_siwessid: called
> > ^^^^^^^^^^ Evil NM notices Rfkill too, and sends bogus bssid to driver
> > > May 13 01:44:35 maxim-laptop kernel: [ 1095.789462] ieee80211_sta_req_auth: called
> > > May 13 01:44:35 maxim-laptop kernel: [ 1095.789498] iwl3945 0000:06:00.0: Error sending REPLY_LEDS_CMD: enqueue_hcmd failed: -5
> > > May 13 01:44:35 maxim-laptop kernel: [ 1095.789530] iwl3945 0000:06:00.0: Error sending REPLY_RXON: enqueue_hcmd failed: -5
> > > May 13 01:44:35 maxim-laptop kernel: [ 1095.789536] iwl3945 0000:06:00.0: Error clearing ASSOC_MSK on current configuration (-5).
> > > May 13 01:44:35 maxim-laptop kernel: [ 1095.870202] iwl3945 0000:06:00.0: Error sending REPLY_RXON: enqueue_hcmd failed: -5
> > > May 13 01:44:35 maxim-laptop kernel: [ 1095.870210] iwl3945 0000:06:00.0: Error clearing ASSOC_MSK on current configuration (-5).
> > > May 13 01:44:35 maxim-laptop kernel: [ 1095.870223] iwl3945 0000:06:00.0: Error sending REPLY_RXON: enqueue_hcmd failed: -5
> > > May 13 01:44:35 maxim-laptop kernel: [ 1095.870227] iwl3945 0000:06:00.0: Error clearing ASSOC_MSK on current configuration (-5).
> > > May 13 01:44:35 maxim-laptop kernel: [ 1095.870234] wlan0: direct probe to AP 00:0e:2e:a4:12:3e try 1
> > > May 13 01:44:35 maxim-laptop kernel: [ 1095.960876] cfg80211: scan requested
> > ^^^^^^^^^^^ This results in a scan
> > > May 13 01:44:35 maxim-laptop kernel: [ 1095.960884] __ieee80211_start_scan: scan requested
> > > May 13 01:44:35 maxim-laptop kernel: [ 1096.042972] wlan0: deauthenticating by local choice (reason=3)
> > ^^^^^^^^^^^ And deauth....
> > > May 13 01:44:35 maxim-laptop kernel: [ 1096.331307] ieee80211_ioctl_siwap: called
> > ^^^^^^^^^^^ And NM sets ESSID on top of that....
> > > May 13 01:44:35 maxim-laptop kernel: [ 1096.331310] ieee80211_sta_req_auth: called
> > >
> >
> > Now I press rfkill again, but scan is hung
>
> So it's a bug in iwlwifi, it neither rejects nor executes a scan when in
> rfkill.

I am not convinced. When mac80211 requests a scan from the driver
checking if rfkill is enabled is the first thing it does and it will
return an error in this case. See iwl_mac_hw_scan().

Maxim, please ensure your driver is compiled with debug support
(CONFIG_IWLWIFI_DEBUG) and load the driver with debug=0x2. We should see
debug messages that will guide us here.

Reinette



2009-05-12 23:20:42

by Johannes Berg

[permalink] [raw]
Subject: Re: [BUG] scans can still hang with -EBUSY on iwl3945

On Wed, 2009-05-13 at 02:14 +0300, Maxim Levitsky wrote:

> > So it's a bug in iwlwifi, it neither rejects nor executes a scan when in
> > rfkill.
> But I use disable_hw_scan=1 !

Load the driver with debug=1 and see what it shows then. I don't think
there's a way for mac80211 to get stuck.

Also you could stick some printks into
net/mac80211/scan.c:ieee80211_scan_work() at strategic places and see...
but if they never happen you haven't turned off hw scan.

johannes


Attachments:
signature.asc (801.00 B)
This is a digitally signed message part

2009-05-13 00:50:39

by Maxim Levitsky

[permalink] [raw]
Subject: Re: [BUG] scans can still hang with -EBUSY on iwl3945

On Wed, 2009-05-13 at 01:20 +0200, Johannes Berg wrote:
> On Wed, 2009-05-13 at 02:14 +0300, Maxim Levitsky wrote:
>
> > > So it's a bug in iwlwifi, it neither rejects nor executes a scan when in
> > > rfkill.
> > But I use disable_hw_scan=1 !
>
> Load the driver with debug=1 and see what it shows then. I don't think
> there's a way for mac80211 to get stuck.
>
> Also you could stick some printks into
> net/mac80211/scan.c:ieee80211_scan_work() at strategic places and see...
> but if they never happen you haven't turned off hw scan.
You know, you are right, it turns out that hw scan was active (due to
all the experimentation, I forgot to turn it off...) and it is obvious
that iwlwifi has rfkill related bugs there, I handle that later.

But without hw scan, something is still there, I sooneer or later will
understand.

Best regards, and sorry
Maxim Levitsky





2009-05-03 21:46:10

by Maxim Levitsky

[permalink] [raw]
Subject: Re: [BUG] scans can still hang with -EBUSY on iwl3945

On Sun, 2009-05-03 at 20:07 +0200, Johannes Berg wrote:
> On Sat, 2009-05-02 at 04:32 +0300, Maxim Levitsky wrote:
>
> > > * iwlwifi.git - very close to current head, yesterday HEAD
> > > * disable_hw_scan=1 - this bug is present regardless of this setting,
> > > yet this rules out firmware interactions during rfkill.
>
> Are you absolutely sure this also happens with disable_hw_scan=1? That
> seems rather strange. Can you please also put a printk into
> net/wireless/scan.c:cfg80211_scan_done, and provide separate logs w/ and
> w/o disable_hw_scan=1.
>
> johannes

Yes, sure
For record I use iwlwifi.git commit
5a94b6d38100b7056a5a347e5c51359d924d305d


Logs attached.
This bug can be reliably reproduced, happens every 5~10 times when I
turn rfkill on/off, but today again this happened after resume from
disk.

Sorry for misleading MAC80211, I noticed that long ago, but this was for
debugging only.

Current diff attached too.




Attachments:
scanlog (5.48 kB)
test_patch.diff (2.33 kB)
Download all attachments

2009-05-12 23:10:29

by Maxim Levitsky

[permalink] [raw]
Subject: Re: [BUG] scans can still hang with -EBUSY on iwl3945

On Tue, 2009-05-12 at 23:54 +0300, Maxim Levitsky wrote:
> On Mon, 2009-05-04 at 09:19 +0200, Johannes Berg wrote:
> > On Mon, 2009-05-04 at 00:46 +0300, Maxim Levitsky wrote:
> > > Yes, sure
> > > For record I use iwlwifi.git commit
> > > 5a94b6d38100b7056a5a347e5c51359d924d305d
> >
> > Ok.
> >
> > > Logs attached.
> > > This bug can be reliably reproduced, happens every 5~10 times when I
> > > turn rfkill on/off, but today again this happened after resume from
> > > disk.
> >
> > Ah, rfkill. Hmm, unfortunately I don't have a system with rfkill. I
> > looked at ieee80211_scan_work but can't see how it would abort the
> > scan... Could you put some debugging into that function? Especially
> > printing out the state when it enters, whether it calls
>
> Hi,
>
> I can't seem to reproduce that ether.
> Maybe this got fixed (I have seen few suspicios commits in
> wireless-testing... althought I think this is a wishfull thinking)
>
> Anyway, I I have put printks in ieee80211_scan_work.

Ahaa!
Look at that:


> May 13 01:44:33 maxim-laptop kernel: [ 1093.991176] iwl3945 0000:06:00.0: Radio Frequency Kill Switch is On:
^^^^^^^^^^ Kernel notices the rfkill state
> May 13 01:44:33 maxim-laptop kernel: [ 1093.991179] Kill switch must be turned off for wireless networking to work.
> May 13 01:44:35 maxim-laptop kernel: [ 1095.789455] ieee80211_ioctl_siwessid: called
^^^^^^^^^^ Evil NM notices Rfkill too, and sends bogus bssid to driver
> May 13 01:44:35 maxim-laptop kernel: [ 1095.789462] ieee80211_sta_req_auth: called
> May 13 01:44:35 maxim-laptop kernel: [ 1095.789498] iwl3945 0000:06:00.0: Error sending REPLY_LEDS_CMD: enqueue_hcmd failed: -5
> May 13 01:44:35 maxim-laptop kernel: [ 1095.789530] iwl3945 0000:06:00.0: Error sending REPLY_RXON: enqueue_hcmd failed: -5
> May 13 01:44:35 maxim-laptop kernel: [ 1095.789536] iwl3945 0000:06:00.0: Error clearing ASSOC_MSK on current configuration (-5).
> May 13 01:44:35 maxim-laptop kernel: [ 1095.870202] iwl3945 0000:06:00.0: Error sending REPLY_RXON: enqueue_hcmd failed: -5
> May 13 01:44:35 maxim-laptop kernel: [ 1095.870210] iwl3945 0000:06:00.0: Error clearing ASSOC_MSK on current configuration (-5).
> May 13 01:44:35 maxim-laptop kernel: [ 1095.870223] iwl3945 0000:06:00.0: Error sending REPLY_RXON: enqueue_hcmd failed: -5
> May 13 01:44:35 maxim-laptop kernel: [ 1095.870227] iwl3945 0000:06:00.0: Error clearing ASSOC_MSK on current configuration (-5).
> May 13 01:44:35 maxim-laptop kernel: [ 1095.870234] wlan0: direct probe to AP 00:0e:2e:a4:12:3e try 1
> May 13 01:44:35 maxim-laptop kernel: [ 1095.960876] cfg80211: scan requested
^^^^^^^^^^^ This results in a scan
> May 13 01:44:35 maxim-laptop kernel: [ 1095.960884] __ieee80211_start_scan: scan requested
> May 13 01:44:35 maxim-laptop kernel: [ 1096.042972] wlan0: deauthenticating by local choice (reason=3)
^^^^^^^^^^^ And deauth....
> May 13 01:44:35 maxim-laptop kernel: [ 1096.331307] ieee80211_ioctl_siwap: called
^^^^^^^^^^^ And NM sets ESSID on top of that....
> May 13 01:44:35 maxim-laptop kernel: [ 1096.331310] ieee80211_sta_req_auth: called
>

Now I press rfkill again, but scan is hung
>
> May 13 01:44:48 maxim-laptop kernel: [ 1108.639995] Registered led device: iwl-phy0::radio
> May 13 01:44:48 maxim-laptop kernel: [ 1108.640060] Registered led device: iwl-phy0::assoc
> May 13 01:44:48 maxim-laptop kernel: [ 1108.640093] Registered led device: iwl-phy0::RX
> May 13 01:44:48 maxim-laptop kernel: [ 1108.640124] Registered led device: iwl-phy0::TX
> May 13 01:44:48 maxim-laptop kernel: [ 1108.729516] wlan0: direct probe to AP 00:0e:2e:a4:12:3e try 1
> May 13 01:44:48 maxim-laptop kernel: [ 1108.733695] wlan0 direct probe responded
> May 13 01:44:48 maxim-laptop kernel: [ 1108.733704] wlan0: authenticate with AP 00:0e:2e:a4:12:3e
> May 13 01:44:48 maxim-laptop kernel: [ 1108.735529] wlan0: authenticated
> May 13 01:44:48 maxim-laptop kernel: [ 1108.735535] wlan0: associate with AP 00:0e:2e:a4:12:3e
> May 13 01:44:48 maxim-laptop kernel: [ 1108.737684] wlan0: RX AssocResp from 00:0e:2e:a4:12:3e (capab=0x401 status=1 aid=2)
> May 13 01:44:48 maxim-laptop kernel: [ 1108.737690] wlan0: AP denied association (code=1)
> May 13 01:44:48 maxim-laptop kernel: [ 1108.808376] cfg80211: scan requested
> May 13 01:44:48 maxim-laptop kernel: [ 1108.808383] cfg80211: can't scan, already scanning
> May 13 01:44:48 maxim-laptop kernel: [ 1108.932614] wlan0: associate with AP 00:0e:2e:a4:12:3e
> May 13 01:44:48 maxim-laptop kernel: [ 1108.934760] wlan0: RX AssocResp from 00:0e:2e:a4:12:3e (capab=0x401 status=1 aid=2)
> May 13 01:44:48 maxim-laptop kernel: [ 1108.934767] wlan0: AP denied association (code=1)
> May 13 01:44:48 maxim-laptop kernel: [ 1109.130125] wlan0: associate with AP 00:0e:2e:a4:12:3e
> May 13 01:44:48 maxim-laptop kernel: [ 1109.132231] wlan0: RX AssocResp from 00:0e:2e:a4:12:3e (capab=0x401 status=1 aid=2)
> May 13 01:44:48 maxim-laptop kernel: [ 1109.132237] wlan0: AP denied association (code=1)
> May 13 01:44:48 maxim-laptop kernel: [ 1109.332611] wlan0: association with AP 00:0e:2e:a4:12:3e timed out
> May 13 01:45:08 maxim-laptop kernel: [ 1128.585712] cfg80211: scan requested
> May 13 01:45:08 maxim-laptop kernel: [ 1128.585719] cfg80211: can't scan, already scanning
>
>
So it looks like a race between setting the essid, when in rfkill

Yes, it is, very easy to reproduce.
Just press rfkill to turn radio off, then 'iwconfig wlan0 essid zzzz'

Which is very wrong btw, NM shouldn't touch bssid/essid, as wpa_supplicant already does that for wext, and doesn't
(there is no need for nl80211)


Best regards,
Maxim Levitsky


2009-05-03 18:07:44

by Johannes Berg

[permalink] [raw]
Subject: Re: [BUG] scans can still hang with -EBUSY on iwl3945

On Sat, 2009-05-02 at 04:32 +0300, Maxim Levitsky wrote:

> > * iwlwifi.git - very close to current head, yesterday HEAD
> > * disable_hw_scan=1 - this bug is present regardless of this setting,
> > yet this rules out firmware interactions during rfkill.

Are you absolutely sure this also happens with disable_hw_scan=1? That
seems rather strange. Can you please also put a printk into
net/wireless/scan.c:cfg80211_scan_done, and provide separate logs w/ and
w/o disable_hw_scan=1.

johannes


Attachments:
signature.asc (836.00 B)
This is a digitally signed message part

2009-05-01 15:52:16

by Reinette Chatre

[permalink] [raw]
Subject: Re: [BUG] scans can still hang with -EBUSY on iwl3945

Maxim,

On Fri, 2009-05-01 at 05:00 -0700, Maxim Levitsky wrote:
> Don't know if this is new regression, or just exposed race condition,
> but currently using rfkill, and/or s2disk sometimes leaves the card in
> state in which it can't scan.

Could you please provide more details of this scenario?

>
> Tested with and without hardware scanning.
>
> It just returns -EBUSY till I reload the driver.
>
> I use iwlwifi.git, iwl3945

Reinette


2009-05-04 07:19:13

by Johannes Berg

[permalink] [raw]
Subject: Re: [BUG] scans can still hang with -EBUSY on iwl3945

On Mon, 2009-05-04 at 00:46 +0300, Maxim Levitsky wrote:
> Yes, sure
> For record I use iwlwifi.git commit
> 5a94b6d38100b7056a5a347e5c51359d924d305d

Ok.

> Logs attached.
> This bug can be reliably reproduced, happens every 5~10 times when I
> turn rfkill on/off, but today again this happened after resume from
> disk.

Ah, rfkill. Hmm, unfortunately I don't have a system with rfkill. I
looked at ieee80211_scan_work but can't see how it would abort the
scan... Could you put some debugging into that function? Especially
printing out the state when it enters, whether it calls
ieee80211_scan_completed and the next_delay? Also, it would probably
help if you were to enable mac80211 debugging in iwlwifi (modprobe with
debug=2). And let's focus on the disable_hw_scan=1 case for now.

johannes


Attachments:
signature.asc (836.00 B)
This is a digitally signed message part

2009-05-02 01:32:32

by Maxim Levitsky

[permalink] [raw]
Subject: Re: [BUG] scans can still hang with -EBUSY on iwl3945

On Fri, 2009-05-01 at 23:13 +0300, Maxim Levitsky wrote:
> On Fri, 2009-05-01 at 08:58 -0700, reinette chatre wrote:
> > Maxim,
> >
> > On Fri, 2009-05-01 at 05:00 -0700, Maxim Levitsky wrote:
> > > Don't know if this is new regression, or just exposed race condition,
> > > but currently using rfkill, and/or s2disk sometimes leaves the card in
> > > state in which it can't scan.
> >
> > Could you please provide more details of this scenario?
>
> OK, lets focus on my current setup:
>
> * iwlwifi.git - very close to current head, yesterday HEAD
> * disable_hw_scan=1 - this bug is present regardless of this setting,
> yet this rules out firmware interactions during rfkill.
>
> * Network Manager - ubuntu 9.04 unmodified.
> * wpa_supplicant - close to git HEAD, with one local change:
> It is known that on first attempt scans always fail, with same -EBUSY
> iwlst scan shows 'Device is busy', supplicant complains about scan
> requested (ret=-1), ... When this happens supplicant waits for 10
> seconds, and I reduced that wait to 1 second.
>
>
> When this happens, I can't make the card scan, scans always fail, and I
> can only reload the module.
>
> Best regards,
> Maxim Levitsky
>


Meanwhile, here are results from debbuging:




> -5
> May 2 04:22:56 maxim-laptop kernel: [ 1174.672463] Registered led device: iwl-phy0::radio
> May 2 04:22:56 maxim-laptop kernel: [ 1174.672500] Registered led device: iwl-phy0::assoc
> May 2 04:22:56 maxim-laptop kernel: [ 1174.672538] Registered led device: iwl-phy0::RX
> May 2 04:22:56 maxim-laptop kernel: [ 1174.672570] Registered led device: iwl-phy0::TX
> May 2 04:22:56 maxim-laptop kernel: [ 1174.769294] wlan0: direct probe to AP 00:11:6b:29:b4:c0 try 1
> May 2 04:22:56 maxim-laptop kernel: [ 1174.858645] MAC80211: scan requested
> May 2 04:22:56 maxim-laptop kernel: [ 1174.858652] __ieee80211_start_scan: scan requested
> May 2 04:22:56 maxim-laptop kernel: [ 1174.859251] MAC80211: scan requested
> May 2 04:22:56 maxim-laptop kernel: [ 1174.859256] MAC80211: can't scan, already scanning
> May 2 04:22:56 maxim-laptop kernel: [ 1174.962564] wlan0: direct probe to AP 00:11:6b:29:b4:c0 try 2
> May 2 04:22:56 maxim-laptop kernel: [ 1175.170108] wlan0: direct probe to AP 00:11:6b:29:b4:c0 try 3
> May 2 04:22:56 maxim-laptop kernel: [ 1175.370109] wlan0: direct probe to AP 00:11:6b:29:b4:c0 timed out
> May 2 04:22:57 maxim-laptop kernel: [ 1175.859705] MAC80211: scan requested
> May 2 04:22:57 maxim-laptop kernel: [ 1175.859712] MAC80211: can't scan, already scanning
> May 2 04:22:57 maxim-laptop kernel: [ 1176.040128] __ieee80211_start_scan: scan requested
> May 2 04:22:57 maxim-laptop kernel: [ 1176.040137] __ieee80211_start_scan: scan requested OK
> May 2 04:22:58 maxim-laptop kernel: [ 1176.861043] MAC80211: scan requested
> May 2 04:22:58 maxim-laptop kernel: [ 1176.861050] MAC80211: can't scan, already scanning
> May 2 04:22:58 maxim-laptop kernel: [ 1177.109763] iwl3945 0000:06:00.0: Radio Frequency Kill Switch is On:
> May 2 04:22:58 maxim-laptop kernel: [ 1177.109768] Kill switch must be turned off for wireless networking to work.
> May 2 04:22:58 maxim-laptop kernel: [ 1177.200160] iwl3945 0000:06:00.0: Error sending REPLY_RXON: enqueue_hcmd failed: -5
> May 2 04:22:58 maxim-laptop kernel: [ 1177.200169] iwl3945 0000:06:00.0: Error setting new configuration (-5).
> May 2 04:22:58 maxim-laptop kernel: [ 1177.200843] iwl3945 0000:06:00.0: Error sending REPLY_RXON: enqueue_hcmd failed: -5
> May 2 04:22:58 maxim-laptop kernel: [ 1177.200850] iwl3945 0000:06:00.0: Error setting new configuration (-5).
> May 2 04:22:58 maxim-laptop kernel: [ 1177.200977] iwl3945 0000:06:00.0: Error sending REPLY_RXON: enqueue_hcmd failed: -5
> May 2 04:22:58 maxim-laptop kernel: [ 1177.200983] iwl3945 0000:06:00.0: Error setting new configuration (-5).
> May 2 04:22:58 maxim-laptop kernel: [ 1177.200992] wlan0: authenticate with AP 00:1b:9e:d8:77:02
> May 2 04:22:58 maxim-laptop kernel: [ 1177.400130] wlan0: authenticate with AP 00:1b:9e:d8:77:02
> May 2 04:22:58 maxim-laptop kernel: [ 1177.590135] wlan0: authenticate with AP 00:1b:9e:d8:77:02
> May 2 04:22:59 maxim-laptop kernel: [ 1177.790135] wlan0: authentication with AP 00:1b:9e:d8:77:02 timed out
> May 2 04:23:02 maxim-laptop kernel: [ 1180.776565] __ieee80211_start_scan: scan requested
> May 2 04:23:02 maxim-laptop kernel: [ 1180.776576] __ieee80211_start_scan: scan requested OK
> May 2 04:23:02 maxim-laptop kernel: [ 1180.800169] iwl3945 0000:06:00.0: Error sending REPLY_RXON: enqueue_hcmd failed: -5
> May 2 04:23:02 maxim-laptop kernel: [ 1180.800178] iwl3945 0000:06:00.0: Error setting new configuration (-5).
> May 2 04:23:02 maxim-laptop kernel: [ 1180.852615] wlan0: deauthenticating by local choice (reason=3)
> May 2 04:23:09 maxim-laptop kernel: [ 1187.692165] Registered led device: iwl-phy0::radio
> May 2 04:23:09 maxim-laptop kernel: [ 1187.692243] Registered led device: iwl-phy0::assoc
> May 2 04:23:09 maxim-laptop kernel: [ 1187.692277] Registered led device: iwl-phy0::RX
> May 2 04:23:09 maxim-laptop kernel: [ 1187.692309] Registered led device: iwl-phy0::TX
> May 2 04:23:09 maxim-laptop kernel: [ 1187.779065] wlan0: direct probe to AP 00:21:63:73:3e:cb try 1
> May 2 04:23:09 maxim-laptop kernel: [ 1187.866011] MAC80211: scan requested
> May 2 04:23:09 maxim-laptop kernel: [ 1187.866020] __ieee80211_start_scan: scan requested
> May 2 04:23:09 maxim-laptop kernel: [ 1187.866246] MAC80211: scan requested
> May 2 04:23:09 maxim-laptop kernel: [ 1187.866250] MAC80211: can't scan, already scanning
> May 2 04:23:09 maxim-laptop kernel: [ 1187.970132] wlan0: direct probe to AP 00:21:63:73:3e:cb try 2
> May 2 04:23:09 maxim-laptop kernel: [ 1187.975966] wlan0 direct probe responded
> May 2 04:23:09 maxim-laptop kernel: [ 1187.975972] wlan0: authenticate with AP 00:21:63:73:3e:cb
> May 2 04:23:09 maxim-laptop kernel: [ 1187.980907] wlan0: authenticated
> May 2 04:23:09 maxim-laptop kernel: [ 1187.980916] wlan0: associate with AP 00:21:63:73:3e:cb
> May 2 04:23:09 maxim-laptop kernel: [ 1187.985106] wlan0: RX AssocResp from 00:21:63:73:3e:cb (capab=0x1 status=0 aid=1)
> May 2 04:23:09 maxim-laptop kernel: [ 1187.985112] wlan0: associated
> May 2 04:23:09 maxim-laptop kernel: [ 1187.990358] wlan0: disassociating by local choice (reason=3)
> May 2 04:23:09 maxim-laptop kernel: [ 1188.012606] __ieee80211_start_scan: scan requested
> May 2 04:23:09 maxim-laptop kernel: [ 1188.012615] __ieee80211_start_scan: scan requested OK
> May 2 04:23:10 maxim-laptop kernel: [ 1189.164814] __ieee80211_start_scan: scan requested
> May 2 04:23:10 maxim-laptop kernel: [ 1189.164821] __ieee80211_start_scan: scan requested OK
> May 2 04:23:11 maxim-laptop kernel: [ 1190.314526] __ieee80211_start_scan: scan requested
> May 2 04:23:11 maxim-laptop kernel: [ 1190.314535] __ieee80211_start_scan: scan requested OK
> May 2 04:23:11 maxim-laptop kernel: [ 1190.598788] iwl3945 0000:06:00.0: Radio Frequency Kill Switch is On:
> May 2 04:23:11 maxim-laptop kernel: [ 1190.598793] Kill switch must be turned off for wireless networking to work.
> May 2 04:23:12 maxim-laptop kernel: [ 1190.640157] iwl3945 0000:06:00.0: Error sending REPLY_RXON: enqueue_hcmd failed: -5
> May 2 04:23:12 maxim-laptop kernel: [ 1190.640166] iwl3945 0000:06:00.0: Error setting new configuration (-5).
> May 2 04:23:12 maxim-laptop kernel: [ 1190.700126] iwl3945 0000:06:00.0: Error sending REPLY_RXON: enqueue_hcmd failed: -5
> May 2 04:23:12 maxim-laptop kernel: [ 1190.700132] iwl3945 0000:06:00.0: Error setting new configuration (-5).
> May 2 04:23:12 maxim-laptop kernel: [ 1190.760135] iwl3945 0000:06:00.0: Error sending REPLY_RXON: enqueue_hcmd failed: -5
> May 2 04:23:12 maxim-laptop kernel: [ 1190.760142] iwl3945 0000:06:00.0: Error setting new configuration (-5).
> May 2 04:23:12 maxim-laptop kernel: [ 1190.820135] iwl3945 0000:06:00.0: Error sending REPLY_RXON: enqueue_hcmd failed: -5
> May 2 04:23:12 maxim-laptop kernel: [ 1190.820142] iwl3945 0000:06:00.0: Error setting new configuration (-5).
> May 2 04:23:12 maxim-laptop kernel: [ 1190.880128] iwl3945 0000:06:00.0: Error sending REPLY_RXON: enqueue_hcmd failed: -5
> May 2 04:23:12 maxim-laptop kernel: [ 1190.880134] iwl3945 0000:06:00.0: Error setting new configuration (-5).
> May 2 04:23:12 maxim-laptop kernel: [ 1190.940134] iwl3945 0000:06:00.0: Error sending REPLY_RXON: enqueue_hcmd failed: -5
> May 2 04:23:12 maxim-laptop kernel: [ 1190.940141] iwl3945 0000:06:00.0: Error setting new configuration (-5).
> May 2 04:23:12 maxim-laptop kernel: [ 1191.000059] iwl3945 0000:06:00.0: Error sending REPLY_RXON: enqueue_hcmd failed: -5
> May 2 04:23:12 maxim-laptop kernel: [ 1191.000065] iwl3945 0000:06:00.0: Error setting new configuration (-5).
> May 2 04:23:12 maxim-laptop kernel: [ 1191.230193] iwl3945 0000:06:00.0: Error sending REPLY_RXON: enqueue_hcmd failed: -5
> May 2 04:23:12 maxim-laptop kernel: [ 1191.230201] iwl3945 0000:06:00.0: Error setting new configuration (-5).
> May 2 04:23:12 maxim-laptop kernel: [ 1191.460151] iwl3945 0000:06:00.0: Error sending REPLY_RXON: enqueue_hcmd failed: -5
> May 2 04:23:12 maxim-laptop kernel: [ 1191.460159] iwl3945 0000:06:00.0: Error setting new configuration (-5).
> May 2 04:23:15 maxim-laptop kernel: [ 1193.634947] iwl3945 0000:06:00.0: Error sending REPLY_RXON: enqueue_hcmd failed: -5
> May 2 04:23:15 maxim-laptop kernel: [ 1193.634955] iwl3945 0000:06:00.0: Error setting new configuration (-5).
> May 2 04:23:15 maxim-laptop kernel: [ 1193.634966] wlan0: authenticate with AP 00:21:63:73:3e:cb
> May 2 04:23:15 maxim-laptop kernel: [ 1193.722617] wlan0: deauthenticating by local choice (reason=3)
> May 2 04:23:22 maxim-laptop kernel: [ 1200.672558] Registered led device: iwl-phy0::radio
> May 2 04:23:22 maxim-laptop kernel: [ 1200.672637] Registered led device: iwl-phy0::assoc
> May 2 04:23:22 maxim-laptop kernel: [ 1200.672672] Registered led device: iwl-phy0::RX
> May 2 04:23:22 maxim-laptop kernel: [ 1200.672703] Registered led device: iwl-phy0::TX
> May 2 04:23:22 maxim-laptop kernel: [ 1200.768734] wlan0: direct probe to AP 00:21:63:73:3e:cb try 1
> May 2 04:23:22 maxim-laptop kernel: [ 1200.848756] MAC80211: scan requested
> May 2 04:23:22 maxim-laptop kernel: [ 1200.848766] __ieee80211_start_scan: scan requested
> May 2 04:23:22 maxim-laptop kernel: [ 1200.962608] wlan0: direct probe to AP 00:21:63:73:3e:cb try 2
> May 2 04:23:22 maxim-laptop kernel: [ 1201.160120] wlan0: direct probe to AP 00:21:63:73:3e:cb try 3
> May 2 04:23:22 maxim-laptop kernel: [ 1201.163983] wlan0 direct probe responded
> May 2 04:23:22 maxim-laptop kernel: [ 1201.163989] wlan0: authenticate with AP 00:21:63:73:3e:cb
> May 2 04:23:22 maxim-laptop kernel: [ 1201.179940] wlan0: authenticated
> May 2 04:23:22 maxim-laptop kernel: [ 1201.179945] wlan0: associate with AP 00:21:63:73:3e:cb
> May 2 04:23:22 maxim-laptop kernel: [ 1201.183360] wlan0: RX AssocResp from 00:21:63:73:3e:cb (capab=0x1 status=0 aid=1)
> May 2 04:23:22 maxim-laptop kernel: [ 1201.183365] wlan0: associated
> May 2 04:23:22 maxim-laptop kernel: [ 1201.185992] __ieee80211_start_scan: scan requested
> May 2 04:23:22 maxim-laptop kernel: [ 1201.186060] __ieee80211_start_scan: scan requested OK
> May 2 04:23:22 maxim-laptop kernel: [ 1201.186645] wlan0: disassociating by local choice (reason=3)
> May 2 04:23:23 maxim-laptop kernel: [ 1202.334386] __ieee80211_start_scan: scan requested
> May 2 04:23:23 maxim-laptop kernel: [ 1202.334394] __ieee80211_start_scan: scan requested OK
> May 2 04:23:24 maxim-laptop kernel: [ 1203.484504] __ieee80211_start_scan: scan requested
> May 2 04:23:24 maxim-laptop kernel: [ 1203.484512] __ieee80211_start_scan: scan requested OK
> May 2 04:23:28 maxim-laptop kernel: [ 1207.338286] MAC80211: scan requested
> May 2 04:23:28 maxim-laptop kernel: [ 1207.338295] __ieee80211_start_scan: scan requested
> May 2 04:23:28 maxim-laptop kernel: [ 1207.338301] __ieee80211_start_scan: scan requested OK
> May 2 04:23:29 maxim-laptop kernel: [ 1208.489879] wlan0: authenticate with AP 00:1b:9e:d8:77:02
> May 2 04:23:29 maxim-laptop kernel: [ 1208.491881] wlan0: authenticated
> May 2 04:23:29 maxim-laptop kernel: [ 1208.491886] wlan0: associate with AP 00:1b:9e:d8:77:02
> May 2 04:23:29 maxim-laptop kernel: [ 1208.494566] wlan0: RX AssocResp from 00:1b:9e:d8:77:02 (capab=0x411 status=0 aid=2)
> May 2 04:23:29 maxim-laptop kernel: [ 1208.494572] wlan0: associated
> May 2 04:23:32 maxim-laptop kernel: [ 1211.353262] iwl3945 0000:06:00.0: Radio Frequency Kill Switch is On:
> May 2 04:23:32 maxim-laptop kernel: [ 1211.353266] Kill switch must be turned off for wireless networking to work.
> May 2 04:23:34 maxim-laptop kernel: [ 1212.852045] iwl3945 0000:06:00.0: Error sending REPLY_LEDS_CMD: enqueue_hcmd failed: -5
> May 2 04:23:34 maxim-laptop kernel: [ 1212.922732] iwl3945 0000:06:00.0: Error sending REPLY_RXON: enqueue_hcmd failed: -5
> May 2 04:23:34 maxim-laptop kernel: [ 1212.922740] iwl3945 0000:06:00.0: Error clearing ASSOC_MSK on current configuration (-5).
> May 2 04:23:34 maxim-laptop kernel: [ 1212.922751] wlan0: authenticate with AP 00:1b:9e:d8:77:02
> May 2 04:23:34 maxim-laptop kernel: [ 1213.033608] MAC80211: scan requested
> May 2 04:23:34 maxim-laptop kernel: [ 1213.033617] __ieee80211_start_scan: scan requested
> May 2 04:23:34 maxim-laptop kernel: [ 1213.122740] wlan0: deauthenticating by local choice (reason=3)
> May 2 04:23:41 maxim-laptop kernel: [ 1219.682401] Registered led device: iwl-phy0::radio
> May 2 04:23:41 maxim-laptop kernel: [ 1219.682479] Registered led device: iwl-phy0::assoc
> May 2 04:23:41 maxim-laptop kernel: [ 1219.683652] Registered led device: iwl-phy0::RX
> May 2 04:23:41 maxim-laptop kernel: [ 1219.683691] Registered led device: iwl-phy0::TX
> May 2 04:23:41 maxim-laptop kernel: [ 1219.776806] wlan0: direct probe to AP 00:21:63:73:3e:cb try 1
> May 2 04:23:41 maxim-laptop kernel: [ 1219.895887] MAC80211: scan requested
> May 2 04:23:41 maxim-laptop kernel: [ 1219.895893] MAC80211: can't scan, already scanning
> May 2 04:23:41 maxim-laptop kernel: [ 1219.970133] wlan0: direct probe to AP 00:21:63:73:3e:cb try 2
> May 2 04:23:41 maxim-laptop kernel: [ 1220.170110] wlan0: direct probe to AP 00:21:63:73:3e:cb try 3
> May 2 04:23:41 maxim-laptop kernel: [ 1220.370129] wlan0: direct probe to AP 00:21:63:73:3e:cb timed out
> May 2 04:24:02 maxim-laptop kernel: [ 1240.632072] MAC80211: scan requested
> May 2 04:24:02 maxim-laptop kernel: [ 1240.632079] MAC80211: can't scan, already scanning
> May 2 04:24:32 maxim-laptop kernel: [ 1270.630784] MAC80211: scan requested
> May 2 04:24:32 maxim-laptop kernel: [ 1270.630792] MAC80211: can't scan, already scanning
> May 2 04:25:12 maxim-laptop kernel: [ 1310.631028] MAC80211: scan requested
> May 2 04:25:12 maxim-laptop kernel: [ 1310.631035] MAC80211: can't scan, already scanning
> May 2 04:26:02 maxim-laptop kernel: [ 1360.632820] MAC80211: scan requested
> May 2 04:26:02 maxim-laptop kernel: [ 1360.632827] MAC80211: can't scan, already scanning
> May 2 04:27:02 maxim-laptop kernel: [ 1420.630846] MAC80211: scan requested
> May 2 04:27:02 maxim-laptop kernel: [ 1420.630853] MAC80211: can't scan, already scanning
> May 2 04:28:02 maxim-laptop kernel: [ 1480.632698] MAC80211: scan requested
> May 2 04:28:02 maxim-laptop kernel: [ 1480.632705] MAC80211: can't scan, already scanning
>
>

Debug patch that created those messages attached

As you see __ieee80211_start_scan never returns.


Best regards,
Maxim Levitsky


Attachments:
test_patch.diff (1.43 kB)

2009-05-03 18:05:30

by Johannes Berg

[permalink] [raw]
Subject: Re: [BUG] scans can still hang with -EBUSY on iwl3945

I was taking a look at this, one small request:
> --- a/net/wireless/scan.c
> +++ b/net/wireless/scan.c
> @@ -585,11 +585,14 @@ int cfg80211_wext_siwscan(struct net_device
> *dev,
> return -ENETDOWN;
>
> rdev = cfg80211_get_dev_from_ifindex(dev->ifindex);
> +
> + printk("MAC80211: scan requested\n");

Please don't put "mac80211:" printks anywhere else than net/mac80211/,
that's very confusing here. This subsystem is called 'cfg80211' usually.

Anyway, what seems to happen is that mac80211 never responds to the scan
request -- can you describe what you're actually doing?

johannes


Attachments:
signature.asc (836.00 B)
This is a digitally signed message part

2009-05-12 20:54:09

by Maxim Levitsky

[permalink] [raw]
Subject: Re: [BUG] scans can still hang with -EBUSY on iwl3945

On Mon, 2009-05-04 at 09:19 +0200, Johannes Berg wrote:
> On Mon, 2009-05-04 at 00:46 +0300, Maxim Levitsky wrote:
> > Yes, sure
> > For record I use iwlwifi.git commit
> > 5a94b6d38100b7056a5a347e5c51359d924d305d
>
> Ok.
>
> > Logs attached.
> > This bug can be reliably reproduced, happens every 5~10 times when I
> > turn rfkill on/off, but today again this happened after resume from
> > disk.
>
> Ah, rfkill. Hmm, unfortunately I don't have a system with rfkill. I
> looked at ieee80211_scan_work but can't see how it would abort the
> scan... Could you put some debugging into that function? Especially
> printing out the state when it enters, whether it calls

Hi,

I can't seem to reproduce that ether.
Maybe this got fixed (I have seen few suspicios commits in
wireless-testing... althought I think this is a wishfull thinking)

Anyway, I I have put printks in ieee80211_scan_work.


Best regards,
Maxim Levitsky


2009-05-12 23:12:50

by Johannes Berg

[permalink] [raw]
Subject: Re: [BUG] scans can still hang with -EBUSY on iwl3945

On Wed, 2009-05-13 at 02:10 +0300, Maxim Levitsky wrote:

> > May 13 01:44:33 maxim-laptop kernel: [ 1093.991176] iwl3945 0000:06:00.0: Radio Frequency Kill Switch is On:
> ^^^^^^^^^^ Kernel notices the rfkill state
> > May 13 01:44:33 maxim-laptop kernel: [ 1093.991179] Kill switch must be turned off for wireless networking to work.
> > May 13 01:44:35 maxim-laptop kernel: [ 1095.789455] ieee80211_ioctl_siwessid: called
> ^^^^^^^^^^ Evil NM notices Rfkill too, and sends bogus bssid to driver
> > May 13 01:44:35 maxim-laptop kernel: [ 1095.789462] ieee80211_sta_req_auth: called
> > May 13 01:44:35 maxim-laptop kernel: [ 1095.789498] iwl3945 0000:06:00.0: Error sending REPLY_LEDS_CMD: enqueue_hcmd failed: -5
> > May 13 01:44:35 maxim-laptop kernel: [ 1095.789530] iwl3945 0000:06:00.0: Error sending REPLY_RXON: enqueue_hcmd failed: -5
> > May 13 01:44:35 maxim-laptop kernel: [ 1095.789536] iwl3945 0000:06:00.0: Error clearing ASSOC_MSK on current configuration (-5).
> > May 13 01:44:35 maxim-laptop kernel: [ 1095.870202] iwl3945 0000:06:00.0: Error sending REPLY_RXON: enqueue_hcmd failed: -5
> > May 13 01:44:35 maxim-laptop kernel: [ 1095.870210] iwl3945 0000:06:00.0: Error clearing ASSOC_MSK on current configuration (-5).
> > May 13 01:44:35 maxim-laptop kernel: [ 1095.870223] iwl3945 0000:06:00.0: Error sending REPLY_RXON: enqueue_hcmd failed: -5
> > May 13 01:44:35 maxim-laptop kernel: [ 1095.870227] iwl3945 0000:06:00.0: Error clearing ASSOC_MSK on current configuration (-5).
> > May 13 01:44:35 maxim-laptop kernel: [ 1095.870234] wlan0: direct probe to AP 00:0e:2e:a4:12:3e try 1
> > May 13 01:44:35 maxim-laptop kernel: [ 1095.960876] cfg80211: scan requested
> ^^^^^^^^^^^ This results in a scan
> > May 13 01:44:35 maxim-laptop kernel: [ 1095.960884] __ieee80211_start_scan: scan requested
> > May 13 01:44:35 maxim-laptop kernel: [ 1096.042972] wlan0: deauthenticating by local choice (reason=3)
> ^^^^^^^^^^^ And deauth....
> > May 13 01:44:35 maxim-laptop kernel: [ 1096.331307] ieee80211_ioctl_siwap: called
> ^^^^^^^^^^^ And NM sets ESSID on top of that....
> > May 13 01:44:35 maxim-laptop kernel: [ 1096.331310] ieee80211_sta_req_auth: called
> >
>
> Now I press rfkill again, but scan is hung

So it's a bug in iwlwifi, it neither rejects nor executes a scan when in
rfkill.

johannes


Attachments:
signature.asc (801.00 B)
This is a digitally signed message part