2008-03-01 18:21:35

by Johannes Berg

[permalink] [raw]
Subject: Re: bughost-1583&1499

Bill,

> I sent an early version of this report to Johannes and Reinette a week
> ago. Johannes said he would look at it after finishing the mesh stuff
> and he wanted Zhu to look at it too. This report contains some further
> analysis. This is a pernicious bug for new users of Fedora so I don't
> want it to fall on the floor.

Thanks for reminding me, I completely forgot to look at it again after
the mesh stuff.

> done: <----------- Add
> rcu_read_lock();
> list_for_each_entry_rcu(sdata, &local->interfaces, list) {
>
> /* No need to wake the master device. */
> if (sdata->dev == local->mdev)
> continue;
>
> if (sdata->vif.type == IEEE80211_IF_TYPE_STA) {
> if (sdata->u.sta.flags & IEEE80211_STA_ASSOCIATED)
> ieee80211_send_nullfunc(local, sdata, 0);
> ieee80211_sta_timer((unsigned long)sdata);
> }
>
> netif_wake_queue(sdata->dev);
> }
> rcu_read_unlock();
>
> done: <---------- Remove

I think this patch is wrong because it will result in sending a nullfunc
frame to the AP which the firmware should already have done in
hardware-scan case.

Can you try the patch below? I think it would be a better fix, the only
drawback is that we do two list iterations... I hope we'll never have
that many virtual interfaces on the list that it matters. But if this
patch works I can also refactor the function completely, just trying to
understand whether the STA timer really is the problem.

johannes

---
net/mac80211/ieee80211_sta.c | 23 +++++++++++++----------
1 file changed, 13 insertions(+), 10 deletions(-)

--- everything.orig/net/mac80211/ieee80211_sta.c 2008-03-01 19:16:13.000000000 +0100
+++ everything/net/mac80211/ieee80211_sta.c 2008-03-01 19:18:58.000000000 +0100
@@ -3610,25 +3610,28 @@ void ieee80211_scan_completed(struct iee

rcu_read_lock();
list_for_each_entry_rcu(sdata, &local->interfaces, list) {
-
/* No need to wake the master device. */
if (sdata->dev == local->mdev)
continue;

- if (sdata->vif.type == IEEE80211_IF_TYPE_STA) {
- if (sdata->u.sta.flags & IEEE80211_STA_ASSOCIATED)
- ieee80211_send_nullfunc(local, sdata, 0);
- ieee80211_sta_timer((unsigned long)sdata);
- }
-
- if (sdata->vif.type == IEEE80211_IF_TYPE_MESH_POINT)
- ieee80211_sta_timer((unsigned long)sdata);
+ /* Tell AP we're back */
+ if (sdata->vif.type == IEEE80211_IF_TYPE_STA &&
+ sdata->u.sta.flags & IEEE80211_STA_ASSOCIATED)
+ ieee80211_send_nullfunc(local, sdata, 0);

netif_wake_queue(sdata->dev);
}
+
+ done:
+ /* Restart STA timer for both SW and HW scan cases */
+ list_for_each_entry_rcu(sdata, &local->interfaces, list) {
+ if (sdata->vif.type == IEEE80211_IF_TYPE_STA ||
+ ieee80211_vif_is_mesh(&sdata->vif))
+ ieee80211_sta_timer((unsigned long)sdata);
+ }
+
rcu_read_unlock();

-done:
sdata = IEEE80211_DEV_TO_SUB_IF(dev);
if (sdata->vif.type == IEEE80211_IF_TYPE_IBSS) {
struct ieee80211_if_sta *ifsta = &sdata->u.sta;




2008-03-02 22:12:23

by Johannes Berg

[permalink] [raw]
Subject: Re: bughost-1583&1499


On Sat, 2008-03-01 at 17:35 -0500, Bill Moss wrote:
> I have pounded on your patch hard for both broadcast and hidden access
> points and its seems solid. I can't make it fail. After boot or iwl3945
> module reload, I run

Good, thanks, I'll respin without the extra loop and some cleanups. Or
actually, I should do that as a separate patch now that you've tested
it.

> For a broadcast access point, I rarely see the 'No scan results' output
> when running 'iwlist wlan0 scan'. For the hidden access point case, here
> are some results
>
> After module reload: 50 'No scan results' out of 50 scans
>
> After association/disassociation: 24 'No scan results' out of 50 scans
>
> Scanning while not associated is what NetworkManager does when it starts
> up and compares the scanned access point list with its stored list of
> acceptable access points. This is why I am interested in this bug. So
> far I am stumped. Any suggestions?

Hm, not really. You could create a monitor interface with iw
(git.sipsolutions.net), something like
# iw phydev wmaster0 interface add moni0 type monitor

and then capture on the master interface and see if there are any probe
responses. Also, can you test software scan for the second problem? If
it doesn't occur with software scan and you don't see any beacons/probe
responses on the monitor for those access points you're missing, then I
can't help you.

johannes


Attachments:
signature.asc (828.00 B)
This is a digitally signed message part

2008-03-01 22:37:04

by Bill Moss

[permalink] [raw]
Subject: Re: bughost-1583&1499

I have pounded on your patch hard for both broadcast and hidden access
points and its seems solid. I can't make it fail. After boot or iwl3945
module reload, I run

#!/bin/bash

ifconfig wlan0 up
iwconfig wlan0 mode Managed
iwconfig wlan0 key xxxxxxxxxxxxxxxxxxxxxxxxxx
iwconfig wlan0 essid mosswap
sleep 2
dhclient wlan0

which is essentially what ifup does in Fedora.

___________________________________________________________

Let me change horses to bughost 1499, part of which is the frequency
with which 'iwlist wlan0 scan' produces the 'No scan results' output
after a boot or a module reload and before any association is made. I
use this script for testing.

#!/bin/bash

ifconfig wlan0 up

iwlist wlan0 scan
for (( i = 0 ; i < 10; i++ ))
do
sleep 5
iwlist wlan0 scan
done

For a broadcast access point, I rarely see the 'No scan results' output
when running 'iwlist wlan0 scan'. For the hidden access point case, here
are some results

After module reload: 50 'No scan results' out of 50 scans

After association/disassociation: 24 'No scan results' out of 50 scans

Scanning while not associated is what NetworkManager does when it starts
up and compares the scanned access point list with its stored list of
acceptable access points. This is why I am interested in this bug. So
far I am stumped. Any suggestions?

Bill Moss

Johannes Berg wrote:
> Bill,
>
>
>> I sent an early version of this report to Johannes and Reinette a week
>> ago. Johannes said he would look at it after finishing the mesh stuff
>> and he wanted Zhu to look at it too. This report contains some further
>> analysis. This is a pernicious bug for new users of Fedora so I don't
>> want it to fall on the floor.
>>
>
> Thanks for reminding me, I completely forgot to look at it again after
> the mesh stuff.
>
>
>> done: <----------- Add
>> rcu_read_lock();
>> list_for_each_entry_rcu(sdata, &local->interfaces, list) {
>>
>> /* No need to wake the master device. */
>> if (sdata->dev == local->mdev)
>> continue;
>>
>> if (sdata->vif.type == IEEE80211_IF_TYPE_STA) {
>> if (sdata->u.sta.flags & IEEE80211_STA_ASSOCIATED)
>> ieee80211_send_nullfunc(local, sdata, 0);
>> ieee80211_sta_timer((unsigned long)sdata);
>> }
>>
>> netif_wake_queue(sdata->dev);
>> }
>> rcu_read_unlock();
>>
>> done: <---------- Remove
>>
>
> I think this patch is wrong because it will result in sending a nullfunc
> frame to the AP which the firmware should already have done in
> hardware-scan case.
>
> Can you try the patch below? I think it would be a better fix, the only
> drawback is that we do two list iterations... I hope we'll never have
> that many virtual interfaces on the list that it matters. But if this
> patch works I can also refactor the function completely, just trying to
> understand whether the STA timer really is the problem.
>
> johannes
>
> ---
> net/mac80211/ieee80211_sta.c | 23 +++++++++++++----------
> 1 file changed, 13 insertions(+), 10 deletions(-)
>
> --- everything.orig/net/mac80211/ieee80211_sta.c 2008-03-01 19:16:13.000000000 +0100
> +++ everything/net/mac80211/ieee80211_sta.c 2008-03-01 19:18:58.000000000 +0100
> @@ -3610,25 +3610,28 @@ void ieee80211_scan_completed(struct iee
>
> rcu_read_lock();
> list_for_each_entry_rcu(sdata, &local->interfaces, list) {
> -
> /* No need to wake the master device. */
> if (sdata->dev == local->mdev)
> continue;
>
> - if (sdata->vif.type == IEEE80211_IF_TYPE_STA) {
> - if (sdata->u.sta.flags & IEEE80211_STA_ASSOCIATED)
> - ieee80211_send_nullfunc(local, sdata, 0);
> - ieee80211_sta_timer((unsigned long)sdata);
> - }
> -
> - if (sdata->vif.type == IEEE80211_IF_TYPE_MESH_POINT)
> - ieee80211_sta_timer((unsigned long)sdata);
> + /* Tell AP we're back */
> + if (sdata->vif.type == IEEE80211_IF_TYPE_STA &&
> + sdata->u.sta.flags & IEEE80211_STA_ASSOCIATED)
> + ieee80211_send_nullfunc(local, sdata, 0);
>
> netif_wake_queue(sdata->dev);
> }
> +
> + done:
> + /* Restart STA timer for both SW and HW scan cases */
> + list_for_each_entry_rcu(sdata, &local->interfaces, list) {
> + if (sdata->vif.type == IEEE80211_IF_TYPE_STA ||
> + ieee80211_vif_is_mesh(&sdata->vif))
> + ieee80211_sta_timer((unsigned long)sdata);
> + }
> +
> rcu_read_unlock();
>
> -done:
> sdata = IEEE80211_DEV_TO_SUB_IF(dev);
> if (sdata->vif.type == IEEE80211_IF_TYPE_IBSS) {
> struct ieee80211_if_sta *ifsta = &sdata->u.sta;
>
>
>
>

--
Bill Moss
Alumni Distinguished Professor
Mathematical Sciences
Clemson University