2009-02-21 07:33:38

by Lars Ericsson

[permalink] [raw]
Subject: SIOCGIWSCAN-race

Hi Johannes,

I have discovered and patched a race in the scanning function since a couple
of releases.
To day I checked the current Linux git and the problem is still there.

The problem is the sequence of events when the scan result is reported back.
The wireless_send_event(dev, SIOCGIWSCAN, &wrqu, NULL);
is called before ieee80211_hw_config(local);

ieee80211_hw_config(local) will trig the wpa_supplicant to select an AP.
That may happen before the ieee80211_hw_config() is executed since the
wpa_supplicant
generated actions is executed by an other thread (wpa_supplicant).

The result is that:
- wpa_supplicant setup for an association to an ap using correct channel.
- ieee80211_hw_config() reset the channel to the value before the SCAN
started.
- the association request will be sent out using the wrong channel.


Attached you will find the patch for 2.6.27.
It is not a perfect patch since the code is duplicated but it works :)


Regards
Lars


Attachments:
lae-mac80211-mlme-SIOCGIWSCAN-race.patch (1.20 kB)

2009-02-24 07:05:57

by Lars Ericsson

[permalink] [raw]
Subject: RE: SIOCGIWSCAN-race

> On Mon, 2009-02-23 at 16:11 -0500, John W. Linville wrote:
> > On Sat, Feb 21, 2009 at 08:33:06AM +0100, Lars Ericsson wrote:
> > > Hi Johannes,
> > >
> > > I have discovered and patched a race in the scanning function since a
couple
> > > of releases.
> > > To day I checked the current Linux git and the problem is still there.
> > >
> > > The problem is the sequence of events when the scan result is reported
back.
> > > The wireless_send_event(dev, SIOCGIWSCAN, &wrqu, NULL);
> > > is called before ieee80211_hw_config(local);
> > >
> > > ieee80211_hw_config(local) will trig the wpa_supplicant to select an
AP.
> > > That may happen before the ieee80211_hw_config() is executed since the
> > > wpa_supplicant
> > > generated actions is executed by an other thread (wpa_supplicant).
> > >
> > > The result is that:
> > > - wpa_supplicant setup for an association to an ap using correct
channel.
> > > - ieee80211_hw_config() reset the channel to the value before the SCAN
started.
> > > - the association request will be sent out using the wrong channel.
> > >
> > >
> > > Attached you will find the patch for 2.6.27.
> > > It is not a perfect patch since the code is duplicated but it works :)
> >
> > Looks like the patch would need to be reworked -- the code in 2.6.29
> > and later is different.
>
> and the new code with cfg80211 is even more different ;)
>

OK, I have not moved in to the 'cfg80211' world, yet.
Johannes; do you create the patch for this ?

> > Also, any reason we can't just move the
> > wireless_send_event() down to done:?
>
> seems fine

Perfect, is it to late for 2.6.29 ?

Lars

> > Still, this doesn't feel quite right. Shouldn't we be able to queue
> > the userland-driven channel change until after the scan completes?
>
> ?
> we do that, well, we set it here:
> local->sw_scanning = false;
> ieee80211_hw_config(local, IEEE80211_CONF_CHANGE_CHANNEL);
>
> but we claim to be on the user requested channel at all times, so that
> should be ok
>
> johannes
>
> > John
> >
> > diff --git a/net/mac80211/scan.c b/net/mac80211/scan.c
> > index f5c7c33..8c13a91 100644
> > --- a/net/mac80211/scan.c
> > +++ b/net/mac80211/scan.c
> > @@ -437,15 +437,6 @@ void ieee80211_scan_completed(struct
> ieee80211_hw *hw)
> > local->last_scan_completed = jiffies;
> > memset(&wrqu, 0, sizeof(wrqu));
> >
> > - /*
> > - * local->scan_sdata could have been NULLed by the interface
> > - * down code in case we were scanning on an interface that is
> > - * being taken down.
> > - */
> > - sdata = local->scan_sdata;
> > - if (sdata)
> > - wireless_send_event(sdata->dev, SIOCGIWSCAN,
> &wrqu, NULL);
> > -
> > if (local->hw_scanning) {
> > local->hw_scanning = false;
> > /*
> > @@ -486,6 +477,15 @@ void ieee80211_scan_completed(struct
> ieee80211_hw *hw)
> > rcu_read_unlock();
> >
> > done:
> > + /*
> > + * local->scan_sdata could have been NULLed by the interface
> > + * down code in case we were scanning on an interface that is
> > + * being taken down.
> > + */
> > + sdata = local->scan_sdata;
> > + if (sdata)
> > + wireless_send_event(sdata->dev, SIOCGIWSCAN,
> &wrqu, NULL);
> > +
> > ieee80211_mlme_notify_scan_completed(local);
> > ieee80211_mesh_notify_scan_completed(local);
> > }
> >
>


2009-02-24 02:10:14

by Johannes Berg

[permalink] [raw]
Subject: Re: SIOCGIWSCAN-race

On Mon, 2009-02-23 at 16:11 -0500, John W. Linville wrote:
> On Sat, Feb 21, 2009 at 08:33:06AM +0100, Lars Ericsson wrote:
> > Hi Johannes,
> >
> > I have discovered and patched a race in the scanning function since a couple
> > of releases.
> > To day I checked the current Linux git and the problem is still there.
> >
> > The problem is the sequence of events when the scan result is reported back.
> > The wireless_send_event(dev, SIOCGIWSCAN, &wrqu, NULL);
> > is called before ieee80211_hw_config(local);
> >
> > ieee80211_hw_config(local) will trig the wpa_supplicant to select an AP.
> > That may happen before the ieee80211_hw_config() is executed since the
> > wpa_supplicant
> > generated actions is executed by an other thread (wpa_supplicant).
> >
> > The result is that:
> > - wpa_supplicant setup for an association to an ap using correct channel.
> > - ieee80211_hw_config() reset the channel to the value before the SCAN
> > started.
> > - the association request will be sent out using the wrong channel.
> >
> >
> > Attached you will find the patch for 2.6.27.
> > It is not a perfect patch since the code is duplicated but it works :)
>
> Looks like the patch would need to be reworked -- the code in 2.6.29
> and later is different.

and the new code with cfg80211 is even more different ;)

> Also, any reason we can't just move the
> wireless_send_event() down to done:?

seems fine

> Still, this doesn't feel quite right. Shouldn't we be able to queue
> the userland-driven channel change until after the scan completes?

?
we do that, well, we set it here:
local->sw_scanning = false;
ieee80211_hw_config(local, IEEE80211_CONF_CHANGE_CHANNEL);

but we claim to be on the user requested channel at all times, so that
should be ok

johannes

> John
>
> diff --git a/net/mac80211/scan.c b/net/mac80211/scan.c
> index f5c7c33..8c13a91 100644
> --- a/net/mac80211/scan.c
> +++ b/net/mac80211/scan.c
> @@ -437,15 +437,6 @@ void ieee80211_scan_completed(struct ieee80211_hw *hw)
> local->last_scan_completed = jiffies;
> memset(&wrqu, 0, sizeof(wrqu));
>
> - /*
> - * local->scan_sdata could have been NULLed by the interface
> - * down code in case we were scanning on an interface that is
> - * being taken down.
> - */
> - sdata = local->scan_sdata;
> - if (sdata)
> - wireless_send_event(sdata->dev, SIOCGIWSCAN, &wrqu, NULL);
> -
> if (local->hw_scanning) {
> local->hw_scanning = false;
> /*
> @@ -486,6 +477,15 @@ void ieee80211_scan_completed(struct ieee80211_hw *hw)
> rcu_read_unlock();
>
> done:
> + /*
> + * local->scan_sdata could have been NULLed by the interface
> + * down code in case we were scanning on an interface that is
> + * being taken down.
> + */
> + sdata = local->scan_sdata;
> + if (sdata)
> + wireless_send_event(sdata->dev, SIOCGIWSCAN, &wrqu, NULL);
> +
> ieee80211_mlme_notify_scan_completed(local);
> ieee80211_mesh_notify_scan_completed(local);
> }
>


Attachments:
signature.asc (836.00 B)
This is a digitally signed message part

2009-02-23 21:15:20

by John W. Linville

[permalink] [raw]
Subject: Re: SIOCGIWSCAN-race

On Sat, Feb 21, 2009 at 08:33:06AM +0100, Lars Ericsson wrote:
> Hi Johannes,
>
> I have discovered and patched a race in the scanning function since a couple
> of releases.
> To day I checked the current Linux git and the problem is still there.
>
> The problem is the sequence of events when the scan result is reported back.
> The wireless_send_event(dev, SIOCGIWSCAN, &wrqu, NULL);
> is called before ieee80211_hw_config(local);
>
> ieee80211_hw_config(local) will trig the wpa_supplicant to select an AP.
> That may happen before the ieee80211_hw_config() is executed since the
> wpa_supplicant
> generated actions is executed by an other thread (wpa_supplicant).
>
> The result is that:
> - wpa_supplicant setup for an association to an ap using correct channel.
> - ieee80211_hw_config() reset the channel to the value before the SCAN
> started.
> - the association request will be sent out using the wrong channel.
>
>
> Attached you will find the patch for 2.6.27.
> It is not a perfect patch since the code is duplicated but it works :)

Looks like the patch would need to be reworked -- the code in 2.6.29
and later is different. Also, any reason we can't just move the
wireless_send_event() down to done:?

Still, this doesn't feel quite right. Shouldn't we be able to queue
the userland-driven channel change until after the scan completes?

John

diff --git a/net/mac80211/scan.c b/net/mac80211/scan.c
index f5c7c33..8c13a91 100644
--- a/net/mac80211/scan.c
+++ b/net/mac80211/scan.c
@@ -437,15 +437,6 @@ void ieee80211_scan_completed(struct ieee80211_hw *hw)
local->last_scan_completed = jiffies;
memset(&wrqu, 0, sizeof(wrqu));

- /*
- * local->scan_sdata could have been NULLed by the interface
- * down code in case we were scanning on an interface that is
- * being taken down.
- */
- sdata = local->scan_sdata;
- if (sdata)
- wireless_send_event(sdata->dev, SIOCGIWSCAN, &wrqu, NULL);
-
if (local->hw_scanning) {
local->hw_scanning = false;
/*
@@ -486,6 +477,15 @@ void ieee80211_scan_completed(struct ieee80211_hw *hw)
rcu_read_unlock();

done:
+ /*
+ * local->scan_sdata could have been NULLed by the interface
+ * down code in case we were scanning on an interface that is
+ * being taken down.
+ */
+ sdata = local->scan_sdata;
+ if (sdata)
+ wireless_send_event(sdata->dev, SIOCGIWSCAN, &wrqu, NULL);
+
ieee80211_mlme_notify_scan_completed(local);
ieee80211_mesh_notify_scan_completed(local);
}

--
John W. Linville Someday the world will need a hero, and you
[email protected] might be all we have. Be ready.