2008-08-12 06:39:02

by Holger Schurig

[permalink] [raw]
Subject: Pondering: how to improve mac80211 roaming ...

Hi !

The last days I looked a bit at mac80211 and how it works. While
doing this, I detected that mac80211 does virtually no roaming.
That is, it is very usable for a hot-spot setup (Office, Home,
Starbucks), but not for a place where you have 20 Access-Points
and move between them.

It looks like that somebody (Jiri?) also detected this, because
I found he following TODO in mlme.c:

/* TODO: start monitoring current AP signal quality and number of
* missed beacons. Scan other channels every now and then and search
* for better APs. */

While pondering about this, I had some thought which I wanted to
confirm with the greater community, so that I don't make a patch
that won't be accepted.


How to detect missed beacons?
-----------------------------
When I know the beacon period, I could setup a timer
with "beacon_period + beacon_period*0.5". In the timer function I
could then increase a missed beacon counter and act accordingly,
e.g. search for APs, roam etc.

But how would I determine the beacon period?


Is detection of missed beacons good enought?
--------------------------------------------
For me this approach seems like driving a car into the wall of a
house. Then crash-detection notifies me and I'd search for a new
way to drive.

Certainly it's better to act before the accident happens, so I'd
rather do it differently. With a fullmac driver I looked at the
RSSI and, when it fell below a certain threshhold, started the
roaming.

In mac80211, I could do this in ieee80211_rx_bss_info(), in the
vincinity of those lines:

bss->timestamp = beacon_timestamp;
bss->last_update = jiffies;
bss->signal = rx_status->signal;
bss->noise = rx_status->noise;
bss->qual = rx_status->qual;
if (!beacon && !bss->probe_resp)
bss->probe_resp = true;

/*
* In STA mode, the remaining parameters should not be overridden
* by beacons because they're not necessarily accurate there.
*/
if (sdata->vif.type != IEEE80211_IF_TYPE_IBSS &&
bss->probe_resp && beacon) {
ieee80211_rx_bss_put(local, bss);
return;
}


In-kernel or in-userspace? --- or hybrid?
-------------------------------------------
Some people say that in-userspace roaming is the way to go.
Maybe. But in-kernel, I have more information. In the mac80211
case, the above quoted code is executed whenever I receive a
beacon. So I can very quickly react to declining SNR. Userspace
would have to issue periodically scan requests and process them,
quite a bit of overhead.

But there is existing user-space code that does roaming, e.g. WPA
Supplicant and (probably, not sure) Network-Manager.

So I think I'd opt to a hybrid approach. Userspace uses cfg80211
to configure some roaming threshold to mac80211. mac80211 would
gain AP-is-about-to-fail detection and, if it detects this, it
would signal via cfg80211 (is this possible?) to user-space that
it should now roam.

Userspace then could, while still associated, scan for other APs
(e.g. first on preferred channels, like 1,6,11, then on all
channels) and if it finds something, trigger association to
another AP.


For a WEP or non-encrypted environment a in-kernel-roaming would
be possible, this would bring a similar behavior to mac80211 that
common fullmac drivers exhibit. But my first goal would not be in
this area.


2008-08-13 12:47:43

by Dan Williams

[permalink] [raw]
Subject: Re: Pondering: how to improve mac80211 roaming ...

On Wed, 2008-08-13 at 14:26 +0200, Helmut Schaa wrote:
> Am Dienstag, 12. August 2008 17:42:23 schrieb Jouni Malinen:
> > On Tue, Aug 12, 2008 at 02:40:23PM +0200, Helmut Schaa wrote:
> > > JFYI I already started to rework the existing scan code in mac80211
> > > (software scan) to do something like background scanning:
> > >
> > > 1) notify current AP about leaving the channel
> > > 2) scan one channel
> >
> > It might be useful to leave this "one" as a parameter to allow easy
> > experiments with scanning more than one channel at a time to reduce
> > latency.
>
> Good point.
>
> > Ideally, this--along the interval for background scans--could
> > be something that is dynamically changed based on the expected traffic
> > pattern. Whenever there is lot of data traffic being (mostly
> > successfully) transmitted, it would be beneficial not to jump to other
> > channels as frequently or for as long a time. If there has not been any
> > data transmission for some time, it may be more acceptable to scan more
> > frequently and to remain away from the operational channel for longer
> > periods of time. Though, we should also keep in mind that background
> > scans are going to increase power consumption on otherwise inactive
> > situation, so setting a suitable policy for this can get quite complex.
>
> It might even be beneficial to cancel a currently active background scan once
> the TX queue is filling up and report the already gathered information to the
> user space.

If that's the case, we should also then have a response to the original
scan request (or broadcast netlink message with the original request's
cookie) that says "scan canceled" so that the requester can handle that.

dan


2008-08-12 12:39:33

by Helmut Schaa

[permalink] [raw]
Subject: Re: Pondering: how to improve mac80211 roaming ...

Am Dienstag, 12. August 2008 schrieb Jouni Malinen:
> On Tue, Aug 12, 2008 at 08:38:52AM +0200, Holger Schurig wrote:

> > In-kernel or in-userspace? --- or hybrid?
> > -------------------------------------------
>
> > So I think I'd opt to a hybrid approach. Userspace uses cfg80211
> > to configure some roaming threshold to mac80211. mac80211 would
> > gain AP-is-about-to-fail detection and, if it detects this, it
> > would signal via cfg80211 (is this possible?) to user-space that
> > it should now roam.
>
> I would also think that this is the most useful design. You could
> already do background scanning in the kernel (triggered by whatever) and
> just sen SIOCSIWSCAN WE event to notify user space (e.g.,
> wpa_supplicant) of availability of new scan results. This is something
> that madwifi for example is already doing.

JFYI I already started to rework the existing scan code in mac80211
(software scan) to do something like background scanning:

1) notify current AP about leaving the channel
2) scan one channel
3) get back to the operating channel
4) notify current AP about being back
5) back to 1) if more channels need to be scanned

The code basically works but is not cleaned up though and not yet triggered
by any event (just used the current start scan event to initiate a background
scan).

Maybe it is reasonable to use background scanning if the interface is
currently associated with an AP and regular scanning in all other cases.

Helmut

2008-08-14 06:42:29

by Kalle Valo

[permalink] [raw]
Subject: Re: Pondering: how to improve mac80211 roaming ...

Holger Schurig <[email protected]> writes:

> Hi !

Hello,

> The last days I looked a bit at mac80211 and how it works. While
> doing this, I detected that mac80211 does virtually no roaming.
> That is, it is very usable for a hot-spot setup (Office, Home,
> Starbucks), but not for a place where you have 20 Access-Points
> and move between them.

Yes, the current implementation is far from perfect. I will be working
on with roaming as well and want to improve it.

[...]

> How to detect missed beacons?
> -----------------------------
> When I know the beacon period, I could setup a timer
> with "beacon_period + beacon_period*0.5". In the timer function I
> could then increase a missed beacon counter and act accordingly,
> e.g. search for APs, roam etc.
>
> But how would I determine the beacon period?

Jouni answered this one already.

> Is detection of missed beacons good enought?
> --------------------------------------------
> For me this approach seems like driving a car into the wall of a
> house. Then crash-detection notifies me and I'd search for a new
> way to drive.
>
> Certainly it's better to act before the accident happens, so I'd
> rather do it differently. With a fullmac driver I looked at the
> RSSI and, when it fell below a certain threshhold, started the
> roaming.

Yes, we definitely need background scanning which is triggered, at
least, based on a RSSI threshold and possibly some other parameters,
for example number of failed transmissions.

> In-kernel or in-userspace? --- or hybrid?
> -------------------------------------------

This is the question.

> Some people say that in-userspace roaming is the way to go.

In my opinion we should move roaming logic to the user space as much
as possible.

> Maybe. But in-kernel, I have more information.

Yes, kernel has more information. But kernel can send an event to the
user space whenever where is need to bacground scan or roam.

> In the mac80211 case, the above quoted code is executed whenever I
> receive a beacon. So I can very quickly react to declining SNR.

It wouln't be that much slower to send an event to userspace and let
userspace initiate scanning. This isn't in hotpath.

> Userspace would have to issue periodically scan requests and process
> them, quite a bit of overhead.

Do you mean the overhead of creating the scan results and sending them
to the user space? Is that really so high that we need to optimize it?
Remember that the interval between scans is measured in seconds, so it
doesn't happen very often.

> But there is existing user-space code that does roaming, e.g. WPA
> Supplicant and (probably, not sure) Network-Manager.
>
> So I think I'd opt to a hybrid approach. Userspace uses cfg80211
> to configure some roaming threshold to mac80211. mac80211 would
> gain AP-is-about-to-fail detection and, if it detects this, it
> would signal via cfg80211 (is this possible?) to user-space that
> it should now roam.

I think we need to support Wireless Extensions as well, because
cfg80211 is not widely used yet.

> Userspace then could, while still associated, scan for other APs
> (e.g. first on preferred channels, like 1,6,11, then on all
> channels) and if it finds something, trigger association to
> another AP.

This sounds just what I have been thinking.

> For a WEP or non-encrypted environment a in-kernel-roaming would
> be possible, this would bring a similar behavior to mac80211 that
> common fullmac drivers exhibit. But my first goal would not be in
> this area.

My view is that in kernel roaming is not worth the effort, I would
prefer to keep the implementation simple and not complicate mac80211
unnecessarily. If we have the scan logic in one place, ie. in user
space, implementation and testing is a lot simpler.

--
Kalle Valo

2008-08-13 14:18:20

by Holger Schurig

[permalink] [raw]
Subject: Re: Pondering: how to improve mac80211 roaming ...

> It might even be beneficial to cancel a currently active
> background scan once the TX queue is filling up and report the
> already gathered information to the user space.

This brings policy into the kernel. Why should TX data be more
important than a scan result?

If an app only gets partial scan results because of "too much tx
data", what can the app do against it? Nothing. It can just
retrigger a scan again. But what if then there is still "too
much tx data"? Then the times of n * "partial scan" might be
bigger than a "real scan" and a happy application.

2008-08-12 17:56:47

by Luis R. Rodriguez

[permalink] [raw]
Subject: Re: Pondering: how to improve mac80211 roaming ...

On Tue, Aug 12, 2008 at 8:42 AM, Jouni Malinen <[email protected]> wrote:
> On Tue, Aug 12, 2008 at 02:40:23PM +0200, Helmut Schaa wrote:
>
>> JFYI I already started to rework the existing scan code in mac80211
>> (software scan) to do something like background scanning:
>>
>> 1) notify current AP about leaving the channel
>> 2) scan one channel
>
> It might be useful to leave this "one" as a parameter to allow easy
> experiments with scanning more than one channel at a time to reduce
> latency. Ideally, this--along the interval for background scans--could
> be something that is dynamically changed based on the expected traffic
> pattern. Whenever there is lot of data traffic being (mostly
> successfully) transmitted, it would be beneficial not to jump to other
> channels as frequently or for as long a time. If there has not been any
> data transmission for some time, it may be more acceptable to scan more
> frequently and to remain away from the operational channel for longer
> periods of time. Though, we should also keep in mind that background
> scans are going to increase power consumption on otherwise inactive
> situation, so setting a suitable policy for this can get quite complex.
>
> If we can somehow figure out whether there is periodic need for some
> traffic (e.g., VoIP), it would also be nice to be able to schedule the
> background scans to happen when there is no expected packet being sent
> or received at the time. Obviously, this gets much more complex, so this
> is not really something that would need to be included in the first
> version ;-). Anyway, it would be useful to keep in mind that there may
> be needs for making the background scan parameters change dynamically
> and there is need to be able to configure a suitable policy from user
> space.

If someone is picking this up please extend our TODO list for roaming
and mention it:

http://wireless.kernel.org/en/developers/todo-list#Roaming

Luis

2008-08-13 12:00:56

by Dan Williams

[permalink] [raw]
Subject: Re: Pondering: how to improve mac80211 roaming ...

On Wed, 2008-08-13 at 08:52 +0200, Holger Schurig wrote:
> > cfg80211 lacks a command for request new scans, so that could
> > also be an area that would benefit of improvements if the
> > current SIOCSIWSCAN WE ioctl does not provide all the
> > functionality needed for this (though, it may be more because
> > of SIOCSIWSCAN handler in mac80211 lacking support for many of
> > the options).
>
> If cfg80211 get's a scan command, it should probably get more
> than one command:
>
> * command "do scan" with lot's of options, e.g. band, channels,
> ESSID, active/passive
>
> * a notification from mac80211->userspace "scan completed"
>
> * command "get current scan list"
> - this command should not return the equivalent of -EAGAIN
> - always return the whole list, even if one process asked
> a "do scan" with ESSID "MUH" and another process asked
> a "do scan" with ESSID "BLAH" one second later
> - filtering/sorting should be done in user-space

Yes; a scan list is a system resource, not per-process of course; the
driver should be adding the new results to it's internal scan cache and
timing out old results as appropriate. And since the process that
requested the filtered scan obviously knows what filter options it was
requesting, it can certainly apply that filter itself.

That said, it would be useful to pass the requested scan options along
with the "scan completed" response. Say process A just scans "ch1 +
ssid foo", and process B wants full scans. If process B just listens
for scan events, and backs off a scan timer each time an event comes in,
it'll never ever get any results other than what process A requested.
If scan options were passed along with the "scan completed" event, then
process B could figure out that it needs more complete scan information
and trigger a scan by itself. Right now with WEXT, we can't do this
AFAIK.

Dan


2008-08-14 08:26:08

by Jouni Malinen

[permalink] [raw]
Subject: Re: Pondering: how to improve mac80211 roaming ...

On Thu, Aug 14, 2008 at 10:05:27AM +0300, Kalle Valo wrote:

> What about PMKSA caching? Shouldn't user space need to provide PMKIDs
> (or the IEs) to mac80211 before association?

There are two possible designs for this. In wpa_supplicant terms,
ap_scan=2 would most likely mean that driver is responsible for
generating the RSN IE, including PMKID list, and wpa_supplicant is
providing the driver a list of currently available PMKs (PMKIDs for
them). In ap_scan=1, there is option for the driver to use the RSN IE
from wpa_supplicant (which is something that mac80211 is doing at the
moment) and in that case, yes, PMKIDs would indeed need to be updated
in the RSN IE just before association.

--
Jouni Malinen PGP id EFC895FA

2008-08-14 18:28:58

by Dan Williams

[permalink] [raw]
Subject: Re: Pondering: how to improve mac80211 roaming ...

On Thu, 2008-08-14 at 11:25 +0300, Jouni Malinen wrote:
> On Thu, Aug 14, 2008 at 10:05:27AM +0300, Kalle Valo wrote:
>
> > What about PMKSA caching? Shouldn't user space need to provide PMKIDs
> > (or the IEs) to mac80211 before association?
>
> There are two possible designs for this. In wpa_supplicant terms,
> ap_scan=2 would most likely mean that driver is responsible for
> generating the RSN IE, including PMKID list, and wpa_supplicant is
> providing the driver a list of currently available PMKs (PMKIDs for
> them). In ap_scan=1, there is option for the driver to use the RSN IE
> from wpa_supplicant (which is something that mac80211 is doing at the
> moment) and in that case, yes, PMKIDs would indeed need to be updated
> in the RSN IE just before association.

The problem with ap_scan is that it really has two uses: roaming (it's
original meaning) and hidden SSID support (the acquired meaning because
of WEXT command ordering issues and lack of SSID scan support).

It's important to note that historically, mac80211 has completely failed
with ap_scan=2 and I'm not sure it's worth fixing that because effort
should be directed into cfg80211.

So if you're going to suggest that people should actually start using
roaming support and fixing it up, we need to figure out how to do
scanning better in the supplicant config too.

Dan



2008-08-13 06:52:20

by Holger Schurig

[permalink] [raw]
Subject: Re: Pondering: how to improve mac80211 roaming ...

> cfg80211 lacks a command for request new scans, so that could
> also be an area that would benefit of improvements if the
> current SIOCSIWSCAN WE ioctl does not provide all the
> functionality needed for this (though, it may be more because
> of SIOCSIWSCAN handler in mac80211 lacking support for many of
> the options).

If cfg80211 get's a scan command, it should probably get more
than one command:

* command "do scan" with lot's of options, e.g. band, channels,
ESSID, active/passive

* a notification from mac80211->userspace "scan completed"

* command "get current scan list"
- this command should not return the equivalent of -EAGAIN
- always return the whole list, even if one process asked
a "do scan" with ESSID "MUH" and another process asked
a "do scan" with ESSID "BLAH" one second later
- filtering/sorting should be done in user-space

2008-08-13 14:28:28

by Holger Schurig

[permalink] [raw]
Subject: Re: Pondering: how to improve mac80211 roaming ...

> Yes; a scan list is a system resource, not per-process of
> course; the driver should be adding the new results to it's
> internal scan cache and timing out old results as appropriate.

Agreed.


> That said, it would be useful to pass the requested scan
> options along with the "scan completed" response. Say process
> A just scans "ch1 + ssid foo", and process B wants full scans.
> If process B just listens for scan events, and backs off a
> scan timer each time an event comes in, it'll never ever get
> any results other than what process A requested. If scan
> options were passed along with the "scan completed" event,
> then process B could figure out that it needs more complete
> scan information and trigger a scan by itself. Right now with
> WEXT, we can't do this AFAIK.

That's one possible solution, but if I get you correctly, then
the kernel would need to keep a FIFO of the various scan
requests.

However, scanning is something that is restricted by the
hardware. We just define that "There can only be one scan in
progress per time per hardware".

So if the hardware is busy, we can return an error. The client
than knows that he has to request the same scan later (ideally
probably after a "scan done" event has been seen).


Process A Process B mac80211 / Hardware
"do scan MUH" - scan for MUH on channel 1
- - scan for MUH on channel 2
- - ...
- - scan for MUH on channel 12
- "scan BLAH" -> -EAGAIN scan for MUH on channel 13
- - "scan done" notification
"get result" "do scan BLAH" scan for BLAH on channel 1
- - ...

So, if you do a "scan MUH" and you don't get an -EAGAIN, the
next "scan done" event is yours and you get "your" scan result
with "get result".

If you get an -EAGAIN, you cannot scan until the current scan is
done. This is marked by the next "scan done" event. You can then
decide to get the other process' scan result and do something
with it --- or you can issue your own "do scan" command and wait
until you get your "scan done".

I think this is both efficient and flexible.

2008-08-13 06:41:44

by Holger Schurig

[permalink] [raw]
Subject: Re: Pondering: how to improve mac80211 roaming ...

> If someone is picking this up please extend our TODO list for
> roaming and mention it:
>
> http://wireless.kernel.org/en/developers/todo-list#Roaming

Nice tip. From this page, I saw that wpa_supplicant supports
802.11r (Fast BSS transition: establish security/QoS to an AP
before making the transition). I assume that mac80211 doesn't
support this yet?

>From the wikipedia page about this I also found 802.11k (Radio
Resource management - AP detects that station is moving away, AP
informs station, Station asks AP for list of stations and uses
this to select another station). Does anybody know an AP with
802.11k support?

2008-08-13 06:51:00

by Jouni Malinen

[permalink] [raw]
Subject: Re: Pondering: how to improve mac80211 roaming ...

On Wed, Aug 13, 2008 at 08:41:37AM +0200, Holger Schurig wrote:

> Nice tip. From this page, I saw that wpa_supplicant supports
> 802.11r (Fast BSS transition: establish security/QoS to an AP
> before making the transition). I assume that mac80211 doesn't
> support this yet?

Yes, that would be the case for client mode with MLME in mac80211. I
would expect AP mode to be working with mac80211 + hostapd since MLME is
in hostapd in that case. I'm first hoping to get mac80211 +
wpa_supplicant with user space MLME working with 802.11r. It should be
possible to make this work with mac80211-based client MLME, too, if
desired.

--
Jouni Malinen PGP id EFC895FA

2008-08-14 07:05:43

by Kalle Valo

[permalink] [raw]
Subject: Re: Pondering: how to improve mac80211 roaming ...

Jouni Malinen <[email protected]> writes:

> On Tue, Aug 12, 2008 at 08:38:52AM +0200, Holger Schurig wrote:
>
>> How to detect missed beacons?
>> -----------------------------
>> When I know the beacon period, I could setup a timer
>> with "beacon_period + beacon_period*0.5". In the timer function I
>> could then increase a missed beacon counter and act accordingly,
>> e.g. search for APs, roam etc.
>
> Some (most?) wlan hardware designs allow this to be done in
> firmware/hardware and that would likely be more efficient way of
> determining missed beacons. In other words, the design here should take
> into account the possibility of moving this functionality into the
> driver and only optionally (if no hw support) set up this to be done
> with host CPU and timers.

Yes, I have seen firmwares which can send an event if beacons are
lost. This feature is really needed.

>> But how would I determine the beacon period?
>
> struct ieee80211_sta_bss::beacon_int
>
> (in TUs, i.e., 1.024 ms)
>
>> Is detection of missed beacons good enought?
>> --------------------------------------------
>> Certainly it's better to act before the accident happens, so I'd
>> rather do it differently. With a fullmac driver I looked at the
>> RSSI and, when it fell below a certain threshhold, started the
>> roaming.
>
> That is another feature that is available in many hw/firmware designs
> even for softmac drivers (i.e., interrupt if current BSS signal strength
> drops below a threshold value). This could tricker a background scan to
> figure out whether there are better APs available at the moment.
> Similarly, it would be useful to be able to trigger such background
> scans periodically even without a specific signal strength trigger.

I agree, we need this feature as well.

>> In-kernel or in-userspace? --- or hybrid?
>> -------------------------------------------
>
>> So I think I'd opt to a hybrid approach. Userspace uses cfg80211
>> to configure some roaming threshold to mac80211. mac80211 would
>> gain AP-is-about-to-fail detection and, if it detects this, it
>> would signal via cfg80211 (is this possible?) to user-space that
>> it should now roam.
>
> I would also think that this is the most useful design. You could
> already do background scanning in the kernel (triggered by whatever) and
> just sen SIOCSIWSCAN WE event to notify user space (e.g.,
> wpa_supplicant) of availability of new scan results. This is something
> that madwifi for example is already doing.

Didn't think of it like this. Yes, that's possible as well. But in
some cases we might want to create more advanced scanning logic, for
example different scanning parameters based on if the display is on or
off and stuff like that. If the kernel issues the scan, making more
advanced logic based on policy is more difficult. With user space
roaming this would be simpler.

>> For a WEP or non-encrypted environment a in-kernel-roaming would
>> be possible, this would bring a similar behavior to mac80211 that
>> common fullmac drivers exhibit. But my first goal would not be in
>> this area.
>
> That should not be limited to just WEP or plaintext. If mac80211 reports
> association even to a new BSS, wpa_supplicant should be able to complete
> WPA/IEEE 802.1X handshake with the new AP after that.

What about PMKSA caching? Shouldn't user space need to provide PMKIDs
(or the IEs) to mac80211 before association?

I'm still in favor of handling roaming in user space.

--
Kalle Valo

2008-08-12 08:23:34

by Jouni Malinen

[permalink] [raw]
Subject: Re: Pondering: how to improve mac80211 roaming ...

On Tue, Aug 12, 2008 at 08:38:52AM +0200, Holger Schurig wrote:

> The last days I looked a bit at mac80211 and how it works. While
> doing this, I detected that mac80211 does virtually no roaming.

> It looks like that somebody (Jiri?) also detected this, because
> I found he following TODO in mlme.c:
>
> /* TODO: start monitoring current AP signal quality and number of
> * missed beacons. Scan other channels every now and then and search
> * for better APs. */

I think I actually wrote that long time ago.. The original design for
the client MLME code in kernel was to be minimal and just support fixed
location (i.e., just minimal roaming if the AP goes away for any
reason).

> How to detect missed beacons?
> -----------------------------
> When I know the beacon period, I could setup a timer
> with "beacon_period + beacon_period*0.5". In the timer function I
> could then increase a missed beacon counter and act accordingly,
> e.g. search for APs, roam etc.

Some (most?) wlan hardware designs allow this to be done in
firmware/hardware and that would likely be more efficient way of
determining missed beacons. In other words, the design here should take
into account the possibility of moving this functionality into the
driver and only optionally (if no hw support) set up this to be done
with host CPU and timers.

> But how would I determine the beacon period?

struct ieee80211_sta_bss::beacon_int

(in TUs, i.e., 1.024 ms)

> Is detection of missed beacons good enought?
> --------------------------------------------
> Certainly it's better to act before the accident happens, so I'd
> rather do it differently. With a fullmac driver I looked at the
> RSSI and, when it fell below a certain threshhold, started the
> roaming.

That is another feature that is available in many hw/firmware designs
even for softmac drivers (i.e., interrupt if current BSS signal strength
drops below a threshold value). This could tricker a background scan to
figure out whether there are better APs available at the moment.
Similarly, it would be useful to be able to trigger such background
scans periodically even without a specific signal strength trigger.

> In-kernel or in-userspace? --- or hybrid?
> -------------------------------------------

> So I think I'd opt to a hybrid approach. Userspace uses cfg80211
> to configure some roaming threshold to mac80211. mac80211 would
> gain AP-is-about-to-fail detection and, if it detects this, it
> would signal via cfg80211 (is this possible?) to user-space that
> it should now roam.

I would also think that this is the most useful design. You could
already do background scanning in the kernel (triggered by whatever) and
just sen SIOCSIWSCAN WE event to notify user space (e.g.,
wpa_supplicant) of availability of new scan results. This is something
that madwifi for example is already doing.

As far as notification for current-signal-below-threshold event is
concerned, that would be a new notification and it would need to be
added somewhere. At this point, cfg80211/nl80211 does not provide new
event mechanism, i.e., the WE netlink messages are still used.

cfg80211 lacks a command for request new scans, so that could also be an
area that would benefit of improvements if the current SIOCSIWSCAN WE
ioctl does not provide all the functionality needed for this (though, it
may be more because of SIOCSIWSCAN handler in mac80211 lacking support
for many of the options).

> For a WEP or non-encrypted environment a in-kernel-roaming would
> be possible, this would bring a similar behavior to mac80211 that
> common fullmac drivers exhibit. But my first goal would not be in
> this area.

That should not be limited to just WEP or plaintext. If mac80211 reports
association even to a new BSS, wpa_supplicant should be able to complete
WPA/IEEE 802.1X handshake with the new AP after that.

--
Jouni Malinen PGP id EFC895FA

2008-08-13 12:26:59

by Helmut Schaa

[permalink] [raw]
Subject: Re: Pondering: how to improve mac80211 roaming ...

Am Dienstag, 12. August 2008 17:42:23 schrieb Jouni Malinen:
> On Tue, Aug 12, 2008 at 02:40:23PM +0200, Helmut Schaa wrote:
> > JFYI I already started to rework the existing scan code in mac80211
> > (software scan) to do something like background scanning:
> >
> > 1) notify current AP about leaving the channel
> > 2) scan one channel
>
> It might be useful to leave this "one" as a parameter to allow easy
> experiments with scanning more than one channel at a time to reduce
> latency.

Good point.

> Ideally, this--along the interval for background scans--could
> be something that is dynamically changed based on the expected traffic
> pattern. Whenever there is lot of data traffic being (mostly
> successfully) transmitted, it would be beneficial not to jump to other
> channels as frequently or for as long a time. If there has not been any
> data transmission for some time, it may be more acceptable to scan more
> frequently and to remain away from the operational channel for longer
> periods of time. Though, we should also keep in mind that background
> scans are going to increase power consumption on otherwise inactive
> situation, so setting a suitable policy for this can get quite complex.

It might even be beneficial to cancel a currently active background scan once
the TX queue is filling up and report the already gathered information to the
user space.

Helmut

2008-08-12 15:43:12

by Jouni Malinen

[permalink] [raw]
Subject: Re: Pondering: how to improve mac80211 roaming ...

On Tue, Aug 12, 2008 at 02:40:23PM +0200, Helmut Schaa wrote:

> JFYI I already started to rework the existing scan code in mac80211
> (software scan) to do something like background scanning:
>
> 1) notify current AP about leaving the channel
> 2) scan one channel

It might be useful to leave this "one" as a parameter to allow easy
experiments with scanning more than one channel at a time to reduce
latency. Ideally, this--along the interval for background scans--could
be something that is dynamically changed based on the expected traffic
pattern. Whenever there is lot of data traffic being (mostly
successfully) transmitted, it would be beneficial not to jump to other
channels as frequently or for as long a time. If there has not been any
data transmission for some time, it may be more acceptable to scan more
frequently and to remain away from the operational channel for longer
periods of time. Though, we should also keep in mind that background
scans are going to increase power consumption on otherwise inactive
situation, so setting a suitable policy for this can get quite complex.

If we can somehow figure out whether there is periodic need for some
traffic (e.g., VoIP), it would also be nice to be able to schedule the
background scans to happen when there is no expected packet being sent
or received at the time. Obviously, this gets much more complex, so this
is not really something that would need to be included in the first
version ;-). Anyway, it would be useful to keep in mind that there may
be needs for making the background scan parameters change dynamically
and there is need to be able to configure a suitable policy from user
space.

--
Jouni Malinen PGP id EFC895FA

2008-08-12 09:56:30

by Holger Schurig

[permalink] [raw]
Subject: Re: Pondering: how to improve mac80211 roaming ...

> > When I know the beacon period, I could setup a timer
> > with "beacon_period + beacon_period*0.5". In the timer
> > function I could then increase a missed beacon counter and
> > act accordingly, e.g. search for APs, roam etc.
>
> Some (most?) wlan hardware designs allow this to be done in
> firmware/hardware and that would likely be more efficient way
> of determining missed beacons.

Ahh, I see. E.g. in wireless-testing, ath5k/hw.c:

AR5K_REG_WRITE_BITS(ah, AR5K_RSSI_THR,
AR5K_RSSI_THR_BMISS,
state->bs_bmiss_threshold);

However, a grep for bs_bmiss_threshold doesn't show that it
get's ever set. The whole section is also marked with #if 0 ...
#endif.




Reporting events from the driver
--------------------------------
How would a driver report beacon miss to mac80211? A function,
incrementing some variable in some structure? How does this
variable get decremented?

Surely it would be set to zero at association time, but maybe the
value should degrade over time (e.g. it advanced to 6 while I was
at the edge of the AP area, but then I move back into good
coverage).

So maybe the driver calls

void mac80211_notify_missed_beacon(struct ieee80211_hw *hw,
u32 count)

Then mac80211 could do the increment/decrement things common for
all
drivers. The other function:

To report a too-low SNR/RSSI, we could re-use
ieee80211_notify_mac(). We could also add a new parameter to
ieee80211_notify_mac() to that we can transport some data via it,
besides the event. This way we wouldn't need a new function
mac80211_notify_missed_beacon().



Announing capabilities
----------------------
Somehow mac80211 needs to send the thresshold to the drivers ---
or do it on it's own. So we need some kind of capability bits.

So maybe I need:

IEEE80211_HW_BEACON_MISSED
IEEE80211_HW_LOW_SIGNAL

(ath5k supports RSSI threshold, maybe other HW supports SNR
threshhold? And do we have the usual dB vs. dBm problem here?).

We could also forget about RSSI/SNR dB/dBm and just
use "sensitity" from 0..n (e.g. 0..3), with:

0: don't try to roam
1: try to stay at current AP quite long
2: middle ground
3: roam quickly

IEEE80211_HW_SENS_THRESHOLD



If the driver doesn't support either of this, then mac80211
could do that on it's own. I'd actually try to get this
functionality working first.


> cfg80211 lacks a command for request new scans

But that is something that I currently don't want to do. A scan
command can be quite complicated, e.g. scan active (with probe
requests), scan passive etc etc etc. For me this is another
problem to tackle :-)

> > For a WEP or non-encrypted environment a in-kernel-roaming
> > would be possible, this would bring a similar behavior to
> > mac80211 that common fullmac drivers exhibit. But my first
> > goal would not be in this area.
>
> That should not be limited to just WEP or plaintext. If
> mac80211 reports association even to a new BSS, wpa_supplicant
> should be able to complete WPA/IEEE 802.1X handshake with the
> new AP after that.

Ah, good to know :-)

2008-08-13 14:26:14

by Helmut Schaa

[permalink] [raw]
Subject: Re: Pondering: how to improve mac80211 roaming ...

Am Mittwoch, 13. August 2008 16:18:13 schrieb Holger Schurig:
> > It might even be beneficial to cancel a currently active
> > background scan once the TX queue is filling up and report the
> > already gathered information to the user space.
>
> This brings policy into the kernel. Why should TX data be more
> important than a scan result?

I was only referring to background scanning not triggered by the user
space. If any user space application requests a scan it may of course
not be canceled due to TX data.

2008-08-13 12:53:10

by Helmut Schaa

[permalink] [raw]
Subject: Re: Pondering: how to improve mac80211 roaming ...

Am Mittwoch, 13. August 2008 14:49:15 schrieb Dan Williams:
> On Wed, 2008-08-13 at 14:26 +0200, Helmut Schaa wrote:
> > Am Dienstag, 12. August 2008 17:42:23 schrieb Jouni Malinen:
> > > On Tue, Aug 12, 2008 at 02:40:23PM +0200, Helmut Schaa wrote:
> > > > JFYI I already started to rework the existing scan code in mac80211
> > > > (software scan) to do something like background scanning:
> > > >
> > > > 1) notify current AP about leaving the channel
> > > > 2) scan one channel
> > >
> > > It might be useful to leave this "one" as a parameter to allow easy
> > > experiments with scanning more than one channel at a time to reduce
> > > latency.
> >
> > Good point.
> >
> > > Ideally, this--along the interval for background scans--could
> > > be something that is dynamically changed based on the expected traffic
> > > pattern. Whenever there is lot of data traffic being (mostly
> > > successfully) transmitted, it would be beneficial not to jump to other
> > > channels as frequently or for as long a time. If there has not been any
> > > data transmission for some time, it may be more acceptable to scan more
> > > frequently and to remain away from the operational channel for longer
> > > periods of time. Though, we should also keep in mind that background
> > > scans are going to increase power consumption on otherwise inactive
> > > situation, so setting a suitable policy for this can get quite complex.
> >
> > It might even be beneficial to cancel a currently active background scan
> > once the TX queue is filling up and report the already gathered
> > information to the user space.
>
> If that's the case, we should also then have a response to the original
> scan request (or broadcast netlink message with the original request's
> cookie) that says "scan canceled" so that the requester can handle that.

Right. Something like "The scan was canceled but here are the incomplete
results".

Helmut