2009-05-14 17:52:30

by Luis R. Rodriguez

[permalink] [raw]
Subject: Scan while TX/RX'ing a lot of data

I'm told Network Manager scans every 60 seconds. When TX'ing or RX'ing
a lot of data you will see a big dip in throughput and sometimes it
may cause issues with some connections. Jouni pointed out a possible
nice option here: split the scans per channel through time. Now with
nl80211 this is possible but right now Network Manager uses wext
through wpa_supplicant in many distributions and this won't change for
a bit (maybe by the next major distribution releases?). Since we're
stuck with wext for the current distribution releases I'd like to hear
feedback on a possible nice solution. Should we simply cancel scan?

Luis


2009-05-16 06:16:09

by Luis R. Rodriguez

[permalink] [raw]
Subject: Re: Scan while TX/RX'ing a lot of data

On Fri, May 15, 2009 at 11:12 PM, Luis R. Rodriguez <[email protected]> wrote:
> On Thu, May 14, 2009 at 2:26 PM, Luis R. Rodriguez <[email protected]> wrote:
>> On Thu, May 14, 2009 at 1:57 PM, Dan Williams <[email protected]> wrote:
>>> On Thu, 2009-05-14 at 12:07 -0700, Luis R. Rodriguez wrote:
>>>> On Thu, May 14, 2009 at 11:54 AM, Dan Williams <[email protected]> wrote:
>>
>>>> > Secondarily, scanning is a tradeoff between better roaming latency and
>>>> > continuous high throughput.  If you don't scan, you have no idea what's
>>>> > around, and when you move and the current AP becomes marginal, you
>>>> > *have* to take the hit no matter what, so you can scan and find a new AP
>>>> > to associate with.
>>>>
>>>> True however I'm inclined to believe that generally when you are
>>>> sending or receiving a lot of data you are stationary so roaming might
>>>> not be that urgent. For 802.11p (vehicle stuff) I suppose this may be
>>>> a bit different but I have yet to see this take off.
>>>
>>> I'll believe 11p when I see it somewhere :)  But anyway, yeah, there are
>>> tricks we can play here, and I'm happy to entertain methods of making
>>> the scanning hit less annoying.  I'm simply opposed to a checkbox in the
>>> UI for "never scan while connected" because that's a pure cop-out: we
>>> should be making things intelligent, not taking the half-assed approach
>>> like people (nobody here of course) have been whining about for a long
>>> time.  Scanning isn't something the user should have to care about or
>>> turn on or off; it's something that NetworkManager (in concert with
>>> usage information from the stack) should be able to handle
>>> automatically.
>>
>> Agreed.
>>
>>> Maybe if there was a way to figure out how full mac80211's QoS buckets
>>> were, that would help?  Or get a frame counts from each of the 4 QoS
>>> buckets individually?  That would allow NM to make more intelligent
>>> decisions about stuff.  Maybe a more detailed nl80211 stats interface
>>> would do the trick here.
>>
>> Would help for debugging too anyway, since QOS uses the new device
>> queues maybe that might already be available?
>>
>>>> > I would have though that the periodic scanning would be more of an
>>>> > annoyance when doing VOIP or SSH other latency sensitive tasks, but when
>>>> > just downloading a file, a few second drop in transfer rate gets lost in
>>>> > the bucket in the grand scheme of things.
>>>>
>>>> Good point, how about we use pm_qos for disabling scan when we are
>>>> sending a lot of data (whatever we define this to mean)? Then
>>>> applications can just write to this pm_qos stuff and you won't see
>>>> this.
>>>>
>>>> I forget when pm_qos was introduced though.
>>>
>>> Yeah, something like that, although it's sometimes hard to go from app
>>> -> interface;
>>
>> Yeah at least pm_qos is not device specific IIRC.
>>
>>> if you have a wired interface connected at the same time,
>>> but the thing you're sucking down data from is on the wifi interface,
>>> we'd need some way to figure that out.
>>
>> Point taken.
>>
>>> Hence the idea about exposing
>>> the packet counts of the WMM queues or something like that.
>>
>> I think this is a good idea, although it wouldn't solve anything for
>> people stuck on old userspace (my oldest target is distributions
>> releases circa 2.6.27). Not sure what is best for that case. The
>> kernel could be informed of the lack of userspace regulating this and
>> then take some reasonable action. What the reasonable action and the
>> terms for that remain unclear to me.
>
> OK so what if we just let let old wext scan be just a dump of the
> current bss list provided we do selective scanning once per channel
> throughout a period of time.
>
> For new userspace can obviously do whatever we like but since we'd
> implement the above we could just let this automatic scanning can be
> tweaked for roaming purpose with exposed knobs. That is -- make the
> code so that it can be tweaked to act like a regular good old scan or
> take breaks throughout.
>
> If you're already associated it makes little sense to be scanning all
> over. If you're not associated it makes sense to do the good old scan
> we're used to. Of course this would just be fof mac80211 and cfg80211
> drivers, old wext will obviously still behave the same for old
> hardware.
>
> Meh?

Additionally we could add Kconfig for SMART_WEXT or whatever, and old
configs would behave the same. However users of old distributions
wanting new shiny drivers with new shiny benefits can enable it -- and
it would still work with old userspace.

Luis

2009-05-16 12:49:00

by Johannes Berg

[permalink] [raw]
Subject: Re: Scan while TX/RX'ing a lot of data

On Fri, 2009-05-15 at 19:15 -0400, Dan Williams wrote:

> Well, what does "background scan" mean? mac80211 will obviously accept
> beacons from APs on the same channel and happily add them to the scan
> list.

Only if powersave/beacon filtering isn't enabled. Actually, mac80211
does beacon filtering itself now, so I don't think this happens any more
(I did that on purpose so people don't really rely on the former
behaviour)

> But a background scan would imply that the card itself is taking
> over the decision about when to jump around and scan, basically the same
> thing NM is doing, right? When would the card/stack decided to
> background scan, and what channels would it background scan on? If it
> doesn't do more or less all passive channels, then it wouldn't be that
> useful for figuring out the site survey data.

I would think that mac80211, while asked to scan when associated, could
simply do something like calculating how long it can go off-channel
under the current QoS requirements, and then go off-channel for that,
wait for the next on-channel beacon and some traffic, scan the next
channel(s) again etc.

johannes


Attachments:
signature.asc (801.00 B)
This is a digitally signed message part

2009-05-19 09:06:46

by Johannes Berg

[permalink] [raw]
Subject: Re: Scan while TX/RX'ing a lot of data

On Mon, 2009-05-18 at 15:43 +0200, Helmut Schaa wrote:

> That's what I've tried with the patch at [1]. I used a fixed time schedule
> for switching back to the operating channel after each scanned channel.
> However, that could of course be done somewhat dynamic based on the current
> traffic characteristics.

Not sure there's a need -- we have plenty of time.

> The basic problem I had was that I couldn't check if the nullfunc frame
> indicating the new "powersave" state to the AP was already sent out (or
> ACKed by the AP). This resulted in lost frames sometimes: if the device's
> tx queue contained a lot of data frames the nullfunc frame was sent out
> _after_ the channel switch occured.

Indeed.

> If anybody has a good idea on how to fix this issue I'm glad to provide an
> updated version of the bg-scan patch.

Hmmm.

> One solution would be to force all drivers to report the tx status. Another
> one would be to just wait until the device's queue is empty. At that point
> we know that all pending data frames were sent out and additionally the
> nullfunc frame was also sent out, so we can safely switch to the next
> to-be-scanned channel.

Some drivers now flush queues at channel switch time, but there's no way
to guarantee that the software queues are actually empty and the
nullfunc frame isn't queued up behind other traffic. Also, waiting for
any driver to report a status will lead to problems because the driver
might just have dropped the frame for various reasons like being out of
memory.

johannes


Attachments:
signature.asc (801.00 B)
This is a digitally signed message part

2009-05-14 18:54:09

by Dan Williams

[permalink] [raw]
Subject: Re: Scan while TX/RX'ing a lot of data

On Thu, 2009-05-14 at 10:52 -0700, Luis R. Rodriguez wrote:
> I'm told Network Manager scans every 60 seconds. When TX'ing or RX'ing
> a lot of data you will see a big dip in throughput and sometimes it
> may cause issues with some connections. Jouni pointed out a possible
> nice option here: split the scans per channel through time. Now with
> nl80211 this is possible but right now Network Manager uses wext
> through wpa_supplicant in many distributions and this won't change for
> a bit (maybe by the next major distribution releases?). Since we're
> stuck with wext for the current distribution releases I'd like to hear
> feedback on a possible nice solution. Should we simply cancel scan?

Libertas splits scans up into 3 parts with a short return to the
operating channel between each part. There's nothing that requires
cfg80211 for that to work...

Something I've tossed around for a while is counting traffic on the
device and if its over a certain bitrate for a period of time, postpone
the scan for a while. But after a certain amount of time, there's going
to be a scan no matter what.

The problem here is that at any time an application (say, wifi location
app) could ask for the list of access points. If you don't scan
periodically, all APs other than your associated AP (and others on the
same channel) will gradually drop off because their beacons are
received. Hard to wifi position or get area statistics if there's only
one AP in the list.

Secondarily, scanning is a tradeoff between better roaming latency and
continuous high throughput. If you don't scan, you have no idea what's
around, and when you move and the current AP becomes marginal, you
*have* to take the hit no matter what, so you can scan and find a new AP
to associate with.

I would have though that the periodic scanning would be more of an
annoyance when doing VOIP or SSH other latency sensitive tasks, but when
just downloading a file, a few second drop in transfer rate gets lost in
the bucket in the grand scheme of things.

Dan



2009-05-19 09:10:05

by Johannes Berg

[permalink] [raw]
Subject: Re: Scan while TX/RX'ing a lot of data

On Mon, 2009-05-18 at 11:06 -0700, Luis R. Rodriguez wrote:

> > No, I just think we should, if associated, spread out the scan a little
> > more, and not change anything in the userspace API at all, just change
> > the time it takes to do the scan and allow data to pass by going back to
> > the operational channel.
>
> If we enhance scanning in mac80211 while your associated by relying
> on some metrics we're leaving userspace out of the decision. It would
> seem nicer to expose these metrics to userspace so it can take a
> better informed decision. The issue still remains with old wext unless
> are OK with some defaults. Which is why I was suggesting a SMART_WEXT.
> But it seems we are OK with penalizing a userspace wext scan by
> delaying it quite a bit when associated at the expense of doing
> scattered scan.

We would also do that with a nl80211 scan, if it was requested for all
channels. I don't see how we are leaving userspace out of the decision
either. Yes, we're not asking userspace in which order to scan (actually
nl80211 order is used but we don't guarantee that) or how many channels
to scan between traffic etc.

However, all this information doesn't belong into the scan command. That
would mean applications would not only have to register their
requirements with the kernel but also with the scanning applications!
Which would be a completely hopeless approach! Yes, having userspace
influence policy on these things is great, but it also needs to have a
chance to do something sane!

> Hm, yeah I just noticed we don't rx flush on channel change... I do
> see this notice on ath_set_channel()
>
> /* XXX: do not flush receive queue here. We don't want
> * to flush data frames already in queue because of
> * changing channel. */
>
> I forget why we added that now.. Anyway I guess we can test something like this:
>
> diff --git a/drivers/net/wireless/ath/ath9k/main.c
> b/drivers/net/wireless/ath/ath9k/main.c
> index bbbfdcd..a5db284 100644
> --- a/drivers/net/wireless/ath/ath9k/main.c
> +++ b/drivers/net/wireless/ath/ath9k/main.c
> @@ -261,6 +261,11 @@ int ath_set_channel(struct ath_softc *sc, struct
> ieee80211_hw *hw,
> ath9k_hw_set_interrupts(ah, 0);
> ath_drain_all_txq(sc, false);
> stopped = ath_stoprecv(sc);
> + ath_flushrecv(sc);
> +
> + spin_lock_bh(&sc->rx.rxflushlock);
> + ath_rx_tasklet(sc, 0);
> + spin_unlock_bh(&sc->rx.rxflushlock);
>
> /* XXX: do not flush receive queue here. We don't want
> * to flush data frames already in queue because of
>
>
>
> But not quite sure if this would resolve that race you mention of.

I don't think so, no.

> > Then again I don't quite see why we discard frames received during a
> > scan even if they were for us.
>
> We do discard them?

Yes.

johannes


Attachments:
signature.asc (801.00 B)
This is a digitally signed message part

2009-05-14 19:07:08

by Johannes Berg

[permalink] [raw]
Subject: Re: Scan while TX/RX'ing a lot of data

On Thu, 2009-05-14 at 14:54 -0400, Dan Williams wrote:

> Libertas splits scans up into 3 parts with a short return to the
> operating channel between each part. There's nothing that requires
> cfg80211 for that to work...

Yeah that was my idea too, just return to the operating channel after
having scanned a channel or two and wait for the next beacon, and
possibly receive traffic if indicated in that beacon. It takes a bit of
synchronisation and isn't easy to implement, but it's definitely
possible.

> The problem here is that at any time an application (say, wifi location
> app) could ask for the list of access points. If you don't scan
> periodically, all APs other than your associated AP (and others on the
> same channel) will gradually drop off because their beacons are
> received. Hard to wifi position or get area statistics if there's only
> one AP in the list.

The other thing we should do is bump the AP list timeout, I think -- 10
seconds is very small. But then again we really need such apps to query
NM anyway.

> Secondarily, scanning is a tradeoff between better roaming latency and
> continuous high throughput. If you don't scan, you have no idea what's
> around, and when you move and the current AP becomes marginal, you
> *have* to take the hit no matter what, so you can scan and find a new AP
> to associate with.

Yeah, that too.

> I would have though that the periodic scanning would be more of an
> annoyance when doing VOIP or SSH other latency sensitive tasks, but when
> just downloading a file, a few second drop in transfer rate gets lost in
> the bucket in the grand scheme of things.

ssh isn't too bad, at least not after I fixed the timings... before I
got very annoyed with ar9170. OTOH with VoIP it would suck to have a
small hiccup every 2 minutes -- maybe then we should postpone
indefinitely and only do it if the signal strength fluctuates?

johannes


Attachments:
signature.asc (801.00 B)
This is a digitally signed message part

2009-05-20 13:54:19

by Johannes Berg

[permalink] [raw]
Subject: Re: Scan while TX/RX'ing a lot of data

On Wed, 2009-05-20 at 15:43 +0200, Helmut Schaa wrote:

> > > Hence, would it be possible to:
> > > 1) Stop all sub_if tx queues (afterwards no new data frames should
> > > appear in the mdev tx queue)
> > > 2) Queue the nullfunc frame
> > > 3) Flush the mdev's tx queue
> > > 4) Switch the channel
> > > ?
> >
> > I don't think that is sufficient,
>
> Maybe not sufficient, but it's at least an improvement.

True. The other thing is that it's not entirely trivial to flush the
mdev queues afaict.

> > unless the driver also flushes the
> > hardware queue at channel switch time.
>
> Yes, but that would be the drivers responsibility and could be
> fixed in a second step.
>
> > Which we may want to make explicit with a new callback or so?
>
> And make it mandatory?

Maybe not -- drivers that don't handle it wouldn't be off worse than
now, after all.

johannes


Attachments:
signature.asc (801.00 B)
This is a digitally signed message part

2009-05-20 11:45:28

by Johannes Berg

[permalink] [raw]
Subject: Re: Scan while TX/RX'ing a lot of data

On Wed, 2009-05-20 at 13:41 +0200, Helmut Schaa wrote:
> Am Dienstag, 19. Mai 2009 schrieb Johannes Berg:
> > On Mon, 2009-05-18 at 15:43 +0200, Helmut Schaa wrote:
>
> > > One solution would be to force all drivers to report the tx status. Another
> > > one would be to just wait until the device's queue is empty. At that point
> > > we know that all pending data frames were sent out and additionally the
> > > nullfunc frame was also sent out, so we can safely switch to the next
> > > to-be-scanned channel.
> >
> > Some drivers now flush queues at channel switch time, but there's no way
> > to guarantee that the software queues are actually empty and the
> > nullfunc frame isn't queued up behind other traffic. Also, waiting for
> > any driver to report a status will lead to problems because the driver
> > might just have dropped the frame for various reasons like being out of
> > memory.
>
> Hmm, the problem is basically (from a mac80211 perspective) that the
> mdev tx queue could still contain data frames when the nullfunc frame
> gets queued, right? And we do not assure that these frames are sent
> out before the channel switch.
>
> Hence, would it be possible to:
> 1) Stop all sub_if tx queues (afterwards no new data frames should
> appear in the mdev tx queue)
> 2) Queue the nullfunc frame
> 3) Flush the mdev's tx queue
> 4) Switch the channel
> ?

I don't think that is sufficient, unless the driver also flushes the
hardware queue at channel switch time. Which we may want to make
explicit with a new callback or so?

johannes


Attachments:
signature.asc (801.00 B)
This is a digitally signed message part

2009-05-15 08:32:13

by Johannes Berg

[permalink] [raw]
Subject: Re: Scan while TX/RX'ing a lot of data

On Fri, 2009-05-15 at 10:11 +0200, Holger Schurig wrote:

[...]

good thoughts

> Also, I think that the background-scan stuff belongs more or less
> in the "let roaming not suck" department. I wonder where the
> GSOC people are now ...

That particular project wasn't accepted. We get to work on it
ourselves ;)

johannes


Attachments:
signature.asc (801.00 B)
This is a digitally signed message part

2009-05-15 08:12:24

by Holger Schurig

[permalink] [raw]
Subject: Re: Scan while TX/RX'ing a lot of data

On Thursday 14 May 2009 20:54:58 Dan Williams wrote:
> Libertas splits scans up into 3 parts with a short return to
> the operating channel between each part. There's nothing that
> requires cfg80211 for that to work...

Hey, it's up-to 4 channels (MRVDRV_MAX_CHANNELS_PER_SCAN).

Yeah, I had to implement this "stuttering scan" because the
firmware that I had access to cannot send
power-save-null-packets to the AP. Without that, I could not
notify to the current AP that I'm away, and if the AP has data
for me, which I don't ack for more than 1000ms, then the AP
might have disassociated me in the mean-time.


So basically I changed the scanning into a state-machine. I get a
list of channels to scan (e.g. "all channels", if the request
comes from WEXT). The a scan-worker calls the logic of the
state-machine. The state-machine does it's work and either
re-schedules the workqueue. Or, if every channel has been
visitied, it emits the SIOCGIWSCAN event.

It a bit more complicated, because one can force a full scan
(e.g. when initially associating).


> Something I've tossed around for a while is counting traffic
> on the device and if its over a certain bitrate for a period
> of time, postpone the scan for a while. But after a certain
> amount of time, there's going to be a scan no matter what.

Traffic could for example modify the time for delays between
re-schedules of the scan-state-machine.


> Secondarily, scanning is a tradeoff between better roaming
> latency and continuous high throughput.

Kind of a QoS thingy?


> If you don't scan,
> you have no idea what's around, and when you move and the
> current AP becomes marginal, you *have* to take the hit no
> matter what, so you can scan and find a new AP to associate
> with.

TeX does it's layouting by minimizing (calculated) uglyness.
Maybe a background-scan-decision can be done on (calculated)
urgentness? E.g. if background scan is enabled at all, the
urgentness of background scan increases in time. "Huge" amount
of traffic decreases the urgentness. And only if urgentness
get's to some level we do the background scan for the next n
channels. Throw in the possibility of of a full-scan and a sane
user-space and this scheme might actually work.



Also, I think that the background-scan stuff belongs more or less
in the "let roaming not suck" department. I wonder where the
GSOC people are now ...

2009-05-16 12:57:41

by Johannes Berg

[permalink] [raw]
Subject: Re: Scan while TX/RX'ing a lot of data

On Fri, 2009-05-15 at 23:12 -0700, Luis R. Rodriguez wrote:

> > I think this is a good idea, although it wouldn't solve anything for
> > people stuck on old userspace (my oldest target is distributions
> > releases circa 2.6.27). Not sure what is best for that case. The
> > kernel could be informed of the lack of userspace regulating this and
> > then take some reasonable action. What the reasonable action and the
> > terms for that remain unclear to me.
>
> OK so what if we just let let old wext scan be just a dump of the
> current bss list provided we do selective scanning once per channel
> throughout a period of time.

Hmm, no. I don't think we want to scan automatically. On the other hand,
a scan that the user triggered could be spread out over more time like I
just wrote in the other mail.

> For new userspace can obviously do whatever we like but since we'd
> implement the above we could just let this automatic scanning can be
> tweaked for roaming purpose with exposed knobs. That is -- make the
> code so that it can be tweaked to act like a regular good old scan or
> take breaks throughout.
>
> If you're already associated it makes little sense to be scanning all
> over. If you're not associated it makes sense to do the good old scan
> we're used to. Of course this would just be fof mac80211 and cfg80211
> drivers, old wext will obviously still behave the same for old
> hardware.

Ick.

[other mail]

> Additionally we could add Kconfig for SMART_WEXT or whatever, and old
> configs would behave the same. However users of old distributions
> wanting new shiny drivers with new shiny benefits can enable it -- and
> it would still work with old userspace.

How complicated do you want to make it?

No, I just think we should, if associated, spread out the scan a little
more, and not change anything in the userspace API at all, just change
the time it takes to do the scan and allow data to pass by going back to
the operational channel.

Now, I'm not saying this is easy, it can be almost arbitrarily tricky,
but I still think we should do it. One thing for example could be if
we're scanning the operational channel then the only thing we do is send
probe requests, nothing more.

The other thing to notice is that there's a race between RX and channel
switch that I pointed out before -- and not all hardware is capable of
closing that race. Atheros hardware for example doesn't seem to contain
the channel the frame was received on in the RX information, so that the
driver fills that in based on the current operating channel, which means
that in order to report that correctly you have to flush RX...

Then again I don't quite see why we discard frames received during a
scan even if they were for us.

Anyway, bottom line is that I don't think we should change the APIs in
any way, we should instead make the scan smarter by spreading it out if
we're active. This even applies to scanning while in an IBSS -- stop
beaconing, then go scan etc.

johannes


Attachments:
signature.asc (801.00 B)
This is a digitally signed message part

2009-05-18 17:52:57

by Luis R. Rodriguez

[permalink] [raw]
Subject: Re: Scan while TX/RX'ing a lot of data

On Mon, May 18, 2009 at 5:38 AM, John W. Linville
<[email protected]> wrote:
> On Sat, May 16, 2009 at 02:57:37PM +0200, Johannes Berg wrote:
>
>> Anyway, bottom line is that I don't think we should change the APIs in
>> any way, we should instead make the scan smarter by spreading it out if
>> we're active. This even applies to scanning while in an IBSS -- stop
>> beaconing, then go scan etc.
>
> ACK

OK FYI -- I think Senthil is up for taking this on.

Luis

2009-05-14 20:53:54

by Johannes Berg

[permalink] [raw]
Subject: Re: Scan while TX/RX'ing a lot of data

On Thu, 2009-05-14 at 16:50 -0400, Dan Williams wrote:

> If there was a reliable mechanism to figure out whether there was a
> certain QoS level of traffic flowing through the card, this would be
> easier to do automatically. AFAIK all the APIs these days are
> socket-based, and that doesn't help us get from app -> interface without
> a lot of intermediate steps. How does mac80211 figure out what to put
> into each of the 4 buckets for wifi QoS / WMM?

Well, it depends what you need to do. You can use
setsockopt(SO_PRIORITY) or like ping -Q use DSCP

johannes


Attachments:
signature.asc (801.00 B)
This is a digitally signed message part

2009-05-18 18:06:55

by Luis R. Rodriguez

[permalink] [raw]
Subject: Re: Scan while TX/RX'ing a lot of data

On Sat, May 16, 2009 at 5:57 AM, Johannes Berg
<[email protected]> wrote:
> On Fri, 2009-05-15 at 23:12 -0700, Luis R. Rodriguez wrote:
>
>> > I think this is a good idea, although it wouldn't solve anything for
>> > people stuck on old userspace (my oldest target is distributions
>> > releases circa 2.6.27). Not sure what is best for that case. The
>> > kernel could be informed of the lack of userspace regulating this and
>> > then take some reasonable action. What the reasonable action and the
>> > terms for that remain unclear to me.
>>
>> OK so what if we just let let old wext scan be just a dump of the
>> current bss list provided we do selective scanning once per channel
>> throughout a period of time.
>
> Hmm, no. I don't think we want to scan automatically. On the other hand,
> a scan that the user triggered could be spread out over more time like I
> just wrote in the other mail.

The other mail talks about things to do for new stuff. Keep in mind
I'm talking about here a way to get whatever we do with the new stuff
somehow usable with the old stuff somehow while still not changing the
expected behavior dramatically. Granted I was making the assumption
automatic scanning was a good idea which it seems it may not be. But
more on this below.

>> For new userspace can obviously do whatever we like but since we'd
>> implement the above we could just let this automatic scanning can be
>> tweaked for roaming purpose with exposed knobs. That is -- make the
>> code so that it can be tweaked to act like a regular good old scan or
>> take breaks throughout.
>>
>> If you're already associated it makes little sense to be scanning all
>> over. If you're not associated it makes sense to do the good old scan
>> we're used to. Of course this would just be fof mac80211 and cfg80211
>> drivers, old wext will obviously still behave the same for old
>> hardware.
>
> Ick.
>
> [other mail]
>
>> Additionally we could add Kconfig for SMART_WEXT or whatever, and old
>> configs would behave the same. However users of old distributions
>> wanting new shiny drivers with new shiny benefits can enable it -- and
>> it would still work with old userspace.
>
> How complicated do you want to make it?

Implementation of scattered "soft scanning, I guess you can call it,
is a bit complicated in itself. I was trying to figure out a way to
keep the old behavior intact while still allowing old userspace to
gain the benefits of whatever it is that we do come up with for the
new stuff.

> No, I just think we should, if associated, spread out the scan a little
> more, and not change anything in the userspace API at all, just change
> the time it takes to do the scan and allow data to pass by going back to
> the operational channel.

If we enhance scanning in mac80211 while your associated by relying
on some metrics we're leaving userspace out of the decision. It would
seem nicer to expose these metrics to userspace so it can take a
better informed decision. The issue still remains with old wext unless
are OK with some defaults. Which is why I was suggesting a SMART_WEXT.
But it seems we are OK with penalizing a userspace wext scan by
delaying it quite a bit when associated at the expense of doing
scattered scan.

> Now, I'm not saying this is easy, it can be almost arbitrarily tricky,
> but I still think we should do it.

OK!

> One thing for example could be if
> we're scanning the operational channel then the only thing we do is send
> probe requests, nothing more.
>
> The other thing to notice is that there's a race between RX and channel
> switch that I pointed out before -- and not all hardware is capable of
> closing that race. Atheros hardware for example doesn't seem to contain
> the channel the frame was received on in the RX information, so that the
> driver fills that in based on the current operating channel, which means
> that in order to report that correctly you have to flush RX...

Hm, yeah I just noticed we don't rx flush on channel change... I do
see this notice on ath_set_channel()

/* XXX: do not flush receive queue here. We don't want
* to flush data frames already in queue because of
* changing channel. */

I forget why we added that now.. Anyway I guess we can test something like this:

diff --git a/drivers/net/wireless/ath/ath9k/main.c
b/drivers/net/wireless/ath/ath9k/main.c
index bbbfdcd..a5db284 100644
--- a/drivers/net/wireless/ath/ath9k/main.c
+++ b/drivers/net/wireless/ath/ath9k/main.c
@@ -261,6 +261,11 @@ int ath_set_channel(struct ath_softc *sc, struct
ieee80211_hw *hw,
ath9k_hw_set_interrupts(ah, 0);
ath_drain_all_txq(sc, false);
stopped = ath_stoprecv(sc);
+ ath_flushrecv(sc);
+
+ spin_lock_bh(&sc->rx.rxflushlock);
+ ath_rx_tasklet(sc, 0);
+ spin_unlock_bh(&sc->rx.rxflushlock);

/* XXX: do not flush receive queue here. We don't want
* to flush data frames already in queue because of



But not quite sure if this would resolve that race you mention of.

> Then again I don't quite see why we discard frames received during a
> scan even if they were for us.

We do discard them?

> Anyway, bottom line is that I don't think we should change the APIs in
> any way, we should instead make the scan smarter by spreading it out if
> we're active. This even applies to scanning while in an IBSS -- stop
> beaconing, then go scan etc.

OK sounds like a plan.

Luis

2009-05-14 20:56:19

by Dan Williams

[permalink] [raw]
Subject: Re: Scan while TX/RX'ing a lot of data

On Thu, 2009-05-14 at 12:07 -0700, Luis R. Rodriguez wrote:
> On Thu, May 14, 2009 at 11:54 AM, Dan Williams <[email protected]> wrote:
> > On Thu, 2009-05-14 at 10:52 -0700, Luis R. Rodriguez wrote:
> >> I'm told Network Manager scans every 60 seconds. When TX'ing or RX'ing
> >> a lot of data you will see a big dip in throughput and sometimes it
> >> may cause issues with some connections. Jouni pointed out a possible
> >> nice option here: split the scans per channel through time. Now with
> >> nl80211 this is possible but right now Network Manager uses wext
> >> through wpa_supplicant in many distributions and this won't change for
> >> a bit (maybe by the next major distribution releases?). Since we're
> >> stuck with wext for the current distribution releases I'd like to hear
> >> feedback on a possible nice solution. Should we simply cancel scan?
> >
> > Libertas
>
> libertas_tf (the mac80211 driver) ? Or the fullmac one?

The fullmac one. The firmware initially didn't implement nullfunc
support so every scan would make the AP kick you off, much like what
happened before kvalo's patch from March that fixed the mac80211 TX
filter for nullfunc.

Splitting the scan up (the scan isn't really that latency-sensitive
anyway) worked fairly well.

> > splits scans up into 3 parts with a short return to the
> > operating channel between each part. There's nothing that requires
> > cfg80211 for that to work...
>
> We can do tricks in drivers but I'd like to see this handled in mac80211.

Right, could be handled in mac80211 too.

> > Something I've tossed around for a while is counting traffic on the
> > device and if its over a certain bitrate for a period of time, postpone
> > the scan for a while. But after a certain amount of time, there's going
> > to be a scan no matter what.
>
> Sure, makes sense. I take it the effort to make this more intelligent
> is part of the the roaming intelligence we need to enhance.
>
> > The problem here is that at any time an application (say, wifi location
> > app) could ask for the list of access points. If you don't scan
> > periodically, all APs other than your associated AP (and others on the
> > same channel) will gradually drop off because their beacons are
> > received. Hard to wifi position or get area statistics if there's only
> > one AP in the list.
>
> Makes sense.
>
> > Secondarily, scanning is a tradeoff between better roaming latency and
> > continuous high throughput. If you don't scan, you have no idea what's
> > around, and when you move and the current AP becomes marginal, you
> > *have* to take the hit no matter what, so you can scan and find a new AP
> > to associate with.
>
> True however I'm inclined to believe that generally when you are
> sending or receiving a lot of data you are stationary so roaming might
> not be that urgent. For 802.11p (vehicle stuff) I suppose this may be
> a bit different but I have yet to see this take off.

I'll believe 11p when I see it somewhere :) But anyway, yeah, there are
tricks we can play here, and I'm happy to entertain methods of making
the scanning hit less annoying. I'm simply opposed to a checkbox in the
UI for "never scan while connected" because that's a pure cop-out: we
should be making things intelligent, not taking the half-assed approach
like people (nobody here of course) have been whining about for a long
time. Scanning isn't something the user should have to care about or
turn on or off; it's something that NetworkManager (in concert with
usage information from the stack) should be able to handle
automatically.

Maybe if there was a way to figure out how full mac80211's QoS buckets
were, that would help? Or get a frame counts from each of the 4 QoS
buckets individually? That would allow NM to make more intelligent
decisions about stuff. Maybe a more detailed nl80211 stats interface
would do the trick here.

> > I would have though that the periodic scanning would be more of an
> > annoyance when doing VOIP or SSH other latency sensitive tasks, but when
> > just downloading a file, a few second drop in transfer rate gets lost in
> > the bucket in the grand scheme of things.
>
> Good point, how about we use pm_qos for disabling scan when we are
> sending a lot of data (whatever we define this to mean)? Then
> applications can just write to this pm_qos stuff and you won't see
> this.
>
> I forget when pm_qos was introduced though.

Yeah, something like that, although it's sometimes hard to go from app
-> interface; if you have a wired interface connected at the same time,
but the thing you're sucking down data from is on the wifi interface,
we'd need some way to figure that out. Hence the idea about exposing
the packet counts of the WMM queues or something like that.

Dan



2009-05-18 13:43:11

by Helmut Schaa

[permalink] [raw]
Subject: Re: Scan while TX/RX'ing a lot of data

Am Samstag, 16. Mai 2009 schrieb Johannes Berg:
> Anyway, bottom line is that I don't think we should change the APIs in
> any way, we should instead make the scan smarter by spreading it out if
> we're active.

That's what I've tried with the patch at [1]. I used a fixed time schedule
for switching back to the operating channel after each scanned channel.
However, that could of course be done somewhat dynamic based on the current
traffic characteristics.

The basic problem I had was that I couldn't check if the nullfunc frame
indicating the new "powersave" state to the AP was already sent out (or
ACKed by the AP). This resulted in lost frames sometimes: if the device's
tx queue contained a lot of data frames the nullfunc frame was sent out
_after_ the channel switch occured.

If anybody has a good idea on how to fix this issue I'm glad to provide an
updated version of the bg-scan patch.

One solution would be to force all drivers to report the tx status. Another
one would be to just wait until the device's queue is empty. At that point
we know that all pending data frames were sent out and additionally the
nullfunc frame was also sent out, so we can safely switch to the next
to-be-scanned channel.

Comments?

Helmut

[1] http://marc.info/?l=linux-wireless&m=122226702331135&w=2

2009-05-14 19:07:54

by Luis R. Rodriguez

[permalink] [raw]
Subject: Re: Scan while TX/RX'ing a lot of data

On Thu, May 14, 2009 at 11:54 AM, Dan Williams <[email protected]> wrote:
> On Thu, 2009-05-14 at 10:52 -0700, Luis R. Rodriguez wrote:
>> I'm told Network Manager scans every 60 seconds. When TX'ing or RX'ing
>> a lot of data you will see a big dip in throughput and sometimes it
>> may cause issues with some connections. Jouni pointed out a possible
>> nice option here: split the scans per channel through time. Now with
>> nl80211 this is possible but right now Network Manager uses wext
>> through wpa_supplicant in many distributions and this won't change for
>> a bit (maybe by the next major distribution releases?). Since we're
>> stuck with wext for the current distribution releases I'd like to hear
>> feedback on a possible nice solution. Should we simply cancel scan?
>
> Libertas

libertas_tf (the mac80211 driver) ? Or the fullmac one?

> splits scans up into 3 parts with a short return to the
> operating channel between each part.  There's nothing that requires
> cfg80211 for that to work...

We can do tricks in drivers but I'd like to see this handled in mac80211.

> Something I've tossed around for a while is counting traffic on the
> device and if its over a certain bitrate for a period of time, postpone
> the scan for a while.  But after a certain amount of time, there's going
> to be a scan no matter what.

Sure, makes sense. I take it the effort to make this more intelligent
is part of the the roaming intelligence we need to enhance.

> The problem here is that at any time an application (say, wifi location
> app) could ask for the list of access points.  If you don't scan
> periodically, all APs other than your associated AP (and others on the
> same channel) will gradually drop off because their beacons are
> received.  Hard to wifi position or get area statistics if there's only
> one AP in the list.

Makes sense.

> Secondarily, scanning is a tradeoff between better roaming latency and
> continuous high throughput.  If you don't scan, you have no idea what's
> around, and when you move and the current AP becomes marginal, you
> *have* to take the hit no matter what, so you can scan and find a new AP
> to associate with.

True however I'm inclined to believe that generally when you are
sending or receiving a lot of data you are stationary so roaming might
not be that urgent. For 802.11p (vehicle stuff) I suppose this may be
a bit different but I have yet to see this take off.

> I would have though that the periodic scanning would be more of an
> annoyance when doing VOIP or SSH other latency sensitive tasks, but when
> just downloading a file, a few second drop in transfer rate gets lost in
> the bucket in the grand scheme of things.

Good point, how about we use pm_qos for disabling scan when we are
sending a lot of data (whatever we define this to mean)? Then
applications can just write to this pm_qos stuff and you won't see
this.

I forget when pm_qos was introduced though.

Luis

2009-05-14 22:17:47

by Luis R. Rodriguez

[permalink] [raw]
Subject: Re: Scan while TX/RX'ing a lot of data

On Thu, May 14, 2009 at 2:26 PM, Luis R. Rodriguez <[email protected]> wrote:
> On Thu, May 14, 2009 at 1:57 PM, Dan Williams <[email protected]> wrote:

>> Maybe if there was a way to figure out how full mac80211's QoS buckets
>> were, that would help?  Or get a frame counts from each of the 4 QoS
>> buckets individually?  That would allow NM to make more intelligent
>> decisions about stuff.  Maybe a more detailed nl80211 stats interface
>> would do the trick here.
>
> Would help for debugging too anyway, since QOS uses the new device
> queues maybe that might already be available?

I should note the netdevice queues we use are _in_ mac80211 and do not
actually represent each _device_ hardware queue. We use them for
queuing frames for master interface to then send to the driver. Still
-- it seems this may be helpful. This would help with TX, we would
need something else for RX.

Luis

2009-05-18 15:35:35

by Dan Williams

[permalink] [raw]
Subject: Re: Scan while TX/RX'ing a lot of data

On Sat, 2009-05-16 at 14:48 +0200, Johannes Berg wrote:
> On Fri, 2009-05-15 at 19:15 -0400, Dan Williams wrote:
>
> > Well, what does "background scan" mean? mac80211 will obviously accept
> > beacons from APs on the same channel and happily add them to the scan
> > list.
>
> Only if powersave/beacon filtering isn't enabled. Actually, mac80211
> does beacon filtering itself now, so I don't think this happens any more
> (I did that on purpose so people don't really rely on the former
> behaviour)
>
> > But a background scan would imply that the card itself is taking
> > over the decision about when to jump around and scan, basically the same
> > thing NM is doing, right? When would the card/stack decided to
> > background scan, and what channels would it background scan on? If it
> > doesn't do more or less all passive channels, then it wouldn't be that
> > useful for figuring out the site survey data.
>
> I would think that mac80211, while asked to scan when associated, could
> simply do something like calculating how long it can go off-channel
> under the current QoS requirements, and then go off-channel for that,
> wait for the next on-channel beacon and some traffic, scan the next
> channel(s) again etc.

Yeah, that sounds like it would work.

Dan


2009-05-18 12:45:16

by John W. Linville

[permalink] [raw]
Subject: Re: Scan while TX/RX'ing a lot of data

On Sat, May 16, 2009 at 02:57:37PM +0200, Johannes Berg wrote:

> Anyway, bottom line is that I don't think we should change the APIs in
> any way, we should instead make the scan smarter by spreading it out if
> we're active. This even applies to scanning while in an IBSS -- stop
> beaconing, then go scan etc.

ACK

--
John W. Linville Someday the world will need a hero, and you
[email protected] might be all we have. Be ready.

2009-05-14 21:27:00

by Luis R. Rodriguez

[permalink] [raw]
Subject: Re: Scan while TX/RX'ing a lot of data

On Thu, May 14, 2009 at 1:57 PM, Dan Williams <[email protected]> wrote:
> On Thu, 2009-05-14 at 12:07 -0700, Luis R. Rodriguez wrote:
>> On Thu, May 14, 2009 at 11:54 AM, Dan Williams <[email protected]> wrote:

>> > Secondarily, scanning is a tradeoff between better roaming latency and
>> > continuous high throughput.  If you don't scan, you have no idea what's
>> > around, and when you move and the current AP becomes marginal, you
>> > *have* to take the hit no matter what, so you can scan and find a new AP
>> > to associate with.
>>
>> True however I'm inclined to believe that generally when you are
>> sending or receiving a lot of data you are stationary so roaming might
>> not be that urgent. For 802.11p (vehicle stuff) I suppose this may be
>> a bit different but I have yet to see this take off.
>
> I'll believe 11p when I see it somewhere :)  But anyway, yeah, there are
> tricks we can play here, and I'm happy to entertain methods of making
> the scanning hit less annoying.  I'm simply opposed to a checkbox in the
> UI for "never scan while connected" because that's a pure cop-out: we
> should be making things intelligent, not taking the half-assed approach
> like people (nobody here of course) have been whining about for a long
> time.  Scanning isn't something the user should have to care about or
> turn on or off; it's something that NetworkManager (in concert with
> usage information from the stack) should be able to handle
> automatically.

Agreed.

> Maybe if there was a way to figure out how full mac80211's QoS buckets
> were, that would help?  Or get a frame counts from each of the 4 QoS
> buckets individually?  That would allow NM to make more intelligent
> decisions about stuff.  Maybe a more detailed nl80211 stats interface
> would do the trick here.

Would help for debugging too anyway, since QOS uses the new device
queues maybe that might already be available?

>> > I would have though that the periodic scanning would be more of an
>> > annoyance when doing VOIP or SSH other latency sensitive tasks, but when
>> > just downloading a file, a few second drop in transfer rate gets lost in
>> > the bucket in the grand scheme of things.
>>
>> Good point, how about we use pm_qos for disabling scan when we are
>> sending a lot of data (whatever we define this to mean)? Then
>> applications can just write to this pm_qos stuff and you won't see
>> this.
>>
>> I forget when pm_qos was introduced though.
>
> Yeah, something like that, although it's sometimes hard to go from app
> -> interface;

Yeah at least pm_qos is not device specific IIRC.

> if you have a wired interface connected at the same time,
> but the thing you're sucking down data from is on the wifi interface,
> we'd need some way to figure that out.

Point taken.

> Hence the idea about exposing
> the packet counts of the WMM queues or something like that.

I think this is a good idea, although it wouldn't solve anything for
people stuck on old userspace (my oldest target is distributions
releases circa 2.6.27). Not sure what is best for that case. The
kernel could be informed of the lack of userspace regulating this and
then take some reasonable action. What the reasonable action and the
terms for that remain unclear to me.

Luis

2009-05-14 20:49:45

by Dan Williams

[permalink] [raw]
Subject: Re: Scan while TX/RX'ing a lot of data

On Thu, 2009-05-14 at 21:06 +0200, Johannes Berg wrote:
> On Thu, 2009-05-14 at 14:54 -0400, Dan Williams wrote:
>
> > Libertas splits scans up into 3 parts with a short return to the
> > operating channel between each part. There's nothing that requires
> > cfg80211 for that to work...
>
> Yeah that was my idea too, just return to the operating channel after
> having scanned a channel or two and wait for the next beacon, and
> possibly receive traffic if indicated in that beacon. It takes a bit of
> synchronisation and isn't easy to implement, but it's definitely
> possible.
>
> > The problem here is that at any time an application (say, wifi location
> > app) could ask for the list of access points. If you don't scan
> > periodically, all APs other than your associated AP (and others on the
> > same channel) will gradually drop off because their beacons are
> > received. Hard to wifi position or get area statistics if there's only
> > one AP in the list.
>
> The other thing we should do is bump the AP list timeout, I think -- 10
> seconds is very small. But then again we really need such apps to query
> NM anyway.
>
> > Secondarily, scanning is a tradeoff between better roaming latency and
> > continuous high throughput. If you don't scan, you have no idea what's
> > around, and when you move and the current AP becomes marginal, you
> > *have* to take the hit no matter what, so you can scan and find a new AP
> > to associate with.
>
> Yeah, that too.
>
> > I would have though that the periodic scanning would be more of an
> > annoyance when doing VOIP or SSH other latency sensitive tasks, but when
> > just downloading a file, a few second drop in transfer rate gets lost in
> > the bucket in the grand scheme of things.
>
> ssh isn't too bad, at least not after I fixed the timings... before I
> got very annoyed with ar9170. OTOH with VoIP it would suck to have a
> small hiccup every 2 minutes -- maybe then we should postpone
> indefinitely and only do it if the signal strength fluctuates?

If there was a reliable mechanism to figure out whether there was a
certain QoS level of traffic flowing through the card, this would be
easier to do automatically. AFAIK all the APIs these days are
socket-based, and that doesn't help us get from app -> interface without
a lot of intermediate steps. How does mac80211 figure out what to put
into each of the 4 buckets for wifi QoS / WMM?

Dan


2009-05-20 13:43:06

by Helmut Schaa

[permalink] [raw]
Subject: Re: Scan while TX/RX'ing a lot of data

Am Mittwoch, 20. Mai 2009 schrieb Johannes Berg:
> On Wed, 2009-05-20 at 13:41 +0200, Helmut Schaa wrote:
> > Am Dienstag, 19. Mai 2009 schrieb Johannes Berg:
> > > On Mon, 2009-05-18 at 15:43 +0200, Helmut Schaa wrote:
> >
> > > > One solution would be to force all drivers to report the tx status. Another
> > > > one would be to just wait until the device's queue is empty. At that point
> > > > we know that all pending data frames were sent out and additionally the
> > > > nullfunc frame was also sent out, so we can safely switch to the next
> > > > to-be-scanned channel.
> > >
> > > Some drivers now flush queues at channel switch time, but there's no way
> > > to guarantee that the software queues are actually empty and the
> > > nullfunc frame isn't queued up behind other traffic. Also, waiting for
> > > any driver to report a status will lead to problems because the driver
> > > might just have dropped the frame for various reasons like being out of
> > > memory.
> >
> > Hmm, the problem is basically (from a mac80211 perspective) that the
> > mdev tx queue could still contain data frames when the nullfunc frame
> > gets queued, right? And we do not assure that these frames are sent
> > out before the channel switch.
> >
> > Hence, would it be possible to:
> > 1) Stop all sub_if tx queues (afterwards no new data frames should
> > appear in the mdev tx queue)
> > 2) Queue the nullfunc frame
> > 3) Flush the mdev's tx queue
> > 4) Switch the channel
> > ?
>
> I don't think that is sufficient,

Maybe not sufficient, but it's at least an improvement.

> unless the driver also flushes the
> hardware queue at channel switch time.

Yes, but that would be the drivers responsibility and could be
fixed in a second step.

> Which we may want to make explicit with a new callback or so?

And make it mandatory?

Helmut

2009-05-16 06:12:28

by Luis R. Rodriguez

[permalink] [raw]
Subject: Re: Scan while TX/RX'ing a lot of data

On Thu, May 14, 2009 at 2:26 PM, Luis R. Rodriguez <[email protected]> wrote:
> On Thu, May 14, 2009 at 1:57 PM, Dan Williams <[email protected]> wrote:
>> On Thu, 2009-05-14 at 12:07 -0700, Luis R. Rodriguez wrote:
>>> On Thu, May 14, 2009 at 11:54 AM, Dan Williams <[email protected]> wrote:
>
>>> > Secondarily, scanning is a tradeoff between better roaming latency and
>>> > continuous high throughput.  If you don't scan, you have no idea what's
>>> > around, and when you move and the current AP becomes marginal, you
>>> > *have* to take the hit no matter what, so you can scan and find a new AP
>>> > to associate with.
>>>
>>> True however I'm inclined to believe that generally when you are
>>> sending or receiving a lot of data you are stationary so roaming might
>>> not be that urgent. For 802.11p (vehicle stuff) I suppose this may be
>>> a bit different but I have yet to see this take off.
>>
>> I'll believe 11p when I see it somewhere :)  But anyway, yeah, there are
>> tricks we can play here, and I'm happy to entertain methods of making
>> the scanning hit less annoying.  I'm simply opposed to a checkbox in the
>> UI for "never scan while connected" because that's a pure cop-out: we
>> should be making things intelligent, not taking the half-assed approach
>> like people (nobody here of course) have been whining about for a long
>> time.  Scanning isn't something the user should have to care about or
>> turn on or off; it's something that NetworkManager (in concert with
>> usage information from the stack) should be able to handle
>> automatically.
>
> Agreed.
>
>> Maybe if there was a way to figure out how full mac80211's QoS buckets
>> were, that would help?  Or get a frame counts from each of the 4 QoS
>> buckets individually?  That would allow NM to make more intelligent
>> decisions about stuff.  Maybe a more detailed nl80211 stats interface
>> would do the trick here.
>
> Would help for debugging too anyway, since QOS uses the new device
> queues maybe that might already be available?
>
>>> > I would have though that the periodic scanning would be more of an
>>> > annoyance when doing VOIP or SSH other latency sensitive tasks, but when
>>> > just downloading a file, a few second drop in transfer rate gets lost in
>>> > the bucket in the grand scheme of things.
>>>
>>> Good point, how about we use pm_qos for disabling scan when we are
>>> sending a lot of data (whatever we define this to mean)? Then
>>> applications can just write to this pm_qos stuff and you won't see
>>> this.
>>>
>>> I forget when pm_qos was introduced though.
>>
>> Yeah, something like that, although it's sometimes hard to go from app
>> -> interface;
>
> Yeah at least pm_qos is not device specific IIRC.
>
>> if you have a wired interface connected at the same time,
>> but the thing you're sucking down data from is on the wifi interface,
>> we'd need some way to figure that out.
>
> Point taken.
>
>> Hence the idea about exposing
>> the packet counts of the WMM queues or something like that.
>
> I think this is a good idea, although it wouldn't solve anything for
> people stuck on old userspace (my oldest target is distributions
> releases circa 2.6.27). Not sure what is best for that case. The
> kernel could be informed of the lack of userspace regulating this and
> then take some reasonable action. What the reasonable action and the
> terms for that remain unclear to me.

OK so what if we just let let old wext scan be just a dump of the
current bss list provided we do selective scanning once per channel
throughout a period of time.

For new userspace can obviously do whatever we like but since we'd
implement the above we could just let this automatic scanning can be
tweaked for roaming purpose with exposed knobs. That is -- make the
code so that it can be tweaked to act like a regular good old scan or
take breaks throughout.

If you're already associated it makes little sense to be scanning all
over. If you're not associated it makes sense to do the good old scan
we're used to. Of course this would just be fof mac80211 and cfg80211
drivers, old wext will obviously still behave the same for old
hardware.

Meh?

Luis

2009-05-20 11:41:41

by Helmut Schaa

[permalink] [raw]
Subject: Re: Scan while TX/RX'ing a lot of data

Am Dienstag, 19. Mai 2009 schrieb Johannes Berg:
> On Mon, 2009-05-18 at 15:43 +0200, Helmut Schaa wrote:

> > One solution would be to force all drivers to report the tx status. Another
> > one would be to just wait until the device's queue is empty. At that point
> > we know that all pending data frames were sent out and additionally the
> > nullfunc frame was also sent out, so we can safely switch to the next
> > to-be-scanned channel.
>
> Some drivers now flush queues at channel switch time, but there's no way
> to guarantee that the software queues are actually empty and the
> nullfunc frame isn't queued up behind other traffic. Also, waiting for
> any driver to report a status will lead to problems because the driver
> might just have dropped the frame for various reasons like being out of
> memory.

Hmm, the problem is basically (from a mac80211 perspective) that the
mdev tx queue could still contain data frames when the nullfunc frame
gets queued, right? And we do not assure that these frames are sent
out before the channel switch.

Hence, would it be possible to:
1) Stop all sub_if tx queues (afterwards no new data frames should
appear in the mdev tx queue)
2) Queue the nullfunc frame
3) Flush the mdev's tx queue
4) Switch the channel
?

Helmut

2009-05-15 23:15:04

by Dan Williams

[permalink] [raw]
Subject: Re: Scan while TX/RX'ing a lot of data

On Fri, 2009-05-15 at 10:11 +0200, Holger Schurig wrote:
> On Thursday 14 May 2009 20:54:58 Dan Williams wrote:
> > Libertas splits scans up into 3 parts with a short return to
> > the operating channel between each part. There's nothing that
> > requires cfg80211 for that to work...
>
> Hey, it's up-to 4 channels (MRVDRV_MAX_CHANNELS_PER_SCAN).
>
> Yeah, I had to implement this "stuttering scan" because the
> firmware that I had access to cannot send
> power-save-null-packets to the AP. Without that, I could not
> notify to the current AP that I'm away, and if the AP has data
> for me, which I don't ack for more than 1000ms, then the AP
> might have disassociated me in the mean-time.
>
>
> So basically I changed the scanning into a state-machine. I get a
> list of channels to scan (e.g. "all channels", if the request
> comes from WEXT). The a scan-worker calls the logic of the
> state-machine. The state-machine does it's work and either
> re-schedules the workqueue. Or, if every channel has been
> visitied, it emits the SIOCGIWSCAN event.
>
> It a bit more complicated, because one can force a full scan
> (e.g. when initially associating).
>
>
> > Something I've tossed around for a while is counting traffic
> > on the device and if its over a certain bitrate for a period
> > of time, postpone the scan for a while. But after a certain
> > amount of time, there's going to be a scan no matter what.
>
> Traffic could for example modify the time for delays between
> re-schedules of the scan-state-machine.
>
>
> > Secondarily, scanning is a tradeoff between better roaming
> > latency and continuous high throughput.
>
> Kind of a QoS thingy?
>
>
> > If you don't scan,
> > you have no idea what's around, and when you move and the
> > current AP becomes marginal, you *have* to take the hit no
> > matter what, so you can scan and find a new AP to associate
> > with.
>
> TeX does it's layouting by minimizing (calculated) uglyness.
> Maybe a background-scan-decision can be done on (calculated)
> urgentness? E.g. if background scan is enabled at all, the
> urgentness of background scan increases in time. "Huge" amount

Well, what does "background scan" mean? mac80211 will obviously accept
beacons from APs on the same channel and happily add them to the scan
list. But a background scan would imply that the card itself is taking
over the decision about when to jump around and scan, basically the same
thing NM is doing, right? When would the card/stack decided to
background scan, and what channels would it background scan on? If it
doesn't do more or less all passive channels, then it wouldn't be that
useful for figuring out the site survey data.

Dan