2012-06-25 16:05:16

by Johannes Berg

[permalink] [raw]
Subject: mac80211 auth/assoc in multi-channel scenarios

Hi all,

Some of us have discussed this before, but I wanted to get some wider
dissemination ...

With multi-channel, one problem that we get is managing the channel
timing for authentication/association. We'll have bound the channel
context (once we have that) before we even send a frame of course, but
setting up the timing as to when to use the channel is very hard, in
particular for connections to P2P GOs that might potentially be asleep.

I see a few obvious choices, open for other suggestions.

1) bring back something like the TX sync API, where the driver sleeps
until the frame can be transmitted; this is done before every
transmit
1a) do the same, but asynchronously (somewhat ugly)

2) sleep in sta_state in the driver; this is problematic because while
we know that the auth frame will be sent right after the AP station
is created, we don't see the retransmits properly

3) do it in the driver in tx(); this is problematic because it may mean
delaying the frames quite a bit, and mac80211 might then try to
retransmit too early etc

4) introduce a sleeping mgmt-tx driver call to send such frames; this is
ugly in mac80211 code flows with the TX functions

5) do remain-on-channel from mac80211, but this doesn't address P2P GO
synchronisation requirements

I haven't been able to come up with any other ways, and here we really
don't have much of a choice -- #1 is the only thing that seems sane to
me. Thoughts?

johannes



2012-06-27 15:25:38

by Johannes Berg

[permalink] [raw]
Subject: Re: mac80211 auth/assoc in multi-channel scenarios

On Wed, 2012-06-27 at 18:20 +0300, Arik Nemtsov wrote:

> > It'll depend on the driver I guess? If you're going to set a flag "give
> > this vif priority now" on the first invocation, then you probably
> > wouldn't care about the timing. It looks like our implementation would
> > actually start some "give me the channel" operation and when the
> > firmware says "ok you have the channel now" we'd return and rely on
> > getting the TX right after.
> >
> > Wrt. the race, this isn't going to be something that happens within less
> > than a millisecond or so, it's going to give us maybe 50ms at least of
> > channel time (WFA mandates replying within 30ms), so that doesn't seem
> > like a problem?
>
> It's probably not an issue. Just seems simpler to just give the skb to
> the driver so it can do whatever it wants before/during/after. Bypass
> op_tx completely for these.

Yeah, that was one of the other options I considered in my original
email ... but this call must be able to sleep, and the TX processing in
mac80211 cannot sleep, so unless we bypass all processing (which seems
wrong) that would be rather difficult to implement.

johannes


2012-06-27 11:17:53

by Johannes Berg

[permalink] [raw]
Subject: Re: mac80211 auth/assoc in multi-channel scenarios

On Tue, 2012-06-26 at 14:35 +0300, Arik Nemtsov wrote:

> Can you give a bit more info on the Tx-sync approach, for the uninitiated?
> I'm also thinking that maybe we could somehow treat the sleeping-GO as
> a special case (maybe with some special code and a HW flag). Right now
> I'm not sure the wl12xx FW even supports it.

I basically had a simpler version in mind, something like this:
http://p.sipsolutions.net/cd928e926a941ac7.txt

johannes


2012-06-27 07:00:17

by Johannes Berg

[permalink] [raw]
Subject: Re: mac80211 auth/assoc in multi-channel scenarios

On Tue, 2012-06-26 at 21:18 +0300, Arik Nemtsov wrote:

> > And now this is just why it won't work even if we start from a clean
> > slate -- I haven't even talked about the backward compatibility aspect,
> > running an old supplicant on a driver that expects a ROC. Remember that
> > the driver need not give the virtual interface *any* channel time on the
> > right channel before needed, so if there's something going on on another
> > channel with multi-channel, that vif would never be able to authenticate
> > with an old supplicant.
>
> Well that's a given, if we're introducing new features into hostap to
> support multi channel.

No, not really -- I'm going to want to rely on this mechanism even for
the single-channel case in our driver, everything else would be more
like a collection of hacks :-)

> > I could also mention how this is a stupid userspace API, you're now
> > requiring to call one thing before another call is valid, but then the
> > other call is only very briefly valid? If we really wanted ROC, we
> > should embed the time for it into the auth/assoc request, I'd say.
>
> I think all these examples are because of our different definitions
> for ROC. If ROC is a recommendation, then we just start the ROC before
> starting the connection, and end it after the last EAPOL.
> If channel management is implemented in SW, what you're saying is a
> must. But the FW can abstract these details. Maybe we should do this
> similar to SW/HW scan in mac80211?

Well, I understand your point, but I don't think it's that simple. If,
as you say, we define this to be a recommendation, then we will require
device support for this. You may have that, but with what we're
currently seeing from our firmware I don't think it's possible to
implement it as a recommendation. Even if I could get it somehow
implemented in our device and/or the driver, we'd still be making more
assumptions than I'd want to make for this.

> > Thinking about that though -- what for connect() calls? Whole new can of
> > worms ...
>
> Not sure what you mean here.

nl80211 connect call, where the driver is managing both auth/assoc,
sometimes even BSS selection.


> > Yes, but it would need this patch and a change to hostapd to add the
> > station when it sends the first auth frame:
> > http://johannes.sipsolutions.net/patches/kernel/all/LATEST/005-nl80211-full-sta-state.patch
>
> Ah this is useful on another level as well. Today we have a hack in
> the driver to identify the auth-frame reply and add the STA to FW
> temporarily :)

Yes, I know. It's also useful to address a few other races, it just
needs a hostapd implementation. Want to work on it? :-)


johannes


2012-06-26 12:56:04

by Johannes Berg

[permalink] [raw]
Subject: Re: mac80211 auth/assoc in multi-channel scenarios

On Tue, 2012-06-26 at 14:18 +0300, Eliad Peller wrote:

> > Yes, we should keep that in mind, but I have a feeling that we should
> > treat it separately. It's quite different as mac80211 is doing a lot
> > less -- for example, mac80211 is managing retries, comeback timeouts,
> > etc. in the managed case.
> >
> i guess it makes sense. i just don't like it being managed in multiple
> places (userspace for ap, kernel for sta).

I don't really understand why? We already manage managed/AP *very*
differently, this just extends that. We'll never unify them, and trying
to unify small bits of it seems bad to me.

> > I'm pretty much happy with that for the AP case, but given how much
> > mac80211 really manages for auth/assoc I don't think it makes sense in
> > the managed case.
> >
> i think that's true for open networks, but for encrypted ones the
> EAPOLs are not handled by mac80211, which pretty much complicates the
> flow.

Yeah, but I think the only thing you could possibly do in the EAPOL case
is reserve some time right after association, before you go into PS.
Either it completes quickly (including DHCP maybe even!) or you have
quite a bit of latency anyway and will need to go into PS again before
it all completes.


> >> also, when waiting for EAPOLs after association you might have to
> >> reserve the channel for a pretty long time anyway (since you still
> >> can't enter ps, and some APs don't send the first EAPOL immediately).
> >
> > Yeah, but you might only want to wait a little bit and then stop doing
> > ROC and actually enter PS mode if the AP is really slow, etc. That can
> > pretty much be managed from the STA state though since once you're
> > associated you're past the bit where mac80211 is doing a lot?
> >
> you can't enter ps while you are not authorized (it's probably not
> forbidden by the spec, but i'm pretty sure some APs won't like it).

You'll have to enter PS after you associate but before you're authorized
if the AP is slow enough. If the AP is fast, then you don't want to
because it'll increase the connection latency significantly.

> managing it from the STA state doesn't make much sense to me (why add
> another management point?), and if you're going to handle it from
> userspace, well... just do it from the beginning :)

That's what I'm saying though -- I don't think you can manage this from
userspace, not authentication/association anyway. And if you really want
to delay going into PS (keep in mind multi-vif/multi-channel!) until
after authorization, you already can since you're told when
authorization finishes.

johannes


2012-06-26 11:35:28

by Arik Nemtsov

[permalink] [raw]
Subject: Re: mac80211 auth/assoc in multi-channel scenarios

On Tue, Jun 26, 2012 at 2:18 PM, Eliad Peller <[email protected]> wrote:
> On Tue, Jun 26, 2012 at 1:41 PM, Johannes Berg
> <[email protected]> wrote:
>> On Tue, 2012-06-26 at 13:09 +0300, Eliad Peller wrote:
>>
>>> > With multi-channel, one problem that we get is managing the channel
>>> > timing for authentication/association. We'll have bound the channel
>>> > context (once we have that) before we even send a frame of course, but
>>> > setting up the timing as to when to use the channel is very hard, in
>>> > particular for connections to P2P GOs that might potentially be asleep.
>>> >
>>> the other scenario we should have in mind here is the AP side - making
>>> sure we'll be able to auth/assoc a new station.
>>
>> Yes, we should keep that in mind, but I have a feeling that we should
>> treat it separately. It's quite different as mac80211 is doing a lot
>> less -- for example, mac80211 is managing retries, comeback timeouts,
>> etc. in the managed case.
>>
> i guess it makes sense. i just don't like it being managed in multiple
> places (userspace for ap, kernel for sta).

I also think managing this in multiple places can get messy. I
slightly prefer Eliad's approach with userspace giving the ROCs, since
it sees the bigger picture.

But all userspace changes are reflected in the STA flags in kernel, so
we can do STA + AP from kernel mode as well.

Can you give a bit more info on the Tx-sync approach, for the uninitiated?
I'm also thinking that maybe we could somehow treat the sleeping-GO as
a special case (maybe with some special code and a HW flag). Right now
I'm not sure the wl12xx FW even supports it.

Arik

2012-06-28 09:33:18

by Johannes Berg

[permalink] [raw]
Subject: Re: mac80211 auth/assoc in multi-channel scenarios

On Wed, 2012-06-27 at 20:05 +0300, Arik Nemtsov wrote:

> > Yeah, that was one of the other options I considered in my original
> > email ... but this call must be able to sleep, and the TX processing in
> > mac80211 cannot sleep, so unless we bypass all processing (which seems
> > wrong) that would be rather difficult to implement.
>
> Good point. But another race to consider the the
> multi-channel/multi-vif scenario, where the driver already has a lot
> of packets queued up, so the time will expire before we get a chance
> to Tx. Again this is not a problem in practice (since it's the VO ac).

If the vif is on a different channel, the driver really should assign it
different queues and then there shouldn't be much queued up, right?
Also, if you really have >100ms latency on your queues, you have a major
problem anyway I'd think?

> How about somehow requiring a multi channel driver to give always
> Tx-ack? That will mean we can abandon the retry timers, and rely on
> the driver to give an answer within a reasonable time.

That doesn't mean we can abandon the retry timers though? Then again,
maybe it does, we could start the timer only when we get the status
information I guess?

I'm not sure I'd want to *require* this, but it sounds like a good thing
we could do to address this possible race for drivers that do support
reliable status reports?

johannes


2012-06-26 10:09:53

by Eliad Peller

[permalink] [raw]
Subject: Re: mac80211 auth/assoc in multi-channel scenarios

On Mon, Jun 25, 2012 at 7:05 PM, Johannes Berg
<[email protected]> wrote:
> Hi all,
>
> Some of us have discussed this before, but I wanted to get some wider
> dissemination ...
>
> With multi-channel, one problem that we get is managing the channel
> timing for authentication/association. We'll have bound the channel
> context (once we have that) before we even send a frame of course, but
> setting up the timing as to when to use the channel is very hard, in
> particular for connections to P2P GOs that might potentially be asleep.
>
the other scenario we should have in mind here is the AP side - making
sure we'll be able to auth/assoc a new station.

> 5) do remain-on-channel from mac80211, but this doesn't address P2P GO
> ? synchronisation requirements
>
what are the unfulfilled requirements here?

> I haven't been able to come up with any other ways, and here we really
> don't have much of a choice -- #1 is the only thing that seems sane to
> me. Thoughts?

our current approach is making the kernel part as simple as possible,
and let userspace handle the channel management.
before authentication (or after getting auth request in the ap case)
the supplicant requests the kernel to "give priority" to a specific
vif. it's up to the low-level driver to decide on the right way to do
it (internally we implement it with ROC on the specified vif).

the con here is that this might result in the channel being reserved
for a pretty long time (retransmits, etc.), but that's why we refer it
as setting "priority" rather than actual channel reservation. if the
fw has to do other things (e.g. beacon on each beacon interval) it
will do it, even though we asked it to ROC on a different channel.

i think the "tx sync" (method 1) is good for sta mode, but it's less
suitable for ap mode.
also, when waiting for EAPOLs after association you might have to
reserve the channel for a pretty long time anyway (since you still
can't enter ps, and some APs don't send the first EAPOL immediately).

Eliad.

2012-06-26 13:10:36

by Johannes Berg

[permalink] [raw]
Subject: Re: mac80211 auth/assoc in multi-channel scenarios

On Tue, 2012-06-26 at 14:35 +0300, Arik Nemtsov wrote:

> >> Yes, we should keep that in mind, but I have a feeling that we should
> >> treat it separately. It's quite different as mac80211 is doing a lot
> >> less -- for example, mac80211 is managing retries, comeback timeouts,
> >> etc. in the managed case.
> >>
> > i guess it makes sense. i just don't like it being managed in multiple
> > places (userspace for ap, kernel for sta).
>
> I also think managing this in multiple places can get messy. I
> slightly prefer Eliad's approach with userspace giving the ROCs, since
> it sees the bigger picture.

Yeah but I don't think the bigger picture actually helps much in this
case. The details matter much more.

Imagine userspace asking to ROC and then to authenticate when the ROC is
accepted. Then the first question is -- how long should the ROC period
be? mac80211 will transmit an auth frame, but if it doesn't get an
answer it will retransmit up to 2 times with 100ms delay. That's
*highly* implementation dependent, I don't think any other (full-MAC)
driver would do exactly the same thing. So this means you'd need about
300ms ROC time.

OTOH, WFA mandates replies within 30ms. So it makes much more sense to
have ~35ms ROC time and try a new one if it fails for some reason,
right?

The next question is whether to use a new ROC for assoc or not. How much
time is left on the ROC? Should you cancel it (could be inefficient) and
then start a new one? How can you make that decision.

And now this is just why it won't work even if we start from a clean
slate -- I haven't even talked about the backward compatibility aspect,
running an old supplicant on a driver that expects a ROC. Remember that
the driver need not give the virtual interface *any* channel time on the
right channel before needed, so if there's something going on on another
channel with multi-channel, that vif would never be able to authenticate
with an old supplicant.

I could also mention how this is a stupid userspace API, you're now
requiring to call one thing before another call is valid, but then the
other call is only very briefly valid? If we really wanted ROC, we
should embed the time for it into the auth/assoc request, I'd say.

Thinking about that though -- what for connect() calls? Whole new can of
worms ...

So I really don't see how ROC from userspace makes any sense in this
case or is even workable.

> But all userspace changes are reflected in the STA flags in kernel, so
> we can do STA + AP from kernel mode as well.

Yes, but it would need this patch and a change to hostapd to add the
station when it sends the first auth frame:
http://johannes.sipsolutions.net/patches/kernel/all/LATEST/005-nl80211-full-sta-state.patch

I haven't yet had a chance to look at this again in more detail, in
particular we need a matching hostapd implementation of course.
This would require the assumption though that when hostapd adds a
station it will also send an auth response and the client will request
assoc, but that seems like a reasonable assumption (and hostapd would
have to make assumptions about the client just as mac80211 would.)

This would probably make it possible to implement such "give it airtime"
behaviour entirely within the sta_state() callback in the driver, which
isn't really possible in the managed case.

> Can you give a bit more info on the Tx-sync approach, for the uninitiated?
> I'm also thinking that maybe we could somehow treat the sleeping-GO as
> a special case (maybe with some special code and a HW flag). Right now
> I'm not sure the wl12xx FW even supports it.

Well so the sleeping-GO case is basically that the GO could be asleep at
any time (even opportunistic PS), so you would want to receive a beacon
from it before you send an auth frame.

The TX-sync name goes back to the tx_sync API I had at some point to
handle this case, now we handle it a bit differently (but not very well)
so I removed that because the implementation details were pretty ugly,
but for the future we want something better :-) This was removed in
commit 177958e9679c23537411066cc41b205635dacb14, you can look there for
the old code.

Basically this was letting the driver know that mac80211 was going to TX
a management frame for the auth/assoc sequence, and the callback could
sleep so the driver had a chance to sync up with the GO or reserve
airtime or whatever else.

johannes


2012-06-27 11:27:22

by Arik Nemtsov

[permalink] [raw]
Subject: Re: mac80211 auth/assoc in multi-channel scenarios

On Wed, Jun 27, 2012 at 2:17 PM, Johannes Berg
<[email protected]> wrote:
> On Tue, 2012-06-26 at 14:35 +0300, Arik Nemtsov wrote:
>
>> Can you give a bit more info on the Tx-sync approach, for the uninitiated?
>> I'm also thinking that maybe we could somehow treat the sleeping-GO as
>> a special case (maybe with some special code and a HW flag). Right now
>> I'm not sure the wl12xx FW even supports it.
>
> I basically had a simpler version in mind, something like this:
> http://p.sipsolutions.net/cd928e926a941ac7.txt

If the low level driver has to prepare for each such Tx, that's fine.
But does this somehow rely on the timing of the Tx? Relying on op_tx
to come right after this seems racy.

2012-06-27 17:05:50

by Arik Nemtsov

[permalink] [raw]
Subject: Re: mac80211 auth/assoc in multi-channel scenarios

On Wed, Jun 27, 2012 at 6:25 PM, Johannes Berg
<[email protected]> wrote:
> On Wed, 2012-06-27 at 18:20 +0300, Arik Nemtsov wrote:
>
>> > It'll depend on the driver I guess? If you're going to set a flag "give
>> > this vif priority now" on the first invocation, then you probably
>> > wouldn't care about the timing. It looks like our implementation would
>> > actually start some "give me the channel" operation and when the
>> > firmware says "ok you have the channel now" we'd return and rely on
>> > getting the TX right after.
>> >
>> > Wrt. the race, this isn't going to be something that happens within less
>> > than a millisecond or so, it's going to give us maybe 50ms at least of
>> > channel time (WFA mandates replying within 30ms), so that doesn't seem
>> > like a problem?
>>
>> It's probably not an issue. Just seems simpler to just give the skb to
>> the driver so it can do whatever it wants before/during/after. Bypass
>> op_tx completely for these.
>
> Yeah, that was one of the other options I considered in my original
> email ... but this call must be able to sleep, and the TX processing in
> mac80211 cannot sleep, so unless we bypass all processing (which seems
> wrong) that would be rather difficult to implement.

Good point. But another race to consider the the
multi-channel/multi-vif scenario, where the driver already has a lot
of packets queued up, so the time will expire before we get a chance
to Tx. Again this is not a problem in practice (since it's the VO ac).

How about somehow requiring a multi channel driver to give always
Tx-ack? That will mean we can abandon the retry timers, and rely on
the driver to give an answer within a reasonable time.

2012-06-27 15:20:59

by Arik Nemtsov

[permalink] [raw]
Subject: Re: mac80211 auth/assoc in multi-channel scenarios

On Wed, Jun 27, 2012 at 5:38 PM, Johannes Berg
<[email protected]> wrote:
> On Wed, 2012-06-27 at 14:27 +0300, Arik Nemtsov wrote:
>> On Wed, Jun 27, 2012 at 2:17 PM, Johannes Berg
>> <[email protected]> wrote:
>> > On Tue, 2012-06-26 at 14:35 +0300, Arik Nemtsov wrote:
>> >
>> >> Can you give a bit more info on the Tx-sync approach, for the uninitiated?
>> >> I'm also thinking that maybe we could somehow treat the sleeping-GO as
>> >> a special case (maybe with some special code and a HW flag). Right now
>> >> I'm not sure the wl12xx FW even supports it.
>> >
>> > I basically had a simpler version in mind, something like this:
>> > http://p.sipsolutions.net/cd928e926a941ac7.txt
>>
>> If the low level driver has to prepare for each such Tx, that's fine.
>
> If it doesn't, then it can de-duplicate itself I guess? And it can also
> abort everything (if needed) when the bssid is cleared (failure) or when
> we go into associated (success), so I didn't add an explicit cancel
> operation.

Ok.

>
>> But does this somehow rely on the timing of the Tx? Relying on op_tx
>> to come right after this seems racy.
>
> It'll depend on the driver I guess? If you're going to set a flag "give
> this vif priority now" on the first invocation, then you probably
> wouldn't care about the timing. It looks like our implementation would
> actually start some "give me the channel" operation and when the
> firmware says "ok you have the channel now" we'd return and rely on
> getting the TX right after.
>
> Wrt. the race, this isn't going to be something that happens within less
> than a millisecond or so, it's going to give us maybe 50ms at least of
> channel time (WFA mandates replying within 30ms), so that doesn't seem
> like a problem?

It's probably not an issue. Just seems simpler to just give the skb to
the driver so it can do whatever it wants before/during/after. Bypass
op_tx completely for these.

2012-06-26 11:18:44

by Eliad Peller

[permalink] [raw]
Subject: Re: mac80211 auth/assoc in multi-channel scenarios

On Tue, Jun 26, 2012 at 1:41 PM, Johannes Berg
<[email protected]> wrote:
> On Tue, 2012-06-26 at 13:09 +0300, Eliad Peller wrote:
>
>> > With multi-channel, one problem that we get is managing the channel
>> > timing for authentication/association. We'll have bound the channel
>> > context (once we have that) before we even send a frame of course, but
>> > setting up the timing as to when to use the channel is very hard, in
>> > particular for connections to P2P GOs that might potentially be asleep.
>> >
>> the other scenario we should have in mind here is the AP side - making
>> sure we'll be able to auth/assoc a new station.
>
> Yes, we should keep that in mind, but I have a feeling that we should
> treat it separately. It's quite different as mac80211 is doing a lot
> less -- for example, mac80211 is managing retries, comeback timeouts,
> etc. in the managed case.
>
i guess it makes sense. i just don't like it being managed in multiple
places (userspace for ap, kernel for sta).

>> > I haven't been able to come up with any other ways, and here we really
>> > don't have much of a choice -- #1 is the only thing that seems sane to
>> > me. Thoughts?
>>
>> our current approach is making the kernel part as simple as possible,
>> and let userspace handle the channel management.
>> before authentication (or after getting auth request in the ap case)
>> the supplicant requests the kernel to "give priority" to a specific
>> vif. it's up to the low-level driver to decide on the right way to do
>> it (internally we implement it with ROC on the specified vif).
>
> I'm pretty much happy with that for the AP case, but given how much
> mac80211 really manages for auth/assoc I don't think it makes sense in
> the managed case.
>
i think that's true for open networks, but for encrypted ones the
EAPOLs are not handled by mac80211, which pretty much complicates the
flow.

>> the con here is that this might result in the channel being reserved
>> for a pretty long time (retransmits, etc.), but that's why we refer it
>> as setting "priority" rather than actual channel reservation. if the
>> fw has to do other things (e.g. beacon on each beacon interval) it
>> will do it, even though we asked it to ROC on a different channel.
>
> That seems highly implementation-dependent :-)
> The point with that though is that I think it shouldn't be necessary to
> have a firmware implementation that does this.
>
well, i'm not really sure whether our fw is doing it, but that was the plan :)

>> also, when waiting for EAPOLs after association you might have to
>> reserve the channel for a pretty long time anyway (since you still
>> can't enter ps, and some APs don't send the first EAPOL immediately).
>
> Yeah, but you might only want to wait a little bit and then stop doing
> ROC and actually enter PS mode if the AP is really slow, etc. That can
> pretty much be managed from the STA state though since once you're
> associated you're past the bit where mac80211 is doing a lot?
>
you can't enter ps while you are not authorized (it's probably not
forbidden by the spec, but i'm pretty sure some APs won't like it).
managing it from the STA state doesn't make much sense to me (why add
another management point?), and if you're going to handle it from
userspace, well... just do it from the beginning :)

Eliad.

2012-06-28 10:02:01

by Arik Nemtsov

[permalink] [raw]
Subject: Re: mac80211 auth/assoc in multi-channel scenarios

On Thu, Jun 28, 2012 at 12:33 PM, Johannes Berg
<[email protected]> wrote:
> On Wed, 2012-06-27 at 20:05 +0300, Arik Nemtsov wrote:
>
>> > Yeah, that was one of the other options I considered in my original
>> > email ... but this call must be able to sleep, and the TX processing in
>> > mac80211 cannot sleep, so unless we bypass all processing (which seems
>> > wrong) that would be rather difficult to implement.
>>
>> Good point. But another race to consider the the
>> multi-channel/multi-vif scenario, where the driver already has a lot
>> of packets queued up, so the time will expire before we get a chance
>> to Tx. Again this is not a problem in practice (since it's the VO ac).
>
> If the vif is on a different channel, the driver really should assign it
> different queues and then there shouldn't be much queued up, right?
> Also, if you really have >100ms latency on your queues, you have a major
> problem anyway I'd think?

Yea. It's all only theoretical :)

>
>> How about somehow requiring a multi channel driver to give always
>> Tx-ack? That will mean we can abandon the retry timers, and rely on
>> the driver to give an answer within a reasonable time.
>
> That doesn't mean we can abandon the retry timers though? Then again,
> maybe it does, we could start the timer only when we get the status
> information I guess?

I'm not really sure why the assoc timers are there now. If it's
because we want to be sure we gave the peer a chance to respond, well
the Tx-ack already gives us that. Waiting any longer won't help.
Or does waiting in general before the next Tx help with something?
(clear temporary congestion etc).

>
> I'm not sure I'd want to *require* this, but it sounds like a good thing
> we could do to address this possible race for drivers that do support
> reliable status reports?

Yea. And then, if all current multi-channel drivers support Tx-ack, we
can skip implemeting tx_sync.

2012-06-26 10:41:49

by Johannes Berg

[permalink] [raw]
Subject: Re: mac80211 auth/assoc in multi-channel scenarios

On Tue, 2012-06-26 at 13:09 +0300, Eliad Peller wrote:

> > With multi-channel, one problem that we get is managing the channel
> > timing for authentication/association. We'll have bound the channel
> > context (once we have that) before we even send a frame of course, but
> > setting up the timing as to when to use the channel is very hard, in
> > particular for connections to P2P GOs that might potentially be asleep.
> >
> the other scenario we should have in mind here is the AP side - making
> sure we'll be able to auth/assoc a new station.

Yes, we should keep that in mind, but I have a feeling that we should
treat it separately. It's quite different as mac80211 is doing a lot
less -- for example, mac80211 is managing retries, comeback timeouts,
etc. in the managed case.

> > 5) do remain-on-channel from mac80211, but this doesn't address P2P GO
> > synchronisation requirements
> >
> what are the unfulfilled requirements here?

It would be a more special ROC because the driver would be expected to
only start it when the P2P GO is available. This isn't the case for any
other ROC, and my gut feeling is that requiring that in this special ROC
would "pollute" the ROC API and a lot of people would forget about it
etc.

> > I haven't been able to come up with any other ways, and here we really
> > don't have much of a choice -- #1 is the only thing that seems sane to
> > me. Thoughts?
>
> our current approach is making the kernel part as simple as possible,
> and let userspace handle the channel management.
> before authentication (or after getting auth request in the ap case)
> the supplicant requests the kernel to "give priority" to a specific
> vif. it's up to the low-level driver to decide on the right way to do
> it (internally we implement it with ROC on the specified vif).

I'm pretty much happy with that for the AP case, but given how much
mac80211 really manages for auth/assoc I don't think it makes sense in
the managed case.

> the con here is that this might result in the channel being reserved
> for a pretty long time (retransmits, etc.), but that's why we refer it
> as setting "priority" rather than actual channel reservation. if the
> fw has to do other things (e.g. beacon on each beacon interval) it
> will do it, even though we asked it to ROC on a different channel.

That seems highly implementation-dependent :-)
The point with that though is that I think it shouldn't be necessary to
have a firmware implementation that does this.

> i think the "tx sync" (method 1) is good for sta mode, but it's less
> suitable for ap mode.

Totally agree.

> also, when waiting for EAPOLs after association you might have to
> reserve the channel for a pretty long time anyway (since you still
> can't enter ps, and some APs don't send the first EAPOL immediately).

Yeah, but you might only want to wait a little bit and then stop doing
ROC and actually enter PS mode if the AP is really slow, etc. That can
pretty much be managed from the STA state though since once you're
associated you're past the bit where mac80211 is doing a lot?

johannes


2012-06-27 14:38:12

by Johannes Berg

[permalink] [raw]
Subject: Re: mac80211 auth/assoc in multi-channel scenarios

On Wed, 2012-06-27 at 14:27 +0300, Arik Nemtsov wrote:
> On Wed, Jun 27, 2012 at 2:17 PM, Johannes Berg
> <[email protected]> wrote:
> > On Tue, 2012-06-26 at 14:35 +0300, Arik Nemtsov wrote:
> >
> >> Can you give a bit more info on the Tx-sync approach, for the uninitiated?
> >> I'm also thinking that maybe we could somehow treat the sleeping-GO as
> >> a special case (maybe with some special code and a HW flag). Right now
> >> I'm not sure the wl12xx FW even supports it.
> >
> > I basically had a simpler version in mind, something like this:
> > http://p.sipsolutions.net/cd928e926a941ac7.txt
>
> If the low level driver has to prepare for each such Tx, that's fine.

If it doesn't, then it can de-duplicate itself I guess? And it can also
abort everything (if needed) when the bssid is cleared (failure) or when
we go into associated (success), so I didn't add an explicit cancel
operation.

> But does this somehow rely on the timing of the Tx? Relying on op_tx
> to come right after this seems racy.

It'll depend on the driver I guess? If you're going to set a flag "give
this vif priority now" on the first invocation, then you probably
wouldn't care about the timing. It looks like our implementation would
actually start some "give me the channel" operation and when the
firmware says "ok you have the channel now" we'd return and rely on
getting the TX right after.

Wrt. the race, this isn't going to be something that happens within less
than a millisecond or so, it's going to give us maybe 50ms at least of
channel time (WFA mandates replying within 30ms), so that doesn't seem
like a problem?

johannes


2012-06-28 12:00:18

by Arik Nemtsov

[permalink] [raw]
Subject: Re: mac80211 auth/assoc in multi-channel scenarios

On Thu, Jun 28, 2012 at 1:10 PM, Johannes Berg
<[email protected]> wrote:
> On Thu, 2012-06-28 at 13:01 +0300, Arik Nemtsov wrote:
>
>> >> How about somehow requiring a multi channel driver to give always
>> >> Tx-ack? That will mean we can abandon the retry timers, and rely on
>> >> the driver to give an answer within a reasonable time.
>> >
>> > That doesn't mean we can abandon the retry timers though? Then again,
>> > maybe it does, we could start the timer only when we get the status
>> > information I guess?
>>
>> I'm not really sure why the assoc timers are there now. If it's
>> because we want to be sure we gave the peer a chance to respond, well
>> the Tx-ack already gives us that. Waiting any longer won't help.
>> Or does waiting in general before the next Tx help with something?
>> (clear temporary congestion etc).
>
> Well the only timer there is is retrying the frame. Arguably that isn't
> needed as the frame has been retried on the air already multiple times
> by the hardware, but sometimes temporary channel conditions exist so
> waiting 100ms until the next retry can make sense.
>
> If we actually receive a response, we cancel all timers right away.
>
>> > I'm not sure I'd want to *require* this, but it sounds like a good thing
>> > we could do to address this possible race for drivers that do support
>> > reliable status reports?
>>
>> Yea. And then, if all current multi-channel drivers support Tx-ack, we
>> can skip implemeting tx_sync.
>
> We could, but I'm not sure I'd go there? It would mean that the driver
> has to buffer the frame, schedule a work to do whatever it needs to sync
> up, and then at the end of the work TX the frame. Since this only comes
> in from contexts in mac80211 that can sleep, from a whole system
> complexity POV it seems much simpler to give the driver a chance to do
> whatever it needs to do before the TX even happens?

Yea I guess the callbacks don't hurt anyone.

2012-06-26 18:18:56

by Arik Nemtsov

[permalink] [raw]
Subject: Re: mac80211 auth/assoc in multi-channel scenarios

On Tue, Jun 26, 2012 at 4:10 PM, Johannes Berg
<[email protected]> wrote:
> On Tue, 2012-06-26 at 14:35 +0300, Arik Nemtsov wrote:
>
>> >> Yes, we should keep that in mind, but I have a feeling that we should
>> >> treat it separately. It's quite different as mac80211 is doing a lot
>> >> less -- for example, mac80211 is managing retries, comeback timeouts,
>> >> etc. in the managed case.
>> >>
>> > i guess it makes sense. i just don't like it being managed in multiple
>> > places (userspace for ap, kernel for sta).
>>
>> I also think managing this in multiple places can get messy. I
>> slightly prefer Eliad's approach with userspace giving the ROCs, since
>> it sees the bigger picture.
>
> Yeah but I don't think the bigger picture actually helps much in this
> case. The details matter much more.
>
> Imagine userspace asking to ROC and then to authenticate when the ROC is
> accepted. Then the first question is -- how long should the ROC period
> be? mac80211 will transmit an auth frame, but if it doesn't get an
> answer it will retransmit up to 2 times with 100ms delay. That's
> *highly* implementation dependent, I don't think any other (full-MAC)
> driver would do exactly the same thing. So this means you'd need about
> 300ms ROC time.
>
> OTOH, WFA mandates replies within 30ms. So it makes much more sense to
> have ~35ms ROC time and try a new one if it fails for some reason,
> right?
>
> The next question is whether to use a new ROC for assoc or not. How much
> time is left on the ROC? Should you cancel it (could be inefficient) and
> then start a new one? How can you make that decision.
>
> And now this is just why it won't work even if we start from a clean
> slate -- I haven't even talked about the backward compatibility aspect,
> running an old supplicant on a driver that expects a ROC. Remember that
> the driver need not give the virtual interface *any* channel time on the
> right channel before needed, so if there's something going on on another
> channel with multi-channel, that vif would never be able to authenticate
> with an old supplicant.

Well that's a given, if we're introducing new features into hostap to
support multi channel.

>
> I could also mention how this is a stupid userspace API, you're now
> requiring to call one thing before another call is valid, but then the
> other call is only very briefly valid? If we really wanted ROC, we
> should embed the time for it into the auth/assoc request, I'd say.

I think all these examples are because of our different definitions
for ROC. If ROC is a recommendation, then we just start the ROC before
starting the connection, and end it after the last EAPOL.
If channel management is implemented in SW, what you're saying is a
must. But the FW can abstract these details. Maybe we should do this
similar to SW/HW scan in mac80211?

>
> Thinking about that though -- what for connect() calls? Whole new can of
> worms ...

Not sure what you mean here.

>
> So I really don't see how ROC from userspace makes any sense in this
> case or is even workable.
>
>> But all userspace changes are reflected in the STA flags in kernel, so
>> we can do STA + AP from kernel mode as well.
>
> Yes, but it would need this patch and a change to hostapd to add the
> station when it sends the first auth frame:
> http://johannes.sipsolutions.net/patches/kernel/all/LATEST/005-nl80211-full-sta-state.patch

Ah this is useful on another level as well. Today we have a hack in
the driver to identify the auth-frame reply and add the STA to FW
temporarily :)

>
> I haven't yet had a chance to look at this again in more detail, in
> particular we need a matching hostapd implementation of course.
> This would require the assumption though that when hostapd adds a
> station it will also send an auth response and the client will request
> assoc, but that seems like a reasonable assumption (and hostapd would
> have to make assumptions about the client just as mac80211 would.)
>
> This would probably make it possible to implement such "give it airtime"
> behaviour entirely within the sta_state() callback in the driver, which
> isn't really possible in the managed case.
>
>> Can you give a bit more info on the Tx-sync approach, for the uninitiated?
>> I'm also thinking that maybe we could somehow treat the sleeping-GO as
>> a special case (maybe with some special code and a HW flag). Right now
>> I'm not sure the wl12xx FW even supports it.
>
> Well so the sleeping-GO case is basically that the GO could be asleep at
> any time (even opportunistic PS), so you would want to receive a beacon
> from it before you send an auth frame.
>
> The TX-sync name goes back to the tx_sync API I had at some point to
> handle this case, now we handle it a bit differently (but not very well)
> so I removed that because the implementation details were pretty ugly,
> but for the future we want something better :-) This was removed in
> commit 177958e9679c23537411066cc41b205635dacb14, you can look there for
> the old code.
>
> Basically this was letting the driver know that mac80211 was going to TX
> a management frame for the auth/assoc sequence, and the callback could
> sleep so the driver had a chance to sync up with the GO or reserve
> airtime or whatever else.

Thanks

Arik

2012-06-28 10:11:00

by Johannes Berg

[permalink] [raw]
Subject: Re: mac80211 auth/assoc in multi-channel scenarios

On Thu, 2012-06-28 at 13:01 +0300, Arik Nemtsov wrote:

> >> How about somehow requiring a multi channel driver to give always
> >> Tx-ack? That will mean we can abandon the retry timers, and rely on
> >> the driver to give an answer within a reasonable time.
> >
> > That doesn't mean we can abandon the retry timers though? Then again,
> > maybe it does, we could start the timer only when we get the status
> > information I guess?
>
> I'm not really sure why the assoc timers are there now. If it's
> because we want to be sure we gave the peer a chance to respond, well
> the Tx-ack already gives us that. Waiting any longer won't help.
> Or does waiting in general before the next Tx help with something?
> (clear temporary congestion etc).

Well the only timer there is is retrying the frame. Arguably that isn't
needed as the frame has been retried on the air already multiple times
by the hardware, but sometimes temporary channel conditions exist so
waiting 100ms until the next retry can make sense.

If we actually receive a response, we cancel all timers right away.

> > I'm not sure I'd want to *require* this, but it sounds like a good thing
> > we could do to address this possible race for drivers that do support
> > reliable status reports?
>
> Yea. And then, if all current multi-channel drivers support Tx-ack, we
> can skip implemeting tx_sync.

We could, but I'm not sure I'd go there? It would mean that the driver
has to buffer the frame, schedule a work to do whatever it needs to sync
up, and then at the end of the work TX the frame. Since this only comes
in from contexts in mac80211 that can sleep, from a whole system
complexity POV it seems much simpler to give the driver a chance to do
whatever it needs to do before the TX even happens?

johannes