2011-09-15 22:36:35

by Christian Lamparter

[permalink] [raw]
Subject: Re: Latency and connection problems with a carl9170-based AP

On Thursday, September 15, 2011 11:37:58 PM Harshal Chhaya wrote:
> On Wed, Sep 14, 2011 at 12:32 PM, Christian Lamparter
> <[email protected]> wrote:
> > On Wednesday, September 14, 2011 01:19:59 PM Harshal Chhaya wrote:
> >> Most of the disconnects seem to be caused by beacons that update the
> >> TIM IE but not the overall length. The result is a corrupted RSN IE
> >> (e.g. the IE length says 20 bytes but the IE is only 19 bytes in size)
> >> which causes the clients to disconnect. This problem lasts for only
> >> one beacon (i.e. the next beacon has the right size) but it is enough
> >> to cause the clients to disconnect. Is there a way to fix this?
>> Now that is really interesting. Do you know if the TIM IE is generated
>> properly by ieee80211_beacon_add_tim in net/mac80211/tx.c?
>
> I don't know if TIM IE is valid but a change in size of this IE causes
> the problem. Do you need more details or packet captures or something
> else to understand the problem better?
There's a 6 ms window between the TBTT event and beacon xmit. Most x86
[which have a USB port] and all AMD64 are fast enough to react in time.
Do you think you can check if your embedded system is fast enough?

An alternative approach we could reorder the TIM IE within the beacon
and put it at the end. This step reduces the delta between the old
and new beacon and prevents the corruption.
> >> Another problem when power save is enabled is the large and
> >> unpredictable latency. I understand how enabling power save can
> >> increase latency but my ping times go from 3-4 ms without power save
> >> to a wide range of 3 ms - 3 s after I enable power save. I am trying
> >> smaller beacon intervals to reduce this latency but even at a beacon
> >> interval of 25ms, I get ping times of up to ~400 ms. How do I reduce
> >> this wide variation in latency.
> > What's the listen interval of your stations?
> > Maybe max_listen_interval=1 in hostapd.conf helps.
>
> The listen interval on the clients is 1. It's as if the beacon doesn't
> have the right TIM bit set or the device is missing beacons. I will
> narrow down my test set to a single client and capture the packets to
> understand the interaction better.
ok, understood.

Regards,
Chr


2011-09-19 02:35:25

by Harshal Chhaya

[permalink] [raw]
Subject: Re: Latency and connection problems with a carl9170-based AP

On Thu, Sep 15, 2011 at 5:36 PM, Christian Lamparter
<[email protected]> wrote:
> On Thursday, September 15, 2011 11:37:58 PM Harshal Chhaya wrote:
>> On Wed, Sep 14, 2011 at 12:32 PM, Christian Lamparter
>> <[email protected]> wrote:
>> > On Wednesday, September 14, 2011 01:19:59 PM Harshal Chhaya wrote:
>> >> Most of the disconnects seem to be caused by beacons that update the
>> >> TIM IE but not the overall length. The result is a corrupted RSN IE
>> >> (e.g. the IE length says 20 bytes but the IE is only 19 bytes in size)
>> >> which causes the clients to disconnect. This problem lasts for only
>> >> one beacon (i.e. the next beacon has the right size) but it is enough
>> >> to cause the clients to disconnect. Is there a way to fix this?
>>> Now that is really interesting. Do you know if the TIM IE is generated
>>> properly by ieee80211_beacon_add_tim in net/mac80211/tx.c?
>>
>> I don't know if TIM IE is valid but a change in size of this IE causes
>> the problem. Do you need more details or packet captures or something
>> else to understand the problem better?
> There's a 6 ms window between the TBTT event and beacon xmit. Most x86
> [which have a USB port] and all AMD64 are fast enough to react in time.
> Do you think you can check if your embedded system is fast enough?

I did not know about the 6ms constraint. Thanks for that detail. Where
do I add debug prints (i.e which function in which file) to confirm
that my platform meets the 6ms requirement?

Also, does the host wait for the TBTT event before processing the next
beacon? I am not clear on the beacon xmit path and would very much
appreciate some details on who (carl9170/mac80211/hostapd) is
responsible for which part of the beacon creation and transmission so
I can debug appropriately.

> An alternative approach we could reorder the TIM IE within the beacon
> and put it at the end. This step reduces the delta between the old
> and new beacon and prevents the corruption.

I can try this too if you can tell me which function controls this.


>> >> Another problem when power save is enabled is the large and
>> >> unpredictable latency. I understand how enabling power save can
>> >> increase latency but my ping times go from 3-4 ms without power save
>> >> to a wide range of 3 ms - 3 s after I enable power save. I am trying
>> >> smaller beacon intervals to reduce this latency but even at a beacon
>> >> interval of 25ms, I get ping times of up to ~400 ms. How do I reduce
>> >> this wide variation in latency.
>> > What's the listen interval of your stations?
>> > Maybe max_listen_interval=1 in hostapd.conf helps.
>>
>> The listen interval on the clients is 1. It's as if the beacon doesn't
>> have the right TIM bit set or the device is missing beacons. I will
>> narrow down my test set to a single client and capture the packets to
>> understand the interaction better.
> ok, understood.
>
> Regards,
> ? ? ? ?Chr
>


Thanks,
- Harshal

2011-09-19 13:06:24

by Christian Lamparter

[permalink] [raw]
Subject: Re: Latency and connection problems with a carl9170-based AP

On Monday, September 19, 2011 04:32:14 AM Harshal Chhaya wrote:
> On Thu, Sep 15, 2011 at 5:36 PM, Christian Lamparter
> <[email protected]> wrote:
> > On Thursday, September 15, 2011 11:37:58 PM Harshal Chhaya wrote:
> >> On Wed, Sep 14, 2011 at 12:32 PM, Christian Lamparter
> >> <[email protected]> wrote:
> >> > On Wednesday, September 14, 2011 01:19:59 PM Harshal Chhaya wrote:
> >> >> Most of the disconnects seem to be caused by beacons that update the
> >> >> TIM IE but not the overall length. The result is a corrupted RSN IE
> >> >> (e.g. the IE length says 20 bytes but the IE is only 19 bytes in size)
> >> >> which causes the clients to disconnect. This problem lasts for only
> >> >> one beacon (i.e. the next beacon has the right size) but it is enough
> >> >> to cause the clients to disconnect. Is there a way to fix this?
> >>> Now that is really interesting. Do you know if the TIM IE is generated
> >>> properly by ieee80211_beacon_add_tim in net/mac80211/tx.c?
> >>
> >> I don't know if TIM IE is valid but a change in size of this IE causes
> >> the problem. Do you need more details or packet captures or something
> >> else to understand the problem better?
> > There's a 6 ms window between the TBTT event and beacon xmit. Most x86
> > [which have a USB port] and all AMD64 are fast enough to react in time.
> > Do you think you can check if your embedded system is fast enough?
>
> I did not know about the 6ms constraint. Thanks for that detail. Where
> do I add debug prints (i.e which function in which file) to confirm
> that my platform meets the 6ms requirement?
Is the driver/fw really that hard to understand? After all, you are lucky
to have the HW specs. I would put it into the firmware:

start: first line in handle_pretbtt in wlan.c
stop: right after set(AR9170_MAC_REG_BCN_CTRL, AR9170_BCN_CTRL_READY);
in cmd.c

BTW: I got the name wrong, it's not TBTT but the PRETBTT event which
is triggered 6 ms before the TBTT.

> Also, does the host wait for the TBTT event before processing the next
> beacon? I am not clear on the beacon xmit path and would very much
> appreciate some details on who (carl9170/mac80211/hostapd) is
> responsible for which part of the beacon creation and transmission so
> I can debug appropriately.
After the PRETBTT event is triggered, the firmware/driver/mac80211 has
6ms to update the beacon. Of course, you can play around with different
delays, just update CARL9170_PRETBTT_KUS in hw.h and recompile the
fw and driver.

> > An alternative approach we could reorder the TIM IE within the beacon
> > and put it at the end. This step reduces the delta between the old
> > and new beacon and prevents the corruption.
>
> I can try this too if you can tell me which function controls this.
in net/mac80211/tx.c look for the NL80211_IFTYPE_AP case
and move:

if (beacon->tail)
memcpy(skb_put(skb, beacon->tail_len),
beacon->tail, beacon->tail_len);

in front of
if (local->tim_in_locked_section) {

Regards,
Chr