I got another one to lay out before you for comment:
Setup:
- XG-601 card (ISL3880) with p54pci driver in AP mode
- iPod Touch 2G (firmware 2.1.1) as the only client
The client does an http request and waits for an answer. In this case,
the web server answers, but the answer seemingly is never sent over
the air.
Please see the attached pcap:
- frame 7: http request
- frame 13: iPod goes to sleep
- frame 23: AP beacon indicates traffic for iPod (AID 1 in TIM)
- frame 25: iPod wakes up
- in between: MISSING DATA ???
- frame 27: AP beacon with no traffic indicated ???
- frame 29: iPod goes to sleep again
- subsequent frames: repetitions of this, until the TCP connection is closed
My understanding is that the AP would not indicate any traffic without
actually having some ready to send? Wrong assumption?
Both the iPod and the monitoring device seem to agree the http reply
is not there. This would be a good reason for the application-level
timeouts I'm seeing.
Best wishes,
Stefan.
See also: previous conversation about app-level timeouts and powersave:
http://article.gmane.org/gmane.linux.kernel.wireless.general/24278
On Monday 24 November 2008 14:37:54 Johannes Berg wrote:
> On Fri, 2008-11-21 at 15:12 +0100, Stefan Steuerwald wrote:
>
> > - frame 7: http request
> > - frame 13: iPod goes to sleep
> > - frame 23: AP beacon indicates traffic for iPod (AID 1 in TIM)
>
> in 24 too, the ipod probably didn't see the beacon in frame 23 even
> though 23 was a dtim beacon (which is a bit odd, but maybe the ipod
> doesn't care about mcast at this point)
>
> > - frame 25: iPod wakes up
>
> 26: ack from the AP
>
> > - in between: MISSING DATA ???
> > - frame 27: AP beacon with no traffic indicated ???
> > - frame 29: iPod goes to sleep again
> > - subsequent frames: repetitions of this, until the TCP connection is closed
> >
> > My understanding is that the AP would not indicate any traffic without
> > actually having some ready to send? Wrong assumption?
>
> Indeed. Christian, is it possible that the p54 device is actually
> filtering these frames? I'm pretty sure mac80211 behaves correctly, and
> it unsetting the TIM bit means that it must no longer have traffic
> buffered.
As far as I know it works like this:
If a frame with a the PS-Bit in the FC set is received, the firmware
will mark the source mac / aid as "sleeping". And every frame from
this moment on for this device will be buffered.
To remove the "mark" again, the driver has to call p54_sta_unlock.
And the firmware will send out all buffered frames.
Or if we only need for one frame (e.g: probe resp) we have a tx_flag (_NO_CANCEL)
If for some reason the "mark" doesn't get removed the firmware will eventually timeout
the stuck frame and sets a the P54_TX_PSM_CANCELLED flag in tx_status.
And we pass this on to the mac with IEEE80211_TX_STAT_TX_FILTERED.
one thing: p54 reports the tx_status through the rx-ring-buffer.
so I hope there's no rx/tx race here since everything is in the same "boat" here.
based on that:
I made two different patches to address the problem.
one fiddles with mac80211 only (set-and-clear.diff).
It assumes that if a station comes out of PS, we have to set
the CLEAR_PS_FILT on the same time we clear the WLAN_STA_PS.
the other one is only for p54 (p54-sta-flags.diff)... Doesn't do very much,
it just checks if the CLEAR_PS_FILT is set and then sets the
NO_CANCEL flag on that frame, so the firmware won't buffer it.
Of course you can test both patches on the same time, if it doesn't help.
And finally of course, I could be totally wrong and this is all nothing but useless gibberish.
Regards,
Chr
BTW: I couldn't test the patches, so it may OOps
On Fri, 2008-11-21 at 15:12 +0100, Stefan Steuerwald wrote:
> - frame 7: http request
> - frame 13: iPod goes to sleep
> - frame 23: AP beacon indicates traffic for iPod (AID 1 in TIM)
in 24 too, the ipod probably didn't see the beacon in frame 23 even
though 23 was a dtim beacon (which is a bit odd, but maybe the ipod
doesn't care about mcast at this point)
> - frame 25: iPod wakes up
26: ack from the AP
> - in between: MISSING DATA ???
> - frame 27: AP beacon with no traffic indicated ???
> - frame 29: iPod goes to sleep again
> - subsequent frames: repetitions of this, until the TCP connection is closed
>
> My understanding is that the AP would not indicate any traffic without
> actually having some ready to send? Wrong assumption?
Indeed. Christian, is it possible that the p54 device is actually
filtering these frames? I'm pretty sure mac80211 behaves correctly, and
it unsetting the TIM bit means that it must no longer have traffic
buffered.
johannes
On Mon, 2008-11-24 at 14:36 +0100, Stefan Steuerwald wrote:
> I see "dropped TX filtered frame" messages in the syslog when at the
> same time I see wireless packet sequences as described below. Please
> see the attached logs.
Ah, very good, thank you.
> Nov 24 13:49:11 alix kernel: wlan0: STA 00:22:41:91:8e:96 aid 1 send
> PS frame since STA not sleeping anymore
> Nov 24 13:49:11 alix kernel: phy0: dropped TX filtered frame,
> queue_len=0 PS=0 @135594
That confirms my suspicion that something is going on with p54.
Christian, can you review the comment in net/mac80211/main.c around line
400, the one about races, wrt. p54?
> which may explain why I don't see any data packet on the air. Also
> attached is hostapd log.
Indeed.
> Another thing is the fact that we see 24(!) consecutive beacons with
> TIM bit set until the iPod finally wakes up. It seems to be lying to
> the AP, as it advertises a listen_interval of only 10. May this cause
> the above?
That is another issue, yes, but probably not causing it, we'd be seeing
"expired buffered frame" messages if it was, I think.
johannes
- Compiled verbose and powersave debugging into the kernel
- Using hostapd 0.6.6 with listen_interval fix
- Moved my monitoring device to the machine acting as AP so timestamps
are synced between the various logs
I see "dropped TX filtered frame" messages in the syslog when at the
same time I see wireless packet sequences as described below. Please
see the attached logs.
Capture:
- frame 1158: my client does a http request
- frame 1168: iPod goes to sleep
- frame 1170: many beacons (with TIM bit set after some time)
- frame 1199: iPod wakes up (presumably answering the "traffic for you" beacons)
- ??? MISSING DATA ???
- frame 1201: beacons with no TIM bit set (so data got sent or dropped?)
- frame 1204: iPod goes to sleep again
In the syslog we find at the corresponding time:
Nov 24 13:49:11 alix kernel: wlan0: STA 00:22:41:91:8e:96 aid 1 send
PS frame since STA not sleeping anymore
Nov 24 13:49:11 alix kernel: phy0: dropped TX filtered frame,
queue_len=0 PS=0 @135594
Nov 24 13:49:11 alix kernel: wlan0: STA 00:22:41:91:8e:96 aid 1 enters
power save mode
which may explain why I don't see any data packet on the air. Also
attached is hostapd log.
This repeats several times until the final "http OK" which only
succeeds because I have grossly extended the timeout of my app.
However, the delay of several seconds between http request and server
response is intended and normal.
I have numerous occurrences of this sequence, easy to reproduce.
Another thing is the fact that we see 24(!) consecutive beacons with
TIM bit set until the iPod finally wakes up. It seems to be lying to
the AP, as it advertises a listen_interval of only 10. May this cause
the above?
In case better debug output is needed or you need me to try out
something, please shout.
Best regards,
Stefan.
2008/11/21 Stefan Steuerwald <[email protected]>:
> I got another one to lay out before you for comment:
>
> Setup:
> - XG-601 card (ISL3880) with p54pci driver in AP mode
> - iPod Touch 2G (firmware 2.1.1) as the only client
>
> The client does an http request and waits for an answer. In this case,
> the web server answers, but the answer seemingly is never sent over
> the air.
>
> Please see the attached pcap:
> - frame 7: http request
> - frame 13: iPod goes to sleep
> - frame 23: AP beacon indicates traffic for iPod (AID 1 in TIM)
> - frame 25: iPod wakes up
> - in between: MISSING DATA ???
> - frame 27: AP beacon with no traffic indicated ???
> - frame 29: iPod goes to sleep again
> - subsequent frames: repetitions of this, until the TCP connection is closed
>
> My understanding is that the AP would not indicate any traffic without
> actually having some ready to send? Wrong assumption?
>
> Both the iPod and the monitoring device seem to agree the http reply
> is not there. This would be a good reason for the application-level
> timeouts I'm seeing.
>
> Best wishes,
> Stefan.
>
>
> See also: previous conversation about app-level timeouts and powersave:
> http://article.gmane.org/gmane.linux.kernel.wireless.general/24278
>
Thanks again Christian and Johannes,
from a first quick check
- set-and-clear.diff doesn't seem to change anything
- both patches together freeze my system
I don't have a serial console on this embedded thing, so I don't know
its death poem yet. Let me set up my debug environment properly and
report back to you.
Stefan.
2008/11/24 Christian Lamparter <[email protected]>:
> On Monday 24 November 2008 14:37:54 Johannes Berg wrote:
>> On Fri, 2008-11-21 at 15:12 +0100, Stefan Steuerwald wrote:
>>
>> > - frame 7: http request
>> > - frame 13: iPod goes to sleep
>> > - frame 23: AP beacon indicates traffic for iPod (AID 1 in TIM)
>>
>> in 24 too, the ipod probably didn't see the beacon in frame 23 even
>> though 23 was a dtim beacon (which is a bit odd, but maybe the ipod
>> doesn't care about mcast at this point)
>>
>> > - frame 25: iPod wakes up
>>
>> 26: ack from the AP
>>
>> > - in between: MISSING DATA ???
>> > - frame 27: AP beacon with no traffic indicated ???
>> > - frame 29: iPod goes to sleep again
>> > - subsequent frames: repetitions of this, until the TCP connection is closed
>> >
>> > My understanding is that the AP would not indicate any traffic without
>> > actually having some ready to send? Wrong assumption?
>>
>> Indeed. Christian, is it possible that the p54 device is actually
>> filtering these frames? I'm pretty sure mac80211 behaves correctly, and
>> it unsetting the TIM bit means that it must no longer have traffic
>> buffered.
> As far as I know it works like this:
> If a frame with a the PS-Bit in the FC set is received, the firmware
> will mark the source mac / aid as "sleeping". And every frame from
> this moment on for this device will be buffered.
>
> To remove the "mark" again, the driver has to call p54_sta_unlock.
> And the firmware will send out all buffered frames.
> Or if we only need for one frame (e.g: probe resp) we have a tx_flag (_NO_CANCEL)
>
> If for some reason the "mark" doesn't get removed the firmware will eventually timeout
> the stuck frame and sets a the P54_TX_PSM_CANCELLED flag in tx_status.
> And we pass this on to the mac with IEEE80211_TX_STAT_TX_FILTERED.
>
> one thing: p54 reports the tx_status through the rx-ring-buffer.
> so I hope there's no rx/tx race here since everything is in the same "boat" here.
>
> based on that:
> I made two different patches to address the problem.
>
> one fiddles with mac80211 only (set-and-clear.diff).
> It assumes that if a station comes out of PS, we have to set
> the CLEAR_PS_FILT on the same time we clear the WLAN_STA_PS.
>
> the other one is only for p54 (p54-sta-flags.diff)... Doesn't do very much,
> it just checks if the CLEAR_PS_FILT is set and then sets the
> NO_CANCEL flag on that frame, so the firmware won't buffer it.
>
> Of course you can test both patches on the same time, if it doesn't help.
>
> And finally of course, I could be totally wrong and this is all nothing but useless gibberish.
>
> Regards,
> Chr
>
> BTW: I couldn't test the patches, so it may OOps
>