Hi,
On 20 October 2012 19:32, Thomas Pedersen <[email protected]> wrote:
> Hi list,
>
> We're seeing some issue with reported mactime by the ath9k_htc (AR9271)
> card. Here is a mesh node sampling peer beacons:
>
[snip]
> The number after "at" is rx_status->mactime, "now" is drv_get_tsf(), and
> offset (ie. stack delay) is mactime - tsf. RX_FLAG_MACTIME_MPDU is on for
> all frames, and mesh sync is not adjusting the TSF.
>
> How can we possibly receive frames from the future!?
>
1) Are you correctly merging the TSF32 values? Are you comparing TSF32
to TSF64, are you converting 32->64, etc? TSF32 wraps pretty quickly
and there's some kooky logic needed to make sure your TSF64 calculated
value is correct.
2) Are all your mesh nodes synchronised? Why don't you print out the
timestamp inside the beacon frame as well, and compare whether they're
all in reasonable sync?
Adrian
On Thu, Oct 25, 2012 at 1:42 PM, Adrian Chadd <[email protected]> wrote:
> On 24 October 2012 18:54, Adrian Chadd <[email protected]> wrote:
>
>> Yup, found it. The firmware TSF code was treating the TSF like it was
>> an older pre-11n NIC and only dealing with the low handful of bits.
>> Thomas has verified my test image works.
>>
>> I'll go and see about pushing this into the tree so Sujith's next
>> ath9k_htc drop will include this fix.
>>
>> Thanks for the debugging info Thomas!
>
> Thomas, is the TSF in the RX status descriptor actually 32 bits, or is
> this a printing error?
Looking at the printk again, I'm pretty sure I'm just an idiot and the
printed value is being truncated to 32 bits. Will test and verify the
TSF wraps correctly again.
Thomas
On Wed, Oct 24, 2012 at 02:26:05PM -0700, Adrian Chadd wrote:
> On 24 October 2012 14:22, Adrian Chadd <[email protected]> wrote:
> > Hi,
> >
> > can you please repeat this experiment but dump the TSF32/TSF64 fields
> > there as hexadecimal, rather than decimal?
> >
> > I'd like to see if this is an example of "bad TSF32->TSF64 promotion".
>
> Also, we should check to see whether the MAC is paying attention to
> any beacon frames
How? For mesh mode support we currently just reuse the WDS AP firmware vif
opmode.
> and updating _its_ timestamp using that beacon frame timer. That's
> one of the mesh-and-sta-and-ap-at-the-same-time problems - in
> STA mode, the TSF gets fudged based on the AP TSF. In other modes this
> doesn't necessarily hold.
OK, but the TSF for AP shouldn't be automatically updated on other
beacons, right?
Thanks again,
Thomas
On 24 October 2012 16:33, Adrian Chadd <[email protected]> wrote:
> So hm, now you've made me open the ath9k_htc firmware source. I hate you.
>
> Let me look.
Yup, found it. The firmware TSF code was treating the TSF like it was
an older pre-11n NIC and only dealing with the low handful of bits.
Thomas has verified my test image works.
I'll go and see about pushing this into the tree so Sujith's next
ath9k_htc drop will include this fix.
Thanks for the debugging info Thomas!
Adrian
Hi,
can you please repeat this experiment but dump the TSF32/TSF64 fields
there as hexadecimal, rather than decimal?
I'd like to see if this is an example of "bad TSF32->TSF64 promotion".
adrian
On Thu, Oct 25, 2012 at 03:27:39PM -0700, Adrian Chadd wrote:
> On 25 October 2012 15:25, Thomas Pedersen <[email protected]> wrote:
>
> >> Thomas, is the TSF in the RX status descriptor actually 32 bits, or is
> >> this a printing error?
> >
> > Looking at the printk again, I'm pretty sure I'm just an idiot and the
> > printed value is being truncated to 32 bits. Will test and verify the
> > TSF wraps correctly again.
Yep, the TSF actually wraps to 64 bits correctly :|
[ 4394.156056] got beacon with ts ABA184D3 from 00:03:7f:10:4e:0a at FEF0D969, now: FEF0DC84. offset == -795
[ 4395.723180] got beacon with ts 1020E8597 from 00:03:7f:10:4e:1a at FF08C20E, now: FF08C526. offset == -792
[ 4397.771811] got beacon with ts 1022DC649 from 00:03:7f:10:4e:1a at FF2802BC, now: FF280671. offset == -949
[ 4406.446193] got beacon with ts AC5D0568 from 00:03:7f:10:4e:0a at FFAC5A30, now: FFAC5D68. offset == -824
[ 4412.109937] got beacon with ts 103088637 from 00:03:7f:10:4e:1a at 10002C292, now: 10002C5F0. offset == -862
[ 4416.205946] got beacon with ts 10347041B from 00:03:7f:10:4e:1a at 100414070, now: 100414376. offset == -774
[ 4418.253827] got beacon with ts 103664224 from 00:03:7f:10:4e:1a at 100607E74, now: 1006081C7. offset == -851
[ 4418.735826] got beacon with ts AD188414 from 00:03:7f:10:4e:0a at 10067D90F, now: 10067DC4B. offset == -828
Thanks,
Thomas
On 25 October 2012 15:25, Thomas Pedersen <[email protected]> wrote:
>> Thomas, is the TSF in the RX status descriptor actually 32 bits, or is
>> this a printing error?
>
> Looking at the printk again, I'm pretty sure I'm just an idiot and the
> printed value is being truncated to 32 bits. Will test and verify the
> TSF wraps correctly again.
Thanks.
There's a few other TSF tweaks that could go in after this - to make
things more correct around the wrap boundary.
Adrian
On Wed, Oct 24, 2012 at 02:22:51PM -0700, Adrian Chadd wrote:
> Hi,
>
> can you please repeat this experiment but dump the TSF32/TSF64 fields
> there as hexadecimal, rather than decimal?
Sure.
> I'd like to see if this is an example of "bad TSF32->TSF64 promotion".
Doesn't look like it:
[ 147.074120] got beacon with ts 8CA003E9 from 00:03:7f:10:4e:1a at 477E063, now: 4746393. offset == 228560
[ 147.086876] got beacon with ts 4A381D8 from 00:03:7f:10:4e:24 at 477E5D6, now: 474956B. offset == 217195
[ 147.124375] got beacon with ts 4A38B4C from 00:03:7f:10:4d:d7 at 475A460, now: 47527D4. offset == 31884
[ 148.122624] got beacon with ts 4B32FAB from 00:03:7f:10:4e:0c at 487DF1F, now: 48462A1. offset == 228478
[ 148.152501] got beacon with ts 4B321F4 from 00:03:7f:10:4e:19 at 484D3A3, now: 484D750. offset == -941
[ 149.003626] got beacon with ts 4556180 from 64:70:02:0a:4e:a8 at 491CF79, now: 491D38B. offset == -1042
[ 150.159007] got beacon with ts 4D26D99 from 00:03:7f:10:4e:09 at 4A3EE9A, now: 4A37231. offset == 31849
[ 150.480503] got beacon with ts 8D1D045B from 00:03:7f:10:4e:11 at 4AFD5EC, now: 4A859D7. offset == 490517
[ 151.053502] got beacon with ts 474A653 from 64:70:02:0a:4e:a8 at 4B19427, now: 4B117C9. offset == 31838
[ 151.171133] got beacon with ts 8CDE845F from 00:03:7f:10:4e:1a at 4B2DFC6, now: 4B2E32B. offset == -869
[ 151.183751] got beacon with ts 4E20D1E from 00:03:7f:10:4e:24 at 4B2EF59, now: 4B3147D. offset == -9508
[ 151.224256] got beacon with ts 4E217D8 from 00:03:7f:10:4d:d7 at 4B3AF3C, now: 4B3B2A9. offset == -877
[ 151.248882] got beacon with ts 4E22473 from 00:03:7f:10:4e:2a at 4B78FA7, now: 4B412D9. offset == 228558
[ 152.216627] got beacon with ts 4F1A579 from 00:03:7f:10:4e:0c at 4C2D336, now: 4C2D685. offset == -847
[ 152.250129] got beacon with ts 4F1A5DE from 00:03:7f:10:4e:19 at 4C3D5DB, now: 4C3595D. offset == 31870
[ 152.528127] got beacon with ts 8D3C41C7 from 00:03:7f:10:4e:11 at 4C793B6, now: 4C7971C. offset == -870
[ 153.134004] got beacon with ts 6978669 from 00:03:7f:10:4d:f7 at 4D0D1EF, now: 4D0D575. offset == -902
[ 153.220143] got beacon with ts 8CFDC6F1 from 00:03:7f:10:4e:1a at 4D3A2B7, now: 4D225DB. offset == 97500
[ 154.256008] got beacon with ts 510EF18 from 00:03:7f:10:4e:09 at 4E1EE8D, now: 4E1F1E2. offset == -853
[ 154.582264] got beacon with ts 8D5B9700 from 00:03:7f:10:4e:11 at 4E6E883, now: 4E6EC38. offset == -949
[ 155.149631] got beacon with ts 4B32180 from 64:70:02:0a:4e:a8 at 4EF909C, now: 4EF9410. offset == -884
[ 155.253254] got beacon with ts 5208441 from 00:03:7f:10:4d:e7 at 4F1A58F, now: 4F128C8. offset == 31943
[ 155.267500] got beacon with ts 8D1D01C4 from 00:03:7f:10:4e:1a at 4F1D980, now: 4F15DC5. offset == 31675
[ 155.280261] got beacon with ts 52082F1 from 00:03:7f:10:4e:24 at 4F1E0E7, now: 4F18FA5. offset == 20802
The TSF counter hasn't even wrapped 32 bits yet.
Thomas
Hi Adrian,
Thanks for your help.
On Mon, Oct 22, 2012 at 7:31 AM, Adrian Chadd <[email protected]> wrote:
> [snip]
>
>> The number after "at" is rx_status->mactime, "now" is drv_get_tsf(),
>> and offset (ie. stack delay) is mactime - tsf. RX_FLAG_MACTIME_MPDU
>> is on for all frames, and mesh sync is not adjusting the TSF.
>>
>> How can we possibly receive frames from the future!?
>>
>
> 1) Are you correctly merging the TSF32 values? Are you comparing TSF32
> to TSF64, are you converting 32->64, etc? TSF32 wraps pretty quickly
> and there's some kooky logic needed to make sure your TSF64 calculated
> value is correct.
It looks like the mactime reported by ath9k_htc already comes as a 64
bit value straight from the firmware in
(ath9k_htc_rx_status)->rs_timestamp. All variables involved are u64.
> 2) Are all your mesh nodes synchronised? Why don't you print out the
> timestamp inside the beacon frame as well, and compare whether they're
> all in reasonable sync?
I think the sync code currently just tries to account for clock drift.
Unfortunately this too is relying on an accurate mactime reading, so
bogus mactimes are interpreted as large clock drift :(
Some more output from a single peer:
[ 721.451551] got beacon with ts 641025272 from 64:70:02:0a:4e:d1 at 651540784, now is: 651508959. offset == 31825
[ 723.498926] got beacon with ts 643072384 from 64:70:02:0a:4e:d1 at 653784502, now is: 653556017. offset == 228485
[ 725.547303] got beacon with ts 645120384 from 64:70:02:0a:4e:d1 at 655603123, now is: 655604082. offset == -959
[ 731.692319] got beacon with ts 651264414 from 64:70:02:0a:4e:d1 at 661779913, now is: 661748125. offset == 31788
[ 733.741433] got beacon with ts 653313270 from 64:70:02:0a:4e:d1 at 663795998, now is: 663796927. offset == -929
[ 735.788803] got beacon with ts 655360384 from 64:70:02:0a:4e:d1 at 665843110, now is: 665843976. offset == -866
[ 739.885435] got beacon with ts 659456384 from 64:70:02:0a:4e:d1 at 670037411, now is: 669939978. offset == 97433
[ 741.933810] got beacon with ts 661504529 from 64:70:02:0a:4e:d1 at 671987247, now is: 671988032. offset == -785
[ 746.030328] got beacon with ts 665600384 from 64:70:02:0a:4e:d1 at 676312473, now is: 676083906. offset == 228567
[ 748.078714] got beacon with ts 667648384 from 64:70:02:0a:4e:d1 at 678163865, now is: 678131958. offset == 31907
[ 752.187941] got beacon with ts 671756958 from 64:70:02:0a:4e:d1 at 682337969, now is: 682240565. offset == 97404
[ 754.223820] got beacon with ts 673792384 from 64:70:02:0a:4e:d1 at 684307855, now is: 684276125. offset == 31730
[ 756.271944] got beacon with ts 675840384 from 64:70:02:0a:4e:d1 at 686814606, now is: 686323937. offset == 490669
[ 758.321077] got beacon with ts 677889131 from 64:70:02:0a:4e:d1 at 688371831, now is: 688372745. offset == -914
[ 760.369198] got beacon with ts 679937030 from 64:70:02:0a:4e:d1 at 690419726, now is: 690420550. offset == -824
[ 764.465200] got beacon with ts 684032384 from 64:70:02:0a:4e:d1 at 694547844, now is: 694515914. offset == 31930
[ 766.513576] got beacon with ts 686080384 from 64:70:02:0a:4e:d1 at 696563073, now is: 696563975. offset == -902
[ 768.561948] got beacon with ts 688128512 from 64:70:02:0a:4e:d1 at 698611198, now is: 698612029. offset == -831
[ 772.658704] got beacon with ts 692224497 from 64:70:02:0a:4e:d1 at 702805482, now is: 702708141. offset == 97341
Again, offset here is "rx->mactime - drv_get_tsf()", which should never be
positive. Here are some interesting numbers:
[ 1234.229588] got beacon with ts 8429161802 from 00:03:7f:10:4d:d7 at 1164435178, now is: 1164207083. offset == 228095
[ 1234.247839] got beacon with ts 8429159489 from 00:03:7f:10:4e:2a at 1164224481, now is: 1164225454. offset == -973
[ 1234.556464] got beacon with ts 1181697231 from 00:03:7f:10:4d:f7 at 1164565957, now is: 1164534031. offset == 31926
[ 1234.587093] got beacon with ts 8429573388 from 00:03:7f:10:4e:24 at 1164563800, now is: 1164564649. offset == -849
[ 1234.637841] got beacon with ts 8429570178 from 00:03:7f:10:4d:d7 at 1164614178, now is: 1164615399. offset == -1221
[ 1234.962592] got beacon with ts 8429978558 from 00:03:7f:10:4e:31 at 1164939247, now is: 1164940096. offset == -849
[ 1234.992221] got beacon with ts 8429978502 from 00:03:7f:10:4e:24 at 1165984724, now is: 1164969719. offset == 1015005
[ 1235.046093] got beacon with ts 8429978601 from 00:03:7f:10:4d:d7 at 1165022604, now is: 1165023584. offset == -980
Notice rx->mactime is not always steadily increasing, but the "future" frames
have sudden jumps in time. Frames with more moderate offsets seem to follow a
regular pattern, though.
Thomas
On 25 October 2012 17:41, Thomas Pedersen <[email protected]> wrote:
> Yep, the TSF actually wraps to 64 bits correctly :|
>
> [ 4394.156056] got beacon with ts ABA184D3 from 00:03:7f:10:4e:0a at FEF0D969, now: FEF0DC84. offset == -795
> [ 4395.723180] got beacon with ts 1020E8597 from 00:03:7f:10:4e:1a at FF08C20E, now: FF08C526. offset == -792
> [ 4397.771811] got beacon with ts 1022DC649 from 00:03:7f:10:4e:1a at FF2802BC, now: FF280671. offset == -949
> [ 4406.446193] got beacon with ts AC5D0568 from 00:03:7f:10:4e:0a at FFAC5A30, now: FFAC5D68. offset == -824
> [ 4412.109937] got beacon with ts 103088637 from 00:03:7f:10:4e:1a at 10002C292, now: 10002C5F0. offset == -862
> [ 4416.205946] got beacon with ts 10347041B from 00:03:7f:10:4e:1a at 100414070, now: 100414376. offset == -774
> [ 4418.253827] got beacon with ts 103664224 from 00:03:7f:10:4e:1a at 100607E74, now: 1006081C7. offset == -851
> [ 4418.735826] got beacon with ts AD188414 from 00:03:7f:10:4e:0a at 10067D90F, now: 10067DC4B. offset == -828
Sweet, thanks for verifying that.
Adrian
On 24 October 2012 18:54, Adrian Chadd <[email protected]> wrote:
> Yup, found it. The firmware TSF code was treating the TSF like it was
> an older pre-11n NIC and only dealing with the low handful of bits.
> Thomas has verified my test image works.
>
> I'll go and see about pushing this into the tree so Sujith's next
> ath9k_htc drop will include this fix.
>
> Thanks for the debugging info Thomas!
Thomas, is the TSF in the RX status descriptor actually 32 bits, or is
this a printing error?
The firmware side seems to be populating it with a 64 bit TSF value
and the rx_status field in question is actually 64 bits. Hence why I'm
confused.
Adrian
On 24 October 2012 18:54, Adrian Chadd <[email protected]> wrote:
> I'll go and see about pushing this into the tree so Sujith's next
> ath9k_htc drop will include this fix.
>
> Thanks for the debugging info Thomas!
I've just pushed the fix into the tree. I'll privately email you
another image to test with just to make sure my committed fix is
correct.
The next release will have this fixed.
Adrian
On 24 October 2012 15:08, Thomas Pedersen <[email protected]> wrote:
>> Also, we should check to see whether the MAC is paying attention to
>> any beacon frames
>
> How? For mesh mode support we currently just reuse the WDS AP firmware vif
> opmode.
>
>> and updating _its_ timestamp using that beacon frame timer. That's
>> one of the mesh-and-sta-and-ap-at-the-same-time problems - in
>> STA mode, the TSF gets fudged based on the AP TSF. In other modes this
>> doesn't necessarily hold.
>
> OK, but the TSF for AP shouldn't be automatically updated on other
> beacons, right?
Right but look at those AP macs, there's multiple overlapping TSFs there, right?
* Each MAC AP has an incrementing TSF;
* Multiple APs have different TSF spaces, they're not synchronised at all;
* There's one RX TSF namespace - if the vif is in AP mode then it's
free-running and not synchronising against any other TSF;
* Thus the above makes sense to me.
So far the beacon frames look correct and they're increasing; you're
not obviously synchronising TSF against anything.
Going back to the original question - you fetch TSF after you receive
the packet, and sometimes the TSF in the frame is greater than the TSF
read back by the ath9k TSF get method.
The ath9k_htc tsf read is just 'tsf = ath9k_hw_gettsf64(priv->ah);',
so it's doing direct register reads. That should be immediate, but
it's USB so it may take a while.
So hm, now you've made me open the ath9k_htc firmware source. I hate you.
Let me look.
Adrian
On 24 October 2012 14:22, Adrian Chadd <[email protected]> wrote:
> Hi,
>
> can you please repeat this experiment but dump the TSF32/TSF64 fields
> there as hexadecimal, rather than decimal?
>
> I'd like to see if this is an example of "bad TSF32->TSF64 promotion".
Also, we should check to see whether the MAC is paying attention to
any beacon frames and updating _its_ timestamp using that beacon frame
timer.
That's one of the mesh-and-sta-and-ap-at-the-same-time problems - in
STA mode, the TSF gets fudged based on the AP TSF. In other modes this
doesn't necessarily hold.
Adrian