Return-path: Received: from mail-qy0-f174.google.com ([209.85.216.174]:37481 "EHLO mail-qy0-f174.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753189Ab0LGBUb convert rfc822-to-8bit (ORCPT ); Mon, 6 Dec 2010 20:20:31 -0500 Received: by qyk11 with SMTP id 11so4331020qyk.19 for ; Mon, 06 Dec 2010 17:20:31 -0800 (PST) MIME-Version: 1.0 In-Reply-To: <201012061714.15863.br1@einfach.org> References: <201012061714.15863.br1@einfach.org> From: Jonathan Guerin Date: Tue, 7 Dec 2010 11:12:18 +1000 Message-ID: Subject: Re: ath5k: Weird Retransmission Behaviour To: Bruno Randolf Cc: linux-wireless , ath5k-devel , nbd@openwrt.org Content-Type: text/plain; charset=ISO-8859-1 Sender: linux-wireless-owner@vger.kernel.org List-ID: On Mon, Dec 6, 2010 at 6:14 PM, Bruno Randolf wrote: > On Mon December 6 2010 15:30:00 Jonathan Guerin wrote: >> Hi, >> >> >> I've been doing some investigation into the behaviour of contention >> windows and retransmissions. >> >> Firstly, I'll just describe the test scenario and setup that I have. I >> have 3 Via x86 nodes with Atheros AR5001X+ cards. They are tethered to >> each other via coaxial cables, into splitters. They have 20dB of fixed >> attenuation applied to each antenna output, plus a programmable >> variable attenuator on each link. One node acts as a sender, one as a >> receiver, and one simply runs a monitor-mode interface to capture >> packet traces. All 3 are running kernel version 2.6.37-rc2. The sender >> and receiver are configured as IBSS stations and are tuned to 5.18 >> GHz. >> >> Here's a really dodgy ASCII diagram of the setup: >> >> S-----[variable attenuator]-----R >> >> >> >> +------------M-------------------------+ >> >> where S is the Sender node, R is the Receiver node and M is the >> Monitoring capture node. >> >> >> Secondly, I have written a program which will parse a captured pcap >> file from the Monitoring station. It looks for 'chains' of frames with >> the same sequence number, and where the first frame has the Retry bit >> set to false in the header and all following have it set to true. Any >> deviation from this, and the program drops the current chain without >> including it in its stats, and looks for the next chain matching these >> requirements. It averages the amount of time per transmission number >> (i.e. the average of all transmissions which were the first, second, >> third etc. for a unique sequence number). The transmission time of a >> frame is the amount of time between the end of the frame and the end >> of the previous. It tracks these 'chains' of frames with the same >> sequence number. It considers the last transmission number in each >> chain as the 'final' transmission. >> >> Finally, the link is loaded using a saturated UDP flow, and the data >> rate is fixed to 54M and 36M. This is specified in the output. The >> output is attached below. >> >> The output describes the fixed link data rate, the variable >> attenuator's value, the delivery ratio, and the number of transmitted >> packets/s. I've added a discussion per result set. Each line outputs >> the transmission number, the average transmission time for this >> number, the total number of transmissions, the number of frames which >> ended their transmissions at this number (i.e. where the chain ended >> its final transmission - this is equivalent to the retransmission >> value from the Radiotap header + 1), and the average expected >> transmission time for all that particular transmission number in all >> chains. This is calculated using the airtime calculations from the >> 802.11a standard, with the receipt of an ACK frame, as well as a SIFS >> (16us), which is 28us. If the transmission did not receive an ACK, a >> normal ACK timeout is 50 us, but ath5k appears to have this set to 25 >> us, so the value shouldn't be too far out what to expect. >> >> The header to each result refers to the rate it was fixed at, as well >> as the variable attenuation being added to it. The link also has a >> fixed 40dB of attenuation both to protect the cards, as well as give >> the necessary range for the variable attenuator to control link >> quality. >> >> ==> iperf_33M_rate_36M_att_1dB.pcap.txt <== (good link, 100% delivery) >> Average time per TX No: >> TXNo ?Avg ? ? ? ? ? ? ? ? ? ? No ? ? ? ? ? ? ?Final ? ? ? ? ? ExpectedAvg >> 1 ? ? ? ? ? ? 477.604980 ? ? ?10463 ? 10462 ? ? ? ? ? 509 >> Overall average: 477.604980 >> >> [Discussion:] Nothing, appears normal. >> >> >> ==> iperf_33M_rate_36M_att_18dB.pcap.txt <== (lossy link, but still >> 100% delivery) >> Average time per TX No: >> TXNo ?Avg ? ? ? ? ? ? ? ? ? ? No ? ? ? ? ? ? ?Final ? ? ? ? ? ExpectedAvg >> 1 ? ? ? ? ? ? 476.966766 ? ? ?9808 ? ? ? ? ? ?8138 ? ?509 >> 2 ? ? ? ? ? ? 550.320496 ? ? ?1663 ? ? ? ? ? ?1403 ? ?581 >> 3 ? ? ? ? ? ? 697.552917 ? ? ?255 ? ? ? ? ? ? 218 ? ? 725 >> 4 ? ? ? ? ? ? 1028.756714 ? ? 37 ? ? ? ? ? ? ?30 ? ? ? ? ? ? ?1013 >> 5 ? ? ? ? ? ? 1603.428589 ? ? 7 ? ? ? ? ? ? ? 7 ? ? ? ? ? ? ? 1589 >> Overall average: 494.514618 >> >> [Discussion:] Nothing, appears normal. Contention window appears to >> double normally. >> >> ==> iperf_33M_rate_36M_att_19dB.pcap.txt <== (lossy link, but still >> 100% delivery) >> Average time per TX No: >> TXNo ?Avg ? ? ? ? ? ? ? ? ? ? No ? ? ? ? ? ? ?Final ? ? ? ? ? ExpectedAvg >> 1 ? ? ? ? ? ? 477.510437 ? ? ?14893 ? 8653 ? ?509 >> 2 ? ? ? ? ? ? 546.149048 ? ? ?6205 ? ? ? ? ? ?3624 ? ?581 >> 3 ? ? ? ? ? ? 692.270203 ? ? ?2561 ? ? ? ? ? ?1552 ? ?725 >> 4 ? ? ? ? ? ? 980.565857 ? ? ?1002 ? ? ? ? ? ?596 ? ? 1013 >> 5 ? ? ? ? ? ? 1542.079956 ? ? 400 ? ? ? ? ? ? 252 ? ? 1589 >> 6 ? ? ? ? ? ? 2758.693848 ? ? 147 ? ? ? ? ? ? 89 ? ? ? ? ? ? ?2741 >> 7 ? ? ? ? ? ? 4971.500000 ? ? 56 ? ? ? ? ? ? ?32 ? ? ? ? ? ? ?5045 >> 8 ? ? ? ? ? ? 4689.043457 ? ? 23 ? ? ? ? ? ? ?15 ? ? ? ? ? ? ?5045 >> 9 ? ? ? ? ? ? 4487.856934 ? ? 7 ? ? ? ? ? ? ? 3 ? ? ? ? ? ? ? 5045 >> 10 ? ? ? ? ? ?442.250000 ? ? ?4 ? ? ? ? ? ? ? 3 ? ? ? ? ? ? ? 5045 >> 11 ? ? ? ? ? ?488.000000 ? ? ?1 ? ? ? ? ? ? ? 1 ? ? ? ? ? ? ? 5045 >> Overall average: 580.976807 >> >> [Discussion:] Contention window appears to double until a plateau from >> 7 through 9. Weirdly, the contention window appears to be drop again >> from 10, but >> there are too few frames to draw a conclusion. >> >> ==> iperf_33M_rate_36M_att_21dB.pcap.txt <== (lossy link, < 1% delivery) >> TXNo ?Avg ? ? ? ? ? ? ? ? ? ? No ? ? ? ? ? ? ?Final ? ExpectedAvg >> 1 ? ? ? ? ? ? 485.390198 ? ? ?1940 ? ? ? ? ? ?3 ? ? ? ? ?509 >> 2 ? ? ? ? ? ? 479.113434 ? ? ?1922 ? ? ? ? ? ?2 ? ? ? ? ?581 >> 3 ? ? ? ? ? ? 479.681824 ? ? ?1914 ? ? ? ? ? ?0 ? ? ? ? ?725 >> 4 ? ? ? ? ? ? 485.083038 ? ? ?1903 ? ? ? ? ? ?1 ? ? ? ? ?1013 >> 5 ? ? ? ? ? ? 492.088135 ? ? ?1895 ? ? ? ? ? ?4 ? ? ? ? ?1589 >> 6 ? ? ? ? ? ? 508.322510 ? ? ?1876 ? ? ? ? ? ?1 ? ? ? ? ?2741 >> 7 ? ? ? ? ? ? 524.697876 ? ? ?1870 ? ? ? ? ? ?1 ? ? ? ? ?5045 >> 8 ? ? ? ? ? ? 543.054382 ? ? ?1857 ? ? ? ? ? ?0 ? ? ? ? ?5045 >> 9 ? ? ? ? ? ? 522.970703 ? ? ?1842 ? ? ? ? ? ?0 ? ? ? ? ?5045 >> 10 ? ? ? ? ? ?478.204132 ? ? ?1837 ? ? ? ? ? ?0 ? ? ? ? ?5045 >> 11 ? ? ? ? ? ?476.520782 ? ? ?1828 ? ? ? ? ? ?0 ? ? ? ? ?5045 >> 12 ? ? ? ? ? ?477.531342 ? ? ?1818 ? ? ? ? ? ?0 ? ? ? ? ?5045 >> 13 ? ? ? ? ? ?476.743652 ? ? ?1810 ? ? ? ? ? ?0 ? ? ? ? ?5045 >> 14 ? ? ? ? ? ?478.936554 ? ? ?1797 ? ? ? ? ? ?0 ? ? ? ? ?5045 >> 15 ? ? ? ? ? ?480.699097 ? ? ?1788 ? ? ? ? ? ?0 ? ? ? ? ?5045 >> 16 ? ? ? ? ? ?482.734314 ? ? ?1784 ? ? ? ? ? ?0 ? ? ? ? ?5045 >> 17 ? ? ? ? ? ?491.608459 ? ? ?1775 ? ? ? ? ? ?0 ? ? ? ? ?5045 >> 18 ? ? ? ? ? ?497.458984 ? ? ?1767 ? ? ? ? ? ?1 ? ? ? ? ?5045 >> 19 ? ? ? ? ? ?495.067932 ? ? ?1752 ? ? ? ? ? ?7 ? ? ? ? ?5045 >> 20 ? ? ? ? ? ?478.102417 ? ? ?1738 ? ? ? ? ? ?295 ? ? 5045 >> 21 ? ? ? ? ? ?475.128845 ? ? ?1436 ? ? ? ? ? ?1402 ? 5045 >> 22 ? ? ? ? ? ?492.692322 ? ? ?26 ? ? ? ? ? ? ?0 ? ? ? ? ?5045 >> 23 ? ? ? ? ? ?471.576935 ? ? ?26 ? ? ? ? ? ? ?0 ? ? ? ? ?5045 >> 24 ? ? ? ? ? ?466.884613 ? ? ?26 ? ? ? ? ? ? ?0 ? ? ? ? ?5045 >> 25 ? ? ? ? ? ?476.269226 ? ? ?26 ? ? ? ? ? ? ?0 ? ? ? ? ?5045 >> 26 ? ? ? ? ? ?462.192322 ? ? ?26 ? ? ? ? ? ? ?0 ? ? ? ? ?5045 >> 27 ? ? ? ? ? ?480.961548 ? ? ?26 ? ? ? ? ? ? ?1 ? ? ? ? ?5045 >> 28 ? ? ? ? ? ?463.600006 ? ? ?25 ? ? ? ? ? ? ?24 ? ? ? ? 5045 >> Overall average: 491.068359 >> >> [Discussion:] Contention does not appear to increase, and the number >> of transmission per frame is very large. This behaviour is replicated >> with the 54M scenario when a link is extremely lossy. >> >> ==> iperf_33M_rate_54M_att_1dB.pcap.txt <== (good link, 2400 packets/s) >> Average time per TX No: >> TXNo ?Avg ? ? ? ? ? ? ? ? ? ? No ? ? ? ? ? ? ?Final ? ? ? ? ? ExpectedAverage >> 1 ? ? ? ? ? ? 365.551849 ? ? ?23957 ? 23935 ? ? ? ? ? 393 >> 2 ? ? ? ? ? ? 409.571442 ? ? ?21 ? ? ? ? ? ? ?21 ? ? ? ? ? ? ?465 >> Overall average: 365.590424 >> >> [Discussion: ] Appears relatively normal. >> >> ==> iperf_33M_rate_54M_att_10dB.pcap.txt <== (lossy link, but still >> 100% delivery, 1500 packets/s) >> Average time per TX No: >> TXNo ?Avg ? ? ? ? ? ? ? ? ? ? No ? ? ? ? ? ? ?Final ? ? ? ? ? ExpectedAverage >> 1 ? ? ? ? ? ? 364.501190 ? ? ?10134 ? 5915 ? ?393 >> 2 ? ? ? ? ? ? 434.138000 ? ? ?4196 ? ? ? ? ? ?2461 ? ?465 >> 3 ? ? ? ? ? ? 579.482300 ? ? ?1721 ? ? ? ? ? ?1036 ? ?609 >> 4 ? ? ? ? ? ? 837.005859 ? ? ?682 ? ? ? ? ? ? 397 ? ? 897 >> 5 ? ? ? ? ? ? 1365.279175 ? ? 283 ? ? ? ? ? ? 155 ? ? 1473 >> 6 ? ? ? ? ? ? 2572.007812 ? ? 128 ? ? ? ? ? ? 81 ? ? ? ? ? ? ?2625 >> 7 ? ? ? ? ? ? 4905.195801 ? ? 46 ? ? ? ? ? ? ?27 ? ? ? ? ? ? ?4929 >> 8 ? ? ? ? ? ? 4985.947266 ? ? 19 ? ? ? ? ? ? ?12 ? ? ? ? ? ? ?4929 >> 9 ? ? ? ? ? ? 4627.285645 ? ? 7 ? ? ? ? ? ? ? 4 ? ? ? ? ? ? ? 4929 >> 10 ? ? ? ? ? ?366.000000 ? ? ?3 ? ? ? ? ? ? ? 1 ? ? ? ? ? ? ? 4929 >> 11 ? ? ? ? ? ?335.500000 ? ? ?2 ? ? ? ? ? ? ? 2 ? ? ? ? ? ? ? 4929 >> Overall average: 473.477020 >> >> [Discussion: ] Appears fine, until transmission 10, which appears to >> drop the contention window back to an equivalent first transmission >> value, but not enough frames at this point to draw a conclusion. >> >> ==> iperf_33M_rate_54M_att_11dB.pcap.txt <== (lossy link, but still >> 100% delivery, 680 packets/s) >> Average time per TX No: >> TXNo ?Avg ? ? ? ? ? ? ? ? ? ? No ? ? ? ? ? ? ?Final ? ? ? ? ? ExpectedAverage >> 1 ? ? ? ? ? ? 362.082825 ? ? ?2149 ? ? ? ? ? ?539 ? ? 393 >> 2 ? ? ? ? ? ? 434.672485 ? ? ?1606 ? ? ? ? ? ?368 ? ? 465 >> 3 ? ? ? ? ? ? 582.795288 ? ? ?1231 ? ? ? ? ? ?307 ? ? 609 >> 4 ? ? ? ? ? ? 820.347107 ? ? ?919 ? ? ? ? ? ? 237 ? ? 897 >> 5 ? ? ? ? ? ? 1424.753296 ? ? 673 ? ? ? ? ? ? 194 ? ? 1473 >> 6 ? ? ? ? ? ? 2626.403320 ? ? 466 ? ? ? ? ? ? 143 ? ? 2625 >> 7 ? ? ? ? ? ? 4734.233887 ? ? 308 ? ? ? ? ? ? 83 ? ? ? ? ? ? ?4929 >> 8 ? ? ? ? ? ? 4830.244141 ? ? 217 ? ? ? ? ? ? 65 ? ? ? ? ? ? ?4929 >> 9 ? ? ? ? ? ? 4449.702637 ? ? 148 ? ? ? ? ? ? 33 ? ? ? ? ? ? ?4929 >> 10 ? ? ? ? ? ?360.114044 ? ? ?114 ? ? ? ? ? ? 36 ? ? ? ? ? ? ?4929 >> 11 ? ? ? ? ? ?366.000000 ? ? ?78 ? ? ? ? ? ? ?20 ? ? ? ? ? ? ?4929 >> 12 ? ? ? ? ? ?460.655182 ? ? ?58 ? ? ? ? ? ? ?20 ? ? ? ? ? ? ?4929 >> 13 ? ? ? ? ? ?544.184204 ? ? ?38 ? ? ? ? ? ? ?9 ? ? ? ? ? ? ? 4929 >> 14 ? ? ? ? ? ?893.965515 ? ? ?29 ? ? ? ? ? ? ?7 ? ? ? ? ? ? ? 4929 >> 15 ? ? ? ? ? ?1361.409058 ? ? 22 ? ? ? ? ? ? ?8 ? ? ? ? ? ? ? 4929 >> 16 ? ? ? ? ? ?2675.285645 ? ? 14 ? ? ? ? ? ? ?2 ? ? ? ? ? ? ? 4929 >> 17 ? ? ? ? ? ?4239.500000 ? ? 12 ? ? ? ? ? ? ?5 ? ? ? ? ? ? ? 4929 >> 18 ? ? ? ? ? ?3198.142822 ? ? 7 ? ? ? ? ? ? ? 2 ? ? ? ? ? ? ? 4929 >> 19 ? ? ? ? ? ?5111.799805 ? ? 5 ? ? ? ? ? ? ? 3 ? ? ? ? ? ? ? 4929 >> 20 ? ? ? ? ? ?1403.000000 ? ? 2 ? ? ? ? ? ? ? 1 ? ? ? ? ? ? ? 4929 >> Overall average: 1063.129883 >> >> [Discussion: ] Everything appears fine until, once again, transmission >> 10, when the contention windows appears to 'restart' - it climbs >> steadily until 17. After this point, there are not enough frames to >> draw any conclusions. >> >> ==> iperf_33M_rate_54M_att_12dB.pcap.txt <== (lossy link, 6% delivery, >> 400 packets/s) >> Average time per TX No: >> TXNo ?Avg ? ? ? ? ? ? ? ? ? ? No ? ? ? ? ? ? ?Final ? ? ? ? ? ExpectedAvg >> 1 ? ? ? ? ? ? 360.460724 ? ? ?4482 ? ? ? ? ? ?14 ? ? ? ? ? ? ?393 >> 2 ? ? ? ? ? ? 366.068481 ? ? ?4453 ? ? ? ? ? ?16 ? ? ? ? ? ? ?465 >> 3 ? ? ? ? ? ? 360.871735 ? ? ?4413 ? ? ? ? ? ?13 ? ? ? ? ? ? ?609 >> 4 ? ? ? ? ? ? 361.535553 ? ? ?4386 ? ? ? ? ? ?18 ? ? ? ? ? ? ?897 >> 5 ? ? ? ? ? ? 367.526062 ? ? ?4357 ? ? ? ? ? ?60 ? ? ? ? ? ? ?1473 >> 6 ? ? ? ? ? ? 360.003967 ? ? ?4283 ? ? ? ? ? ?3839 ? ?2625 >> 7 ? ? ? ? ? ? 361.778046 ? ? ?419 ? ? ? ? ? ? 416 ? ? 4929 >> Overall average: 362.732910 >> >> [Discussion:] This exhibits the same problem as the extremely lossy >> 36M link - the contention window does not appear to rise. Even with >> enough frames to draw a good conclusion at transmission 6, the >> transmission time average (360) is way below the expected average >> (2625). >> ==> END OF OUTPUT <== >> >> The question here is: why does ath5k/mac80211 send out so many >> transmissions, and why does it vary so much based on link quality? >> Additionally, why does it appear to 'reset' the contention window >> after 9 retransmissions of a frame? >> >> Cheers, >> >> Jonathan > > Hi Jonathan! > > This is a very interesting setup and test. I guess nobody has looked so > closely yet... I think this is not necessarily ath5k related, but may be a bug > of mac80211 or minstrel, but not sure yet, of course... > > It's normal, that the CW is reset after the retry limits are reached, this is > what the standard says: > > "The CW shall be reset to aCWmin after every successful attempt to transmit an > MPDU or MMPDU, when SLRC reaches dot11LongRetryLimit, or when SSRC reaches > dot11ShortRetryLimit." (802.11-2007 p261) Good point, I forgot to check the standard on this! > > But it seems weird that there are so many retransmissions. The default maximum > numbers of retransmissions should be 7 for short frames and 4 for long frames > (dot11[Short|Long]RetryLimit), and this is what is set as defaults in mac80211 > (local->hw.conf.short_frame_max_tx_count). Seems we are getting many > retransmissions from minstel, i added some debug prints: > > *** txdesc tries 3 > *** mrr 0 tries 3 rate 11 > *** mrr 1 tries 3 rate 11 > *** mrr 2 tries 3 rate 11 > > This seems to be the normal case and that would already result in 12 > transmissions. > > Another thing that strikes me here is: why use multi rate retries if the rate > is all the same? (Ignore the actual value of the rate, this is the HW rate > code). > > Other examples: > > *** txdesc tries 2 > *** mrr 0 tries 9 rate 12 > *** mrr 1 tries 2 rate 13 > *** mrr 2 tries 3 rate 11 > > = 16 transmissions in sum. > > *** txdesc tries 9 > *** mrr 0 tries 3 rate 11 > *** mrr 1 tries 9 rate 8 > *** mrr 2 tries 3 rate 11 > > = 24 transmissions in sum. Again, rate[1] and rate[3] are the same, so why > bother setting it up twice? I'm not sure if you still had a fixed rate set here - and I don't know 100% how minstrel works - but it could be that minstrel is trying to do some probing for better rates (if it was set to auto-rate)? > > bruno >