Return-path: Received: from mail.w1.fi ([212.71.239.96]:57573 "EHLO li674-96.members.linode.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753108AbbBXK0Q (ORCPT ); Tue, 24 Feb 2015 05:26:16 -0500 Date: Tue, 24 Feb 2015 12:26:11 +0200 From: Jouni Malinen To: Andrew McGregor Cc: Sujith Manoharan , Adrian Chadd , Linus Torvalds , "Luis R. Rodriguez" , Kalle Valo , "ath9k-devel@lists.ath9k.org" , Linux Wireless List Subject: Re: AR9462 problems connecting again.. Message-ID: <20150224102611.GA30806@w1.fi> (sfid-20150224_112619_938236_74F667A8) References: <20150223171700.GA29730@w1.fi> <20150223213050.GA23232@w1.fi> <20150223224305.GA30228@w1.fi> <21739.50662.902775.901924@gargle.gargle.HOWL> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii In-Reply-To: Sender: linux-wireless-owner@vger.kernel.org List-ID: On Tue, Feb 24, 2015 at 01:29:27PM +1100, Andrew McGregor wrote: > Over the weekend I found a bug in minstrel-ht that might well be > implicated here. > > The last retransmit rate is meant to be a 'get the packet there > reliably' rate; minstrel-ht doesn't do that right, and can pick a > fairly flaky rate instead. > > Can't generate a proper patch right now, so this diff might not apply > cleanly, but the fix is simply to change 75 to 99 in the two places > below: While this may indeed be helpful, I don't think it is sufficient for this EAPOL frame related issue. What I would like to see is minstrel_ht using a basic rate (something non-HT) at the end of the retry series for EAPOL frames. The current behavior looks very suspicious to me. The early EAPOL frames after association are being used to probe for higher rates. This results in the total number of retry attempts actually getting smaller than any other frame, i.e., minstrel_ht seems to be using significantly _less_ robust choices for the EAPOL frames than following "normal" data frames! This should really be the other way around.. As an example, I'm seeing this on 5 GHz band (with the 75 to 99 change in place, but behavior was more or less identical without it): - the first EAPOL frame (msg 2/4) getting one attempt at MCS 3, 2 attempts at MCS 0, 2 attempts at MCS 0 (yes, identical to the previous one) with total maximum of 5 attempts - the second EAPOL frame (msg 4/4) getting one attempt at MCS 9, 2 attempts at MCS 0, 2 attempts at MCS 0 with total maximum of 5 attempts - another data frame after this: 5 attempts at MCS 9, 5 attempts at MCS 3, 5 attempts at MCS 3 with total maximum of 15 attempts(!!) This cannot be the best approach here.. For the IEEE80211_TX_CTRL_PORT_CTRL_PROTO cases, there are identified issues where failing to deliver the frame results is significant issues either in getting connected in the first place or getting disconnected if rekeying fails. I'm not sure how this would be implemented cleanly in minstrel_ht or whether that is even the best place (i.e., rate.c could do this instead), but I'd like that third attempt for control port cases to be dropped to use a (lowish) basic rate and non-MCS at that since there may be some interop issues with HT MCS early during association. Alternatively with drivers like ath9k that support 4 rate values, it would also be fine to add this basic rate attempt (or well, I'd have multiple, say 4, such attempts) as an additional 4th entry which does not currently seem to get used with minstrel at all. The "(lowish) basic rate" here could be defined as 6 Mbps OFDM for 5 GHz band and either that or maybe even 2 Mbps or 5.5 Mbps on 2.4 GHz (if included by the AP in basic rate set). -- Jouni Malinen PGP id EFC895FA