Received: by 2002:ac0:a5b6:0:0:0:0:0 with SMTP id m51-v6csp620532imm; Fri, 15 Jun 2018 03:37:29 -0700 (PDT) X-Google-Smtp-Source: ADUXVKKHtr+lcZhvgxjSFbKPsskouwRKX8o3LEnEprG7jrRdbgeMVEE3rnUawKCAtavdT/a2qL47 X-Received: by 2002:a62:484d:: with SMTP id v74-v6mr1309273pfa.256.1529059049700; Fri, 15 Jun 2018 03:37:29 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1529059049; cv=none; d=google.com; s=arc-20160816; b=cCVyvNkwlDfe4eal1ZM6HLWotkxDqqEnu1aaeCwWcFUmoxIc5xjuGaQcZHadX8ja71 iFX02MwpZTpJyBqnrqnn4SMbs9482XL3zvaZtGyERjGa5sTBGbf1tjDeJ9wcA5UGcdQa RhrXXiof5sx9J8Sz+V3u10eEmOvAfl9b2sPKljd18RAJyXYi3ki1naXqITEa9YfTxeNJ zg+5dgI99ZYWtng6vAGrlgyHIQ+7vshlbO72JKo6OuoYR4B8OWKAMvTQI1Zgmi7neJk6 yw/Z5jrDtIvOC+2jR47VcQU1Zazlx9eqAHwR7RTeIWQeGbJuZjnOsWC/bMYBaoJDVFCR I4Fg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-id:mime-version:user-agent :references:message-id:in-reply-to:subject:cc:to:from:date :arc-authentication-results; bh=B48qYH4/LoatCGQLADbVaV0ZFDnBz2nDODgR5E3vdW0=; b=dQFzZG7KiCj3ADSN/ipDy2FENrnoRkb8/HE81znMpZ1SiePuYYCIxSOp+Ij79ZToeg +PcdhfduCAcu/mcqrFaop7J9k8mlYuawOOYl5QP8C9m8jfQTsIZi+CJsT6wgDt8d8bjM N0tCY2xtyEqiurM4YHpo+lN29EiHGVbY+FmFgxGaDRFZ/+qSXQi12xW0OF7mHx/Ng/pM JQv3raHXKQRTY4s8rEGnizKhlTQZCNbZvVxkVkNCk0XbViVIi5VW5NwosCf5Zs7Sx13+ FwdrwHFZgQGKSpIKIeI4RzATmB8Fl+fmPkx3QDhZ3c3D/IQ/ClK9xfFgSq/UdL45u6Ns yOZw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id o128-v6si8001458pfg.5.2018.06.15.03.37.14; Fri, 15 Jun 2018 03:37:29 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S965684AbeFOKfy (ORCPT + 99 others); Fri, 15 Jun 2018 06:35:54 -0400 Received: from smtp-rs2-vallila1.fe.helsinki.fi ([128.214.173.73]:60906 "EHLO smtp-rs2-vallila1.fe.helsinki.fi" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S965577AbeFOKfw (ORCPT ); Fri, 15 Jun 2018 06:35:52 -0400 Received: from whs-18.cs.helsinki.fi (whs-18.cs.helsinki.fi [128.214.166.46]) by smtp-rs2.it.helsinki.fi (8.14.7/8.14.7) with ESMTP id w5FAZjWT023352; Fri, 15 Jun 2018 13:35:45 +0300 Received: by whs-18.cs.helsinki.fi (Postfix, from userid 1070048) id 37D903601A6; Fri, 15 Jun 2018 13:35:45 +0300 (EEST) Received: from localhost (localhost [127.0.0.1]) by whs-18.cs.helsinki.fi (Postfix) with ESMTP id 363E0360045; Fri, 15 Jun 2018 13:35:45 +0300 (EEST) Date: Fri, 15 Jun 2018 13:35:45 +0300 (EEST) From: =?ISO-8859-15?Q?Ilpo_J=E4rvinen?= X-X-Sender: ijjarvin@whs-18.cs.helsinki.fi To: Michal Kubecek cc: Yuchung Cheng , netdev , Eric Dumazet , LKML Subject: Re: [RFC PATCH RESEND] tcp: avoid F-RTO if SACK and timestamps are disabled In-Reply-To: <20180615092753.lmxqh65moc33rzbq@unicorn.suse.cz> Message-ID: References: <20180613164802.99B89A09E2@unicorn.suse.cz> <20180613165543.0F92DA09E2@unicorn.suse.cz> <20180614093408.5e34ijwhome4t5yn@unicorn.suse.cz> <20180614131801.hd474jgrhmtqzhag@unicorn.suse.cz> <20180615092753.lmxqh65moc33rzbq@unicorn.suse.cz> User-Agent: Alpine 2.20 (DEB 67 2015-01-07) MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="8323329-4507525-1529057656=:29120" Content-ID: Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org This message is in MIME format. The first part should be readable text, while the remaining parts are likely unreadable without MIME-aware tools. --8323329-4507525-1529057656=:29120 Content-Type: text/plain; charset=ISO-8859-15 Content-Transfer-Encoding: 8BIT Content-ID: On Fri, 15 Jun 2018, Michal Kubecek wrote: > On Fri, Jun 15, 2018 at 11:05:03AM +0300, Ilpo J?rvinen wrote: > > On Thu, 14 Jun 2018, Michal Kubecek wrote: > > > > My point was that the new data segment bursts that occur if the sender > > isn't application limited indicate that there's something going wrong > > with FRTO. And that wrong is also what is causing that RTO loop because > > the sender doesn't see the previous FRTO recovery on second RTO. With > > my FRTO undo fix, (new_recovery || icsk->icsk_retransmits) will be false > > and that will prevent the RTO loop. > > Yes, it would prevent the loop in this case (except it would be a bit > later, after second RTO rather than after first). Hmm, I'm actually wrong about the new data missing bit I think. After reading more code I'm quite sure conventional RTO recovery is triggered right away (as long as that bogus undo that ends the recovery wouldn't occur first like it does without my fix). So no second RTO would be needed. > But I'm not convinced > the logic of the patch is correct. If I understand it correctly, it > essentially changes "presumption of innocence" (if we get an ack past > what we retransmitted, we assume it was received earlier - i.e. would > have been sacked before if SACK was in use) to "presumption of guilt" > (whenever a retransmitted segment is acked, we assume nothing else acked > with it was received earlier). Or that you trade false negatives for > false positives. FRTO depends on knowing for sure what packet (original pre-RTO one or something that was transmitted post-RTO) triggered the ACK. If FRTO isn't sure that the ACK was generated by a post-RTO packet, it must not assume innocence! This change in practice affects just the time while the segment rexmitted by RTO is still there, that is, processing in step 2 (if we get a cumulative ACK beyond it because the next loss is not for the subsequent segment but for some later segment, FLAG_ORIG_SACK_ACKED is set and we'll incorrectly do step 3b while still in FRTO has only reached step 2 for real; this is fixed by my patch). ...The decision about positive/negative only occurs _after_ that in step 3. > Maybe I understand it wrong but it seems that you de facto prevent > Step (3b) from ever happening in non-SACK case. Only if any of skb that was ACKed had been retransmitted. There shouldn't be any in step 3 because the RTO rexmit was ACKed (and also because of how new_recovery variable works wrt. earlier recoveries). Thus, in step 3 the fix won't clear FLAG_ORIG_SACK_ACKED flag allowing FRTO to detect spurious RTOs using step 3b. > > > > No! The window should not update window on ACKs the receiver intends to > > > > designate as "duplicate ACKs". That is not without some potential cost > > > > though as it requires delaying window updates up to the next cumulative > > > > ACK. In the non-SACK series one of the changes is fixing this for > > > > non-SACK Linux TCP flows. > > > > > > That sounds like a reasonable change (at least at the first glance, > > > I didn't think about it too deeply) but even if we fix Linux stack to > > > behave like this, we cannot force everyone else to do the same. > > > > Unfortunately I don't know what the other stacks besides Linux do. But > > for Linux, the cause for the changing receiver window is the receiver > > window auto-tuning and I'm not sure if other stacks have a similar > > feature (or if that affects (almost) all ACKs like in Linux). > > The capture from my previous e-mail and some others I have seen indicate > that at least some implementations do not take care to never change > window size when responding to an out-of-order segment. That means that > even if we change linux TCP this way (we might still need to send > a separate window update in some cases), we still cannot rely on others > doing the same. Those implementations ignore what is a duplicate ACK (RFC5681, which is also pointed into by RFC5682 for its defination): DUPLICATE ACKNOWLEDGMENT: An acknowledgment is considered a "duplicate" ... (e) the advertised window in the incoming acknowledgment equals the advertised window in the last incoming acknowledgment. Not sending duplicate ACKs also means that fast recovery will not work for those flows but that may not show up performance wise as long as you have enough capacity for any unnecessary rexmits the forced RTO recovery is going to do. RTO recovery may actually improve completion times for non-SACK flows as NewReno recovers only one lost pkt/RTT where as RTO recovery needs log(outstanding packets) RTTs at worst. For a large number of losses in a window, the log is going to win. > I checked the capture attached to my previous e-mail again and there is > one thing where our F-RTO implementation (in 4.4, at least) is wrong, > IMHO. While the first ACK after "new data" (sent in (2b)) was a window > update (and therefore not dupack by definition) so that we could take > neither (3a) nor (3b), in some iterations there were further acks which > did not change window size. The text just before Step 1 says > > The F-RTO algorithm does not specify actions for receiving > a segment that neither acknowledges new data nor is a duplicate > acknowledgment. The TCP sender SHOULD ignore such segments and > wait for a segment that either acknowledges new data or is > a duplicate acknowledgment. > > My understanding is that this means that while the first ack after new > data is correctly ignored, the following ack which preserves window size > should be recognized as a dupack and (3a) should be taken. Linux FRTO never gets that far (without my fix) if the ACK in step 2 covers beyond the RTO rexmit because 3b is prematurely invoked, that's why you never see what would occur if 3a is taken. TCP thinks it's not recovering anymore and therefore can send only new data (if there's some available). This is what I tried to tell earlier, with new data there you see there's something else wrong too with FRTO besides the RTO loop. -- i. --8323329-4507525-1529057656=:29120--