Received: by 2002:ac0:a5a7:0:0:0:0:0 with SMTP id m36-v6csp1153988imm; Wed, 18 Jul 2018 18:16:27 -0700 (PDT) X-Google-Smtp-Source: AAOMgpcNHg3kfO97SZHzWrY+pltQKz5GleR0H0NeNmO76ySAOpXDVQdWo2rRIaKlhePRKnl2+gda X-Received: by 2002:a62:47c4:: with SMTP id p65-v6mr7544716pfi.170.1531962986973; Wed, 18 Jul 2018 18:16:26 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1531962986; cv=none; d=google.com; s=arc-20160816; b=u9r6OwZS7hck1/dKjTn7wajQlyy4E3Dvy8oMj78TjgOaAKnl/w3CuOhvHURFMPV5FC f8uC4nzmJDIlTwWsJ7MCs3nW1wxTAmASZsYQ/8NMmR018n26RPYSTHZMz2AWlEHxipOI +rAwVP1eTYCMQpQxYyPSMwVP20Aom9Yxsvg32EIQDuqaL5TqoORcvOK+1UWksCQF6mRx UmANIWUINFlw5uq/CYkMPwmd/MEP5vRWnCevhiiJAsKROa+TWqSEjI28Fy9j29fF3khi Eh6KFdPtdmUWKZX9AivGaxQBZkx/n4Dqg9pYERUERTja3c4vcsZdLgHHbWz95wENX6tv BW2A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:message-id:date:subject:cc:to:from :arc-authentication-results; bh=bXtxkbu4Kl/9BAGixw86gh/Dp+meDPJDwZoofsWstkM=; b=EDIGcgW2EzhW31IyNkjTuvxPPsuLQIsOKh4nIAv0ycuC3HCbemHjPn9iVZJP+lp2rH KruvVQrHQxGNxisqn2VB5/rz3C2KAfC1fq4ifWrCo4NEiQXiNPnWGU+BreAcWTgcBgMx 1iLMTFNus5ZZAEcWCtOo3IAuWrDOx0mRlMytrvudsyN/ljGy2x2oS1/Vc8jxUVVlDhSz /s9f/dGZF3IoWsr02AFcAyEn4uEKCJyWQ1Mr13U5xCUeSpBz9ShSrr/busPvBYlOkLvl y4LfD/VuUjx48o5rgq8Aemy/AltHOf6MQky1IYv+ETGrf4XVTrHaLGZZipElByHhg2FT lhfw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id 21-v6si4720543pgl.62.2018.07.18.18.16.12; Wed, 18 Jul 2018 18:16:26 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730885AbeGSB4D (ORCPT + 99 others); Wed, 18 Jul 2018 21:56:03 -0400 Received: from mx3-rdu2.redhat.com ([66.187.233.73]:36710 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1730094AbeGSB4D (ORCPT ); Wed, 18 Jul 2018 21:56:03 -0400 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.rdu2.redhat.com [10.11.54.4]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id D89C881663CE; Thu, 19 Jul 2018 01:15:29 +0000 (UTC) Received: from 192-168-1-101.tpgi.com.com (vpn2-54-62.bne.redhat.com [10.64.54.62]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 22C752026D69; Thu, 19 Jul 2018 01:15:25 +0000 (UTC) From: Jon Maxwell To: davem@davemloft.net Cc: edumazet@google.com, ncardwell@google.com, David.Laight@aculab.com, kuznet@ms2.inr.ac.ru, yoshfuji@linux-ipv6.org, netdev@vger.kernel.org, linux-kernel@vger.kernel.org, jmaxwell@redhat.com Subject: [PATCH V4 net-next 0/3] tcp: improve setsockopt() TCP_USER_TIMEOUT accuracy Date: Thu, 19 Jul 2018 11:14:41 +1000 Message-Id: <20180719011444.4694-1-jmaxwell37@gmail.com> X-Scanned-By: MIMEDefang 2.78 on 10.11.54.4 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.8]); Thu, 19 Jul 2018 01:15:29 +0000 (UTC) X-Greylist: inspected by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.8]); Thu, 19 Jul 2018 01:15:29 +0000 (UTC) for IP:'10.11.54.4' DOMAIN:'int-mx04.intmail.prod.int.rdu2.redhat.com' HELO:'smtp.corp.redhat.com' FROM:'jmaxwell37@gmail.com' RCPT:'' Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org The patch was becoming bigger based on feedback therefore I have implemented a series of 3 commits instead in V4. This series is a continuation based on V3 here and associated feedback: https://patchwork.kernel.org/patch/10516195/ Suggestions by Neal Cardwell: 1) Fix up units mismatch regarding msec/jiffies. 2) Address possiblility of time_remaining being negative. 3) Add a helper routine tcp_clamp_rto_to_user_timeout() to do the rto calculation. 4) Move start_ts logic into helper routine tcp_retrans_stamp() to validate tcp_sk(sk)->retrans_stamp. 5) Some u32 declation and return refactoring. 6) Return 0 instead of false in tcp_retransmit_stamp(), it's not a bool. Suggestions by David Laight: 1) Don't cache rto in tcp_clamp_rto_to_user_timeout(). Suggestions by Eric Dumazet: 1) Make u32 declartions consistent. 2) Use patch series for easier review. 3) Convert icsk->icsk_user_timeout to millisconds to avoid jiffie to msec dance. 4) Use seperate titles for each commit in the series. 5) Fix fuzzy indentation and line wrap issues. 6) Make commit titles descriptive. Changes: 1) Call tcp_clamp_rto_to_user_timeout(sk) as an argument to inet_csk_reset_xmit_timer() to save on rto declaration. Every time the TCP retransmission timer fires. It checks to see if there is a timeout before scheduling the next retransmit timer. The retransmit interval between each retransmission increases exponentially. The issue is that in order for the timeout to occur the retransmit timer needs to fire again. If the user timeout check happens after the 9th retransmit for example. It needs to wait for the 10th retransmit timer to fire in order to evaluate whether a timeout has occurred or not. If the interval is large enough then the timeout will be inaccurate. For example with a TCP_USER_TIMEOUT of 10 seconds without patch: 1st retransmit: 22:25:18.973488 IP host1.49310 > host2.search-agent: Flags [.] Last retransmit: 22:25:26.205499 IP host1.49310 > host2.search-agent: Flags [.] Timeout: send: Connection timed out Sun Jul 1 22:25:34 EDT 2018 We can see that last retransmit took ~7 seconds. Which pushed the total timeout to ~15 seconds instead of the expected 10 seconds. This gets more inaccurate the larger the TCP_USER_TIMEOUT value. As the interval increases. Add tcp_clamp_rto_to_user_timeout() to determine if the user rto has expired. Or whether the rto interval needs to be recalculated. Use the original interval if user rto is not set. Test results with the patch is the expected 10 second timeout: 1st retransmit: 01:37:59.022555 IP host1.49310 > host2.search-agent: Flags [.] Last retransmit: 01:38:06.486558 IP host1.49310 > host2.search-agent: Flags [.] Timeout: send: Connection timed out Mon Jul 2 01:38:09 EDT 2018 Jon Maxwell (3): tcp: convert icsk_user_timeout from jiffies to msecs tcp: Add tcp_retransmit_stamp() helper routine tcp: Add tcp_clamp_rto_to_user_timeout() helper to improve accuracy net/ipv4/tcp.c | 4 ++-- net/ipv4/tcp_timer.c | 51 ++++++++++++++++++++++++++++++++++++++------------- 2 files changed, 40 insertions(+), 15 deletions(-) -- 2.13.6