Received: by 2002:ac0:a5a7:0:0:0:0:0 with SMTP id m36-v6csp1358929imm; Mon, 9 Jul 2018 23:55:25 -0700 (PDT) X-Google-Smtp-Source: AAOMgpdcPDwhtktR0Ku1usDKpJrRTW1V+8ZalNZY4RxMq1433oFC3tRdMdBGJ8lQkqGgZmVEQRc+ X-Received: by 2002:a65:5004:: with SMTP id f4-v6mr21637135pgo.54.1531205725577; Mon, 09 Jul 2018 23:55:25 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1531205725; cv=none; d=google.com; s=arc-20160816; b=DDH+gnWWBPcN8xrovbH24v/IwwdvXtNlSTkw0dTuFEHxRZMEKyklo2UAys34n1NCt0 x4l0GXrASHwdnG3DS2K9o+o5n+nSPMzLKcM2fE4QB3GvL1uGJynF3dVuy89hj3haUZEh KKd45jiknFijtFkjejEx4Cs+WBziN5ec9sbRx5NjcsCc4UorveobzBXP+0HcmUo5WXZm 7SjwogcTT1gt9PwvngvQJWz1js70tnCJ9SYzhXaowwzRgTt0LRmL0QCqYW5Te9kir8Vw 9tkaiSI4ZniHtRooKZ4eVqP56rHGMFZDaruVUKWe3D17RNur1W65DnvMMvxw4N0bzDUm DrEw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:message-id:date:subject:cc:to:from :dkim-signature:arc-authentication-results; bh=6Dl6yPhzHrgqdO3XTFuP0JzmqCFNhNDpkWFbaoMG8SA=; b=VUukS5xDMH5ri6SarEe7+D4AjL3GVEE3rHi9b1kjvU3LKA/Pxua5NsKCGnHopJTbrj E0HGhfxnlSm6ckqss/pjAymyux2Hg/j9FqiyxILZd6bmnHVibEG2KYpt8YO/JO5KZdrf KQ6/M+O28UIROxDRFVkLP53s4CSbc+2CBBK+X521UzcdKXDnu8oPf6yeGVLvniEWRVo2 5ifysBEqs0Btr+QkzYNnlIQ0eNx4LRRZlLfFxS/rzbY+atBVaAYqx+BDiMClWj86XGhK 3XYovjpdewaEdxmHBF8LAiGb6AWXeGZPaJ9sz24O1fuyGA746g237aOt4BZYocJZm92p GPyA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=J02CkZ5o; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id m37-v6si16591630pla.148.2018.07.09.23.55.11; Mon, 09 Jul 2018 23:55:25 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=J02CkZ5o; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S933302AbeGJGw5 (ORCPT + 99 others); Tue, 10 Jul 2018 02:52:57 -0400 Received: from mail-pg1-f193.google.com ([209.85.215.193]:37710 "EHLO mail-pg1-f193.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S933127AbeGJGwa (ORCPT ); Tue, 10 Jul 2018 02:52:30 -0400 Received: by mail-pg1-f193.google.com with SMTP id n15-v6so1798169pgv.4; Mon, 09 Jul 2018 23:52:30 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id; bh=6Dl6yPhzHrgqdO3XTFuP0JzmqCFNhNDpkWFbaoMG8SA=; b=J02CkZ5owb4C214CvO4wilTcZTnahXay50En2mVty8DytueLdOqr5mHuPTDIkt/4F4 6Ei9l28naeAinGP1bEn5LCXG5JyvDsAV1OTmdbRYFDzMuzVsWow10471Jjqp5kMk8Znq 4yDO6HE5a3RFBN0Srh4TZkV+Mvh/yKnAJ7zmP1Wxxx3Yfk6sdxVceRmzH+9BIaBzFVZH q35B3lCki89ZhFp6J1F17ThLJ6mqyhVxnuYwvTxY24XrGVscrQsyoMl1pWUUAzycsG2w tF2xuXNrkC6dhLItYy/jhCMoAiw0Ms1kIeQ6H6BoJ9Xw4D4i+9PWPr2mVWl8QmsIPj9W 8gXQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id; bh=6Dl6yPhzHrgqdO3XTFuP0JzmqCFNhNDpkWFbaoMG8SA=; b=oAoRGjjgNjPya5aNi/v0W/9sAqQ1poB7hD5DHH8oNc1MZyxhnCueClpCOFEpdKDuXt oPYz0C/OtRyRxmTg9ESaLr6MKQiZmIAF89ArGAkOA1hDAGYeFAI+0uzoOtGaZUKIr60K O69MYNDC07mAfL1hmSbypita3i2alotSE/1rdjiNIDQsCExj0fwhX8TiczX9QRmjG1VP 79Qxf5KiWLybqayi+W5ereQ/ZGbIAnWNiDYevG+AeCyqoJz9cxXm/8aojrhHRRrDQcXK WHvL8sZxTCbvp/UsHe2VYwtsMonvpsj7JNofajydgNaMSislMmzkqOKM6X5l+3Ojdqbl zW5A== X-Gm-Message-State: APt69E0uTK+1+jEDPNCsFy/VB5EwNCK68yPil9ouNHCPCaKUUqPm4s/h MBlr+1tdpvtdlX/S4XD+fyX8K5Yp X-Received: by 2002:a62:6882:: with SMTP id d124-v6mr24320960pfc.122.1531205549868; Mon, 09 Jul 2018 23:52:29 -0700 (PDT) Received: from 192-168-1-101.tpgi.com.com ([118.102.104.154]) by smtp.gmail.com with ESMTPSA id y16-v6sm5497584pgc.73.2018.07.09.23.52.25 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Mon, 09 Jul 2018 23:52:28 -0700 (PDT) From: Jon Maxwell To: davem@davemloft.net Cc: edumazet@google.com, eric.dumazet@gmail.com, ncardwell@google.com, David.Laight@aculab.com, kuznet@ms2.inr.ac.ru, yoshfuji@linux-ipv6.org, netdev@vger.kernel.org, linux-kernel@vger.kernel.org, jmaxwell@redhat.com Subject: [net-next,v3] tcp: Improve setsockopt() TCP_USER_TIMEOUT accuracy Date: Tue, 10 Jul 2018 16:51:47 +1000 Message-Id: <20180710065147.27647-1-jmaxwell37@gmail.com> X-Mailer: git-send-email 2.13.6 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org v3 contains the following suggestions by Neal Cardwell: 1) Fix up units mismatch regarding msec/jiffies. 2) Address possiblility of time_remaining being negative. 3) Add a helper routine tcp_clamp_rto_to_user_timeout() to do the rto calculation. 4) Move start_ts logic into helper routine tcp_retrans_stamp() to validate tcp_sk(sk)->retrans_stamp. 5) Some u32 declation and return refactoring. 6) Return 0 instead of false in tcp_retransmit_stamp(), it's not a bool. Suggestions by David Laight: 1) Don't cache rto in tcp_clamp_rto_to_user_timeout(). 2) Use conditional operator instead of min_t() in tcp_clamp_rto_to_user_timeout() Changes: 1) Call tcp_clamp_rto_to_user_timeout(sk) as an argument to inet_csk_reset_xmit_timer() to save on rto declaration. Every time the TCP retransmission timer fires. It checks to see if there is a timeout before scheduling the next retransmit timer. The retransmit interval between each retransmission increases exponentially. The issue is that in order for the timeout to occur the retransmit timer needs to fire again. If the user timeout check happens after the 9th retransmit for example. It needs to wait for the 10th retransmit timer to fire in order to evaluate whether a timeout has occurred or not. If the interval is large enough then the timeout will be inaccurate. For example with a TCP_USER_TIMEOUT of 10 seconds without patch: 1st retransmit: 22:25:18.973488 IP host1.49310 > host2.search-agent: Flags [.] Last retransmit: 22:25:26.205499 IP host1.49310 > host2.search-agent: Flags [.] Timeout: send: Connection timed out Sun Jul 1 22:25:34 EDT 2018 We can see that last retransmit took ~7 seconds. Which pushed the total timeout to ~15 seconds instead of the expected 10 seconds. This gets more inaccurate the larger the TCP_USER_TIMEOUT value. As the interval increases. Add tcp_clamp_rto_to_user_timeout() to determine if the user rto has expired. Or whether the rto interval needs to be recalculated. Use the original interval if user rto is not set. Test results with the patch is the expected 10 second timeout: 1st retransmit: 01:37:59.022555 IP host1.49310 > host2.search-agent: Flags [.] Last retransmit: 01:38:06.486558 IP host1.49310 > host2.search-agent: Flags [.] Timeout: send: Connection timed out Mon Jul 2 01:38:09 EDT 2018 Signed-off-by: Jon Maxwell --- net/ipv4/tcp_timer.c | 49 +++++++++++++++++++++++++++++++++++++++---------- 1 file changed, 39 insertions(+), 10 deletions(-) diff --git a/net/ipv4/tcp_timer.c b/net/ipv4/tcp_timer.c index 3b3611729928..93239e58776d 100644 --- a/net/ipv4/tcp_timer.c +++ b/net/ipv4/tcp_timer.c @@ -22,6 +22,38 @@ #include #include +u32 tcp_retransmit_stamp(struct sock *sk) +{ + u32 start_ts = tcp_sk(sk)->retrans_stamp; + + if (unlikely(!start_ts)) { + struct sk_buff *head = tcp_rtx_queue_head(sk); + + if (!head) + return 0; + start_ts = tcp_skb_timestamp(head); + } + return start_ts; +} + +static __u32 tcp_clamp_rto_to_user_timeout(struct sock *sk) +{ + struct inet_connection_sock *icsk = inet_csk(sk); + __u32 elapsed, user_timeout; + u32 start_ts; + + start_ts = tcp_retransmit_stamp(sk); + if (!icsk->icsk_user_timeout || !start_ts) + return icsk->icsk_rto; + elapsed = tcp_time_stamp(tcp_sk(sk)) - start_ts; + user_timeout = jiffies_to_msecs(icsk->icsk_user_timeout); + if (elapsed >= user_timeout) + return 1; /* user timeout has passed; fire ASAP */ + else + return (icsk->icsk_rto < msecs_to_jiffies(user_timeout - elapsed)) ? + icsk->icsk_rto : msecs_to_jiffies(user_timeout - elapsed); +} + /** * tcp_write_err() - close socket and save error info * @sk: The socket the error has appeared on. @@ -161,19 +193,15 @@ static bool retransmits_timed_out(struct sock *sk, unsigned int timeout) { const unsigned int rto_base = TCP_RTO_MIN; - unsigned int linear_backoff_thresh, start_ts; + unsigned int linear_backoff_thresh; + u32 start_ts; if (!inet_csk(sk)->icsk_retransmits) return false; - start_ts = tcp_sk(sk)->retrans_stamp; - if (unlikely(!start_ts)) { - struct sk_buff *head = tcp_rtx_queue_head(sk); - - if (!head) - return false; - start_ts = tcp_skb_timestamp(head); - } + start_ts = tcp_retransmit_stamp(sk); + if (!start_ts) + return false; if (likely(timeout == 0)) { linear_backoff_thresh = ilog2(TCP_RTO_MAX/rto_base); @@ -535,7 +563,8 @@ void tcp_retransmit_timer(struct sock *sk) /* Use normal (exponential) backoff */ icsk->icsk_rto = min(icsk->icsk_rto << 1, TCP_RTO_MAX); } - inet_csk_reset_xmit_timer(sk, ICSK_TIME_RETRANS, icsk->icsk_rto, TCP_RTO_MAX); + inet_csk_reset_xmit_timer(sk, ICSK_TIME_RETRANS, + tcp_clamp_rto_to_user_timeout(sk), TCP_RTO_MAX); if (retransmits_timed_out(sk, net->ipv4.sysctl_tcp_retries1 + 1, 0)) __sk_dst_reset(sk); -- 2.13.6