Received: by 2002:a05:6a10:8c0a:0:0:0:0 with SMTP id go10csp32315pxb; Fri, 19 Feb 2021 16:58:24 -0800 (PST) X-Google-Smtp-Source: ABdhPJwbhA4NmMWp+FPU3PATlkAZu2gsHE8y/et8IIrWmkJ3sDi36YCUdOFJ3TNP4Ay8dqdHFc2u X-Received: by 2002:a17:906:c00a:: with SMTP id e10mr11206231ejz.501.1613782703988; Fri, 19 Feb 2021 16:58:23 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1613782703; cv=none; d=google.com; s=arc-20160816; b=C9872YIwKugnBYVjPmqIAH7DPAQ/VMtL9xDWrEMk4685XV9NH0LcmQresGClZqaA+H oCMP7uEGrORE7o6S0Y0Xsb0tdmLYh4Fop8h7584QJt70DnsUahB0YaO17Lh85q5bdas5 F7Wjjg+OtsZklXx40cmnLCSmNPoDerSoeXMqYtFX/0whhN73Iqjrkp9LcBfhS7hpDiZS vvIa1sTT91ZDycvEIDs/89qDF+xqjvLYTynuT0r9BFvRfIci9J4EBZnEQNSqLE/ji0Mg Lq04BMMXMuuuqORbXnm+rYxy1R9PQlFc+KAHOJ46G1TH0hQWJyNZleVQ8bnts4CMYFlL yi7w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-disposition:mime-version:message-id :subject:cc:to:from:date:dkim-signature; bh=Sl18xJK0FBSUcbbYuyh+oreHTjQxjE1dyshoc+jjlUI=; b=g7Hnn/QNZpZyJKRetWCpLe+aaF/fzVWwQkoheE6Shb3MwaRj/nrJCFGWP60uAekdFX tk6UK7UHIAzY7Z/777otqlaEAe9A4K4fEmPVBuKsDO6bFkwXGcb9uNSvxXzum33r85+k bUuTLV/Q+SyVQJ+IAbdu66N+4sOT3sAkSAnUFd+5BTOQXJwrVlOCW8T27Cv/hkJjQOQr Y990dKSZ2RFGEbOPy+W5mJoK0MCnECMJE8FWT8M4mS2CKnH6zEfPr6k5AyKXwToHv8Y1 YGgOlcErNLFfwsmKONx9JofTKqxlBqZPRxqueKdZaYiK+ksIDrAvDzx7GHZrEkiAWjgx /XQg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=K7XCW1WW; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id bs11si6995893ejb.637.2021.02.19.16.58.01; Fri, 19 Feb 2021 16:58:23 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=K7XCW1WW; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229803AbhBTAzc (ORCPT + 99 others); Fri, 19 Feb 2021 19:55:32 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35912 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229700AbhBTAza (ORCPT ); Fri, 19 Feb 2021 19:55:30 -0500 Received: from mail-ot1-x32e.google.com (mail-ot1-x32e.google.com [IPv6:2607:f8b0:4864:20::32e]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 5A61AC061574; Fri, 19 Feb 2021 16:54:50 -0800 (PST) Received: by mail-ot1-x32e.google.com with SMTP id o10so6753951ote.13; Fri, 19 Feb 2021 16:54:50 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=date:from:to:cc:subject:message-id:mime-version:content-disposition; bh=Sl18xJK0FBSUcbbYuyh+oreHTjQxjE1dyshoc+jjlUI=; b=K7XCW1WWWeAL73AOb5FLLcLtM9+VV2lCYoajJA7RPceu3wu6hrJf+Gtj+D/j11sQI6 iZBucvwZ5S6JSA3T4TmFnM8zFrc95/bujHMA6ObGe2N9kw/ioi/x2PCB2K6LCuy9jG6n WA3Sxja+fyHbA8nStBSJTtC+bAt4oZwCP2zgts4yK0QhpnBZlNiaoOebbb5mhQiOJvPB 035EXtKGWpPUqcvxhqyyyMd5/MhYfVe2E3AheXdqOyyvCo8YEwTQ9Nx3NqrHEH3U42OZ ZMiPsSRKp089gIBox6Ecn6vf/6/NU/o2Hll47SmLgzcLHhOjdwIWXb+seieSxgIWCRQS SWTw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:mime-version :content-disposition; bh=Sl18xJK0FBSUcbbYuyh+oreHTjQxjE1dyshoc+jjlUI=; b=T2oyh4o8rfAc3VH0cAH9WC2MihbEVHgrYXsAiZo/5qFQRseRjkiJp9GmNlAlkoJ97I bv9/0iJ1oKyh2jps9/wYSYlisSBeSByOtveaoMKByWmVPMIwCcrBUoYiTOCckyOQzSXr ssecb8b8D3sNAg+08LqtmJ2uyq7y0zjyCI+f5Uf6ZM9+KaBARl3E/k50X07RW6mwSF/z 0vnDxwL7SpIQbL3+8fFnC6iZGMYoHDprmL2P+WYo1oluynseeyFlil/qLDvxlh6+zeCU brSsaIlXR7GavMSMrjJB8rflv0hmVnO2o4IdCEqZF66a/1y0y15IqI2gYubMBC0ME4ex tOtg== X-Gm-Message-State: AOAM53379TOQ3wbSxEiS3zxh3vYE9KCD5XK9RqKqgHvzOL+ZfUmhfR/C qGgZ4Y8ptjDEOW/tZFpvaCg= X-Received: by 2002:a05:6830:1d68:: with SMTP id l8mr8939616oti.238.1613782489766; Fri, 19 Feb 2021 16:54:49 -0800 (PST) Received: from localhost.localdomain (99-6-134-177.lightspeed.snmtca.sbcglobal.net. [99.6.134.177]) by smtp.gmail.com with ESMTPSA id u126sm2224426oig.55.2021.02.19.16.54.48 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 19 Feb 2021 16:54:49 -0800 (PST) Date: Fri, 19 Feb 2021 16:54:47 -0800 From: Enke Chen To: Eric Dumazet , "David S. Miller" , Hideaki YOSHIFUJI , David Ahern , Jakub Kicinski , netdev@vger.kernel.org, linux-kernel@vger.kernel.org Cc: Neal Cardwell , Yuchung Cheng , enkechen2020@gmai.com Subject: [PATCH net] tcp: fix keepalive when data remain undelivered Message-ID: <20210220005447.GA93678@localhost.localdomain> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Enke Chen TCP keepalive does not timeout under the condition that network connection is lost and data remain undelivered (incl. retransmit). A very simple scenarios of the failure is to write data to a tcp socket after the network connection is lost. Under the specified condition the keepalive timeout is not evaluated in the keepalive timer. That is the primary cause of the failure. In addition, the keepalive probe is not sent out in the keepalive timer. Although packet retransmit or 0-window probe can serve a similar purpose, they have their own timers and backoffs that are generally not aligned with the keepalive parameters for probes and timeout. As the timing and conditions of the events involved are random, the tcp keepalive can fail randomly. Given the randomness of the failures, fixing the issue would not cause any backward compatibility issues. As was well said, "Determinism is a special case of randomness". The fix in this patch consists of the following: a. Always evaluate the keepalive timeout in the keepalive timer. b. Always send out the keepalive probe in the keepalive timer (post the keepalive idle time). Given that the keepalive intervals are usually in the range of 30 - 60 seconds, there is no need for an optimization to further reduce the number of keepalive probes in the presence of packet retransmit. c. Use the elapsed time (instead of the 0-window probe counter) in evaluating tcp keepalive timeout. Thanks to Eric Dumazet, Neal Cardwell, and Yuchung Cheng for helpful discussions about the issue and options for fixing it. Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2 Initial git repository build") Signed-off-by: Enke Chen --- net/ipv4/tcp_timer.c | 20 ++++++-------------- 1 file changed, 6 insertions(+), 14 deletions(-) diff --git a/net/ipv4/tcp_timer.c b/net/ipv4/tcp_timer.c index 4ef08079ccfa..16a044da20db 100644 --- a/net/ipv4/tcp_timer.c +++ b/net/ipv4/tcp_timer.c @@ -708,29 +708,23 @@ static void tcp_keepalive_timer (struct timer_list *t) ((1 << sk->sk_state) & (TCPF_CLOSE | TCPF_SYN_SENT))) goto out; - elapsed = keepalive_time_when(tp); - - /* It is alive without keepalive 8) */ - if (tp->packets_out || !tcp_write_queue_empty(sk)) - goto resched; - elapsed = keepalive_time_elapsed(tp); if (elapsed >= keepalive_time_when(tp)) { /* If the TCP_USER_TIMEOUT option is enabled, use that * to determine when to timeout instead. */ - if ((icsk->icsk_user_timeout != 0 && - elapsed >= msecs_to_jiffies(icsk->icsk_user_timeout) && - icsk->icsk_probes_out > 0) || - (icsk->icsk_user_timeout == 0 && - icsk->icsk_probes_out >= keepalive_probes(tp))) { + u32 timeout = icsk->icsk_user_timeout ? + msecs_to_jiffies(icsk->icsk_user_timeout) : + keepalive_intvl_when(tp) * keepalive_probes(tp) + + keepalive_time_when(tp); + + if (elapsed >= timeout) { tcp_send_active_reset(sk, GFP_ATOMIC); tcp_write_err(sk); goto out; } if (tcp_write_wakeup(sk, LINUX_MIB_TCPKEEPALIVE) <= 0) { - icsk->icsk_probes_out++; elapsed = keepalive_intvl_when(tp); } else { /* If keepalive was lost due to local congestion, @@ -744,8 +738,6 @@ static void tcp_keepalive_timer (struct timer_list *t) } sk_mem_reclaim(sk); - -resched: inet_csk_reset_keepalive_timer (sk, elapsed); goto out; -- 2.29.2