2021-01-12 19:29:13

by Enke Chen

[permalink] [raw]
Subject: [PATCH] tcp: keepalive fixes

From: Enke Chen <[email protected]>

In this patch two issues with TCP keepalives are fixed:

1) TCP keepalive does not timeout when there are data waiting to be
delivered and then the connection got broken. The TCP keepalive
timeout is not evaluated in that condition.

The fix is to remove the code that prevents TCP keepalive from
being evaluated for timeout.

2) With the fix for #1, TCP keepalive can erroneously timeout after
the 0-window probe kicks in. The 0-window probe counter is wrongly
applied to TCP keepalives.

The fix is to use the elapsed time instead of the 0-window probe
counter in evaluating TCP keepalive timeout.

Cc: [email protected]
Signed-off-by: Enke Chen <[email protected]>
---
net/ipv4/tcp_timer.c | 15 +++------------
1 file changed, 3 insertions(+), 12 deletions(-)

diff --git a/net/ipv4/tcp_timer.c b/net/ipv4/tcp_timer.c
index 6c62b9ea1320..40953aa40d53 100644
--- a/net/ipv4/tcp_timer.c
+++ b/net/ipv4/tcp_timer.c
@@ -696,12 +696,6 @@ static void tcp_keepalive_timer (struct timer_list *t)
((1 << sk->sk_state) & (TCPF_CLOSE | TCPF_SYN_SENT)))
goto out;

- elapsed = keepalive_time_when(tp);
-
- /* It is alive without keepalive 8) */
- if (tp->packets_out || !tcp_write_queue_empty(sk))
- goto resched;
-
elapsed = keepalive_time_elapsed(tp);

if (elapsed >= keepalive_time_when(tp)) {
@@ -709,16 +703,15 @@ static void tcp_keepalive_timer (struct timer_list *t)
* to determine when to timeout instead.
*/
if ((icsk->icsk_user_timeout != 0 &&
- elapsed >= msecs_to_jiffies(icsk->icsk_user_timeout) &&
- icsk->icsk_probes_out > 0) ||
+ elapsed >= msecs_to_jiffies(icsk->icsk_user_timeout)) ||
(icsk->icsk_user_timeout == 0 &&
- icsk->icsk_probes_out >= keepalive_probes(tp))) {
+ (elapsed >= keepalive_time_when(tp) +
+ keepalive_intvl_when(tp) * keepalive_probes(tp)))) {
tcp_send_active_reset(sk, GFP_ATOMIC);
tcp_write_err(sk);
goto out;
}
if (tcp_write_wakeup(sk, LINUX_MIB_TCPKEEPALIVE) <= 0) {
- icsk->icsk_probes_out++;
elapsed = keepalive_intvl_when(tp);
} else {
/* If keepalive was lost due to local congestion,
@@ -732,8 +725,6 @@ static void tcp_keepalive_timer (struct timer_list *t)
}

sk_mem_reclaim(sk);
-
-resched:
inet_csk_reset_keepalive_timer (sk, elapsed);
goto out;

--
2.29.2


2021-01-13 03:22:45

by Eric Dumazet

[permalink] [raw]
Subject: Re: [PATCH] tcp: keepalive fixes

On Tue, Jan 12, 2021 at 11:48 PM Yuchung Cheng <[email protected]> wrote:
>
> On Tue, Jan 12, 2021 at 2:31 PM Enke Chen <[email protected]> wrote:
> >
> > From: Enke Chen <[email protected]>
> >
> > In this patch two issues with TCP keepalives are fixed:
> >
> > 1) TCP keepalive does not timeout when there are data waiting to be
> > delivered and then the connection got broken. The TCP keepalive
> > timeout is not evaluated in that condition.
> hi enke
> Do you have an example to demonstrate this issue -- in theory when
> there is data inflight, an RTO timer should be pending (which
> considers user-timeout setting). based on the user-timeout description
> (man tcp), the user timeout should abort the socket per the specified
> time after data commences. some data would help to understand the
> issue.
>

+1

A packetdrill test would be ideal.

Also, given that there is this ongoing issue with TCP_USER_TIMEOUT,
lets not mix things
or risk added work for backports to stable versions.

2021-01-13 03:34:47

by Enke Chen

[permalink] [raw]
Subject: Re: [PATCH] tcp: keepalive fixes

Hi, Yuchung:

I have attached the python script that reproduces the keepalive issues.
The script is a slight modification of the one written by Marek Majkowski:

https://github.com/cloudflare/cloudflare-blog/blob/master/2019-09-tcp-keepalives/test-zero.py

Please note that only the TCP keepalive is configured, and not the user timeout.

Thanks. -- Enke

On Tue, Jan 12, 2021 at 02:48:01PM -0800, Yuchung Cheng wrote:
> On Tue, Jan 12, 2021 at 2:31 PM Enke Chen <[email protected]> wrote:
> >
> > From: Enke Chen <[email protected]>
> >
> > In this patch two issues with TCP keepalives are fixed:
> >
> > 1) TCP keepalive does not timeout when there are data waiting to be
> > delivered and then the connection got broken. The TCP keepalive
> > timeout is not evaluated in that condition.
> hi enke
> Do you have an example to demonstrate this issue -- in theory when
> there is data inflight, an RTO timer should be pending (which
> considers user-timeout setting). based on the user-timeout description
> (man tcp), the user timeout should abort the socket per the specified
> time after data commences. some data would help to understand the
> issue.
>

------
#! /usr/bin/python

import io
import os
import select
import socket
import time
import utils
import ctypes

utils.new_ns()

port = 1

s = socket.socket(socket.AF_INET, socket.SOCK_STREAM, 0)
s.bind(('127.0.0.1', port))
s.setsockopt(socket.SOL_SOCKET, socket.SO_RCVBUF, 1024)

s.listen(16)

tcpdump = utils.tcpdump_start(port)
c = socket.socket(socket.AF_INET, socket.SOCK_STREAM, 0)
c.setsockopt(socket.SOL_SOCKET, socket.SO_RCVBUF, 1024)
c.connect(('127.0.0.1', port))

x, _ = s.accept()

if False:
c.setsockopt(socket.IPPROTO_TCP, socket.TCP_USER_TIMEOUT, 90*1000)

if True:
c.setsockopt(socket.SOL_SOCKET, socket.SO_KEEPALIVE, 1)
c.setsockopt(socket.IPPROTO_TCP, socket.TCP_KEEPCNT, 5)
c.setsockopt(socket.IPPROTO_TCP, socket.TCP_KEEPIDLE, 10)
c.setsockopt(socket.IPPROTO_TCP, socket.TCP_KEEPINTVL, 10)

time.sleep(0.2)
print("[ ] c.send()")
import fcntl
TIOCOUTQ=0x5411
c.setblocking(False)
while True:
bytes_avail = ctypes.c_int()
fcntl.ioctl(c.fileno(), TIOCOUTQ, bytes_avail)
if bytes_avail.value > 64*1024:
break
try:
c.send(b"A" * 16384 * 4)
except io.BlockingIOError:
break
c.setblocking(True)
time.sleep(0.2)
utils.ss(port)
utils.check_buffer(c)

t0 = time.time()

if True:
utils.drop_start(dport=port)
utils.drop_start(sport=port)

poll = select.poll()
poll.register(c, select.POLLIN)
poll.poll()

utils.ss(port)


e = c.getsockopt(socket.SOL_SOCKET, socket.SO_ERROR)
print("[ ] SO_ERROR = %s" % (e,))

t1 = time.time()
print("[ ] took: %f seconds" % (t1-t0,))

2021-01-13 04:10:15

by Yuchung Cheng

[permalink] [raw]
Subject: Re: [PATCH] tcp: keepalive fixes

On Tue, Jan 12, 2021 at 2:31 PM Enke Chen <[email protected]> wrote:
>
> From: Enke Chen <[email protected]>
>
> In this patch two issues with TCP keepalives are fixed:
>
> 1) TCP keepalive does not timeout when there are data waiting to be
> delivered and then the connection got broken. The TCP keepalive
> timeout is not evaluated in that condition.
hi enke
Do you have an example to demonstrate this issue -- in theory when
there is data inflight, an RTO timer should be pending (which
considers user-timeout setting). based on the user-timeout description
(man tcp), the user timeout should abort the socket per the specified
time after data commences. some data would help to understand the
issue.

>
> The fix is to remove the code that prevents TCP keepalive from
> being evaluated for timeout.
>
> 2) With the fix for #1, TCP keepalive can erroneously timeout after
> the 0-window probe kicks in. The 0-window probe counter is wrongly
> applied to TCP keepalives.
>
> The fix is to use the elapsed time instead of the 0-window probe
> counter in evaluating TCP keepalive timeout.
>
> Cc: [email protected]
> Signed-off-by: Enke Chen <[email protected]>
> ---
> net/ipv4/tcp_timer.c | 15 +++------------
> 1 file changed, 3 insertions(+), 12 deletions(-)
>
> diff --git a/net/ipv4/tcp_timer.c b/net/ipv4/tcp_timer.c
> index 6c62b9ea1320..40953aa40d53 100644
> --- a/net/ipv4/tcp_timer.c
> +++ b/net/ipv4/tcp_timer.c
> @@ -696,12 +696,6 @@ static void tcp_keepalive_timer (struct timer_list *t)
> ((1 << sk->sk_state) & (TCPF_CLOSE | TCPF_SYN_SENT)))
> goto out;
>
> - elapsed = keepalive_time_when(tp);
> -
> - /* It is alive without keepalive 8) */
> - if (tp->packets_out || !tcp_write_queue_empty(sk))
> - goto resched;
> -
> elapsed = keepalive_time_elapsed(tp);
>
> if (elapsed >= keepalive_time_when(tp)) {
> @@ -709,16 +703,15 @@ static void tcp_keepalive_timer (struct timer_list *t)
> * to determine when to timeout instead.
> */
> if ((icsk->icsk_user_timeout != 0 &&
> - elapsed >= msecs_to_jiffies(icsk->icsk_user_timeout) &&
> - icsk->icsk_probes_out > 0) ||
> + elapsed >= msecs_to_jiffies(icsk->icsk_user_timeout)) ||
> (icsk->icsk_user_timeout == 0 &&
> - icsk->icsk_probes_out >= keepalive_probes(tp))) {
> + (elapsed >= keepalive_time_when(tp) +
> + keepalive_intvl_when(tp) * keepalive_probes(tp)))) {
> tcp_send_active_reset(sk, GFP_ATOMIC);
> tcp_write_err(sk);
> goto out;
> }
> if (tcp_write_wakeup(sk, LINUX_MIB_TCPKEEPALIVE) <= 0) {
> - icsk->icsk_probes_out++;
> elapsed = keepalive_intvl_when(tp);
> } else {
> /* If keepalive was lost due to local congestion,
> @@ -732,8 +725,6 @@ static void tcp_keepalive_timer (struct timer_list *t)
> }
>
> sk_mem_reclaim(sk);
> -
> -resched:
> inet_csk_reset_keepalive_timer (sk, elapsed);
> goto out;
>
> --
> 2.29.2
>

2021-01-13 20:10:15

by Enke Chen

[permalink] [raw]
Subject: Re: [PATCH] tcp: keepalive fixes

Hi, Eric:

Just to clarify: the issues for tcp keepalive and TCP_USER_TIMEOUT are
separate isues, and the fixes would not conflict afaik.

Thanks. -- Enke

On Tue, Jan 12, 2021 at 11:52:43PM +0100, Eric Dumazet wrote:
> On Tue, Jan 12, 2021 at 11:48 PM Yuchung Cheng <[email protected]> wrote:
> >
> > On Tue, Jan 12, 2021 at 2:31 PM Enke Chen <[email protected]> wrote:
> > >
> > > From: Enke Chen <[email protected]>
> > >
> > > In this patch two issues with TCP keepalives are fixed:
> > >
> > > 1) TCP keepalive does not timeout when there are data waiting to be
> > > delivered and then the connection got broken. The TCP keepalive
> > > timeout is not evaluated in that condition.
> > hi enke
> > Do you have an example to demonstrate this issue -- in theory when
> > there is data inflight, an RTO timer should be pending (which
> > considers user-timeout setting). based on the user-timeout description
> > (man tcp), the user timeout should abort the socket per the specified
> > time after data commences. some data would help to understand the
> > issue.
> >
>
> +1
>
> A packetdrill test would be ideal.
>
> Also, given that there is this ongoing issue with TCP_USER_TIMEOUT,
> lets not mix things
> or risk added work for backports to stable versions.

2021-01-13 20:31:39

by Enke Chen

[permalink] [raw]
Subject: Re: [PATCH] tcp: keepalive fixes

On Wed, Jan 13, 2021 at 12:06:27PM -0800, Enke Chen wrote:
> Hi, Eric:
>
> Just to clarify: the issues for tcp keepalive and TCP_USER_TIMEOUT are
> separate isues, and the fixes would not conflict afaik.
>
> Thanks. -- Enke

I have posted patches for both issues, and there is no conflict between
the patches.

Thanks. -- Enke

>
> On Tue, Jan 12, 2021 at 11:52:43PM +0100, Eric Dumazet wrote:
> > On Tue, Jan 12, 2021 at 11:48 PM Yuchung Cheng <[email protected]> wrote:
> > >
> > > On Tue, Jan 12, 2021 at 2:31 PM Enke Chen <[email protected]> wrote:
> > > >
> > > > From: Enke Chen <[email protected]>
> > > >
> > > > In this patch two issues with TCP keepalives are fixed:
> > > >
> > > > 1) TCP keepalive does not timeout when there are data waiting to be
> > > > delivered and then the connection got broken. The TCP keepalive
> > > > timeout is not evaluated in that condition.
> > > hi enke
> > > Do you have an example to demonstrate this issue -- in theory when
> > > there is data inflight, an RTO timer should be pending (which
> > > considers user-timeout setting). based on the user-timeout description
> > > (man tcp), the user timeout should abort the socket per the specified
> > > time after data commences. some data would help to understand the
> > > issue.
> > >
> >
> > +1
> >
> > A packetdrill test would be ideal.
> >
> > Also, given that there is this ongoing issue with TCP_USER_TIMEOUT,
> > lets not mix things
> > or risk added work for backports to stable versions.

2021-01-22 22:41:21

by Enke Chen

[permalink] [raw]
Subject: Re: [PATCH] tcp: keepalive fixes

Hi, Folks:

Please ignore this patch. I will split it into separate ones as suggested
off-list by Neal Cardwell <[email protected]>.

Thanks. -- Enke

On Tue, Jan 12, 2021 at 11:25:44AM -0800, Enke Chen wrote:
> From: Enke Chen <[email protected]>
>
> In this patch two issues with TCP keepalives are fixed:
>
> 1) TCP keepalive does not timeout when there are data waiting to be
> delivered and then the connection got broken. The TCP keepalive
> timeout is not evaluated in that condition.
>
> The fix is to remove the code that prevents TCP keepalive from
> being evaluated for timeout.
>
> 2) With the fix for #1, TCP keepalive can erroneously timeout after
> the 0-window probe kicks in. The 0-window probe counter is wrongly
> applied to TCP keepalives.
>
> The fix is to use the elapsed time instead of the 0-window probe
> counter in evaluating TCP keepalive timeout.
>
> Cc: [email protected]
> Signed-off-by: Enke Chen <[email protected]>
> ---
> net/ipv4/tcp_timer.c | 15 +++------------
> 1 file changed, 3 insertions(+), 12 deletions(-)
>
> diff --git a/net/ipv4/tcp_timer.c b/net/ipv4/tcp_timer.c
> index 6c62b9ea1320..40953aa40d53 100644
> --- a/net/ipv4/tcp_timer.c
> +++ b/net/ipv4/tcp_timer.c
> @@ -696,12 +696,6 @@ static void tcp_keepalive_timer (struct timer_list *t)
> ((1 << sk->sk_state) & (TCPF_CLOSE | TCPF_SYN_SENT)))
> goto out;
>
> - elapsed = keepalive_time_when(tp);
> -
> - /* It is alive without keepalive 8) */
> - if (tp->packets_out || !tcp_write_queue_empty(sk))
> - goto resched;
> -
> elapsed = keepalive_time_elapsed(tp);
>
> if (elapsed >= keepalive_time_when(tp)) {
> @@ -709,16 +703,15 @@ static void tcp_keepalive_timer (struct timer_list *t)
> * to determine when to timeout instead.
> */
> if ((icsk->icsk_user_timeout != 0 &&
> - elapsed >= msecs_to_jiffies(icsk->icsk_user_timeout) &&
> - icsk->icsk_probes_out > 0) ||
> + elapsed >= msecs_to_jiffies(icsk->icsk_user_timeout)) ||
> (icsk->icsk_user_timeout == 0 &&
> - icsk->icsk_probes_out >= keepalive_probes(tp))) {
> + (elapsed >= keepalive_time_when(tp) +
> + keepalive_intvl_when(tp) * keepalive_probes(tp)))) {
> tcp_send_active_reset(sk, GFP_ATOMIC);
> tcp_write_err(sk);
> goto out;
> }
> if (tcp_write_wakeup(sk, LINUX_MIB_TCPKEEPALIVE) <= 0) {
> - icsk->icsk_probes_out++;
> elapsed = keepalive_intvl_when(tp);
> } else {
> /* If keepalive was lost due to local congestion,
> @@ -732,8 +725,6 @@ static void tcp_keepalive_timer (struct timer_list *t)
> }
>
> sk_mem_reclaim(sk);
> -
> -resched:
> inet_csk_reset_keepalive_timer (sk, elapsed);
> goto out;
>
> --
> 2.29.2
>