2007-11-02 01:33:32

by Felix von Leitner

[permalink] [raw]
Subject: TCP_DEFER_ACCEPT issues

I am trying to use TCP_DEFER_ACCEPT in my web server.

There are some operational problems. First of all: timeout handling. I
would like to be able to set a timeout in seconds (or better:
milliseconds) for how long the socket is allowed to sit there without
data coming in. For high load situations, I have been enforcing
timeouts in the range of 15 seconds, otherwise someone can DoS the
server by opening a lot of connections and tying up data structures.

It is still possible, of course, to tie up kernel memory this way, by
not reacting to the FIN or RST packets and running into a timeout there,
too, but that is partially tunable via sysctl.

According to tcp(7) the int argument to TCP_DEFER_ACCEPT is in seconds.
In the kernel code, it's converted to TCP timeout units. When I ran my
server, and connected without sending any data, nothing happened. No
timeout. Minutes later, the connection was still there. Even worse:
when I killed (!) the server process (thus closing the server socket),
the client did not get a reset. Only when I type something in the
telnet, I get a reset. This appears to be very broken.

My suggestion:

1. make the argument to the setsockopt be in seconds, or milliseconds.
2. if the server socket is closed, reset all pending connections.

Comments?

Felix


2007-11-02 02:52:08

by David Miller

[permalink] [raw]
Subject: Re: TCP_DEFER_ACCEPT issues

From: Felix von Leitner <[email protected]>
Date: Fri, 2 Nov 2007 02:33:21 +0100

> I am trying to use TCP_DEFER_ACCEPT in my web server.

You aren't going to reach many Linux kernel networking
exports on this mailing list. Please post your question
instead to [email protected], as that's where all
the networking developers hang out.

Thanks.

2007-11-02 07:26:23

by Eric Dumazet

[permalink] [raw]
Subject: Re: TCP_DEFER_ACCEPT issues

Felix von Leitner a ?crit :
> I am trying to use TCP_DEFER_ACCEPT in my web server.
>
> There are some operational problems. First of all: timeout handling. I
> would like to be able to set a timeout in seconds (or better:
> milliseconds) for how long the socket is allowed to sit there without
> data coming in. For high load situations, I have been enforcing
> timeouts in the range of 15 seconds, otherwise someone can DoS the
> server by opening a lot of connections and tying up data structures.
>
> It is still possible, of course, to tie up kernel memory this way, by
> not reacting to the FIN or RST packets and running into a timeout there,
> too, but that is partially tunable via sysctl.
>
> According to tcp(7) the int argument to TCP_DEFER_ACCEPT is in seconds.
> In the kernel code, it's converted to TCP timeout units. When I ran my
> server, and connected without sending any data, nothing happened. No
> timeout. Minutes later, the connection was still there. Even worse:
> when I killed (!) the server process (thus closing the server socket),
> the client did not get a reset. Only when I type something in the
> telnet, I get a reset. This appears to be very broken.
>
> My suggestion:
>
> 1. make the argument to the setsockopt be in seconds, or milliseconds.
> 2. if the server socket is closed, reset all pending connections.
>
> Comments?
>

I agree TCP_DEFER_ACCEPT is not worth it at the current time, if you take into
account the bad guys, or very slow networks.

1) Setting a timeout in a millisecond range (< 1000) is not very good because
some clients may need much more time to send your server the data (very long
distance). So a second granularity is OK.

2) After timeout is elapsed, the server tcp stack has no socket associated to
your client attempt. So closing the server listening socket wont be able to
send RST. I agree a RST *should* be sent by the server once the timeout is
triggered.

A typical tcpdump of what is happening for a tcp_defer_accept timeout of 20
seconds is :

[1]08:52:47.480291 IP client.60930 > server.http: S 2498995442:2498995442(0)
win 5840 <mss 1460,sackOK,timestamp 2685904595 0,nop,wscale 2>
[2]08:52:47.480302 IP server.http > client.60930: S 1173302644:1173302644(0)
ack 2498995443 win 5840 <mss 1460>
[3]08:52:47.481669 IP client.60930 > server.http: . ack 1 win 5840

[4]08:52:50.757543 IP server.http > client.60930: S 1173302644:1173302644(0)
ack 2498995443 win 5840 <mss 1460>
[5]08:52:50.758953 IP client.60930 > server.http: . ack 1 win 5840

[6]08:52:56.760611 IP server.http > client.60930: S 1173302644:1173302644(0)
ack 2498995443 win 5840 <mss 1460>
[7]08:52:56.761886 IP client.60930 > server.http: . ack 1 win 5840

[8]08:53:08.771254 IP server.http > client.60930: S 1173302644:1173302644(0)
ack 2498995443 win 5840 <mss 1460>
[9]08:53:08.772514 IP client.60930 > server.http: . ack 1 win 5840

[10]08:53:32.782488 IP server.http > client.60930: S 1173302644:1173302644(0)
ack 2498995443 win 5840 <mss 1460>
[11]08:53:32.783754 IP client.60930 > server.http: . ack 1 win 5840

<a very long time, then client finally sends 2 bytes>

[12]08:59:30.509097 IP client.60930 > server.http: P 1:3(2) ack 1 win 5840
[13]08:59:30.509125 IP server.http > client.60930: R 1173302645:1173302645(0)
win 0


So TCP_DEFER_ACCEPT might send way more packets than needed. Packets 4,6,8,10
(and their corresponding acks 5,7,9,11) seem un-necessary, since (1,2,3) has
engaged a normal TCP session (three way handshake).

We only should wait for the data coming from the client to be able to pass the
new socket to the listening application.


2007-11-02 22:19:23

by Felix von Leitner

[permalink] [raw]
Subject: Re: TCP_DEFER_ACCEPT issues

Thus spake Eric Dumazet ([email protected]):
> 1) Setting a timeout in a millisecond range (< 1000) is not very good
> because some clients may need much more time to send your server the data
> (very long distance). So a second granularity is OK.

I want millisecond accuracy for consistency. select and poll have it,
we have a 1000 Hz timer, we should also expose that accuracy. I don't
want to have sub second timeouts, in case you were wondering.

> 2) After timeout is elapsed, the server tcp stack has no socket associated
> to your client attempt. So closing the server listening socket wont be able
> to send RST. I agree a RST *should* be sent by the server once the timeout
> is triggered.

I don't see any evidence for a timeout happening at all.
I passed 1 as argument to the setsockopt, so I'd expect a timeout to
happen pretty quickly. There was no connection reset until I Ctrl-C'd
the server 15 minuets (!) laster.

> A typical tcpdump of what is happening for a tcp_defer_accept timeout of 20
> seconds is :

> [1]08:52:47.480291 IP client.60930 > server.http: S
> 2498995442:2498995442(0) win 5840 <mss 1460,sackOK,timestamp 2685904595
> 0,nop,wscale 2>
> [2]08:52:47.480302 IP server.http > client.60930: S
> 1173302644:1173302644(0) ack 2498995443 win 5840 <mss 1460>
> [3]08:52:47.481669 IP client.60930 > server.http: . ack 1 win 5840

> [4]08:52:50.757543 IP server.http > client.60930: S
> 1173302644:1173302644(0) ack 2498995443 win 5840 <mss 1460>
> [5]08:52:50.758953 IP client.60930 > server.http: . ack 1 win 5840

> [6]08:52:56.760611 IP server.http > client.60930: S
> 1173302644:1173302644(0) ack 2498995443 win 5840 <mss 1460>
> [7]08:52:56.761886 IP client.60930 > server.http: . ack 1 win 5840

> [8]08:53:08.771254 IP server.http > client.60930: S
> 1173302644:1173302644(0) ack 2498995443 win 5840 <mss 1460>
> [9]08:53:08.772514 IP client.60930 > server.http: . ack 1 win 5840

> [10]08:53:32.782488 IP server.http > client.60930: S
> 1173302644:1173302644(0) ack 2498995443 win 5840 <mss 1460>
> [11]08:53:32.783754 IP client.60930 > server.http: . ack 1 win 5840

> <a very long time, then client finally sends 2 bytes>

> [12]08:59:30.509097 IP client.60930 > server.http: P 1:3(2) ack 1 win 5840
> [13]08:59:30.509125 IP server.http > client.60930: R
> 1173302645:1173302645(0) win 0

I see this, too. If I connect and not send something, I expected the
kernel to drop the connection when the timeout is reached. Nothing like
that happens.

> So TCP_DEFER_ACCEPT might send way more packets than needed.

Only in the face of attackers, and after the handshake. I could live
with that. If the timeout happened.

> We only should wait for the data coming from the client to be able to pass
> the new socket to the listening application.

Yes. And we should send a RST if no data is coming in within the
timeout, which is not happening for me (2.6.23).

Felix

2007-11-04 17:18:28

by dean gaudet

[permalink] [raw]
Subject: Re: TCP_DEFER_ACCEPT issues

fwiw i also brought the TCP_DEFER_ACCEPT problems up the end of last year:

http://www.mail-archive.com/[email protected]/msg28916.html

it's possible the final message in that thread is how we should define the
behaviour, i haven't tried the TCP_SYNCNT idea though.

-dean