Linux (2.4.18) places incoming connection requests into the syn_recd state
when the server's backlog queue is full. I thought they were supposed to be
discarded if the server's backlog is full, forcing the client to
subsequently retransmit the request after it times out. Why does linux put
the server side into the syn_recd state when its backlog is full?
> Linux (2.4.18) places incoming connection requests into the syn_recd state
> when the server's backlog queue is full. I thought they were supposed to be
> discarded if the server's backlog is full, forcing the client to
> subsequently retransmit the request after it times out. Why does linux put
> the server side into the syn_recd state when its backlog is full?
Do you have tcp_syncookies on? And are you exceeding
the len as configured by tcp_max_syn_backlog?
thanks,
Nivedita
[Please cc or post to netdev, like most networking folk,
dont subscribe to lkml]
Nivedita writes:
>
> Do you have tcp_syncookies on?
>
syncookies = 0.
>
>And are you exceeding the len as configured by tcp_max_syn_backlog?
>
max_syn_backlog = 256.
My server program sets its backlog to one and pauses ninety seconds before
accepting connections. Within that ninety second interval, I start three
client programs that do an active open to my server. I expect one of
connections to get discarded when the server's connection backlog limit is
exceeded.
Paul Albrecht wrote:
> My server program sets its backlog to one and pauses ninety seconds before
> accepting connections. Within that ninety second interval, I start three
> client programs that do an active open to my server. I expect one of
> connections to get discarded when the server's connection backlog limit is
> exceeded.
We actually have two queues - the syn queue and the socket
acccept queue. We move the connection request from the syn
queue to the accept queue of the socket once the 3 way
handshake is complete - i.e. once the state is ESTABLISHED.
If the syn queue is full, requests will get dropped and
the socket will not change state.
When you set a the backlog to 1 in the listen call, what is
being capped is the accept queue. So I would expect your
server to allow only one of those requests in the accept
queue, and the kernel will drop the other two requests.
Actually, details, but we also apply some other conditions
before we actually drop the connection request - we try not to be
so harsh if the syn queue is still fairly empty..
Think thats so, at any rate :).
Nivedita
Nivedita Singhvi writes:
>
> When you set a the backlog to 1 in the listen call, what is
> being capped is the accept queue. So I would expect your
> server to allow only one of those requests in the accept
> queue, and the kernel will drop the other two requests.
>
What you get when you set backlog to one is operating system dependent.
Tracing the flows with tcpdump, I get two clean handshakes so presumeably,
for linux, one means two. The third connection request *isn't* dropped;
according to netstat, it's placed in the syn_recd state. I thought
berkeley-derived implementations followed the rule that if there is no room
on the backlog queue for the new connection, tcp ignored the the received
syn.
>
> Actually, details, but we also apply some other conditions
> before we actually drop the connection request - we try not to be
> so harsh if the syn queue is still fairly empty..
>
Irrespective of whatever conditions linux applies, how can the connection
enter the syn_recd state if the backlog limit would be exceeded? What's the
client supposed to do with the syn/ack from the server? What's the server
supposed to do with the ack it get's back from the client?
Paul Albrecht wrote:
>>When you set a the backlog to 1 in the listen call, what is
>>being capped is the accept queue. So I would expect your
>>server to allow only one of those requests in the accept
>>queue, and the kernel will drop the other two requests.
> What you get when you set backlog to one is operating system dependent.
You asked about Linux 2.4.18, and I was speaking
strictly for it. This is after all linux-netdev :).
> Tracing the flows with tcpdump, I get two clean handshakes so presumeably,
> for linux, one means two. The third connection request *isn't* dropped;
Again, youre limiting the number of connnection requests
that are allowed to wait in the *accept* queue, where
we move to once we're ESTABLISHED. You arent limiting
a request sitting in the SYN queue.
> according to netstat, it's placed in the syn_recd state. I thought
> berkeley-derived implementations followed the rule that if there is no room
> on the backlog queue for the new connection, tcp ignored the the received
> syn.
>>Actually, details, but we also apply some other conditions
>>before we actually drop the connection request - we try not to be
>>so harsh if the syn queue is still fairly empty..
>>
>
>
> Irrespective of whatever conditions linux applies, how can the connection
> enter the syn_recd state if the backlog limit would be exceeded? What's the
> client supposed to do with the syn/ack from the server? What's the server
> supposed to do with the ack it get's back from the client?
Er, complete the 3 way handshake? If the client gets the syn/ack, it
should send a SYN in response, and move to ESTABLISHED state. If the
server gets an ack back from the client, we process the ack. Our
processing involves moving the request from the syn queue to the
accept queue. Should the accept queue be full (which could occur
anytime - eg it could have occurred *after* the server recvd this
SYN) we would drop the request. Should the client then send data,
it would get a RST, letting it know our side (srvr) has had to
throw the connection away. Its quite possible that the accept queue
clears and a request can be moved from the SYN queue to the
accept queue in the interval of the handshake being completed (?)
If we get a SYN, it doesn't seem unreasonable that we enter
SYN_RCVD state :).
thanks,
Nivedita
Nivedita Singhvi wrote:
> Er, complete the 3 way handshake? If the client gets the syn/ack, it
> should send a SYN in response, and move to ESTABLISHED state. If the
~~~
my bad, sorry, that should be ACK, of course...
thanks,
Nivedita
Nivedita Singhvi writes:
>
> Again, youre limiting the number of connnection requests
> that are allowed to wait in the *accept* queue, where
> we move to once we're ESTABLISHED. You arent limiting
> a request sitting in the SYN queue.
>
This statement is inconsistent with the description of this scenario in
Steven's TCP/IP Illustrated. Specifically, continuing the handshake in the
TCP layer, i.e., sending a syn/ack and moving to the syn_recd state, is
incorrect if the limit of the server's socket backlog would be exceeded.
How do you account for this discrepancy between linux and other
berkeley-derived implementations?
"Paul Albrecht" <[email protected]> writes:
> This statement is inconsistent with the description of this scenario in
> Steven's TCP/IP Illustrated. Specifically, continuing the handshake in the
> TCP layer, i.e., sending a syn/ack and moving to the syn_recd state, is
> incorrect if the limit of the server's socket backlog would be exceeded.
> How do you account for this discrepancy between linux and other
> berkeley-derived implementations?
The 4.4BSD-Lite code described in Stevens is long outdated. All modern
BSDs (and probably most other Unixes too) do it in a similar way to what
Nivedita described. The keywords are "syn flood attack" and "DoS".
-Andi
Andi Kleen <[email protected]> writes:
> "Paul Albrecht" <[email protected]> writes:
>
> > This statement is inconsistent with the description of this scenario in
> > Steven's TCP/IP Illustrated. Specifically, continuing the handshake in the
> > TCP layer, i.e., sending a syn/ack and moving to the syn_recd state, is
> > incorrect if the limit of the server's socket backlog would be exceeded.
> > How do you account for this discrepancy between linux and other
> > berkeley-derived implementations?
>
> The 4.4BSD-Lite code described in Stevens is long outdated. All modern
> BSDs (and probably most other Unixes too) do it in a similar way to what
> Nivedita described. The keywords are "syn flood attack" and "DoS".
And furthermore, IIRC, the current Linux networking code is not
Berkeley-derived, though an earlier version was.
-Doug
On 07 Jul 2003 18:25:17 -0400
Doug McNaught <[email protected]> wrote:
> And furthermore, IIRC, the current Linux networking code is not
> Berkeley-derived, though an earlier version was.
The linux network stack was never BSD derived in any way.
[there are two header files that came from net2, but they do not
contain any code]
-Andi
Andi Kleen <[email protected]> writes:
> On 07 Jul 2003 18:25:17 -0400
> Doug McNaught <[email protected]> wrote:
>
> > And furthermore, IIRC, the current Linux networking code is not
> > Berkeley-derived, though an earlier version was.
>
> The linux network stack was never BSD derived in any way.
>
> [there are two header files that came from net2, but they do not
> contain any code]
OIDNRC, thanks for the correction. :)
Although, I distinctly remember seeing "Net-2" in one of the boot
mesages in an early kernel (pre 1.0); was that just the header files'
doing?
-Doug
On 07 Jul 2003 20:17:57 -0400
Doug McNaught <[email protected]> wrote:
> Although, I distinctly remember seeing "Net-2" in one of the boot
> mesages in an early kernel (pre 1.0); was that just the header files'
> doing?
Net-2 was the name for a linux network code release too. The current code is net4
(actually more net5). But it has nothing to do with the similarly named
BSD release.
-Andi
Andi Kleen writes:
>
> The 4.4BSD-Lite code described in Stevens is long outdated.
>
I was referring to volume one subtitled: "The Protocols." It doesn't
describe implementation and the examples are not limited to bsd-lite.
>
>All modern BSDs (and probably most other Unixes too) do it in a similar way
to what
> Nivedita described.
>
Linux doesn't operate in the manner Nivedita describes ... the tcp layer on
the server side moves to the syn_recd state, but doesn't accept the ack back
from client. Instead it times out and sends its syn/ack back to the client
and again ignores the client's ack, ... Eventually, either there's room on
backlog queue and the server side moves to the established state or the
server side stops resending the its syn/ack. This doesn't seem to make much
sense. If the tcp layer can send the syn/ack it seems like it should
probably respond to the client's ack.
>
>The keywords are "syn flood attack" and "DoS".
>
I'd be interested in a more specific reference detailing the changes
required to the listen syscall as a consequence of the changes required for
avoidance of syn flood attacks. Thanks.
Doug McNaught <[email protected]> said:
[...]
> Although, I distinctly remember seeing "Net-2" in one of the boot
> mesages in an early kernel (pre 1.0); was that just the header files'
> doing?
Nope. There were NET, NET2, NET3, ... versions of the Linux native TCP/IP
stack. Just name coincidence.
--
Dr. Horst H. von Brand User #22616 counter.li.org
Departamento de Informatica Fono: +56 32 654431
Universidad Tecnica Federico Santa Maria +56 32 654239
Casilla 110-V, Valparaiso, Chile Fax: +56 32 797513
Andi Kleen writes:
>
> The 4.4BSD-Lite code described in Stevens is long outdated. All modern
> BSDs (and probably most other Unixes too) do it in a similar way to what
> Nivedita described. The keywords are "syn flood attack" and "DoS".
>
I have attached a copy of tcpdump output for two linux systems connected
over ether replaying the scenario for incoming request queue handling given
in Stevens's TCP/IP Illustrated Volume 1: The Protocols. What I don't
understand about the third handshake is if the server is going to send the
syn/ack in response the client's initial syn then why does server repeatly
ignore the subsequent ack from the client?