2007-12-16 01:31:55

by James Nichols

[permalink] [raw]
Subject: After many hours all outbound connections get stuck in SYN_SENT

Hello,

I have a Java application that makes a large number of outbound
webservice calls over HTTP/TCP. The hosts contacted are a fixed set
of about 2000 hosts and a web service call is made to each of them
approximately every 5 mintues by a pool of 200 Java threads. Over
time, on average a percentage of these hosts are unreachable for one
reason or another, usually because they are on wireless cell phone
NICs, so there is a persistent count of sockets in the SYN_SENT state
in the range of about 60-80. This is fine, as these failed connection
attempts eventually time out.

However, after approximately 38 hours of operation, all outbound
connection attempts get stuck in the SYN_SENT state. It happens
instantaneously, where I go from the baseline of about 60-80 sockets
in SYN_SENT to a count of 200 (corresponding to the # of java threads
that make these calls).

When I stop and start the Java application, all the new outbound
connections still get stuck in SYN_SENT state. During this time, I am
still able to SSH to the box and run wget to Google, cnn, etc, so the
problem appears to be specific to the hosts that I'm accessing via the
webservices.

For a long time, the only thing that would resolve this was rebooting
the entire machine. Once I did this, the outbound connections could
be made succesfully. However, very recently when I had once of these
incidents I disabled tcp_sack via:

echo "0" > /proc/sys/net/ipv4/tcp_sack

And the problem almost instanteaously resolved itself and outbound
connection attempts were succesful. I hadn't attempted this before
because I assumed that if any of my network
equipment or remote hosts had a problem with SACK, that it would never
work. In my case, it worked fine for about 38 hours before hitting a
wall where no outbound connections could be made.

I'm running kernel 2.6.18 on RedHat, but have had this problem occur
on earlier kernel versions (all 2.4 and 2.6). I know a lot of people
will say it must be the firewall, but I've seen had this issue on
different router vendors, firewall vendors, different co-location
facilities, NICs, and several other variables. I've totaly rebuilt
every piece of the archtiecture at one time or another and still see
this issue. I've had this problem to varying degrees of severity for
the past 4 years or so. Up until this point, the only thing other
than a complete machine restart that fixes the problem is disabling
tcp_sack. When I disable it, the problem goes away almost
instantaneously.

Is there a kernel buffer or some data structure that tcp_sack uses
that gets filled up after an extended period of operation?
How can I debug this problem in the kernel to find out what the root cause is?

I've temporarily signed up on this list, but may opt-out if I can't
handle the traffic, so please CC me directly on any replies.

Thanks,
James Nichols


2007-12-17 23:14:48

by Jan Engelhardt

[permalink] [raw]
Subject: Re: After many hours all outbound connections get stuck in SYN_SENT


On Dec 14 2007 15:39, James Nichols wrote:
>
>However, after approximately 38 hours of operation, all outbound
>connection attempts get stuck in the SYN_SENT state. It happens
>instantaneously, where I go from the baseline of about 60-80 sockets
>in SYN_SENT to a count of 200 (corresponding to the # of java threads
>that make these calls).
>
>When I stop and start the Java application, all the new outbound
- ^

at that point, try tcpdump. It may, or may not, show something.

>connections still get stuck in SYN_SENT state.
>During this time, I am still able to SSH to the box

Try uploading something through rsync+ssh, or scp+ssh. If it aborts
or hangs after a while, that may be an strong indication of a crappy
router. Also, I'd advise to upgrade to something newer like >=
2.6.22. There was one of those SACK-broken routers around here too,
but it seemed to have been replaced (or linux got a mysterious fix
:-) as one day when I tried turning off SACK, rsync didnot abort
anymore on new connections.

Though, if SACK was the problem, the problem would be much more
likely to appear after the handshake. YMMV.


>and run wget to Google, cnn, etc, so the
>problem appears to be specific to the hosts that I'm accessing via the
>webservices.
>
>For a long time, the only thing that would resolve this was rebooting
>the entire machine. Once I did this, the outbound connections could
>be made succesfully. However, very recently when I had once of these
>incidents I disabled tcp_sack via:
>
>echo "0" > /proc/sys/net/ipv4/tcp_sack
>
>And the problem almost instanteaously resolved itself and outbound
>connection attempts were succesful. I hadn't attempted this before
>because I assumed that if any of my network
>equipment or remote hosts had a problem with SACK, that it would never
>work. In my case, it worked fine for about 38 hours before hitting a
>wall where no outbound connections could be made.
>
>I'm running kernel 2.6.18 on RedHat, but have had this problem occur
>on earlier kernel versions (all 2.4 and 2.6).

2007-12-18 15:34:36

by James Nichols

[permalink] [raw]
Subject: Re: After many hours all outbound connections get stuck in SYN_SENT

> Try uploading something through rsync+ssh, or scp+ssh. If it aborts
> or hangs after a while, that may be an strong indication of a crappy
> router. Also, I'd advise to upgrade to something newer like >=
> 2.6.22. There was one of those SACK-broken routers around here too,
> but it seemed to have been replaced (or linux got a mysterious fix
> :-) as one day when I tried turning off SACK, rsync didnot abort
> anymore on new connections.

Uploads to hosts other then 2000 being hit by my app via webservices
run flawlessly to completion. I really don't think it is my routing
equipment- I've completely rebuilt the whole network infrastructure
several times in the 4 years that I've had this problem, have moved to
different colo facilities, and several other network related factors
and still have seen this problem.

It's very challenging for me to upgrade the kernel as this is a
production system and I need to run on whatever the latest RedHat
supports for support contract reasons. I probably could do it if
there are specific fixes that there is reason to believe will make my
problem go away.


> Though, if SACK was the problem, the problem would be much more
> likely to appear after the handshake. YMMV.

2007-12-18 16:05:23

by Jan Engelhardt

[permalink] [raw]
Subject: Re: After many hours all outbound connections get stuck in SYN_SENT


On Dec 18 2007 10:34, James Nichols wrote:
>
>It's very challenging for me to upgrade the kernel as this is a
>production system and I need to run on whatever the latest RedHat
>supports for support contract reasons. I probably could do it if
>there are specific fixes that there is reason to believe will make my
>problem go away.
>
Well you could still blame Java. I am sure that if you program was C,
the problem could be narrowed down much easier.

2007-12-18 16:45:50

by James Nichols

[permalink] [raw]
Subject: Re: After many hours all outbound connections get stuck in SYN_SENT

> Well you could still blame Java. I am sure that if you program was C,
> the problem could be narrowed down much easier.

That may very well be true, but I can't rewrite the whole 500K line
application in C at this point. Plus, it's a web app which would be
"fun" to implement in C.

2007-12-18 17:20:11

by Jan Engelhardt

[permalink] [raw]
Subject: Re: After many hours all outbound connections get stuck in SYN_SENT


On Dec 18 2007 11:45, James Nichols wrote:
>
>> Well you could still blame Java. I am sure that if you program was C,
>> the problem could be narrowed down much easier.
>
>That may very well be true, but I can't rewrite the whole 500K line
>application in C at this point. Plus, it's a web app which would be
>"fun" to implement in C.

Well I do not require you to do /that/, but you could try adhering to
the unix philosophy later on that one program should do (ideally) one
thing, and if the java blob already serves the webpage, then opening
sockets and doing xyz could probably live in another program.

2007-12-18 18:09:41

by James Nichols

[permalink] [raw]
Subject: Re: After many hours all outbound connections get stuck in SYN_SENT

> >> Well you could still blame Java. I am sure that if you program was C,
> >> the problem could be narrowed down much easier.
> >
> >That may very well be true, but I can't rewrite the whole 500K line
> >application in C at this point. Plus, it's a web app which would be
> >"fun" to implement in C.
>
> Well I do not require you to do /that/, but you could try adhering to
> the unix philosophy later on that one program should do (ideally) one
> thing, and if the java blob already serves the webpage, then opening
> sockets and doing xyz could probably live in another program.

Fair enough. So if the application was written in C, how would that
make this problem any easier to narrow down?

2007-12-18 18:14:33

by Jan Engelhardt

[permalink] [raw]
Subject: Re: After many hours all outbound connections get stuck in SYN_SENT


On Dec 18 2007 13:09, James Nichols wrote:
>
>> >> Well you could still blame Java. I am sure that if you program was C,
>> >> the problem could be narrowed down much easier.
>> >
>> >That may very well be true, but I can't rewrite the whole 500K line
>> >application in C at this point. Plus, it's a web app which would be
>> >"fun" to implement in C.
>>
>> Well I do not require you to do /that/, but you could try adhering to
>> the unix philosophy later on that one program should do (ideally) one
>> thing, and if the java blob already serves the webpage, then opening
>> sockets and doing xyz could probably live in another program.
>
>Fair enough. So if the application was written in C, how would that
>make this problem any easier to narrow down?
>
Here is a purely hypothethical (and in practice unlikely) idea:
Java opens up too many sockets (more than you really request) and the
kernel, for whatever reason, does not deliver packets to programs
which have maxed out their fds. Well it would already help if the
java blob was split into multiple blobs (assuming the problem
persists), as the best testcase is the smallest possible one. So if
it is reproducable without the web blob, great step there.

2007-12-18 18:23:28

by James Nichols

[permalink] [raw]
Subject: Re: After many hours all outbound connections get stuck in SYN_SENT

> Here is a purely hypothethical (and in practice unlikely) idea:
> Java opens up too many sockets (more than you really request) and the
> kernel, for whatever reason, does not deliver packets to programs
> which have maxed out their fds. Well it would already help if the
> java blob was split into multiple blobs (assuming the problem
> persists), as the best testcase is the smallest possible one. So if
> it is reproducable without the web blob, great step there.
>


Right, I don't disagree with you there. FWIW, I can disable entire
parts of the application and have already narrowed down reproduction
of this issue to the 200 threads that make the webservice calls, so it
doesn't have anything to do with any of the GUI or other background
services that my application executes.


You said:

> Well you could still blame Java. I am sure that if you program was C,
> the problem could be narrowed down much easier.

I'm curious to know how this problem would be easier to narrow down if
it were written in C.

2007-12-18 18:31:06

by Jan Engelhardt

[permalink] [raw]
Subject: Re: After many hours all outbound connections get stuck in SYN_SENT


On Dec 18 2007 13:21, James Nichols wrote:
>
>> Well you could still blame Java. I am sure that if you program was C,
>> the problem could be narrowed down much easier.
>
>I'm curious to know how this problem would be easier to narrow down if
>it were written in C.
>
It depends on the developers preference after all. libc is
'well-known' and who knows what bugs hide in the jre (it was not so
open until recently). I have seen obscure things in aurora linux's
glibc whereby concurrent use of malloc() in threaded applications
will hang because some malloc_mutex was probably trashed. Which tells
me that the lesser code there is, the better.

In your specific case it is likely to be not in userspace if the
socket remains in SYN_SENT, but I was speaking in general terms
(lesser is better). Usually, one would take out tcpdump, look for
SYN ACK packets arriving. If they don't arrive, check router. If they
appear, check the TCP input code path in the kernel. So much about
how to proceed :)

2007-12-18 18:32:55

by Eric Dumazet

[permalink] [raw]
Subject: Re: After many hours all outbound connections get stuck in SYN_SENT

James Nichols a ?crit :
>> Here is a purely hypothethical (and in practice unlikely) idea:
>> Java opens up too many sockets (more than you really request) and the
>> kernel, for whatever reason, does not deliver packets to programs
>> which have maxed out their fds. Well it would already help if the
>> java blob was split into multiple blobs (assuming the problem
>> persists), as the best testcase is the smallest possible one. So if
>> it is reproducable without the web blob, great step there.
>>
>>
>
>
> Right, I don't disagree with you there. FWIW, I can disable entire
> parts of the application and have already narrowed down reproduction
> of this issue to the 200 threads that make the webservice calls, so it
> doesn't have anything to do with any of the GUI or other background
> services that my application executes.
>
>
> You said:
>
>
>> Well you could still blame Java. I am sure that if you program was C,
>> the problem could be narrowed down much easier.
>>
>
> I'm curious to know how this problem would be easier to narrow down if
> it were written in C.
>
Well... please dont start a flame war :(

Back to your SYN_SENT problem, I suppose the remote IP is known, so you
probably could post here the result of a tcdpump ?

tcpdump -p -n -s 1600 host IP_of_problematic_peer -c 500

Most probably remote peer received too many attempts from you, and a
anti DOS mechanism is droping all SYN packets.

Ah well... I remember now that you mentioned tcp_sack setting had an
effect, so forget the "Most probably..." and give some tcpdump traces :)




2007-12-18 19:44:56

by James Nichols

[permalink] [raw]
Subject: Re: After many hours all outbound connections get stuck in SYN_SENT

> Well... please dont start a flame war :(
>
> Back to your SYN_SENT problem, I suppose the remote IP is known, so you
> probably could post here the result of a tcdpump ?
>
> tcpdump -p -n -s 1600 host IP_of_problematic_peer -c 500
>
> Most probably remote peer received too many attempts from you, and a
> anti DOS mechanism is droping all SYN packets.
>
> Ah well... I remember now that you mentioned tcp_sack setting had an
> effect, so forget the "Most probably..." and give some tcpdump traces :)


I'm not trying to start a flame war. My situation is that I'm a
performance engineer and I have to restart the app every 38 hours due
to this issue, I'm not the person(s) who wrote it. It's my job (and
the kernel's) job to support whatever application is being run. Also,
I was seriously curious to know if there were any better tools for
debugging this in C that I wasn't aware of. Anwyay...


I've run tcpdump for all IPs during this problem. I haven't tried
doing it for a single explicit IP address- due to the nature of the
workload it's very difficult to know which IPs will be hit at any
given moment. What I did see in the full IP captures is that the
returning ACKs don't show up in the packet capture. Unfortunately,
tcpdump reported that some packets were dropped during the capture.
Is it possible that the kernel was dropping the packets before they
could be captured by tcpdump?

Also, I have some doubts about it being the end points or an
intermediate router, please let me know if these are unreasonable:
1) We've completely replaced our routing equipment several times in
the past 4 years... totally different colos, router vendors, firewall
vendors, firewall rules, etc.
2) It occurs across all remote end points at the exact same time.
The endpoints are hetrogenous, run brain-dead OS's that don't do any
DOS detection, reboot at random times of the day, are geographically
distributed, are on different ISPs, etc. etc.
3) Turning of tcp_sack instantaneously makes the problem go away. If
it were endpoints or a router, it seems like a stretch that removing a
single TCP option would make the problem instantly resolve itself in
so many places other than the originating host.

2007-12-18 19:45:30

by James Nichols

[permalink] [raw]
Subject: Re: After many hours all outbound connections get stuck in SYN_SENT

> Well... please dont start a flame war :(
>
> Back to your SYN_SENT problem, I suppose the remote IP is known, so you
> probably could post here the result of a tcdpump ?
>
> tcpdump -p -n -s 1600 host IP_of_problematic_peer -c 500
>
> Most probably remote peer received too many attempts from you, and a
> anti DOS mechanism is droping all SYN packets.
>
> Ah well... I remember now that you mentioned tcp_sack setting had an
> effect, so forget the "Most probably..." and give some tcpdump traces :)


I'm not trying to start a flame war. My situation is that I'm a
performance engineer and I have to restart the app every 38 hours due
to this issue, I'm not the person(s) who wrote it. It's my job (and
the kernel's) job to support whatever application is being run. Also,
I was seriously curious to know if there were any better tools for
debugging this in C that I wasn't aware of. Anwyay...


I've run tcpdump for all IPs during this problem. I haven't tried
doing it for a single explicit IP address- due to the nature of the
workload it's very difficult to know which IPs will be hit at any
given moment. What I did see in the full IP captures is that the
returning ACKs don't show up in the packet capture. Unfortunately,
tcpdump reported that some packets were dropped during the capture.
Is it possible that the kernel was dropping the packets before they
could be captured by tcpdump?

Also, I have some doubts about it being the end points or an
intermediate router, please let me know if these are unreasonable:
1) We've completely replaced our routing equipment several times in
the past 4 years... totally different colos, router vendors, firewall
vendors, firewall rules, etc.
2) It occurs across all remote end points at the exact same time.
The endpoints are hetrogenous, run brain-dead OS's that don't do any
DOS detection, reboot at random times of the day, are geographically
distributed, are on different ISPs, etc. etc.
3) Turning of tcp_sack instantaneously makes the problem go away. If
it were endpoints or a router, it seems like a stretch that removing a
single TCP option would make the problem instantly resolve itself in
so many places other than the originating host.

2007-12-18 20:29:18

by Chuck Ebbert

[permalink] [raw]
Subject: Re: After many hours all outbound connections get stuck in SYN_SENT

On 12/18/2007 02:45 PM, James Nichols wrote:
>
> I've run tcpdump for all IPs during this problem. I haven't tried
> doing it for a single explicit IP address- due to the nature of the
> workload it's very difficult to know which IPs will be hit at any
> given moment. What I did see in the full IP captures is that the
> returning ACKs don't show up in the packet capture. Unfortunately,
> tcpdump reported that some packets were dropped during the capture.
> Is it possible that the kernel was dropping the packets before they
> could be captured by tcpdump?
>

The only way to get a reliable trace is to run a capture from a port
mirror on the switch the server is connected to. Capturing from inside
the server at the same time and comparing the traces could be useful.

2007-12-18 20:37:55

by Eric Dumazet

[permalink] [raw]
Subject: Re: After many hours all outbound connections get stuck in SYN_SENT

James Nichols a ?crit :
>> Well... please dont start a flame war :(
>>
>> Back to your SYN_SENT problem, I suppose the remote IP is known, so you
>> probably could post here the result of a tcdpump ?
>>
>> tcpdump -p -n -s 1600 host IP_of_problematic_peer -c 500
>>
>> Most probably remote peer received too many attempts from you, and a
>> anti DOS mechanism is droping all SYN packets.
>>
>> Ah well... I remember now that you mentioned tcp_sack setting had an
>> effect, so forget the "Most probably..." and give some tcpdump traces :)
>
>
> I've run tcpdump for all IPs during this problem. I haven't tried
> doing it for a single explicit IP address- due to the nature of the
> workload it's very difficult to know which IPs will be hit at any
> given moment. What I did see in the full IP captures is that the
> returning ACKs don't show up in the packet capture. Unfortunately,
> tcpdump reported that some packets were dropped during the capture.
> Is it possible that the kernel was dropping the packets before they
> could be captured by tcpdump?

Yes it can happens, because an active sniffer makes the stack using more
cpu cycles (timestamping for example).

So you see outgoing SYN packets, but no SYN replies coming from the remote
peer ? (you mention ACKS, but the first packet received from the remote
peer should be a SYN+ACK),

client->server SYN
server->client SYN+ACK
client->server ACK


>
> Also, I have some doubts about it being the end points or an
> intermediate router, please let me know if these are unreasonable:
> 1) We've completely replaced our routing equipment several times in
> the past 4 years... totally different colos, router vendors, firewall
> vendors, firewall rules, etc.
> 2) It occurs across all remote end points at the exact same time.
> The endpoints are hetrogenous, run brain-dead OS's that don't do any
> DOS detection, reboot at random times of the day, are geographically
> distributed, are on different ISPs, etc. etc.
> 3) Turning of tcp_sack instantaneously makes the problem go away. If
> it were endpoints or a router, it seems like a stretch that removing a
> single TCP option would make the problem instantly resolve itself in
> so many places other than the originating host.

CC to netdev where linux network guys might have an idea.

When the problem comes, instead of restarting the application, please take a
tcpdump of say 10.000 packets.
Then turn off tcp_sack and take a 2nd tcpdump sample, and make both samples
available to us.

If turning off tcp_sack makes the problem go away, why dont you turn it off
all the time ?

2007-12-18 21:24:59

by Jan Engelhardt

[permalink] [raw]
Subject: Re: After many hours all outbound connections get stuck in SYN_SENT


On Dec 18 2007 21:37, Eric Dumazet wrote:
>
> If turning off tcp_sack makes the problem go away, why dont you
> turn it off all the time ?
>
That would just be workaround. I welcome the efforts to track this;
not all users have the time to do so.
Disabling tcp_sack also disabled it kernel-wide, which, well... for
2.6.25 there is a TCPOPTSTRIP netfilter target slated with which
SACK could be stripped only for a given host list or processes from
a UID.

2007-12-19 16:53:22

by James Nichols

[permalink] [raw]
Subject: Re: After many hours all outbound connections get stuck in SYN_SENT

> So you see outgoing SYN packets, but no SYN replies coming from the remote
> peer ? (you mention ACKS, but the first packet received from the remote
> peer should be a SYN+ACK),

Right, I meant to say SYN+ACK. I don't see them coming back.


> When the problem comes, instead of restarting the application, please take a
> tcpdump of say 10.000 packets.
> Then turn off tcp_sack and take a 2nd tcpdump sample, and make both samples
> available to us.

I can take these captures and take a look at the results.
Unfortunately, I don't think I'll be able to make the captures
available to the general public.



> If turning off tcp_sack makes the problem go away, why dont you turn it off
> all the time ?

Unfortunately, I think that will be the answer if I can't get any help
fixing this problem in the kernel. It's a bummer, because many of the
remote hosts my application communicates with are on wireless links,
so there may be performance implications to turning SACK off.

2007-12-19 17:07:45

by Eric Dumazet

[permalink] [raw]
Subject: Re: After many hours all outbound connections get stuck in SYN_SENT

James Nichols a ?crit :
>> So you see outgoing SYN packets, but no SYN replies coming from the remote
>> peer ? (you mention ACKS, but the first packet received from the remote
>> peer should be a SYN+ACK),
>
> Right, I meant to say SYN+ACK. I don't see them coming back.

So... Really unlikely a linux problem, but ...

>
>
>> When the problem comes, instead of restarting the application, please take a
>> tcpdump of say 10.000 packets.
>> Then turn off tcp_sack and take a 2nd tcpdump sample, and make both samples
>> available to us.
>
> I can take these captures and take a look at the results.
> Unfortunately, I don't think I'll be able to make the captures
> available to the general public.

I dont understand, why dont you change IPs to mask them with 192.168.X.Y, or
just ME, and peer1, peer2, peer...

>
>
>
>> If turning off tcp_sack makes the problem go away, why dont you turn it off
>> all the time ?
>
> Unfortunately, I think that will be the answer if I can't get any help
> fixing this problem in the kernel. It's a bummer, because many of the
> remote hosts my application communicates with are on wireless links,
> so there may be performance implications to turning SACK off.
>

Random ideas :

1) Is your server behind a NET router or something ?
2) Are you sure you are not using connection tracking, and hit a limit on it ?

2007-12-19 17:43:26

by James Nichols

[permalink] [raw]
Subject: Re: After many hours all outbound connections get stuck in SYN_SENT

On 12/19/07, Eric Dumazet <[email protected]> wrote:
> James Nichols a ?crit :
> >> So you see outgoing SYN packets, but no SYN replies coming from the remote
> >> peer ? (you mention ACKS, but the first packet received from the remote
> >> peer should be a SYN+ACK),
> >
> > Right, I meant to say SYN+ACK. I don't see them coming back.
>
> So... Really unlikely a linux problem, but ...
>


I don't know how you can be so sure. Turning tcp_sack off instantly
resovles the problem and all connections are succesful. I can't
imagine even the most far-fetched scenario where a router or every
single remote endpoints would suddenly stop causing the problem just
by removing a single TCP option.


> > I can take these captures and take a look at the results.
> > Unfortunately, I don't think I'll be able to make the captures
> > available to the general public.
>
> I dont understand, why dont you change IPs to mask them with 192.168.X.Y, or
> just ME, and peer1, peer2, peer...

I will see if I can do that, but it's major pain with 2000 hosts.
Plus, there is application data in the packets that I can't allow into
the public domain. I really don't think I can pull it off... I
literally would have to go through our legal department.

>
> Random ideas :
>
> 1) Is your server behind a NET router or something ?

What's a NET router? I am behind a Cisco router and a firewall, but
these network components have completely been replaced/rebuilt several
times in the 4+ years that we've had this problem. I've looked at the
logs there and neither are doing anything other than passing the
traffic along.

> 2) Are you sure you are not using connection tracking, and hit a limit on it ?

I'm using ip_conntrack, but the limit I have for max entries is 65K.
The most I've seen in there are a couple thousand- that was one of the
first things I monitored very closely.

2007-12-19 17:59:16

by Jan Engelhardt

[permalink] [raw]
Subject: Re: After many hours all outbound connections get stuck in SYN_SENT


On Dec 19 2007 12:43, James Nichols wrote:
>On 12/19/07, Eric Dumazet <[email protected]> wrote:
>> James Nichols a écrit :
>> >> So you see outgoing SYN packets, but no SYN replies coming from the remote
>> >> peer ? (you mention ACKS, but the first packet received from the remote
>> >> peer should be a SYN+ACK),
>> >
>> > Right, I meant to say SYN+ACK. I don't see them coming back.
>>
>> So... Really unlikely a linux problem, but ...
>
>I don't know how you can be so sure. Turning tcp_sack off instantly
>resovles the problem and all connections are succesful. I can't
>imagine even the most far-fetched scenario where a router or every
>single remote endpoints would suddenly stop causing the problem just
>by removing a single TCP option.

The router could be sooo crappy that it drops all packets from
TCP streams that have SACK enabled and the client has opened
200+ SACK connections previously... something like that?

2007-12-19 18:03:50

by Eric Dumazet

[permalink] [raw]
Subject: Re: After many hours all outbound connections get stuck in SYN_SENT

James Nichols a ?crit :
> On 12/19/07, Eric Dumazet <[email protected]> wrote:
>> James Nichols a ?crit :
>>>> So you see outgoing SYN packets, but no SYN replies coming from the remote
>>>> peer ? (you mention ACKS, but the first packet received from the remote
>>>> peer should be a SYN+ACK),
>>> Right, I meant to say SYN+ACK. I don't see them coming back.
>> So... Really unlikely a linux problem, but ...
>>
>
>
> I don't know how you can be so sure. Turning tcp_sack off instantly
> resovles the problem and all connections are succesful. I can't
> imagine even the most far-fetched scenario where a router or every
> single remote endpoints would suddenly stop causing the problem just
> by removing a single TCP option.
>
>
>>> I can take these captures and take a look at the results.
>>> Unfortunately, I don't think I'll be able to make the captures
>>> available to the general public.
>> I dont understand, why dont you change IPs to mask them with 192.168.X.Y, or
>> just ME, and peer1, peer2, peer...
>
> I will see if I can do that, but it's major pain with 2000 hosts.
> Plus, there is application data in the packets that I can't allow into
> the public domain. I really don't think I can pull it off... I
> literally would have to go through our legal department.

I still dont understand.

"tcpdump -p -n -s 1600 -c 10000" doesnt reveal User data at all.

Without any exact data from you, I am afraid nobody can help.

>
>> Random ideas :
>>
>> 1) Is your server behind a NET router or something ?
>
> What's a NET router? I am behind a Cisco router and a firewall, but
> these network components have completely been replaced/rebuilt several
> times in the 4+ years that we've had this problem. I've looked at the
> logs there and neither are doing anything other than passing the
> traffic along.

Typo error, I meant NAT. Most routers doing NAT have some limits, timers, hacks...

>
>> 2) Are you sure you are not using connection tracking, and hit a limit on it ?
>
> I'm using ip_conntrack, but the limit I have for max entries is 65K.
> The most I've seen in there are a couple thousand- that was one of the
> first things I monitored very closely.

Now please try without conn tracking module. I saw many failures in the past
that were trigered by conntrack.

Do you have some firewall rules, using some netfilter modules like hashlimit ?

2007-12-19 18:12:54

by James Nichols

[permalink] [raw]
Subject: Re: After many hours all outbound connections get stuck in SYN_SENT

> The router could be sooo crappy that it drops all packets from
> TCP streams that have SACK enabled and the client has opened
> 200+ SACK connections previously... something like that?

I don't know, maybe. My router is a fairly new model Cisco and is
pretty major (i.e. pretty expensive), so it's not just a total piece
of crap. Plus, I never restart it when I see these issues. I just
turn tcp_sack off, the problem goes away, and I'm able to renable
tcp_sack a few hours later and it works fine until many hours later
when I see the SYN_SENT problem again.

2007-12-19 21:27:59

by Ilpo Järvinen

[permalink] [raw]
Subject: Re: After many hours all outbound connections get stuck in SYN_SENT

On Wed, 19 Dec 2007, Eric Dumazet wrote:

> James Nichols a ?crit :
> > On 12/19/07, Eric Dumazet <[email protected]> wrote:
> > > James Nichols a ?crit :
> > > > > So you see outgoing SYN packets, but no SYN replies coming from the
> > > > > remote
> > > > > peer ? (you mention ACKS, but the first packet received from the
> > > > > remote
> > > > > peer should be a SYN+ACK),
> > > > Right, I meant to say SYN+ACK. I don't see them coming back.
> > > So... Really unlikely a linux problem, but ...
> > >
> >
> >
> > I don't know how you can be so sure. Turning tcp_sack off instantly
> > resovles the problem and all connections are succesful. I can't
> > imagine even the most far-fetched scenario where a router or every
> > single remote endpoints would suddenly stop causing the problem just
> > by removing a single TCP option.

You could also check if you can pull same trick off by touching
tcp_timestamps. It affects the TCP header as well.



--
i.

2007-12-20 15:51:31

by Glen Turner

[permalink] [raw]
Subject: Re: After many hours all outbound connections get stuck in SYN_SENT

[speculation by network engineer -- not kernel hacker -- follows]

> The router could be sooo crappy that it drops all packets from
> TCP streams that have SACK enabled and the client has opened
> 200+ SACK connections previously... something like that?

As far as any third party is concerned the existing TCP connections
continue to have negotiated "SACK Permitted". Only new connections
will not negotiate this. So "router crappiness" promptly disappearing
doesn't seem too likely (a way I could see this happening is if the
Linux box sends a Ack for each connection and this clears out Sack
datastructures on the third party).

But I'd be very surprised if the router is acting as anything more
that a network-layer device. It might perhaps have some soft connection
state being used for generating accounting records. Being Cisco
it's probably a switch-router, so it might carry some per-port hard
state for validating source IP addresses and ARPs on each port.

The firewall is much more likely to be carrying per-flow Sack
state. The Cisco PIX had a bug with SACK handling (CSCse14419,
fixed in 7.0(7), 7.1(2.34), 7.2(2.2), 8.0(0.141) but perhaps it
has regressed). A simple trace either side of the firewall will
show the inconsistency between the TCP sequence number (which
gets randomised) and the Sack sequence number (which didn't).
You could disable the TCP Sequence Number Randomisation feature
and see if the fault reoccurs.

You'd probably should also investigate the Linux kernel,
especially the size and locks of the components of the Sack data
structures and what happens to those data structures after Sack is
disabled (presumably the Sack data structure is in some unhappy
circumstance, and disabling Sack allows the data to be discarded,
magically unclaging the box).

In the absence of the reporter wanting to dump the kernel's
core, how about a patch to print the Sack datastructure when
the command to disable Sack is received by the kernel?
Maybe just print the last 16b of the IP address?

Best wishes, Glen

2007-12-20 16:09:05

by James Nichols

[permalink] [raw]
Subject: Re: After many hours all outbound connections get stuck in SYN_SENT

> I still dont understand.
>
> "tcpdump -p -n -s 1600 -c 10000" doesnt reveal User data at all.
>
> Without any exact data from you, I am afraid nobody can help.

Oh, I didn't see that you specified specific options. I'll still have
to anonymize 2000+ IP addresses, but I think there is an open source
tool that will do this for you.



> >> 2) Are you sure you are not using connection tracking, and hit a limit on it ?
> >
> > I'm using ip_conntrack, but the limit I have for max entries is 65K.
> > The most I've seen in there are a couple thousand- that was one of the
> > first things I monitored very closely.
>
> Now please try without conn tracking module. I saw many failures in the past
> that were trigered by conntrack.
>
> Do you have some firewall rules, using some netfilter modules like hashlimit ?

I will have to look into this.

2007-12-20 16:37:57

by James Nichols

[permalink] [raw]
Subject: Re: After many hours all outbound connections get stuck in SYN_SENT

> But I'd be very surprised if the router is acting as anything more
> that a network-layer device. It might perhaps have some soft connection
> state being used for generating accounting records. Being Cisco
> it's probably a switch-router, so it might carry some per-port hard
> state for validating source IP addresses and ARPs on each port.
>
> The firewall is much more likely to be carrying per-flow Sack
> state. The Cisco PIX had a bug with SACK handling (CSCse14419,
> fixed in 7.0(7), 7.1(2.34), 7.2(2.2), 8.0(0.141) but perhaps it
> has regressed). A simple trace either side of the firewall will
> show the inconsistency between the TCP sequence number (which
> gets randomised) and the Sack sequence number (which didn't).
> You could disable the TCP Sequence Number Randomisation feature
> and see if the fault reoccurs.

I do have TCP Sequence # Randomization enabled on my router. However,
if this was causing an issue, wouldn't it always occur and cause
connection issues, not just after 38 hours of correct operation? I
can look into turning this off, but I'll likely have to jump through
several hoops which will be challenging if I don't have a very clear
definitive reason why this is causing this issue. Plus, I've had this
problem with at least 2 other sets of network switches over the past 4
years. I'm actually running 7.0(6), which doesn't have the fix you
mentioned. If it really is possible that this issue wouldn't always
cause problems, but only after hours of succesful operation, then I
could probably motivate the upgrade. I can try to setup a trace, but
this is a lot of work for other people in my organization, so it will
take quite some time.


> You'd probably should also investigate the Linux kernel,
> especially the size and locks of the components of the Sack data
> structures and what happens to those data structures after Sack is
> disabled (presumably the Sack data structure is in some unhappy
> circumstance, and disabling Sack allows the data to be discarded,
> magically unclaging the box).
>
> In the absence of the reporter wanting to dump the kernel's
> core, how about a patch to print the Sack datastructure when
> the command to disable Sack is received by the kernel?
> Maybe just print the last 16b of the IP address?

Given the fact that I've had this problem for so long, over a variety
of networking hardware vendors and colo-facilities, this really sounds
good to me. It will be challenging for me to justify a kernel core
dump, but a simple patch to dump the Sack data would be do-able.

2007-12-20 20:45:20

by Ilpo Järvinen

[permalink] [raw]
Subject: Re: After many hours all outbound connections get stuck in SYN_SENT

On Thu, 20 Dec 2007, James Nichols wrote:

> > I still dont understand.
> >
> > "tcpdump -p -n -s 1600 -c 10000" doesnt reveal User data at all.
> >
> > Without any exact data from you, I am afraid nobody can help.
>
> Oh, I didn't see that you specified specific options. I'll still have
> to anonymize 2000+ IP addresses, but I think there is an open source
> tool that will do this for you.

Even a simple for loop in shell can do that. It's not that hard and
there's very little need for manual work! Ingrediments: for, cut, grep
and sed.


--
i.

2007-12-20 20:57:23

by Justin Banks

[permalink] [raw]
Subject: Re: After many hours all outbound connections get stuck in SYN_SENT

James Nichols wrote
> > I still dont understand.
> >
> > "tcpdump -p -n -s 1600 -c 10000" doesnt reveal User data at all.
> >
> > Without any exact data from you, I am afraid nobody can help.
>
> Oh, I didn't see that you specified specific options. I'll still have
> to anonymize 2000+ IP addresses, but I think there is an open source
> tool that will do this for you.


tcpdump -p -n -s 1600 -c 10000 | perl -pe 's/(\d+\.\d+\.\d+\.\d+)/HIDE.THIS.IP.ADDR/g'

-justinb

--
Justin Banks
BakBone Software
[email protected]

2007-12-20 21:06:17

by Ilpo Järvinen

[permalink] [raw]
Subject: Re: After many hours all outbound connections get stuck in SYN_SENT

On Thu, 20 Dec 2007, James Nichols wrote:

> > You'd probably should also investigate the Linux kernel,
> > especially the size and locks of the components of the Sack data
> > structures and what happens to those data structures after Sack is
> > disabled (presumably the Sack data structure is in some unhappy
> > circumstance, and disabling Sack allows the data to be discarded,
> > magically unclaging the box).

...Not sure if you want now to invent such structure. Yes, we have per skb
->sacked but again in SYN_SENT there are very few things who touch it at
all, and they just set it to zero (though it would not even be mandatory
for tcp_transmit_skb, IIRC, checked that just couple of days ago due to
other things).

Another thing is the rx_opt.sack_ok which is just couple flag bits that
tell the TCP variant in use (and it's mostly used only after SYN handshake
completes). The rest (the actual SACK blocks) is in the ack_skb but again
it has very little meaning in SYN_SENT state unless somebody is crazy
enough to add SACK blocks to SYN-ACKs :-).

> > In the absence of the reporter wanting to dump the kernel's
> > core, how about a patch to print the Sack datastructure when
> > the command to disable Sack is received by the kernel?
> > Maybe just print the last 16b of the IP address?
>
> Given the fact that I've had this problem for so long, over a variety
> of networking hardware vendors and colo-facilities, this really sounds
> good to me. It will be challenging for me to justify a kernel core
> dump, but a simple patch to dump the Sack data would be do-able.

If your symptoms really are: SYNs leaving (if they show up in tcpdump, for
sure they've left TCP code already) and SYN-ACK not showing up even in
something as early as in tcpdump (for sure TCP side code didn't execute at
that point yet), there's very little change that Linux' TCP code has some
bug in it, only things that do something in such scenario are the SYN
generation and retransmitting SYNs (and those are trivially verifiable
from tcpdump).


--
i.

2007-12-21 04:58:57

by Glen Turner

[permalink] [raw]
Subject: Re: After many hours all outbound connections get stuck in SYN_SENT


> I do have TCP Sequence # Randomization enabled on my router.

Huh? Do you mean a PIX blade in a Cisco switch-router chassis? It
would be very useful if you could be less vague about the
equipment in use.

> However,
> if this was causing an issue, wouldn't it always occur and cause
> connection issues, not just after 38 hours of correct operation?

That depends more on your customers' networking attributes
then you are sharing or perhaps even know. Perhaps your customer
base is very Window-skewed and you simply aren't seeing any Sack
Permitted negotiations for the first 37.999 hours. Or
perhaps you've had a network glitch, and all of your
connections have done a Selective Ack, which the firewall
has trashed, leaving all the connections in a wacko state,
not just a few which you haven't noticed.

The actual failure mode needs a packet trace to determine,
but you should be able to do this yourself (or ask your
local network engineering staff).

If your firewall is trashing the Sack field, then it needs
to be fixed. Time to raise a case with the Cisco TAC and
ask them directly if your PIX version has bug CSCse14419.
You can't expect Sack to work when it's being fed trash,
so it is important to make sure that is not happening.

Cheers, Glen
#include <network_engineer.h>
#undef KERNEL_HACKER

2007-12-21 06:06:26

by Jan Engelhardt

[permalink] [raw]
Subject: Re: After many hours all outbound connections get stuck in SYN_SENT


On Dec 20 2007 23:05, Ilpo Järvinen wrote:
>>
>> Given the fact that I've had this problem for so long, over a variety
>> of networking hardware vendors and colo-facilities, this really sounds
>> good to me. It will be challenging for me to justify a kernel core
>> dump, but a simple patch to dump the Sack data would be do-able.
>
>If your symptoms really are: SYNs leaving (if they show up in tcpdump, for
>sure they've left TCP code already) and SYN-ACK not showing up even in
>something as early as in tcpdump (for sure TCP side code didn't execute at
>that point yet), there's very little change that Linux' TCP code has some
>bug in it, only things that do something in such scenario are the SYN
>generation and retransmitting SYNs (and those are trivially verifiable
>from tcpdump).
>
Take a machine, put two interfaces in it, configure as bridge (br0
over eth0 and eth1 without any assigned ip addresses), put it between
end node and the cisco. tcpdump there, which should give an unbiased
view wrt. endnode/cisco. Then perhaps, also configure such a network
listening bridge on the other side of the cisco, e.g. on the link to
the internet and watch that. Compare the two tcpdumpds and see if
sack got trashed.

2007-12-21 13:57:41

by James Nichols

[permalink] [raw]
Subject: Re: After many hours all outbound connections get stuck in SYN_SENT

> Huh? Do you mean a PIX blade in a Cisco switch-router chassis? It
> would be very useful if you could be less vague about the
> equipment in use.

Right it's a PIX blade in Cisco chassis. The PIX is running ASA version 7.0(6)



> That depends more on your customers' networking attributes
> then you are sharing or perhaps even know. Perhaps your customer
> base is very Window-skewed and you simply aren't seeing any Sack
> Permitted negotiations for the first 37.999 hours. Or
> perhaps you've had a network glitch, and all of your
> connections have done a Selective Ack, which the firewall
> has trashed, leaving all the connections in a wacko state,
> not just a few which you haven't noticed.

I definitely see SACKs over the course of the 38 hours. I don't have
any Windows hosts, they are all running Linux except for a very small
number that run a proprietary OS and webserver. If the firewall were
to get trashed and hose the currently active connections, I would
expect that newly initiated connections would work.


> The actual failure mode needs a packet trace to determine,
> but you should be able to do this yourself (or ask your
> local network engineering staff).
>
> If your firewall is trashing the Sack field, then it needs
> to be fixed. Time to raise a case with the Cisco TAC and
> ask them directly if your PIX version has bug CSCse14419.
> You can't expect Sack to work when it's being fed trash,
> so it is important to make sure that is not happening.

I've pinged our dude that handles the PIX stuff to see about getting
an upgrade to 7.0(7). I should be able to get a packet trace, but it
may take some time. At this point I'm getting a lot of resistance and
people here telling me to just turn SACK off and not worry about what
is causing this issue, but I'd really like to get to the bottom of it.