LinuxLists.cc - poll() blocked / packets not received ?

2008-10-20 08:25:30

Subject: poll() blocked / packets not received ?

Hello,

We have an application that uses pthreads and (blocking) sockets.

When the application runs with one single thread in separate processes
(using fork()) we don't get any problem.

However when it's multithreaded, we sometimes get stuck while poll()ing
a socket (with events set to POLLIN). Even after the other side of the
connection has closed its side of the connection, we are still stuck
here. Adding a timeout only makes the poll() exit with 0, so we loop.

In case we don't loop the next operation is a recv() which will block as
well (which is consistent).

It seems like nothing is longer received on the socket but it's
difficult to verify with tcpdump since our server outputs something like
15MB at peek time with 150 hits per seconds.

We have Shorewall installed and enabled, but what seems strange is that
the problem depends on multithreading. It also occurs much more often on
the 4 core machines than on a 2 core ones (both with Hyperthreading
activated). We're using kernel 2.6.20-15-server (#2 SMP) provided by Ubuntu.

Any tip on we could fix that or investigate further would be
appreciated. After one month of debugging we're really out of solution now.

Best,
Nicolas

2008-10-20 10:15:57

by swivel

[permalink] [raw]

Subject: Re: poll() blocked / packets not received ?

On Mon, Oct 20, 2008 at 10:25:10AM +0200, Nicolas Cannasse wrote:
> Hello,
>
> We have an application that uses pthreads and (blocking) sockets.
>
> When the application runs with one single thread in separate processes
> (using fork()) we don't get any problem.
>
> However when it's multithreaded, we sometimes get stuck while poll()ing
> a socket (with events set to POLLIN). Even after the other side of the
> connection has closed its side of the connection, we are still stuck
> here. Adding a timeout only makes the poll() exit with 0, so we loop.
>
> In case we don't loop the next operation is a recv() which will block as
> well (which is consistent).
>
> It seems like nothing is longer received on the socket but it's
> difficult to verify with tcpdump since our server outputs something like
> 15MB at peek time with 150 hits per seconds.
>
> We have Shorewall installed and enabled, but what seems strange is that
> the problem depends on multithreading. It also occurs much more often on
> the 4 core machines than on a 2 core ones (both with Hyperthreading
> activated). We're using kernel 2.6.20-15-server (#2 SMP) provided by Ubuntu.
>
> Any tip on we could fix that or investigate further would be
> appreciated. After one month of debugging we're really out of solution now.
>
> Best,
> Nicolas

Your usage pattern is a very common one, I highly doubt you are experiencing
a kernel bug here or many people (including myself) would be complaining.

Shorewall sounds like it might be suspect, are FIN's not coming in when the
remote closes? You can look in the output of netstat to see what state the
TCP is in, still ESTABLISHED?

Have you tried just disabling the firewall to see if the problem
disappears?

Regards,
Vito Caputo

2008-10-20 10:47:14

by Nicolas Cannasse

[permalink] [raw]

Subject: Re: poll() blocked / packets not received ?

>> We have Shorewall installed and enabled, but what seems strange is that
>> the problem depends on multithreading. It also occurs much more often on
>> the 4 core machines than on a 2 core ones (both with Hyperthreading
>> activated). We're using kernel 2.6.20-15-server (#2 SMP) provided by Ubuntu.
>>
>> Any tip on we could fix that or investigate further would be
>> appreciated. After one month of debugging we're really out of solution now.
>>
>> Best,
>> Nicolas
>
> Your usage pattern is a very common one, I highly doubt you are experiencing
> a kernel bug here or many people (including myself) would be complaining.
>
> Shorewall sounds like it might be suspect, are FIN's not coming in when the
> remote closes? You can look in the output of netstat to see what state the
> TCP is in, still ESTABLISHED?

Yes, it's still ESTABLISHED, but we can't see the corresponding
connection on the other machine while running netstat. I'm not a TCP
expert, so I'm not sure in which case this can occur.

I agree with your comment in general, except that we have been running
the same application in single-thread environment for years without
running into this very specific problem.

The only logs we get in the dmesg are the following :

either (a few everyday) :

[10742708.006350] TCP: Treason uncloaked! Peer 213.209.177.218:32924/80
shrinks window 4049064122:4049064123. Repaired.

Or (more often) :

[10755036.856217] Shorewall:net2all:DROP:IN=eth0 OUT=
MAC=00:XX:XX:XX:XX:XX:XX:XX:XX:XX:XX:XX:XX:00 SRC=60.238.83.204
DST=XX.XX.XX.43 LEN=404 TOS=0x00 PREC=0x00 TTL=114 ID=12366 PROTO=UDP
SPT=1057 DPT=1434 LEN=384

Both SRC/DST IPs does not correspond to the connections that are
stalled, since they occur on the local network.

Best,
Nicolas

2008-10-20 11:39:52

by swivel

[permalink] [raw]

Subject: Re: poll() blocked / packets not received ?

On Mon, Oct 20, 2008 at 12:46:56PM +0200, Nicolas Cannasse wrote:
> >>We have Shorewall installed and enabled, but what seems strange is that
> >>the problem depends on multithreading. It also occurs much more often on
> >>the 4 core machines than on a 2 core ones (both with Hyperthreading
> >>activated). We're using kernel 2.6.20-15-server (#2 SMP) provided by
> >>Ubuntu.
> >>
> >>Any tip on we could fix that or investigate further would be
> >>appreciated. After one month of debugging we're really out of solution
> >>now.
> >>
> >>Best,
> >>Nicolas
> >
> >Your usage pattern is a very common one, I highly doubt you are
> >experiencing
> >a kernel bug here or many people (including myself) would be complaining.
> >
> >Shorewall sounds like it might be suspect, are FIN's not coming in when the
> >remote closes? You can look in the output of netstat to see what state the
> >TCP is in, still ESTABLISHED?
>
> Yes, it's still ESTABLISHED, but we can't see the corresponding
> connection on the other machine while running netstat. I'm not a TCP
> expert, so I'm not sure in which case this can occur.

If the end that's blocking still has the TCP in ESTABLISHED state, and
the other end doesnt have the TCP at all... you've already identified
why the one end is still ESTABLISHED. ESTABLISHED state won't be left
until the FIN is received from the other end, then entering CLOSE_WAIT
state.

When the other end of the TCP is _gone_ that leads me to believe a FIN
will not be coming, hence the indefinite ESTABLISHED state. Why it's
gone is a different question, maybe your problem is at the other end?
The end initiating a shutdown has to enter FIN_WAIT_1 then FIN_WAIT_2,
these transitions require the other side to leave ESTABLISHED (receive a
FIN then ACK) at the very least to proceed.

>
> I agree with your comment in general, except that we have been running
> the same application in single-thread environment for years without
> running into this very specific problem.
>

Perhaps when you run in multicore/threaded you are stressing the network
stacks at both ends more, including everything in-between? The
threading vs. single process relationship is probably not causal, but
just coincidental.

What is the protocol? Are there any timeouts to take care of these
situations? Do you schedule an alarm or use SO_RCVTIMEO to shutdown
dead connections and free up consumed threads?

TCP being reliable can block indefinitely, you can employ TCP keepalive
to change indefinite to quite a long time.

Regards,
Vito Caputo

2008-10-20 12:13:50

by Nicolas Cannasse

[permalink] [raw]

Subject: Re: poll() blocked / packets not received ?

[email protected] a ?crit :
> When the other end of the TCP is _gone_ that leads me to believe a FIN
> will not be coming, hence the indefinite ESTABLISHED state. Why it's
> gone is a different question, maybe your problem is at the other end?
> The end initiating a shutdown has to enter FIN_WAIT_1 then FIN_WAIT_2,
> these transitions require the other side to leave ESTABLISHED (receive a
> FIN then ACK) at the very least to proceed.
>
>> I agree with your comment in general, except that we have been running
>> the same application in single-thread environment for years without
>> running into this very specific problem.
>>
>
> Perhaps when you run in multicore/threaded you are stressing the network
> stacks at both ends more, including everything in-between? The
> threading vs. single process relationship is probably not causal, but
> just coincidental.

Not sure why this should happen, since it's the same servers. What only
change is part of the software that we are using to handle our server
requests. It's either embedded in Apache 1.3 with fork() or a standalone
multithread server which acts as Apache backend.

So the only difference for networking is that we have additional
Apache<->MT-Server communications, but they should be on 127.0.0.1 so I
think they are purely software and not hardware-related.

> What is the protocol? Are there any timeouts to take care of these
> situations? Do you schedule an alarm or use SO_RCVTIMEO to shutdown
> dead connections and free up consumed threads?

The protocol is MySQL. Since we had the problem with libmysqlclient, we
reimplemented it again from scratch to make sure that it was not
software-related.

What happens at the protocol-level is the following :

a) we connect to the server
b) we make several requests and get answers back
c) at some (random+rare) point - always after making a request - we're
stuck while waiting for the answer.

Sadly, this can happen inside a transaction while we hold the lock on
some shared resource. This will lock the whole website until we run out
of File Descriptor due to accept'ed pending connections. In that case we
get an exception and the server (the multithread one, not MySQL)
restarts, which release the lock.

In some other cases when we don't hold a lock, the thread remains
blocked in poll() as I described it. After a timeout (I think it's 28800
seconds) the MySQL server closes the connection. The client - which is
waiting in poll() - does not have any timeout activated (it's relying on
the mysql server). But it doesn't notice that the socket has been closed
either.

We investigated a lot about signals since poll() can also be interrupted
by Garbage Collector and child process signals, but we correctly handle
EINTR everywhere it's needed. So unless there's a possibility that
interrupting poll() with a signal might somehow consume the data, this
is not the problem here.

> TCP being reliable can block indefinitely, you can employ TCP keepalive
> to change indefinite to quite a long time.

Sure. We could also use a client timeout, but we don't want to hold the
lock more than required, and we can't make the difference between a
given request that would take too much time to complete and a lost
connection.

Hope we can somehow understand what's going on.
Thanks for the answers so far,

Best,
Nicolas

2008-10-20 12:39:26

by Nicolas Cannasse

[permalink] [raw]

Subject: Re: poll() blocked / packets not received ?

> TCP being reliable can block indefinitely, you can employ TCP keepalive
> to change indefinite to quite a long time.

Ok, funny thing is that we just found what is occurring...

We had a process that was on a regular basis doing the following :

conntrack -F

This was done in order to prevent the table to grow too big, because we
were reaching the maximum size as told by :

/proc/sys/net/ipv4/netfilter/ip_conntrack_max
and
/proc/sys/net/ipv4/netfilter/ip_conntrack_count

Seems like when there are active connections, this will break netfilter
and stop delivering packets to the socket.

At least I will have nice sleep tonight.

Best,
Nicolas

2008-10-20 15:53:47

by David Schwartz

[permalink] [raw]

Subject: RE: poll() blocked / packets not received ?

Nick Cannasse wrote:

> Ok, funny thing is that we just found what is occurring...
>
> We had a process that was on a regular basis doing the following :
>
> conntrack -F
>
> This was done in order to prevent the table to grow too big, because we
> were reaching the maximum size as told by :
>
> /proc/sys/net/ipv4/netfilter/ip_conntrack_max
> and
> /proc/sys/net/ipv4/netfilter/ip_conntrack_count
>
> Seems like when there are active connections, this will break netfilter
> and stop delivering packets to the socket.
>
> At least I will have nice sleep tonight.

Note that this solved your symptom, not your problem. You actually have two
problems:

1) You rely on TCP to detect a lost connection even by a side that will
never transmit any data. TCP simply does not do this. If you are not trying
to send data, you are not assured that a lost connection will be detected.
(You either need a timeout, or you need to send or dribble some data,
depending on the protocl.)

2) You hold a lock on a shared resource while you wait for a reply over a
network. If this is a low-level "block and wait indefinitely" lock, this
will cause many threads to line up behind a slow/stuck thread. The right fix
depends on your circumstances, but you need to use a synchronization
primitive that is suitable. (You need to be able to use multiple connections
or defer operations without holding a thread.)

With both of these bugs, you are vulnerable to precisely the scenario you
observed. The TCP connection close packets were lost (in this case due to
premature expiration of the connnection tracking, but other things can do
it, such as the server rebooting), TCP could not detect the lost connection
because you never sent any data, so one thread blocked forever, and other
threads got in line behind it.

DS

2008-10-20 17:25:52

by Nicolas Cannasse

[permalink] [raw]

Subject: Re: poll() blocked / packets not received ?

David Schwartz a ?crit :
>> At least I will have nice sleep tonight.
>
> Note that this solved your symptom, not your problem. You actually have two
> problems:
>
> 1) You rely on TCP to detect a lost connection even by a side that will
> never transmit any data. TCP simply does not do this. If you are not trying
> to send data, you are not assured that a lost connection will be detected.
> (You either need a timeout, or you need to send or dribble some data,
> depending on the protocl.)
>
> 2) You hold a lock on a shared resource while you wait for a reply over a
> network. If this is a low-level "block and wait indefinitely" lock, this
> will cause many threads to line up behind a slow/stuck thread. The right fix
> depends on your circumstances, but you need to use a synchronization
> primitive that is suitable. (You need to be able to use multiple connections
> or defer operations without holding a thread.)

I agree with both points, but I can't modify the MySQL protocol to
implement that.

For (1) I can't add the timeout since I have no way to differentiate
between a lost connection and a request that takes time to execute. I'll
maybe check if the protocol allow pings while waiting for the request
result, but I'm not sure it does.

For (2) the shared resources is on the database side, not on the server
side. It's the transaction that have some rows locked. I have no
solution for that.

Best,
Nicolas

2008-10-20 23:22:25

by David Schwartz

[permalink] [raw]

Subject: RE: poll() blocked / packets not received ?

> I agree with both points, but I can't modify the MySQL protocol to
> implement that.

> For (1) I can't add the timeout since I have no way to differentiate
> between a lost connection and a request that takes time to execute. I'll
> maybe check if the protocol allow pings while waiting for the request
> result, but I'm not sure it does.

Sure you can. For example, you can run a proxy on both the server and the
client, with the two proxies speaking a protocol that carries the MySQL
protocol. The protocol between the server and the client can include two
types of messages, one being regular data (which the proxies pass to the
server and client software) and one being a ping (which the proxies use
internally to decide when to drop their connections). Each proxy can 'ping'
the other as often as required and drop both connections if the ping fails
to go through. This will ensure that your program detects a connection loss
rapidly.

There are many other possible solutions.

> For (2) the shared resources is on the database side, not on the server
> side. It's the transaction that have some rows locked. I have no
> solution for that.

That doesn't fit your problem description. Presumably the server detected
the loss of the connection and so would have released any resources it was
holding that were associated with it. The problem in this case was that the
client couldn't detect the loss of the connection.

> Best,
> Nicolas

Good luck.

DS

2008-10-21 05:13:22

by Willy Tarreau

[permalink] [raw]

Subject: Re: poll() blocked / packets not received ?

On Mon, Oct 20, 2008 at 07:24:14PM +0200, Nicolas Cannasse wrote:
> For (1) I can't add the timeout since I have no way to differentiate
> between a lost connection and a request that takes time to execute.

Not only you can, but you *must*. Any service assuming infinite timeout
is deemed to fail. If you know that one request can take as long as one
minute for instance, then use a 2 minutes timeout. The day all requests
will be automatically cleaned up because of a failed firewall between
client and server, you'll be happy not to have to come there and restart
the service to flush them.

There's a huge difference between using a very large timeout and none at
all.

Willy