2002-02-27 13:45:36

by Bjorn Wesen

[permalink] [raw]
Subject: What is TCPRenoRecoveryFail ?

I have a TCP connection that is sending bulk data from a Linux 2.4.17
machine to a client. At some point, one of the packets from the Linux
machine is lost, so the client asks for a retransmit by acking the last
received correct packet. Then the Linux machine just keeps filling the
clients open window, ignoring that and subsequent retransmit requests,
never retransmitting any data.

Around the time of the packet loss happened, the counter
TCPRenoRecoveryFail increased by one, but I'm not sufficiently into the
TCP code to figure out why that happens and if that is the reason why
Linux stop retransmitting anything.. any ideas ?

The client is a Windows machine, but the packets it sends does not seem
broken in any way.

/BW


2002-02-27 14:27:29

by bert hubert

[permalink] [raw]
Subject: Re: What is TCPRenoRecoveryFail ?

On Wed, Feb 27, 2002 at 01:46:55PM +0000, Bjorn Wesen wrote:
> I have a TCP connection that is sending bulk data from a Linux 2.4.17
> machine to a client. At some point, one of the packets from the Linux
> machine is lost, so the client asks for a retransmit by acking the last
> received correct packet. Then the Linux machine just keeps filling the
> clients open window, ignoring that and subsequent retransmit requests,
> never retransmitting any data.

Please show a tcpdump -v of this happening, including the initial SYN
packets. I strongly suspect something in your network of mucking with TCP
options.

> Around the time of the packet loss happened, the counter
> TCPRenoRecoveryFail increased by one, but I'm not sufficiently into the
> TCP code to figure out why that happens and if that is the reason why
> Linux stop retransmitting anything.. any ideas ?

See RFC2001. Might well be related.

Regards,

bert

--
http://www.PowerDNS.com Versatile DNS Software & Services
http://www.tk the dot in .tk
http://lartc.org Linux Advanced Routing & Traffic Control HOWTO

2002-02-27 14:46:34

by Andi Kleen

[permalink] [raw]
Subject: Re: What is TCPRenoRecoveryFail ?

Bjorn Wesen <[email protected]> writes:

> Around the time of the packet loss happened, the counter
> TCPRenoRecoveryFail increased by one, but I'm not sufficiently into the
> TCP code to figure out why that happens and if that is the reason why
> Linux stop retransmitting anything.. any ideas ?


TCPRenoRecovery fail just means that fast retransmit didn't help and
it is going into a full retransmit. Fast retransmit is a short cut retransmit
mechanism that works faster when only a few packets got lost. You lost
more than 1. You have to find out why you are losing them.

-Andi

2002-02-27 17:04:16

by Nivedita Singhvi

[permalink] [raw]
Subject: Re: What is TCPRenoRecoveryFail ?


> I have a TCP connection that is sending bulk data from
> a Linux 2.4.17 machine to a client. At some point, one of
> the packets from the Linux machine is lost, so the client
> asks for a retransmit by acking the last received correct
> packet. Then the Linux machine just keeps filling the
> clients open window, ignoring that and subsequent
> retransmit requests, never retransmitting any data.

Windows and Linux dont agree on DSACK options in some
situations, leading to symptoms you describe sometimes..

You can try echoing 0 into /proc/sys/net/ipv4/tcp_dsack
and see if that avoids the stall (just to identify the problem,
this is not the real way to go..).

Stats and a trace would be useful..

> /BW

thanks,
Nivedita

2002-02-27 17:09:14

by David Miller

[permalink] [raw]
Subject: Re: What is TCPRenoRecoveryFail ?

From: "Nivedita Singhvi" <[email protected]>
Date: Wed, 27 Feb 2002 09:03:27 -0800

Windows and Linux dont agree on DSACK options in some
situations, leading to symptoms you describe sometimes..

There are no options to negotiate DSACK, SACK implies DSACK will cause
no harm. An pre-DSACK implementation of SACK should effectively
treat the DSACKs as nops, ie. they are harmless.

2002-02-27 17:26:30

by Nivedita Singhvi

[permalink] [raw]
Subject: Re: What is TCPRenoRecoveryFail ?




> There are no options to negotiate DSACK,

Didnt mean option negotiation, the SACK permitted
works fine

> SACK implies DSACK will cause no harm. An
> pre-DSACK implementation of SACK should
> effectively treat the DSACKs as nops, ie.
> they are harmless.

You would hope ;), but there is some bug in some Windoze I suspect
that gets confused by a duplicate SACK in some situation. I havent
been able to reproduce this on my hw, but have seen a strange trace
a while ago that was pretty similar.

thanks,
Nivedita


2002-02-27 18:23:18

by Bjorn Wesen

[permalink] [raw]
Subject: Re: What is TCPRenoRecoveryFail ?

On Wed, 27 Feb 2002, bert hubert wrote:
> On Wed, Feb 27, 2002 at 01:46:55PM +0000, Bjorn Wesen wrote:
> > I have a TCP connection that is sending bulk data from a Linux 2.4.17
> > machine to a client. At some point, one of the packets from the Linux
> > machine is lost, so the client asks for a retransmit by acking the last
> > received correct packet. Then the Linux machine just keeps filling the
> > clients open window, ignoring that and subsequent retransmit requests,
> > never retransmitting any data.
>
> Please show a tcpdump -v of this happening, including the initial SYN
> packets. I strongly suspect something in your network of mucking with TCP
> options.

Ok this is mangled by the email client but i attached the binary dump of
the relevant packets. The dump is taken on the windows machine, which
complicates the analysis because perhaps the network card itself is
screwing up, but that is a low probability because all succeeding
communication after the TCP timeout works fine (and parallell
communication). tcpdump says 'squid' because the src port happened to be
3128..

23:46:43.009000 10.13.18.46.http > dh10-13-18-213.axis.se.squid: . [tcp
sum ok] 4269884068:4269885528(1460) ack 7148250 win 5840 (DF) (ttl 64, id
37958, len 1500)
23:46:43.009000 dh10-13-18-213.axis.se.squid > 10.13.18.46.http: . [tcp
sum ok] ack 1460 win 8760 (DF) (ttl 128, id 54605, len 40)
23:46:43.009000 10.13.18.46.http > dh10-13-18-213.axis.se.squid: . [tcp
sum ok] 1460:2920(1460) ack 1 win 5840 (DF) (ttl 64, id 37959, len 1500)
23:46:43.009000 dh10-13-18-213.axis.se.squid > 10.13.18.46.http: . [tcp
sum ok] ack 2920 win 8760 (DF) (ttl 128, id 54861, len 40)
23:46:43.010000 10.13.18.46.http > dh10-13-18-213.axis.se.squid: . [tcp
sum ok] 2920:4380(1460) ack 1 win 5840 (DF) (ttl 64, id 37960, len 1500)
23:46:43.010000 10.13.18.46.http > dh10-13-18-213.axis.se.squid: . [tcp
sum ok] 5840:7300(1460) ack 1 win 5840 (DF) (ttl 64, id 37962, len 1500)
^^^^-- the last data packet windows receives

23:46:43.010000 dh10-13-18-213.axis.se.squid > 10.13.18.46.http: . [tcp
sum ok] ack 4380 win 8760 (DF) (ttl 128, id 55117, len 40)
^^^^-- first in a row of futile ACK's at relative seq 4380

23:46:43.011000 10.13.18.46.http > dh10-13-18-213.axis.se.squid: . [tcp
sum ok] 7300:8760(1460) ack 1 win 5840 (DF) (ttl 64, id 37963, len 1500)
23:46:43.011000 dh10-13-18-213.axis.se.squid > 10.13.18.46.http: . [tcp
sum ok] ack 4380 win 8760 (DF) (ttl 128, id 55373, len 40)
23:46:43.011000 10.13.18.46.http > dh10-13-18-213.axis.se.squid: . [tcp
sum ok] 8760:10220(1460) ack 1 win 5840 (DF) (ttl 64, id 37964, len 1500)
23:46:43.011000 dh10-13-18-213.axis.se.squid > 10.13.18.46.http: . [tcp
sum ok] ack 4380 win 8760 (DF) (ttl 128, id 55629, len 40)
23:46:43.012000 10.13.18.46.http > dh10-13-18-213.axis.se.squid: P [tcp
sum ok] 10220:11680(1460) ack 1 win 5840 (DF) (ttl 64, id 37965, len 1500)
23:46:43.012000 dh10-13-18-213.axis.se.squid > 10.13.18.46.http: . [tcp
sum ok] ack 4380 win 8760 (DF) (ttl 128, id 55885, len 40)

.. long timeout here until the server finally gives up the connection ..

23:56:46.111000 10.13.18.46.http > dh10-13-18-213.axis.se.squid: F [tcp
sum ok] 11680:11680(0) ack 1 win 5840 (DF) (ttl 64, id 37966, len 40)

I just hope I'm not doing anything stupid.. well that's best of course
because it's less job to make it work :)

The seq numbers are close to 4G but do not wrap, btw.

-BW


Attachments:
tcp2.tcpdump (10.97 kB)

2002-02-27 23:01:20

by bert hubert

[permalink] [raw]
Subject: Re: What is TCPRenoRecoveryFail ?

On Wed, Feb 27, 2002 at 06:33:43PM +0000, Bjorn Wesen wrote:

> > Please show a tcpdump -v of this happening, including the initial SYN
> > packets. I strongly suspect something in your network of mucking with TCP
> > options.
>
> Ok this is mangled by the email client but i attached the binary dump of
> the relevant packets. The dump is taken on the windows machine, which
> complicates the analysis because perhaps the network card itself is
> screwing up, but that is a low probability because all succeeding

This dump is somewhat inconclusive. I miss the SYN packets, but I'm
especially interested in this bit:

> 23:46:43.011000 10.13.18.46.http > dh10-13-18-213.axis.se.squid: . [tcp
> sum ok] 7300:8760(1460) ack 1 win 5840 (DF) (ttl 64, id 37963, len 1500)
> 23:46:43.011000 dh10-13-18-213.axis.se.squid > 10.13.18.46.http: . [tcp
> sum ok] ack 4380 win 8760 (DF) (ttl 128, id 55373, len 40)
> 23:46:43.011000 10.13.18.46.http > dh10-13-18-213.axis.se.squid: . [tcp
> sum ok] 8760:10220(1460) ack 1 win 5840 (DF) (ttl 64, id 37964, len 1500)
> 23:46:43.011000 dh10-13-18-213.axis.se.squid > 10.13.18.46.http: . [tcp
> sum ok] ack 4380 win 8760 (DF) (ttl 128, id 55629, len 40)
> 23:46:43.012000 10.13.18.46.http > dh10-13-18-213.axis.se.squid: P [tcp
> sum ok] 10220:11680(1460) ack 1 win 5840 (DF) (ttl 64, id 37965, len 1500)
> 23:46:43.012000 dh10-13-18-213.axis.se.squid > 10.13.18.46.http: . [tcp
> sum ok] ack 4380 win 8760 (DF) (ttl 128, id 55885, len 40)
>
> .. long timeout here until the server finally gives up the connection ..

how does this continue? I see the linux machine continuing to send data with
higher sequence numbers, but how does it end? It looks like the linux
machine never sent anything beyond 11680, relative.

To know more, dump at both ends. I really think something in between is
messing up - either a network adaptor or a ""smart"" network element.

Regards,

bert

--
http://www.PowerDNS.com Versatile DNS Software & Services
http://www.tk the dot in .tk
http://lartc.org Linux Advanced Routing & Traffic Control HOWTO

2002-02-27 23:14:25

by Bjorn Wesen

[permalink] [raw]
Subject: Re: What is TCPRenoRecoveryFail ?

On Thu, 28 Feb 2002, bert hubert wrote:
> This dump is somewhat inconclusive. I miss the SYN packets, but I'm

Sorry about that, the dump I got from the one who found the problem didn't
include the SYN's.. I'll dig it up

> how does this continue? I see the linux machine continuing to send data with
> higher sequence numbers, but how does it end? It looks like the linux
> machine never sent anything beyond 11680, relative.

It ends where the dump ends, nothing is sent for the 10 minutes until the
FIN packet. It seems as if Linux fills up the receive window and sits
there, it ignores the ACK's (if they get there).

> To know more, dump at both ends. I really think something in between is
> messing up - either a network adaptor or a ""smart"" network element.

Me too, especially since the dump was made on the windows machine itself..
it just seemed funny that it all works fine when the connection is
restarted (network elements not operating at or above OSI 3 would not know
the difference...). I'll see if I can get a dump "in between".

-BW

2002-02-28 00:16:29

by Nivedita Singhvi

[permalink] [raw]
Subject: Re: What is TCPRenoRecoveryFail ?


The retansmit from the Linux box should have been triggered
where that long timeout at the end begins, see below

> 1] 23:46:43.009000 10.13.18.46.http > dh10-13-18-213.axis.se.squid: .
> [tcp sum ok] 4269884068:4269885528(1460) ack 7148250 win 5840
> (DF) (ttl 64, id 37958, len 1500)

linux -> win send 1460 bytes

> 2] 23:46:43.009000 dh10-13-18-213.axis.se.squid > 10.13.18.46.http: .
> [tcp sum ok] ack 1460 win 8760 (DF) (ttl 128, id 54605, len 40)

win -> linux ack

> 3] 23:46:43.009000 10.13.18.46.http > dh10-13-18-213.axis.se.squid: .
> [tcp sum ok] 1460:2920(1460) ack 1 win 5840 (DF)
> (ttl 64, id 37959, len 1500)

linux -> win send 1460 bytes

> 4] 23:46:43.009000 dh10-13-18-213.axis.se.squid > 10.13.18.46.http: .
> [tcp sum ok] ack 2920 win 8760 (DF) (ttl 128, id 54861, len 40)

win -> linux ack

> 5] 23:46:43.010000 10.13.18.46.http > dh10-13-18-213.axis.se.squid: .
> [tcp sum ok] 2920:4380(1460) ack 1 win 5840 (DF)
> (ttl 64, id 37960, len 1500)

linux -> win send 1460 bytes (upto rel seq # 4380)

> 6] 23:46:43.010000 10.13.18.46.http > dh10-13-18-213.axis.se.squid: .
> [tcp sum ok] 5840:7300(1460) ack 1 win 5840 (DF)
> (ttl 64, id 37962, len 1500)

linux -> win send 1460 bytes (upto rel seq # 7300)

> 7] 23:46:43.010000 dh10-13-18-213.axis.se.squid > 10.13.18.46.http: .
> [tcp sum ok] ack 4380 win 8760 (DF) (ttl 128, id 55117, len 40)

win -> linux ack 4380 #1

> 8] 23:46:43.011000 10.13.18.46.http > dh10-13-18-213.axis.se.squid: .
> [tcp sum ok] 7300:8760(1460) ack 1 win 5840 (DF)
> (ttl 64, id 37963, len 1500)

linux -> win send 1460 bytes (upto rel seq # 8760)

> 9] 23:46:43.011000 dh10-13-18-213.axis.se.squid > 10.13.18.46.http: .
> [tcp sum ok] ack 4380 win 8760 (DF) (ttl 128, id 55373, len 40)

win -> linux ack 4380 #2 (dup ack #1)

> 10] 23:46:43.011000 10.13.18.46.http > dh10-13-18-213.axis.se.squid: .
> [tcp sum ok] 8760:10220(1460) ack 1 win 5840 (DF)
> (ttl 64, id 37964, len 1500)

linux -> win send 1460 bytes (upto rel seq # 10220)

> 11] 23:46:43.011000 dh10-13-18-213.axis.se.squid > 10.13.18.46.http: .
> [tcp sum ok] ack 4380 win 8760 (DF) (ttl 128, id 55629, len 40)

win -> linux ack 4380 #3 (dup ack #2)

> 12] 23:46:43.012000 10.13.18.46.http > dh10-13-18-213.axis.se.squid: P
> [tcp sum ok] 10220:11680(1460) ack 1 win 5840 (DF)
> (ttl 64, id 37965, len 1500)

linux -> win send 1460 bytes (upto rel seq # 11680)

> 13] 23:46:43.012000 dh10-13-18-213.axis.se.squid > 10.13.18.46.http: .
> [tcp sum ok] ack 4380 win 8760 (DF) (ttl 128, id 55885, len 40)

win -> linux ack 4380 #4 (dup ack #3)

!! This ack should trigger fast retransmit from the
linux box. Apparently the linux box didnt generate
it, or the windows box never got it.

Do the statistics indicate a retransmission went out?

> .. long timeout here until the server finally gives up the connection ..

> 14] 23:56:46.111000 10.13.18.46.http > dh10-13-18-213.axis.se.squid: F
> [tcp sum ok] 11680:11680(0) ack 1 win 5840 (DF) (ttl 64, id 37966, len
40)

thanks,
Nivedita