Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753405AbZIYQHf (ORCPT ); Fri, 25 Sep 2009 12:07:35 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1753127AbZIYQHe (ORCPT ); Fri, 25 Sep 2009 12:07:34 -0400 Received: from n9a.bullet.mail.mud.yahoo.com ([209.191.87.108]:43087 "HELO n9a.bullet.mail.mud.yahoo.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with SMTP id S1753020AbZIYQHe convert rfc822-to-8bit (ORCPT ); Fri, 25 Sep 2009 12:07:34 -0400 X-Greylist: delayed 319 seconds by postgrey-1.27 at vger.kernel.org; Fri, 25 Sep 2009 12:07:34 EDT X-Yahoo-Newman-Property: ymail-3 X-Yahoo-Newman-Id: 720416.57467.bm@omp122.mail.ac4.yahoo.com DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=s1024; d=yahoo.com; h=Message-ID:X-YMail-OSG:Received:X-Mailer:Date:From:Subject:To:Cc:In-Reply-To:MIME-Version:Content-Type:Content-Transfer-Encoding; b=AwqtJ3kzvfgCOFJQQLaar+Xqr5XFKb7tNpE6dhZpgzc8CAQ59kZW5NhbshHuw6Ls/XXd2OE++jtYmSU4Mbm97sFkfkQmxCHIA2+JDMbdP0AVDfN63+lqk3mEQr6uGVUyOBjH4IWFGclR8Ezh9PORqUM18yY3pw3K5RFS16osSck=; Message-ID: <619356.98592.qm@web63403.mail.re1.yahoo.com> X-YMail-OSG: MZ9yRC0VM1mAEC_p4cHNxgN.7dIxfvjhtJOR3LykBRlrqLPmh_c7_KFBRf9ZbzOBmRq.CXXIa5vHHvR6pZAfWdtCkk5vaIRACDKo_ver61t724AwnmeiqrM39BYcWmW1weGWaPCwF5b6jEEmuWaqiEwlbicgpZCGu69Xw7ujhrWvjR94ENyiiTUSxb9ZEmfAmRo_zCnKivWULi6bWgW8j1vckyHGPHYUb9QQRZCygE8R_smN1bmNi88QkNRq5O2ky.Fkbw5YH48MRdUy1lN.ziiRMzynWGXbBRgkDeDUD7H7oAhQCji6y.XbrNyZqruXmv3kKrBfreNTDAlgM0LviH_g0JUxSdmyKYKo9ShxPgclz.uwVA-- X-Mailer: YahooMailClassic/7.0.14 YahooMailWebService/0.7.347.3 Date: Fri, 25 Sep 2009 09:02:19 -0700 (PDT) From: Joe Cao Subject: Re: TCP stack bug related to F-RTO? To: zhigang gong Cc: linux-kernel@vger.kernel.org, jcaoco2002@yahoo.com, netdev@vger.kernel.org In-Reply-To: <40c9f5b20909250155l49ad5fd2if8efb4fd48ed6066@mail.gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: 8BIT Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 4257 Lines: 140 Hi Zhigang, Thanks for help looking into the issue. My answer to your analysis is of course there won't the third dup-ack, because the server only sends TWO NEW data packets every time. Clearly this is server's problem and not the client's problem. Thanks, Joe --- On Fri, 9/25/09, zhigang gong wrote: > From: zhigang gong > Subject: Re: TCP stack bug related to F-RTO? > To: "Joe Cao" > Cc: linux-kernel@vger.kernel.org, jcaoco2002@yahoo.com, netdev@vger.kernel.org > Date: Friday, September 25, 2009, 1:55 AM > Oh, I see, so I spoke too quickly in > last mail. You just ignore some packets > in the trace. I have analysed the traffic flow? and > have some findings as below, > hope it's helpful. > > >> > 1. The client opens up a big window, > >> > 2. the server sends 19 packets in a row (pkt > #14- #32 > >> in the trace), but all of them are dropped due to > some > >> congestion. > >> > 3. The server hits RTO and retransmits pkt > #14 in #33 > This retransmission timer expiring indicate the server's > tcp/ip > stack to enter slow start mode, as a result we can see the > server's sending window will be reduced to one. > > >> > 4. The client immediately acks #33 (=#14), > and the > >> server (seems like to enter F-RTO) expends the > window and > >> sends *NEW* pkt #35 & #36.=A0 Timeoute is > doubled to > >> 2*RTO; The client immediately sends two Dup-ack to > #35 and > >> #36. > > Server is still in slow start mode, and extend window to > 2. > > >> > 5. after 2*RTO, pkt #15 is retransmitted in > #39. > > Here , the second retransmission timer expiring ocur. > Server's sending > window reduce to one again and continue in slow start > mode. > > >> > 6.. The client immediately acks #39 (=#15) in > #40, and > >> the server continues to expand the window and > sends two > >> *NEW* pkt #41 & #42. Now the timeoute is > doubled to 4 > >> *RTO. > Here you ignore two duplicate acks #37 and #38 sent by the > client. As I know > the server must receive three or even more duplcate acks > before it enter fast > retransmit mode, otherwise it will still in slow start mode > and? it > will wait until next > time retransmission timer expiring before retransmit the > lost packets. > And this is > actually what you got. > > I'm not an kernel expert, I just analyse from the TCP > protocol standard. From my > view, I think there is no problem in the server's network > stack. But > there maybe > some problem in the client (or some intermediate network > appliance) side, as it > always just sends two duplicate acks at the same time, and > never send the third > one no matter how long the interval is. In my opinion, if > the client > can send the third > duplicate acks then the server will enter fast retransmit > mode and > then fast recovery > then every thing will be ok. > > >> > 8. After 4*RTO timeout, #16 is > retransmitted. > >> > 9.... > >> > 10. The above steps repeats for > retransmitting pkt > >> #16-#32 and each time the timeout is doubled. > >> > 11. It takes a long long time to retransmit > all the > >> lost packets and before that is done, the client > sends a RST > >> because of timeout. > > On Fri, Sep 25, 2009 at 2:42 PM, Joe Cao > wrote: > > Hi, > > > > On the wrong tcp checksum, that's because of hardware > checksum offload. > > > > As for the seq/ack number, because the trace is long, > I deliberately removed those irrelevant packets between > after the three-way handshake and when the problem happens. > ?That can be seen from the timestamps. > > > > Please also note that I intentionally replaced the IP > addresses and mac addresses in the trace to hide proprietary > information in the trace. > > > > Anyway, the problem is not related to the checksum, or > seq/ack number, otherwise, you won't see the behavior shown > in the trace. > > > > Thanks, > > Joe > > > > --- On Thu, 9/24/09, zhigang gong > wrote: > > > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/