2002-07-18 23:27:45

by Hayden Myers

[permalink] [raw]
Subject: 2.2 to 2.4 migration

We're finally migrating to the 2.4 kernel due to hardware
incompatibilities with the 2.4. The 2.2 has worked better for us in the
past as far as our application performs. Our application is an adserver
and becomes bogged down in 2.4 when sending files such as images across
the wire. They're in general between 20-50k in size. I've been
researching the differences between 2.4 and 2.2 and have noticed that a
lot of work has gone into autotuning with 2.4 and I'm wondering if this is
what's slowing things down. When I do tcpdumps to see the traffic being
sent to the client I'm noticing that the receiver window is almost always
set to 6430 bytes. When looking at the same transfer on our 2.2 boxes the
receiver window is almost always over 31000 bytes. I've tried to increase
the size of the buffers using the proc settings that are provided however
this hasn't seemed to make a difference even after restarting servers
after each change the window is still 6430 bytes. I've tried manually
settting the size with setsockopt calls in the server code but this hasn't
seemed to help. I believe the problem is definately with sending the
files over the line. We files are read into the socket to be sent across
the network byte by byte. The boss says this is the best way to do it but
I'm curious if this is so. The code that reads the file into the socket
to go across the network is below.


int output_block(int socket, char *filename)
{
int fd, count = 0;
size_t total_bytes = 0;
/*size_t buf_cnt = 1460;*/
size_t buf_cnt = 512;
char buffer[buf_cnt];
fd_set rfds;
struct timeval tv;

if ((fd = open(filename, O_RDONLY)) < 0) {
//fprintf(stderr, "Unable to open filename: %s\n", filename);
return(-1);
}

while ((count = read(fd, &buffer, buf_cnt)) > 0) {

FD_ZERO(&rfds);
FD_SET(socket, &rfds);
tv.tv_sec = 10;
tv.tv_usec = 0;
if (select(socket+1, NULL, &rfds, NULL, &tv) <= 0) {
//fprintf(stderr, "Output_block timeout\n");
break;
}

if (writen(socket, buffer, count) <= 0)
break;
total_bytes += count;
}

close(fd);
return(total_bytes);

The application is a single threaded app using a multiprocess pre forking
model if that helps any. I'm really baffled as to why using the 2.4
kernel is slowing us down. Any help is appreciated. Sorry if this has
come up before. I really have been looking for help for quite some time
before posting this.

Hayden Myers
Support Manager
Skyline Network Technologies
[email protected]
(410)583-1337 option 2



2002-07-19 17:01:58

by Hayden Myers

[permalink] [raw]
Subject: 2.2 to 2.4... serious TCP send slowdowns


We're finally migrating to the 2.4 kernel due to hardware
incompatibilities with the 2.2. The 2.2 has worked better for us in the
past as far as our application performs. Our application is an adserver
and becomes bogged down in 2.4 when sending files such as images across
the wire. They're in general between 20-50k in size. I've been
researching the differences between 2.4 and 2.2 and have noticed that a
lot of work has gone into autotuning with 2.4 and I'm wondering if this is
what's slowing things down. When I do tcpdumps to see the traffic being
sent to the client I'm noticing that the receiver window is almost always
set to 6430 bytes. When looking at the same transfer on our 2.2 boxes the
receiver window is almost always over 31000 bytes. I've tried to increase
the size of the buffers using the proc settings that are provided however
this hasn't seemed to make a difference even after restarting servers
after each change the window is still 6430 bytes. I've tried manually
settting the size with setsockopt calls in the server code but this hasn't
seemed to help. I believe the problem is definately with sending the
files over the line. We files are read into the socket to be sent across
the network byte by byte. The boss says this is the best way to do it but
I'm curious if this is so. The code that reads the file into the socket
to go across the network is below.


int output_block(int socket, char *filename)
{
int fd, count = 0;
size_t total_bytes = 0;
/*size_t buf_cnt = 1460;*/
size_t buf_cnt = 512;
char buffer[buf_cnt];
fd_set rfds;
struct timeval tv;

if ((fd = open(filename, O_RDONLY)) < 0) {
//fprintf(stderr, "Unable to open filename: %s\n", filename);
return(-1);
}

while ((count = read(fd, &buffer, buf_cnt)) > 0) {

FD_ZERO(&rfds);
FD_SET(socket, &rfds);
tv.tv_sec = 10;
tv.tv_usec = 0;
if (select(socket+1, NULL, &rfds, NULL, &tv) <= 0) {
//fprintf(stderr, "Output_block timeout\n");
break;
}

if (writen(socket, buffer, count) <= 0)
break;
total_bytes += count;
}

close(fd);
return(total_bytes);

The application is a single threaded app using a multiprocess pre forking
model if that helps any. I'm really baffled as to why using the 2.4
kernel is slowing us down. Any help is appreciated. Sorry if this has
come up before. I really have been looking for help for quite some time
before posting this.

Hayden Myers
Support Manager
Skyline Network Technologies
[email protected]
(410)583-1337 option 2



2002-07-19 20:07:30

by Nivedita Singhvi

[permalink] [raw]
Subject: Re: 2.2 to 2.4... serious TCP send slowdowns

> We're finally migrating to the 2.4 kernel due to hardware
> incompatibilities with the 2.2. The 2.2 has worked better
> for us in the past as far as our application performs.
> Our application is an adserver and becomes bogged down in 2.4
> when sending files such as images across

When you say bogged down, what exactly does that mean? Does it
hang? Can you quantify the slowdown with any measurements?
Have you looked at TCP and network statistics to check for
timeouts, drops, other errors, the like? netstat -s should
give you some extended TCP stats which might help you diagnose
that sort of problem..

> the wire. They're in general between 20-50k in size. I've been
> researching the differences between 2.4 and 2.2 and have noticed
> that a lot of work has gone into autotuning with 2.4 and I'm
> wondering if this is what's slowing things down. When I do tcpdumps
> to see the traffic being sent to the client I'm noticing that the
> receiver window is almost always set to 6430 bytes. When looking at
> the same transfer on our 2.2 boxes the receiver window is almost
> always over 31000 bytes. I've tried to increase the size of the
> buffers using the proc settings that are provided however

Whats your interface MTU? How did you change the size of the buffers?
Note that you need to increase the tcp_rmem[1] and tcp_wmem[1] to
affect the default tcp socket buffer sizes. Also note that
approximately half that is used by the kernel, so if you really want
64K user space, try setting the size to 128K.

> this hasn't seemed to make a difference even after restarting
> servers after each change the window is still 6430 bytes. I've
> tried manually settting the size with setsockopt calls in the server
> code but this hasn't seemed to help. I believe the problem is
> definately with sending the files over the line. We files are read
> into the socket to be sent across the network byte by byte. The boss
> says this is the best way to do it but I'm curious if this is so.

You cant optimize your read() from a fd and writes to a socket fd()
simultaneously. Are you setting TCP_NODELAY?

If all you are doing is reading large files from disk and sending them
out over a socket, consider using sendfile() instead. Much more
efficient.

thanks,
Nivedita




2002-07-20 19:39:07

by Alan

[permalink] [raw]
Subject: Re: 2.2 to 2.4... serious TCP send slowdowns

On Fri, 2002-07-19 at 18:04, Hayden Myers wrote:
> seemed to help. I believe the problem is definately with sending the
> files over the line. We files are read into the socket to be sent across
> the network byte by byte. The boss says this is the best way to do it but
> I'm curious if this is so. The code that reads the file into the socket
> to go across the network is below.

Your buffers are way too small buf_cnt wants to be probably 60K or
higher. Making it large ensures one write syscall will fill all
available space in the queue immediately drastically reducing syscall
and wakeup rates. Also avoiding breaks in streaming.

> The application is a single threaded app using a multiprocess pre forking
> model if that helps any. I'm really baffled as to why using the 2.4
> kernel is slowing us down. Any help is appreciated. Sorry if this has
> come up before. I really have been looking for help for quite some time
> before posting this.

Without tcpdump data its hard to guess

2002-07-22 18:44:13

by Hayden Myers

[permalink] [raw]
Subject: Re: 2.2 to 2.4... serious TCP send slowdowns

On 20 Jul 2002, Alan Cox wrote:

> Your buffers are way too small buf_cnt wants to be probably 60K or
> higher. Making it large ensures one write syscall will fill all
> available space in the queue immediately drastically reducing syscall
> and wakeup rates. Also avoiding breaks in streaming.

I've played around with changing the buf_cnt size and tests in house have
surprisingly shown a slight slowdown when increasing it to 64k. This is
most likely inconclusive but it didn't seem to make a large difference.
I also tried to do away with the read and writen syscalls and replace them
with a sendfile call but this seems to have made things even slower.

>
> Without tcpdump data its hard to guess
>
Tcpdump output is where I'm seeing the difference in the clients receive
window. Below is tcpdump from the server

[root@install spinbox]# /usr/sbin/tcpdump src port 80
tcpdump: listening on eth0
11:37:21.003009 install.skyline.net.http >
leg-66-247-99-8-RLY.sprinthome.com.53687: S 273731802:273731802(0) ack
2533363500 win 5792 <mss 1460,sackOK,timestamp 25697 104440615,nop,wscale
0> (DF)
11:37:21.006489 install.skyline.net.http >
leg-66-247-99-8-RLY.sprinthome.com.53687: . ack 302 win 6432
<nop,nop,timestamp 25698 104440615> (DF)
11:37:21.009357 install.skyline.net.http >
leg-66-247-99-8-RLY.sprinthome.com.53687: P 1:16(15) ack 302 win 6432
<nop,nop,timestamp 25698 104440615> (DF)
11:37:21.009529 install.skyline.net.http >
leg-66-247-99-8-RLY.sprinthome.com.53687: P 16:123(107) ack 302 win 6432
<nop,nop,timestamp 25698 104440616> (DF)
11:37:21.009696 install.skyline.net.http >
leg-66-247-99-8-RLY.sprinthome.com.53687: P 123:263(140) ack 302 win 6432
<nop,nop,timestamp 25698 104440616> (DF)
11:37:21.010081 install.skyline.net.http >
leg-66-247-99-8-RLY.sprinthome.com.53687: . 263:1711(1448) ack 302 win
6432 <nop,nop,timestamp 25698 104440616> (DF)
11:37:21.010116 install.skyline.net.http >
leg-66-247-99-8-RLY.sprinthome.com.53687: . 1711:3159(1448) ack 302 win
6432 <nop,nop,timestamp 25698 104440616> (DF)11:37:21.010687
install.skyline.net.http > leg-66-247-99-8-RLY.sprinthome.com.53687: P
3159:4607(1448) ack 302 win 6432 <nop,nop,timestamp 25698 104440616>
(DF)11:37:21.010698 install.skyline.net.http >
leg-66-247-99-8-RLY.sprinthome.com.53687: . 4607:6055(1448) ack 302 win
6432 <nop,nop,timestamp 25698 104440616> (DF)11:37:21.010726
install.skyline.net.http > leg-66-247-99-8-RLY.sprinthome.com.53687: .
6055:7503(1448) ack 302 win 6432 <nop,nop,timestamp 25698 104440616>
(DF)11:37:21.010736 install.skyline.net.http >
leg-66-247-99-8-RLY.sprinthome.com.53687: P 7503:8951(1448) ack 302 win
6432 <nop,nop,timestamp 25698 104440616> (DF)11:37:21.011557
install.skyline.net.http > leg-66-247-99-8-RLY.sprinthome.com.53687: .
8951:10399(1448) ack 302 win 6432 <nop,nop,timestamp 25698 104440616> (DF)
11:37:21.011571 install.skyline.net.http >
leg-66-247-99-8-RLY.sprinthome.com.53687: . 10399:11847(1448) ack 302 win
6432 <nop,nop,timestamp 25698 104440616> (DF)
11:37:21.011583 install.skyline.net.http >
leg-66-247-99-8-RLY.sprinthome.com.53687: FP 11847:12744(897) ack 302 win
6432 <nop,nop,timestamp 25698 104440616> (DF)
11:37:21.058316 install.skyline.net.http >
leg-66-247-99-8-RLY.sprinthome.com.53688: S 265781655:265781655(0) ack
2534779761 win 5792 <mss 1460,sackOK,timestamp 25703 104440621,nop,wscale
0> (DF)
11:37:21.059682 install.skyline.net.http >
leg-66-247-99-8-RLY.sprinthome.com.53688: . ack 334 win 6432
<nop,nop,timestamp 25703 104440621> (DF)
11:37:21.061403 install.skyline.net.http >
leg-66-247-99-8-RLY.sprinthome.com.53688: P 1:16(15) ack 334 win 6432
<nop,nop,timestamp 25703 104440621> (DF)
11:37:21.061574 install.skyline.net.http >
leg-66-247-99-8-RLY.sprinthome.com.53688: P 16:123(107) ack 334 win 6432
<nop,nop,timestamp 25703 104440621> (DF)
11:37:21.061732 install.skyline.net.http >
leg-66-247-99-8-RLY.sprinthome.com.53688: P 123:263(140) ack 334 win 6432
<nop,nop,timestamp 25703 104440621> (DF)
11:37:21.061973 install.skyline.net.http >
leg-66-247-99-8-RLY.sprinthome.com.53688: . 263:1711(1448) ack 334 win
6432 <nop,nop,timestamp 25703 104440621> (DF)
11:37:21.062000 install.skyline.net.http >
leg-66-247-99-8-RLY.sprinthome.com.53688: . 1711:3159(1448) ack 334 win
6432 <nop,nop,timestamp 25703 104440621> (DF)11:37:21.062572
install.skyline.net.http > leg-66-247-99-8-RLY.sprinthome.com.53688: P
3159:4607(1448) ack 334 win 6432 <nop,nop,timestamp 25703 104440621>
(DF)11:37:21.062583 install.skyline.net.http >
leg-66-247-99-8-RLY.sprinthome.com.53688: . 4607:6055(1448) ack 334 win
6432 <nop,nop,timestamp 25703 104440621> (DF)11:37:21.062611
install.skyline.net.http > leg-66-247-99-8-RLY.sprinthome.com.53688: .
6055:7503(1448) ack 334 win 6432 <nop,nop,timestamp 25703 104440621>
(DF)11:37:21.062619 install.skyline.net.http >
leg-66-247-99-8-RLY.sprinthome.com.53688: P 7503:8951(1448) ack 334 win
6432 <nop,nop,timestamp 25703 104440621> (DF)11:37:21.063147
install.skyline.net.http > leg-66-247-99-8-RLY.sprinthome.com.53688: .
8951:10399(1448) ack 334 win 6432 <nop,nop,timestamp 25703 104440621> (DF)
11:37:21.063156 install.skyline.net.http >
leg-66-247-99-8-RLY.sprinthome.com.53688: . 10399:11847(1448) ack 334 win
6432 <nop,nop,timestamp 25703 104440621> (DF)
11:37:21.063167 install.skyline.net.http >
leg-66-247-99-8-RLY.sprinthome.com.53688: FP 11847:12744(897) ack 334 win
6432 <nop,nop,timestamp 25703 104440621> (DF)
11:37:21.093947 install.skyline.net.http >
leg-66-247-99-8-RLY.sprinthome.com.53687: . ack 303 win 6432
<nop,nop,timestamp 25707 104440624> (DF)
11:37:21.112002 install.skyline.net.http >
leg-66-247-99-8-RLY.sprinthome.com.53688: . ack 335 win 6432
<nop,nop,timestamp 25708 104440626> (DF)

According to the tcpdump manpage win 6432 is the number of bytes of
receive buffer space available the other direction of the connection.

Below is a tcpdump session for the same request on the same client but
with the server on a 2.2.20 kernel.


[root@install spinbox]# /usr/sbin/tcpdump src port 80
tcpdump: listening on eth0
11:45:00.379901 install.skyline.net.http >
leg-66-247-99-8-RLY.sprinthome.com.53702: S 754852391:754852391(0) ack
2999938034 win 30660 <mss 1460,sackOK,timestamp 11434 104486504,nop,wscale
0> (DF)
11:45:00.383374 install.skyline.net.http >
leg-66-247-99-8-RLY.sprinthome.com.53702: . ack 302 win 30660
<nop,nop,timestamp 11435 104486504> (DF)
11:45:00.386345 install.skyline.net.http >
leg-66-247-99-8-RLY.sprinthome.com.53702: P 1:16(15) ack 302 win 31856
<nop,nop,timestamp 11435 104486504> (DF)
11:45:00.386571 install.skyline.net.http >
leg-66-247-99-8-RLY.sprinthome.com.53702: P 16:47(31) ack 302 win 31856
<nop,nop,timestamp 11435 104486504> (DF)
11:45:00.386855 install.skyline.net.http >
leg-66-247-99-8-RLY.sprinthome.com.53702: P 47:73(26) ack 302 win 31856
<nop,nop,timestamp 11435 104486504> (DF)
11:45:00.387116 install.skyline.net.http >
leg-66-247-99-8-RLY.sprinthome.com.53702: P 73:154(81) ack 302 win 31856
<nop,nop,timestamp 11435 104486504> (DF)
11:45:00.387314 install.skyline.net.http >
leg-66-247-99-8-RLY.sprinthome.com.53702: P 154:263(109) ack 302 win 31856
<nop,nop,timestamp 11435 104486504> (DF)
11:45:00.427821 install.skyline.net.http >
leg-66-247-99-8-RLY.sprinthome.com.53703: S 754496145:754496145(0) ack
3011203697 win 30660 <mss 1460,sackOK,timestamp 11439 104486508,nop,wscale
0> (DF)
11:45:00.429069 install.skyline.net.http >
leg-66-247-99-8-RLY.sprinthome.com.53703: . ack 334 win 30660
<nop,nop,timestamp 11439 104486508> (DF)
11:45:00.435172 install.skyline.net.http >
leg-66-247-99-8-RLY.sprinthome.com.53703: P 1:16(15) ack 334 win 31856
<nop,nop,timestamp 11440 104486508> (DF)
11:45:00.435392 install.skyline.net.http >
leg-66-247-99-8-RLY.sprinthome.com.53703: P 16:47(31) ack 334 win 31856
<nop,nop,timestamp 11440 104486509> (DF)
11:45:00.435636 install.skyline.net.http >
leg-66-247-99-8-RLY.sprinthome.com.53703: P 47:73(26) ack 334 win 31856
<nop,nop,timestamp 11440 104486509> (DF)
11:45:00.435825 install.skyline.net.http >
leg-66-247-99-8-RLY.sprinthome.com.53703: P 73:210(137) ack 334 win 31856
<nop,nop,timestamp 11440 104486509> (DF)
11:45:00.436047 install.skyline.net.http >
leg-66-247-99-8-RLY.sprinthome.com.53703: P 210:263(53) ack 334 win 31856
<nop,nop,timestamp 11440 104486509> (DF)
11:45:00.468318 install.skyline.net.http >
leg-66-247-99-8-RLY.sprinthome.com.53702: P 263:1711(1448) ack 302 win
31856 <nop,nop,timestamp 11443 104486504> (DF)11:45:00.468380
install.skyline.net.http > leg-66-247-99-8-RLY.sprinthome.com.53702: P
1711:3159(1448) ack 302 win 31856 <nop,nop,timestamp 11443 104486504> (DF)
11:45:00.468422 install.skyline.net.http >
leg-66-247-99-8-RLY.sprinthome.com.53702: P 3159:4607(1448) ack 302 win
31856 <nop,nop,timestamp 11443 104486504> (DF)
11:45:00.468467 install.skyline.net.http >
leg-66-247-99-8-RLY.sprinthome.com.53702: . 4607:6055(1448) ack 302 win
31856 <nop,nop,timestamp 11443 104486504> (DF)
11:45:00.468945 install.skyline.net.http >
leg-66-247-99-8-RLY.sprinthome.com.53702: . 6055:7503(1448) ack 302 win
31856 <nop,nop,timestamp 11443 104486512> (DF)
11:45:00.468959 install.skyline.net.http >
leg-66-247-99-8-RLY.sprinthome.com.53702: . 7503:8951(1448) ack 302 win
31856 <nop,nop,timestamp 11443 104486512> (DF)
11:45:00.468980 install.skyline.net.http >
leg-66-247-99-8-RLY.sprinthome.com.53702: . 8951:10399(1448) ack 302 win
31856 <nop,nop,timestamp 11443 104486512> (DF)
11:45:00.469235 install.skyline.net.http >
leg-66-247-99-8-RLY.sprinthome.com.53702: . 10399:11847(1448) ack 302 win
31856 <nop,nop,timestamp 11443 104486512> (DF)
11:45:00.469250 install.skyline.net.http >
leg-66-247-99-8-RLY.sprinthome.com.53702: FP 11847:12744(897) ack 302 win
31856 <nop,nop,timestamp 11443 104486512> (DF)
11:45:00.470040 install.skyline.net.http >
leg-66-247-99-8-RLY.sprinthome.com.53703: P 263:1711(1448) ack 334 win
31856 <nop,nop,timestamp 11443 104486509> (DF)11:45:00.470077
install.skyline.net.http > leg-66-247-99-8-RLY.sprinthome.com.53703: P
1711:3159(1448) ack 334 win 31856 <nop,nop,timestamp 11443 104486509> (DF)
11:45:00.470120 install.skyline.net.http >
leg-66-247-99-8-RLY.sprinthome.com.53703: P 3159:4607(1448) ack 334 win
31856 <nop,nop,timestamp 11443 104486509> (DF)
11:45:00.470257 install.skyline.net.http >
leg-66-247-99-8-RLY.sprinthome.com.53703: . 4607:6055(1448) ack 334 win
31856 <nop,nop,timestamp 11443 104486509> (DF)
11:45:00.470638 install.skyline.net.http >
leg-66-247-99-8-RLY.sprinthome.com.53703: . 6055:7503(1448) ack 334 win
31856 <nop,nop,timestamp 11443 104486513> (DF)
11:45:00.470654 install.skyline.net.http >
leg-66-247-99-8-RLY.sprinthome.com.53703: . 7503:8951(1448) ack 334 win
31856 <nop,nop,timestamp 11443 104486513> (DF)
11:45:00.470675 install.skyline.net.http >
leg-66-247-99-8-RLY.sprinthome.com.53703: . 8951:10399(1448) ack 334 win
31856 <nop,nop,timestamp 11443 104486513> (DF)
11:45:00.471015 install.skyline.net.http >
leg-66-247-99-8-RLY.sprinthome.com.53703: . 10399:11847(1448) ack 334 win
31856 <nop,nop,timestamp 11443 104486513> (DF)
11:45:00.471048 install.skyline.net.http >
leg-66-247-99-8-RLY.sprinthome.com.53703: FP 11847:12744(897) ack 334 win
31856 <nop,nop,timestamp 11443 104486513> (DF)
11:45:00.487840 install.skyline.net.http >
leg-66-247-99-8-RLY.sprinthome.com.53702: . ack 303 win 31856
<nop,nop,timestamp 11445 104486514> (DF)
11:45:00.532997 install.skyline.net.http >
leg-66-247-99-8-RLY.sprinthome.com.53703: . ack 335 win 31856
<nop,nop,timestamp 11450 104486519> (DF)

Looks to be about the same amount of data going in each packet but the
clients receive window is much larger for some reason. I tried tweaking
all of the window parameters I can find both in the kernel and in proc but
none seem to increase this number. Is there a reason this number is much
lower in a vanilla 2.4 kernel? I'm not very familiar with the 2.4 and
it's ways yet but I've been reading abou the autotuning algorithms that
have been put in place and lay suspect to them for the differences. Any
ideas as to why the windows are very different?



Hayden Myers
Support Manager
Skyline Network Technologies
[email protected]
(410)583-1337 option 2




2002-07-22 19:48:26

by Mika Liljeberg

[permalink] [raw]
Subject: Re: 2.2 to 2.4... serious TCP send slowdowns

On Mon, 2002-07-22 at 21:47, Hayden Myers wrote:

> Tcpdump output is where I'm seeing the difference in the clients receive
> window. Below is tcpdump from the server
>
> [root@install spinbox]# /usr/sbin/tcpdump src port 80

Your dump is showing only one direction of the connection. The receive
window visible in this dump is used for the reverse direction. Use
"tcpdump port 80" instead to get some useful output.

Linux 2.4 starts with a small receive window and rapidly increases it
when the data starts to flow. This is a type of receiver oriented
congestion control. You don't see the window increase here, because
there is very little data sent from client to server.

Also, next time try not to wrap the dump output. Beastly hard to make
sense of.

MikaL

2002-07-22 20:29:49

by Mika Liljeberg

[permalink] [raw]
Subject: Re: 2.2 to 2.4... serious TCP send slowdowns

On Mon, 2002-07-22 at 23:09, Hayden Myers wrote:
> Is it possible the window scaling mechanism is slowing us down. Since the
> connections are so short the window never scales upwards.

I'm pretty sure the window DOES scale upwards. As I said, you didn't
dump the other direction of the connection, which would actually SHOW
the advertised window.

The 6432 you're seeing is the server telling the client how much the
client is allowed send. Only, the client isn't sending anything. You
need to look at what the client is advertising to the server.

> Do you think
> I'd benefit by starting off with a larger window?

I don't think so. The transfer is unlikely to be limited by the
advertised window. Short TCP connections are constrained by the
congestion window, not the advertised window.

> My tests have shown
> that the 2.2 can handle more traffic with our application than the 2.4's
> I've used so far. I would expect the 2.4 to be faster. I imagine it's a
> tuning issue somewhere or an inefficient code issue. I changed the code
> for sending files from the disk across the wire from using read and writen
> to sendfile and set the tcp cork option with setsockopt but contrary to
> everyones messages about it being faster, it slowed things down, more
> noticeably in 2.2.

Not sure why you're seeing a difference here and it's hard to say
without a complete TCP dump. As far as I can see, the half?dump doesn't
exhibit any abnormalities.

This could easily be something completely unrelated to the networking
stack, however. You could be limited by file I/O, for instance. Have you
tried measuring pure TCP throughput without file access?

MikaL

2002-07-22 21:40:43

by Mika Liljeberg

[permalink] [raw]
Subject: Re: 2.2 to 2.4... serious TCP send slowdowns

On Tue, 2002-07-23 at 00:13, Hayden Myers wrote:

> I dumped the other side and it does confirm that the advertised window is
> increasing. This is may not be the problem. So if I understand
> correctly 2.4 has window scaling while 2.2 didn't and that's why the
> window advertisement is larger initially in the 2.2 than the 2.4.

The TCP window scaling option is something else. You are correct,
however, that 2.4 manages receiver side buffers differently and
therefore advertises differently. This is designed to save buffer memory
in a HTTP server and should actually be good for your application.

> I
> still have a theory as to why this smaller window size and/or scaling
is
> slowing us down.
>
> >
> > The 6432 you're seeing is the server telling the client how much the
> > client is allowed send. Only, the client isn't sending anything. You
> > need to look at what the client is advertising to the server.
> The client's advertisement seems to increase as the packets keep rolling
> in.

Correct.

> I just noticed a large number of delayed acks in /proc/net/netstat. Do
> you think it's possible that a lot of clients have delayed ack and are
> holding up our connections?

Again, don't think so. Unless Alexey has changed something recently,
Linux quickacks every full sized segment immediately during the
slowstart phase. This means that, in the beginning, the congestion
window is growing as fast as the specs allow.

If in doubt, you can set the TCP_NODELAY option. Just make sure your app
is sending full sized segments if you do this, otherwise it will just
degrade performance.

Anyway, try posting some hard data. Maybe somebody on these lists can
figure it out.

MikaL

2002-07-23 07:20:47

by Buddy Lumpkin

[permalink] [raw]
Subject: RE: 2.2 to 2.4... serious TCP send slowdowns

How about a simple netstat -i, are you getting any collisions or errors?

-----Original Message-----
From: [email protected]
[mailto:[email protected]]On Behalf Of Hayden Myers
Sent: Monday, July 22, 2002 11:47 AM
To: [email protected]
Cc: [email protected]
Subject: Re: 2.2 to 2.4... serious TCP send slowdowns


On 20 Jul 2002, Alan Cox wrote:

> Your buffers are way too small buf_cnt wants to be probably 60K or
> higher. Making it large ensures one write syscall will fill all
> available space in the queue immediately drastically reducing syscall
> and wakeup rates. Also avoiding breaks in streaming.

I've played around with changing the buf_cnt size and tests in house have
surprisingly shown a slight slowdown when increasing it to 64k. This is
most likely inconclusive but it didn't seem to make a large difference.
I also tried to do away with the read and writen syscalls and replace them
with a sendfile call but this seems to have made things even slower.

>
> Without tcpdump data its hard to guess
>
Tcpdump output is where I'm seeing the difference in the clients receive
window. Below is tcpdump from the server

[root@install spinbox]# /usr/sbin/tcpdump src port 80
tcpdump: listening on eth0
11:37:21.003009 install.skyline.net.http >
leg-66-247-99-8-RLY.sprinthome.com.53687: S 273731802:273731802(0) ack
2533363500 win 5792 <mss 1460,sackOK,timestamp 25697 104440615,nop,wscale
0> (DF)
11:37:21.006489 install.skyline.net.http >
leg-66-247-99-8-RLY.sprinthome.com.53687: . ack 302 win 6432
<nop,nop,timestamp 25698 104440615> (DF)
11:37:21.009357 install.skyline.net.http >
leg-66-247-99-8-RLY.sprinthome.com.53687: P 1:16(15) ack 302 win 6432
<nop,nop,timestamp 25698 104440615> (DF)
11:37:21.009529 install.skyline.net.http >
leg-66-247-99-8-RLY.sprinthome.com.53687: P 16:123(107) ack 302 win 6432
<nop,nop,timestamp 25698 104440616> (DF)
11:37:21.009696 install.skyline.net.http >
leg-66-247-99-8-RLY.sprinthome.com.53687: P 123:263(140) ack 302 win 6432
<nop,nop,timestamp 25698 104440616> (DF)
11:37:21.010081 install.skyline.net.http >
leg-66-247-99-8-RLY.sprinthome.com.53687: . 263:1711(1448) ack 302 win
6432 <nop,nop,timestamp 25698 104440616> (DF)
11:37:21.010116 install.skyline.net.http >
leg-66-247-99-8-RLY.sprinthome.com.53687: . 1711:3159(1448) ack 302 win
6432 <nop,nop,timestamp 25698 104440616> (DF)11:37:21.010687
install.skyline.net.http > leg-66-247-99-8-RLY.sprinthome.com.53687: P
3159:4607(1448) ack 302 win 6432 <nop,nop,timestamp 25698 104440616>
(DF)11:37:21.010698 install.skyline.net.http >
leg-66-247-99-8-RLY.sprinthome.com.53687: . 4607:6055(1448) ack 302 win
6432 <nop,nop,timestamp 25698 104440616> (DF)11:37:21.010726
install.skyline.net.http > leg-66-247-99-8-RLY.sprinthome.com.53687: .
6055:7503(1448) ack 302 win 6432 <nop,nop,timestamp 25698 104440616>
(DF)11:37:21.010736 install.skyline.net.http >
leg-66-247-99-8-RLY.sprinthome.com.53687: P 7503:8951(1448) ack 302 win
6432 <nop,nop,timestamp 25698 104440616> (DF)11:37:21.011557
install.skyline.net.http > leg-66-247-99-8-RLY.sprinthome.com.53687: .
8951:10399(1448) ack 302 win 6432 <nop,nop,timestamp 25698 104440616> (DF)
11:37:21.011571 install.skyline.net.http >
leg-66-247-99-8-RLY.sprinthome.com.53687: . 10399:11847(1448) ack 302 win
6432 <nop,nop,timestamp 25698 104440616> (DF)
11:37:21.011583 install.skyline.net.http >
leg-66-247-99-8-RLY.sprinthome.com.53687: FP 11847:12744(897) ack 302 win
6432 <nop,nop,timestamp 25698 104440616> (DF)
11:37:21.058316 install.skyline.net.http >
leg-66-247-99-8-RLY.sprinthome.com.53688: S 265781655:265781655(0) ack
2534779761 win 5792 <mss 1460,sackOK,timestamp 25703 104440621,nop,wscale
0> (DF)
11:37:21.059682 install.skyline.net.http >
leg-66-247-99-8-RLY.sprinthome.com.53688: . ack 334 win 6432
<nop,nop,timestamp 25703 104440621> (DF)
11:37:21.061403 install.skyline.net.http >
leg-66-247-99-8-RLY.sprinthome.com.53688: P 1:16(15) ack 334 win 6432
<nop,nop,timestamp 25703 104440621> (DF)
11:37:21.061574 install.skyline.net.http >
leg-66-247-99-8-RLY.sprinthome.com.53688: P 16:123(107) ack 334 win 6432
<nop,nop,timestamp 25703 104440621> (DF)
11:37:21.061732 install.skyline.net.http >
leg-66-247-99-8-RLY.sprinthome.com.53688: P 123:263(140) ack 334 win 6432
<nop,nop,timestamp 25703 104440621> (DF)
11:37:21.061973 install.skyline.net.http >
leg-66-247-99-8-RLY.sprinthome.com.53688: . 263:1711(1448) ack 334 win
6432 <nop,nop,timestamp 25703 104440621> (DF)
11:37:21.062000 install.skyline.net.http >
leg-66-247-99-8-RLY.sprinthome.com.53688: . 1711:3159(1448) ack 334 win
6432 <nop,nop,timestamp 25703 104440621> (DF)11:37:21.062572
install.skyline.net.http > leg-66-247-99-8-RLY.sprinthome.com.53688: P
3159:4607(1448) ack 334 win 6432 <nop,nop,timestamp 25703 104440621>
(DF)11:37:21.062583 install.skyline.net.http >
leg-66-247-99-8-RLY.sprinthome.com.53688: . 4607:6055(1448) ack 334 win
6432 <nop,nop,timestamp 25703 104440621> (DF)11:37:21.062611
install.skyline.net.http > leg-66-247-99-8-RLY.sprinthome.com.53688: .
6055:7503(1448) ack 334 win 6432 <nop,nop,timestamp 25703 104440621>
(DF)11:37:21.062619 install.skyline.net.http >
leg-66-247-99-8-RLY.sprinthome.com.53688: P 7503:8951(1448) ack 334 win
6432 <nop,nop,timestamp 25703 104440621> (DF)11:37:21.063147
install.skyline.net.http > leg-66-247-99-8-RLY.sprinthome.com.53688: .
8951:10399(1448) ack 334 win 6432 <nop,nop,timestamp 25703 104440621> (DF)
11:37:21.063156 install.skyline.net.http >
leg-66-247-99-8-RLY.sprinthome.com.53688: . 10399:11847(1448) ack 334 win
6432 <nop,nop,timestamp 25703 104440621> (DF)
11:37:21.063167 install.skyline.net.http >
leg-66-247-99-8-RLY.sprinthome.com.53688: FP 11847:12744(897) ack 334 win
6432 <nop,nop,timestamp 25703 104440621> (DF)
11:37:21.093947 install.skyline.net.http >
leg-66-247-99-8-RLY.sprinthome.com.53687: . ack 303 win 6432
<nop,nop,timestamp 25707 104440624> (DF)
11:37:21.112002 install.skyline.net.http >
leg-66-247-99-8-RLY.sprinthome.com.53688: . ack 335 win 6432
<nop,nop,timestamp 25708 104440626> (DF)

According to the tcpdump manpage win 6432 is the number of bytes of
receive buffer space available the other direction of the connection.

Below is a tcpdump session for the same request on the same client but
with the server on a 2.2.20 kernel.


[root@install spinbox]# /usr/sbin/tcpdump src port 80
tcpdump: listening on eth0
11:45:00.379901 install.skyline.net.http >
leg-66-247-99-8-RLY.sprinthome.com.53702: S 754852391:754852391(0) ack
2999938034 win 30660 <mss 1460,sackOK,timestamp 11434 104486504,nop,wscale
0> (DF)
11:45:00.383374 install.skyline.net.http >
leg-66-247-99-8-RLY.sprinthome.com.53702: . ack 302 win 30660
<nop,nop,timestamp 11435 104486504> (DF)
11:45:00.386345 install.skyline.net.http >
leg-66-247-99-8-RLY.sprinthome.com.53702: P 1:16(15) ack 302 win 31856
<nop,nop,timestamp 11435 104486504> (DF)
11:45:00.386571 install.skyline.net.http >
leg-66-247-99-8-RLY.sprinthome.com.53702: P 16:47(31) ack 302 win 31856
<nop,nop,timestamp 11435 104486504> (DF)
11:45:00.386855 install.skyline.net.http >
leg-66-247-99-8-RLY.sprinthome.com.53702: P 47:73(26) ack 302 win 31856
<nop,nop,timestamp 11435 104486504> (DF)
11:45:00.387116 install.skyline.net.http >
leg-66-247-99-8-RLY.sprinthome.com.53702: P 73:154(81) ack 302 win 31856
<nop,nop,timestamp 11435 104486504> (DF)
11:45:00.387314 install.skyline.net.http >
leg-66-247-99-8-RLY.sprinthome.com.53702: P 154:263(109) ack 302 win 31856
<nop,nop,timestamp 11435 104486504> (DF)
11:45:00.427821 install.skyline.net.http >
leg-66-247-99-8-RLY.sprinthome.com.53703: S 754496145:754496145(0) ack
3011203697 win 30660 <mss 1460,sackOK,timestamp 11439 104486508,nop,wscale
0> (DF)
11:45:00.429069 install.skyline.net.http >
leg-66-247-99-8-RLY.sprinthome.com.53703: . ack 334 win 30660
<nop,nop,timestamp 11439 104486508> (DF)
11:45:00.435172 install.skyline.net.http >
leg-66-247-99-8-RLY.sprinthome.com.53703: P 1:16(15) ack 334 win 31856
<nop,nop,timestamp 11440 104486508> (DF)
11:45:00.435392 install.skyline.net.http >
leg-66-247-99-8-RLY.sprinthome.com.53703: P 16:47(31) ack 334 win 31856
<nop,nop,timestamp 11440 104486509> (DF)
11:45:00.435636 install.skyline.net.http >
leg-66-247-99-8-RLY.sprinthome.com.53703: P 47:73(26) ack 334 win 31856
<nop,nop,timestamp 11440 104486509> (DF)
11:45:00.435825 install.skyline.net.http >
leg-66-247-99-8-RLY.sprinthome.com.53703: P 73:210(137) ack 334 win 31856
<nop,nop,timestamp 11440 104486509> (DF)
11:45:00.436047 install.skyline.net.http >
leg-66-247-99-8-RLY.sprinthome.com.53703: P 210:263(53) ack 334 win 31856
<nop,nop,timestamp 11440 104486509> (DF)
11:45:00.468318 install.skyline.net.http >
leg-66-247-99-8-RLY.sprinthome.com.53702: P 263:1711(1448) ack 302 win
31856 <nop,nop,timestamp 11443 104486504> (DF)11:45:00.468380
install.skyline.net.http > leg-66-247-99-8-RLY.sprinthome.com.53702: P
1711:3159(1448) ack 302 win 31856 <nop,nop,timestamp 11443 104486504> (DF)
11:45:00.468422 install.skyline.net.http >
leg-66-247-99-8-RLY.sprinthome.com.53702: P 3159:4607(1448) ack 302 win
31856 <nop,nop,timestamp 11443 104486504> (DF)
11:45:00.468467 install.skyline.net.http >
leg-66-247-99-8-RLY.sprinthome.com.53702: . 4607:6055(1448) ack 302 win
31856 <nop,nop,timestamp 11443 104486504> (DF)
11:45:00.468945 install.skyline.net.http >
leg-66-247-99-8-RLY.sprinthome.com.53702: . 6055:7503(1448) ack 302 win
31856 <nop,nop,timestamp 11443 104486512> (DF)
11:45:00.468959 install.skyline.net.http >
leg-66-247-99-8-RLY.sprinthome.com.53702: . 7503:8951(1448) ack 302 win
31856 <nop,nop,timestamp 11443 104486512> (DF)
11:45:00.468980 install.skyline.net.http >
leg-66-247-99-8-RLY.sprinthome.com.53702: . 8951:10399(1448) ack 302 win
31856 <nop,nop,timestamp 11443 104486512> (DF)
11:45:00.469235 install.skyline.net.http >
leg-66-247-99-8-RLY.sprinthome.com.53702: . 10399:11847(1448) ack 302 win
31856 <nop,nop,timestamp 11443 104486512> (DF)
11:45:00.469250 install.skyline.net.http >
leg-66-247-99-8-RLY.sprinthome.com.53702: FP 11847:12744(897) ack 302 win
31856 <nop,nop,timestamp 11443 104486512> (DF)
11:45:00.470040 install.skyline.net.http >
leg-66-247-99-8-RLY.sprinthome.com.53703: P 263:1711(1448) ack 334 win
31856 <nop,nop,timestamp 11443 104486509> (DF)11:45:00.470077
install.skyline.net.http > leg-66-247-99-8-RLY.sprinthome.com.53703: P
1711:3159(1448) ack 334 win 31856 <nop,nop,timestamp 11443 104486509> (DF)
11:45:00.470120 install.skyline.net.http >
leg-66-247-99-8-RLY.sprinthome.com.53703: P 3159:4607(1448) ack 334 win
31856 <nop,nop,timestamp 11443 104486509> (DF)
11:45:00.470257 install.skyline.net.http >
leg-66-247-99-8-RLY.sprinthome.com.53703: . 4607:6055(1448) ack 334 win
31856 <nop,nop,timestamp 11443 104486509> (DF)
11:45:00.470638 install.skyline.net.http >
leg-66-247-99-8-RLY.sprinthome.com.53703: . 6055:7503(1448) ack 334 win
31856 <nop,nop,timestamp 11443 104486513> (DF)
11:45:00.470654 install.skyline.net.http >
leg-66-247-99-8-RLY.sprinthome.com.53703: . 7503:8951(1448) ack 334 win
31856 <nop,nop,timestamp 11443 104486513> (DF)
11:45:00.470675 install.skyline.net.http >
leg-66-247-99-8-RLY.sprinthome.com.53703: . 8951:10399(1448) ack 334 win
31856 <nop,nop,timestamp 11443 104486513> (DF)
11:45:00.471015 install.skyline.net.http >
leg-66-247-99-8-RLY.sprinthome.com.53703: . 10399:11847(1448) ack 334 win
31856 <nop,nop,timestamp 11443 104486513> (DF)
11:45:00.471048 install.skyline.net.http >
leg-66-247-99-8-RLY.sprinthome.com.53703: FP 11847:12744(897) ack 334 win
31856 <nop,nop,timestamp 11443 104486513> (DF)
11:45:00.487840 install.skyline.net.http >
leg-66-247-99-8-RLY.sprinthome.com.53702: . ack 303 win 31856
<nop,nop,timestamp 11445 104486514> (DF)
11:45:00.532997 install.skyline.net.http >
leg-66-247-99-8-RLY.sprinthome.com.53703: . ack 335 win 31856
<nop,nop,timestamp 11450 104486519> (DF)

Looks to be about the same amount of data going in each packet but the
clients receive window is much larger for some reason. I tried tweaking
all of the window parameters I can find both in the kernel and in proc but
none seem to increase this number. Is there a reason this number is much
lower in a vanilla 2.4 kernel? I'm not very familiar with the 2.4 and
it's ways yet but I've been reading abou the autotuning algorithms that
have been put in place and lay suspect to them for the differences. Any
ideas as to why the windows are very different?



Hayden Myers
Support Manager
Skyline Network Technologies
[email protected]
(410)583-1337 option 2