2004-03-31 16:36:15

by Richard B. Johnson

[permalink] [raw]
Subject: Powers-of-two - 7 for recv() length??


Linux version 2.4.24

Given a TCP/IP server feeding data as fast as it can
to a stream connection on a dedicated link, and a
client receiving that data, I observe that the data
received is usually a (power-off-two - 7) bytes in
length, followed by the 7 bytes returned by the next
recv() call.

It makes no difference if Nagle is turned OFF (TCP_NODELAY).

Very bad!
Recv value was 114681
Remaining length was was 409607
Very bad!
Recv value was 7
Remaining length was was 0
Very bad!
Recv value was 65529
Remaining length was was 458759
Very bad!
Recv value was 7
Remaining length was was 0
Very bad!
Recv value was 65529
Remaining length was was 458759
Very bad!
Recv value was 7
Remaining length was was 0
Very bad!
Recv value was 65529
Remaining length was was 458759
Very bad!
Recv value was 7
Remaining length was was 0
Very bad!
Recv value was 65529
Remaining length was was 458759
Very bad!
Recv value was 7
Remaining length was was 0
Very bad!
Recv value was 65529
Remaining length was was 458759
Very bad!
Recv value was 7
Remaining length was was 0
Very bad!
Recv value was 65529
Remaining length was was 458759
Very bad!
Recv value was 7
Remaining length was was 0
Very bad!
Recv value was 65529
Remaining length was was 458759
[SNIPPED....]

Code snippet:

len = BUF_LEN;
while(len)
{
if((ret = recv(s, cp, len, 0 )) <= 0)
{
if(errno == EINTR)
continue;
handler(0);
}
len -= ret;
cp += ret;
if(ret & 1)
{
fprintf(stderr, "Very bad!\n");
fprintf(stderr, "Recv value was %d\n", ret);
fprintf(stderr, "Remaining length was was %u\n", len);
}
}

Given that the transport sends only even numbers of bytes, I
would guess that there is considerable overhead associated with
the 7-byte break. This likely points to something being
broken and some work-around incorporated to "fix" it. The
additional calls necessary to receive a mere 7 bytes into
a buffer that expects to get filled with 1/2 megabytes,
seriously reduces the through-put, not only because of the
additional call, but because of the remaining odd-byte
buffer alignment necessary for the next recv() call.

Could somebody who understands the network code please
find out what is going on. I can't find Alexy who used
to handle these kinds of problems off the list.

Cheers,
Dick Johnson
Penguin : Linux version 2.4.24 on an i686 machine (797.90 BogoMips).
Note 96.31% of all statistics are fiction.



2004-03-31 17:22:24

by Stephen Hemminger

[permalink] [raw]
Subject: Re: Powers-of-two - 7 for recv() length??

What is the socket send/receive buffering, and the underlying network.
You need to look at the data flow with something like tcpdump and tcptrace.
If you get flow controlled or lots of other reasons, TCP will validly
send a small number of bytes (like 1) which will get things out of alignment.

2004-03-31 18:02:53

by Richard B. Johnson

[permalink] [raw]
Subject: Re: Powers-of-two - 7 for recv() length??

On Wed, 31 Mar 2004, Stephen Hemminger wrote:

> What is the socket send/receive buffering, and the underlying network.
> You need to look at the data flow with something like tcpdump and tcptrace.
> If you get flow controlled or lots of other reasons, TCP will validly
> send a small number of bytes (like 1) which will get things out of alignment.
>

Hmmm. I get lots of truncated IP packets. See attached. I've tried
to help by setting both RCV_BUF and SND_BUF to 1/2 megabytes. Nothing
seems to work except sending only 1436 bytes at a time. That makes
everything miserably slow. 1436 comes from (1500 - 64) 1500 being the
ethernet packet length, 64 being the IP header length.

Cheers,
Dick Johnson
Penguin : Linux version 2.4.24 on an i686 machine (797.90 BogoMips).
Note 96.31% of all statistics are fiction.


Attachments:
tcpdump.log.gz (43.36 kB)

2004-03-31 18:50:49

by Stephen Hemminger

[permalink] [raw]
Subject: Re: Powers-of-two - 7 for recv() length??

On Wed, 31 Mar 2004 13:02:03 -0500 (EST)
"Richard B. Johnson" <[email protected]> wrote:

> On Wed, 31 Mar 2004, Stephen Hemminger wrote:
>
> > What is the socket send/receive buffering, and the underlying network.
> > You need to look at the data flow with something like tcpdump and tcptrace.
> > If you get flow controlled or lots of other reasons, TCP will validly
> > send a small number of bytes (like 1) which will get things out of alignment.
> >
>
> Hmmm. I get lots of truncated IP packets. See attached. I've tried
> to help by setting both RCV_BUF and SND_BUF to 1/2 megabytes. Nothing
> seems to work except sending only 1436 bytes at a time. That makes
> everything miserably slow. 1436 comes from (1500 - 64) 1500 being the
> ethernet packet length, 64 being the IP header length.
>

That is because you are running on the loopback interface that has an
MTU of 16K. And tcpdump is being smart and only reading part of the
data.



> Cheers,
> Dick Johnson
> Penguin : Linux version 2.4.24 on an i686 machine (797.90 BogoMips).
> Note 96.31% of all statistics are fiction.
>
>


--
Stephen Hemminger mailto:[email protected]
Open Source Development Lab http://developer.osdl.org/shemminger