Could someone explain why send is failing with EPIPE on the 2.4.x
kernel, while it is working with the 2.2.x kernels.
The PsuedoCode:
sock = socket(AF_INET, SOCK_STREAM, 0)
buf = fcntl(sock, F_GETFL)
fcntl(sock, F_SETFL, buf | O_NONBLOCK) // we check the SETFL return
value, it succeeds
while ((retval = connect(sock, addr, sizeof(struct sockaddr_in))) < 0)
{
if (retval < 0) {
if (errno != EINPROGRESS) return -1; // return failure
}
} // the connect succeeds during first iteration with return value of 0.
send(sock, msg, msg_length, 0) // this connection is to the thttpd web
server on the same host. XXX
XXX: send fails with EPIPE on the 2.4.0-test11-ac 4 and 2.4.0-test12
kernels, whereas it does not fail on 2.2.14-5.0(redhat kernel)
More Info:
thttpd is working properly on the 2.4.x machine, I can access it via
Netscape, our software is a proxy.
On Thu, Dec 14, 2000 at 03:12:27PM -0800, Adam Scislowicz wrote:
> Could someone explain why send is failing with EPIPE on the 2.4.x
> kernel, while it is working with the 2.2.x kernels.
>
> The PsuedoCode:
> sock = socket(AF_INET, SOCK_STREAM, 0)
> buf = fcntl(sock, F_GETFL)
> fcntl(sock, F_SETFL, buf | O_NONBLOCK) // we check the SETFL return
> value, it succeeds
> while ((retval = connect(sock, addr, sizeof(struct sockaddr_in))) < 0)
> {
> if (retval < 0) {
> if (errno != EINPROGRESS) return -1; // return failure
> }
> } // the connect succeeds during first iteration with return value of 0.
>
> send(sock, msg, msg_length, 0) // this connection is to the thttpd web
> server on the same host. XXX
> XXX: send fails with EPIPE on the 2.4.0-test11-ac 4 and 2.4.0-test12
> kernels, whereas it does not fail on 2.2.14-5.0(redhat kernel)
EPIPE means that the other end or you have closed the connection. It has nothing
to do with the socket's non blockingness.
-Andi
We understand the meaning of EPIPE, the question is why 2.4.x is returning EPIPE,
while 2.2.x is succeeding in sending
the data to thttpd. Using the 2.2.x kernel our proxy functions, and I can access
thttpd directly. In 2.4.x I can access thttpd
directly but the proxy does not function.
I have already noticed that the 2.4.x kernel does not set errno = 0 in many places
where the 2.2.x kernel did, so there are
differences.
-Adam
Andi wrote:
> EPIPE means that the other end or you have closed the connection. It has nothing
> to do with the socket's non blockingness.
On Thu, Dec 14, 2000 at 03:26:53PM -0800, Adam Scislowicz wrote:
> We understand the meaning of EPIPE, the question is why 2.4.x is returning EPIPE,
> while 2.2.x is succeeding in sending
> the data to thttpd. Using the 2.2.x kernel our proxy functions, and I can access
> thttpd directly. In 2.4.x I can access thttpd
>From your subject you seem not to.
To the best of my knowledge the receiver side EPIPE reporting has not changed,
so it must be something in the sender that causes it to close the connection
earlier. What you have to find out.
> I have already noticed that the 2.4.x kernel does not set errno = 0 in many places
> where the 2.2.x kernel did, so there are
> differences.
No system call ever sets errno = 0.
-Andi
> From your subject you seem not to.
>
Im sorry for the subject I just wanted to give the environmental factors, and it is a
non-blocking socket. At this point I am not sure if that is relavent or not.
> To the best of my knowledge the receiver side EPIPE reporting has not changed,
> so it must be something in the sender that causes it to close the connection
> earlier. What you have to find out.
>
We simply rerun the same binary in the same environment, first with 2.2.x, and then
with 2.4.x. We have verified that socket(), and connect() calls are successfull, and
all of our problems arise when we go to send().
We do not send() until our main select() loop sets the writeable flag on our socket
descriptor, so our problem should not be related to a pre-mature send().
I dont expect this to be a kernel bug, but I was hopeing from the pseudo-code I posted
to get a "you are doing this wrong" response.
Again, everything is working in 2.2.x, but not in 2.4.x. It may be that our coding error
is only expressed in combination with the 2.4.x kernel, thats why I asked in this
mailing
list.
> No system call ever sets errno = 0.
Oh, something else in our system was doing this then. Thanx for the info.
-Adam
On Thu, Dec 14, 2000 at 03:54:16PM -0800, Adam Scislowicz wrote:
> > From your subject you seem not to.
> >
> Im sorry for the subject I just wanted to give the environmental factors, and it is a
> non-blocking socket. At this point I am not sure if that is relavent or not.
>
> > To the best of my knowledge the receiver side EPIPE reporting has not changed,
> > so it must be something in the sender that causes it to close the connection
> > earlier. What you have to find out.
> >
> We simply rerun the same binary in the same environment, first with 2.2.x, and then
> with 2.4.x. We have verified that socket(), and connect() calls are successfull, and
> all of our problems arise when we go to send().
> We do not send() until our main select() loop sets the writeable flag on our socket
> descriptor, so our problem should not be related to a pre-mature send().
> I dont expect this to be a kernel bug, but I was hopeing from the pseudo-code I posted
> to get a "you are doing this wrong" response.
It is hard to be sure with a tcpdump log of the incident. If you send me one I'll look
at it.
-Andi
I Previously Wrote:
> Could someone explain why send is failing with EPIPE on the 2.4.x
> kernel, while it is working with the 2.2.x kernels.
It turns our the socket family was not being set to AF_INET :/
It was working in 2.2.x because in our situation the sock family was being
initialized to AF_INET, this is not
behavious we should have been depending on. Sorry 'bout that.
-Adam