-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
As a follow up to my question from yesterday on the netdev list what I
think is a real problem. Either in the kernel or in the POSIX spec.
The POSIX spec currently says this about SOCK_DGRAM sockets:
If address is a null address for the protocol, the socket’s peer
address shall be reset.
The term "null address" is not further specified but it will usually be
read to allow the following scenario to work out:
fd = socket(AT_INET6, ...)
connect(fd, ...some IPv6 address...)
struct sockaddr_in6 sin6 = { .sin6_family = AF_INET6 };
connect(fd, &sin6, sizeof (sin6));
connect(fd, ...some new IPv6 address...)
This does not work on Linux in the moment. The socket remains connected
to the old IPv6 address but the second connect() call does succeed (this
does not sound OK). What does work is if the connect call to
disassociate the address uses AF_UNSPEC instead of AF_INET6.
The question is: do people here think this is a problem in the POSIX
spec? Binding to :: and 0.0.0.0 isn't possible, so maybe the Linux
implementation should allow this?
If you think the POSIX spec is wrong (and can point to other
implementations doing the same as Linux) let me know and I'll work on
getting the spec changed.
- --
➧ Ulrich Drepper ➧ Red Hat, Inc. ➧ 444 Castro St ➧ Mountain View, CA ❖
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.7 (GNU/Linux)
iD8DBQFG8T6L2ijCOnn/RHQRAnSRAJ9sXDGG9OepEQWQInaPgwxCWlaH6wCghqim
ULttg5/lU8c1rSpBnoRCjB8=
=nGVv
-----END PGP SIGNATURE-----
Ulrich Drepper <[email protected]> writes:
>
> fd = socket(AT_INET6, ...)
>
> connect(fd, ...some IPv6 address...)
>
> struct sockaddr_in6 sin6 = { .sin6_family = AF_INET6 };
> connect(fd, &sin6, sizeof (sin6));
The standard way to undo connect is to use AF_UNSPEC. Code to handle
that for dgram sockets is there. It's the same code for v4 and v6.
-Andi
From: Ulrich Drepper <[email protected]>
Date: Wed, 19 Sep 2007 08:21:47 -0700
> If you think the POSIX spec is wrong (and can point to other
> implementations doing the same as Linux) let me know and I'll work on
> getting the spec changed.
The whole AF_UNSPEC thing I'm almost certain comes from BSD, which has
behaved that way for centuries.
Someone needs to cull through Steven's Volume 2 to verify this, I'm
too busy at the moment to do so myself.
On Wed, 19 Sep 2007 09:15:10 -0700 (PDT)
David Miller <[email protected]> wrote:
> From: Ulrich Drepper <[email protected]>
> Date: Wed, 19 Sep 2007 08:21:47 -0700
>
> > If you think the POSIX spec is wrong (and can point to other
> > implementations doing the same as Linux) let me know and I'll work on
> > getting the spec changed.
>
> The whole AF_UNSPEC thing I'm almost certain comes from BSD, which has
> behaved that way for centuries.
We got it from the 1003.4g draft socket specification if I remember
rightly. Its entirely plausible that got it from 4BSE.
Alan
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Andi Kleen wrote:
> The standard way to undo connect is to use AF_UNSPEC. Code to handle
> that for dgram sockets is there. It's the same code for v4 and v6.
I quoted the standard and it does not say anything about AF_UNSPEC. So
you cannot simply make such broad statements.
I also don't say that this behavior should be removed. It's certainly
useful, very much so in fact.
But the spec calls for a "null address" to be used and that's in my
understanding something different from using AF_UNSPEC.
I looked through Stevens TCP Illustrated Vol 2 and it seems not to
mention resetting the address at all. The POSIX spec certainly got this
text from .1g.
I cannot test it on other systems. If somebody has access to some
certified systems (and maybe others), write a bit of code which creates
a DGRAM socket, connect to one address, call connect with a "null
address", then connect to another address (which likely has to use a
different interface since otherwise the connect will just succeed, it
seems).
- --
➧ Ulrich Drepper ➧ Red Hat, Inc. ➧ 444 Castro St ➧ Mountain View, CA ❖
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.7 (GNU/Linux)
iD8DBQFG8VMF2ijCOnn/RHQRAr9NAJwLxyql0kQnMGJNaPZlRGsuB6rGEACgog88
WIWAFhuBWsjps7PdbcoumUQ=
=oLxP
-----END PGP SIGNATURE-----
From: Ulrich Drepper <[email protected]>
Date: Wed, 19 Sep 2007 09:49:09 -0700
> But the spec calls for a "null address" to be used and that's in my
> understanding something different from using AF_UNSPEC.
It just occured to me that AF_UNSPEC might be used simply
because "all zeros" might be a valid real bindable address
for some address family. And using AF_UNSPEC avoids that
problem entirely.
On Wed, Sep 19, 2007 at 09:49:09AM -0700, Ulrich Drepper wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> Andi Kleen wrote:
> > The standard way to undo connect is to use AF_UNSPEC. Code to handle
> > that for dgram sockets is there. It's the same code for v4 and v6.
>
> I quoted the standard and it does not say anything about AF_UNSPEC. So
> you cannot simply make such broad statements.
Ok "standard" was perhaps a poor choice of words.
AF_UNSPEC used to be introduced long ago by Alan based on some early
POSIX draft iirc.
Also incidentially it's a null address:
include/linux/socket.h:#define AF_UNSPEC 0
> But the spec calls for a "null address" to be used and that's in my
> understanding something different from using AF_UNSPEC.
memset(&sockaddr, 0, sizeof(sockaddr)) should give you AF_UNSPEC
-Andi
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Ulrich Drepper wrote:
> Yes, but for IPv4/6 it's not an issue. Some implementations might
> handle all-zeros and the spec _currently_ calls for it. In this case an
> alignment would be good.
Searching the web shows up this:
http://developer.apple.com/documentation/Darwin/Reference/ManPages/man2/connect.2.html
Datagram sockets may dissolve the association by connecting to an
invalid address, such as a null address or an address with the address
family set to AF_UNSPEC (the error EAFNOSUPPORT will be harmlessly
returned).
I.e., at least Apple implements both variants.
- --
➧ Ulrich Drepper ➧ Red Hat, Inc. ➧ 444 Castro St ➧ Mountain View, CA ❖
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.7 (GNU/Linux)
iD8DBQFG8Vvu2ijCOnn/RHQRAsSfAJkBELtiNyul8wMOjVv1x7LfvDWw/ACfR0D0
cm+k1wfhCsT4GjbF3uac+eY=
=nksN
-----END PGP SIGNATURE-----
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Andi Kleen wrote:
>> But the spec calls for a "null address" to be used and that's in my
>> understanding something different from using AF_UNSPEC.
>
> memset(&sockaddr, 0, sizeof(sockaddr)) should give you AF_UNSPEC
But the spec calls for <quote>null address for the protocol</quote>.
That means the family for the null address is the same as the family of
the socket.
- --
➧ Ulrich Drepper ➧ Red Hat, Inc. ➧ 444 Castro St ➧ Mountain View, CA ❖
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.7 (GNU/Linux)
iD8DBQFG8WCO2ijCOnn/RHQRAgtsAJ9qTFVj5QQbVG/hUflxo/6uPOfl4QCdHSX8
wi2GX7B0pht8VDaswYLqdpM=
=sMSg
-----END PGP SIGNATURE-----
On Wed, Sep 19, 2007 at 10:46:54AM -0700, Ulrich Drepper wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> Andi Kleen wrote:
> >> But the spec calls for a "null address" to be used and that's in my
> >> understanding something different from using AF_UNSPEC.
> >
> > memset(&sockaddr, 0, sizeof(sockaddr)) should give you AF_UNSPEC
>
> But the spec calls for <quote>null address for the protocol</quote>.
>
> That means the family for the null address is the same as the family of
> the socket.
Spec doesn't match traditional behaviour then. IPv4 0.0.0.0 is
traditionally an synonym for old style all broadcast (255.255.255.255)
on UDP/RAW and it's certainly possible to connect() to that.
-Andi
On Wed, 19 Sep 2007 10:46:54 -0700
Ulrich Drepper <[email protected]> wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> Andi Kleen wrote:
> >> But the spec calls for a "null address" to be used and that's in my
> >> understanding something different from using AF_UNSPEC.
> >
> > memset(&sockaddr, 0, sizeof(sockaddr)) should give you AF_UNSPEC
>
> But the spec calls for <quote>null address for the protocol</quote>.
>
> That means the family for the null address is the same as the family of
> the socket.
Which is a valid address in some protocols. If I remember rightly then
appletalk net 0 node 0 port 0 is valid although I'd want to look in the
book to check that - ditto AF_ECONET although I doubt anyone cares too
much 8)
Alan
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Andi Kleen wrote:
> Spec doesn't match traditional behaviour then.
Well, determining whether that's the case is part of this exercise.
> IPv4 0.0.0.0 is
> traditionally an synonym for old style all broadcast (255.255.255.255)
> on UDP/RAW and it's certainly possible to connect() to that.
Where do you get this from? And where is this implemented? I don't
doubt it but I have to convince people to change the standard and
possibly introduce incompatibility.
- --
➧ Ulrich Drepper ➧ Red Hat, Inc. ➧ 444 Castro St ➧ Mountain View, CA ❖
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.7 (GNU/Linux)
iD8DBQFG8WQY2ijCOnn/RHQRAlsBAJ9qZRZXNN2VEy136MFIT1daHfju5ACdGiIW
k0I5e2BGRjvjbJrrAwtehqo=
=fX+i
-----END PGP SIGNATURE-----
On Wed, Sep 19, 2007 at 11:02:00AM -0700, Ulrich Drepper wrote:
> > on UDP/RAW and it's certainly possible to connect() to that.
>
> Where do you get this from? And where is this implemented? I don't
Sorry it's actually loopback, not broadcast as implemented in Linux.
In Linux it's implemented in ip_route_output_slow(). Essentially
converted to 127.0.0.1
I think it's traditional BSD behaviour but couldn't find it on
a quick look in FreeBSD source (but haven't looked very intensively)
Admittedly port 0 is somewhat dodgy for UDP too, but at least in RAW
context it might be valid.
-Andi
Andi Kleen wrote:
> On Wed, Sep 19, 2007 at 11:02:00AM -0700, Ulrich Drepper wrote:
>
>>>on UDP/RAW and it's certainly possible to connect() to that.
>>
>>Where do you get this from? And where is this implemented? I don't
>
>
> Sorry it's actually loopback, not broadcast as implemented in Linux.
> In Linux it's implemented in ip_route_output_slow(). Essentially
> converted to 127.0.0.1
>
> I think it's traditional BSD behaviour but couldn't find it on
> a quick look in FreeBSD source (but haven't looked very intensively)
One has to set their way-back machine pretty far back to find the *BSD
bits which used 0.0.0.0 as the "all nets, all subnets" (to mis-use a
term) broadcast IPv4 address when sending. Perhaps as far back as the
time before HP-UX 7 or SunOS4. The bit errors in my dimm memory get
pretty dense that far back...
It has hung-on in various places (stacks) as an "accepted" broadcast IP
in the receive path, but not the send path for quite possibly decades now.
rick jones
> It has hung-on in various places (stacks) as an "accepted" broadcast IP
> in the receive path, but not the send path for quite possibly decades now.
Well it is valid in Linux for sending. And who knows who relies on it.
-Andi
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
David Miller wrote:
> It just occured to me that AF_UNSPEC might be used simply
> because "all zeros" might be a valid real bindable address
> for some address family. And using AF_UNSPEC avoids that
> problem entirely.
Yes, but for IPv4/6 it's not an issue. Some implementations might
handle all-zeros and the spec _currently_ calls for it. In this case an
alignment would be good.
I guess I'll just go ahead and file a problem report with the spec.
Maybe the Unix vendors will test their implementations in provide feedback.
- --
➧ Ulrich Drepper ➧ Red Hat, Inc. ➧ 444 Castro St ➧ Mountain View, CA ❖
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.7 (GNU/Linux)
iD8DBQFG8Vam2ijCOnn/RHQRAlw2AJwPCkD/GdX5YWCjsidhNXkGT71SiQCeLUDX
XimSWS2NMI9T8QxnnV3FDQ4=
=8XbG
-----END PGP SIGNATURE-----
On Wed, 19 Sep 2007 11:38:57 PDT, Rick Jones said:
> One has to set their way-back machine pretty far back to find the *BSD
> bits which used 0.0.0.0 as the "all nets, all subnets" (to mis-use a
> term) broadcast IPv4 address when sending. Perhaps as far back as the
> time before HP-UX 7 or SunOS4. The bit errors in my dimm memory get
> pretty dense that far back...
That would be BSD4.2 - BSD4.3 went to all-ones, and it *was* quite the
little mess if you had both flavors of boxes on the same subnet at the same
time, it would packet-storm *quite* easily.