2003-07-23 13:17:09

by David Korn

[permalink] [raw]
Subject: kernel bug in socketpair()


I am not sure what the procedure for reporting bugs, but here
is a description of two bugs and a program that can can be used
to produce them.

$ uname -a
Linux fror.research.att.com 2.4.18-18.7.xsmp #1 SMP Wed Nov 13 19:01:42 EST 2002


The first problem is that files created with socketpair() are not accessible
via /dev/fd/n or /proc/$$/fd/n where n is the file descriptor returned
by socketpair(). Note that this is not a problem with pipe().

The second problem is that if fchmod(fd,S_IWUSR) is applied to the write end
of a pipe(), it causes the read() end to also be write only so that
opening /dev/fd/n for read fails.

The following program demonstrates these problems. If invoked without
arguments, socketpair() is used to create to files. Later the
open /dev/fd/n and /proc/$$/fd/n fail.

With one argument, pipe() is used instead of socketpair() and the
program works. With two arguments, pipe() is used bug fchmod()
is also called, and then it fails.

==================cut here======================
#include <sys/socket.h>
#include <sys/stat.h>
#include <stdio.h>
#include <errno.h>


int main(int argc, char *argv[])
{
char buff[256];
int pv[2], fd;
if(argc>1)
fd = pipe(pv);
else
fd = socketpair(PF_UNIX, SOCK_STREAM, 0, pv);
if(fd<0)
{
fprintf(stderr,"socketpar failed err=%d\n",errno);
exit(1);
}
if(argc<2)
{
if(shutdown(pv[0],1)< 0)
{
fprintf(stderr,"shutdown send failed err=%d\n",errno);
exit(1);
}
if(shutdown(pv[1],0)< 0)
{
fprintf(stderr,"shutdown recv failed err=%d\n",errno);
exit(1);
}
}
if(argc!=2)
{
fchmod(pv[0],S_IRUSR);
fchmod(pv[1],S_IWUSR);
}
sprintf(buff,"/dev/fd/%d\0",pv[0]);
errno = 0;
fd = open(buff,0);
fprintf(stderr,"name=%s fd=%d errno=%d\n",buff,fd,errno);
sprintf(buff,"/proc/%d/fd/%d\0",getpid(),pv[0]);
fd = open(buff,0);
fprintf(stderr,"name=%s fd=%d errno=%d\n",buff,fd,errno);
return(0);
}

==================cut here======================

David Korn
[email protected]


2003-07-23 13:50:11

by David Miller

[permalink] [raw]
Subject: Re: kernel bug in socketpair()

On Wed, 23 Jul 2003 09:32:09 -0400 (EDT)
David Korn <[email protected]> wrote:

[ Added [email protected], the proper place to discuss networking kernel issues. ]

> The first problem is that files created with socketpair() are not accessible
> via /dev/fd/n or /proc/$$/fd/n where n is the file descriptor returned
> by socketpair(). Note that this is not a problem with pipe().

Not a bug.

Sockets are not openable via /proc files under any circumstances,
not just the circumstances you describe. This is a policy decision and
prevents a whole slew of potential security holes.

2003-07-23 14:10:21

by Alan

[permalink] [raw]
Subject: Re: kernel bug in socketpair()

On Mer, 2003-07-23 at 14:32, David Korn wrote:
> The first problem is that files created with socketpair() are not accessible
> via /dev/fd/n or /proc/$$/fd/n where n is the file descriptor returned
> by socketpair(). Note that this is not a problem with pipe().

This is intentional - sockets do not have an "open" operation currently.

> The second problem is that if fchmod(fd,S_IWUSR) is applied to the write end
> of a pipe(), it causes the read() end to also be write only so that
> opening /dev/fd/n for read fails.


That doesn't directly suprise me. Our pipes are BSD style not streams
pipes. One thing that means is that the pipe itself is a single inode.

2003-07-23 14:14:11

by David Korn

[permalink] [raw]
Subject: Re: Re: kernel bug in socketpair()


> On Wed, 23 Jul 2003 09:32:09 -0400 (EDT)
> David Korn <[email protected]> wrote:
>
> [ Added [email protected], the proper place to discuss networking kernel issues
> . ]
>
> > The first problem is that files created with socketpair() are not accessible
> > via /dev/fd/n or /proc/$$/fd/n where n is the file descriptor returned
> > by socketpair(). Note that this is not a problem with pipe().
>
> Not a bug.
>
> Sockets are not openable via /proc files under any circumstances,
> not just the circumstances you describe. This is a policy decision and
> prevents a whole slew of potential security holes.
>
>

Thanks for you quick response.

This make sense for INET sockets, but I don't understand the security
considerations for UNIX domain sockets. Could you please elaborate?
Moreover, /dev/fd/n, (as opposed to /proc/$$/n) is restricted to
the current process and its decendents if close-on-exec is not specified.
Again, I don't understand why this would create a security problem
either since the socket is already accesible via the original
descriptor.

Finally if this is a security problem, why is the errno is set to ENXIO
rather than EACCESS?

David Korn
[email protected]

2003-07-23 14:33:28

by David Miller

[permalink] [raw]
Subject: Re: kernel bug in socketpair()

On Wed, 23 Jul 2003 10:28:22 -0400 (EDT)
David Korn <[email protected]> wrote:

> This make sense for INET sockets, but I don't understand the security
> considerations for UNIX domain sockets. Could you please elaborate?
> Moreover, /dev/fd/n, (as opposed to /proc/$$/n) is restricted to
> the current process and its decendents if close-on-exec is not specified.
> Again, I don't understand why this would create a security problem
> either since the socket is already accesible via the original
> descriptor.

Someone else would have to comment, but I do know we've had
this behavior since day one.

And therefore I wouldn't be doing many people much of a favor
by changing the behavior today, what will people do who need
their things to work on the bazillion existing linux kernels
running out there? :-)

Also, see below for another reason why this behavior is unlikely
to change.

> Finally if this is a security problem, why is the errno is set to ENXIO
> rather than EACCESS?

Look at the /proc file we put there for socket FD's. It's a symbolic
link with a readable string of the form ("socket:[%d]", inode_nr)

So your program ends up doing a follow of a symbolic link with that
string name, which does not exist.

Thinking more about this, changing this behavior would probably break
more programs than it would help begin to function, so this is unlikely
to ever change.

2003-07-23 15:21:21

by David Miller

[permalink] [raw]
Subject: Re: kernel bug in socketpair()

On 23 Jul 2003 15:20:08 +0100
Alan Cox <[email protected]> wrote:

> On Mer, 2003-07-23 at 14:32, David Korn wrote:
> > The first problem is that files created with socketpair() are not accessible
> > via /dev/fd/n or /proc/$$/fd/n where n is the file descriptor returned
> > by socketpair(). Note that this is not a problem with pipe().
>
> This is intentional - sockets do not have an "open" operation currently.

Sure, but we've known this for a long time.

And because we knew, we decided not to add an "open"
method to sockets. The reason, as I remember it, was
security.

Was it not?

2003-07-23 16:05:09

by Alan

[permalink] [raw]
Subject: Re: kernel bug in socketpair()

On Mer, 2003-07-23 at 16:36, David S. Miller wrote:
> > This is intentional - sockets do not have an "open" operation currently.
>
> Sure, but we've known this for a long time.
>
> And because we knew, we decided not to add an "open"
> method to sockets. The reason, as I remember it, was
> security.
>
> Was it not?

Mostly if I remember rightly that if you don't do the check because you have
no open operation to create a new instance you crash the box. HPA did have
some sensible ideas about how to do "open" on AF_UNIX sockets but for the
others its really unclear quite what "open" means

2003-07-23 16:41:22

by Glenn Fowler

[permalink] [raw]
Subject: Re: kernel bug in socketpair()


you can eliminate the security implications for all fd types by
simply translating
open("/dev/fd/N",...)
to
dup(atoi(N))
w.r.t. fd N in the current process

the problem is that linux took an implementation shortcut by symlinking
/dev/fd/N -> /proc/self/fd/N
and by the time the kernel sees /proc/self/fd/N the "self"-ness is apparently
lost, and it is forced to do the security checks

if the /proc fd open code has access to the original /proc/PID/fd/N path
then it can do dup(atoi(N)) when the PID is the current process without
affecting security

otherwise there is a bug in the /dev/fd/N -> /proc/self/fd/N implementation
and /dev/fd/N should be separated out to its (original) dup(atoi(N))
semantics

see http://mail-index.netbsd.org/current-users/1994/03/29/0027.html for
an early (bsd) discussion of /dev/fd/N vs. /proc/self/fd/N

-- Glenn Fowler <[email protected]> AT&T Labs Research, Florham Park NJ --

On Wed, 23 Jul 2003 07:46:15 -0700 David S. Miller wrote:
> On Wed, 23 Jul 2003 10:28:22 -0400 (EDT)
> David Korn <[email protected]> wrote:

> > This make sense for INET sockets, but I don't understand the security
> > considerations for UNIX domain sockets. Could you please elaborate?
> > Moreover, /dev/fd/n, (as opposed to /proc/$$/n) is restricted to
> > the current process and its decendents if close-on-exec is not specified.
> > Again, I don't understand why this would create a security problem
> > either since the socket is already accesible via the original
> > descriptor.

> Someone else would have to comment, but I do know we've had
> this behavior since day one.

> And therefore I wouldn't be doing many people much of a favor
> by changing the behavior today, what will people do who need
> their things to work on the bazillion existing linux kernels
> running out there? :-)

> Also, see below for another reason why this behavior is unlikely
> to change.

> > Finally if this is a security problem, why is the errno is set to ENXIO
> > rather than EACCESS?

> Look at the /proc file we put there for socket FD's. It's a symbolic
> link with a readable string of the form ("socket:[%d]", inode_nr)

> So your program ends up doing a follow of a symbolic link with that
> string name, which does not exist.

> Thinking more about this, changing this behavior would probably break
> more programs than it would help begin to function, so this is unlikely
> to ever change.

2003-07-23 16:47:56

by David Miller

[permalink] [raw]
Subject: Re: kernel bug in socketpair()

On Wed, 23 Jul 2003 12:56:12 -0400 (EDT)
Glenn Fowler <[email protected]> wrote:

> the problem is that linux took an implementation shortcut by symlinking
> /dev/fd/N -> /proc/self/fd/N
> and by the time the kernel sees /proc/self/fd/N the "self"-ness is apparently
> lost, and it is forced to do the security checks

None of this is true. If you open /proc/self/fd/N directly the problem
is still there.

> if the /proc fd open code has access to the original /proc/PID/fd/N path
> then it can do dup(atoi(N)) when the PID is the current process without
> affecting security

If we're talking about the current process, there is no use in using
/proc/*/fd/N to open a file descriptor in the first place, you can
simply call open(N,...)

I've personally always viewed /proc/*/fd/N as a way to see who has
various files or sockets open, ie. a debugging tool, not as a generic
way for processes to get access to each other's FDs.

There is an existing mechanism, a portable non-Linux one, that you
can use to do that.

Pass the fd over a UNIX domain socket if you want that, truly.
That works on every system.

2003-07-23 17:09:44

by Glenn Fowler

[permalink] [raw]
Subject: Re: kernel bug in socketpair()


On Wed, 23 Jul 2003 10:00:43 -0700 David S. Miller wrote:
> On Wed, 23 Jul 2003 12:56:12 -0400 (EDT)
> Glenn Fowler <[email protected]> wrote:

> > the problem is that linux took an implementation shortcut by symlinking
> > /dev/fd/N -> /proc/self/fd/N
> > and by the time the kernel sees /proc/self/fd/N the "self"-ness is apparently
> > lost, and it is forced to do the security checks

> None of this is true. If you open /proc/self/fd/N directly the problem
> is still there.

you missed the point that the original open() call is on /dev/fd/N,
not /proc/PID/fd/N; /proc/PID/fd/N only comes into play because the
linux implementation foists it on the user

> > if the /proc fd open code has access to the original /proc/PID/fd/N path
> > then it can do dup(atoi(N)) when the PID is the current process without
> > affecting security

> If we're talking about the current process, there is no use in using
> /proc/*/fd/N to open a file descriptor in the first place, you can
> simply call open(N,...)

no, in the notation above N is the fd number "so you could simply call dup(N)"

here is one reason why /dev/fd/N is useful:

/dev/fd/N is the underlying mechanism for implementing the bash and ksh

cmd-1 <(cmd-2 ...) ... <(cmd-n ...)

each <(cmd-i ...) is converted to a pipe() with the write side getting the
output of cmd-i (and marked close on exec) and the read side *not* marked
close on exec; cmd-1 is then executed as

cmd-1 /dev/fd/PIPE-READ-2 ... /dev/fd/PIPE-READ-n

where PIPE-READ-i is the fd number of the read side of the pipe for cmd-i

2003-07-23 17:18:47

by David Miller

[permalink] [raw]
Subject: Re: kernel bug in socketpair()

On Wed, 23 Jul 2003 13:24:36 -0400 (EDT)
Glenn Fowler <[email protected]> wrote:

> /dev/fd/N is the underlying mechanism for implementing the bash and ksh
>
> cmd-1 <(cmd-2 ...) ... <(cmd-n ...)
>

Interesting.

I looked at the bash code, and it uses pipes with /dev/fd/N, and for
/dev/fd/N which are pipes the open should work under Linux.

This is what David Korn said in his original report.

I guess the part that is left is the fchmod() issue which exists
because one inode is used to implement both sides of the pipe under
Linux.

Was the idea to, since fchmod() on pipes modified both sides,
to use UNIX domain sockets to implement this? And that's how
you discovered the /dev/fd/N failure for sockets?

Another idea is to use named unix sockets. Can that be
sufficient to solve your dilemma?

2003-07-23 17:42:26

by Alan

[permalink] [raw]
Subject: Re: kernel bug in socketpair()

On Mer, 2003-07-23 at 17:56, Glenn Fowler wrote:
> you can eliminate the security implications for all fd types by
> simply translating
> open("/dev/fd/N",...)
> to
> dup(atoi(N))
> w.r.t. fd N in the current process

This has very different semantics. Consider lseek().

> otherwise there is a bug in the /dev/fd/N -> /proc/self/fd/N implementation
> and /dev/fd/N should be separated out to its (original) dup(atoi(N))
> semantics

I don't see a bug. I see differing behaviour between Linux and BSD on a
completely non standards defined item. Also btw nobody ever really wrote
a /dev/fd/ for Linux - it was just a byproduct of the proc stuff someone
noticed. I guess someone could write a Plan-9 style dev/fd or devfdfs
for Linux if they wanted.

Alan

2003-07-23 18:01:16

by Glenn Fowler

[permalink] [raw]
Subject: Re: kernel bug in socketpair()


On Wed, 23 Jul 2003 10:31:35 -0700 David S. Miller wrote:
> Interesting.

> I looked at the bash code, and it uses pipes with /dev/fd/N, and for
> /dev/fd/N which are pipes the open should work under Linux.

> This is what David Korn said in his original report.

> I guess the part that is left is the fchmod() issue which exists
> because one inode is used to implement both sides of the pipe under
> Linux.

> Was the idea to, since fchmod() on pipes modified both sides,
> to use UNIX domain sockets to implement this? And that's how
> you discovered the /dev/fd/N failure for sockets?

fchmod() came into play with socketpair() to get the fd modes to match
pipe(); its not needed with pipe()

we use socketpair() to allow efficient peeking on pipe input (via recv()),
where peek means "read some data but don't advance the read/seek offset"
btw, this is on systems that don't allow ioctl(I_PEEK) on pipe() fds;
if there is a way to peek pipe() data on linux then we can switch back
to pipe() and be on our way

> Another idea is to use named unix sockets. Can that be
> sufficient to solve your dilemma?

named sockets seem a little heavyweight for this application

2003-07-23 18:11:27

by David Miller

[permalink] [raw]
Subject: Re: kernel bug in socketpair()

On Wed, 23 Jul 2003 14:14:57 -0400 (EDT)
Glenn Fowler <[email protected]> wrote:

> named sockets seem a little heavyweight for this application

I think it'll be cheaper than unnamed unix sockets and
groveling in /proc/*/fd/

And even if there is a minor performance issue, you'll more than get
that back due to the portability gain. :-)

2003-07-23 18:42:54

by Glenn Fowler

[permalink] [raw]
Subject: Re: kernel bug in socketpair()


On Wed, 23 Jul 2003 11:23:07 -0700 David S. Miller wrote:
> On Wed, 23 Jul 2003 14:14:57 -0400 (EDT)
> Glenn Fowler <[email protected]> wrote:

> > named sockets seem a little heavyweight for this application

> I think it'll be cheaper than unnamed unix sockets and
> groveling in /proc/*/fd/

> And even if there is a minor performance issue, you'll more than get
> that back due to the portability gain. :-)

named unix sockets reside in the fs namespace, no?
so they must be linked to a dir before use and unlinked after use
the unlink after use would be particularly tricky for the parent process
implementing
cmd <(cmd ...) ...

2003-07-23 18:53:15

by David Miller

[permalink] [raw]
Subject: Re: kernel bug in socketpair()

On Wed, 23 Jul 2003 14:54:49 -0400 (EDT)
Glenn Fowler <[email protected]> wrote:

> On Wed, 23 Jul 2003 11:23:07 -0700 David S. Miller wrote:
> > On Wed, 23 Jul 2003 14:14:57 -0400 (EDT)
> > Glenn Fowler <[email protected]> wrote:
>
> > > named sockets seem a little heavyweight for this application
>
> > I think it'll be cheaper than unnamed unix sockets and
> > groveling in /proc/*/fd/
>
> > And even if there is a minor performance issue, you'll more than get
> > that back due to the portability gain. :-)
>
> named unix sockets reside in the fs namespace, no?

Right.

> so they must be linked to a dir before use and unlinked after use
> the unlink after use would be particularly tricky for the parent process
> implementing
> cmd <(cmd ...) ...

Hmmm... true.

I honestly don't know what to suggest you use, sorry :(

Is bash totally broken because of all this? Or does the problem only
trigger when using (cmd) subprocesses in a certain way?

2003-07-23 18:58:00

by Glenn Fowler

[permalink] [raw]
Subject: Re: kernel bug in socketpair()


On Wed, 23 Jul 2003 12:04:57 -0700 David S. Miller wrote:
> Is bash totally broken because of all this? Or does the problem only
> trigger when using (cmd) subprocesses in a certain way?

bash uses pipe() so its ok
using socketpair() instead of pipe() introduces the problem
and we will now have to find an alternative to work around the
linux /dev/fd/N implementation

thanks

2003-07-23 19:04:49

by David Miller

[permalink] [raw]
Subject: Re: kernel bug in socketpair()

On Wed, 23 Jul 2003 15:11:47 -0400 (EDT)
Glenn Fowler <[email protected]> wrote:

> On Wed, 23 Jul 2003 12:04:57 -0700 David S. Miller wrote:
> > Is bash totally broken because of all this? Or does the problem only
> > trigger when using (cmd) subprocesses in a certain way?
>
> bash uses pipe() so its ok
> using socketpair() instead of pipe() introduces the problem
> and we will now have to find an alternative to work around the
> linux /dev/fd/N implementation

I missed the reason why you can't use pipes and bash
is able to, what is it?

If it's the fchown() thing, why doesn't bash have this issue?

2003-07-23 19:11:27

by Alan

[permalink] [raw]
Subject: Re: kernel bug in socketpair()

On Mer, 2003-07-23 at 19:54, Glenn Fowler wrote:
> named unix sockets reside in the fs namespace, no?
> so they must be linked to a dir before use and unlinked after use
> the unlink after use would be particularly tricky for the parent process
> implementing
> cmd <(cmd ...) ...

Portable stuff yes, Linux also supports a pure socket namespace for them
when the path starts with a nul character

2003-07-23 19:17:20

by Glenn Fowler

[permalink] [raw]
Subject: Re: kernel bug in socketpair()


On Wed, 23 Jul 2003 12:14:36 -0700 David S. Miller wrote:
> I missed the reason why you can't use pipes and bash
> is able to, what is it?

we have some applications, ksh included, with semantics that require
stdin be read at most one line at a time; an inefficient implementation
of this does 1 byte read()s until newline is read; an efficient
implementation does a peek read (without advancing the read/seek offset),
determines how many chars to read up to and including the newline,
and then read()s that much

linux has ioctl(I_PEEK) for stream devices and recv() for sockets,
and neither of these work on pipes; if there is a linux alternative
for pipes then we'd be glad to use it

we switched from pipe() to socketpair() to take advantage of the linux
recv() peek read

2003-07-23 19:17:20

by David Korn

[permalink] [raw]
Subject: Re: Re: kernel bug in socketpair()

cc: [email protected] [email protected] [email protected] [email protected]
Subject: Re: Re: kernel bug in socketpair()
--------

> I missed the reason why you can't use pipes and bash
> is able to, what is it?
>
> If it's the fchown() thing, why doesn't bash have this issue?
>
>

The reason is that we want to be able to peek ahead at data in
the pipe before advancing. You can do this with recv() but
this doesn't work wtih pipes. On some systems you can use
an ioctl() for this with pipes by Linux doesn't support this
so ksh configures to use socketpair() instead of pipe()
on Linux. Without the ability to peek ahead on pipes, a command
like
cat file | { head -6 > /dev/null; cat ;}
to remove the first 6 lines of a file would be hard to implement
unless head reads one byte at a time from the pipe.
(OK, you could read 6 bytes at first if you want to optimize head.)

David Korn
[email protected]

2003-07-23 19:28:55

by Andreas Jellinghaus

[permalink] [raw]
Subject: Re: kernel bug in socketpair()

On Mit, 2003-07-23 at 19:00, David S. Miller wrote:
> If we're talking about the current process, there is no use in using
> /proc/*/fd/N to open a file descriptor in the first place, you can
> simply call open(N,...)

maybe you can use open on /proc/fd/*/N to open a file
already deleted from the filesystem? That might be useful.

Andreas

2003-07-23 19:43:41

by David Miller

[permalink] [raw]
Subject: Re: kernel bug in socketpair()

On Wed, 23 Jul 2003 15:29:03 -0400 (EDT)
Glenn Fowler <[email protected]> wrote:

> linux has ioctl(I_PEEK) for stream devices and recv() for sockets,
> and neither of these work on pipes; if there is a linux alternative
> for pipes then we'd be glad to use it

Alan mentioned the pure-socket namespace we have for named unix
sockets, but I don't think you can actually use it for your
problem unfortunately.

2003-07-23 22:10:06

by jw schultz

[permalink] [raw]
Subject: Re: kernel bug in socketpair()

On Wed, Jul 23, 2003 at 03:29:03PM -0400, Glenn Fowler wrote:
>
> On Wed, 23 Jul 2003 12:14:36 -0700 David S. Miller wrote:
> > I missed the reason why you can't use pipes and bash
> > is able to, what is it?
>
> we have some applications, ksh included, with semantics that require
> stdin be read at most one line at a time; an inefficient implementation
> of this does 1 byte read()s until newline is read; an efficient
> implementation does a peek read (without advancing the read/seek offset),
> determines how many chars to read up to and including the newline,
> and then read()s that much
>
> linux has ioctl(I_PEEK) for stream devices and recv() for sockets,
> and neither of these work on pipes; if there is a linux alternative
> for pipes then we'd be glad to use it
>
> we switched from pipe() to socketpair() to take advantage of the linux
> recv() peek read

Perhaps you'd rather code a patch adding peek functionality
for pipes.

--
________________________________________________________________
J.W. Schultz Pegasystems Technologies
email address: [email protected]

Remember Cernan and Schmitt

2003-07-23 23:12:53

by Bill Rugolsky Jr.

[permalink] [raw]
Subject: Re: kernel bug in socketpair()

On Wed, Jul 23, 2003 at 06:50:41PM +0100, Alan Cox wrote:
> > otherwise there is a bug in the /dev/fd/N -> /proc/self/fd/N implementation
> > and /dev/fd/N should be separated out to its (original) dup(atoi(N))
> > semantics
>
> I don't see a bug. I see differing behaviour between Linux and BSD on a
> completely non standards defined item. Also btw nobody ever really wrote
> a /dev/fd/ for Linux - it was just a byproduct of the proc stuff someone
> noticed. I guess someone could write a Plan-9 style dev/fd or devfdfs
> for Linux if they wanted.

I first posted about this several years ago, and it came up again earlier
in the year; see:

http://hypermail.idiosynkrasia.net/linux-kernel/archived/2003/week14/0314.html

As HPA and I had previously discussed, ->open() methods always return
a new file struct, so providing the dup() semantics would require a
restructuring of the ->open() methods -- unless, (and this is a dirty
hack,) one creates a devfdfs that abuses the ERESTART_RESTARTBLOCK
mechanism to restart the open() syscall with dup() instead. This requires
some minor pollution to the open() syscall path to interpret the error
return, but should require no other changes.

Regards,

Bill Rugolsky