2009-11-19 13:44:14

by Jeff Layton

[permalink] [raw]
Subject: [PATCH] procfs: make /proc style symlinks behave like "normal" symlinks

This is the fourth attempt to fix problems with opening files via
/proc/pid symlinks. This one takes yet another approach and makes
/proc/pid symlinks behave like "normal" symlinks.

/proc/pid symlinks are "special". They do not return an actual path
string like most filesystems. They instead return a filled out struct
path in the nameidata and set the last_type to LAST_BIND. This allows
the path resolution code to skip resolving a text path to a struct path
since in general the kernel already knows to which dentry these symlinks
point.

There are a couple of problems with this shortcut however:

1) It causes the path walking code to skip revalidating the dentry.
It's assumed that this dentry will be valid and in general, that's a
valid assumption. Still, some filesystems such as NFSv4 rely on a call
to d_revalidate to handle opens. Opening via a /proc symlink causes that
to get skipped leading to a filp with no NFSv4 state attached.

2) In certain situations, this behavior can cause directory permissions
to be ignored. The situations identified for this are rather contrived,
but it's still a security issue (albeit a rather benign one). Details of
that problem are in the following thread:

http://seclists.org/bugtraq/2009/Oct/179

At this point, we're faced with a choice -- add special casing to the
path walking code to fix these problems, or change /proc/pid symlinks
such that they behave like "normal" symlinks. This patch implements the
latter.

Instead of returning a struct path in the nameidata, have
proc_pid_follow_link allocate a page and convert the struct path
into a path string. That string is then returned and the path walking
code can treat it like any other symlink, converting it back to a
struct path.

This approach is less efficient than the current approach, but it allows
us to avoid adding new special cases to the path walking code. I don't
perceive /proc/pid symlinks to be a performance-critical codepath, so
I'm making the assumption that this is a reasonable tradeoff.

There is at least one user-visible change that this introduces. If a
symlink under /proc/pid points to a deleted file, the existing code will
still look like a valid symlink. With this code change that will no
longer be the case, but that may be a good thing. Users shouldn't need
to take special steps to ensure that an open file is no longer
accessible for new opens once it has been unlinked.

Signed-off-by: Jeff Layton <[email protected]>
---
fs/proc/base.c | 85 +++++++++++++++++++++++++-------------------------------
1 files changed, 38 insertions(+), 47 deletions(-)

diff --git a/fs/proc/base.c b/fs/proc/base.c
index af643b5..b383f08 100644
--- a/fs/proc/base.c
+++ b/fs/proc/base.c
@@ -81,6 +81,7 @@
#include <linux/elf.h>
#include <linux/pid_namespace.h>
#include <linux/fs_struct.h>
+#include <linux/highmem.h>
#include "internal.h"

/* NOTE:
@@ -1343,68 +1344,58 @@ static int proc_exe_link(struct inode *inode, struct path *exe_path)
static void *proc_pid_follow_link(struct dentry *dentry, struct nameidata *nd)
{
struct inode *inode = dentry->d_inode;
- int error = -EACCES;
-
- /* We don't need a base pointer in the /proc filesystem */
- path_put(&nd->path);
-
- /* Are we allowed to snoop on the tasks file descriptors? */
- if (!proc_fd_access_allowed(inode))
- goto out;
-
- error = PROC_I(inode)->op.proc_get_link(inode, &nd->path);
- nd->last_type = LAST_BIND;
-out:
- return ERR_PTR(error);
-}
-
-static int do_proc_readlink(struct path *path, char __user *buffer, int buflen)
-{
- char *tmp = (char*)__get_free_page(GFP_TEMPORARY);
char *pathname;
- int len;
-
- if (!tmp)
- return -ENOMEM;
-
- pathname = d_path(path, tmp, PAGE_SIZE);
- len = PTR_ERR(pathname);
- if (IS_ERR(pathname))
- goto out;
- len = tmp + PAGE_SIZE - 1 - pathname;
-
- if (len > buflen)
- len = buflen;
- if (copy_to_user(buffer, pathname, len))
- len = -EFAULT;
- out:
- free_page((unsigned long)tmp);
- return len;
-}
-
-static int proc_pid_readlink(struct dentry * dentry, char __user * buffer, int buflen)
-{
- int error = -EACCES;
- struct inode *inode = dentry->d_inode;
+ struct page *page = NULL;
struct path path;
+ int error;

/* Are we allowed to snoop on the tasks file descriptors? */
- if (!proc_fd_access_allowed(inode))
+ if (!proc_fd_access_allowed(inode)) {
+ pathname = ERR_PTR(-EACCES);
goto out;
+ }

error = PROC_I(inode)->op.proc_get_link(inode, &path);
- if (error)
+ if (error) {
+ pathname = ERR_PTR(error);
goto out;
+ }
+
+ page = alloc_page(GFP_HIGHUSER);
+ if (!page) {
+ pathname = ERR_PTR(-ENOMEM);
+ goto out_path_put;
+ }
+
+ pathname = kmap(page);
+ pathname = d_path(&path, pathname, PAGE_SIZE);
+ if (IS_ERR(pathname)) {
+ kunmap(page);
+ __free_page(page);
+ page = NULL;
+ }

- error = do_proc_readlink(&path, buffer, buflen);
+out_path_put:
path_put(&path);
out:
- return error;
+ nd_set_link(nd, pathname);
+ return page;
+}
+
+static void
+proc_pid_put_link(struct dentry *dentry, struct nameidata *nd, void *cookie)
+{
+ struct page *page = cookie;
+ if (page) {
+ kunmap(page);
+ __free_page(page);
+ }
}

static const struct inode_operations proc_pid_link_inode_operations = {
- .readlink = proc_pid_readlink,
+ .readlink = generic_readlink,
.follow_link = proc_pid_follow_link,
+ .put_link = proc_pid_put_link,
.setattr = proc_setattr,
};

--
1.5.5.6


2009-11-19 17:07:17

by Eric W. Biederman

[permalink] [raw]
Subject: Re: [PATCH] procfs: make /proc style symlinks behave like "normal" symlinks


Nacked-by: "Eric W. Biederman" <[email protected]>

This is broken. If the referenced file is in a different mount namespace
the path returned could point to a completely different path in your
own mount namespace. Even in your own mount namespace this makes the
proc symlinks racy and not guaranteed to return the file of interest.

I don't see any hope of this approach ever working.

Eric


Jeff Layton <[email protected]> writes:

> -static int proc_pid_readlink(struct dentry * dentry, char __user * buffer, int buflen)
> -{
> - int error = -EACCES;
> - struct inode *inode = dentry->d_inode;
> + struct page *page = NULL;
> struct path path;
> + int error;
>
> /* Are we allowed to snoop on the tasks file descriptors? */
> - if (!proc_fd_access_allowed(inode))
> + if (!proc_fd_access_allowed(inode)) {
> + pathname = ERR_PTR(-EACCES);
> goto out;
> + }
>
> error = PROC_I(inode)->op.proc_get_link(inode, &path);
> - if (error)
> + if (error) {
> + pathname = ERR_PTR(error);
> goto out;
> + }
> +
> + page = alloc_page(GFP_HIGHUSER);
> + if (!page) {
> + pathname = ERR_PTR(-ENOMEM);
> + goto out_path_put;
> + }
> +
> + pathname = kmap(page);
> + pathname = d_path(&path, pathname, PAGE_SIZE);

This is just nonsense.


> + if (IS_ERR(pathname)) {
> + kunmap(page);
> + __free_page(page);
> + page = NULL;
> + }
>
> - error = do_proc_readlink(&path, buffer, buflen);
> +out_path_put:
> path_put(&path);
> out:
> - return error;
> + nd_set_link(nd, pathname);
> + return page;
> +}

2009-11-19 18:28:34

by Jeff Layton

[permalink] [raw]
Subject: Re: [PATCH] procfs: make /proc style symlinks behave like "normal" symlinks

On Thu, 19 Nov 2009 09:07:16 -0800
[email protected] (Eric W. Biederman) wrote:

>
> Nacked-by: "Eric W. Biederman" <[email protected]>
>
> This is broken. If the referenced file is in a different mount namespace
> the path returned could point to a completely different path in your
> own mount namespace. Even in your own mount namespace this makes the
> proc symlinks racy and not guaranteed to return the file of interest.
>
> I don't see any hope of this approach ever working.
>
> Eric
>

Then is proc_pid_readlink broken in the same way?

>
> Jeff Layton <[email protected]> writes:
>
> > -static int proc_pid_readlink(struct dentry * dentry, char __user * buffer, int buflen)
> > -{
> > - int error = -EACCES;
> > - struct inode *inode = dentry->d_inode;
> > + struct page *page = NULL;
> > struct path path;
> > + int error;
> >
> > /* Are we allowed to snoop on the tasks file descriptors? */
> > - if (!proc_fd_access_allowed(inode))
> > + if (!proc_fd_access_allowed(inode)) {
> > + pathname = ERR_PTR(-EACCES);
> > goto out;
> > + }
> >
> > error = PROC_I(inode)->op.proc_get_link(inode, &path);
> > - if (error)
> > + if (error) {
> > + pathname = ERR_PTR(error);
> > goto out;
> > + }
> > +
> > + page = alloc_page(GFP_HIGHUSER);
> > + if (!page) {
> > + pathname = ERR_PTR(-ENOMEM);
> > + goto out_path_put;
> > + }
> > +
> > + pathname = kmap(page);
> > + pathname = d_path(&path, pathname, PAGE_SIZE);
>
> This is just nonsense.
>

How so? Care to elaborate?

>
> > + if (IS_ERR(pathname)) {
> > + kunmap(page);
> > + __free_page(page);
> > + page = NULL;
> > + }
> >
> > - error = do_proc_readlink(&path, buffer, buflen);
> > +out_path_put:
> > path_put(&path);
> > out:
> > - return error;
> > + nd_set_link(nd, pathname);
> > + return page;
> > +}


--
Jeff Layton <[email protected]>

2009-11-19 18:57:08

by Eric W. Biederman

[permalink] [raw]
Subject: Re: [PATCH] procfs: make /proc style symlinks behave like "normal" symlinks

Jeff Layton <[email protected]> writes:

> On Thu, 19 Nov 2009 09:07:16 -0800
> [email protected] (Eric W. Biederman) wrote:
>
>>
>> Nacked-by: "Eric W. Biederman" <[email protected]>
>>
>> This is broken. If the referenced file is in a different mount namespace
>> the path returned could point to a completely different path in your
>> own mount namespace. Even in your own mount namespace this makes the
>> proc symlinks racy and not guaranteed to return the file of interest.
>>
>> I don't see any hope of this approach ever working.
>>
>> Eric
>>
>
> Then is proc_pid_readlink broken in the same way?

proc_pid_readlink has the same deficiencies. The race is fundamental
to all readlink operations, the difference is that for normal symlinks
it is a don't care, and for proc it is incorrect behavior if you follow
the symlink to the wrong file. If you are dealing with a file in a
different namespace or a socket what you get back doesn't actually
work as a file in your local namespace but that is the best we can do
with a pathname, and if you know the context of what is going on readlink
is still useful.

Adding all of the short comings to followlink that readlink has is a problem,
especially as followlink does much better now.

At a practical level I think your changes are much easier to exploit than
Pavels contrived example.

I really don't have any problems with your first patch to proc to add the
missing revalidate.

Eric

2009-11-19 19:35:56

by Jeff Layton

[permalink] [raw]
Subject: Re: [PATCH] procfs: make /proc style symlinks behave like "normal" symlinks

On Thu, 19 Nov 2009 10:57:08 -0800
[email protected] (Eric W. Biederman) wrote:

> Jeff Layton <[email protected]> writes:
>
> > On Thu, 19 Nov 2009 09:07:16 -0800
> > [email protected] (Eric W. Biederman) wrote:
> >
> >>
> >> Nacked-by: "Eric W. Biederman" <[email protected]>
> >>
> >> This is broken. If the referenced file is in a different mount namespace
> >> the path returned could point to a completely different path in your
> >> own mount namespace. Even in your own mount namespace this makes the
> >> proc symlinks racy and not guaranteed to return the file of interest.
> >>
> >> I don't see any hope of this approach ever working.
> >>
> >> Eric
> >>
> >
> > Then is proc_pid_readlink broken in the same way?
>
> proc_pid_readlink has the same deficiencies. The race is fundamental
> to all readlink operations, the difference is that for normal symlinks
> it is a don't care, and for proc it is incorrect behavior if you follow
> the symlink to the wrong file. If you are dealing with a file in a
> different namespace or a socket what you get back doesn't actually
> work as a file in your local namespace but that is the best we can do
> with a pathname, and if you know the context of what is going on readlink
> is still useful.
>
> Adding all of the short comings to followlink that readlink has is a problem,
> especially as followlink does much better now.
>
> At a practical level I think your changes are much easier to exploit than
> Pavels contrived example.
>
> I really don't have any problems with your first patch to proc to add the
> missing revalidate.
>

Thanks, that makes sense. The raciness was evident once you pointed it
out, so I think you're correct that we can't take this approach.

Adding the missing revalidations is fine, but I don't believe that
helps to fix Pavel's issue. I'll go back and take a more careful look
at the suggestion that Miklos made and see whether it makes sense to
implement a new FS_* flag for this, and see what it'll take to fix
Pavel's issue.

--
Jeff Layton <[email protected]>

2009-11-19 21:31:20

by Eric W. Biederman

[permalink] [raw]
Subject: Re: [PATCH] procfs: make /proc style symlinks behave like "normal" symlinks

Jeff Layton <[email protected]> writes:

> Thanks, that makes sense. The raciness was evident once you pointed it
> out, so I think you're correct that we can't take this approach.
>
> Adding the missing revalidations is fine, but I don't believe that
> helps to fix Pavel's issue. I'll go back and take a more careful look
> at the suggestion that Miklos made and see whether it makes sense to
> implement a new FS_* flag for this, and see what it'll take to fix
> Pavel's issue.

Fundamentally today the proc/fd links are special kind of hard link,
the same kind as bind mounts and have the same issues crop up. Given
that definition there is absolutely nothing wrong their behavior.

The fact that we create the proc/fd implicitly is a little bit
surprising and the source of the concerns. So far the cost of
changing the semantics and breaking potentially valid user space
applications appears to outweigh the potential security benefit.


I can't find a reason for the missing revalidate. It is easy to get
into a state where the destination of of a proc/fd/N symlink or
a file bind mount has been deleted, and everything continues to
work. You can even open d_dropped file descriptors.

If you want confirmation just do:

$ mkdir toy
$ cd toy
$ echo 123 > f
$ touch g
$ sudo mount --bind ./f ./g
$ sleep 3000 < g &
$ child=$!
$ ls -l /proc/$child/fd/0
lr-x------ 1 eric eric 64 2009-11-19 13:27 /proc/28797/fd/0 -> /tmp/toy/g
$ rm f
$ ls -l /proc/$child/fd/0
lr-x------ 1 eric eric 64 2009-11-19 13:27 /proc/28797/fd/0 -> /tmp/toy/g (deleted)
$ cat /proc/$child/fd/0
123
$ kill $child
$ sudo umount ./g
$ cat g
$ rm g

What we can't do right now is revalidate the source of a mount as
that will make it impossible to find the vfsmount to unmount it.

Eric

2009-11-19 21:39:40

by Pavel Machek

[permalink] [raw]
Subject: Re: [PATCH] procfs: make /proc style symlinks behave like "normal" symlinks

> > Adding all of the short comings to followlink that readlink has is a problem,
> > especially as followlink does much better now.
> >
> > At a practical level I think your changes are much easier to exploit than
> > Pavels contrived example.
> >
> > I really don't have any problems with your first patch to proc to add the
> > missing revalidate.
> >
>
> Thanks, that makes sense. The raciness was evident once you pointed it
> out, so I think you're correct that we can't take this approach.
>
> Adding the missing revalidations is fine, but I don't believe that
> helps to fix Pavel's issue. I'll go back and take a more careful look
> at the suggestion that Miklos made and see whether it makes sense to
> implement a new FS_* flag for this, and see what it'll take to fix
> Pavel's issue.

One posibility would be to make open(/proc/XX/fd/XX) behave like
dup(). That should solve the NFS problems, too, no?
Pavel
--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

2009-11-19 21:56:43

by Eric W. Biederman

[permalink] [raw]
Subject: Re: [PATCH] procfs: make /proc style symlinks behave like "normal" symlinks

Pavel Machek <[email protected]> writes:

>> > Adding all of the short comings to followlink that readlink has is a problem,
>> > especially as followlink does much better now.
>> >
>> > At a practical level I think your changes are much easier to exploit than
>> > Pavels contrived example.
>> >
>> > I really don't have any problems with your first patch to proc to add the
>> > missing revalidate.
>> >
>>
>> Thanks, that makes sense. The raciness was evident once you pointed it
>> out, so I think you're correct that we can't take this approach.
>>
>> Adding the missing revalidations is fine, but I don't believe that
>> helps to fix Pavel's issue. I'll go back and take a more careful look
>> at the suggestion that Miklos made and see whether it makes sense to
>> implement a new FS_* flag for this, and see what it'll take to fix
>> Pavel's issue.
>
> One posibility would be to make open(/proc/XX/fd/XX) behave like
> dup(). That should solve the NFS problems, too, no?

Not for bind mounts, and the revalidate in follow_mounts is a bug regardless.

Eric

2009-11-19 22:30:20

by Pavel Machek

[permalink] [raw]
Subject: Re: [PATCH] procfs: make /proc style symlinks behave like "normal" symlinks

Hi!

(Sorry, missed Cc's first time).

> >> Adding the missing revalidations is fine, but I don't believe that
> >> helps to fix Pavel's issue. I'll go back and take a more careful look
> >> at the suggestion that Miklos made and see whether it makes sense to
> >> implement a new FS_* flag for this, and see what it'll take to fix
> >> Pavel's issue.
> >
> > One posibility would be to make open(/proc/XX/fd/XX) behave like
> > dup(). That should solve the NFS problems, too, no?
>
> Not for bind mounts, and the revalidate in follow_mounts is a bug regardless.

Really? So dup() is also broken on nfs? How do bind mounts affect
dup()?
Pavel
--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

2009-11-20 09:31:49

by Miklos Szeredi

[permalink] [raw]
Subject: Re: [PATCH] procfs: make /proc style symlinks behave like "normal" symlinks

On Thu, 19 Nov 2009, Pavel Machek wrote:
> > Adding the missing revalidations is fine, but I don't believe that
> > helps to fix Pavel's issue. I'll go back and take a more careful look
> > at the suggestion that Miklos made and see whether it makes sense to
> > implement a new FS_* flag for this, and see what it'll take to fix
> > Pavel's issue.

What is this "Pavel's issue"?

> One posibility would be to make open(/proc/XX/fd/XX) behave like
> dup(). That should solve the NFS problems, too, no?

You mean, reuse the old "struct file" for the new open? Or what?

Thanks,
Miklos

2009-11-20 09:51:32

by Pavel Machek

[permalink] [raw]
Subject: Re: [PATCH] procfs: make /proc style symlinks behave like "normal" symlinks

On Fri 2009-11-20 10:31:40, Miklos Szeredi wrote:
> On Thu, 19 Nov 2009, Pavel Machek wrote:
> > > Adding the missing revalidations is fine, but I don't believe that
> > > helps to fix Pavel's issue. I'll go back and take a more careful look
> > > at the suggestion that Miklos made and see whether it makes sense to
> > > implement a new FS_* flag for this, and see what it'll take to fix
> > > Pavel's issue.
>
> What is this "Pavel's issue"?

The one we were talking about, see bugtraq.

> > One posibility would be to make open(/proc/XX/fd/XX) behave like
> > dup(). That should solve the NFS problems, too, no?
>
> You mean, reuse the old "struct file" for the new open? Or what?

Aha, dup() results in file position being shared. You are right,
that's not going to fly :-(.
Pavel

--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html