2009-11-06 13:19:51

by Jeff Layton

[permalink] [raw]
Subject: [PATCH] proc: revalidate dentry returned by proc_pid_follow_link

The procfs follow_link operation doesn't provide an actual pathname
string. Instead, it returns a struct path in the nameidata and sets the
last_type field in the nameidata to LAST_BIND. This makes the open path
walking routine (do_filp_open) skip doing a lookup against the path
string and has it just trust the info in the struct path.

The problem here is that this makes that code shortcut any lookup or
revalidation of the dentry. In general, this isn't a problem -- in most
cases the dentry is known to be good. It is a problem however for NFSv4.
If this symlink is followed on an open operation no actual open call
occurs and the open state isn't properly established. This causes
problems when we later try to use this file descriptor for actual
operations.

This patch takes a minimalist approach to fixing this by making the
/proc/pid follow_link routine revalidate the dentry before returning it.
It seems to fix the reproducer I have for this, but I wonder whether it
might be better to do this at a higher level, maybe in __do_follow_link
in case there are other filesystems in the future that return a
last_type of LAST_BIND. Also, should this dentry be d_invalidated if
it d_revalidate returns 0?

Comments welcome.

Signed-off-by: Jeff Layton <[email protected]>
---
fs/proc/base.c | 14 +++++++++++++-
1 files changed, 13 insertions(+), 1 deletions(-)

diff --git a/fs/proc/base.c b/fs/proc/base.c
index 837469a..1bd994c 100644
--- a/fs/proc/base.c
+++ b/fs/proc/base.c
@@ -1343,7 +1343,7 @@ static int proc_exe_link(struct inode *inode, struct path *exe_path)
static void *proc_pid_follow_link(struct dentry *dentry, struct nameidata *nd)
{
struct inode *inode = dentry->d_inode;
- int error = -EACCES;
+ int valid = 1, error = -EACCES;

/* We don't need a base pointer in the /proc filesystem */
path_put(&nd->path);
@@ -1354,6 +1354,18 @@ static void *proc_pid_follow_link(struct dentry *dentry, struct nameidata *nd)

error = PROC_I(inode)->op.proc_get_link(inode, &nd->path);
nd->last_type = LAST_BIND;
+
+ if (error)
+ goto out;
+
+ dentry = nd->path.dentry;
+ if (dentry->d_op && dentry->d_op->d_revalidate)
+ valid = dentry->d_op->d_revalidate(dentry, nd);
+
+ if (valid <= 0) {
+ path_put(&nd->path);
+ error = -ESTALE;
+ }
out:
return ERR_PTR(error);
}
--
1.5.5.6



2009-11-06 20:36:01

by Jamie Lokier

[permalink] [raw]
Subject: Re: [PATCH] proc: revalidate dentry returned by proc_pid_follow_link

Jeff Layton wrote:
> The problem here is that this makes that code shortcut any lookup or
> revalidation of the dentry. In general, this isn't a problem -- in most
> cases the dentry is known to be good. It is a problem however for NFSv4.
> If this symlink is followed on an open operation no actual open call
> occurs and the open state isn't properly established. This causes
> problems when we later try to use this file descriptor for actual
> operations.

As NFS uses open() as a kind of fcntl-lock barrier, I can see it's
important to do _something_ on new opens, rather than just cloning
most of the file descriptor.

> This patch takes a minimalist approach to fixing this by making the
> /proc/pid follow_link routine revalidate the dentry before returning it.

What happens if the file descriptor you are re-opening is for a file
which has been deleted. Does it still have a revalidatable dentry?

-- Jamie

2009-11-06 21:06:18

by Jeff Layton

[permalink] [raw]
Subject: Re: [PATCH] proc: revalidate dentry returned by proc_pid_follow_link

On Fri, 6 Nov 2009 20:36:01 +0000
Jamie Lokier <[email protected]> wrote:

> Jeff Layton wrote:
> > The problem here is that this makes that code shortcut any lookup or
> > revalidation of the dentry. In general, this isn't a problem -- in most
> > cases the dentry is known to be good. It is a problem however for NFSv4.
> > If this symlink is followed on an open operation no actual open call
> > occurs and the open state isn't properly established. This causes
> > problems when we later try to use this file descriptor for actual
> > operations.
>
> As NFS uses open() as a kind of fcntl-lock barrier, I can see it's
> important to do _something_ on new opens, rather than just cloning
> most of the file descriptor.
>

I guess you mean the close-to-open cache consistency? If so, this
problem doesn't actually break that. The actual nfs_file_open call does
occur even when you're opening by following one of these symlinks. I
believe the cache consistency code occurs there.

The problem here is really nfsv4 specific. There the on-the-wire open
call and initialization of state actually happens during d_lookup and
d_revalidate. Neither of these happens with these LAST_BIND symlinks so
we end up with a filp that has no NFSv4 state attached.

> > This patch takes a minimalist approach to fixing this by making the
> > /proc/pid follow_link routine revalidate the dentry before returning it.
>
> What happens if the file descriptor you are re-opening is for a file
> which has been deleted. Does it still have a revalidatable dentry?
>

Well, these LAST_BIND symlinks return a real dget'ed dentry today. If
we assume that it always returns a valid dentry (which seems to be the
case), then I suppose it's OK to do a d_revalidate against it.

It's possible though that that revalidate will either fail though or
return that it's no good. In that case, this code just returns ESTALE
which should make the path walking code revalidate all the way up the
chain. That should (hopefully) make whatever syscall we're servicing
return an error.

Thanks for the review so far.

--
Jeff Layton <[email protected]>

2009-11-08 10:15:57

by Eric W. Biederman

[permalink] [raw]
Subject: Re: [PATCH] proc: revalidate dentry returned by proc_pid_follow_link

Jeff Layton <[email protected]> writes:

> On Fri, 6 Nov 2009 20:36:01 +0000
> Jamie Lokier <[email protected]> wrote:
>
>> Jeff Layton wrote:
>> > The problem here is that this makes that code shortcut any lookup or
>> > revalidation of the dentry. In general, this isn't a problem -- in most
>> > cases the dentry is known to be good. It is a problem however for NFSv4.
>> > If this symlink is followed on an open operation no actual open call
>> > occurs and the open state isn't properly established. This causes
>> > problems when we later try to use this file descriptor for actual
>> > operations.
>>
>> As NFS uses open() as a kind of fcntl-lock barrier, I can see it's
>> important to do _something_ on new opens, rather than just cloning
>> most of the file descriptor.
>>
>
> I guess you mean the close-to-open cache consistency? If so, this
> problem doesn't actually break that. The actual nfs_file_open call does
> occur even when you're opening by following one of these symlinks. I
> believe the cache consistency code occurs there.
>
> The problem here is really nfsv4 specific. There the on-the-wire open
> call and initialization of state actually happens during d_lookup and
> d_revalidate. Neither of these happens with these LAST_BIND symlinks so
> we end up with a filp that has no NFSv4 state attached.
>
>> > This patch takes a minimalist approach to fixing this by making the
>> > /proc/pid follow_link routine revalidate the dentry before returning it.
>>
>> What happens if the file descriptor you are re-opening is for a file
>> which has been deleted. Does it still have a revalidatable dentry?
>>
>
> Well, these LAST_BIND symlinks return a real dget'ed dentry today. If
> we assume that it always returns a valid dentry (which seems to be the
> case), then I suppose it's OK to do a d_revalidate against it.
>
> It's possible though that that revalidate will either fail though or
> return that it's no good. In that case, this code just returns ESTALE
> which should make the path walking code revalidate all the way up the
> chain. That should (hopefully) make whatever syscall we're servicing
> return an error.

Hmm. Looking at the code I get the impression that a file bind mount
will have exactly the same problem.

Can you confirm.

If file bind mounts also have this problem a bugfix to to just
proc seems questionable.

Eric

2009-11-09 01:12:34

by Jeff Layton

[permalink] [raw]
Subject: Re: [PATCH] proc: revalidate dentry returned by proc_pid_follow_link

On Sun, 08 Nov 2009 02:15:57 -0800
[email protected] (Eric W. Biederman) wrote:

> Jeff Layton <[email protected]> writes:
>
> > On Fri, 6 Nov 2009 20:36:01 +0000
> > Jamie Lokier <[email protected]> wrote:
> >
> >> Jeff Layton wrote:
> >> > The problem here is that this makes that code shortcut any lookup or
> >> > revalidation of the dentry. In general, this isn't a problem -- in most
> >> > cases the dentry is known to be good. It is a problem however for NFSv4.
> >> > If this symlink is followed on an open operation no actual open call
> >> > occurs and the open state isn't properly established. This causes
> >> > problems when we later try to use this file descriptor for actual
> >> > operations.
> >>
> >> As NFS uses open() as a kind of fcntl-lock barrier, I can see it's
> >> important to do _something_ on new opens, rather than just cloning
> >> most of the file descriptor.
> >>
> >
> > I guess you mean the close-to-open cache consistency? If so, this
> > problem doesn't actually break that. The actual nfs_file_open call does
> > occur even when you're opening by following one of these symlinks. I
> > believe the cache consistency code occurs there.
> >
> > The problem here is really nfsv4 specific. There the on-the-wire open
> > call and initialization of state actually happens during d_lookup and
> > d_revalidate. Neither of these happens with these LAST_BIND symlinks so
> > we end up with a filp that has no NFSv4 state attached.
> >
> >> > This patch takes a minimalist approach to fixing this by making the
> >> > /proc/pid follow_link routine revalidate the dentry before returning it.
> >>
> >> What happens if the file descriptor you are re-opening is for a file
> >> which has been deleted. Does it still have a revalidatable dentry?
> >>
> >
> > Well, these LAST_BIND symlinks return a real dget'ed dentry today. If
> > we assume that it always returns a valid dentry (which seems to be the
> > case), then I suppose it's OK to do a d_revalidate against it.
> >
> > It's possible though that that revalidate will either fail though or
> > return that it's no good. In that case, this code just returns ESTALE
> > which should make the path walking code revalidate all the way up the
> > chain. That should (hopefully) make whatever syscall we're servicing
> > return an error.
>
> Hmm. Looking at the code I get the impression that a file bind mount
> will have exactly the same problem.
>
> Can you confirm.
>
> If file bind mounts also have this problem a bugfix to to just
> proc seems questionable.
>

I'm not sure I understand what you mean by "file bind mount". Is that
something like mounting with "-o loop" ?

I'm not at all opposed to fixing this in a more broad fashion, but as
best I can tell, the only place that LAST_BIND is used is in procfs.

--
Jeff Layton <[email protected]>

2009-11-09 03:30:55

by Eric W. Biederman

[permalink] [raw]
Subject: Re: [PATCH] proc: revalidate dentry returned by proc_pid_follow_link

Jeff Layton <[email protected]> writes:

>> Hmm. Looking at the code I get the impression that a file bind mount
>> will have exactly the same problem.
>>
>> Can you confirm.
>>
>> If file bind mounts also have this problem a bugfix to to just
>> proc seems questionable.
>>
>
> I'm not sure I understand what you mean by "file bind mount". Is that
> something like mounting with "-o loop" ?

# cd /tmp
# echo foo > foo
# echo bar > bar
# mount --bind foo bar
# cat bar
foo
#

> I'm not at all opposed to fixing this in a more broad fashion, but as
> best I can tell, the only place that LAST_BIND is used is in procfs.

proc does appear to be the only user of LAST_BIND. With a file bind
mount we can get to the same ok: label without a revalidate. The
difference is that we came from __follow_mount instead of follow_link.

At least that is how I read the code.

Eric

2009-11-09 13:11:32

by Jeff Layton

[permalink] [raw]
Subject: Re: [PATCH] proc: revalidate dentry returned by proc_pid_follow_link

On Sun, 08 Nov 2009 19:30:55 -0800
[email protected] (Eric W. Biederman) wrote:

> Jeff Layton <[email protected]> writes:
>
> >> Hmm. Looking at the code I get the impression that a file bind mount
> >> will have exactly the same problem.
> >>
> >> Can you confirm.
> >>
> >> If file bind mounts also have this problem a bugfix to to just
> >> proc seems questionable.
> >>
> >
> > I'm not sure I understand what you mean by "file bind mount". Is that
> > something like mounting with "-o loop" ?
>
> # cd /tmp
> # echo foo > foo
> # echo bar > bar
> # mount --bind foo bar
> # cat bar
> foo
> #
>
> > I'm not at all opposed to fixing this in a more broad fashion, but as
> > best I can tell, the only place that LAST_BIND is used is in procfs.
>
> proc does appear to be the only user of LAST_BIND. With a file bind
> mount we can get to the same ok: label without a revalidate. The
> difference is that we came from __follow_mount instead of follow_link.
>
> At least that is how I read the code.
>

Thanks.

Yes, you're correct. That scenario does seem to have a similar problem.
I'm not quite sure yet how to fix it there yet.

It's easy enough to force a d_revalidate at a higher level. The tricky
bit is what to do when that returns 0 or an error. When I spoke to Al
about this, he suggested returning -ESTALE from the follow_link routine
if that occurs. That would force LOOKUP_REVAL to get set and cause the
whole chain of dentries to be revalidated back up to the root (if
necessary).

That's a little more difficult in the file bind mount case as I don't
see where it calls link_path_walk at all and hence can't really deal
with an ESTALE on a reval properly. I'll need to study this code some
more to see what the right fix is. If you (or anyone) has a suggestion,
I'd love to hear it.

--
Jeff Layton <[email protected]>