2014-07-14 05:14:16

by NeilBrown

[permalink] [raw]
Subject: [PATCH] NFS: nfs4_lookup_revalidate need to report STALE inodes.


If an 'open' of a file in an NFSv4 filesystem finds that the dentry is
in cache, but the inode is stale (on the server), the dentry will not
be re-validated immediately and may cause ESTALE to be returned to
user-space.

For a non-create 'open', do_last() calls lookup_fast() and on success
will eventually call may_open() which calls into nfs_permission().
If nfs_permission() makes the ACCESS call to the server it will get
NFS4ERR_STALE, resulting in ESTALE from may_open() and thence from
do_last().
The retry-on-ESTALE in filename_lookup() will repeat exactly the same
process because nothing in this path will invalidate the dentry due to
the inode being stale, so the ESTALE will be returned.

lookup_fast() calls ->d_revalidate(), but for an OPEN on an NFSv4
filesystem, that will succeed for regular files:
/* Let f_op->open() actually open (and revalidate) the file */

Unfortunately in the case of a STALE inode, f_op->open() never gets
called. If we teach nfs4_lookup_revalidate() to report a failure on
NFS_STALE() inodes, then the dentry will be invalidated and a full
lookup will be attempted. The ESTALE errors go away.


While I think this fix is correct, I'm not convinced that it is
sufficient, particularly if lookupcache=none.
The current code will fail an "open" is nfs_permission() fails,
without having performed a LOOKUP. i.e. it will use the cache.
nfs_lookup_revalidate will force a lookup before the permission check
if NFS_MOUNT_LOOKUP_CACHE_NONE, but nfs4_lookup_revalidate will not.


Signed-off-by: NeilBrown <[email protected]>

diff --git a/fs/nfs/dir.c b/fs/nfs/dir.c
index 4a3d4ef76127..4f7414afca27 100644
--- a/fs/nfs/dir.c
+++ b/fs/nfs/dir.c
@@ -1563,6 +1563,8 @@ static int nfs4_lookup_revalidate(struct dentry *dentry, unsigned int flags)
/* We cannot do exclusive creation on a positive dentry */
if (flags & LOOKUP_EXCL)
goto no_open_dput;
+ if (NFS_STALE(inode))
+ goto no_open_dput;

/* Let f_op->open() actually open (and revalidate) the file */
ret = 1;


Attachments:
signature.asc (828.00 B)

2014-07-14 13:00:31

by Jeff Layton

[permalink] [raw]
Subject: Re: [PATCH] NFS: nfs4_lookup_revalidate need to report STALE inodes.

On Mon, 14 Jul 2014 22:35:13 +1000
NeilBrown <[email protected]> wrote:

> On Mon, 14 Jul 2014 08:14:55 -0400 Jeff Layton <[email protected]>
> wrote:
>
> > On Mon, 14 Jul 2014 15:14:05 +1000
> > NeilBrown <[email protected]> wrote:
> >
> > >
> > > If an 'open' of a file in an NFSv4 filesystem finds that the dentry is
> > > in cache, but the inode is stale (on the server), the dentry will not
> > > be re-validated immediately and may cause ESTALE to be returned to
> > > user-space.
> > >
> > > For a non-create 'open', do_last() calls lookup_fast() and on success
> > > will eventually call may_open() which calls into nfs_permission().
> > > If nfs_permission() makes the ACCESS call to the server it will get
> > > NFS4ERR_STALE, resulting in ESTALE from may_open() and thence from
> > > do_last().
> > > The retry-on-ESTALE in filename_lookup() will repeat exactly the same
> > > process because nothing in this path will invalidate the dentry due to
> > > the inode being stale, so the ESTALE will be returned.
> > >
> > > lookup_fast() calls ->d_revalidate(), but for an OPEN on an NFSv4
> > > filesystem, that will succeed for regular files:
> > > /* Let f_op->open() actually open (and revalidate) the file */
> > >
> > > Unfortunately in the case of a STALE inode, f_op->open() never gets
> > > called. If we teach nfs4_lookup_revalidate() to report a failure on
> > > NFS_STALE() inodes, then the dentry will be invalidated and a full
> > > lookup will be attempted. The ESTALE errors go away.
> > >
> > >
> > > While I think this fix is correct, I'm not convinced that it is
> > > sufficient, particularly if lookupcache=none.
> > > The current code will fail an "open" is nfs_permission() fails,
> > > without having performed a LOOKUP. i.e. it will use the cache.
> > > nfs_lookup_revalidate will force a lookup before the permission check
> > > if NFS_MOUNT_LOOKUP_CACHE_NONE, but nfs4_lookup_revalidate will not.
> > >
> >
> > This patch should make the code fall through to nfs_lookup_revalidate,
> > which would then force the lookup, right?
>
> Yes ... though maybe that's not what I really want to do. I really wanted to
> just return '0', though I would need to check that is right in all cases.
>
> >
> > Also, I'm a little unclear...
> >
> > Why would may_open fail with ESTALE after the v4 OPEN succeeds? The
> > OPEN should be returning a filehandle and attributes for the inode
> > actually opened. It seems like we ought to be doing any permission
> > checks vs. that inode, not anything we had in cache. Presumably the
> > server is then holding it open so it shouldn't be stale.
>
> may_open is called *before* and v4 OPEN.
>
> In do_last, if the inode is already in cache, then
> lookup_fast is called, which calls d_revalidate
> then may_open (calls ->permission)
> then finish_open which calls f_op->open
>
> Yes, we should be doing permission checking against whatever 'open' finds.
> But the VFS is structured to the the permission check after d_revalidate and
> before ->open. So maybe d_revalidate needs to do the NFS open??
>

Ok, I see. Ugh, having the revalidate do the open sounds...messy.

A simpler fix might be to fix it so that an -ESTALE return from
may_open triggers a retry. Something like this maybe (probably
whitespace damaged, so just for discussion purposes):

diff --git a/fs/namei.c b/fs/namei.c
index 985c6f368485..c1657deea52c 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -3045,8 +3045,13 @@ finish_open:
}
finish_open_created:
error = may_open(&nd->path, acc_mode, open_flag);
- if (error)
+ if (error) {
+ if (error == -ESTALE)
+ goto stale_open;
goto out;
+ }
file->f_path.mnt = nd->path.mnt;
error = finish_open(file, nd->path.dentry, NULL, opened);
if (error) {


...though might need to convert the ESTALE to EOPENSTALE there too, not
sure...

>
> >
> > Are we not properly updating the dcache (and attrcache) after the
> > OPEN reply?
>
> I think so, yes. But in the problem case, we don't even send an OPEN
> request.
>
>
> >
> > >
> > > Signed-off-by: NeilBrown <[email protected]>
> > >
> > > diff --git a/fs/nfs/dir.c b/fs/nfs/dir.c
> > > index 4a3d4ef76127..4f7414afca27 100644
> > > --- a/fs/nfs/dir.c
> > > +++ b/fs/nfs/dir.c
> > > @@ -1563,6 +1563,8 @@ static int nfs4_lookup_revalidate(struct
> > > dentry *dentry, unsigned int flags) /* We cannot do exclusive
> > > creation on a positive dentry */ if (flags & LOOKUP_EXCL)
> > > goto no_open_dput;
> > > + if (NFS_STALE(inode))
> > > + goto no_open_dput;
> > >
> > > /* Let f_op->open() actually open (and revalidate) the
> > > file */ ret = 1;
> >
> > Looks legit to me too, but it seems like the inode could go stale
> > w/o us knowing after this point.
> >
> > Acked-by: Jeff Layton <[email protected]>
>
> Thanks,
> NeilBrown


--
Jeff Layton <[email protected]>


Attachments:
signature.asc (819.00 B)

2014-07-14 12:35:26

by NeilBrown

[permalink] [raw]
Subject: Re: [PATCH] NFS: nfs4_lookup_revalidate need to report STALE inodes.

On Mon, 14 Jul 2014 08:14:55 -0400 Jeff Layton <[email protected]>
wrote:

> On Mon, 14 Jul 2014 15:14:05 +1000
> NeilBrown <[email protected]> wrote:
>
> >
> > If an 'open' of a file in an NFSv4 filesystem finds that the dentry is
> > in cache, but the inode is stale (on the server), the dentry will not
> > be re-validated immediately and may cause ESTALE to be returned to
> > user-space.
> >
> > For a non-create 'open', do_last() calls lookup_fast() and on success
> > will eventually call may_open() which calls into nfs_permission().
> > If nfs_permission() makes the ACCESS call to the server it will get
> > NFS4ERR_STALE, resulting in ESTALE from may_open() and thence from
> > do_last().
> > The retry-on-ESTALE in filename_lookup() will repeat exactly the same
> > process because nothing in this path will invalidate the dentry due to
> > the inode being stale, so the ESTALE will be returned.
> >
> > lookup_fast() calls ->d_revalidate(), but for an OPEN on an NFSv4
> > filesystem, that will succeed for regular files:
> > /* Let f_op->open() actually open (and revalidate) the file */
> >
> > Unfortunately in the case of a STALE inode, f_op->open() never gets
> > called. If we teach nfs4_lookup_revalidate() to report a failure on
> > NFS_STALE() inodes, then the dentry will be invalidated and a full
> > lookup will be attempted. The ESTALE errors go away.
> >
> >
> > While I think this fix is correct, I'm not convinced that it is
> > sufficient, particularly if lookupcache=none.
> > The current code will fail an "open" is nfs_permission() fails,
> > without having performed a LOOKUP. i.e. it will use the cache.
> > nfs_lookup_revalidate will force a lookup before the permission check
> > if NFS_MOUNT_LOOKUP_CACHE_NONE, but nfs4_lookup_revalidate will not.
> >
>
> This patch should make the code fall through to nfs_lookup_revalidate,
> which would then force the lookup, right?

Yes ... though maybe that's not what I really want to do. I really wanted to
just return '0', though I would need to check that is right in all cases.

>
> Also, I'm a little unclear...
>
> Why would may_open fail with ESTALE after the v4 OPEN succeeds? The
> OPEN should be returning a filehandle and attributes for the inode
> actually opened. It seems like we ought to be doing any permission
> checks vs. that inode, not anything we had in cache. Presumably the
> server is then holding it open so it shouldn't be stale.

may_open is called *before* and v4 OPEN.

In do_last, if the inode is already in cache, then
lookup_fast is called, which calls d_revalidate
then may_open (calls ->permission)
then finish_open which calls f_op->open

Yes, we should be doing permission checking against whatever 'open' finds.
But the VFS is structured to the the permission check after d_revalidate and
before ->open. So maybe d_revalidate needs to do the NFS open??


>
> Are we not properly updating the dcache (and attrcache) after the OPEN
> reply?

I think so, yes. But in the problem case, we don't even send an OPEN request.


>
> >
> > Signed-off-by: NeilBrown <[email protected]>
> >
> > diff --git a/fs/nfs/dir.c b/fs/nfs/dir.c
> > index 4a3d4ef76127..4f7414afca27 100644
> > --- a/fs/nfs/dir.c
> > +++ b/fs/nfs/dir.c
> > @@ -1563,6 +1563,8 @@ static int nfs4_lookup_revalidate(struct dentry
> > *dentry, unsigned int flags) /* We cannot do exclusive creation on a
> > positive dentry */ if (flags & LOOKUP_EXCL)
> > goto no_open_dput;
> > + if (NFS_STALE(inode))
> > + goto no_open_dput;
> >
> > /* Let f_op->open() actually open (and revalidate) the file
> > */ ret = 1;
>
> Looks legit to me too, but it seems like the inode could go stale w/o
> us knowing after this point.
>
> Acked-by: Jeff Layton <[email protected]>

Thanks,
NeilBrown


Attachments:
signature.asc (828.00 B)

2014-07-14 23:47:40

by Jeff Layton

[permalink] [raw]
Subject: Re: [PATCH] NFS: nfs4_lookup_revalidate need to report STALE inodes.

On Tue, 15 Jul 2014 08:57:27 +1000
NeilBrown <[email protected]> wrote:

> On Mon, 14 Jul 2014 09:00:28 -0400 Jeff Layton <[email protected]>
> wrote:
>
> > On Mon, 14 Jul 2014 22:35:13 +1000
> > NeilBrown <[email protected]> wrote:
> >
> > > On Mon, 14 Jul 2014 08:14:55 -0400 Jeff Layton <[email protected]>
> > > wrote:
> > >
> > > > On Mon, 14 Jul 2014 15:14:05 +1000
> > > > NeilBrown <[email protected]> wrote:
> > > >
> > > > >
> > > > > If an 'open' of a file in an NFSv4 filesystem finds that the dentry is
> > > > > in cache, but the inode is stale (on the server), the dentry will not
> > > > > be re-validated immediately and may cause ESTALE to be returned to
> > > > > user-space.
> > > > >
> > > > > For a non-create 'open', do_last() calls lookup_fast() and on success
> > > > > will eventually call may_open() which calls into nfs_permission().
> > > > > If nfs_permission() makes the ACCESS call to the server it will get
> > > > > NFS4ERR_STALE, resulting in ESTALE from may_open() and thence from
> > > > > do_last().
> > > > > The retry-on-ESTALE in filename_lookup() will repeat exactly the same
> > > > > process because nothing in this path will invalidate the dentry due to
> > > > > the inode being stale, so the ESTALE will be returned.
> > > > >
> > > > > lookup_fast() calls ->d_revalidate(), but for an OPEN on an NFSv4
> > > > > filesystem, that will succeed for regular files:
> > > > > /* Let f_op->open() actually open (and revalidate) the file */
> > > > >
> > > > > Unfortunately in the case of a STALE inode, f_op->open() never gets
> > > > > called. If we teach nfs4_lookup_revalidate() to report a failure on
> > > > > NFS_STALE() inodes, then the dentry will be invalidated and a full
> > > > > lookup will be attempted. The ESTALE errors go away.
> > > > >
> > > > >
> > > > > While I think this fix is correct, I'm not convinced that it is
> > > > > sufficient, particularly if lookupcache=none.
> > > > > The current code will fail an "open" is nfs_permission() fails,
> > > > > without having performed a LOOKUP. i.e. it will use the cache.
> > > > > nfs_lookup_revalidate will force a lookup before the permission check
> > > > > if NFS_MOUNT_LOOKUP_CACHE_NONE, but nfs4_lookup_revalidate will not.
> > > > >
> > > >
> > > > This patch should make the code fall through to nfs_lookup_revalidate,
> > > > which would then force the lookup, right?
> > >
> > > Yes ... though maybe that's not what I really want to do. I really wanted to
> > > just return '0', though I would need to check that is right in all cases.
> > >
> > > >
> > > > Also, I'm a little unclear...
> > > >
> > > > Why would may_open fail with ESTALE after the v4 OPEN succeeds? The
> > > > OPEN should be returning a filehandle and attributes for the inode
> > > > actually opened. It seems like we ought to be doing any permission
> > > > checks vs. that inode, not anything we had in cache. Presumably the
> > > > server is then holding it open so it shouldn't be stale.
> > >
> > > may_open is called *before* and v4 OPEN.
> > >
> > > In do_last, if the inode is already in cache, then
> > > lookup_fast is called, which calls d_revalidate
> > > then may_open (calls ->permission)
> > > then finish_open which calls f_op->open
> > >
> > > Yes, we should be doing permission checking against whatever 'open' finds.
> > > But the VFS is structured to the the permission check after d_revalidate and
> > > before ->open. So maybe d_revalidate needs to do the NFS open??
> > >
> >
> > Ok, I see. Ugh, having the revalidate do the open sounds...messy.
>
> Having the VFS call into the file system in dribs and drabs, rather than just
> asking the filesystem to "open" and letting it call back to VFS libraries
> for name lookup etc it what is really messy (IMO).
>
> So yes - definite mess. Not entirely sure where the mess is.
>

Yeah, that might have been cleaner overall. I'm not sure how we can get
there from where the code is today though...

> >
> > A simpler fix might be to fix it so that an -ESTALE return from
> > may_open triggers a retry. Something like this maybe (probably
> > whitespace damaged, so just for discussion purposes):
>
> Nice idea but doesn't work.
> We get back to retry_lookup and call lookup_open().
> lookup_dcache calls d_revalidate which reports that everything is fine, so it
> tells lookup_open which jumps to out_no_open and does nothing useful.
> So we end up in may_open() again which returns ESTALE again but now we've
> used up all our extra lives...
>

Ahh right, so you'd probably need to pair that with the patch you
already have. Regardless, it seems like getting back an ESTALE from
may_open should trigger a retry rather than just erroring out.

>
> One thing I noticed while exploring this is that do_last calls "may_open"
> *before* finish_open() while atomic_open() calls "may_open" *after*
> finish_open() (which it calls by virtual of the fact that all ->atomic_open
> methods call finish_open()).
>
> I was very tempted to just move the 'may_open' call in 'do_last' to after the
> 'finish_open' call. That fixed the problem, but I'm not sure it is "right".
>
> I think the real core messiness here is that permission checking should be
> neither before nor after finish_open, but should be an integral part of
> finish_open with the filesystem doing the permission check in f_op->open().
>
> I'm currently thinking this is the best patch for now:
>
> diff --git a/fs/nfs/dir.c b/fs/nfs/dir.c
> index 4f7414afca27..5c40cfd3ae29 100644
> --- a/fs/nfs/dir.c
> +++ b/fs/nfs/dir.c
> @@ -1563,9 +1563,10 @@ static int nfs4_lookup_revalidate(struct dentry *dentry, unsigned int flags)
> /* We cannot do exclusive creation on a positive dentry */
> if (flags & LOOKUP_EXCL)
> goto no_open_dput;
>
> - /* Let f_op->open() actually open (and revalidate) the file */
> - ret = 1;
> + if (!NFS_STALE(inode))
> + /* Let f_op->open() actually open (and revalidate) the file */
> + ret = 1;
>
> out:
> dput(parent);
>
>
> Thanks,
> NeilBrown
>

That looks fine too, but I think you probably will also want to pair it
with making may_open retry the open on an ESTALE return.

The problem with the above check alone is that it's only going to fire
if you previously found the inode to be stale. It may be stale on the
server, but the client doesn't realize it yet, or could go stale after
this check and before the ACCESS call. In that case, you'll still end
up getting back an ESTALE once you hit may_open (unless I'm missing
something) and that won't trigger a reattempt either.

>
> >
> > diff --git a/fs/namei.c b/fs/namei.c
> > index 985c6f368485..c1657deea52c 100644
> > --- a/fs/namei.c
> > +++ b/fs/namei.c
> > @@ -3045,8 +3045,13 @@ finish_open:
> > }
> > finish_open_created:
> > error = may_open(&nd->path, acc_mode, open_flag);
> > - if (error)
> > + if (error) {
> > + if (error == -ESTALE)
> > + goto stale_open;
> > goto out;
> > + }
> > file->f_path.mnt = nd->path.mnt;
> > error = finish_open(file, nd->path.dentry, NULL, opened);
> > if (error) {
> >
> >
> > ...though might need to convert the ESTALE to EOPENSTALE there too, not
> > sure...
> >
> > >
> > > >
> > > > Are we not properly updating the dcache (and attrcache) after the
> > > > OPEN reply?
> > >
> > > I think so, yes. But in the problem case, we don't even send an OPEN
> > > request.
> > >
> > >
> > > >
> > > > >
> > > > > Signed-off-by: NeilBrown <[email protected]>
> > > > >
> > > > > diff --git a/fs/nfs/dir.c b/fs/nfs/dir.c
> > > > > index 4a3d4ef76127..4f7414afca27 100644
> > > > > --- a/fs/nfs/dir.c
> > > > > +++ b/fs/nfs/dir.c
> > > > > @@ -1563,6 +1563,8 @@ static int nfs4_lookup_revalidate(struct
> > > > > dentry *dentry, unsigned int flags) /* We cannot do exclusive
> > > > > creation on a positive dentry */ if (flags & LOOKUP_EXCL)
> > > > > goto no_open_dput;
> > > > > + if (NFS_STALE(inode))
> > > > > + goto no_open_dput;
> > > > >
> > > > > /* Let f_op->open() actually open (and revalidate) the
> > > > > file */ ret = 1;
> > > >
> > > > Looks legit to me too, but it seems like the inode could go stale
> > > > w/o us knowing after this point.
> > > >
> > > > Acked-by: Jeff Layton <[email protected]>
> > >
> > > Thanks,
> > > NeilBrown
> >
> >
>


--
Jeff Layton <[email protected]>


Attachments:
signature.asc (819.00 B)

2014-07-17 12:52:01

by Miklos Szeredi

[permalink] [raw]
Subject: Re: [PATCH] NFS: nfs4_lookup_revalidate need to report STALE inodes.

On Thu, Jul 17, 2014 at 1:22 PM, Jeff Layton
<[email protected]> wrote:

> What's so special about an EOPENSTALE return from finish_open that we
> need to handle retries in do_last? It seems like we could get rid of the
> stale_open label and just let do_filp_open handle it like we would
> an ESTALE return from any other spot in the function.
>
> Just for giggles, here's an RFC patch. It builds but I haven't tested
> it. It might also be possible to do some cleanup around saved_parent
> with this.
>
> Thoughts?

EOPENSTALE is an optimization for the redoing only the last component.
It's the analogue of ->d_revalidate() failure, in which case lookup of
that component only is retried path components before that are not.

I'm not sure if it's a valid optimization, but if not, then we should
also consider doing LOOKUP_REVAL on the whole path on any
d_revalidate() failure as well.

Thanks,
Miklos

2014-07-14 22:57:36

by NeilBrown

[permalink] [raw]
Subject: Re: [PATCH] NFS: nfs4_lookup_revalidate need to report STALE inodes.

On Mon, 14 Jul 2014 09:00:28 -0400 Jeff Layton <[email protected]>
wrote:

> On Mon, 14 Jul 2014 22:35:13 +1000
> NeilBrown <[email protected]> wrote:
>
> > On Mon, 14 Jul 2014 08:14:55 -0400 Jeff Layton <[email protected]>
> > wrote:
> >
> > > On Mon, 14 Jul 2014 15:14:05 +1000
> > > NeilBrown <[email protected]> wrote:
> > >
> > > >
> > > > If an 'open' of a file in an NFSv4 filesystem finds that the dentry is
> > > > in cache, but the inode is stale (on the server), the dentry will not
> > > > be re-validated immediately and may cause ESTALE to be returned to
> > > > user-space.
> > > >
> > > > For a non-create 'open', do_last() calls lookup_fast() and on success
> > > > will eventually call may_open() which calls into nfs_permission().
> > > > If nfs_permission() makes the ACCESS call to the server it will get
> > > > NFS4ERR_STALE, resulting in ESTALE from may_open() and thence from
> > > > do_last().
> > > > The retry-on-ESTALE in filename_lookup() will repeat exactly the same
> > > > process because nothing in this path will invalidate the dentry due to
> > > > the inode being stale, so the ESTALE will be returned.
> > > >
> > > > lookup_fast() calls ->d_revalidate(), but for an OPEN on an NFSv4
> > > > filesystem, that will succeed for regular files:
> > > > /* Let f_op->open() actually open (and revalidate) the file */
> > > >
> > > > Unfortunately in the case of a STALE inode, f_op->open() never gets
> > > > called. If we teach nfs4_lookup_revalidate() to report a failure on
> > > > NFS_STALE() inodes, then the dentry will be invalidated and a full
> > > > lookup will be attempted. The ESTALE errors go away.
> > > >
> > > >
> > > > While I think this fix is correct, I'm not convinced that it is
> > > > sufficient, particularly if lookupcache=none.
> > > > The current code will fail an "open" is nfs_permission() fails,
> > > > without having performed a LOOKUP. i.e. it will use the cache.
> > > > nfs_lookup_revalidate will force a lookup before the permission check
> > > > if NFS_MOUNT_LOOKUP_CACHE_NONE, but nfs4_lookup_revalidate will not.
> > > >
> > >
> > > This patch should make the code fall through to nfs_lookup_revalidate,
> > > which would then force the lookup, right?
> >
> > Yes ... though maybe that's not what I really want to do. I really wanted to
> > just return '0', though I would need to check that is right in all cases.
> >
> > >
> > > Also, I'm a little unclear...
> > >
> > > Why would may_open fail with ESTALE after the v4 OPEN succeeds? The
> > > OPEN should be returning a filehandle and attributes for the inode
> > > actually opened. It seems like we ought to be doing any permission
> > > checks vs. that inode, not anything we had in cache. Presumably the
> > > server is then holding it open so it shouldn't be stale.
> >
> > may_open is called *before* and v4 OPEN.
> >
> > In do_last, if the inode is already in cache, then
> > lookup_fast is called, which calls d_revalidate
> > then may_open (calls ->permission)
> > then finish_open which calls f_op->open
> >
> > Yes, we should be doing permission checking against whatever 'open' finds.
> > But the VFS is structured to the the permission check after d_revalidate and
> > before ->open. So maybe d_revalidate needs to do the NFS open??
> >
>
> Ok, I see. Ugh, having the revalidate do the open sounds...messy.

Having the VFS call into the file system in dribs and drabs, rather than just
asking the filesystem to "open" and letting it call back to VFS libraries
for name lookup etc it what is really messy (IMO).

So yes - definite mess. Not entirely sure where the mess is.

>
> A simpler fix might be to fix it so that an -ESTALE return from
> may_open triggers a retry. Something like this maybe (probably
> whitespace damaged, so just for discussion purposes):

Nice idea but doesn't work.
We get back to retry_lookup and call lookup_open().
lookup_dcache calls d_revalidate which reports that everything is fine, so it
tells lookup_open which jumps to out_no_open and does nothing useful.
So we end up in may_open() again which returns ESTALE again but now we've
used up all our extra lives...


One thing I noticed while exploring this is that do_last calls "may_open"
*before* finish_open() while atomic_open() calls "may_open" *after*
finish_open() (which it calls by virtual of the fact that all ->atomic_open
methods call finish_open()).

I was very tempted to just move the 'may_open' call in 'do_last' to after the
'finish_open' call. That fixed the problem, but I'm not sure it is "right".

I think the real core messiness here is that permission checking should be
neither before nor after finish_open, but should be an integral part of
finish_open with the filesystem doing the permission check in f_op->open().

I'm currently thinking this is the best patch for now:

diff --git a/fs/nfs/dir.c b/fs/nfs/dir.c
index 4f7414afca27..5c40cfd3ae29 100644
--- a/fs/nfs/dir.c
+++ b/fs/nfs/dir.c
@@ -1563,9 +1563,10 @@ static int nfs4_lookup_revalidate(struct dentry *dentry, unsigned int flags)
/* We cannot do exclusive creation on a positive dentry */
if (flags & LOOKUP_EXCL)
goto no_open_dput;

- /* Let f_op->open() actually open (and revalidate) the file */
- ret = 1;
+ if (!NFS_STALE(inode))
+ /* Let f_op->open() actually open (and revalidate) the file */
+ ret = 1;

out:
dput(parent);


Thanks,
NeilBrown


>
> diff --git a/fs/namei.c b/fs/namei.c
> index 985c6f368485..c1657deea52c 100644
> --- a/fs/namei.c
> +++ b/fs/namei.c
> @@ -3045,8 +3045,13 @@ finish_open:
> }
> finish_open_created:
> error = may_open(&nd->path, acc_mode, open_flag);
> - if (error)
> + if (error) {
> + if (error == -ESTALE)
> + goto stale_open;
> goto out;
> + }
> file->f_path.mnt = nd->path.mnt;
> error = finish_open(file, nd->path.dentry, NULL, opened);
> if (error) {
>
>
> ...though might need to convert the ESTALE to EOPENSTALE there too, not
> sure...
>
> >
> > >
> > > Are we not properly updating the dcache (and attrcache) after the
> > > OPEN reply?
> >
> > I think so, yes. But in the problem case, we don't even send an OPEN
> > request.
> >
> >
> > >
> > > >
> > > > Signed-off-by: NeilBrown <[email protected]>
> > > >
> > > > diff --git a/fs/nfs/dir.c b/fs/nfs/dir.c
> > > > index 4a3d4ef76127..4f7414afca27 100644
> > > > --- a/fs/nfs/dir.c
> > > > +++ b/fs/nfs/dir.c
> > > > @@ -1563,6 +1563,8 @@ static int nfs4_lookup_revalidate(struct
> > > > dentry *dentry, unsigned int flags) /* We cannot do exclusive
> > > > creation on a positive dentry */ if (flags & LOOKUP_EXCL)
> > > > goto no_open_dput;
> > > > + if (NFS_STALE(inode))
> > > > + goto no_open_dput;
> > > >
> > > > /* Let f_op->open() actually open (and revalidate) the
> > > > file */ ret = 1;
> > >
> > > Looks legit to me too, but it seems like the inode could go stale
> > > w/o us knowing after this point.
> > >
> > > Acked-by: Jeff Layton <[email protected]>
> >
> > Thanks,
> > NeilBrown
>
>


Attachments:
signature.asc (828.00 B)

2014-07-17 01:50:35

by NeilBrown

[permalink] [raw]
Subject: Re: [PATCH] NFS: nfs4_lookup_revalidate need to report STALE inodes.

On Mon, 14 Jul 2014 19:47:38 -0400 Jeff Layton <[email protected]>
wrote:

> On Tue, 15 Jul 2014 08:57:27 +1000
> NeilBrown <[email protected]> wrote:
>
> > On Mon, 14 Jul 2014 09:00:28 -0400 Jeff Layton <[email protected]>
> > wrote:
> >
> > > On Mon, 14 Jul 2014 22:35:13 +1000
> > > NeilBrown <[email protected]> wrote:
> > >
> > > > On Mon, 14 Jul 2014 08:14:55 -0400 Jeff Layton <[email protected]>
> > > > wrote:
> > > >
> > > > > On Mon, 14 Jul 2014 15:14:05 +1000
> > > > > NeilBrown <[email protected]> wrote:
> > > > >
> > > > > >
> > > > > > If an 'open' of a file in an NFSv4 filesystem finds that the dentry is
> > > > > > in cache, but the inode is stale (on the server), the dentry will not
> > > > > > be re-validated immediately and may cause ESTALE to be returned to
> > > > > > user-space.
> > > > > >
> > > > > > For a non-create 'open', do_last() calls lookup_fast() and on success
> > > > > > will eventually call may_open() which calls into nfs_permission().
> > > > > > If nfs_permission() makes the ACCESS call to the server it will get
> > > > > > NFS4ERR_STALE, resulting in ESTALE from may_open() and thence from
> > > > > > do_last().
> > > > > > The retry-on-ESTALE in filename_lookup() will repeat exactly the same
> > > > > > process because nothing in this path will invalidate the dentry due to
> > > > > > the inode being stale, so the ESTALE will be returned.
> > > > > >
> > > > > > lookup_fast() calls ->d_revalidate(), but for an OPEN on an NFSv4
> > > > > > filesystem, that will succeed for regular files:
> > > > > > /* Let f_op->open() actually open (and revalidate) the file */
> > > > > >
> > > > > > Unfortunately in the case of a STALE inode, f_op->open() never gets
> > > > > > called. If we teach nfs4_lookup_revalidate() to report a failure on
> > > > > > NFS_STALE() inodes, then the dentry will be invalidated and a full
> > > > > > lookup will be attempted. The ESTALE errors go away.
> > > > > >
> > > > > >
> > > > > > While I think this fix is correct, I'm not convinced that it is
> > > > > > sufficient, particularly if lookupcache=none.
> > > > > > The current code will fail an "open" is nfs_permission() fails,
> > > > > > without having performed a LOOKUP. i.e. it will use the cache.
> > > > > > nfs_lookup_revalidate will force a lookup before the permission check
> > > > > > if NFS_MOUNT_LOOKUP_CACHE_NONE, but nfs4_lookup_revalidate will not.
> > > > > >
> > > > >
> > > > > This patch should make the code fall through to nfs_lookup_revalidate,
> > > > > which would then force the lookup, right?
> > > >
> > > > Yes ... though maybe that's not what I really want to do. I really wanted to
> > > > just return '0', though I would need to check that is right in all cases.
> > > >
> > > > >
> > > > > Also, I'm a little unclear...
> > > > >
> > > > > Why would may_open fail with ESTALE after the v4 OPEN succeeds? The
> > > > > OPEN should be returning a filehandle and attributes for the inode
> > > > > actually opened. It seems like we ought to be doing any permission
> > > > > checks vs. that inode, not anything we had in cache. Presumably the
> > > > > server is then holding it open so it shouldn't be stale.
> > > >
> > > > may_open is called *before* and v4 OPEN.
> > > >
> > > > In do_last, if the inode is already in cache, then
> > > > lookup_fast is called, which calls d_revalidate
> > > > then may_open (calls ->permission)
> > > > then finish_open which calls f_op->open
> > > >
> > > > Yes, we should be doing permission checking against whatever 'open' finds.
> > > > But the VFS is structured to the the permission check after d_revalidate and
> > > > before ->open. So maybe d_revalidate needs to do the NFS open??
> > > >
> > >
> > > Ok, I see. Ugh, having the revalidate do the open sounds...messy.
> >
> > Having the VFS call into the file system in dribs and drabs, rather than just
> > asking the filesystem to "open" and letting it call back to VFS libraries
> > for name lookup etc it what is really messy (IMO).
> >
> > So yes - definite mess. Not entirely sure where the mess is.
> >
>
> Yeah, that might have been cleaner overall. I'm not sure how we can get
> there from where the code is today though...
>
> > >
> > > A simpler fix might be to fix it so that an -ESTALE return from
> > > may_open triggers a retry. Something like this maybe (probably
> > > whitespace damaged, so just for discussion purposes):
> >
> > Nice idea but doesn't work.
> > We get back to retry_lookup and call lookup_open().
> > lookup_dcache calls d_revalidate which reports that everything is fine, so it
> > tells lookup_open which jumps to out_no_open and does nothing useful.
> > So we end up in may_open() again which returns ESTALE again but now we've
> > used up all our extra lives...
> >
>
> Ahh right, so you'd probably need to pair that with the patch you
> already have. Regardless, it seems like getting back an ESTALE from
> may_open should trigger a retry rather than just erroring out.
>
> >
> > One thing I noticed while exploring this is that do_last calls "may_open"
> > *before* finish_open() while atomic_open() calls "may_open" *after*
> > finish_open() (which it calls by virtual of the fact that all ->atomic_open
> > methods call finish_open()).
> >
> > I was very tempted to just move the 'may_open' call in 'do_last' to after the
> > 'finish_open' call. That fixed the problem, but I'm not sure it is "right".
> >
> > I think the real core messiness here is that permission checking should be
> > neither before nor after finish_open, but should be an integral part of
> > finish_open with the filesystem doing the permission check in f_op->open().
> >
> > I'm currently thinking this is the best patch for now:
> >
> > diff --git a/fs/nfs/dir.c b/fs/nfs/dir.c
> > index 4f7414afca27..5c40cfd3ae29 100644
> > --- a/fs/nfs/dir.c
> > +++ b/fs/nfs/dir.c
> > @@ -1563,9 +1563,10 @@ static int nfs4_lookup_revalidate(struct dentry *dentry, unsigned int flags)
> > /* We cannot do exclusive creation on a positive dentry */
> > if (flags & LOOKUP_EXCL)
> > goto no_open_dput;
> >
> > - /* Let f_op->open() actually open (and revalidate) the file */
> > - ret = 1;
> > + if (!NFS_STALE(inode))
> > + /* Let f_op->open() actually open (and revalidate) the file */
> > + ret = 1;
> >
> > out:
> > dput(parent);
> >
> >
> > Thanks,
> > NeilBrown
> >
>
> That looks fine too, but I think you probably will also want to pair it
> with making may_open retry the open on an ESTALE return.
>
> The problem with the above check alone is that it's only going to fire
> if you previously found the inode to be stale. It may be stale on the
> server, but the client doesn't realize it yet, or could go stale after
> this check and before the ACCESS call. In that case, you'll still end
> up getting back an ESTALE once you hit may_open (unless I'm missing
> something) and that won't trigger a reattempt either.

I must admit to being a bit confused by your position here.

You are the one who introduced the high-level retry-on-ESTALE functionality
into namei.c. So you presumably know that an ESTALE will already be
retried. Yet you are suggesting to that we add another retry here??

The way I understanding it, ESTALE should only be retried if it was a cached
inode that was found to be STALE. When that happens, the dentry needs to be
invalidated and then the whole path retried again from the top with
LOOKUP_REVAL. This time we won't trust anything that is cached so any ESTALE
we find is a real ESTALE that must be returned to the caller.

From this perspective, the problem is either something is seeing a STALE
inode in the first pass and not invalidating the dentry, or that something is
not revalidating the dentry on the second pass despite LOOKUP_REVAL being set.
I'm assuming that nfs4_look_revalidate should be invalidating the dentry on
the first pass (by returning 0). Other fixes might be possible, but further
retries should be pointless - we already have the required retry in place
thanks to you!

Thanks,
NeilBrown


Attachments:
signature.asc (828.00 B)

2014-07-14 12:14:59

by Jeff Layton

[permalink] [raw]
Subject: Re: [PATCH] NFS: nfs4_lookup_revalidate need to report STALE inodes.

On Mon, 14 Jul 2014 15:14:05 +1000
NeilBrown <[email protected]> wrote:

>
> If an 'open' of a file in an NFSv4 filesystem finds that the dentry is
> in cache, but the inode is stale (on the server), the dentry will not
> be re-validated immediately and may cause ESTALE to be returned to
> user-space.
>
> For a non-create 'open', do_last() calls lookup_fast() and on success
> will eventually call may_open() which calls into nfs_permission().
> If nfs_permission() makes the ACCESS call to the server it will get
> NFS4ERR_STALE, resulting in ESTALE from may_open() and thence from
> do_last().
> The retry-on-ESTALE in filename_lookup() will repeat exactly the same
> process because nothing in this path will invalidate the dentry due to
> the inode being stale, so the ESTALE will be returned.
>
> lookup_fast() calls ->d_revalidate(), but for an OPEN on an NFSv4
> filesystem, that will succeed for regular files:
> /* Let f_op->open() actually open (and revalidate) the file */
>
> Unfortunately in the case of a STALE inode, f_op->open() never gets
> called. If we teach nfs4_lookup_revalidate() to report a failure on
> NFS_STALE() inodes, then the dentry will be invalidated and a full
> lookup will be attempted. The ESTALE errors go away.
>
>
> While I think this fix is correct, I'm not convinced that it is
> sufficient, particularly if lookupcache=none.
> The current code will fail an "open" is nfs_permission() fails,
> without having performed a LOOKUP. i.e. it will use the cache.
> nfs_lookup_revalidate will force a lookup before the permission check
> if NFS_MOUNT_LOOKUP_CACHE_NONE, but nfs4_lookup_revalidate will not.
>

This patch should make the code fall through to nfs_lookup_revalidate,
which would then force the lookup, right?

Also, I'm a little unclear...

Why would may_open fail with ESTALE after the v4 OPEN succeeds? The
OPEN should be returning a filehandle and attributes for the inode
actually opened. It seems like we ought to be doing any permission
checks vs. that inode, not anything we had in cache. Presumably the
server is then holding it open so it shouldn't be stale.

Are we not properly updating the dcache (and attrcache) after the OPEN
reply?

>
> Signed-off-by: NeilBrown <[email protected]>
>
> diff --git a/fs/nfs/dir.c b/fs/nfs/dir.c
> index 4a3d4ef76127..4f7414afca27 100644
> --- a/fs/nfs/dir.c
> +++ b/fs/nfs/dir.c
> @@ -1563,6 +1563,8 @@ static int nfs4_lookup_revalidate(struct dentry
> *dentry, unsigned int flags) /* We cannot do exclusive creation on a
> positive dentry */ if (flags & LOOKUP_EXCL)
> goto no_open_dput;
> + if (NFS_STALE(inode))
> + goto no_open_dput;
>
> /* Let f_op->open() actually open (and revalidate) the file
> */ ret = 1;

Looks legit to me too, but it seems like the inode could go stale w/o
us knowing after this point.

Acked-by: Jeff Layton <[email protected]>


Attachments:
signature.asc (819.00 B)

2014-07-17 11:22:51

by Jeff Layton

[permalink] [raw]
Subject: Re: [PATCH] NFS: nfs4_lookup_revalidate need to report STALE inodes.

On Thu, 17 Jul 2014 11:50:24 +1000
NeilBrown <[email protected]> wrote:

> On Mon, 14 Jul 2014 19:47:38 -0400 Jeff Layton <[email protected]>
> wrote:
>
> > On Tue, 15 Jul 2014 08:57:27 +1000
> > NeilBrown <[email protected]> wrote:
> >
> > > On Mon, 14 Jul 2014 09:00:28 -0400 Jeff Layton <[email protected]>
> > > wrote:
> > >
> > > > On Mon, 14 Jul 2014 22:35:13 +1000
> > > > NeilBrown <[email protected]> wrote:
> > > >
> > > > > On Mon, 14 Jul 2014 08:14:55 -0400 Jeff Layton <[email protected]>
> > > > > wrote:
> > > > >
> > > > > > On Mon, 14 Jul 2014 15:14:05 +1000
> > > > > > NeilBrown <[email protected]> wrote:
> > > > > >
> > > > > > >
> > > > > > > If an 'open' of a file in an NFSv4 filesystem finds that the dentry is
> > > > > > > in cache, but the inode is stale (on the server), the dentry will not
> > > > > > > be re-validated immediately and may cause ESTALE to be returned to
> > > > > > > user-space.
> > > > > > >
> > > > > > > For a non-create 'open', do_last() calls lookup_fast() and on success
> > > > > > > will eventually call may_open() which calls into nfs_permission().
> > > > > > > If nfs_permission() makes the ACCESS call to the server it will get
> > > > > > > NFS4ERR_STALE, resulting in ESTALE from may_open() and thence from
> > > > > > > do_last().
> > > > > > > The retry-on-ESTALE in filename_lookup() will repeat exactly the same
> > > > > > > process because nothing in this path will invalidate the dentry due to
> > > > > > > the inode being stale, so the ESTALE will be returned.
> > > > > > >
> > > > > > > lookup_fast() calls ->d_revalidate(), but for an OPEN on an NFSv4
> > > > > > > filesystem, that will succeed for regular files:
> > > > > > > /* Let f_op->open() actually open (and revalidate) the file */
> > > > > > >
> > > > > > > Unfortunately in the case of a STALE inode, f_op->open() never gets
> > > > > > > called. If we teach nfs4_lookup_revalidate() to report a failure on
> > > > > > > NFS_STALE() inodes, then the dentry will be invalidated and a full
> > > > > > > lookup will be attempted. The ESTALE errors go away.
> > > > > > >
> > > > > > >
> > > > > > > While I think this fix is correct, I'm not convinced that it is
> > > > > > > sufficient, particularly if lookupcache=none.
> > > > > > > The current code will fail an "open" is nfs_permission() fails,
> > > > > > > without having performed a LOOKUP. i.e. it will use the cache.
> > > > > > > nfs_lookup_revalidate will force a lookup before the permission check
> > > > > > > if NFS_MOUNT_LOOKUP_CACHE_NONE, but nfs4_lookup_revalidate will not.
> > > > > > >
> > > > > >
> > > > > > This patch should make the code fall through to nfs_lookup_revalidate,
> > > > > > which would then force the lookup, right?
> > > > >
> > > > > Yes ... though maybe that's not what I really want to do. I really wanted to
> > > > > just return '0', though I would need to check that is right in all cases.
> > > > >
> > > > > >
> > > > > > Also, I'm a little unclear...
> > > > > >
> > > > > > Why would may_open fail with ESTALE after the v4 OPEN succeeds? The
> > > > > > OPEN should be returning a filehandle and attributes for the inode
> > > > > > actually opened. It seems like we ought to be doing any permission
> > > > > > checks vs. that inode, not anything we had in cache. Presumably the
> > > > > > server is then holding it open so it shouldn't be stale.
> > > > >
> > > > > may_open is called *before* and v4 OPEN.
> > > > >
> > > > > In do_last, if the inode is already in cache, then
> > > > > lookup_fast is called, which calls d_revalidate
> > > > > then may_open (calls ->permission)
> > > > > then finish_open which calls f_op->open
> > > > >
> > > > > Yes, we should be doing permission checking against whatever 'open' finds.
> > > > > But the VFS is structured to the the permission check after d_revalidate and
> > > > > before ->open. So maybe d_revalidate needs to do the NFS open??
> > > > >
> > > >
> > > > Ok, I see. Ugh, having the revalidate do the open sounds...messy.
> > >
> > > Having the VFS call into the file system in dribs and drabs, rather than just
> > > asking the filesystem to "open" and letting it call back to VFS libraries
> > > for name lookup etc it what is really messy (IMO).
> > >
> > > So yes - definite mess. Not entirely sure where the mess is.
> > >
> >
> > Yeah, that might have been cleaner overall. I'm not sure how we can get
> > there from where the code is today though...
> >
> > > >
> > > > A simpler fix might be to fix it so that an -ESTALE return from
> > > > may_open triggers a retry. Something like this maybe (probably
> > > > whitespace damaged, so just for discussion purposes):
> > >
> > > Nice idea but doesn't work.
> > > We get back to retry_lookup and call lookup_open().
> > > lookup_dcache calls d_revalidate which reports that everything is fine, so it
> > > tells lookup_open which jumps to out_no_open and does nothing useful.
> > > So we end up in may_open() again which returns ESTALE again but now we've
> > > used up all our extra lives...
> > >
> >
> > Ahh right, so you'd probably need to pair that with the patch you
> > already have. Regardless, it seems like getting back an ESTALE from
> > may_open should trigger a retry rather than just erroring out.
> >
> > >
> > > One thing I noticed while exploring this is that do_last calls "may_open"
> > > *before* finish_open() while atomic_open() calls "may_open" *after*
> > > finish_open() (which it calls by virtual of the fact that all ->atomic_open
> > > methods call finish_open()).
> > >
> > > I was very tempted to just move the 'may_open' call in 'do_last' to after the
> > > 'finish_open' call. That fixed the problem, but I'm not sure it is "right".
> > >
> > > I think the real core messiness here is that permission checking should be
> > > neither before nor after finish_open, but should be an integral part of
> > > finish_open with the filesystem doing the permission check in f_op->open().
> > >
> > > I'm currently thinking this is the best patch for now:
> > >
> > > diff --git a/fs/nfs/dir.c b/fs/nfs/dir.c
> > > index 4f7414afca27..5c40cfd3ae29 100644
> > > --- a/fs/nfs/dir.c
> > > +++ b/fs/nfs/dir.c
> > > @@ -1563,9 +1563,10 @@ static int nfs4_lookup_revalidate(struct dentry *dentry, unsigned int flags)
> > > /* We cannot do exclusive creation on a positive dentry */
> > > if (flags & LOOKUP_EXCL)
> > > goto no_open_dput;
> > >
> > > - /* Let f_op->open() actually open (and revalidate) the file */
> > > - ret = 1;
> > > + if (!NFS_STALE(inode))
> > > + /* Let f_op->open() actually open (and revalidate) the file */
> > > + ret = 1;
> > >
> > > out:
> > > dput(parent);
> > >
> > >
> > > Thanks,
> > > NeilBrown
> > >
> >
> > That looks fine too, but I think you probably will also want to pair it
> > with making may_open retry the open on an ESTALE return.
> >
> > The problem with the above check alone is that it's only going to fire
> > if you previously found the inode to be stale. It may be stale on the
> > server, but the client doesn't realize it yet, or could go stale after
> > this check and before the ACCESS call. In that case, you'll still end
> > up getting back an ESTALE once you hit may_open (unless I'm missing
> > something) and that won't trigger a reattempt either.
>
> I must admit to being a bit confused by your position here.
>
> You are the one who introduced the high-level retry-on-ESTALE functionality
> into namei.c. So you presumably know that an ESTALE will already be
> retried. Yet you are suggesting to that we add another retry here??
>
> The way I understanding it, ESTALE should only be retried if it was a cached
> inode that was found to be STALE. When that happens, the dentry needs to be
> invalidated and then the whole path retried again from the top with
> LOOKUP_REVAL. This time we won't trust anything that is cached so any ESTALE
> we find is a real ESTALE that must be returned to the caller.
>
> From this perspective, the problem is either something is seeing a STALE
> inode in the first pass and not invalidating the dentry, or that something is
> not revalidating the dentry on the second pass despite LOOKUP_REVAL being set.
> I'm assuming that nfs4_look_revalidate should be invalidating the dentry on
> the first pass (by returning 0). Other fixes might be possible, but further
> retries should be pointless - we already have the required retry in place
> thanks to you!
>
> Thanks,
> NeilBrown

(cc'ing Miklos)

<facepalm>
You're totally correct. I had forgotten that we do retries on ESTALE at
a higher level. I got confused by the EOPENSTALE handling there after
finish_open.
</facepalm>

So, on your patch:

Acked-by: Jeff Layton <[email protected]>

That said, it does sort of bring up an unrelated question:

What's so special about an EOPENSTALE return from finish_open that we
need to handle retries in do_last? It seems like we could get rid of the
stale_open label and just let do_filp_open handle it like we would
an ESTALE return from any other spot in the function.

Just for giggles, here's an RFC patch. It builds but I haven't tested
it. It might also be possible to do some cleanup around saved_parent
with this.

Thoughts?

-------------------------[snip]-------------------

[PATCH] vfs: don't handle EOPENSTALE retries in do_last

We already handle ESTALE retries at higher levels. Retrying the lookup
and open in do_last is somewhat redundant. Remove the logic that
for that from do_last and just let the upper layers handle it.

Cc: Miklos Szeredi <[email protected]>
Signed-off-by: Jeff Layton <[email protected]>
---
fs/namei.c | 26 +-------------------------
1 file changed, 1 insertion(+), 25 deletions(-)

diff --git a/fs/namei.c b/fs/namei.c
index 985c6f368485..34c6d008d0e5 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -2882,7 +2882,6 @@ static int do_last(struct nameidata *nd, struct path *path,
struct inode *inode;
bool symlink_ok = false;
struct path save_parent = { .dentry = NULL, .mnt = NULL };
- bool retried = false;
int error;

nd->flags &= ~LOOKUP_PARENT;
@@ -2927,7 +2926,6 @@ static int do_last(struct nameidata *nd, struct path *path,
goto out;
}

-retry_lookup:
if (op->open_flag & (O_CREAT | O_TRUNC | O_WRONLY | O_RDWR)) {
error = mnt_want_write(nd->path.mnt);
if (!error)
@@ -3017,7 +3015,6 @@ finish_lookup:
save_parent.dentry = nd->path.dentry;
save_parent.mnt = mntget(path->mnt);
nd->path.dentry = path->dentry;
-
}
nd->inode = inode;
/* Why this, you ask? _Now_ we might have grown LOOKUP_JUMPED... */
@@ -3049,11 +3046,8 @@ finish_open_created:
goto out;
file->f_path.mnt = nd->path.mnt;
error = finish_open(file, nd->path.dentry, NULL, opened);
- if (error) {
- if (error == -EOPENSTALE)
- goto stale_open;
+ if (error)
goto out;
- }
opened:
error = open_check_o_direct(file);
if (error)
@@ -3080,24 +3074,6 @@ exit_dput:
exit_fput:
fput(file);
goto out;
-
-stale_open:
- /* If no saved parent or already retried then can't retry */
- if (!save_parent.dentry || retried)
- goto out;
-
- BUG_ON(save_parent.dentry != dir);
- path_put(&nd->path);
- nd->path = save_parent;
- nd->inode = dir->d_inode;
- save_parent.mnt = NULL;
- save_parent.dentry = NULL;
- if (got_write) {
- mnt_drop_write(nd->path.mnt);
- got_write = false;
- }
- retried = true;
- goto retry_lookup;
}

static int do_tmpfile(int dfd, struct filename *pathname,
--
1.9.3




Attachments:
signature.asc (819.00 B)

2014-07-17 14:42:03

by Jeff Layton

[permalink] [raw]
Subject: Re: [PATCH] NFS: nfs4_lookup_revalidate need to report STALE inodes.

On Thu, 17 Jul 2014 14:52:00 +0200
Miklos Szeredi <[email protected]> wrote:

> On Thu, Jul 17, 2014 at 1:22 PM, Jeff Layton
> <[email protected]> wrote:
>
> > What's so special about an EOPENSTALE return from finish_open that we
> > need to handle retries in do_last? It seems like we could get rid of the
> > stale_open label and just let do_filp_open handle it like we would
> > an ESTALE return from any other spot in the function.
> >
> > Just for giggles, here's an RFC patch. It builds but I haven't tested
> > it. It might also be possible to do some cleanup around saved_parent
> > with this.
> >
> > Thoughts?
>
> EOPENSTALE is an optimization for the redoing only the last component.
> It's the analogue of ->d_revalidate() failure, in which case lookup of
> that component only is retried path components before that are not.
>
> I'm not sure if it's a valid optimization, but if not, then we should
> also consider doing LOOKUP_REVAL on the whole path on any
> d_revalidate() failure as well.
>
> Thanks,
> Miklos


Ok, that makes sense and it's seems like good enough reason to keep it
as is for now. We can just drop my RFC patch...

Thanks,
--
Jeff Layton <[email protected]>