2015-02-24 00:15:01

by Trond Myklebust

[permalink] [raw]
Subject: [PATCH 1/3] NFS: Don't invalidate a submounted dentry in nfs_prime_dcache()

If we're traversing a directory which contains a submounted filesystem,
or one that has a referral, the NFS server that is processing the READDIR
request will often return information for the underlying (mounted-on)
directory. It may, or may not, also return filehandle information.

If this happens, and the lookup in nfs_prime_dcache() returns the
dentry for the submounted directory, the filehandle comparison will
fail, and we call d_invalidate(). Post-commit 8ed936b5671bf
("vfs: Lazily remove mounts on unlinked files and directories."), this
means the entire subtree is unmounted.

The following minimal patch addresses this problem by punting on
the invalidation if there is a submount.

Kudos to Neil Brown <[email protected]> for having tracked down this
issue (see link).

Reported-by: Nix <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Cc: [email protected] # 3.18+
Signed-off-by: Trond Myklebust <[email protected]>
---
fs/nfs/dir.c | 4 ++++
1 file changed, 4 insertions(+)

diff --git a/fs/nfs/dir.c b/fs/nfs/dir.c
index 9b0c55cb2a2e..0da617a61c0b 100644
--- a/fs/nfs/dir.c
+++ b/fs/nfs/dir.c
@@ -479,6 +479,10 @@ void nfs_prime_dcache(struct dentry *parent, struct nfs_entry *entry)

dentry = d_lookup(parent, &filename);
if (dentry != NULL) {
+ /* Is there a mountpoint here? If so, just exit */
+ if (!nfs_fsid_equal(&NFS_SB(dentry->d_sb)->fsid,
+ &entry->fattr->fsid))
+ goto out;
if (nfs_same_file(dentry, entry)) {
nfs_set_verifier(dentry, nfs_save_change_attribute(dir));
status = nfs_refresh_inode(dentry->d_inode, entry->fattr);
--
2.1.0



2015-02-24 00:15:02

by Trond Myklebust

[permalink] [raw]
Subject: [PATCH 2/3] NFSv3: Use the readdir fileid as the mounted-on-fileid

When we call readdirplus, set the fileid normally returned by readdir
as the mounted-on-fileid, since that is commonly the case if there is
a mountpoint. To ensure that we get it right, we only set the flag if
the readdir fileid differs from the one returned in the readdirplus
attributes.

This again means that we can avoid the issues described in commit
2ef47eb1aee17 ("NFS: Fix use of nfs_attr_use_mounted_on_fileid()"),
which only fixed NFSv4.

Signed-off-by: Trond Myklebust <[email protected]>
---
fs/nfs/nfs3xdr.c | 5 +++++
1 file changed, 5 insertions(+)

diff --git a/fs/nfs/nfs3xdr.c b/fs/nfs/nfs3xdr.c
index 2a932fdc57cb..53852a4bd88b 100644
--- a/fs/nfs/nfs3xdr.c
+++ b/fs/nfs/nfs3xdr.c
@@ -1987,6 +1987,11 @@ int nfs3_decode_dirent(struct xdr_stream *xdr, struct nfs_entry *entry,
if (entry->fattr->valid & NFS_ATTR_FATTR_V3)
entry->d_type = nfs_umode_to_dtype(entry->fattr->mode);

+ if (entry->fattr->fileid != entry->ino) {
+ entry->fattr->mounted_on_fileid = entry->ino;
+ entry->fattr->valid |= NFS_ATTR_FATTR_MOUNTED_ON_FILEID;
+ }
+
/* In fact, a post_op_fh3: */
p = xdr_inline_decode(xdr, 4);
if (unlikely(p == NULL))
--
2.1.0


2015-02-24 00:15:04

by Trond Myklebust

[permalink] [raw]
Subject: [PATCH 3/3] NFS: Don't require a filehandle to refresh the inode in nfs_prime_dcache()

If the server does not return a valid set of attributes that we can
use to either create a file or refresh the inode, then there is no
value in calling nfs_prime_dcache().

However if we're just refreshing the inode using the attributes that
the server returned, then it shouldn't matter whether or not we have
a filehandle, as long as we check the fsid+fileid combination.

Signed-off-by: Trond Myklebust <[email protected]>
---
fs/nfs/dir.c | 18 +++++++++++++++---
1 file changed, 15 insertions(+), 3 deletions(-)

diff --git a/fs/nfs/dir.c b/fs/nfs/dir.c
index 0da617a61c0b..c19e16f0b2d0 100644
--- a/fs/nfs/dir.c
+++ b/fs/nfs/dir.c
@@ -408,14 +408,22 @@ static int xdr_decode(nfs_readdir_descriptor_t *desc,
return 0;
}

+/* Match file and dirent using either filehandle or fileid
+ * Note: caller is responsible for checking the fsid
+ */
static
int nfs_same_file(struct dentry *dentry, struct nfs_entry *entry)
{
+ struct nfs_inode *nfsi;
+
if (dentry->d_inode == NULL)
goto different;
- if (nfs_compare_fh(entry->fh, NFS_FH(dentry->d_inode)) != 0)
- goto different;
- return 1;
+
+ nfsi = NFS_I(dentry->d_inode);
+ if (entry->fattr->fileid == nfsi->fileid)
+ return 1;
+ if (nfs_compare_fh(entry->fh, &nfsi->fh) == 0)
+ return 1;
different:
return 0;
}
@@ -469,6 +477,10 @@ void nfs_prime_dcache(struct dentry *parent, struct nfs_entry *entry)
struct inode *inode;
int status;

+ if (!(entry->fattr->valid & NFS_ATTR_FATTR_FILEID))
+ return;
+ if (!(entry->fattr->valid & NFS_ATTR_FATTR_FSID))
+ return;
if (filename.name[0] == '.') {
if (filename.len == 1)
return;
--
2.1.0


2015-02-24 21:53:26

by NeilBrown

[permalink] [raw]
Subject: Re: [PATCH 3/3] NFS: Don't require a filehandle to refresh the inode in nfs_prime_dcache()

On Mon, 23 Feb 2015 19:14:57 -0500 Trond Myklebust
<[email protected]> wrote:

> If the server does not return a valid set of attributes that we can
> use to either create a file or refresh the inode, then there is no
> value in calling nfs_prime_dcache().
>
> However if we're just refreshing the inode using the attributes that
> the server returned, then it shouldn't matter whether or not we have
> a filehandle, as long as we check the fsid+fileid combination.
>
> Signed-off-by: Trond Myklebust <[email protected]>
> ---
> fs/nfs/dir.c | 18 +++++++++++++++---
> 1 file changed, 15 insertions(+), 3 deletions(-)
>
> diff --git a/fs/nfs/dir.c b/fs/nfs/dir.c
> index 0da617a61c0b..c19e16f0b2d0 100644
> --- a/fs/nfs/dir.c
> +++ b/fs/nfs/dir.c
> @@ -408,14 +408,22 @@ static int xdr_decode(nfs_readdir_descriptor_t *desc,
> return 0;
> }
>
> +/* Match file and dirent using either filehandle or fileid
> + * Note: caller is responsible for checking the fsid
> + */
> static
> int nfs_same_file(struct dentry *dentry, struct nfs_entry *entry)
> {
> + struct nfs_inode *nfsi;
> +
> if (dentry->d_inode == NULL)
> goto different;
> - if (nfs_compare_fh(entry->fh, NFS_FH(dentry->d_inode)) != 0)
> - goto different;
> - return 1;
> +
> + nfsi = NFS_I(dentry->d_inode);
> + if (entry->fattr->fileid == nfsi->fileid)
> + return 1;
> + if (nfs_compare_fh(entry->fh, &nfsi->fh) == 0)
> + return 1;
> different:
> return 0;
> }
> @@ -469,6 +477,10 @@ void nfs_prime_dcache(struct dentry *parent, struct nfs_entry *entry)
> struct inode *inode;
> int status;
>
> + if (!(entry->fattr->valid & NFS_ATTR_FATTR_FILEID))
> + return;
> + if (!(entry->fattr->valid & NFS_ATTR_FATTR_FSID))
> + return;
> if (filename.name[0] == '.') {
> if (filename.len == 1)
> return;


I believe this will fix the observed problem. This is partly because the
Linux NFSv3 server either returns both a filehandle and attributes, or
neither.

If a server happened to return postop attributes, but no filehandle, then the
"nfs_compare_fh()" would be a meaningless test.

I think you should abort nfs_prime_dcache if entry->fh->size is zero for
exactly the same reason that you abort if NFS_ATTR_FATTR_FSID is not set.

NeilBrown


Attachments:
(No filename) (811.00 B)
OpenPGP digital signature

2015-02-24 21:46:32

by NeilBrown

[permalink] [raw]
Subject: Re: [PATCH 2/3] NFSv3: Use the readdir fileid as the mounted-on-fileid

On Mon, 23 Feb 2015 19:14:56 -0500 Trond Myklebust
<[email protected]> wrote:

> When we call readdirplus, set the fileid normally returned by readdir
> as the mounted-on-fileid, since that is commonly the case if there is
> a mountpoint. To ensure that we get it right, we only set the flag if
> the readdir fileid differs from the one returned in the readdirplus
> attributes.
>
> This again means that we can avoid the issues described in commit
> 2ef47eb1aee17 ("NFS: Fix use of nfs_attr_use_mounted_on_fileid()"),
> which only fixed NFSv4.
>
> Signed-off-by: Trond Myklebust <[email protected]>
> ---
> fs/nfs/nfs3xdr.c | 5 +++++
> 1 file changed, 5 insertions(+)
>
> diff --git a/fs/nfs/nfs3xdr.c b/fs/nfs/nfs3xdr.c
> index 2a932fdc57cb..53852a4bd88b 100644
> --- a/fs/nfs/nfs3xdr.c
> +++ b/fs/nfs/nfs3xdr.c
> @@ -1987,6 +1987,11 @@ int nfs3_decode_dirent(struct xdr_stream *xdr, struct nfs_entry *entry,
> if (entry->fattr->valid & NFS_ATTR_FATTR_V3)
> entry->d_type = nfs_umode_to_dtype(entry->fattr->mode);
>
> + if (entry->fattr->fileid != entry->ino) {
> + entry->fattr->mounted_on_fileid = entry->ino;
> + entry->fattr->valid |= NFS_ATTR_FATTR_MOUNTED_ON_FILEID;
> + }
> +
> /* In fact, a post_op_fh3: */
> p = xdr_inline_decode(xdr, 4);
> if (unlikely(p == NULL))


I like this!

Reviewed-by: NeilBrown <[email protected]>

if you like.


Thanks,
NeilBrown


Attachments:
(No filename) (811.00 B)
OpenPGP digital signature

2015-02-24 03:09:11

by Trond Myklebust

[permalink] [raw]
Subject: Re: [PATCH 1/3] NFS: Don't invalidate a submounted dentry in nfs_prime_dcache()

On Mon, Feb 23, 2015 at 7:14 PM, Trond Myklebust
<[email protected]> wrote:
> If we're traversing a directory which contains a submounted filesystem,
> or one that has a referral, the NFS server that is processing the READDIR
> request will often return information for the underlying (mounted-on)
> directory. It may, or may not, also return filehandle information.
>
> If this happens, and the lookup in nfs_prime_dcache() returns the
> dentry for the submounted directory, the filehandle comparison will
> fail, and we call d_invalidate(). Post-commit 8ed936b5671bf
> ("vfs: Lazily remove mounts on unlinked files and directories."), this
> means the entire subtree is unmounted.
>
> The following minimal patch addresses this problem by punting on
> the invalidation if there is a submount.
>
> Kudos to Neil Brown <[email protected]> for having tracked down this
> issue (see link).
>
> Reported-by: Nix <[email protected]>
> Link: http://lkml.kernel.org/r/[email protected]
> Cc: [email protected] # 3.18+
> Signed-off-by: Trond Myklebust <[email protected]>
> ---
> fs/nfs/dir.c | 4 ++++
> 1 file changed, 4 insertions(+)
>
> diff --git a/fs/nfs/dir.c b/fs/nfs/dir.c
> index 9b0c55cb2a2e..0da617a61c0b 100644
> --- a/fs/nfs/dir.c
> +++ b/fs/nfs/dir.c
> @@ -479,6 +479,10 @@ void nfs_prime_dcache(struct dentry *parent, struct nfs_entry *entry)
>
> dentry = d_lookup(parent, &filename);
> if (dentry != NULL) {
> + /* Is there a mountpoint here? If so, just exit */
> + if (!nfs_fsid_equal(&NFS_SB(dentry->d_sb)->fsid,
> + &entry->fattr->fsid))
> + goto out;
> if (nfs_same_file(dentry, entry)) {
> nfs_set_verifier(dentry, nfs_save_change_attribute(dir));
> status = nfs_refresh_inode(dentry->d_inode, entry->fattr);
> --

...and this of course needs the test for NFS_ATTR_FATTR_FSID from
3/3.... I've updated.



--
Trond Myklebust
Linux NFS client maintainer, PrimaryData
[email protected]

2015-02-24 21:49:59

by NeilBrown

[permalink] [raw]
Subject: Re: [PATCH 1/3] NFS: Don't invalidate a submounted dentry in nfs_prime_dcache()

On Mon, 23 Feb 2015 22:09:09 -0500 Trond Myklebust
<[email protected]> wrote:

> On Mon, Feb 23, 2015 at 7:14 PM, Trond Myklebust
> <[email protected]> wrote:
> > If we're traversing a directory which contains a submounted filesystem,
> > or one that has a referral, the NFS server that is processing the READDIR
> > request will often return information for the underlying (mounted-on)
> > directory. It may, or may not, also return filehandle information.
> >
> > If this happens, and the lookup in nfs_prime_dcache() returns the
> > dentry for the submounted directory, the filehandle comparison will
> > fail, and we call d_invalidate(). Post-commit 8ed936b5671bf
> > ("vfs: Lazily remove mounts on unlinked files and directories."), this
> > means the entire subtree is unmounted.
> >
> > The following minimal patch addresses this problem by punting on
> > the invalidation if there is a submount.
> >
> > Kudos to Neil Brown <[email protected]> for having tracked down this
> > issue (see link).
> >
> > Reported-by: Nix <[email protected]>
> > Link: http://lkml.kernel.org/r/[email protected]
> > Cc: [email protected] # 3.18+
> > Signed-off-by: Trond Myklebust <[email protected]>
> > ---
> > fs/nfs/dir.c | 4 ++++
> > 1 file changed, 4 insertions(+)
> >
> > diff --git a/fs/nfs/dir.c b/fs/nfs/dir.c
> > index 9b0c55cb2a2e..0da617a61c0b 100644
> > --- a/fs/nfs/dir.c
> > +++ b/fs/nfs/dir.c
> > @@ -479,6 +479,10 @@ void nfs_prime_dcache(struct dentry *parent, struct nfs_entry *entry)
> >
> > dentry = d_lookup(parent, &filename);
> > if (dentry != NULL) {
> > + /* Is there a mountpoint here? If so, just exit */
> > + if (!nfs_fsid_equal(&NFS_SB(dentry->d_sb)->fsid,
> > + &entry->fattr->fsid))
> > + goto out;
> > if (nfs_same_file(dentry, entry)) {
> > nfs_set_verifier(dentry, nfs_save_change_attribute(dir));
> > status = nfs_refresh_inode(dentry->d_inode, entry->fattr);
> > --
>
> ...and this of course needs the test for NFS_ATTR_FATTR_FSID from
> 3/3.... I've updated.
>
>
>

Sorry ... I didn't see this before my earlier reply...

What exactly do you do if NFS_ATTR_FATTR_FSID isn't set?
Hopefully you "goto out".

Thanks,
NeilBrown


Attachments:
(No filename) (811.00 B)
OpenPGP digital signature

2015-02-24 21:44:15

by NeilBrown

[permalink] [raw]
Subject: Re: [PATCH 1/3] NFS: Don't invalidate a submounted dentry in nfs_prime_dcache()

On Mon, 23 Feb 2015 19:14:55 -0500 Trond Myklebust
<[email protected]> wrote:

> If we're traversing a directory which contains a submounted filesystem,
> or one that has a referral, the NFS server that is processing the READDIR
> request will often return information for the underlying (mounted-on)
> directory. It may, or may not, also return filehandle information.
>
> If this happens, and the lookup in nfs_prime_dcache() returns the
> dentry for the submounted directory, the filehandle comparison will
> fail, and we call d_invalidate(). Post-commit 8ed936b5671bf
> ("vfs: Lazily remove mounts on unlinked files and directories."), this
> means the entire subtree is unmounted.
>
> The following minimal patch addresses this problem by punting on
> the invalidation if there is a submount.
>
> Kudos to Neil Brown <[email protected]> for having tracked down this
> issue (see link).
>
> Reported-by: Nix <[email protected]>
> Link: http://lkml.kernel.org/r/[email protected]
> Cc: [email protected] # 3.18+
> Signed-off-by: Trond Myklebust <[email protected]>
> ---
> fs/nfs/dir.c | 4 ++++
> 1 file changed, 4 insertions(+)
>
> diff --git a/fs/nfs/dir.c b/fs/nfs/dir.c
> index 9b0c55cb2a2e..0da617a61c0b 100644
> --- a/fs/nfs/dir.c
> +++ b/fs/nfs/dir.c
> @@ -479,6 +479,10 @@ void nfs_prime_dcache(struct dentry *parent, struct nfs_entry *entry)
>
> dentry = d_lookup(parent, &filename);
> if (dentry != NULL) {
> + /* Is there a mountpoint here? If so, just exit */
> + if (!nfs_fsid_equal(&NFS_SB(dentry->d_sb)->fsid,
> + &entry->fattr->fsid))
> + goto out;

Surely we should only consider fattr->fsid if NFS_ATTR_FATTR_FSID is set.
In the case of the Linux NFSv3 server, if this were a mountpoint on the
server, then NFS_ATTR_FATTR_FSID will not be set (as the server returns
neither postop attrs nor filehandle).

I realise that this issue is addressed in the subsequent patch (3/3), but
that isn't tagged for -stable, and this is.

NeilBrown



> if (nfs_same_file(dentry, entry)) {
> nfs_set_verifier(dentry, nfs_save_change_attribute(dir));
> status = nfs_refresh_inode(dentry->d_inode, entry->fattr);


Attachments:
(No filename) (811.00 B)
OpenPGP digital signature

2015-02-25 00:17:46

by Trond Myklebust

[permalink] [raw]
Subject: Re: [PATCH 1/3] NFS: Don't invalidate a submounted dentry in nfs_prime_dcache()

On Tue, Feb 24, 2015 at 4:49 PM, NeilBrown <[email protected]> wrote:
> On Mon, 23 Feb 2015 22:09:09 -0500 Trond Myklebust
> <[email protected]> wrote:
>
>> On Mon, Feb 23, 2015 at 7:14 PM, Trond Myklebust
>> <[email protected]> wrote:
>> > If we're traversing a directory which contains a submounted filesystem,
>> > or one that has a referral, the NFS server that is processing the READDIR
>> > request will often return information for the underlying (mounted-on)
>> > directory. It may, or may not, also return filehandle information.
>> >
>> > If this happens, and the lookup in nfs_prime_dcache() returns the
>> > dentry for the submounted directory, the filehandle comparison will
>> > fail, and we call d_invalidate(). Post-commit 8ed936b5671bf
>> > ("vfs: Lazily remove mounts on unlinked files and directories."), this
>> > means the entire subtree is unmounted.
>> >
>> > The following minimal patch addresses this problem by punting on
>> > the invalidation if there is a submount.
>> >
>> > Kudos to Neil Brown <[email protected]> for having tracked down this
>> > issue (see link).
>> >
>> > Reported-by: Nix <[email protected]>
>> > Link: http://lkml.kernel.org/r/[email protected]
>> > Cc: [email protected] # 3.18+
>> > Signed-off-by: Trond Myklebust <[email protected]>
>> > ---
>> > fs/nfs/dir.c | 4 ++++
>> > 1 file changed, 4 insertions(+)
>> >
>> > diff --git a/fs/nfs/dir.c b/fs/nfs/dir.c
>> > index 9b0c55cb2a2e..0da617a61c0b 100644
>> > --- a/fs/nfs/dir.c
>> > +++ b/fs/nfs/dir.c
>> > @@ -479,6 +479,10 @@ void nfs_prime_dcache(struct dentry *parent, struct nfs_entry *entry)
>> >
>> > dentry = d_lookup(parent, &filename);
>> > if (dentry != NULL) {
>> > + /* Is there a mountpoint here? If so, just exit */
>> > + if (!nfs_fsid_equal(&NFS_SB(dentry->d_sb)->fsid,
>> > + &entry->fattr->fsid))
>> > + goto out;
>> > if (nfs_same_file(dentry, entry)) {
>> > nfs_set_verifier(dentry, nfs_save_change_attribute(dir));
>> > status = nfs_refresh_inode(dentry->d_inode, entry->fattr);
>> > --
>>
>> ...and this of course needs the test for NFS_ATTR_FATTR_FSID from
>> 3/3.... I've updated.
>>
>>
>>
>
> Sorry ... I didn't see this before my earlier reply...
>
> What exactly do you do if NFS_ATTR_FATTR_FSID isn't set?
> Hopefully you "goto out".
>

Yes. The test is made immediately upon entering the function, and so a
simple 'return' is sufficient.



--
Trond Myklebust
Linux NFS client maintainer, PrimaryData
[email protected]

2015-03-12 23:15:14

by Nix

[permalink] [raw]
Subject: Re: [PATCH 1/3] NFS: Don't invalidate a submounted dentry in nfs_prime_dcache()

On 24 Feb 2015, Trond Myklebust stated:

> If we're traversing a directory which contains a submounted filesystem,
> or one that has a referral, the NFS server that is processing the READDIR
> request will often return information for the underlying (mounted-on)
> directory. It may, or may not, also return filehandle information.

FWIW, this never seems to have made it into 3.19.x, nor is it in the
stable queue. Was this intentional?

> Cc: [email protected] # 3.18+

(... hm, I guess so...)

--
NULL && (void)