2018-03-01 00:25:39

by Eric Rannaud

[permalink] [raw]
Subject: [PATCH] nfs: nfs_fileid_valid() is wrong on NFSv3 and incomplete on NFSv4

On NFSv3, in normal circumstances, NFS_ATTR_FATTR_MOUNTED_ON_FILEID is
not set on fattr->valid. It is only set in nfs3_decode_dirent() if
entry->fattr->fileid != entry->ino, which typically only happens for
mountpoints. (See 1ae04b252351 NFSv3: Use the readdir fileid as the
mounted-on-fileid.)

nfs_fileid_valid() returns 'ret1 || ret2', initializing both ret1 and
ret2 to true, but sets them only conditionally. Therefore, in normal
circumstances, nfs_fileid_valid() wrongly returns 'false || true' when
the fileid has changed.

There is another bug in this function for NFSv4: a server may return a
fattr with NFS_ATTR_FATTR_FILEID|NFS_ATTR_FATTR_MOUNTED_ON_FILEID.
nfs_fileid_valid() will fail to detect a change in fileid in the case
where the fattr->fileid happens to match, by chance, the stored
nfsi->fileid (copied from a previous fattr->mounted_on_fileid).

If this is a mountpoint, the current fattr->fileid and the old
fattr->mounted_on_fileid come from different filesystems and may well
happen to be identical.

The condition should be written as an if / else if:

if (fattr->valid & NFS_ATTR_FATTR_MOUNTED_ON_FILEID)
else if (fattr->valid & NFS_ATTR_FATTR_FILEID)

as is done elsewhere in NFSv4 code.
---
fs/nfs/inode.c | 10 ++++------
1 file changed, 4 insertions(+), 6 deletions(-)

diff --git a/fs/nfs/inode.c b/fs/nfs/inode.c
index 7d893543cf3b..e1ad5ec75795 100644
--- a/fs/nfs/inode.c
+++ b/fs/nfs/inode.c
@@ -1717,13 +1717,11 @@ EXPORT_SYMBOL_GPL(nfs_post_op_update_inode_force_wcc);
static inline bool nfs_fileid_valid(struct nfs_inode *nfsi,
struct nfs_fattr *fattr)
{
- bool ret1 = true, ret2 = true;
-
- if (fattr->valid & NFS_ATTR_FATTR_FILEID)
- ret1 = (nfsi->fileid == fattr->fileid);
if (fattr->valid & NFS_ATTR_FATTR_MOUNTED_ON_FILEID)
- ret2 = (nfsi->fileid == fattr->mounted_on_fileid);
- return ret1 || ret2;
+ return nfsi->fileid == fattr->mounted_on_fileid;
+ else if (fattr->valid & NFS_ATTR_FATTR_FILEID)
+ return nfsi->fileid == fattr->fileid;
+ return true;
}

/*
--
2.16.2



2018-03-14 19:37:04

by Anna Schumaker

[permalink] [raw]
Subject: Re: [PATCH] nfs: nfs_fileid_valid() is wrong on NFSv3 and incomplete on NFSv4

Hi Eric,

On 02/28/2018 07:25 PM, Eric Rannaud wrote:
> On NFSv3, in normal circumstances, NFS_ATTR_FATTR_MOUNTED_ON_FILEID is
> not set on fattr->valid. It is only set in nfs3_decode_dirent() if
> entry->fattr->fileid != entry->ino, which typically only happens for
> mountpoints. (See 1ae04b252351 NFSv3: Use the readdir fileid as the
> mounted-on-fileid.)
>
> nfs_fileid_valid() returns 'ret1 || ret2', initializing both ret1 and
> ret2 to true, but sets them only conditionally. Therefore, in normal
> circumstances, nfs_fileid_valid() wrongly returns 'false || true' when
> the fileid has changed.
>
> There is another bug in this function for NFSv4: a server may return a
> fattr with NFS_ATTR_FATTR_FILEID|NFS_ATTR_FATTR_MOUNTED_ON_FILEID.
> nfs_fileid_valid() will fail to detect a change in fileid in the case
> where the fattr->fileid happens to match, by chance, the stored
> nfsi->fileid (copied from a previous fattr->mounted_on_fileid).
>
> If this is a mountpoint, the current fattr->fileid and the old
> fattr->mounted_on_fileid come from different filesystems and may well
> happen to be identical.
>
> The condition should be written as an if / else if:
>
> if (fattr->valid & NFS_ATTR_FATTR_MOUNTED_ON_FILEID)
> else if (fattr->valid & NFS_ATTR_FATTR_FILEID)

We're requesting both of these attributes as part of the READDIR operation, so won't this return the wrong answer in some cases?

I'm running xfstests "quick" tests with NFS v4.1, and my server is exporting an xfs filesystem for both test and scratch directories (they're on the same partition, but different exported directories using the fsid=X export option). I'm seeing the following message fairly frequently in my dmesg log once I apply your patch:

[ 222.093367] NFS: server 192.168.100.215 error: fileid changed
fsid 0:48: expected fileid 0x60, got 0x60

So I'm thinking that the server is replying with both FATTR_MOUNTED_ON_FILEID and FATTR_FILEID, and then returning false in the mounted-on check when the plain fileid should be compared instead.

I wonder if it would make sense to add another field to the nfs inode to keep track of which fileid is being used.

Thoughts?
Anna

>
> as is done elsewhere in NFSv4 code.
> ---
> fs/nfs/inode.c | 10 ++++------
> 1 file changed, 4 insertions(+), 6 deletions(-)
>
> diff --git a/fs/nfs/inode.c b/fs/nfs/inode.c
> index 7d893543cf3b..e1ad5ec75795 100644
> --- a/fs/nfs/inode.c
> +++ b/fs/nfs/inode.c
> @@ -1717,13 +1717,11 @@ EXPORT_SYMBOL_GPL(nfs_post_op_update_inode_force_wcc);
> static inline bool nfs_fileid_valid(struct nfs_inode *nfsi,
> struct nfs_fattr *fattr)
> {
> - bool ret1 = true, ret2 = true;
> -
> - if (fattr->valid & NFS_ATTR_FATTR_FILEID)
> - ret1 = (nfsi->fileid == fattr->fileid);
> if (fattr->valid & NFS_ATTR_FATTR_MOUNTED_ON_FILEID)
> - ret2 = (nfsi->fileid == fattr->mounted_on_fileid);
> - return ret1 || ret2;
> + return nfsi->fileid == fattr->mounted_on_fileid;
> + else if (fattr->valid & NFS_ATTR_FATTR_FILEID)
> + return nfsi->fileid == fattr->fileid;
> + return true;
> }
>
> /*
>

2018-03-17 00:58:20

by Eric Rannaud

[permalink] [raw]
Subject: Re: [PATCH] nfs: nfs_fileid_valid() is wrong on NFSv3 and incomplete on NFSv4

Hi Anna,

On Wed, Mar 14, 2018 at 12:36 PM, Anna Schumaker
<[email protected]> wrote:
>> The condition should be written as an if / else if:
>>
>> if (fattr->valid & NFS_ATTR_FATTR_MOUNTED_ON_FILEID)
>> else if (fattr->valid & NFS_ATTR_FATTR_FILEID)
>
> We're requesting both of these attributes as part of the READDIR operatio=
n, so won't this return the wrong answer in some cases?
>
> I'm running xfstests "quick" tests with NFS v4.1, and my server is export=
ing an xfs filesystem for both test and scratch directories (they're on the=
same partition, but different exported directories using the fsid=3DX expo=
rt option). I'm seeing the following message fairly frequently in my dmesg=
log once I apply your patch:
>
> [ 222.093367] NFS: server 192.168.100.215 error: fileid changed
> fsid 0:48: expected fileid 0x60, got 0x60
>
> So I'm thinking that the server is replying with both FATTR_MOUNTED_ON_FI=
LEID and FATTR_FILEID, and then returning false in the mounted-on check whe=
n the plain fileid should be compared instead.
>
> I wonder if it would make sense to add another field to the nfs inode to =
keep track of which fileid is being used.
>

Thanks for testing. I was able to reproduce by using the following
setup (which I believe is similar enough to yours).

# server
/mnt/xfs/ # mount point of XFS filesystem
/mnt/xfs/test # normal directory
/mnt/xfs/scratch # normal directory
# /etc/exports
/mnt/xfs *(rw,sync,insecure,no_root_squash,fsid=3D0)
/mnt/xfs/test *(rw,sync,insecure,no_root_squash,fsid=3D1)
/mnt/xfs/scratch *(rw,sync,insecure,no_root_squash,fsid=3D2)
# client
mount 10.0.2.2:/test /mnt/test
mount 10.0.2.2:/scratch /mnt/scratch
# dmesg client
NFS: server 10.0.2.2 error: fileid changed
fsid 0:34: expected fileid 0x60, got 0x60

Just mounting both /test and /scratch is enough to get the error message.

0x60 is the root of the XFS filesystem, which is not itself mounted,
but the nfs4 client first gets the export root FH (the
root-of-all-exports) and then follows the export path to the actual
export being mounted. This is why this filehandle gets loaded at all.
On the second mount, the inode for the export root gets revalidated
and we get the error.

I was getting confused as I thought the following invariant held:

fattr->mounted_on_fileid !=3D fattr->fileid # fattr as returned
by the server
<=3D> super_block->fsid !=3D fattr->fsid # which is how nfs_fhget()
chooses whether to use mounted_on_fileid

as the server only returns a different mounted_on_fileid if the target
FH is itself a mount point on the server (leading to a different
fsid).

But that's not true for the export root when it happens to be a mount
point on the server:
- the server does return a different mounted_on_fileid for it,
because it is a mount point;
- but the client sees that its fsid =3D=3D super_block->fsid and does
not use mounted_on_fileid.

So you're correct that we need to more accurately choose whether we
want to compare against fileid or mounted_on_fileid when revalidating.

We could try to write in nfs_fileid_valid() the full logic that is
used elsewhere to decide whether to use fileid or mounted_on_fileid
(assuming we have access to the needed information -- I think so? --
but are all code paths consistent?), or we could remember in the NFS
inode which one was used to set fileid, which is your suggestion (can
we use nfs_inode's flags?).

As an experiment, I implemented your suggestion with a flag in the NFS
inode, and xfstests nfs/quick does not generate that error anymore.

I do think it would be better to have nfs_fileid_valid() be capable of
figuring out on its own which one to compare against, using a
(side-effect free) version of the logic (confusingly) spread across
nfs_attr_check_mountpoint() and nfs_attr_use_mounted_on_fileid(). And
make sure that everybody is using the same helper in the client.
What's your preference?

As an aside, nfs_fhget() overwrites fattr->fileid with
fattr->mounted_on_fileid if it decides to use mounted_on_fileid (so
that nfs_find_actor() and nfs_init_locked() do the right thing) which
to my eye is quite confusing for callers of nfs_fhget() which may be
entitled to think that fattr will be left untouched.

Also, the error message is not printing which of fileid or
mounted_on_fileid we compared against, and we get something asinine
like "expected 0x60, but got 0x60". I'll fix that in the next version
of this patch.

Thanks,
Eric

2018-03-19 19:52:48

by Anna Schumaker

[permalink] [raw]
Subject: Re: [PATCH] nfs: nfs_fileid_valid() is wrong on NFSv3 and incomplete on NFSv4



On 03/16/2018 08:57 PM, Eric Rannaud wrote:
> Hi Anna,
>
> On Wed, Mar 14, 2018 at 12:36 PM, Anna Schumaker
> <[email protected]> wrote:
>>> The condition should be written as an if / else if:
>>>
>>> if (fattr->valid & NFS_ATTR_FATTR_MOUNTED_ON_FILEID)
>>> else if (fattr->valid & NFS_ATTR_FATTR_FILEID)
>>
>> We're requesting both of these attributes as part of the READDIR operation, so won't this return the wrong answer in some cases?
>>
>> I'm running xfstests "quick" tests with NFS v4.1, and my server is exporting an xfs filesystem for both test and scratch directories (they're on the same partition, but different exported directories using the fsid=X export option). I'm seeing the following message fairly frequently in my dmesg log once I apply your patch:
>>
>> [ 222.093367] NFS: server 192.168.100.215 error: fileid changed
>> fsid 0:48: expected fileid 0x60, got 0x60
>>
>> So I'm thinking that the server is replying with both FATTR_MOUNTED_ON_FILEID and FATTR_FILEID, and then returning false in the mounted-on check when the plain fileid should be compared instead.
>>
>> I wonder if it would make sense to add another field to the nfs inode to keep track of which fileid is being used.
>>
>
> Thanks for testing. I was able to reproduce by using the following
> setup (which I believe is similar enough to yours).
>
> # server
> /mnt/xfs/ # mount point of XFS filesystem
> /mnt/xfs/test # normal directory
> /mnt/xfs/scratch # normal directory
> # /etc/exports
> /mnt/xfs *(rw,sync,insecure,no_root_squash,fsid=0)
> /mnt/xfs/test *(rw,sync,insecure,no_root_squash,fsid=1)
> /mnt/xfs/scratch *(rw,sync,insecure,no_root_squash,fsid=2)
> # client
> mount 10.0.2.2:/test /mnt/test
> mount 10.0.2.2:/scratch /mnt/scratch
> # dmesg client
> NFS: server 10.0.2.2 error: fileid changed
> fsid 0:34: expected fileid 0x60, got 0x60
>
> Just mounting both /test and /scratch is enough to get the error message.
>
> 0x60 is the root of the XFS filesystem, which is not itself mounted,
> but the nfs4 client first gets the export root FH (the
> root-of-all-exports) and then follows the export path to the actual
> export being mounted. This is why this filehandle gets loaded at all.
> On the second mount, the inode for the export root gets revalidated
> and we get the error.
>
> I was getting confused as I thought the following invariant held:
>
> fattr->mounted_on_fileid != fattr->fileid # fattr as returned
> by the server
> <=> super_block->fsid != fattr->fsid # which is how nfs_fhget()
> chooses whether to use mounted_on_fileid
>
> as the server only returns a different mounted_on_fileid if the target
> FH is itself a mount point on the server (leading to a different
> fsid).
>
> But that's not true for the export root when it happens to be a mount
> point on the server:
> - the server does return a different mounted_on_fileid for it,
> because it is a mount point;
> - but the client sees that its fsid == super_block->fsid and does
> not use mounted_on_fileid.
>
> So you're correct that we need to more accurately choose whether we
> want to compare against fileid or mounted_on_fileid when revalidating.
>
> We could try to write in nfs_fileid_valid() the full logic that is
> used elsewhere to decide whether to use fileid or mounted_on_fileid
> (assuming we have access to the needed information -- I think so? --
> but are all code paths consistent?), or we could remember in the NFS
> inode which one was used to set fileid, which is your suggestion (can
> we use nfs_inode's flags?).
>
> As an experiment, I implemented your suggestion with a flag in the NFS
> inode, and xfstests nfs/quick does not generate that error anymore.
>
> I do think it would be better to have nfs_fileid_valid() be capable of
> figuring out on its own which one to compare against, using a
> (side-effect free) version of the logic (confusingly) spread across
> nfs_attr_check_mountpoint() and nfs_attr_use_mounted_on_fileid(). And
> make sure that everybody is using the same helper in the client.
> What's your preference?

I wonder if it would be easier in the long run just to add a mounted-on-fileid field to the nfs_inode, removing the need to guess which one we're supposed to compare against. It might also help with the overwriting issue / potential confusion you mention in the next paragraph.

What do you think?
Anna

>
> As an aside, nfs_fhget() overwrites fattr->fileid with
> fattr->mounted_on_fileid if it decides to use mounted_on_fileid (so
> that nfs_find_actor() and nfs_init_locked() do the right thing) which
> to my eye is quite confusing for callers of nfs_fhget() which may be
> entitled to think that fattr will be left untouched.
>
> Also, the error message is not printing which of fileid or
> mounted_on_fileid we compared against, and we get something asinine
> like "expected 0x60, but got 0x60". I'll fix that in the next version
> of this patch.
>
> Thanks,
> Eric
>