From: Trond Myklebust <[email protected]>
Both NFSv3 and NFSv2 generate their change attribute from the ctime
value that was supplied by the server. However the problem is that there
are plenty of servers out there with ctime resolutions of 1ms or worse.
In a modern performance system, this is insufficient when trying to
decide which is the most recent set of attributes when, for instance, a
READ or GETATTR call races with a WRITE or SETATTR.
For this reason, let's revert to labelling the NFSv2/v3 change
attributes as NFS4_CHANGE_TYPE_IS_UNDEFINED. This will ensure we protect
against such races.
Fixes: 7b24dacf0840 ("NFS: Another inode revalidation improvement")
Signed-off-by: Trond Myklebust <[email protected]>
---
fs/nfs/inode.c | 4 +++-
fs/nfs/nfs3xdr.c | 2 +-
fs/nfs/proc.c | 2 +-
3 files changed, 5 insertions(+), 3 deletions(-)
diff --git a/fs/nfs/inode.c b/fs/nfs/inode.c
index 4f45281c47cf..0f092ccb0ca1 100644
--- a/fs/nfs/inode.c
+++ b/fs/nfs/inode.c
@@ -1777,8 +1777,10 @@ static int nfs_inode_finish_partial_attr_update(const struct nfs_fattr *fattr,
NFS_INO_INVALID_BLOCKS | NFS_INO_INVALID_OTHER |
NFS_INO_INVALID_NLINK;
unsigned long cache_validity = NFS_I(inode)->cache_validity;
+ enum nfs4_change_attr_type ctype = NFS_SERVER(inode)->change_attr_type;
- if (!(cache_validity & NFS_INO_INVALID_CHANGE) &&
+ if (ctype != NFS4_CHANGE_TYPE_IS_UNDEFINED &&
+ !(cache_validity & NFS_INO_INVALID_CHANGE) &&
(cache_validity & check_valid) != 0 &&
(fattr->valid & NFS_ATTR_FATTR_CHANGE) != 0 &&
nfs_inode_attrs_cmp_monotonic(fattr, inode) == 0)
diff --git a/fs/nfs/nfs3xdr.c b/fs/nfs/nfs3xdr.c
index e6eca1d7481b..9274c9c5efea 100644
--- a/fs/nfs/nfs3xdr.c
+++ b/fs/nfs/nfs3xdr.c
@@ -2227,7 +2227,7 @@ static int decode_fsinfo3resok(struct xdr_stream *xdr,
/* ignore properties */
result->lease_time = 0;
- result->change_attr_type = NFS4_CHANGE_TYPE_IS_TIME_METADATA;
+ result->change_attr_type = NFS4_CHANGE_TYPE_IS_UNDEFINED;
return 0;
}
diff --git a/fs/nfs/proc.c b/fs/nfs/proc.c
index ea19dbf12301..ecc4e717808c 100644
--- a/fs/nfs/proc.c
+++ b/fs/nfs/proc.c
@@ -91,7 +91,7 @@ nfs_proc_get_root(struct nfs_server *server, struct nfs_fh *fhandle,
info->dtpref = fsinfo.tsize;
info->maxfilesize = 0x7FFFFFFF;
info->lease_time = 0;
- info->change_attr_type = NFS4_CHANGE_TYPE_IS_TIME_METADATA;
+ info->change_attr_type = NFS4_CHANGE_TYPE_IS_UNDEFINED;
return 0;
}
--
2.31.1
> On Sep 27, 2021, at 9:52 AM, [email protected] wrote:
>
> From: Trond Myklebust <[email protected]>
>
> Both NFSv3 and NFSv2 generate their change attribute from the ctime
> value that was supplied by the server. However the problem is that there
> are plenty of servers out there with ctime resolutions of 1ms or worse.
> In a modern performance system, this is insufficient when trying to
> decide which is the most recent set of attributes when, for instance, a
> READ or GETATTR call races with a WRITE or SETATTR.
>
> For this reason, let's revert to labelling the NFSv2/v3 change
> attributes as NFS4_CHANGE_TYPE_IS_UNDEFINED. This will ensure we protect
> against such races.
>
> Fixes: 7b24dacf0840 ("NFS: Another inode revalidation improvement")
> Signed-off-by: Trond Myklebust <[email protected]>
[root@morisot ~]# mount -o vers=3,rdma,sec=sys klimt.ib:/export/tmp /mnt
[root@morisot ~]# (cd /mnt; /home/cel/src/xfstests/ltp/fsx -q -l 262144 -o 65536 -S 191110531 -N 1000000 -R -W fsx_std_nommap)
All 1000000 operations completed A-OK!
[root@morisot ~]# (cd /mnt; /home/cel/src/xfstests/ltp/fsx -q -l 262144 -o 65536 -S 191110531 -N 1000000 -R -W fsx_std_nommap)
All 1000000 operations completed A-OK!
[root@morisot ~]# (cd /mnt; /home/cel/src/xfstests/ltp/fsx -q -l 262144 -o 65536 -S 191110531 -N 1000000 -R -W fsx_std_nommap)
All 1000000 operations completed A-OK!
[root@morisot ~]# umount /mnt
[root@morisot ~]#
Tested-by: Chuck Lever <[email protected]>
> ---
> fs/nfs/inode.c | 4 +++-
> fs/nfs/nfs3xdr.c | 2 +-
> fs/nfs/proc.c | 2 +-
> 3 files changed, 5 insertions(+), 3 deletions(-)
>
> diff --git a/fs/nfs/inode.c b/fs/nfs/inode.c
> index 4f45281c47cf..0f092ccb0ca1 100644
> --- a/fs/nfs/inode.c
> +++ b/fs/nfs/inode.c
> @@ -1777,8 +1777,10 @@ static int nfs_inode_finish_partial_attr_update(const struct nfs_fattr *fattr,
> NFS_INO_INVALID_BLOCKS | NFS_INO_INVALID_OTHER |
> NFS_INO_INVALID_NLINK;
> unsigned long cache_validity = NFS_I(inode)->cache_validity;
> + enum nfs4_change_attr_type ctype = NFS_SERVER(inode)->change_attr_type;
>
> - if (!(cache_validity & NFS_INO_INVALID_CHANGE) &&
> + if (ctype != NFS4_CHANGE_TYPE_IS_UNDEFINED &&
> + !(cache_validity & NFS_INO_INVALID_CHANGE) &&
> (cache_validity & check_valid) != 0 &&
> (fattr->valid & NFS_ATTR_FATTR_CHANGE) != 0 &&
> nfs_inode_attrs_cmp_monotonic(fattr, inode) == 0)
> diff --git a/fs/nfs/nfs3xdr.c b/fs/nfs/nfs3xdr.c
> index e6eca1d7481b..9274c9c5efea 100644
> --- a/fs/nfs/nfs3xdr.c
> +++ b/fs/nfs/nfs3xdr.c
> @@ -2227,7 +2227,7 @@ static int decode_fsinfo3resok(struct xdr_stream *xdr,
>
> /* ignore properties */
> result->lease_time = 0;
> - result->change_attr_type = NFS4_CHANGE_TYPE_IS_TIME_METADATA;
> + result->change_attr_type = NFS4_CHANGE_TYPE_IS_UNDEFINED;
> return 0;
> }
>
> diff --git a/fs/nfs/proc.c b/fs/nfs/proc.c
> index ea19dbf12301..ecc4e717808c 100644
> --- a/fs/nfs/proc.c
> +++ b/fs/nfs/proc.c
> @@ -91,7 +91,7 @@ nfs_proc_get_root(struct nfs_server *server, struct nfs_fh *fhandle,
> info->dtpref = fsinfo.tsize;
> info->maxfilesize = 0x7FFFFFFF;
> info->lease_time = 0;
> - info->change_attr_type = NFS4_CHANGE_TYPE_IS_TIME_METADATA;
> + info->change_attr_type = NFS4_CHANGE_TYPE_IS_UNDEFINED;
> return 0;
> }
>
> --
> 2.31.1
>
--
Chuck Lever