If a write delegation isn't available, the Linux NFS client uses
a zero-stateid when performing a SETATTR.
NFSv4.0 provides no mechanism for an NFS server to match such a
request to a particular client. It recalls all delegations for that
file, even delegations held by the client issuing the request. If
that client happens to hold a read delegation, the server will
recall it immediately, resulting in an NFS4ERR_DELAY/CB_RECALL/
DELEGRETURN sequence.
Optimize out this pipeline bubble by having the client return any
delegations it may hold on a file before it issues a
SETATTR(zero-stateid) on that file.
Signed-off-by: Chuck Lever <[email protected]>
---
fs/nfs/nfs4proc.c | 2 ++
1 file changed, 2 insertions(+)
Changes since v1:
- Return the delegation only for NFSv4.0 mounts
diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
index dbd01548335b..bca7245f1e78 100644
--- a/fs/nfs/nfs4proc.c
+++ b/fs/nfs/nfs4proc.c
@@ -3314,6 +3314,8 @@ static int _nfs4_do_setattr(struct inode *inode,
goto zero_stateid;
} else {
zero_stateid:
+ if (server->nfs_client->cl_minorversion == 0)
+ nfs4_inode_return_delegation(inode);
nfs4_stateid_copy(&arg->stateid, &zero_stateid);
}
if (delegation_cred)
On Sat, 2020-08-29 at 13:16 -0400, Chuck Lever wrote:
> If a write delegation isn't available, the Linux NFS client uses
> a zero-stateid when performing a SETATTR.
>
> NFSv4.0 provides no mechanism for an NFS server to match such a
> request to a particular client. It recalls all delegations for that
> file, even delegations held by the client issuing the request. If
> that client happens to hold a read delegation, the server will
> recall it immediately, resulting in an NFS4ERR_DELAY/CB_RECALL/
> DELEGRETURN sequence.
>
> Optimize out this pipeline bubble by having the client return any
> delegations it may hold on a file before it issues a
> SETATTR(zero-stateid) on that file.
>
> Signed-off-by: Chuck Lever <[email protected]>
> ---
> fs/nfs/nfs4proc.c | 2 ++
> 1 file changed, 2 insertions(+)
>
> Changes since v1:
> - Return the delegation only for NFSv4.0 mounts
>
> diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
> index dbd01548335b..bca7245f1e78 100644
> --- a/fs/nfs/nfs4proc.c
> +++ b/fs/nfs/nfs4proc.c
> @@ -3314,6 +3314,8 @@ static int _nfs4_do_setattr(struct inode
> *inode,
> goto zero_stateid;
> } else {
> zero_stateid:
> + if (server->nfs_client->cl_minorversion == 0)
> + nfs4_inode_return_delegation(inode);
So, the intention is that nfs4_inode_make_writeable() takes care of
this, and in principle it is done in the cases that matter in
nfs4_proc_setattr().
I agree that the zero_stateid case is not currently being taken care
of, but that only matters for the case of truncate. So perhaps we can
just add a single call to nfs4_inode_make_writeable() above the
zero_stateid label instead of adding redundancy for all the other
cases?
> nfs4_stateid_copy(&arg->stateid, &zero_stateid);
> }
> if (delegation_cred)
>
>
--
Trond Myklebust
Linux NFS client maintainer, Hammerspace
[email protected]
> On Aug 29, 2020, at 5:17 PM, Trond Myklebust <[email protected]> wrote:
>
> On Sat, 2020-08-29 at 13:16 -0400, Chuck Lever wrote:
>> If a write delegation isn't available, the Linux NFS client uses
>> a zero-stateid when performing a SETATTR.
>>
>> NFSv4.0 provides no mechanism for an NFS server to match such a
>> request to a particular client. It recalls all delegations for that
>> file, even delegations held by the client issuing the request. If
>> that client happens to hold a read delegation, the server will
>> recall it immediately, resulting in an NFS4ERR_DELAY/CB_RECALL/
>> DELEGRETURN sequence.
>>
>> Optimize out this pipeline bubble by having the client return any
>> delegations it may hold on a file before it issues a
>> SETATTR(zero-stateid) on that file.
>>
>> Signed-off-by: Chuck Lever <[email protected]>
>> ---
>> fs/nfs/nfs4proc.c | 2 ++
>> 1 file changed, 2 insertions(+)
>>
>> Changes since v1:
>> - Return the delegation only for NFSv4.0 mounts
>>
>> diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
>> index dbd01548335b..bca7245f1e78 100644
>> --- a/fs/nfs/nfs4proc.c
>> +++ b/fs/nfs/nfs4proc.c
>> @@ -3314,6 +3314,8 @@ static int _nfs4_do_setattr(struct inode
>> *inode,
>> goto zero_stateid;
>> } else {
>> zero_stateid:
>> + if (server->nfs_client->cl_minorversion == 0)
>> + nfs4_inode_return_delegation(inode);
>
> So, the intention is that nfs4_inode_make_writeable() takes care of
> this, and in principle it is done in the cases that matter in
> nfs4_proc_setattr().
Thanks for pointing out the nfs4_inode_make_writeable call site!
4219 /* Return any delegations if we're going to change ACLs */
4220 if ((sattr->ia_valid & (ATTR_MODE|ATTR_UID|ATTR_GID)) != 0)
4221 nfs4_inode_make_writeable(inode);
> I agree that the zero_stateid case is not currently being taken care
> of, but that only matters for the case of truncate. So perhaps we can
> just add a single call to nfs4_inode_make_writeable() above the
> zero_stateid label instead of adding redundancy for all the other
> cases?
I'm willing to consider other solutions, but something else is going
on here. I've added some instrumentation to nfsd_setattr. It shows
that the iattr mask does not have the SIZE bit set:
nfsd_compound: xid=0x8ffc7b48 opcnt=3
nfsd_compound_status: op=1/3 OP_PUTFH status=0
nfsd_setattr: xid=0x8ffc7b48 fh_hash=0x2aed0c4d valid=ATIME|MTIME|ATIME_SET|MTIME_SET
time_out_leases: fl=0xffff8887006c7ea0 dev=0x0:0x23 ino=0x16d825 fl_blocker=(nil) fl_owner=0xffff88872928e000 fl_flags=FL_DELEG fl_type=F_RDLCK fl_break_time=0 fl_downgrade
_time=0
leases_conflict: conflict 1: lease=0xffff8887006c7ea0 fl_flags=FL_DELEG fl_type=F_RDLCK; breaker=0xffff8887006c6e38 fl_flags=FL_DELEG fl_type=F_WRLCK
leases_conflict: conflict 1: lease=0xffff8887006c7ea0 fl_flags=FL_DELEG fl_type=F_RDLCK; breaker=0xffff8887006c6e38 fl_flags=FL_DELEG fl_type=F_WRLCK
nfsd_deleg_break: client 5f4a926c:cd5044d5 stateid 00063dd9:00000001
break_lease_noblock: fl=0xffff8887006c6e38 dev=0x0:0x23 ino=0x16d825 fl_blocker=(nil) fl_owner=(nil) fl_flags=FL_DELEG fl_type=F_WRLCK fl_break_time=0 fl_downgrade_time=0
nfsd_cb_work: addr=192.168.2.51:35037 client 5f4a926c:cd5044d5 procedure=CB_RECALL
nfsd_compound_status: op=2/3 OP_SETATTR status=10008
--
Chuck Lever