2022-09-08 22:34:18

by Chuck Lever III

[permalink] [raw]
Subject: [PATCH v4 0/8] Wait for DELEGRETURN before returning NFS4ERR_DELAY

This patch series adds logic to the Linux NFS server to make it wait
a few moments for clients to return a delegation before replying
with NFS4ERR_DELAY. There are two main benefits:

- It improves responsiveness when a delegated file is accessed from
another client, and
- It removes an unnecessary latency if a client has neglected to
return a delegation before attempting a RENAME or SETATTR

NFS4ERR_DELAY during NFSv4 OPEN is still not handled. However, I'm
comfortable merging the infrastructure in this series and support
for using it in SETATTR, RENAME, and REMOVE now, and then handling
OPEN at a later time.

This series applies against v6.0-rc4.

Changes since v3:
- Also handle JUKEBOX when an NFSv3 operation triggers a CB_RECALL

Changes since v2:
- Wake immediately after server receives DELEGRETURN
- More tracepoint improvements

Changes since RFC:
- Eliminate spurious console messages on the server
- Wait for DELEGRETURN for RENAME operations
- Add CB done tracepoints
- Rework the retry loop

---

Chuck Lever (8):
NFSD: Replace dprintk() call site in fh_verify()
NFSD: Trace NFSv4 COMPOUND tags
NFSD: Add tracepoints to report NFSv4 callback completions
NFSD: Add a mechanism to wait for a DELEGRETURN
NFSD: Refactor nfsd_setattr()
NFSD: Make nfsd4_setattr() wait before returning NFS4ERR_DELAY
NFSD: Make nfsd4_rename() wait before returning NFS4ERR_DELAY
NFSD: Make nfsd4_remove() wait before returning NFS4ERR_DELAY


fs/nfsd/nfs4layouts.c | 2 +-
fs/nfsd/nfs4proc.c | 6 +-
fs/nfsd/nfs4state.c | 34 +++++++++++
fs/nfsd/nfsd.h | 7 +++
fs/nfsd/nfsfh.c | 8 +--
fs/nfsd/trace.h | 131 ++++++++++++++++++++++++++++++++++++++----
fs/nfsd/vfs.c | 100 ++++++++++++++++++++------------
7 files changed, 233 insertions(+), 55 deletions(-)

--
Chuck Lever


2022-09-08 22:34:30

by Chuck Lever III

[permalink] [raw]
Subject: [PATCH v4 4/8] NFSD: Add a mechanism to wait for a DELEGRETURN

Subsequent patches will use this mechanism to wake up an operation
that is waiting for a client to return a delegation.

The new tracepoint records whether the wait timed out or was
properly awoken by the expected DELEGRETURN:

nfsd-1155 [002] 83799.493199: nfsd_delegret_wakeup: xid=0x14b7d6ef fh_hash=0xf6826792 (timed out)

Suggested-by: Jeff Layton <[email protected]>
Signed-off-by: Chuck Lever <[email protected]>
---
fs/nfsd/nfs4state.c | 30 ++++++++++++++++++++++++++++++
fs/nfsd/nfsd.h | 7 +++++++
fs/nfsd/trace.h | 23 +++++++++++++++++++++++
3 files changed, 60 insertions(+)

diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index 561f3556b1d2..54bc70427ce3 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -4717,6 +4717,35 @@ nfs4_share_conflict(struct svc_fh *current_fh, unsigned int deny_type)
return ret;
}

+static bool nfsd4_deleg_present(const struct inode *inode)
+{
+ struct file_lock_context *ctx = smp_load_acquire(&inode->i_flctx);
+
+ return ctx && !list_empty_careful(&ctx->flc_lease);
+}
+
+/**
+ * nfsd_wait_for_delegreturn - wait for delegations to be returned
+ * @rqstp: the RPC transaction being executed
+ * @inode: in-core inode of the file being waited for
+ *
+ * The timeout prevents deadlock if all nfsd threads happen to be
+ * tied up waiting for returning delegations.
+ *
+ * Return values:
+ * %true: delegation was returned
+ * %false: timed out waiting for delegreturn
+ */
+bool nfsd_wait_for_delegreturn(struct svc_rqst *rqstp, struct inode *inode)
+{
+ long __maybe_unused timeo;
+
+ timeo = wait_var_event_timeout(inode, !nfsd4_deleg_present(inode),
+ NFSD_DELEGRETURN_TIMEOUT);
+ trace_nfsd_delegret_wakeup(rqstp, inode, timeo);
+ return timeo > 0;
+}
+
static void nfsd4_cb_recall_prepare(struct nfsd4_callback *cb)
{
struct nfs4_delegation *dp = cb_to_delegation(cb);
@@ -6779,6 +6808,7 @@ nfsd4_delegreturn(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
if (status)
goto put_stateid;

+ wake_up_var(d_inode(cstate->current_fh.fh_dentry));
destroy_delegation(dp);
put_stateid:
nfs4_put_stid(&dp->dl_stid);
diff --git a/fs/nfsd/nfsd.h b/fs/nfsd/nfsd.h
index 57a468ed85c3..6ab4ad41ae84 100644
--- a/fs/nfsd/nfsd.h
+++ b/fs/nfsd/nfsd.h
@@ -164,6 +164,7 @@ char * nfs4_recoverydir(void);
bool nfsd4_spo_must_allow(struct svc_rqst *rqstp);
int nfsd4_create_laundry_wq(void);
void nfsd4_destroy_laundry_wq(void);
+bool nfsd_wait_for_delegreturn(struct svc_rqst *rqstp, struct inode *inode);
#else
static inline int nfsd4_init_slabs(void) { return 0; }
static inline void nfsd4_free_slabs(void) { }
@@ -179,6 +180,11 @@ static inline bool nfsd4_spo_must_allow(struct svc_rqst *rqstp)
}
static inline int nfsd4_create_laundry_wq(void) { return 0; };
static inline void nfsd4_destroy_laundry_wq(void) {};
+static inline bool nfsd_wait_for_delegreturn(struct svc_rqst *rqstp,
+ struct inode *inode)
+{
+ return false;
+}
#endif

/*
@@ -343,6 +349,7 @@ void nfsd_lockd_shutdown(void);
#define NFSD_COURTESY_CLIENT_TIMEOUT (24 * 60 * 60) /* seconds */
#define NFSD_CLIENT_MAX_TRIM_PER_RUN 128
#define NFS4_CLIENTS_PER_GB 1024
+#define NFSD_DELEGRETURN_TIMEOUT (HZ / 34) /* 30ms */

/*
* The following attributes are currently not supported by the NFSv4 server:
diff --git a/fs/nfsd/trace.h b/fs/nfsd/trace.h
index ec8e08315779..06a96e955bd0 100644
--- a/fs/nfsd/trace.h
+++ b/fs/nfsd/trace.h
@@ -538,6 +538,29 @@ DEFINE_NFSD_COPY_ERR_EVENT(clone_file_range_err);
#include "filecache.h"
#include "vfs.h"

+TRACE_EVENT(nfsd_delegret_wakeup,
+ TP_PROTO(
+ const struct svc_rqst *rqstp,
+ const struct inode *inode,
+ long timeo
+ ),
+ TP_ARGS(rqstp, inode, timeo),
+ TP_STRUCT__entry(
+ __field(u32, xid)
+ __field(const void *, inode)
+ __field(long, timeo)
+ ),
+ TP_fast_assign(
+ __entry->xid = be32_to_cpu(rqstp->rq_xid);
+ __entry->inode = inode;
+ __entry->timeo = timeo;
+ ),
+ TP_printk("xid=0x%08x inode=%p%s",
+ __entry->xid, __entry->inode,
+ __entry->timeo == 0 ? " (timed out)" : ""
+ )
+);
+
DECLARE_EVENT_CLASS(nfsd_stateid_class,
TP_PROTO(stateid_t *stp),
TP_ARGS(stp),


2022-09-08 22:38:19

by Chuck Lever III

[permalink] [raw]
Subject: [PATCH v4 5/8] NFSD: Refactor nfsd_setattr()

Move code that will be retried (in a subsequent patch) into a helper
function.

Signed-off-by: Chuck Lever <[email protected]>
---
fs/nfsd/vfs.c | 74 ++++++++++++++++++++++++++++++---------------------------
1 file changed, 39 insertions(+), 35 deletions(-)

diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c
index 9f486b788ed0..02f31d8c727a 100644
--- a/fs/nfsd/vfs.c
+++ b/fs/nfsd/vfs.c
@@ -339,6 +339,44 @@ nfsd_get_write_access(struct svc_rqst *rqstp, struct svc_fh *fhp,
return nfserrno(get_write_access(inode));
}

+static int __nfsd_setattr(struct dentry *dentry, struct iattr *iap)
+{
+ int host_err;
+
+ if (iap->ia_valid & ATTR_SIZE) {
+ /*
+ * RFC5661, Section 18.30.4:
+ * Changing the size of a file with SETATTR indirectly
+ * changes the time_modify and change attributes.
+ *
+ * (and similar for the older RFCs)
+ */
+ struct iattr size_attr = {
+ .ia_valid = ATTR_SIZE | ATTR_CTIME | ATTR_MTIME,
+ .ia_size = iap->ia_size,
+ };
+
+ if (iap->ia_size < 0)
+ return -EFBIG;
+
+ host_err = notify_change(&init_user_ns, dentry, &size_attr, NULL);
+ if (host_err)
+ return host_err;
+ iap->ia_valid &= ~ATTR_SIZE;
+
+ /*
+ * Avoid the additional setattr call below if the only other
+ * attribute that the client sends is the mtime, as we update
+ * it as part of the size change above.
+ */
+ if ((iap->ia_valid & ~ATTR_MTIME) == 0)
+ return 0;
+ }
+
+ iap->ia_valid |= ATTR_CTIME;
+ return notify_change(&init_user_ns, dentry, iap, NULL);
+}
+
/*
* Set various file attributes. After this call fhp needs an fh_put.
*/
@@ -417,41 +455,7 @@ nfsd_setattr(struct svc_rqst *rqstp, struct svc_fh *fhp,
}

inode_lock(inode);
- if (size_change) {
- /*
- * RFC5661, Section 18.30.4:
- * Changing the size of a file with SETATTR indirectly
- * changes the time_modify and change attributes.
- *
- * (and similar for the older RFCs)
- */
- struct iattr size_attr = {
- .ia_valid = ATTR_SIZE | ATTR_CTIME | ATTR_MTIME,
- .ia_size = iap->ia_size,
- };
-
- host_err = -EFBIG;
- if (iap->ia_size < 0)
- goto out_unlock;
-
- host_err = notify_change(&init_user_ns, dentry, &size_attr, NULL);
- if (host_err)
- goto out_unlock;
- iap->ia_valid &= ~ATTR_SIZE;
-
- /*
- * Avoid the additional setattr call below if the only other
- * attribute that the client sends is the mtime, as we update
- * it as part of the size change above.
- */
- if ((iap->ia_valid & ~ATTR_MTIME) == 0)
- goto out_unlock;
- }
-
- iap->ia_valid |= ATTR_CTIME;
- host_err = notify_change(&init_user_ns, dentry, iap, NULL);
-
-out_unlock:
+ host_err = __nfsd_setattr(dentry, iap);
if (attr->na_seclabel && attr->na_seclabel->len)
attr->na_labelerr = security_inode_setsecctx(dentry,
attr->na_seclabel->data, attr->na_seclabel->len);


2022-09-12 16:24:50

by Jeff Layton

[permalink] [raw]
Subject: Re: [PATCH v4 0/8] Wait for DELEGRETURN before returning NFS4ERR_DELAY

On Thu, 2022-09-08 at 18:13 -0400, Chuck Lever wrote:
> This patch series adds logic to the Linux NFS server to make it wait
> a few moments for clients to return a delegation before replying
> with NFS4ERR_DELAY. There are two main benefits:
>
> - It improves responsiveness when a delegated file is accessed from
> another client, and
> - It removes an unnecessary latency if a client has neglected to
> return a delegation before attempting a RENAME or SETATTR
>
> NFS4ERR_DELAY during NFSv4 OPEN is still not handled. However, I'm
> comfortable merging the infrastructure in this series and support
> for using it in SETATTR, RENAME, and REMOVE now, and then handling
> OPEN at a later time.
>
> This series applies against v6.0-rc4.
>
> Changes since v3:
> - Also handle JUKEBOX when an NFSv3 operation triggers a CB_RECALL
>
> Changes since v2:
> - Wake immediately after server receives DELEGRETURN
> - More tracepoint improvements
>
> Changes since RFC:
> - Eliminate spurious console messages on the server
> - Wait for DELEGRETURN for RENAME operations
> - Add CB done tracepoints
> - Rework the retry loop
>
> ---
>
> Chuck Lever (8):
> NFSD: Replace dprintk() call site in fh_verify()
> NFSD: Trace NFSv4 COMPOUND tags
> NFSD: Add tracepoints to report NFSv4 callback completions
> NFSD: Add a mechanism to wait for a DELEGRETURN
> NFSD: Refactor nfsd_setattr()
> NFSD: Make nfsd4_setattr() wait before returning NFS4ERR_DELAY
> NFSD: Make nfsd4_rename() wait before returning NFS4ERR_DELAY
> NFSD: Make nfsd4_remove() wait before returning NFS4ERR_DELAY
>
>
> fs/nfsd/nfs4layouts.c | 2 +-
> fs/nfsd/nfs4proc.c | 6 +-
> fs/nfsd/nfs4state.c | 34 +++++++++++
> fs/nfsd/nfsd.h | 7 +++
> fs/nfsd/nfsfh.c | 8 +--
> fs/nfsd/trace.h | 131 ++++++++++++++++++++++++++++++++++++++----
> fs/nfsd/vfs.c | 100 ++++++++++++++++++++------------
> 7 files changed, 233 insertions(+), 55 deletions(-)
>
> --
> Chuck Lever
>

Nice work, Chuck. These all look reasonable to me.

Reviewed-by: Jeff Layton <[email protected]>