LinuxLists.cc - [PATCH RFC 00/13] NFSD backchannel fixes

2024-01-25 16:28:43

Subject: [PATCH RFC 00/13] NFSD backchannel fixes

The first three patches fix bugs that prevent NFSD's backchannel
from reliably retransmitting after a client reconnects. These fixes
might be appropriate for 6.8-rc.

Following that are some new trace points that might be helpful for
field troubleshooting.

Then there are some minor clean-ups.

I am still testing this series, and there is one msleep() call that
needs some thought. Thoughts, comments, opinions, rotten fruit? You
know the drill.

---

Chuck Lever (13):
NFSD: Reset cb_seq_status after NFS4ERR_DELAY
NFSD: Reschedule CB operations when backchannel rpc_clnt is shut down
NFSD: Retransmit callbacks after client reconnects
NFSD: Add nfsd_seq4_status trace event
NFSD: Replace dprintks in nfsd4_cb_sequence_done()
NFSD: Rename nfsd_cb_state trace point
NFSD: Add callback operation lifetime trace points
SUNRPC: Remove EXPORT_SYMBOL_GPL for svc_process_bc()
NFSD: Remove unused @reason argument
NFSD: Replace comment with lockdep assertion
NFSD: Remove BUG_ON in nfsd4_process_cb_update()
SUNRPC: Remove stale comments
NFSD: Remove redundant cb_seq_status initialization

fs/nfsd/nfs4callback.c | 81 +++++++++++++-------
fs/nfsd/nfs4state.c | 1 +
fs/nfsd/trace.h | 162 ++++++++++++++++++++++++++++++++++++++-
include/trace/misc/nfs.h | 34 ++++++++
net/sunrpc/svc.c | 1 -
net/sunrpc/xprtsock.c | 9 ---
6 files changed, 250 insertions(+), 38 deletions(-)

--
Chuck Lever

2024-01-25 16:28:52

by Chuck Lever

[permalink] [raw]

Subject: [PATCH RFC 01/13] NFSD: Reset cb_seq_status after NFS4ERR_DELAY

From: Chuck Lever <[email protected]>

I noticed that once an NFSv4.1 callback operation gets a
NFS4ERR_DELAY status on CB_SEQUENCE and then the connection is lost,
the callback client loops, resending it indefinitely.

The switch arm in nfsd4_cb_sequence_done() that handles
NFS4ERR_DELAY uses rpc_restart_call() to rearm the RPC state machine
for the retransmit, but that path does not call the rpc_prepare_call
callback again. Thus cb_seq_status is set to -10008 by the first
NFS4ERR_DELAY result, but is never set back to 1 for the retransmits.

nfsd4_cb_sequence_done() thinks it's getting nothing but a
long series of CB_SEQUENCE NFS4ERR_DELAY replies.

Fixes: 7ba6cad6c88f ("nfsd: New helper nfsd4_cb_sequence_done() for processing more cb errors")
Signed-off-by: Chuck Lever <[email protected]>
---
fs/nfsd/nfs4callback.c | 1 +
1 file changed, 1 insertion(+)

diff --git a/fs/nfsd/nfs4callback.c b/fs/nfsd/nfs4callback.c
index 926c29879c6a..43b0a34a5d5b 100644
--- a/fs/nfsd/nfs4callback.c
+++ b/fs/nfsd/nfs4callback.c
@@ -1178,6 +1178,7 @@ static bool nfsd4_cb_sequence_done(struct rpc_task *task, struct nfsd4_callback
ret = false;
break;
case -NFS4ERR_DELAY:
+ cb->cb_seq_status = 1;
if (!rpc_restart_call(task))
goto out;

2024-01-25 16:28:57

by Chuck Lever

[permalink] [raw]

Subject: [PATCH RFC 02/13] NFSD: Reschedule CB operations when backchannel rpc_clnt is shut down

From: Chuck Lever <[email protected]>

As part of managing a client disconnect, NFSD closes down and
replaces the backchannel rpc_clnt.

If a callback operation is pending when the backchannel rpc_clnt is
shut down, currently nfsd4_run_cb_work() just discards that
callback. But there are multiple cases to deal with here:

o The client's lease is getting destroyed. Throw the CB away.

o The client disconnected. It might be forcing a retransmit of
CB operations, or it could have disconnected for other reasons.
Reschedule the CB so it is retransmitted when the client
reconnects.

Since callback operations can now be rescheduled, ensure that
cb_ops->prepare can be called only once by moving the
cb_ops->prepare paragraph down to just before the rpc_call_async()
call.

Fixes: 2bbfed98a4d8 ("nfsd: Fix races between nfsd4_cb_release() and nfsd4_shutdown_callback()")
Signed-off-by: Chuck Lever <[email protected]>
---
fs/nfsd/nfs4callback.c | 26 +++++++++++++++++---------
1 file changed, 17 insertions(+), 9 deletions(-)

diff --git a/fs/nfsd/nfs4callback.c b/fs/nfsd/nfs4callback.c
index 43b0a34a5d5b..b2844abcb51f 100644
--- a/fs/nfsd/nfs4callback.c
+++ b/fs/nfsd/nfs4callback.c
@@ -1375,20 +1375,22 @@ nfsd4_run_cb_work(struct work_struct *work)
struct rpc_clnt *clnt;
int flags;

- if (cb->cb_need_restart) {
- cb->cb_need_restart = false;
- } else {
- if (cb->cb_ops && cb->cb_ops->prepare)
- cb->cb_ops->prepare(cb);
- }
-
if (clp->cl_flags & NFSD4_CLIENT_CB_FLAG_MASK)
nfsd4_process_cb_update(cb);

clnt = clp->cl_cb_client;
if (!clnt) {
- /* Callback channel broken, or client killed; give up: */
- nfsd41_destroy_cb(cb);
+ if (test_bit(NFSD4_CLIENT_CB_KILL, &clp->cl_flags))
+ nfsd41_destroy_cb(cb);
+ else {
+ /*
+ * XXX: Ideally, we would wait for the client to
+ * reconnect, but I haven't figured out how
+ * to do that yet.
+ */
+ msleep(30);
+ nfsd4_queue_cb(cb);
+ }
return;
}

@@ -1401,6 +1403,12 @@ nfsd4_run_cb_work(struct work_struct *work)
return;
}

+ if (cb->cb_need_restart) {
+ cb->cb_need_restart = false;
+ } else {
+ if (cb->cb_ops && cb->cb_ops->prepare)
+ cb->cb_ops->prepare(cb);
+ }
cb->cb_msg.rpc_cred = clp->cl_cb_cred;
flags = clp->cl_minorversion ? RPC_TASK_NOCONNECT : RPC_TASK_SOFTCONN;
rpc_call_async(clnt, &cb->cb_msg, RPC_TASK_SOFT | flags,

2024-01-25 16:29:02

by Chuck Lever

[permalink] [raw]

Subject: [PATCH RFC 03/13] NFSD: Retransmit callbacks after client reconnects

From: Chuck Lever <[email protected]>

NFSv4.1 clients assume that if they disconnect, that will force the
server to resend pending callback operations once a fresh connection
has been established.

Turns out NFSD has not been resending after reconnect.

Fixes: 7ba6cad6c88f ("nfsd: New helper nfsd4_cb_sequence_done() for processing more cb errors")
Signed-off-by: Chuck Lever <[email protected]>
---
fs/nfsd/nfs4callback.c | 13 +++++++++++--
1 file changed, 11 insertions(+), 2 deletions(-)

diff --git a/fs/nfsd/nfs4callback.c b/fs/nfsd/nfs4callback.c
index b2844abcb51f..1ff64efe1f5c 100644
--- a/fs/nfsd/nfs4callback.c
+++ b/fs/nfsd/nfs4callback.c
@@ -1171,12 +1171,21 @@ static bool nfsd4_cb_sequence_done(struct rpc_task *task, struct nfsd4_callback
break;
case -ESERVERFAULT:
++session->se_cb_seq_nr;
- fallthrough;
+ nfsd4_mark_cb_fault(cb->cb_clp, cb->cb_seq_status);
+ ret = false;
+ break;
case 1:
+ /*
+ * cb_seq_status remains 1 if an RPC Reply was never
+ * received. NFSD can't know if the client processed
+ * the CB_SEQUENCE operation. Ask the client to send a
+ * DESTROY_SESSION to recover.
+ */
+ fallthrough;
case -NFS4ERR_BADSESSION:
nfsd4_mark_cb_fault(cb->cb_clp, cb->cb_seq_status);
ret = false;
- break;
+ goto need_restart;
case -NFS4ERR_DELAY:
cb->cb_seq_status = 1;
if (!rpc_restart_call(task))

2024-01-25 16:29:10

by Chuck Lever

[permalink] [raw]

Subject: [PATCH RFC 04/13] NFSD: Add nfsd_seq4_status trace event

From: Chuck Lever <[email protected]>

Add a trace point that records SEQ4_STATUS flags returned in an
NFSv4.1 SEQUENCE response. SEQ4_STATUS flags report backchannel
issues and changes to lease state to clients. Knowing what the
server is reporting to clients is useful for debugging both
configuration and operational issues in real time.

For example, upcoming patches will enable server administrators to
revoke parts of a client's lease; that revocation is indicated to
the client when a subsequent SEQUENCE operation has one or more
SEQ4_STATUS flags that are set.

Sample trace records:

nfsd-927 [006] 615.581821: nfsd_seq4_status: xid=0x095ded07 sessionid=65a032c3:b7845faf:00000001:00000000 status_flags=BACKCHANNEL_FAULT
nfsd-927 [006] 615.588043: nfsd_seq4_status: xid=0x0a5ded07 sessionid=65a032c3:b7845faf:00000001:00000000 status_flags=BACKCHANNEL_FAULT
nfsd-928 [003] 615.588448: nfsd_seq4_status: xid=0x0b5ded07 sessionid=65a032c3:b7845faf:00000001:00000000 status_flags=BACKCHANNEL_FAULT

Signed-off-by: Chuck Lever <[email protected]>
---
fs/nfsd/nfs4state.c | 1 +
fs/nfsd/trace.h | 35 +++++++++++++++++++++++++++++++++++
2 files changed, 36 insertions(+)

diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index d7d561b29fb0..3d6e3dcfdee1 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -4058,6 +4058,7 @@ nfsd4_sequence(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
}
if (!list_empty(&clp->cl_revoked))
seq->status_flags |= SEQ4_STATUS_RECALLABLE_STATE_REVOKED;
+ trace_nfsd_seq4_status(rqstp, seq);
out_no_session:
if (conn)
free_conn(conn);
diff --git a/fs/nfsd/trace.h b/fs/nfsd/trace.h
index d1e8cf079b0f..38d11b43779c 100644
--- a/fs/nfsd/trace.h
+++ b/fs/nfsd/trace.h
@@ -696,6 +696,41 @@ DEFINE_EVENT(nfsd_stid_class, nfsd_stid_##name, \

DEFINE_STID_EVENT(revoke);

+TRACE_EVENT_CONDITION(nfsd_seq4_status,
+ TP_PROTO(
+ const struct svc_rqst *rqstp,
+ const struct nfsd4_sequence *sequence
+ ),
+ TP_ARGS(rqstp, sequence),
+ TP_CONDITION(sequence->status_flags),
+ TP_STRUCT__entry(
+ __field(unsigned int, netns_ino)
+ __field(u32, xid)
+ __field(u32, cl_boot)
+ __field(u32, cl_id)
+ __field(u32, seqno)
+ __field(u32, reserved)
+ __field(unsigned long, status_flags)
+ ),
+ TP_fast_assign(
+ const struct nfsd4_sessionid *sid =
+ (struct nfsd4_sessionid *)&sequence->sessionid;
+
+ __entry->netns_ino = SVC_NET(rqstp)->ns.inum;
+ __entry->xid = be32_to_cpu(rqstp->rq_xid);
+ __entry->cl_boot = sid->clientid.cl_boot;
+ __entry->cl_id = sid->clientid.cl_id;
+ __entry->seqno = sid->sequence;
+ __entry->reserved = sid->reserved;
+ __entry->status_flags = sequence->status_flags;
+ ),
+ TP_printk("xid=0x%08x sessionid=%08x:%08x:%08x:%08x status_flags=%s",
+ __entry->xid, __entry->cl_boot, __entry->cl_id,
+ __entry->seqno, __entry->reserved,
+ show_nfs4_seq4_status(__entry->status_flags)
+ )
+);
+
DECLARE_EVENT_CLASS(nfsd_clientid_class,
TP_PROTO(const clientid_t *clid),
TP_ARGS(clid),

2024-01-25 16:29:14

by Chuck Lever

[permalink] [raw]

Subject: [PATCH RFC 05/13] NFSD: Replace dprintks in nfsd4_cb_sequence_done()

From: Chuck Lever <[email protected]>

Improve observability of backchannel session operation.

Signed-off-by: Chuck Lever <[email protected]>
---
fs/nfsd/nfs4callback.c | 9 ++---
fs/nfsd/trace.h | 82 ++++++++++++++++++++++++++++++++++++++++++++++++
2 files changed, 86 insertions(+), 5 deletions(-)

diff --git a/fs/nfsd/nfs4callback.c b/fs/nfsd/nfs4callback.c
index 1ff64efe1f5c..9f5aebeef83c 100644
--- a/fs/nfsd/nfs4callback.c
+++ b/fs/nfsd/nfs4callback.c
@@ -1158,6 +1158,8 @@ static bool nfsd4_cb_sequence_done(struct rpc_task *task, struct nfsd4_callback
if (!cb->cb_holds_slot)
goto need_restart;

+ /* This is the operation status code for CB_SEQUENCE */
+ trace_nfsd_cb_seq_status(task, cb);
switch (cb->cb_seq_status) {
case 0:
/*
@@ -1203,13 +1205,10 @@ static bool nfsd4_cb_sequence_done(struct rpc_task *task, struct nfsd4_callback
break;
default:
nfsd4_mark_cb_fault(cb->cb_clp, cb->cb_seq_status);
- dprintk("%s: unprocessed error %d\n", __func__,
- cb->cb_seq_status);
}
-
nfsd41_cb_release_slot(cb);
- dprintk("%s: freed slot, new seqid=%d\n", __func__,
- clp->cl_cb_session->se_cb_seq_nr);
+
+ trace_nfsd_cb_free_slot(task, cb);

if (RPC_SIGNALLED(task))
goto need_restart;
diff --git a/fs/nfsd/trace.h b/fs/nfsd/trace.h
index 38d11b43779c..c134c755ae5d 100644
--- a/fs/nfsd/trace.h
+++ b/fs/nfsd/trace.h
@@ -9,8 +9,10 @@
#define _NFSD_TRACE_H

#include <linux/tracepoint.h>
+#include <linux/sunrpc/clnt.h>
#include <linux/sunrpc/xprt.h>
#include <trace/misc/nfs.h>
+#include <trace/misc/sunrpc.h>

#include "export.h"
#include "nfsfh.h"
@@ -1440,6 +1442,86 @@ TRACE_EVENT(nfsd_cb_setup_err,
__entry->error)
);

+TRACE_EVENT(nfsd_cb_seq_status,
+ TP_PROTO(
+ const struct rpc_task *task,
+ const struct nfsd4_callback *cb
+ ),
+ TP_ARGS(task, cb),
+ TP_STRUCT__entry(
+ __field(unsigned int, task_id)
+ __field(unsigned int, client_id)
+ __field(u32, cl_boot)
+ __field(u32, cl_id)
+ __field(u32, seqno)
+ __field(u32, reserved)
+ __field(int, tk_status)
+ __field(int, seq_status)
+ ),
+ TP_fast_assign(
+ const struct nfs4_client *clp = cb->cb_clp;
+ const struct nfsd4_session *session = clp->cl_cb_session;
+ const struct nfsd4_sessionid *sid =
+ (struct nfsd4_sessionid *)&session->se_sessionid;
+
+ __entry->task_id = task->tk_pid;
+ __entry->client_id = task->tk_client ?
+ task->tk_client->cl_clid : -1;
+ __entry->cl_boot = sid->clientid.cl_boot;
+ __entry->cl_id = sid->clientid.cl_id;
+ __entry->seqno = sid->sequence;
+ __entry->reserved = sid->reserved;
+ __entry->tk_status = task->tk_status;
+ __entry->seq_status = cb->cb_seq_status;
+ ),
+ TP_printk(SUNRPC_TRACE_TASK_SPECIFIER
+ " sessionid=%08x:%08x:%08x:%08x tk_status=%d seq_status=%d\n",
+ __entry->task_id, __entry->client_id,
+ __entry->cl_boot, __entry->cl_id,
+ __entry->seqno, __entry->reserved,
+ __entry->tk_status, __entry->seq_status
+ )
+);
+
+TRACE_EVENT(nfsd_cb_free_slot,
+ TP_PROTO(
+ const struct rpc_task *task,
+ const struct nfsd4_callback *cb
+ ),
+ TP_ARGS(task, cb),
+ TP_STRUCT__entry(
+ __field(unsigned int, task_id)
+ __field(unsigned int, client_id)
+ __field(u32, cl_boot)
+ __field(u32, cl_id)
+ __field(u32, seqno)
+ __field(u32, reserved)
+ __field(u32, slot_seqno)
+ ),
+ TP_fast_assign(
+ const struct nfs4_client *clp = cb->cb_clp;
+ const struct nfsd4_session *session = clp->cl_cb_session;
+ const struct nfsd4_sessionid *sid =
+ (struct nfsd4_sessionid *)&session->se_sessionid;
+
+ __entry->task_id = task->tk_pid;
+ __entry->client_id = task->tk_client ?
+ task->tk_client->cl_clid : -1;
+ __entry->cl_boot = sid->clientid.cl_boot;
+ __entry->cl_id = sid->clientid.cl_id;
+ __entry->seqno = sid->sequence;
+ __entry->reserved = sid->reserved;
+ __entry->slot_seqno = session->se_cb_seq_nr;
+ ),
+ TP_printk(SUNRPC_TRACE_TASK_SPECIFIER
+ " sessionid=%08x:%08x:%08x:%08x new slot seqno=%u\n",
+ __entry->task_id, __entry->client_id,
+ __entry->cl_boot, __entry->cl_id,
+ __entry->seqno, __entry->reserved,
+ __entry->slot_seqno
+ )
+);
+
TRACE_EVENT_CONDITION(nfsd_cb_recall,
TP_PROTO(
const struct nfs4_stid *stid

2024-01-25 16:29:36

by Chuck Lever

[permalink] [raw]

Subject: [PATCH RFC 06/13] NFSD: Rename nfsd_cb_state trace point

From: Chuck Lever <[email protected]>

Make it clear where backchannel state is updated.

Signed-off-by: Chuck Lever <[email protected]>
---
fs/nfsd/nfs4callback.c | 4 +++-
fs/nfsd/trace.h | 3 ++-
2 files changed, 5 insertions(+), 2 deletions(-)

diff --git a/fs/nfsd/nfs4callback.c b/fs/nfsd/nfs4callback.c
index 9f5aebeef83c..1c85426830b1 100644
--- a/fs/nfsd/nfs4callback.c
+++ b/fs/nfsd/nfs4callback.c
@@ -999,7 +999,7 @@ static void nfsd4_mark_cb_state(struct nfs4_client *clp, int newstate)
{
if (clp->cl_cb_state != newstate) {
clp->cl_cb_state = newstate;
- trace_nfsd_cb_state(clp);
+ trace_nfsd_cb_new_state(clp);
}
}

@@ -1383,6 +1383,8 @@ nfsd4_run_cb_work(struct work_struct *work)
struct rpc_clnt *clnt;
int flags;

+ trace_nfsd_cb_start(clp);
+
if (clp->cl_flags & NFSD4_CLIENT_CB_FLAG_MASK)
nfsd4_process_cb_update(cb);

diff --git a/fs/nfsd/trace.h b/fs/nfsd/trace.h
index c134c755ae5d..6003af2bee33 100644
--- a/fs/nfsd/trace.h
+++ b/fs/nfsd/trace.h
@@ -1371,7 +1371,8 @@ DEFINE_EVENT(nfsd_cb_class, nfsd_cb_##name, \
TP_PROTO(const struct nfs4_client *clp), \
TP_ARGS(clp))

-DEFINE_NFSD_CB_EVENT(state);
+DEFINE_NFSD_CB_EVENT(start);
+DEFINE_NFSD_CB_EVENT(new_state);
DEFINE_NFSD_CB_EVENT(probe);
DEFINE_NFSD_CB_EVENT(lost);
DEFINE_NFSD_CB_EVENT(shutdown);

2024-01-25 16:29:52

by Chuck Lever

[permalink] [raw]

Subject: [PATCH RFC 08/13] SUNRPC: Remove EXPORT_SYMBOL_GPL for svc_process_bc()

From: Chuck Lever <[email protected]>

svc_process_bc(), previously known as bc_svc_process(), was
added in commit 4d6bbb6233c9 ("nfs41: Backchannel bc_svc_process()")
but there has never been a call site outside of the sunrpc.ko
module.

Signed-off-by: Chuck Lever <[email protected]>
---
net/sunrpc/svc.c | 1 -
1 file changed, 1 deletion(-)

diff --git a/net/sunrpc/svc.c b/net/sunrpc/svc.c
index eb5856e1351d..425c1ca9a772 100644
--- a/net/sunrpc/svc.c
+++ b/net/sunrpc/svc.c
@@ -1603,7 +1603,6 @@ void svc_process_bc(struct rpc_rqst *req, struct svc_rqst *rqstp)
WARN_ON_ONCE(atomic_read(&task->tk_count) != 1);
rpc_put_task(task);
}
-EXPORT_SYMBOL_GPL(svc_process_bc);
#endif /* CONFIG_SUNRPC_BACKCHANNEL */

/**

2024-01-25 16:29:55

by Chuck Lever

[permalink] [raw]

Subject: [PATCH RFC 09/13] NFSD: Remove unused @reason argument

From: Chuck Lever <[email protected]>

Signed-off-by: Chuck Lever <[email protected]>
---
fs/nfsd/nfs4callback.c | 18 +++++++++---------
1 file changed, 9 insertions(+), 9 deletions(-)

diff --git a/fs/nfsd/nfs4callback.c b/fs/nfsd/nfs4callback.c
index 4d5a6370b92c..1e0f5a0bd804 100644
--- a/fs/nfsd/nfs4callback.c
+++ b/fs/nfsd/nfs4callback.c
@@ -45,7 +45,7 @@

#define NFSDDBG_FACILITY NFSDDBG_PROC

-static void nfsd4_mark_cb_fault(struct nfs4_client *, int reason);
+static void nfsd4_mark_cb_fault(struct nfs4_client *clp);

#define NFSPROC4_CB_NULL 0
#define NFSPROC4_CB_COMPOUND 1
@@ -1004,14 +1004,14 @@ static void nfsd4_mark_cb_state(struct nfs4_client *clp, int newstate)
}
}

-static void nfsd4_mark_cb_down(struct nfs4_client *clp, int reason)
+static void nfsd4_mark_cb_down(struct nfs4_client *clp)
{
if (test_bit(NFSD4_CLIENT_CB_UPDATE, &clp->cl_flags))
return;
nfsd4_mark_cb_state(clp, NFSD4_CB_DOWN);
}

-static void nfsd4_mark_cb_fault(struct nfs4_client *clp, int reason)
+static void nfsd4_mark_cb_fault(struct nfs4_client *clp)
{
if (test_bit(NFSD4_CLIENT_CB_UPDATE, &clp->cl_flags))
return;
@@ -1023,7 +1023,7 @@ static void nfsd4_cb_probe_done(struct rpc_task *task, void *calldata)
struct nfs4_client *clp = container_of(calldata, struct nfs4_client, cl_cb_null);

if (task->tk_status)
- nfsd4_mark_cb_down(clp, task->tk_status);
+ nfsd4_mark_cb_down(clp);
else
nfsd4_mark_cb_state(clp, NFSD4_CB_UP);
}
@@ -1175,7 +1175,7 @@ static bool nfsd4_cb_sequence_done(struct rpc_task *task, struct nfsd4_callback
break;
case -ESERVERFAULT:
++session->se_cb_seq_nr;
- nfsd4_mark_cb_fault(cb->cb_clp, cb->cb_seq_status);
+ nfsd4_mark_cb_fault(cb->cb_clp);
ret = false;
break;
case 1:
@@ -1187,7 +1187,7 @@ static bool nfsd4_cb_sequence_done(struct rpc_task *task, struct nfsd4_callback
*/
fallthrough;
case -NFS4ERR_BADSESSION:
- nfsd4_mark_cb_fault(cb->cb_clp, cb->cb_seq_status);
+ nfsd4_mark_cb_fault(cb->cb_clp);
ret = false;
goto need_restart;
case -NFS4ERR_DELAY:
@@ -1206,7 +1206,7 @@ static bool nfsd4_cb_sequence_done(struct rpc_task *task, struct nfsd4_callback
}
break;
default:
- nfsd4_mark_cb_fault(cb->cb_clp, cb->cb_seq_status);
+ nfsd4_mark_cb_fault(cb->cb_clp);
}
nfsd41_cb_release_slot(cb);

@@ -1252,7 +1252,7 @@ static void nfsd4_cb_done(struct rpc_task *task, void *calldata)
case -EIO:
case -ETIMEDOUT:
case -EACCES:
- nfsd4_mark_cb_down(clp, task->tk_status);
+ nfsd4_mark_cb_down(clp);
}
break;
default:
@@ -1374,7 +1374,7 @@ static void nfsd4_process_cb_update(struct nfsd4_callback *cb)

err = setup_callback_client(clp, &conn, ses);
if (err) {
- nfsd4_mark_cb_down(clp, err);
+ nfsd4_mark_cb_down(clp);
if (c)
svc_xprt_put(c->cn_xprt);
return;

2024-01-25 16:29:56

by Chuck Lever

[permalink] [raw]

Subject: [PATCH RFC 10/13] NFSD: Replace comment with lockdep assertion

From: Chuck Lever <[email protected]>

Convert a code comment into a real assertion.

Signed-off-by: Chuck Lever <[email protected]>
---
fs/nfsd/nfs4callback.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/fs/nfsd/nfs4callback.c b/fs/nfsd/nfs4callback.c
index 1e0f5a0bd804..dc308569bc31 100644
--- a/fs/nfsd/nfs4callback.c
+++ b/fs/nfsd/nfs4callback.c
@@ -1307,12 +1307,13 @@ void nfsd4_shutdown_callback(struct nfs4_client *clp)
nfsd41_cb_inflight_wait_complete(clp);
}

-/* requires cl_lock: */
static struct nfsd4_conn * __nfsd4_find_backchannel(struct nfs4_client *clp)
{
struct nfsd4_session *s;
struct nfsd4_conn *c;

+ lockdep_assert_held(&clp->cl_lock);
+
list_for_each_entry(s, &clp->cl_sessions, se_perclnt) {
list_for_each_entry(c, &s->se_conns, cn_persession) {
if (c->cn_flags & NFS4_CDFC4_BACK)

2024-01-25 16:29:59

by Chuck Lever

[permalink] [raw]

Subject: [PATCH RFC 11/13] NFSD: Remove BUG_ON in nfsd4_process_cb_update()

From: Chuck Lever <[email protected]>

Don't kill the kworker thread, and don't panic while cl_lock is
held. There's no need for scorching the earth here.

Signed-off-by: Chuck Lever <[email protected]>
---
fs/nfsd/nfs4callback.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/fs/nfsd/nfs4callback.c b/fs/nfsd/nfs4callback.c
index dc308569bc31..46ecb3ec0f8f 100644
--- a/fs/nfsd/nfs4callback.c
+++ b/fs/nfsd/nfs4callback.c
@@ -1362,8 +1362,9 @@ static void nfsd4_process_cb_update(struct nfsd4_callback *cb)
* Only serialized callback code is allowed to clear these
* flags; main nfsd code can only set them:
*/
- BUG_ON(!(clp->cl_flags & NFSD4_CLIENT_CB_FLAG_MASK));
+ WARN_ON(!(clp->cl_flags & NFSD4_CLIENT_CB_FLAG_MASK));
clear_bit(NFSD4_CLIENT_CB_UPDATE, &clp->cl_flags);
+
memcpy(&conn, &cb->cb_clp->cl_cb_conn, sizeof(struct nfs4_cb_conn));
c = __nfsd4_find_backchannel(clp);
if (c) {

2024-01-25 16:30:00

by Chuck Lever

[permalink] [raw]

Subject: [PATCH RFC 12/13] SUNRPC: Remove stale comments

From: Chuck Lever <[email protected]>

bc_close() and bc_destroy now do something, so the comments are
no longer correct. Commit 6221f1d9b63f ("SUNRPC: Fix backchannel
RPC soft lockups") should have removed these.

Signed-off-by: Chuck Lever <[email protected]>
---
net/sunrpc/xprtsock.c | 9 ---------
1 file changed, 9 deletions(-)

diff --git a/net/sunrpc/xprtsock.c b/net/sunrpc/xprtsock.c
index 58f3dc8d0d71..d92c13e78a56 100644
--- a/net/sunrpc/xprtsock.c
+++ b/net/sunrpc/xprtsock.c
@@ -2987,20 +2987,11 @@ static int bc_send_request(struct rpc_rqst *req)
return len;
}

-/*
- * The close routine. Since this is client initiated, we do nothing
- */
-
static void bc_close(struct rpc_xprt *xprt)
{
xprt_disconnect_done(xprt);
}

-/*
- * The xprt destroy routine. Again, because this connection is client
- * initiated, we do nothing
- */
-
static void bc_destroy(struct rpc_xprt *xprt)
{
dprintk("RPC: bc_destroy xprt %p\n", xprt);

2024-01-25 16:31:10

by Chuck Lever

[permalink] [raw]

Subject: [PATCH RFC 13/13] NFSD: Remove redundant cb_seq_status initialization

From: Chuck Lever <[email protected]>

As far as I can see, setting cb_seq_status in nfsd4_init_cb() is
superfluous because it is set again in nfsd4_cb_prepare().

Signed-off-by: Chuck Lever <[email protected]>
---
fs/nfsd/nfs4callback.c | 1 -
1 file changed, 1 deletion(-)

diff --git a/fs/nfsd/nfs4callback.c b/fs/nfsd/nfs4callback.c
index 46ecb3ec0f8f..e72a904f120b 100644
--- a/fs/nfsd/nfs4callback.c
+++ b/fs/nfsd/nfs4callback.c
@@ -1443,7 +1443,6 @@ void nfsd4_init_cb(struct nfsd4_callback *cb, struct nfs4_client *clp,
cb->cb_msg.rpc_resp = cb;
cb->cb_ops = ops;
INIT_WORK(&cb->cb_work, nfsd4_run_cb_work);
- cb->cb_seq_status = 1;
cb->cb_status = 0;
cb->cb_need_restart = false;
cb->cb_holds_slot = false;

2024-01-25 16:33:52

by Chuck Lever

[permalink] [raw]

Subject: [PATCH RFC 07/13] NFSD: Add callback operation lifetime trace points

From: Chuck Lever <[email protected]>

Help observe the flow of callback operations.

bc_shutdown() records exactly when the backchannel RPC client is
destroyed and cl_cb_client is replaced with NULL.

Signed-off-by: Chuck Lever <[email protected]>
---
fs/nfsd/nfs4callback.c | 7 +++++++
fs/nfsd/trace.h | 42 ++++++++++++++++++++++++++++++++++++++++++
include/trace/misc/nfs.h | 34 ++++++++++++++++++++++++++++++++++
3 files changed, 83 insertions(+)

diff --git a/fs/nfsd/nfs4callback.c b/fs/nfsd/nfs4callback.c
index 1c85426830b1..4d5a6370b92c 100644
--- a/fs/nfsd/nfs4callback.c
+++ b/fs/nfsd/nfs4callback.c
@@ -887,6 +887,7 @@ static struct workqueue_struct *callback_wq;

static bool nfsd4_queue_cb(struct nfsd4_callback *cb)
{
+ trace_nfsd_cb_queue(cb->cb_clp, cb);
return queue_work(callback_wq, &cb->cb_work);
}

@@ -1106,6 +1107,7 @@ static void nfsd41_destroy_cb(struct nfsd4_callback *cb)
{
struct nfs4_client *clp = cb->cb_clp;

+ trace_nfsd_cb_destroy(clp, cb);
nfsd41_cb_release_slot(cb);
if (cb->cb_ops && cb->cb_ops->release)
cb->cb_ops->release(cb);
@@ -1220,6 +1222,7 @@ static bool nfsd4_cb_sequence_done(struct rpc_task *task, struct nfsd4_callback
goto out;
need_restart:
if (!test_bit(NFSD4_CLIENT_CB_KILL, &clp->cl_flags)) {
+ trace_nfsd_cb_restart(clp, cb);
task->tk_status = 0;
cb->cb_need_restart = true;
}
@@ -1333,11 +1336,14 @@ static void nfsd4_process_cb_update(struct nfsd4_callback *cb)
struct nfsd4_conn *c;
int err;

+ trace_nfsd_cb_bc_update(clp, cb);
+
/*
* This is either an update, or the client dying; in either case,
* kill the old client:
*/
if (clp->cl_cb_client) {
+ trace_nfsd_cb_bc_shutdown(clp, cb);
rpc_shutdown_client(clp->cl_cb_client);
clp->cl_cb_client = NULL;
put_cred(clp->cl_cb_cred);
@@ -1349,6 +1355,7 @@ static void nfsd4_process_cb_update(struct nfsd4_callback *cb)
}
if (test_bit(NFSD4_CLIENT_CB_KILL, &clp->cl_flags))
return;
+
spin_lock(&clp->cl_lock);
/*
* Only serialized callback code is allowed to clear these
diff --git a/fs/nfsd/trace.h b/fs/nfsd/trace.h
index 6003af2bee33..9f9e58debc26 100644
--- a/fs/nfsd/trace.h
+++ b/fs/nfsd/trace.h
@@ -1443,6 +1443,48 @@ TRACE_EVENT(nfsd_cb_setup_err,
__entry->error)
);

+DECLARE_EVENT_CLASS(nfsd_cb_lifetime_class,
+ TP_PROTO(
+ const struct nfs4_client *clp,
+ const struct nfsd4_callback *cb
+ ),
+ TP_ARGS(clp, cb),
+ TP_STRUCT__entry(
+ __field(u32, cl_boot)
+ __field(u32, cl_id)
+ __field(const void *, cb)
+ __field(bool, need_restart)
+ __sockaddr(addr, clp->cl_cb_conn.cb_addrlen)
+ ),
+ TP_fast_assign(
+ __entry->cl_boot = clp->cl_clientid.cl_boot;
+ __entry->cl_id = clp->cl_clientid.cl_id;
+ __entry->cb = cb;
+ __entry->need_restart = cb->cb_need_restart;
+ __assign_sockaddr(addr, &clp->cl_cb_conn.cb_addr,
+ clp->cl_cb_conn.cb_addrlen)
+ ),
+ TP_printk("addr=%pISpc client %08x:%08x cb=%p%s",
+ __get_sockaddr(addr), __entry->cl_boot, __entry->cl_id,
+ __entry->cb, __entry->need_restart ?
+ " (need restart)" : " (first try)"
+ )
+);
+
+#define DEFINE_NFSD_CB_LIFETIME_EVENT(name) \
+DEFINE_EVENT(nfsd_cb_lifetime_class, nfsd_cb_##name, \
+ TP_PROTO( \
+ const struct nfs4_client *clp, \
+ const struct nfsd4_callback *cb \
+ ), \
+ TP_ARGS(clp, cb))
+
+DEFINE_NFSD_CB_LIFETIME_EVENT(queue);
+DEFINE_NFSD_CB_LIFETIME_EVENT(destroy);
+DEFINE_NFSD_CB_LIFETIME_EVENT(restart);
+DEFINE_NFSD_CB_LIFETIME_EVENT(bc_update);
+DEFINE_NFSD_CB_LIFETIME_EVENT(bc_shutdown);
+
TRACE_EVENT(nfsd_cb_seq_status,
TP_PROTO(
const struct rpc_task *task,
diff --git a/include/trace/misc/nfs.h b/include/trace/misc/nfs.h
index 0d9d48dca38a..64ab5dac59ce 100644
--- a/include/trace/misc/nfs.h
+++ b/include/trace/misc/nfs.h
@@ -385,3 +385,37 @@ TRACE_DEFINE_ENUM(IOMODE_ANY);
{ SEQ4_STATUS_RESTART_RECLAIM_NEEDED, "RESTART_RECLAIM_NEEDED" }, \
{ SEQ4_STATUS_CB_PATH_DOWN_SESSION, "CB_PATH_DOWN_SESSION" }, \
{ SEQ4_STATUS_BACKCHANNEL_FAULT, "BACKCHANNEL_FAULT" })
+
+TRACE_DEFINE_ENUM(OP_CB_GETATTR);
+TRACE_DEFINE_ENUM(OP_CB_RECALL);
+TRACE_DEFINE_ENUM(OP_CB_LAYOUTRECALL);
+TRACE_DEFINE_ENUM(OP_CB_NOTIFY);
+TRACE_DEFINE_ENUM(OP_CB_PUSH_DELEG);
+TRACE_DEFINE_ENUM(OP_CB_RECALL_ANY);
+TRACE_DEFINE_ENUM(OP_CB_RECALLABLE_OBJ_AVAIL);
+TRACE_DEFINE_ENUM(OP_CB_RECALL_SLOT);
+TRACE_DEFINE_ENUM(OP_CB_SEQUENCE);
+TRACE_DEFINE_ENUM(OP_CB_WANTS_CANCELLED);
+TRACE_DEFINE_ENUM(OP_CB_NOTIFY_LOCK);
+TRACE_DEFINE_ENUM(OP_CB_NOTIFY_DEVICEID);
+TRACE_DEFINE_ENUM(OP_CB_OFFLOAD);
+TRACE_DEFINE_ENUM(OP_CB_ILLEGAL);
+
+#define show_nfs4_cb_op(x) \
+ __print_symbolic(x, \
+ { 0, "CB_NULL" }, \
+ { 1, "CB_COMPOUND" }, \
+ { OP_CB_GETATTR, "CB_GETATTR" }, \
+ { OP_CB_RECALL, "CB_RECALL" }, \
+ { OP_CB_LAYOUTRECALL, "CB_LAYOUTRECALL" }, \
+ { OP_CB_NOTIFY, "CB_NOTIFY" }, \
+ { OP_CB_PUSH_DELEG, "CB_PUSH_DELEG" }, \
+ { OP_CB_RECALL_ANY, "CB_RECALL_ANY" }, \
+ { OP_CB_RECALLABLE_OBJ_AVAIL, "CB_RECALLABLE_OBJ_AVAIL" }, \
+ { OP_CB_RECALL_SLOT, "CB_RECALL_SLOT" }, \
+ { OP_CB_SEQUENCE, "CB_SEQUENCE" }, \
+ { OP_CB_WANTS_CANCELLED, "CB_WANTS_CANCELLED" }, \
+ { OP_CB_NOTIFY_LOCK, "CB_NOTIFY_LOCK" }, \
+ { OP_CB_NOTIFY_DEVICEID, "CB_NOTIFY_DEVICEID" }, \
+ { OP_CB_OFFLOAD, "CB_OFFLOAD" }, \
+ { OP_CB_ILLEGAL, "CB_ILLEGAL" })

2024-01-25 20:12:51

by Jeffrey Layton

[permalink] [raw]

Subject: Re: [PATCH RFC 01/13] NFSD: Reset cb_seq_status after NFS4ERR_DELAY

On Thu, 2024-01-25 at 11:28 -0500, Chuck Lever wrote:
> From: Chuck Lever <[email protected]>
>
> I noticed that once an NFSv4.1 callback operation gets a
> NFS4ERR_DELAY status on CB_SEQUENCE and then the connection is lost,
> the callback client loops, resending it indefinitely.
>
> The switch arm in nfsd4_cb_sequence_done() that handles
> NFS4ERR_DELAY uses rpc_restart_call() to rearm the RPC state machine
> for the retransmit, but that path does not call the rpc_prepare_call
> callback again. Thus cb_seq_status is set to -10008 by the first
> NFS4ERR_DELAY result, but is never set back to 1 for the retransmits.
>
> nfsd4_cb_sequence_done() thinks it's getting nothing but a
> long series of CB_SEQUENCE NFS4ERR_DELAY replies.
>
> Fixes: 7ba6cad6c88f ("nfsd: New helper nfsd4_cb_sequence_done() for processing more cb errors")
> Signed-off-by: Chuck Lever <[email protected]>
> ---
> fs/nfsd/nfs4callback.c | 1 +
> 1 file changed, 1 insertion(+)
>
> diff --git a/fs/nfsd/nfs4callback.c b/fs/nfsd/nfs4callback.c
> index 926c29879c6a..43b0a34a5d5b 100644
> --- a/fs/nfsd/nfs4callback.c
> +++ b/fs/nfsd/nfs4callback.c
> @@ -1178,6 +1178,7 @@ static bool nfsd4_cb_sequence_done(struct rpc_task *task, struct nfsd4_callback
> ret = false;
> break;
> case -NFS4ERR_DELAY:
> + cb->cb_seq_status = 1;
> if (!rpc_restart_call(task))
> goto out;
>
>
>
>

Nice catch!

Reviewed-by: Jeff Layton <[email protected]>

2024-01-25 20:19:46

by Jeffrey Layton

[permalink] [raw]

Subject: Re: [PATCH RFC 02/13] NFSD: Reschedule CB operations when backchannel rpc_clnt is shut down

On Thu, 2024-01-25 at 11:28 -0500, Chuck Lever wrote:
> From: Chuck Lever <[email protected]>
>
> As part of managing a client disconnect, NFSD closes down and
> replaces the backchannel rpc_clnt.
>
> If a callback operation is pending when the backchannel rpc_clnt is
> shut down, currently nfsd4_run_cb_work() just discards that
> callback. But there are multiple cases to deal with here:
>
> o The client's lease is getting destroyed. Throw the CB away.
>
> o The client disconnected. It might be forcing a retransmit of
> CB operations, or it could have disconnected for other reasons.
> Reschedule the CB so it is retransmitted when the client
> reconnects.
>
> Since callback operations can now be rescheduled, ensure that
> cb_ops->prepare can be called only once by moving the
> cb_ops->prepare paragraph down to just before the rpc_call_async()
> call.
>
> Fixes: 2bbfed98a4d8 ("nfsd: Fix races between nfsd4_cb_release() and nfsd4_shutdown_callback()")
> Signed-off-by: Chuck Lever <[email protected]>
> ---
> fs/nfsd/nfs4callback.c | 26 +++++++++++++++++---------
> 1 file changed, 17 insertions(+), 9 deletions(-)
>
> diff --git a/fs/nfsd/nfs4callback.c b/fs/nfsd/nfs4callback.c
> index 43b0a34a5d5b..b2844abcb51f 100644
> --- a/fs/nfsd/nfs4callback.c
> +++ b/fs/nfsd/nfs4callback.c
> @@ -1375,20 +1375,22 @@ nfsd4_run_cb_work(struct work_struct *work)
> struct rpc_clnt *clnt;
> int flags;
>
> - if (cb->cb_need_restart) {
> - cb->cb_need_restart = false;
> - } else {
> - if (cb->cb_ops && cb->cb_ops->prepare)
> - cb->cb_ops->prepare(cb);
> - }
> -
> if (clp->cl_flags & NFSD4_CLIENT_CB_FLAG_MASK)
> nfsd4_process_cb_update(cb);
>
> clnt = clp->cl_cb_client;
> if (!clnt) {
> - /* Callback channel broken, or client killed; give up: */
> - nfsd41_destroy_cb(cb);
> + if (test_bit(NFSD4_CLIENT_CB_KILL, &clp->cl_flags))
> + nfsd41_destroy_cb(cb);
> + else {
> + /*
> + * XXX: Ideally, we would wait for the client to
> + * reconnect, but I haven't figured out how
> + * to do that yet.
> + */
> + msleep(30);
> + nfsd4_queue_cb(cb);

It would probably be better to just queue the cb as delayed_work here,
so you don't squat on the workqueue thread. That'll mean changing
cb_work to struct delayed_work, but that should be NBD.

> + }
> return;
> }
>
> @@ -1401,6 +1403,12 @@ nfsd4_run_cb_work(struct work_struct *work)
> return;
> }
>
> + if (cb->cb_need_restart) {
> + cb->cb_need_restart = false;
> + } else {
> + if (cb->cb_ops && cb->cb_ops->prepare)
> + cb->cb_ops->prepare(cb);
> + }
> cb->cb_msg.rpc_cred = clp->cl_cb_cred;
> flags = clp->cl_minorversion ? RPC_TASK_NOCONNECT : RPC_TASK_SOFTCONN;
> rpc_call_async(clnt, &cb->cb_msg, RPC_TASK_SOFT | flags,
>
>
>

--
Jeff Layton <[email protected]>

2024-01-25 20:24:32

by Jeffrey Layton

[permalink] [raw]

Subject: Re: [PATCH RFC 03/13] NFSD: Retransmit callbacks after client reconnects

On Thu, 2024-01-25 at 11:28 -0500, Chuck Lever wrote:
> From: Chuck Lever <[email protected]>
>
> NFSv4.1 clients assume that if they disconnect, that will force the
> server to resend pending callback operations once a fresh connection
> has been established.
>
> Turns out NFSD has not been resending after reconnect.
>
> Fixes: 7ba6cad6c88f ("nfsd: New helper nfsd4_cb_sequence_done() for processing more cb errors")
> Signed-off-by: Chuck Lever <[email protected]>
> ---
> fs/nfsd/nfs4callback.c | 13 +++++++++++--
> 1 file changed, 11 insertions(+), 2 deletions(-)
>
> diff --git a/fs/nfsd/nfs4callback.c b/fs/nfsd/nfs4callback.c
> index b2844abcb51f..1ff64efe1f5c 100644
> --- a/fs/nfsd/nfs4callback.c
> +++ b/fs/nfsd/nfs4callback.c
> @@ -1171,12 +1171,21 @@ static bool nfsd4_cb_sequence_done(struct rpc_task *task, struct nfsd4_callback
> break;
> case -ESERVERFAULT:
> ++session->se_cb_seq_nr;
> - fallthrough;
> + nfsd4_mark_cb_fault(cb->cb_clp, cb->cb_seq_status);
> + ret = false;
> + break;
> case 1:
> + /*
> + * cb_seq_status remains 1 if an RPC Reply was never
> + * received. NFSD can't know if the client processed
> + * the CB_SEQUENCE operation. Ask the client to send a
> + * DESTROY_SESSION to recover.
> + */
> + fallthrough;
> case -NFS4ERR_BADSESSION:
> nfsd4_mark_cb_fault(cb->cb_clp, cb->cb_seq_status);
> ret = false;
> - break;
> + goto need_restart;
> case -NFS4ERR_DELAY:
> cb->cb_seq_status = 1;
> if (!rpc_restart_call(task))
>
>
>

Nice work tracking that down.

Reviewed-by: Jeff Layton <[email protected]>

2024-01-25 20:31:43

by Chuck Lever III

[permalink] [raw]

Subject: Re: [PATCH RFC 02/13] NFSD: Reschedule CB operations when backchannel rpc_clnt is shut down

On Thu, Jan 25, 2024 at 03:19:41PM -0500, Jeff Layton wrote:
> On Thu, 2024-01-25 at 11:28 -0500, Chuck Lever wrote:
> > From: Chuck Lever <[email protected]>
> >
> > As part of managing a client disconnect, NFSD closes down and
> > replaces the backchannel rpc_clnt.
> >
> > If a callback operation is pending when the backchannel rpc_clnt is
> > shut down, currently nfsd4_run_cb_work() just discards that
> > callback. But there are multiple cases to deal with here:
> >
> > o The client's lease is getting destroyed. Throw the CB away.
> >
> > o The client disconnected. It might be forcing a retransmit of
> > CB operations, or it could have disconnected for other reasons.
> > Reschedule the CB so it is retransmitted when the client
> > reconnects.
> >
> > Since callback operations can now be rescheduled, ensure that
> > cb_ops->prepare can be called only once by moving the
> > cb_ops->prepare paragraph down to just before the rpc_call_async()
> > call.
> >
> > Fixes: 2bbfed98a4d8 ("nfsd: Fix races between nfsd4_cb_release() and nfsd4_shutdown_callback()")
> > Signed-off-by: Chuck Lever <[email protected]>
> > ---
> > fs/nfsd/nfs4callback.c | 26 +++++++++++++++++---------
> > 1 file changed, 17 insertions(+), 9 deletions(-)
> >
> > diff --git a/fs/nfsd/nfs4callback.c b/fs/nfsd/nfs4callback.c
> > index 43b0a34a5d5b..b2844abcb51f 100644
> > --- a/fs/nfsd/nfs4callback.c
> > +++ b/fs/nfsd/nfs4callback.c
> > @@ -1375,20 +1375,22 @@ nfsd4_run_cb_work(struct work_struct *work)
> > struct rpc_clnt *clnt;
> > int flags;
> >
> > - if (cb->cb_need_restart) {
> > - cb->cb_need_restart = false;
> > - } else {
> > - if (cb->cb_ops && cb->cb_ops->prepare)
> > - cb->cb_ops->prepare(cb);
> > - }
> > -
> > if (clp->cl_flags & NFSD4_CLIENT_CB_FLAG_MASK)
> > nfsd4_process_cb_update(cb);
> >
> > clnt = clp->cl_cb_client;
> > if (!clnt) {
> > - /* Callback channel broken, or client killed; give up: */
> > - nfsd41_destroy_cb(cb);
> > + if (test_bit(NFSD4_CLIENT_CB_KILL, &clp->cl_flags))
> > + nfsd41_destroy_cb(cb);
> > + else {
> > + /*
> > + * XXX: Ideally, we would wait for the client to
> > + * reconnect, but I haven't figured out how
> > + * to do that yet.
> > + */
> > + msleep(30);
> > + nfsd4_queue_cb(cb);
>
> It would probably be better to just queue the cb as delayed_work here,
> so you don't squat on the workqueue thread.

You found my placeholder :-)

> That'll mean changing
> cb_work to struct delayed_work, but that should be NBD.

I've looked at that. I wanted to be sure, before going that route,
that there is no obvious way to implement a "wait for client to
reconnect". msleep (or delayed_work) is basically a slow busy wait.

> > + }
> > return;
> > }
> >
> > @@ -1401,6 +1403,12 @@ nfsd4_run_cb_work(struct work_struct *work)
> > return;
> > }
> >
> > + if (cb->cb_need_restart) {
> > + cb->cb_need_restart = false;
> > + } else {
> > + if (cb->cb_ops && cb->cb_ops->prepare)
> > + cb->cb_ops->prepare(cb);
> > + }
> > cb->cb_msg.rpc_cred = clp->cl_cb_cred;
> > flags = clp->cl_minorversion ? RPC_TASK_NOCONNECT : RPC_TASK_SOFTCONN;
> > rpc_call_async(clnt, &cb->cb_msg, RPC_TASK_SOFT | flags,
> >
> >
> >
>
> --
> Jeff Layton <[email protected]>

--
Chuck Lever

2024-01-25 20:41:46

by Jeffrey Layton

[permalink] [raw]

Subject: Re: [PATCH RFC 00/13] NFSD backchannel fixes

On Thu, 2024-01-25 at 11:28 -0500, Chuck Lever wrote:
> The first three patches fix bugs that prevent NFSD's backchannel
> from reliably retransmitting after a client reconnects. These fixes
> might be appropriate for 6.8-rc.
>
> Following that are some new trace points that might be helpful for
> field troubleshooting.
>
> Then there are some minor clean-ups.
>
> I am still testing this series, and there is one msleep() call that
> needs some thought. Thoughts, comments, opinions, rotten fruit? You
> know the drill.
>
> ---
>
> Chuck Lever (13):
> NFSD: Reset cb_seq_status after NFS4ERR_DELAY
> NFSD: Reschedule CB operations when backchannel rpc_clnt is shut down
> NFSD: Retransmit callbacks after client reconnects
> NFSD: Add nfsd_seq4_status trace event
> NFSD: Replace dprintks in nfsd4_cb_sequence_done()
> NFSD: Rename nfsd_cb_state trace point
> NFSD: Add callback operation lifetime trace points
> SUNRPC: Remove EXPORT_SYMBOL_GPL for svc_process_bc()
> NFSD: Remove unused @reason argument
> NFSD: Replace comment with lockdep assertion
> NFSD: Remove BUG_ON in nfsd4_process_cb_update()
> SUNRPC: Remove stale comments
> NFSD: Remove redundant cb_seq_status initialization
>
>
> fs/nfsd/nfs4callback.c | 81 +++++++++++++-------
> fs/nfsd/nfs4state.c | 1 +
> fs/nfsd/trace.h | 162 ++++++++++++++++++++++++++++++++++++++-
> include/trace/misc/nfs.h | 34 ++++++++
> net/sunrpc/svc.c | 1 -
> net/sunrpc/xprtsock.c | 9 ---
> 6 files changed, 250 insertions(+), 38 deletions(-)
>
> --
> Chuck Lever
>
>

Love some of the new backchannel tracepoints. That should be helpful.

You can add this to 4-13:

Reviewed-by: Jeff Layton <[email protected]>

2024-01-25 21:57:06

by Benjamin Coddington

[permalink] [raw]

Subject: Re: [PATCH RFC 07/13] NFSD: Add callback operation lifetime trace points

On 25 Jan 2024, at 11:29, Chuck Lever wrote:

> From: Chuck Lever <[email protected]>
>
> Help observe the flow of callback operations.
>
> bc_shutdown() records exactly when the backchannel RPC client is
> destroyed and cl_cb_client is replaced with NULL.
>
> Signed-off-by: Chuck Lever <[email protected]>
> ---
> fs/nfsd/nfs4callback.c | 7 +++++++
> fs/nfsd/trace.h | 42 ++++++++++++++++++++++++++++++++++++++++++
> include/trace/misc/nfs.h | 34 ++++++++++++++++++++++++++++++++++
> 3 files changed, 83 insertions(+)
>
> diff --git a/fs/nfsd/nfs4callback.c b/fs/nfsd/nfs4callback.c
> index 1c85426830b1..4d5a6370b92c 100644
> --- a/fs/nfsd/nfs4callback.c
> +++ b/fs/nfsd/nfs4callback.c
> @@ -887,6 +887,7 @@ static struct workqueue_struct *callback_wq;
>
> static bool nfsd4_queue_cb(struct nfsd4_callback *cb)
> {
> + trace_nfsd_cb_queue(cb->cb_clp, cb);
> return queue_work(callback_wq, &cb->cb_work);
> }
>
> @@ -1106,6 +1107,7 @@ static void nfsd41_destroy_cb(struct nfsd4_callback *cb)
> {
> struct nfs4_client *clp = cb->cb_clp;
>
> + trace_nfsd_cb_destroy(clp, cb);
> nfsd41_cb_release_slot(cb);
> if (cb->cb_ops && cb->cb_ops->release)
> cb->cb_ops->release(cb);
> @@ -1220,6 +1222,7 @@ static bool nfsd4_cb_sequence_done(struct rpc_task *task, struct nfsd4_callback
> goto out;
> need_restart:
> if (!test_bit(NFSD4_CLIENT_CB_KILL, &clp->cl_flags)) {
> + trace_nfsd_cb_restart(clp, cb);
> task->tk_status = 0;
> cb->cb_need_restart = true;

I think you want to call the tracepoint /after/ setting cb_need_restart here..

> }
> @@ -1333,11 +1336,14 @@ static void nfsd4_process_cb_update(struct nfsd4_callback *cb)
> struct nfsd4_conn *c;
> int err;
>
> + trace_nfsd_cb_bc_update(clp, cb);
> +
> /*
> * This is either an update, or the client dying; in either case,
> * kill the old client:
> */
> if (clp->cl_cb_client) {
> + trace_nfsd_cb_bc_shutdown(clp, cb);
> rpc_shutdown_client(clp->cl_cb_client);
> clp->cl_cb_client = NULL;
> put_cred(clp->cl_cb_cred);
> @@ -1349,6 +1355,7 @@ static void nfsd4_process_cb_update(struct nfsd4_callback *cb)
> }
> if (test_bit(NFSD4_CLIENT_CB_KILL, &clp->cl_flags))
> return;
> +

I'm in favor of this whitespace change, but did you mean to include it?

Ben

2024-01-25 22:09:39

by Benjamin Coddington

[permalink] [raw]

Subject: Re: [PATCH RFC 00/13] NFSD backchannel fixes

On 25 Jan 2024, at 11:28, Chuck Lever wrote:

> The first three patches fix bugs that prevent NFSD's backchannel
> from reliably retransmitting after a client reconnects. These fixes
> might be appropriate for 6.8-rc.
>
> Following that are some new trace points that might be helpful for
> field troubleshooting.
>
> Then there are some minor clean-ups.
>
> I am still testing this series, and there is one msleep() call that
> needs some thought. Thoughts, comments, opinions, rotten fruit? You
> know the drill.
>
> ---
>
> Chuck Lever (13):
> NFSD: Reset cb_seq_status after NFS4ERR_DELAY
> NFSD: Reschedule CB operations when backchannel rpc_clnt is shut down
> NFSD: Retransmit callbacks after client reconnects
> NFSD: Add nfsd_seq4_status trace event
> NFSD: Replace dprintks in nfsd4_cb_sequence_done()
> NFSD: Rename nfsd_cb_state trace point
> NFSD: Add callback operation lifetime trace points
> SUNRPC: Remove EXPORT_SYMBOL_GPL for svc_process_bc()
> NFSD: Remove unused @reason argument
> NFSD: Replace comment with lockdep assertion
> NFSD: Remove BUG_ON in nfsd4_process_cb_update()
> SUNRPC: Remove stale comments
> NFSD: Remove redundant cb_seq_status initialization
>
>
> fs/nfsd/nfs4callback.c | 81 +++++++++++++-------
> fs/nfsd/nfs4state.c | 1 +
> fs/nfsd/trace.h | 162 ++++++++++++++++++++++++++++++++++++++-
> include/trace/misc/nfs.h | 34 ++++++++
> net/sunrpc/svc.c | 1 -
> net/sunrpc/xprtsock.c | 9 ---
> 6 files changed, 250 insertions(+), 38 deletions(-)

These are great, looking forward to see how 02/13 waits for reconnection.
Seems like a wait_on_bit or wait_on_var triggered from nfsd4_init_conn()
would do, but that's just my wild speculation.

Ben

2024-01-26 13:57:25

by Chuck Lever III

[permalink] [raw]

Subject: Re: [PATCH RFC 00/13] NFSD backchannel fixes

> On Jan 25, 2024, at 5:09 PM, Benjamin Coddington <[email protected]> wrote:
>
> On 25 Jan 2024, at 11:28, Chuck Lever wrote:
>
>> The first three patches fix bugs that prevent NFSD's backchannel
>> from reliably retransmitting after a client reconnects. These fixes
>> might be appropriate for 6.8-rc.
>>
>> Following that are some new trace points that might be helpful for
>> field troubleshooting.
>>
>> Then there are some minor clean-ups.
>>
>> I am still testing this series, and there is one msleep() call that
>> needs some thought. Thoughts, comments, opinions, rotten fruit? You
>> know the drill.
>>
>> ---
>>
>> Chuck Lever (13):
>> NFSD: Reset cb_seq_status after NFS4ERR_DELAY
>> NFSD: Reschedule CB operations when backchannel rpc_clnt is shut down
>> NFSD: Retransmit callbacks after client reconnects
>> NFSD: Add nfsd_seq4_status trace event
>> NFSD: Replace dprintks in nfsd4_cb_sequence_done()
>> NFSD: Rename nfsd_cb_state trace point
>> NFSD: Add callback operation lifetime trace points
>> SUNRPC: Remove EXPORT_SYMBOL_GPL for svc_process_bc()
>> NFSD: Remove unused @reason argument
>> NFSD: Replace comment with lockdep assertion
>> NFSD: Remove BUG_ON in nfsd4_process_cb_update()
>> SUNRPC: Remove stale comments
>> NFSD: Remove redundant cb_seq_status initialization
>>
>>
>> fs/nfsd/nfs4callback.c | 81 +++++++++++++-------
>> fs/nfsd/nfs4state.c | 1 +
>> fs/nfsd/trace.h | 162 ++++++++++++++++++++++++++++++++++++++-
>> include/trace/misc/nfs.h | 34 ++++++++
>> net/sunrpc/svc.c | 1 -
>> net/sunrpc/xprtsock.c | 9 ---
>> 6 files changed, 250 insertions(+), 38 deletions(-)
>
>
> These are great, looking forward to see how 02/13 waits for reconnection.
> Seems like a wait_on_bit or wait_on_var triggered from nfsd4_init_conn()
> would do, but that's just my wild speculation.

Thanks for having a look!

Well Jeff pointed out that we don't want to put a workqueue worker
thread to sleep, so wait_on_* would work but is not an appropriate
long-term solution.

I'd rather prefer to have a mechanism where the callback RPC tasks
can be "hard" and then they would wait in the RPC client for the
reconnect. The issue then is how connection establishment is
indicated so that the waiting tasks are woken at the right time.

--
Chuck Lever

2024-01-26 14:14:48

by Chuck Lever III

[permalink] [raw]

Subject: Re: [PATCH RFC 07/13] NFSD: Add callback operation lifetime trace points

On Thu, Jan 25, 2024 at 04:49:17PM -0500, Benjamin Coddington wrote:
> On 25 Jan 2024, at 11:29, Chuck Lever wrote:
>
> > From: Chuck Lever <[email protected]>
> >
> > Help observe the flow of callback operations.
> >
> > bc_shutdown() records exactly when the backchannel RPC client is
> > destroyed and cl_cb_client is replaced with NULL.
> >
> > Signed-off-by: Chuck Lever <[email protected]>
> > ---
> > fs/nfsd/nfs4callback.c | 7 +++++++
> > fs/nfsd/trace.h | 42 ++++++++++++++++++++++++++++++++++++++++++
> > include/trace/misc/nfs.h | 34 ++++++++++++++++++++++++++++++++++
> > 3 files changed, 83 insertions(+)
> >
> > diff --git a/fs/nfsd/nfs4callback.c b/fs/nfsd/nfs4callback.c
> > index 1c85426830b1..4d5a6370b92c 100644
> > --- a/fs/nfsd/nfs4callback.c
> > +++ b/fs/nfsd/nfs4callback.c
> > @@ -887,6 +887,7 @@ static struct workqueue_struct *callback_wq;
> >
> > static bool nfsd4_queue_cb(struct nfsd4_callback *cb)
> > {
> > + trace_nfsd_cb_queue(cb->cb_clp, cb);
> > return queue_work(callback_wq, &cb->cb_work);
> > }
> >
> > @@ -1106,6 +1107,7 @@ static void nfsd41_destroy_cb(struct nfsd4_callback *cb)
> > {
> > struct nfs4_client *clp = cb->cb_clp;
> >
> > + trace_nfsd_cb_destroy(clp, cb);
> > nfsd41_cb_release_slot(cb);
> > if (cb->cb_ops && cb->cb_ops->release)
> > cb->cb_ops->release(cb);
> > @@ -1220,6 +1222,7 @@ static bool nfsd4_cb_sequence_done(struct rpc_task *task, struct nfsd4_callback
> > goto out;
> > need_restart:
> > if (!test_bit(NFSD4_CLIENT_CB_KILL, &clp->cl_flags)) {
> > + trace_nfsd_cb_restart(clp, cb);
> > task->tk_status = 0;
> > cb->cb_need_restart = true;
>
> I think you want to call the tracepoint /after/ setting cb_need_restart here..

Maybe?

The trace point currently captures the value of cb_need_restart
before this logic overwrites it. Is that beneficial for
troubleshooting?

> > }
> > @@ -1333,11 +1336,14 @@ static void nfsd4_process_cb_update(struct nfsd4_callback *cb)
> > struct nfsd4_conn *c;
> > int err;
> >
> > + trace_nfsd_cb_bc_update(clp, cb);
> > +
> > /*
> > * This is either an update, or the client dying; in either case,
> > * kill the old client:
> > */
> > if (clp->cl_cb_client) {
> > + trace_nfsd_cb_bc_shutdown(clp, cb);
> > rpc_shutdown_client(clp->cl_cb_client);
> > clp->cl_cb_client = NULL;
> > put_cred(clp->cl_cb_cred);
> > @@ -1349,6 +1355,7 @@ static void nfsd4_process_cb_update(struct nfsd4_callback *cb)
> > }
> > if (test_bit(NFSD4_CLIENT_CB_KILL, &clp->cl_flags))
> > return;
> > +
>
> I'm in favor of this whitespace change, but did you mean to include it?

Yes, but it might be better off in one of the other patches.

--
Chuck Lever

2024-01-26 14:28:31

by Benjamin Coddington

[permalink] [raw]

Subject: Re: [PATCH RFC 07/13] NFSD: Add callback operation lifetime trace points

On 26 Jan 2024, at 9:12, Chuck Lever wrote:

> On Thu, Jan 25, 2024 at 04:49:17PM -0500, Benjamin Coddington wrote:
>> On 25 Jan 2024, at 11:29, Chuck Lever wrote:
>>
>>> From: Chuck Lever <[email protected]>
>>>
>>> Help observe the flow of callback operations.
>>>
>>> bc_shutdown() records exactly when the backchannel RPC client is
>>> destroyed and cl_cb_client is replaced with NULL.
>>>
>>> Signed-off-by: Chuck Lever <[email protected]>
>>> ---
>>> fs/nfsd/nfs4callback.c | 7 +++++++
>>> fs/nfsd/trace.h | 42 ++++++++++++++++++++++++++++++++++++++++++
>>> include/trace/misc/nfs.h | 34 ++++++++++++++++++++++++++++++++++
>>> 3 files changed, 83 insertions(+)
>>>
>>> diff --git a/fs/nfsd/nfs4callback.c b/fs/nfsd/nfs4callback.c
>>> index 1c85426830b1..4d5a6370b92c 100644
>>> --- a/fs/nfsd/nfs4callback.c
>>> +++ b/fs/nfsd/nfs4callback.c
>>> @@ -887,6 +887,7 @@ static struct workqueue_struct *callback_wq;
>>>
>>> static bool nfsd4_queue_cb(struct nfsd4_callback *cb)
>>> {
>>> + trace_nfsd_cb_queue(cb->cb_clp, cb);
>>> return queue_work(callback_wq, &cb->cb_work);
>>> }
>>>
>>> @@ -1106,6 +1107,7 @@ static void nfsd41_destroy_cb(struct nfsd4_callback *cb)
>>> {
>>> struct nfs4_client *clp = cb->cb_clp;
>>>
>>> + trace_nfsd_cb_destroy(clp, cb);
>>> nfsd41_cb_release_slot(cb);
>>> if (cb->cb_ops && cb->cb_ops->release)
>>> cb->cb_ops->release(cb);
>>> @@ -1220,6 +1222,7 @@ static bool nfsd4_cb_sequence_done(struct rpc_task *task, struct nfsd4_callback
>>> goto out;
>>> need_restart:
>>> if (!test_bit(NFSD4_CLIENT_CB_KILL, &clp->cl_flags)) {
>>> + trace_nfsd_cb_restart(clp, cb);
>>> task->tk_status = 0;
>>> cb->cb_need_restart = true;
>>
>> I think you want to call the tracepoint /after/ setting cb_need_restart here..
>
> Maybe?
>
> The trace point currently captures the value of cb_need_restart
> before this logic overwrites it. Is that beneficial for
> troubleshooting?

Got it, yes its better the way you have it - and the tracepoint name makes
clear we want to restart.

Ben