2021-10-05 17:14:42

by Chuck Lever

[permalink] [raw]
Subject: [PATCH RFC 0/5] Update RPC task pid as displayed by tracepoints

Hi-

So based on Trond's earlier comment, this is what I've come up with.
These apply on top of the three patches Trond has already accepted.

I kind of wanted to do "task:%04x@%04x", but the "-1" still displays
as "ffffffff", so I left it as "task:%08x@%08x" to keep the fields
lined up in aligned columns. Any different ideas are welcome.

There are a couple of other fix-ups I found while working on this
update, and I've also included an optional patch to add the
nfs_readpage(s) tracepoints I found useful while tracking down
recent fsx failures.

---

Chuck Lever (5):
SUNRPC: Tracepoints should display tk_pid and cl_clid as a fixed-size field
SUNRPC: Avoid NULL pointer dereferences in tracepoints
SUNRPC: Use BIT() macro in rpc_show_xprt_state()
SUNRPC: Don't dereference xprt->snd_task if it's a cookie
NFS: Replace dprintk callsites in nfs_readpage(s)


fs/nfs/nfs4trace.h | 17 ++-
fs/nfs/nfstrace.h | 79 ++++++++++++-
fs/nfs/read.c | 8 +-
include/trace/events/rpcgss.h | 36 +++---
include/trace/events/rpcrdma.h | 104 +++++++----------
include/trace/events/sunrpc.h | 182 +++++++++++++----------------
include/trace/events/sunrpc_base.h | 42 +++++++
7 files changed, 265 insertions(+), 203 deletions(-)
create mode 100644 include/trace/events/sunrpc_base.h

--
Chuck Lever


2021-10-05 17:14:42

by Chuck Lever

[permalink] [raw]
Subject: [PATCH RFC 2/5] SUNRPC: Avoid NULL pointer dereferences in tracepoints

On occasion, a NULL rpc_task pointer is passed into tracepoints.
We've open-coded a few places where this is expected, but let's
be defensive and protect every place that wants to dereference
a task to assign the tk_pid and cl_clid.

Signed-off-by: Chuck Lever <[email protected]>
---
fs/nfs/nfs4trace.h | 8 +---
fs/nfs/nfstrace.h | 3 -
include/trace/events/rpcgss.h | 18 +++-----
include/trace/events/rpcrdma.h | 62 ++++++++---------------------
include/trace/events/sunrpc.h | 77 +++++++++---------------------------
include/trace/events/sunrpc_base.h | 15 +++++++
6 files changed, 61 insertions(+), 122 deletions(-)

diff --git a/fs/nfs/nfs4trace.h b/fs/nfs/nfs4trace.h
index d4f061046c09..66fbd3c33c15 100644
--- a/fs/nfs/nfs4trace.h
+++ b/fs/nfs/nfs4trace.h
@@ -686,10 +686,8 @@ TRACE_EVENT(nfs4_xdr_bad_operation,

TP_fast_assign(
const struct rpc_rqst *rqstp = xdr->rqst;
- const struct rpc_task *task = rqstp->rq_task;

- __entry->task_id = task->tk_pid;
- __entry->client_id = task->tk_client->cl_clid;
+ SUNRPC_TRACE_TASK_ASSIGN(rqstp->rq_task);
__entry->xid = be32_to_cpu(rqstp->rq_xid);
__entry->op = op;
__entry->expected = expected;
@@ -721,10 +719,8 @@ DECLARE_EVENT_CLASS(nfs4_xdr_event,

TP_fast_assign(
const struct rpc_rqst *rqstp = xdr->rqst;
- const struct rpc_task *task = rqstp->rq_task;

- __entry->task_id = task->tk_pid;
- __entry->client_id = task->tk_client->cl_clid;
+ SUNRPC_TRACE_TASK_ASSIGN(rqstp->rq_task);
__entry->xid = be32_to_cpu(rqstp->rq_xid);
__entry->op = op;
__entry->error = error;
diff --git a/fs/nfs/nfstrace.h b/fs/nfs/nfstrace.h
index 82b51120450f..e9be65b52bfe 100644
--- a/fs/nfs/nfstrace.h
+++ b/fs/nfs/nfstrace.h
@@ -1401,8 +1401,7 @@ DECLARE_EVENT_CLASS(nfs_xdr_event,
const struct rpc_rqst *rqstp = xdr->rqst;
const struct rpc_task *task = rqstp->rq_task;

- __entry->task_id = task->tk_pid;
- __entry->client_id = task->tk_client->cl_clid;
+ SUNRPC_TRACE_TASK_ASSIGN(task);
__entry->xid = be32_to_cpu(rqstp->rq_xid);
__entry->version = task->tk_client->cl_vers;
__entry->error = error;
diff --git a/include/trace/events/rpcgss.h b/include/trace/events/rpcgss.h
index 3ba63319af3c..87b17ebd09c3 100644
--- a/include/trace/events/rpcgss.h
+++ b/include/trace/events/rpcgss.h
@@ -96,8 +96,7 @@ DECLARE_EVENT_CLASS(rpcgss_gssapi_event,
),

TP_fast_assign(
- __entry->task_id = task->tk_pid;
- __entry->client_id = task->tk_client->cl_clid;
+ SUNRPC_TRACE_TASK_ASSIGN(task);
__entry->maj_stat = maj_stat;
),

@@ -330,8 +329,7 @@ TRACE_EVENT(rpcgss_unwrap_failed,
),

TP_fast_assign(
- __entry->task_id = task->tk_pid;
- __entry->client_id = task->tk_client->cl_clid;
+ SUNRPC_TRACE_TASK_ASSIGN(task);
),

TP_printk(SUNRPC_TRACE_TASK_SPECIFIER,
@@ -355,8 +353,7 @@ TRACE_EVENT(rpcgss_bad_seqno,
),

TP_fast_assign(
- __entry->task_id = task->tk_pid;
- __entry->client_id = task->tk_client->cl_clid;
+ SUNRPC_TRACE_TASK_ASSIGN(task);
__entry->expected = expected;
__entry->received = received;
),
@@ -384,8 +381,7 @@ TRACE_EVENT(rpcgss_seqno,
TP_fast_assign(
const struct rpc_rqst *rqst = task->tk_rqstp;

- __entry->task_id = task->tk_pid;
- __entry->client_id = task->tk_client->cl_clid;
+ SUNRPC_TRACE_TASK_ASSIGN(task);
__entry->xid = be32_to_cpu(rqst->rq_xid);
__entry->seqno = rqst->rq_seqno;
),
@@ -414,8 +410,7 @@ TRACE_EVENT(rpcgss_need_reencode,
),

TP_fast_assign(
- __entry->task_id = task->tk_pid;
- __entry->client_id = task->tk_client->cl_clid;
+ SUNRPC_TRACE_TASK_ASSIGN(task);
__entry->xid = be32_to_cpu(task->tk_rqstp->rq_xid);
__entry->seq_xmit = seq_xmit;
__entry->seqno = task->tk_rqstp->rq_seqno;
@@ -448,8 +443,7 @@ TRACE_EVENT(rpcgss_update_slack,
),

TP_fast_assign(
- __entry->task_id = task->tk_pid;
- __entry->client_id = task->tk_client->cl_clid;
+ SUNRPC_TRACE_TASK_ASSIGN(task);
__entry->xid = be32_to_cpu(task->tk_rqstp->rq_xid);
__entry->auth = auth;
__entry->rslack = auth->au_rslack;
diff --git a/include/trace/events/rpcrdma.h b/include/trace/events/rpcrdma.h
index 7f46ef621c95..c2ab9e5d775b 100644
--- a/include/trace/events/rpcrdma.h
+++ b/include/trace/events/rpcrdma.h
@@ -271,8 +271,7 @@ DECLARE_EVENT_CLASS(xprtrdma_rdch_event,
),

TP_fast_assign(
- __entry->task_id = task->tk_pid;
- __entry->client_id = task->tk_client->cl_clid;
+ SUNRPC_TRACE_TASK_ASSIGN(task);
__entry->pos = pos;
__entry->nents = mr->mr_nents;
__entry->handle = mr->mr_handle;
@@ -320,8 +319,7 @@ DECLARE_EVENT_CLASS(xprtrdma_wrch_event,
),

TP_fast_assign(
- __entry->task_id = task->tk_pid;
- __entry->client_id = task->tk_client->cl_clid;
+ SUNRPC_TRACE_TASK_ASSIGN(task);
__entry->nents = mr->mr_nents;
__entry->handle = mr->mr_handle;
__entry->length = mr->mr_length;
@@ -380,15 +378,8 @@ DECLARE_EVENT_CLASS(xprtrdma_mr_class,
TP_fast_assign(
const struct rpcrdma_req *req = mr->mr_req;

- if (req) {
- const struct rpc_task *task = req->rl_slot.rq_task;
-
- __entry->task_id = task->tk_pid;
- __entry->client_id = task->tk_client->cl_clid;
- } else {
- __entry->task_id = 0;
- __entry->client_id = -1;
- }
+ if (req)
+ SUNRPC_TRACE_TASK_ASSIGN(req->rl_slot.rq_task);
__entry->mr_id = mr->mr_ibmr->res.id;
__entry->nents = mr->mr_nents;
__entry->handle = mr->mr_handle;
@@ -633,10 +624,7 @@ TRACE_EVENT(xprtrdma_nomrs_err,
),

TP_fast_assign(
- const struct rpc_rqst *rqst = &req->rl_slot;
-
- __entry->task_id = rqst->rq_task->tk_pid;
- __entry->client_id = rqst->rq_task->tk_client->cl_clid;
+ SUNRPC_TRACE_TASK_ASSIGN(req->rl_slot.rq_task);
__assign_str(addr, rpcrdma_addrstr(r_xprt));
__assign_str(port, rpcrdma_portstr(r_xprt));
),
@@ -694,8 +682,7 @@ TRACE_EVENT(xprtrdma_marshal,
TP_fast_assign(
const struct rpc_rqst *rqst = &req->rl_slot;

- __entry->task_id = rqst->rq_task->tk_pid;
- __entry->client_id = rqst->rq_task->tk_client->cl_clid;
+ SUNRPC_TRACE_TASK_ASSIGN(rqst->rq_task);
__entry->xid = be32_to_cpu(rqst->rq_xid);
__entry->hdrlen = req->rl_hdrbuf.len;
__entry->headlen = rqst->rq_snd_buf.head[0].iov_len;
@@ -730,8 +717,7 @@ TRACE_EVENT(xprtrdma_marshal_failed,
),

TP_fast_assign(
- __entry->task_id = rqst->rq_task->tk_pid;
- __entry->client_id = rqst->rq_task->tk_client->cl_clid;
+ SUNRPC_TRACE_TASK_ASSIGN(rqst->rq_task);
__entry->xid = be32_to_cpu(rqst->rq_xid);
__entry->ret = ret;
),
@@ -757,8 +743,7 @@ TRACE_EVENT(xprtrdma_prepsend_failed,
),

TP_fast_assign(
- __entry->task_id = rqst->rq_task->tk_pid;
- __entry->client_id = rqst->rq_task->tk_client->cl_clid;
+ SUNRPC_TRACE_TASK_ASSIGN(rqst->rq_task);
__entry->xid = be32_to_cpu(rqst->rq_xid);
__entry->ret = ret;
),
@@ -791,9 +776,7 @@ TRACE_EVENT(xprtrdma_post_send,

__entry->cq_id = sc->sc_cid.ci_queue_id;
__entry->completion_id = sc->sc_cid.ci_completion_id;
- __entry->task_id = rqst->rq_task->tk_pid;
- __entry->client_id = rqst->rq_task->tk_client ?
- rqst->rq_task->tk_client->cl_clid : -1;
+ SUNRPC_TRACE_TASK_ASSIGN(rqst->rq_task);
__entry->num_sge = req->rl_wr.num_sge;
__entry->signaled = req->rl_wr.send_flags & IB_SEND_SIGNALED;
),
@@ -827,9 +810,7 @@ TRACE_EVENT(xprtrdma_post_send_err,
const struct rpcrdma_ep *ep = r_xprt->rx_ep;

__entry->cq_id = ep ? ep->re_attr.recv_cq->res.id : 0;
- __entry->task_id = rqst->rq_task->tk_pid;
- __entry->client_id = rqst->rq_task->tk_client ?
- rqst->rq_task->tk_client->cl_clid : -1;
+ SUNRPC_TRACE_TASK_ASSIGN(rqst->rq_task);
__entry->rc = rc;
),

@@ -938,10 +919,7 @@ TRACE_EVENT(xprtrdma_post_linv_err,
),

TP_fast_assign(
- const struct rpc_task *task = req->rl_slot.rq_task;
-
- __entry->task_id = task->tk_pid;
- __entry->client_id = task->tk_client->cl_clid;
+ SUNRPC_TRACE_TASK_ASSIGN(req->rl_slot.rq_task);
__entry->status = status;
),

@@ -1127,8 +1105,7 @@ TRACE_EVENT(xprtrdma_reply,
),

TP_fast_assign(
- __entry->task_id = task->tk_pid;
- __entry->client_id = task->tk_client->cl_clid;
+ SUNRPC_TRACE_TASK_ASSIGN(task);
__entry->xid = be32_to_cpu(rep->rr_xid);
__entry->credits = credits;
),
@@ -1162,8 +1139,7 @@ TRACE_EVENT(xprtrdma_err_vers,
),

TP_fast_assign(
- __entry->task_id = rqst->rq_task->tk_pid;
- __entry->client_id = rqst->rq_task->tk_client->cl_clid;
+ SUNRPC_TRACE_TASK_ASSIGN(rqst->rq_task);
__entry->xid = be32_to_cpu(rqst->rq_xid);
__entry->min = be32_to_cpup(min);
__entry->max = be32_to_cpup(max);
@@ -1189,8 +1165,7 @@ TRACE_EVENT(xprtrdma_err_chunk,
),

TP_fast_assign(
- __entry->task_id = rqst->rq_task->tk_pid;
- __entry->client_id = rqst->rq_task->tk_client->cl_clid;
+ SUNRPC_TRACE_TASK_ASSIGN(rqst->rq_task);
__entry->xid = be32_to_cpu(rqst->rq_xid);
),

@@ -1215,8 +1190,7 @@ TRACE_EVENT(xprtrdma_err_unrecognized,
),

TP_fast_assign(
- __entry->task_id = rqst->rq_task->tk_pid;
- __entry->client_id = rqst->rq_task->tk_client->cl_clid;
+ SUNRPC_TRACE_TASK_ASSIGN(rqst->rq_task);
__entry->procedure = be32_to_cpup(procedure);
),

@@ -1244,8 +1218,7 @@ TRACE_EVENT(xprtrdma_fixup,
),

TP_fast_assign(
- __entry->task_id = rqst->rq_task->tk_pid;
- __entry->client_id = rqst->rq_task->tk_client->cl_clid;
+ SUNRPC_TRACE_TASK_ASSIGN(rqst->rq_task);
__entry->fixup = fixup;
__entry->headlen = rqst->rq_rcv_buf.head[0].iov_len;
__entry->pagelen = rqst->rq_rcv_buf.page_len;
@@ -1298,8 +1271,7 @@ TRACE_EVENT(xprtrdma_mrs_zap,
),

TP_fast_assign(
- __entry->task_id = task->tk_pid;
- __entry->client_id = task->tk_client->cl_clid;
+ SUNRPC_TRACE_TASK_ASSIGN(task);
),

TP_printk(SUNRPC_TRACE_TASK_SPECIFIER,
diff --git a/include/trace/events/sunrpc.h b/include/trace/events/sunrpc.h
index 92def7d6663e..4fd6299bc907 100644
--- a/include/trace/events/sunrpc.h
+++ b/include/trace/events/sunrpc.h
@@ -69,9 +69,7 @@ DECLARE_EVENT_CLASS(rpc_xdr_buf_class,
),

TP_fast_assign(
- __entry->task_id = task->tk_pid;
- __entry->client_id = task->tk_client ?
- task->tk_client->cl_clid : -1;
+ SUNRPC_TRACE_TASK_ASSIGN(task);
__entry->head_base = xdr->head[0].iov_base;
__entry->head_len = xdr->head[0].iov_len;
__entry->tail_base = xdr->tail[0].iov_base;
@@ -248,8 +246,7 @@ DECLARE_EVENT_CLASS(rpc_task_status,
),

TP_fast_assign(
- __entry->task_id = task->tk_pid;
- __entry->client_id = task->tk_client->cl_clid;
+ SUNRPC_TRACE_TASK_ASSIGN(task);
__entry->status = task->tk_status;
),

@@ -285,8 +282,7 @@ TRACE_EVENT(rpc_request,
),

TP_fast_assign(
- __entry->task_id = task->tk_pid;
- __entry->client_id = task->tk_client->cl_clid;
+ SUNRPC_TRACE_TASK_ASSIGN(task);
__entry->version = task->tk_client->cl_vers;
__entry->async = RPC_IS_ASYNC(task);
__assign_str(progname, task->tk_client->cl_program->name);
@@ -344,9 +340,7 @@ DECLARE_EVENT_CLASS(rpc_task_running,
),

TP_fast_assign(
- __entry->client_id = task->tk_client ?
- task->tk_client->cl_clid : -1;
- __entry->task_id = task->tk_pid;
+ SUNRPC_TRACE_TASK_ASSIGN(task);
__entry->action = action;
__entry->runstate = task->tk_runstate;
__entry->status = task->tk_status;
@@ -396,9 +390,7 @@ DECLARE_EVENT_CLASS(rpc_task_queued,
),

TP_fast_assign(
- __entry->client_id = task->tk_client ?
- task->tk_client->cl_clid : -1;
- __entry->task_id = task->tk_pid;
+ SUNRPC_TRACE_TASK_ASSIGN(task);
__entry->timeout = rpc_task_timeout(task);
__entry->runstate = task->tk_runstate;
__entry->status = task->tk_status;
@@ -439,8 +431,7 @@ DECLARE_EVENT_CLASS(rpc_failure,
),

TP_fast_assign(
- __entry->task_id = task->tk_pid;
- __entry->client_id = task->tk_client->cl_clid;
+ SUNRPC_TRACE_TASK_ASSIGN(task);
),

TP_printk(SUNRPC_TRACE_TASK_SPECIFIER,
@@ -476,8 +467,7 @@ DECLARE_EVENT_CLASS(rpc_reply_event,
),

TP_fast_assign(
- __entry->task_id = task->tk_pid;
- __entry->client_id = task->tk_client->cl_clid;
+ SUNRPC_TRACE_TASK_ASSIGN(task);
__entry->xid = be32_to_cpu(task->tk_rqstp->rq_xid);
__assign_str(progname, task->tk_client->cl_program->name);
__entry->version = task->tk_client->cl_vers;
@@ -539,8 +529,7 @@ TRACE_EVENT(rpc_buf_alloc,
),

TP_fast_assign(
- __entry->task_id = task->tk_pid;
- __entry->client_id = task->tk_client->cl_clid;
+ SUNRPC_TRACE_TASK_ASSIGN(task);
__entry->callsize = task->tk_rqstp->rq_callsize;
__entry->recvsize = task->tk_rqstp->rq_rcvsize;
__entry->status = status;
@@ -570,8 +559,7 @@ TRACE_EVENT(rpc_call_rpcerror,
),

TP_fast_assign(
- __entry->client_id = task->tk_client->cl_clid;
- __entry->task_id = task->tk_pid;
+ SUNRPC_TRACE_TASK_ASSIGN(task);
__entry->tk_status = tk_status;
__entry->rpc_status = rpc_status;
),
@@ -606,8 +594,7 @@ TRACE_EVENT(rpc_stats_latency,
),

TP_fast_assign(
- __entry->client_id = task->tk_client->cl_clid;
- __entry->task_id = task->tk_pid;
+ SUNRPC_TRACE_TASK_ASSIGN(task);
__entry->xid = be32_to_cpu(task->tk_rqstp->rq_xid);
__entry->version = task->tk_client->cl_vers;
__assign_str(progname, task->tk_client->cl_program->name);
@@ -655,8 +642,7 @@ TRACE_EVENT(rpc_xdr_overflow,
if (xdr->rqst) {
const struct rpc_task *task = xdr->rqst->rq_task;

- __entry->task_id = task->tk_pid;
- __entry->client_id = task->tk_client->cl_clid;
+ SUNRPC_TRACE_TASK_ASSIGN(task);
__assign_str(progname,
task->tk_client->cl_program->name);
__entry->version = task->tk_client->cl_vers;
@@ -721,8 +707,7 @@ TRACE_EVENT(rpc_xdr_alignment,
TP_fast_assign(
const struct rpc_task *task = xdr->rqst->rq_task;

- __entry->task_id = task->tk_pid;
- __entry->client_id = task->tk_client->cl_clid;
+ SUNRPC_TRACE_TASK_ASSIGN(task);
__assign_str(progname,
task->tk_client->cl_program->name);
__entry->version = task->tk_client->cl_vers;
@@ -922,8 +907,7 @@ TRACE_EVENT(rpc_socket_nospace,
),

TP_fast_assign(
- __entry->task_id = rqst->rq_task->tk_pid;
- __entry->client_id = rqst->rq_task->tk_client->cl_clid;
+ SUNRPC_TRACE_TASK_ASSIGN(rqst->rq_task);
__entry->total = rqst->rq_slen;
__entry->remaining = rqst->rq_slen - transport->xmit.offset;
),
@@ -1046,9 +1030,7 @@ TRACE_EVENT(xprt_transmit,
),

TP_fast_assign(
- __entry->task_id = rqst->rq_task->tk_pid;
- __entry->client_id = rqst->rq_task->tk_client ?
- rqst->rq_task->tk_client->cl_clid : -1;
+ SUNRPC_TRACE_TASK_ASSIGN(rqst->rq_task);
__entry->xid = be32_to_cpu(rqst->rq_xid);
__entry->seqno = rqst->rq_seqno;
__entry->status = status;
@@ -1082,9 +1064,7 @@ TRACE_EVENT(xprt_retransmit,
TP_fast_assign(
struct rpc_task *task = rqst->rq_task;

- __entry->task_id = task->tk_pid;
- __entry->client_id = task->tk_client ?
- task->tk_client->cl_clid : -1;
+ SUNRPC_TRACE_TASK_ASSIGN(task);
__entry->xid = be32_to_cpu(rqst->rq_xid);
__entry->ntrans = rqst->rq_ntrans;
__entry->timeout = task->tk_timeout;
@@ -1137,14 +1117,7 @@ DECLARE_EVENT_CLASS(xprt_writelock_event,
),

TP_fast_assign(
- if (task) {
- __entry->task_id = task->tk_pid;
- __entry->client_id = task->tk_client ?
- task->tk_client->cl_clid : -1;
- } else {
- __entry->task_id = -1;
- __entry->client_id = -1;
- }
+ SUNRPC_TRACE_TASK_ASSIGN(task);
__entry->snd_task_id = xprt->snd_task ?
xprt->snd_task->tk_pid : -1;
),
@@ -1183,14 +1156,7 @@ DECLARE_EVENT_CLASS(xprt_cong_event,
),

TP_fast_assign(
- if (task) {
- __entry->task_id = task->tk_pid;
- __entry->client_id = task->tk_client ?
- task->tk_client->cl_clid : -1;
- } else {
- __entry->task_id = -1;
- __entry->client_id = -1;
- }
+ SUNRPC_TRACE_TASK_ASSIGN(task);
__entry->snd_task_id = xprt->snd_task ?
xprt->snd_task->tk_pid : -1;
__entry->cong = xprt->cong;
@@ -1233,8 +1199,7 @@ TRACE_EVENT(xprt_reserve,
),

TP_fast_assign(
- __entry->task_id = rqst->rq_task->tk_pid;
- __entry->client_id = rqst->rq_task->tk_client->cl_clid;
+ SUNRPC_TRACE_TASK_ASSIGN(rqst->rq_task);
__entry->xid = be32_to_cpu(rqst->rq_xid);
),

@@ -1318,8 +1283,7 @@ TRACE_EVENT(rpcb_getport,
),

TP_fast_assign(
- __entry->task_id = task->tk_pid;
- __entry->client_id = clnt->cl_clid;
+ SUNRPC_TRACE_TASK_ASSIGN(task);
__entry->program = clnt->cl_prog;
__entry->version = clnt->cl_vers;
__entry->protocol = task->tk_xprt->prot;
@@ -1352,8 +1316,7 @@ TRACE_EVENT(rpcb_setport,
),

TP_fast_assign(
- __entry->task_id = task->tk_pid;
- __entry->client_id = task->tk_client->cl_clid;
+ SUNRPC_TRACE_TASK_ASSIGN(task);
__entry->status = status;
__entry->port = port;
),
diff --git a/include/trace/events/sunrpc_base.h b/include/trace/events/sunrpc_base.h
index 588557d07ea8..2cbed4a9a63a 100644
--- a/include/trace/events/sunrpc_base.h
+++ b/include/trace/events/sunrpc_base.h
@@ -10,6 +10,21 @@

#include <linux/tracepoint.h>

+#define SUNRPC_TRACE_PID_SPECIAL (-1)
+
+#define SUNRPC_TRACE_TASK_ASSIGN(t) \
+ do { \
+ if ((t) != NULL) { \
+ __entry->task_id = (t)->tk_pid; \
+ __entry->client_id = (t)->tk_client ? \
+ (t)->tk_client->cl_clid : \
+ SUNRPC_TRACE_PID_SPECIAL; \
+ } else { \
+ __entry->task_id = SUNRPC_TRACE_PID_SPECIAL; \
+ __entry->client_id = SUNRPC_TRACE_PID_SPECIAL; \
+ } \
+ } while (0);
+
#define SUNRPC_TRACE_PID_SPECIFIER "%08x"
#define SUNRPC_TRACE_CLID_SPECIFIER "%08x"
#define SUNRPC_TRACE_TASK_SPECIFIER \


2021-10-05 17:14:42

by Chuck Lever

[permalink] [raw]
Subject: [PATCH RFC 1/5] SUNRPC: Tracepoints should display tk_pid and cl_clid as a fixed-size field

For certain special cases, RPC-related tracepoints record a -1 as
the task ID or the client ID. It's ugly for a trace event to display
4 billion in these cases.

To help keep SUNRPC tracepoints consistent, create a macro that
defines the print format specifiers for tk_pid and cl_clid. At some
point in the future we might try tk_pid with a wider range of values
than 0..64K so this makes it easier to make that change.

RPC tracepoints now look like this:

<...>-1276 [009] 149.720358: rpc_clnt_new: client=00000005 peer=[192.168.2.55]:20049 program=nfs server=klimt.ib

<...>-1342 [004] 149.921234: rpc_xdr_recvfrom: task:0000001a@00000005 head=[0xff1242d9ab6dc01c,144] page=0 tail=[(nil),0] len=144
<...>-1342 [004] 149.921235: xprt_release_cong: task:0000001a@00000005 snd_task:ffffffff cong=256 cwnd=16384
<...>-1342 [004] 149.921235: xprt_put_cong: task:0000001a@00000005 snd_task:ffffffff cong=0 cwnd=16384

Signed-off-by: Chuck Lever <[email protected]>
---
fs/nfs/nfs4trace.h | 9 ++--
fs/nfs/nfstrace.h | 6 ++-
include/trace/events/rpcgss.h | 18 ++++++---
include/trace/events/rpcrdma.h | 42 ++++++++++++--------
include/trace/events/sunrpc.h | 74 ++++++++++++++++++++++--------------
include/trace/events/sunrpc_base.h | 18 +++++++++
6 files changed, 108 insertions(+), 59 deletions(-)
create mode 100644 include/trace/events/sunrpc_base.h

diff --git a/fs/nfs/nfs4trace.h b/fs/nfs/nfs4trace.h
index 7a2567aa2b86..d4f061046c09 100644
--- a/fs/nfs/nfs4trace.h
+++ b/fs/nfs/nfs4trace.h
@@ -9,6 +9,7 @@
#define _TRACE_NFS4_H

#include <linux/tracepoint.h>
+#include <trace/events/sunrpc_base.h>

TRACE_DEFINE_ENUM(EPERM);
TRACE_DEFINE_ENUM(ENOENT);
@@ -694,8 +695,8 @@ TRACE_EVENT(nfs4_xdr_bad_operation,
__entry->expected = expected;
),

- TP_printk(
- "task:%u@%d xid=0x%08x operation=%u, expected=%u",
+ TP_printk(SUNRPC_TRACE_TASK_SPECIFIER
+ " xid=0x%08x operation=%u, expected=%u",
__entry->task_id, __entry->client_id, __entry->xid,
__entry->op, __entry->expected
)
@@ -729,8 +730,8 @@ DECLARE_EVENT_CLASS(nfs4_xdr_event,
__entry->error = error;
),

- TP_printk(
- "task:%u@%d xid=0x%08x error=%ld (%s) operation=%u",
+ TP_printk(SUNRPC_TRACE_TASK_SPECIFIER
+ " xid=0x%08x error=%ld (%s) operation=%u",
__entry->task_id, __entry->client_id, __entry->xid,
-__entry->error, show_nfsv4_errors(__entry->error),
__entry->op
diff --git a/fs/nfs/nfstrace.h b/fs/nfs/nfstrace.h
index 44fd016a8e65..82b51120450f 100644
--- a/fs/nfs/nfstrace.h
+++ b/fs/nfs/nfstrace.h
@@ -11,6 +11,8 @@
#include <linux/tracepoint.h>
#include <linux/iversion.h>

+#include <trace/events/sunrpc_base.h>
+
#define nfs_show_file_type(ftype) \
__print_symbolic(ftype, \
{ DT_UNKNOWN, "UNKNOWN" }, \
@@ -1409,8 +1411,8 @@ DECLARE_EVENT_CLASS(nfs_xdr_event,
__assign_str(procedure, task->tk_msg.rpc_proc->p_name);
),

- TP_printk(
- "task:%u@%d xid=0x%08x %sv%d %s error=%ld (%s)",
+ TP_printk(SUNRPC_TRACE_TASK_SPECIFIER
+ " xid=0x%08x %sv%d %s error=%ld (%s)",
__entry->task_id, __entry->client_id, __entry->xid,
__get_str(program), __entry->version,
__get_str(procedure), -__entry->error,
diff --git a/include/trace/events/rpcgss.h b/include/trace/events/rpcgss.h
index b2a2672e6632..3ba63319af3c 100644
--- a/include/trace/events/rpcgss.h
+++ b/include/trace/events/rpcgss.h
@@ -13,6 +13,8 @@

#include <linux/tracepoint.h>

+#include <trace/events/sunrpc_base.h>
+
/**
** GSS-API related trace events
**/
@@ -99,7 +101,7 @@ DECLARE_EVENT_CLASS(rpcgss_gssapi_event,
__entry->maj_stat = maj_stat;
),

- TP_printk("task:%u@%u maj_stat=%s",
+ TP_printk(SUNRPC_TRACE_TASK_SPECIFIER " maj_stat=%s",
__entry->task_id, __entry->client_id,
__entry->maj_stat == 0 ?
"GSS_S_COMPLETE" : show_gss_status(__entry->maj_stat))
@@ -332,7 +334,8 @@ TRACE_EVENT(rpcgss_unwrap_failed,
__entry->client_id = task->tk_client->cl_clid;
),

- TP_printk("task:%u@%u", __entry->task_id, __entry->client_id)
+ TP_printk(SUNRPC_TRACE_TASK_SPECIFIER,
+ __entry->task_id, __entry->client_id)
);

TRACE_EVENT(rpcgss_bad_seqno,
@@ -358,7 +361,8 @@ TRACE_EVENT(rpcgss_bad_seqno,
__entry->received = received;
),

- TP_printk("task:%u@%u expected seqno %u, received seqno %u",
+ TP_printk(SUNRPC_TRACE_TASK_SPECIFIER
+ " expected seqno %u, received seqno %u",
__entry->task_id, __entry->client_id,
__entry->expected, __entry->received)
);
@@ -386,7 +390,7 @@ TRACE_EVENT(rpcgss_seqno,
__entry->seqno = rqst->rq_seqno;
),

- TP_printk("task:%u@%u xid=0x%08x seqno=%u",
+ TP_printk(SUNRPC_TRACE_TASK_SPECIFIER " xid=0x%08x seqno=%u",
__entry->task_id, __entry->client_id,
__entry->xid, __entry->seqno)
);
@@ -418,7 +422,8 @@ TRACE_EVENT(rpcgss_need_reencode,
__entry->ret = ret;
),

- TP_printk("task:%u@%u xid=0x%08x rq_seqno=%u seq_xmit=%u reencode %sneeded",
+ TP_printk(SUNRPC_TRACE_TASK_SPECIFIER
+ " xid=0x%08x rq_seqno=%u seq_xmit=%u reencode %sneeded",
__entry->task_id, __entry->client_id,
__entry->xid, __entry->seqno, __entry->seq_xmit,
__entry->ret ? "" : "un")
@@ -452,7 +457,8 @@ TRACE_EVENT(rpcgss_update_slack,
__entry->verfsize = auth->au_verfsize;
),

- TP_printk("task:%u@%u xid=0x%08x auth=%p rslack=%u ralign=%u verfsize=%u\n",
+ TP_printk(SUNRPC_TRACE_TASK_SPECIFIER
+ " xid=0x%08x auth=%p rslack=%u ralign=%u verfsize=%u\n",
__entry->task_id, __entry->client_id, __entry->xid,
__entry->auth, __entry->rslack, __entry->ralign,
__entry->verfsize)
diff --git a/include/trace/events/rpcrdma.h b/include/trace/events/rpcrdma.h
index afb2e394797c..7f46ef621c95 100644
--- a/include/trace/events/rpcrdma.h
+++ b/include/trace/events/rpcrdma.h
@@ -14,7 +14,9 @@
#include <linux/sunrpc/rpc_rdma_cid.h>
#include <linux/tracepoint.h>
#include <rdma/ib_cm.h>
+
#include <trace/events/rdma.h>
+#include <trace/events/sunrpc_base.h>

/**
** Event classes
@@ -279,7 +281,8 @@ DECLARE_EVENT_CLASS(xprtrdma_rdch_event,
__entry->nsegs = nsegs;
),

- TP_printk("task:%u@%u pos=%u %u@0x%016llx:0x%08x (%s)",
+ TP_printk(SUNRPC_TRACE_TASK_SPECIFIER
+ " pos=%u %u@0x%016llx:0x%08x (%s)",
__entry->task_id, __entry->client_id,
__entry->pos, __entry->length,
(unsigned long long)__entry->offset, __entry->handle,
@@ -326,7 +329,8 @@ DECLARE_EVENT_CLASS(xprtrdma_wrch_event,
__entry->nsegs = nsegs;
),

- TP_printk("task:%u@%u %u@0x%016llx:0x%08x (%s)",
+ TP_printk(SUNRPC_TRACE_TASK_SPECIFIER
+ " %u@0x%016llx:0x%08x (%s)",
__entry->task_id, __entry->client_id,
__entry->length, (unsigned long long)__entry->offset,
__entry->handle,
@@ -393,7 +397,8 @@ DECLARE_EVENT_CLASS(xprtrdma_mr_class,
__entry->dir = mr->mr_dir;
),

- TP_printk("task:%u@%u mr.id=%u nents=%d %u@0x%016llx:0x%08x (%s)",
+ TP_printk(SUNRPC_TRACE_TASK_SPECIFIER
+ " mr.id=%u nents=%d %u@0x%016llx:0x%08x (%s)",
__entry->task_id, __entry->client_id,
__entry->mr_id, __entry->nents, __entry->length,
(unsigned long long)__entry->offset, __entry->handle,
@@ -636,9 +641,9 @@ TRACE_EVENT(xprtrdma_nomrs_err,
__assign_str(port, rpcrdma_portstr(r_xprt));
),

- TP_printk("peer=[%s]:%s task:%u@%u",
- __get_str(addr), __get_str(port),
- __entry->task_id, __entry->client_id
+ TP_printk(SUNRPC_TRACE_TASK_SPECIFIER " peer=[%s]:%s",
+ __entry->task_id, __entry->client_id,
+ __get_str(addr), __get_str(port)
)
);

@@ -700,7 +705,8 @@ TRACE_EVENT(xprtrdma_marshal,
__entry->wtype = wtype;
),

- TP_printk("task:%u@%u xid=0x%08x: hdr=%u xdr=%u/%u/%u %s/%s",
+ TP_printk(SUNRPC_TRACE_TASK_SPECIFIER
+ " xid=0x%08x hdr=%u xdr=%u/%u/%u %s/%s",
__entry->task_id, __entry->client_id, __entry->xid,
__entry->hdrlen,
__entry->headlen, __entry->pagelen, __entry->taillen,
@@ -730,7 +736,7 @@ TRACE_EVENT(xprtrdma_marshal_failed,
__entry->ret = ret;
),

- TP_printk("task:%u@%u xid=0x%08x: ret=%d",
+ TP_printk(SUNRPC_TRACE_TASK_SPECIFIER " xid=0x%08x ret=%d",
__entry->task_id, __entry->client_id, __entry->xid,
__entry->ret
)
@@ -757,7 +763,7 @@ TRACE_EVENT(xprtrdma_prepsend_failed,
__entry->ret = ret;
),

- TP_printk("task:%u@%u xid=0x%08x: ret=%d",
+ TP_printk(SUNRPC_TRACE_TASK_SPECIFIER " xid=0x%08x ret=%d",
__entry->task_id, __entry->client_id, __entry->xid,
__entry->ret
)
@@ -792,7 +798,7 @@ TRACE_EVENT(xprtrdma_post_send,
__entry->signaled = req->rl_wr.send_flags & IB_SEND_SIGNALED;
),

- TP_printk("task:%u@%u cq.id=%u cid=%d (%d SGE%s) %s",
+ TP_printk(SUNRPC_TRACE_TASK_SPECIFIER " cq.id=%u cid=%d (%d SGE%s) %s",
__entry->task_id, __entry->client_id,
__entry->cq_id, __entry->completion_id,
__entry->num_sge, (__entry->num_sge == 1 ? "" : "s"),
@@ -827,7 +833,7 @@ TRACE_EVENT(xprtrdma_post_send_err,
__entry->rc = rc;
),

- TP_printk("task:%u@%u cq.id=%u rc=%d",
+ TP_printk(SUNRPC_TRACE_TASK_SPECIFIER " cq.id=%u rc=%d",
__entry->task_id, __entry->client_id,
__entry->cq_id, __entry->rc
)
@@ -939,7 +945,7 @@ TRACE_EVENT(xprtrdma_post_linv_err,
__entry->status = status;
),

- TP_printk("task:%u@%u status=%d",
+ TP_printk(SUNRPC_TRACE_TASK_SPECIFIER " status=%d",
__entry->task_id, __entry->client_id, __entry->status
)
);
@@ -1127,7 +1133,7 @@ TRACE_EVENT(xprtrdma_reply,
__entry->credits = credits;
),

- TP_printk("task:%u@%u xid=0x%08x credits=%u",
+ TP_printk(SUNRPC_TRACE_TASK_SPECIFIER " xid=0x%08x credits=%u",
__entry->task_id, __entry->client_id, __entry->xid,
__entry->credits
)
@@ -1163,7 +1169,7 @@ TRACE_EVENT(xprtrdma_err_vers,
__entry->max = be32_to_cpup(max);
),

- TP_printk("task:%u@%u xid=0x%08x versions=[%u, %u]",
+ TP_printk(SUNRPC_TRACE_TASK_SPECIFIER " xid=0x%08x versions=[%u, %u]",
__entry->task_id, __entry->client_id, __entry->xid,
__entry->min, __entry->max
)
@@ -1188,7 +1194,7 @@ TRACE_EVENT(xprtrdma_err_chunk,
__entry->xid = be32_to_cpu(rqst->rq_xid);
),

- TP_printk("task:%u@%u xid=0x%08x",
+ TP_printk(SUNRPC_TRACE_TASK_SPECIFIER " xid=0x%08x",
__entry->task_id, __entry->client_id, __entry->xid
)
);
@@ -1214,7 +1220,7 @@ TRACE_EVENT(xprtrdma_err_unrecognized,
__entry->procedure = be32_to_cpup(procedure);
),

- TP_printk("task:%u@%u xid=0x%08x procedure=%u",
+ TP_printk(SUNRPC_TRACE_TASK_SPECIFIER " xid=0x%08x procedure=%u",
__entry->task_id, __entry->client_id, __entry->xid,
__entry->procedure
)
@@ -1246,7 +1252,7 @@ TRACE_EVENT(xprtrdma_fixup,
__entry->taillen = rqst->rq_rcv_buf.tail[0].iov_len;
),

- TP_printk("task:%u@%u fixup=%lu xdr=%zu/%u/%zu",
+ TP_printk(SUNRPC_TRACE_TASK_SPECIFIER " fixup=%lu xdr=%zu/%u/%zu",
__entry->task_id, __entry->client_id, __entry->fixup,
__entry->headlen, __entry->pagelen, __entry->taillen
)
@@ -1296,7 +1302,7 @@ TRACE_EVENT(xprtrdma_mrs_zap,
__entry->client_id = task->tk_client->cl_clid;
),

- TP_printk("task:%u@%u",
+ TP_printk(SUNRPC_TRACE_TASK_SPECIFIER,
__entry->task_id, __entry->client_id
)
);
diff --git a/include/trace/events/sunrpc.h b/include/trace/events/sunrpc.h
index 2d04eb96d418..92def7d6663e 100644
--- a/include/trace/events/sunrpc.h
+++ b/include/trace/events/sunrpc.h
@@ -14,6 +14,8 @@
#include <linux/net.h>
#include <linux/tracepoint.h>

+#include <trace/events/sunrpc_base.h>
+
TRACE_DEFINE_ENUM(SOCK_STREAM);
TRACE_DEFINE_ENUM(SOCK_DGRAM);
TRACE_DEFINE_ENUM(SOCK_RAW);
@@ -78,7 +80,8 @@ DECLARE_EVENT_CLASS(rpc_xdr_buf_class,
__entry->msg_len = xdr->len;
),

- TP_printk("task:%u@%u head=[%p,%zu] page=%u tail=[%p,%zu] len=%u",
+ TP_printk(SUNRPC_TRACE_TASK_SPECIFIER
+ " head=[%p,%zu] page=%u tail=[%p,%zu] len=%u",
__entry->task_id, __entry->client_id,
__entry->head_base, __entry->head_len, __entry->page_len,
__entry->tail_base, __entry->tail_len, __entry->msg_len
@@ -114,7 +117,7 @@ DECLARE_EVENT_CLASS(rpc_clnt_class,
__entry->client_id = clnt->cl_clid;
),

- TP_printk("clid=%u", __entry->client_id)
+ TP_printk("client=" SUNRPC_TRACE_CLID_SPECIFIER, __entry->client_id)
);

#define DEFINE_RPC_CLNT_EVENT(name) \
@@ -158,7 +161,8 @@ TRACE_EVENT(rpc_clnt_new,
__assign_str(server, server);
),

- TP_printk("client=%u peer=[%s]:%s program=%s server=%s",
+ TP_printk("client=" SUNRPC_TRACE_CLID_SPECIFIER
+ " peer=[%s]:%s program=%s server=%s",
__entry->client_id, __get_str(addr), __get_str(port),
__get_str(program), __get_str(server))
);
@@ -206,7 +210,8 @@ TRACE_EVENT(rpc_clnt_clone_err,
__entry->error = error;
),

- TP_printk("client=%u error=%d", __entry->client_id, __entry->error)
+ TP_printk("client=" SUNRPC_TRACE_CLID_SPECIFIER " error=%d",
+ __entry->client_id, __entry->error)
);


@@ -248,7 +253,7 @@ DECLARE_EVENT_CLASS(rpc_task_status,
__entry->status = task->tk_status;
),

- TP_printk("task:%u@%u status=%d",
+ TP_printk(SUNRPC_TRACE_TASK_SPECIFIER " status=%d",
__entry->task_id, __entry->client_id,
__entry->status)
);
@@ -288,7 +293,7 @@ TRACE_EVENT(rpc_request,
__assign_str(procname, rpc_proc_name(task));
),

- TP_printk("task:%u@%u %sv%d %s (%ssync)",
+ TP_printk(SUNRPC_TRACE_TASK_SPECIFIER " %sv%d %s (%ssync)",
__entry->task_id, __entry->client_id,
__get_str(progname), __entry->version,
__get_str(procname), __entry->async ? "a": ""
@@ -348,7 +353,8 @@ DECLARE_EVENT_CLASS(rpc_task_running,
__entry->flags = task->tk_flags;
),

- TP_printk("task:%u@%d flags=%s runstate=%s status=%d action=%ps",
+ TP_printk(SUNRPC_TRACE_TASK_SPECIFIER
+ " flags=%s runstate=%s status=%d action=%ps",
__entry->task_id, __entry->client_id,
rpc_show_task_flags(__entry->flags),
rpc_show_runstate(__entry->runstate),
@@ -400,7 +406,8 @@ DECLARE_EVENT_CLASS(rpc_task_queued,
__assign_str(q_name, rpc_qname(q));
),

- TP_printk("task:%u@%d flags=%s runstate=%s status=%d timeout=%lu queue=%s",
+ TP_printk(SUNRPC_TRACE_TASK_SPECIFIER
+ " flags=%s runstate=%s status=%d timeout=%lu queue=%s",
__entry->task_id, __entry->client_id,
rpc_show_task_flags(__entry->flags),
rpc_show_runstate(__entry->runstate),
@@ -436,7 +443,7 @@ DECLARE_EVENT_CLASS(rpc_failure,
__entry->client_id = task->tk_client->cl_clid;
),

- TP_printk("task:%u@%u",
+ TP_printk(SUNRPC_TRACE_TASK_SPECIFIER,
__entry->task_id, __entry->client_id)
);

@@ -478,7 +485,8 @@ DECLARE_EVENT_CLASS(rpc_reply_event,
__assign_str(servername, task->tk_xprt->servername);
),

- TP_printk("task:%u@%d server=%s xid=0x%08x %sv%d %s",
+ TP_printk(SUNRPC_TRACE_TASK_SPECIFIER
+ " server=%s xid=0x%08x %sv%d %s",
__entry->task_id, __entry->client_id, __get_str(servername),
__entry->xid, __get_str(progname), __entry->version,
__get_str(procname))
@@ -538,7 +546,8 @@ TRACE_EVENT(rpc_buf_alloc,
__entry->status = status;
),

- TP_printk("task:%u@%u callsize=%zu recvsize=%zu status=%d",
+ TP_printk(SUNRPC_TRACE_TASK_SPECIFIER
+ " callsize=%zu recvsize=%zu status=%d",
__entry->task_id, __entry->client_id,
__entry->callsize, __entry->recvsize, __entry->status
)
@@ -567,7 +576,8 @@ TRACE_EVENT(rpc_call_rpcerror,
__entry->rpc_status = rpc_status;
),

- TP_printk("task:%u@%u tk_status=%d rpc_status=%d",
+ TP_printk(SUNRPC_TRACE_TASK_SPECIFIER
+ " tk_status=%d rpc_status=%d",
__entry->task_id, __entry->client_id,
__entry->tk_status, __entry->rpc_status)
);
@@ -607,7 +617,8 @@ TRACE_EVENT(rpc_stats_latency,
__entry->execute = ktime_to_us(execute);
),

- TP_printk("task:%u@%d xid=0x%08x %sv%d %s backlog=%lu rtt=%lu execute=%lu",
+ TP_printk(SUNRPC_TRACE_TASK_SPECIFIER
+ " xid=0x%08x %sv%d %s backlog=%lu rtt=%lu execute=%lu",
__entry->task_id, __entry->client_id, __entry->xid,
__get_str(progname), __entry->version, __get_str(procname),
__entry->backlog, __entry->rtt, __entry->execute)
@@ -651,8 +662,8 @@ TRACE_EVENT(rpc_xdr_overflow,
__entry->version = task->tk_client->cl_vers;
__assign_str(procedure, task->tk_msg.rpc_proc->p_name);
} else {
- __entry->task_id = 0;
- __entry->client_id = 0;
+ __entry->task_id = -1;
+ __entry->client_id = -1;
__assign_str(progname, "unknown");
__entry->version = 0;
__assign_str(procedure, "unknown");
@@ -668,8 +679,8 @@ TRACE_EVENT(rpc_xdr_overflow,
__entry->len = xdr->buf->len;
),

- TP_printk(
- "task:%u@%u %sv%d %s requested=%zu p=%p end=%p xdr=[%p,%zu]/%u/[%p,%zu]/%u\n",
+ TP_printk(SUNRPC_TRACE_TASK_SPECIFIER
+ " %sv%d %s requested=%zu p=%p end=%p xdr=[%p,%zu]/%u/[%p,%zu]/%u\n",
__entry->task_id, __entry->client_id,
__get_str(progname), __entry->version, __get_str(procedure),
__entry->requested, __entry->p, __entry->end,
@@ -727,8 +738,8 @@ TRACE_EVENT(rpc_xdr_alignment,
__entry->len = xdr->buf->len;
),

- TP_printk(
- "task:%u@%u %sv%d %s offset=%zu copied=%u xdr=[%p,%zu]/%u/[%p,%zu]/%u\n",
+ TP_printk(SUNRPC_TRACE_TASK_SPECIFIER
+ " %sv%d %s offset=%zu copied=%u xdr=[%p,%zu]/%u/[%p,%zu]/%u\n",
__entry->task_id, __entry->client_id,
__get_str(progname), __entry->version, __get_str(procedure),
__entry->offset, __entry->copied,
@@ -917,7 +928,8 @@ TRACE_EVENT(rpc_socket_nospace,
__entry->remaining = rqst->rq_slen - transport->xmit.offset;
),

- TP_printk("task:%u@%u total=%u remaining=%u",
+ TP_printk(SUNRPC_TRACE_TASK_SPECIFIER
+ " total=%u remaining=%u",
__entry->task_id, __entry->client_id,
__entry->total, __entry->remaining
)
@@ -1042,8 +1054,8 @@ TRACE_EVENT(xprt_transmit,
__entry->status = status;
),

- TP_printk(
- "task:%u@%u xid=0x%08x seqno=%u status=%d",
+ TP_printk(SUNRPC_TRACE_TASK_SPECIFIER
+ " xid=0x%08x seqno=%u status=%d",
__entry->task_id, __entry->client_id, __entry->xid,
__entry->seqno, __entry->status)
);
@@ -1082,8 +1094,8 @@ TRACE_EVENT(xprt_retransmit,
__assign_str(procname, rpc_proc_name(task));
),

- TP_printk(
- "task:%u@%u xid=0x%08x %sv%d %s ntrans=%d timeout=%lu",
+ TP_printk(SUNRPC_TRACE_TASK_SPECIFIER
+ " xid=0x%08x %sv%d %s ntrans=%d timeout=%lu",
__entry->task_id, __entry->client_id, __entry->xid,
__get_str(progname), __entry->version, __get_str(procname),
__entry->ntrans, __entry->timeout
@@ -1137,7 +1149,8 @@ DECLARE_EVENT_CLASS(xprt_writelock_event,
xprt->snd_task->tk_pid : -1;
),

- TP_printk("task:%u@%u snd_task:%u",
+ TP_printk(SUNRPC_TRACE_TASK_SPECIFIER
+ " snd_task:" SUNRPC_TRACE_PID_SPECIFIER,
__entry->task_id, __entry->client_id,
__entry->snd_task_id)
);
@@ -1185,7 +1198,9 @@ DECLARE_EVENT_CLASS(xprt_cong_event,
__entry->wait = test_bit(XPRT_CWND_WAIT, &xprt->state);
),

- TP_printk("task:%u@%u snd_task:%u cong=%lu cwnd=%lu%s",
+ TP_printk(SUNRPC_TRACE_TASK_SPECIFIER
+ " snd_task:" SUNRPC_TRACE_PID_SPECIFIER
+ " cong=%lu cwnd=%lu%s",
__entry->task_id, __entry->client_id,
__entry->snd_task_id, __entry->cong, __entry->cwnd,
__entry->wait ? " (wait)" : "")
@@ -1223,7 +1238,7 @@ TRACE_EVENT(xprt_reserve,
__entry->xid = be32_to_cpu(rqst->rq_xid);
),

- TP_printk("task:%u@%u xid=0x%08x",
+ TP_printk(SUNRPC_TRACE_TASK_SPECIFIER " xid=0x%08x",
__entry->task_id, __entry->client_id, __entry->xid
)
);
@@ -1312,7 +1327,8 @@ TRACE_EVENT(rpcb_getport,
__assign_str(servername, task->tk_xprt->servername);
),

- TP_printk("task:%u@%u server=%s program=%u version=%u protocol=%d bind_version=%u",
+ TP_printk(SUNRPC_TRACE_TASK_SPECIFIER
+ " server=%s program=%u version=%u protocol=%d bind_version=%u",
__entry->task_id, __entry->client_id, __get_str(servername),
__entry->program, __entry->version, __entry->protocol,
__entry->bind_version
@@ -1342,7 +1358,7 @@ TRACE_EVENT(rpcb_setport,
__entry->port = port;
),

- TP_printk("task:%u@%u status=%d port=%u",
+ TP_printk(SUNRPC_TRACE_TASK_SPECIFIER " status=%d port=%u",
__entry->task_id, __entry->client_id,
__entry->status, __entry->port
)
diff --git a/include/trace/events/sunrpc_base.h b/include/trace/events/sunrpc_base.h
new file mode 100644
index 000000000000..588557d07ea8
--- /dev/null
+++ b/include/trace/events/sunrpc_base.h
@@ -0,0 +1,18 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright (c) 2021 Oracle and/or its affiliates.
+ *
+ * Common types and format specifiers for sunrpc.
+ */
+
+#if !defined(_TRACE_SUNRPC_BASE_H)
+#define _TRACE_SUNRPC_BASE_H
+
+#include <linux/tracepoint.h>
+
+#define SUNRPC_TRACE_PID_SPECIFIER "%08x"
+#define SUNRPC_TRACE_CLID_SPECIFIER "%08x"
+#define SUNRPC_TRACE_TASK_SPECIFIER \
+ "task:" SUNRPC_TRACE_PID_SPECIFIER "@" SUNRPC_TRACE_CLID_SPECIFIER
+
+#endif /* _TRACE_SUNRPC_BASE_H */


2021-10-05 17:14:47

by Chuck Lever

[permalink] [raw]
Subject: [PATCH RFC 3/5] SUNRPC: Use BIT() macro in rpc_show_xprt_state()

Clean up: BIT() is preferred over open-coding the shift.

Signed-off-by: Chuck Lever <[email protected]>
---
include/trace/events/sunrpc.h | 24 ++++++++++++------------
1 file changed, 12 insertions(+), 12 deletions(-)

diff --git a/include/trace/events/sunrpc.h b/include/trace/events/sunrpc.h
index 4fd6299bc907..9caf4533366e 100644
--- a/include/trace/events/sunrpc.h
+++ b/include/trace/events/sunrpc.h
@@ -921,18 +921,18 @@ TRACE_EVENT(rpc_socket_nospace,

#define rpc_show_xprt_state(x) \
__print_flags(x, "|", \
- { (1UL << XPRT_LOCKED), "LOCKED"}, \
- { (1UL << XPRT_CONNECTED), "CONNECTED"}, \
- { (1UL << XPRT_CONNECTING), "CONNECTING"}, \
- { (1UL << XPRT_CLOSE_WAIT), "CLOSE_WAIT"}, \
- { (1UL << XPRT_BOUND), "BOUND"}, \
- { (1UL << XPRT_BINDING), "BINDING"}, \
- { (1UL << XPRT_CLOSING), "CLOSING"}, \
- { (1UL << XPRT_OFFLINE), "OFFLINE"}, \
- { (1UL << XPRT_REMOVE), "REMOVE"}, \
- { (1UL << XPRT_CONGESTED), "CONGESTED"}, \
- { (1UL << XPRT_CWND_WAIT), "CWND_WAIT"}, \
- { (1UL << XPRT_WRITE_SPACE), "WRITE_SPACE"})
+ { BIT(XPRT_LOCKED), "LOCKED" }, \
+ { BIT(XPRT_CONNECTED), "CONNECTED" }, \
+ { BIT(XPRT_CONNECTING), "CONNECTING" }, \
+ { BIT(XPRT_CLOSE_WAIT), "CLOSE_WAIT" }, \
+ { BIT(XPRT_BOUND), "BOUND" }, \
+ { BIT(XPRT_BINDING), "BINDING" }, \
+ { BIT(XPRT_CLOSING), "CLOSING" }, \
+ { BIT(XPRT_OFFLINE), "OFFLINE" }, \
+ { BIT(XPRT_REMOVE), "REMOVE" }, \
+ { BIT(XPRT_CONGESTED), "CONGESTED" }, \
+ { BIT(XPRT_CWND_WAIT), "CWND_WAIT" }, \
+ { BIT(XPRT_WRITE_SPACE), "WRITE_SPACE" })

DECLARE_EVENT_CLASS(rpc_xprt_lifetime_class,
TP_PROTO(


2021-10-05 17:14:50

by Chuck Lever

[permalink] [raw]
Subject: [PATCH RFC 4/5] SUNRPC: Don't dereference xprt->snd_task if it's a cookie

Fixes: e26d9972720e ("SUNRPC: Clean up scheduling of autoclose")
Signed-off-by: Chuck Lever <[email protected]>
---
include/trace/events/sunrpc.h | 9 ++++-----
include/trace/events/sunrpc_base.h | 9 +++++++++
2 files changed, 13 insertions(+), 5 deletions(-)

diff --git a/include/trace/events/sunrpc.h b/include/trace/events/sunrpc.h
index 9caf4533366e..83c2a1cb2e3a 100644
--- a/include/trace/events/sunrpc.h
+++ b/include/trace/events/sunrpc.h
@@ -932,7 +932,8 @@ TRACE_EVENT(rpc_socket_nospace,
{ BIT(XPRT_REMOVE), "REMOVE" }, \
{ BIT(XPRT_CONGESTED), "CONGESTED" }, \
{ BIT(XPRT_CWND_WAIT), "CWND_WAIT" }, \
- { BIT(XPRT_WRITE_SPACE), "WRITE_SPACE" })
+ { BIT(XPRT_WRITE_SPACE), "WRITE_SPACE" }, \
+ { BIT(XPRT_SND_IS_COOKIE), "SND_IS_COOKIE" })

DECLARE_EVENT_CLASS(rpc_xprt_lifetime_class,
TP_PROTO(
@@ -1118,8 +1119,7 @@ DECLARE_EVENT_CLASS(xprt_writelock_event,

TP_fast_assign(
SUNRPC_TRACE_TASK_ASSIGN(task);
- __entry->snd_task_id = xprt->snd_task ?
- xprt->snd_task->tk_pid : -1;
+ SUNRPC_TRACE_SNDTASK_ASSIGN(xprt);
),

TP_printk(SUNRPC_TRACE_TASK_SPECIFIER
@@ -1157,8 +1157,7 @@ DECLARE_EVENT_CLASS(xprt_cong_event,

TP_fast_assign(
SUNRPC_TRACE_TASK_ASSIGN(task);
- __entry->snd_task_id = xprt->snd_task ?
- xprt->snd_task->tk_pid : -1;
+ SUNRPC_TRACE_SNDTASK_ASSIGN(xprt);
__entry->cong = xprt->cong;
__entry->cwnd = xprt->cwnd;
__entry->wait = test_bit(XPRT_CWND_WAIT, &xprt->state);
diff --git a/include/trace/events/sunrpc_base.h b/include/trace/events/sunrpc_base.h
index 2cbed4a9a63a..abdffe4f53d6 100644
--- a/include/trace/events/sunrpc_base.h
+++ b/include/trace/events/sunrpc_base.h
@@ -25,6 +25,15 @@
} \
} while (0);

+#define SUNRPC_TRACE_SNDTASK_ASSIGN(x) \
+ do { \
+ if ((x)->snd_task && \
+ !test_bit(XPRT_SND_IS_COOKIE, &(x)->state)) \
+ __entry->snd_task_id = (x)->snd_task->tk_pid; \
+ else \
+ __entry->snd_task_id = SUNRPC_TRACE_PID_SPECIAL; \
+ } while(0);
+
#define SUNRPC_TRACE_PID_SPECIFIER "%08x"
#define SUNRPC_TRACE_CLID_SPECIFIER "%08x"
#define SUNRPC_TRACE_TASK_SPECIFIER \


2021-10-05 17:14:56

by Chuck Lever

[permalink] [raw]
Subject: [PATCH RFC 5/5] NFS: Replace dprintk callsites in nfs_readpage(s)

There are two new events that report slightly different information
for readpage and readpages.

For readpage:
fsx-1387 [006] 380.761896: nfs_aops_readpage: fileid=00:28:2 fhandle=0x36fbbe51 version=1752899355910932437 page_index=32

The index of a synchronous single-page read is reported.

For readpages:

fsx-1387 [006] 380.760847: nfs_aops_readpages: fileid=00:28:2 fhandle=0x36fbbe51 version=1752899355909932456 nr_pages=3

The count of pages requested is reported. readpages does not wait
for the READ requests to complete.

Signed-off-by: Chuck Lever <[email protected]>
---
fs/nfs/nfstrace.h | 70 +++++++++++++++++++++++++++++++++++++++++++++++++++++
fs/nfs/read.c | 8 ++----
2 files changed, 72 insertions(+), 6 deletions(-)

diff --git a/fs/nfs/nfstrace.h b/fs/nfs/nfstrace.h
index e9be65b52bfe..0534d090ee55 100644
--- a/fs/nfs/nfstrace.h
+++ b/fs/nfs/nfstrace.h
@@ -862,6 +862,76 @@ TRACE_EVENT(nfs_sillyrename_unlink,
)
);

+TRACE_EVENT(nfs_aops_readpage,
+ TP_PROTO(
+ const struct inode *inode,
+ struct page *page
+ ),
+
+ TP_ARGS(inode, page),
+
+ TP_STRUCT__entry(
+ __field(dev_t, dev)
+ __field(u32, fhandle)
+ __field(u64, fileid)
+ __field(u64, version)
+ __field(pgoff_t, index)
+ ),
+
+ TP_fast_assign(
+ const struct nfs_inode *nfsi = NFS_I(inode);
+
+ __entry->dev = inode->i_sb->s_dev;
+ __entry->fileid = nfsi->fileid;
+ __entry->fhandle = nfs_fhandle_hash(&nfsi->fh);
+ __entry->version = inode_peek_iversion_raw(inode);
+ __entry->index = page_index(page);
+ ),
+
+ TP_printk(
+ "fileid=%02x:%02x:%llu fhandle=0x%08x version=%llu page_index=%lu",
+ MAJOR(__entry->dev), MINOR(__entry->dev),
+ (unsigned long long)__entry->fileid,
+ __entry->fhandle, __entry->version,
+ __entry->index
+ )
+);
+
+TRACE_EVENT(nfs_aops_readpages,
+ TP_PROTO(
+ const struct inode *inode,
+ unsigned int nr_pages
+ ),
+
+ TP_ARGS(inode, nr_pages),
+
+ TP_STRUCT__entry(
+ __field(dev_t, dev)
+ __field(u32, fhandle)
+ __field(u64, fileid)
+ __field(u64, version)
+ __field(unsigned int, nr_pages)
+ ),
+
+ TP_fast_assign(
+ const struct nfs_inode *nfsi = NFS_I(inode);
+
+ __entry->dev = inode->i_sb->s_dev;
+ __entry->fileid = nfsi->fileid;
+ __entry->fhandle = nfs_fhandle_hash(&nfsi->fh);
+ __entry->version = inode_peek_iversion_raw(inode);
+ __entry->nr_pages = nr_pages;
+ ),
+
+ TP_printk(
+ "fileid=%02x:%02x:%llu fhandle=0x%08x version=%llu nr_pages=%u",
+ MAJOR(__entry->dev), MINOR(__entry->dev),
+ (unsigned long long)__entry->fileid,
+ __entry->fhandle, __entry->version,
+ __entry->nr_pages
+ )
+);
+
TRACE_EVENT(nfs_initiate_read,
TP_PROTO(
const struct nfs_pgio_header *hdr
diff --git a/fs/nfs/read.c b/fs/nfs/read.c
index 08d6cc57cbc3..94690eda2a88 100644
--- a/fs/nfs/read.c
+++ b/fs/nfs/read.c
@@ -337,8 +337,7 @@ int nfs_readpage(struct file *file, struct page *page)
struct inode *inode = page_file_mapping(page)->host;
int ret;

- dprintk("NFS: nfs_readpage (%p %ld@%lu)\n",
- page, PAGE_SIZE, page_index(page));
+ trace_nfs_aops_readpage(inode, page);
nfs_inc_stats(inode, NFSIOS_VFSREADPAGE);

/*
@@ -403,10 +402,7 @@ int nfs_readpages(struct file *file, struct address_space *mapping,
struct inode *inode = mapping->host;
int ret;

- dprintk("NFS: nfs_readpages (%s/%Lu %d)\n",
- inode->i_sb->s_id,
- (unsigned long long)NFS_FILEID(inode),
- nr_pages);
+ trace_nfs_aops_readpages(inode, nr_pages);
nfs_inc_stats(inode, NFSIOS_VFSREADPAGES);

ret = -ESTALE;


2021-10-05 17:24:53

by Chuck Lever

[permalink] [raw]
Subject: Re: [PATCH RFC 5/5] NFS: Replace dprintk callsites in nfs_readpage(s)



> On Oct 5, 2021, at 1:14 PM, Chuck Lever <[email protected]> wrote:
>
> There are two new events that report slightly different information
> for readpage and readpages.
>
> For readpage:
> fsx-1387 [006] 380.761896: nfs_aops_readpage: fileid=00:28:2 fhandle=0x36fbbe51 version=1752899355910932437 page_index=32
>
> The index of a synchronous single-page read is reported.
>
> For readpages:
>
> fsx-1387 [006] 380.760847: nfs_aops_readpages: fileid=00:28:2 fhandle=0x36fbbe51 version=1752899355909932456 nr_pages=3
>
> The count of pages requested is reported. readpages does not wait
> for the READ requests to complete.
>
> Signed-off-by: Chuck Lever <[email protected]>

Actually, %s/aops_/aop_/g might be a little nicer.


> ---
> fs/nfs/nfstrace.h | 70 +++++++++++++++++++++++++++++++++++++++++++++++++++++
> fs/nfs/read.c | 8 ++----
> 2 files changed, 72 insertions(+), 6 deletions(-)
>
> diff --git a/fs/nfs/nfstrace.h b/fs/nfs/nfstrace.h
> index e9be65b52bfe..0534d090ee55 100644
> --- a/fs/nfs/nfstrace.h
> +++ b/fs/nfs/nfstrace.h
> @@ -862,6 +862,76 @@ TRACE_EVENT(nfs_sillyrename_unlink,
> )
> );
>
> +TRACE_EVENT(nfs_aops_readpage,
> + TP_PROTO(
> + const struct inode *inode,
> + struct page *page
> + ),
> +
> + TP_ARGS(inode, page),
> +
> + TP_STRUCT__entry(
> + __field(dev_t, dev)
> + __field(u32, fhandle)
> + __field(u64, fileid)
> + __field(u64, version)
> + __field(pgoff_t, index)
> + ),
> +
> + TP_fast_assign(
> + const struct nfs_inode *nfsi = NFS_I(inode);
> +
> + __entry->dev = inode->i_sb->s_dev;
> + __entry->fileid = nfsi->fileid;
> + __entry->fhandle = nfs_fhandle_hash(&nfsi->fh);
> + __entry->version = inode_peek_iversion_raw(inode);
> + __entry->index = page_index(page);
> + ),
> +
> + TP_printk(
> + "fileid=%02x:%02x:%llu fhandle=0x%08x version=%llu page_index=%lu",
> + MAJOR(__entry->dev), MINOR(__entry->dev),
> + (unsigned long long)__entry->fileid,
> + __entry->fhandle, __entry->version,
> + __entry->index
> + )
> +);
> +
> +TRACE_EVENT(nfs_aops_readpages,
> + TP_PROTO(
> + const struct inode *inode,
> + unsigned int nr_pages
> + ),
> +
> + TP_ARGS(inode, nr_pages),
> +
> + TP_STRUCT__entry(
> + __field(dev_t, dev)
> + __field(u32, fhandle)
> + __field(u64, fileid)
> + __field(u64, version)
> + __field(unsigned int, nr_pages)
> + ),
> +
> + TP_fast_assign(
> + const struct nfs_inode *nfsi = NFS_I(inode);
> +
> + __entry->dev = inode->i_sb->s_dev;
> + __entry->fileid = nfsi->fileid;
> + __entry->fhandle = nfs_fhandle_hash(&nfsi->fh);
> + __entry->version = inode_peek_iversion_raw(inode);
> + __entry->nr_pages = nr_pages;
> + ),
> +
> + TP_printk(
> + "fileid=%02x:%02x:%llu fhandle=0x%08x version=%llu nr_pages=%u",
> + MAJOR(__entry->dev), MINOR(__entry->dev),
> + (unsigned long long)__entry->fileid,
> + __entry->fhandle, __entry->version,
> + __entry->nr_pages
> + )
> +);
> +
> TRACE_EVENT(nfs_initiate_read,
> TP_PROTO(
> const struct nfs_pgio_header *hdr
> diff --git a/fs/nfs/read.c b/fs/nfs/read.c
> index 08d6cc57cbc3..94690eda2a88 100644
> --- a/fs/nfs/read.c
> +++ b/fs/nfs/read.c
> @@ -337,8 +337,7 @@ int nfs_readpage(struct file *file, struct page *page)
> struct inode *inode = page_file_mapping(page)->host;
> int ret;
>
> - dprintk("NFS: nfs_readpage (%p %ld@%lu)\n",
> - page, PAGE_SIZE, page_index(page));
> + trace_nfs_aops_readpage(inode, page);
> nfs_inc_stats(inode, NFSIOS_VFSREADPAGE);
>
> /*
> @@ -403,10 +402,7 @@ int nfs_readpages(struct file *file, struct address_space *mapping,
> struct inode *inode = mapping->host;
> int ret;
>
> - dprintk("NFS: nfs_readpages (%s/%Lu %d)\n",
> - inode->i_sb->s_id,
> - (unsigned long long)NFS_FILEID(inode),
> - nr_pages);
> + trace_nfs_aops_readpages(inode, nr_pages);
> nfs_inc_stats(inode, NFSIOS_VFSREADPAGES);
>
> ret = -ESTALE;
>
>

--
Chuck Lever



2021-10-07 15:02:18

by Chuck Lever

[permalink] [raw]
Subject: Re: [PATCH RFC 5/5] NFS: Replace dprintk callsites in nfs_readpage(s)



> On Oct 7, 2021, at 9:05 AM, David Wysochanski <[email protected]> wrote:
>
> On Tue, Oct 5, 2021 at 1:14 PM Chuck Lever <[email protected]> wrote:
> >
> > There are two new events that report slightly different information
> > for readpage and readpages.
> >
> > For readpage:
> > fsx-1387 [006] 380.761896: nfs_aops_readpage: fileid=00:28:2 fhandle=0x36fbbe51 version=1752899355910932437 page_index=32
> >
> > The index of a synchronous single-page read is reported.
> >
> > For readpages:
> >
> > fsx-1387 [006] 380.760847: nfs_aops_readpages: fileid=00:28:2 fhandle=0x36fbbe51 version=1752899355909932456 nr_pages=3
> >
> > The count of pages requested is reported. readpages does not wait
> > for the READ requests to complete.
> >
> > Signed-off-by: Chuck Lever <[email protected]>
> > ---
> > fs/nfs/nfstrace.h | 70 +++++++++++++++++++++++++++++++++++++++++++++++++++++
> > fs/nfs/read.c | 8 ++----
> > 2 files changed, 72 insertions(+), 6 deletions(-)
> >
> > diff --git a/fs/nfs/nfstrace.h b/fs/nfs/nfstrace.h
> > index e9be65b52bfe..0534d090ee55 100644
> > --- a/fs/nfs/nfstrace.h
> > +++ b/fs/nfs/nfstrace.h
> > @@ -862,6 +862,76 @@ TRACE_EVENT(nfs_sillyrename_unlink,
> > )
> > );
> >
> > +TRACE_EVENT(nfs_aops_readpage,
> > + TP_PROTO(
> > + const struct inode *inode,
> > + struct page *page
> > + ),
> > +
> > + TP_ARGS(inode, page),
> > +
> > + TP_STRUCT__entry(
> > + __field(dev_t, dev)
> > + __field(u32, fhandle)
> > + __field(u64, fileid)
> > + __field(u64, version)
> > + __field(pgoff_t, index)
> > + ),
> > +
> > + TP_fast_assign(
> > + const struct nfs_inode *nfsi = NFS_I(inode);
> > +
> > + __entry->dev = inode->i_sb->s_dev;
> > + __entry->fileid = nfsi->fileid;
> > + __entry->fhandle = nfs_fhandle_hash(&nfsi->fh);
> > + __entry->version = inode_peek_iversion_raw(inode);
> > + __entry->index = page_index(page);
> > + ),
> > +
> > + TP_printk(
> > + "fileid=%02x:%02x:%llu fhandle=0x%08x version=%llu page_index=%lu",
> > + MAJOR(__entry->dev), MINOR(__entry->dev),
> > + (unsigned long long)__entry->fileid,
> > + __entry->fhandle, __entry->version,
> > + __entry->index
> > + )
> > +);
> > +
> > +TRACE_EVENT(nfs_aops_readpages,
> > + TP_PROTO(
> > + const struct inode *inode,
> > + unsigned int nr_pages
> > + ),
> > +
> > + TP_ARGS(inode, nr_pages),
> > +
> > + TP_STRUCT__entry(
> > + __field(dev_t, dev)
> > + __field(u32, fhandle)
> > + __field(u64, fileid)
> > + __field(u64, version)
> > + __field(unsigned int, nr_pages)
> > + ),
> > +
> > + TP_fast_assign(
> > + const struct nfs_inode *nfsi = NFS_I(inode);
> > +
> > + __entry->dev = inode->i_sb->s_dev;
> > + __entry->fileid = nfsi->fileid;
> > + __entry->fhandle = nfs_fhandle_hash(&nfsi->fh);
> > + __entry->version = inode_peek_iversion_raw(inode);
> > + __entry->nr_pages = nr_pages;
> > + ),
> > +
> > + TP_printk(
> > + "fileid=%02x:%02x:%llu fhandle=0x%08x version=%llu nr_pages=%u",
> > + MAJOR(__entry->dev), MINOR(__entry->dev),
> > + (unsigned long long)__entry->fileid,
> > + __entry->fhandle, __entry->version,
> > + __entry->nr_pages
> > + )
> > +);
> > +
> > TRACE_EVENT(nfs_initiate_read,
> > TP_PROTO(
> > const struct nfs_pgio_header *hdr
> > diff --git a/fs/nfs/read.c b/fs/nfs/read.c
> > index 08d6cc57cbc3..94690eda2a88 100644
> > --- a/fs/nfs/read.c
> > +++ b/fs/nfs/read.c
> > @@ -337,8 +337,7 @@ int nfs_readpage(struct file *file, struct page *page)
> > struct inode *inode = page_file_mapping(page)->host;
> > int ret;
> >
> > - dprintk("NFS: nfs_readpage (%p %ld@%lu)\n",
> > - page, PAGE_SIZE, page_index(page));
> > + trace_nfs_aops_readpage(inode, page);
> > nfs_inc_stats(inode, NFSIOS_VFSREADPAGE);
> >
> > /*
> > @@ -403,10 +402,7 @@ int nfs_readpages(struct file *file, struct address_space *mapping,
> > struct inode *inode = mapping->host;
> > int ret;
> >
> > - dprintk("NFS: nfs_readpages (%s/%Lu %d)\n",
> > - inode->i_sb->s_id,
> > - (unsigned long long)NFS_FILEID(inode),
> > - nr_pages);
> > + trace_nfs_aops_readpages(inode, nr_pages);
> > nfs_inc_stats(inode, NFSIOS_VFSREADPAGES);
> >
> > ret = -ESTALE;
> >
> >
>
> I added this on top of my fscache patches and have been testing.
> Should we be tracing (only?) the return point with the return
> value?

The purpose of the entry point is: you get a timestamp, filehandle
information, and you know what is driving the READ request (sync
read or async readahead).

There is a dprintk() at the top of the function as well as a
performance metric counter, but there currently isn't a dprintk()
at the bottom of the function. So I assumed the return code is
not a critical piece of information. I'm willing to be educated,
though.

A return point trace event could be generated only when there
is an unexpected error condition, for example, to reduce trace
log noise?


> bigfile-6279 [004] ..... 11550.387232: nfs_aops_readpages: fileid=00:2f:26127 fhandle=0xb6d0e8f0 version=1633611037513339503 nr_pages=32
> bigfile-6279 [004] ..... 11550.387236: nfs_fscache_page_event_read: fileid=00:2f:26127 fhandle=0xb6d0e8f0 offset=0 count=4096
> bigfile-6279 [004] ..... 11550.387237: nfs_fscache_page_event_read_done: fileid=00:2f:26127 fhandle=0xb6d0e8f0 offset=0 count=4096 error=-105
> bigfile-6279 [004] ..... 11550.387248: nfs_fscache_page_event_read: fileid=00:2f:26127 fhandle=0xb6d0e8f0 offset=4096 count=4096
> bigfile-6279 [004] ..... 11550.387248: nfs_fscache_page_event_read_done: fileid=00:2f:26127 fhandle=0xb6d0e8f0 offset=4096 count=4096 error=-105
> bigfile-6279 [004] ..... 11550.387250: nfs_fscache_page_event_read: fileid=00:2f:26127 fhandle=0xb6d0e8f0 offset=8192 count=4096
> bigfile-6279 [004] ..... 11550.387250: nfs_fscache_page_event_read_done: fileid=00:2f:26127 fhandle=0xb6d0e8f0 offset=8192 count=4096 error=-105


--
Chuck Lever



2021-10-07 15:54:22

by David Wysochanski

[permalink] [raw]
Subject: Re: [PATCH RFC 5/5] NFS: Replace dprintk callsites in nfs_readpage(s)

On Thu, Oct 7, 2021 at 11:01 AM Chuck Lever III <[email protected]> wrote:
>
>
>
> > On Oct 7, 2021, at 9:05 AM, David Wysochanski <[email protected]> wrote:
> >
> > On Tue, Oct 5, 2021 at 1:14 PM Chuck Lever <[email protected]> wrote:
> > >
> > > There are two new events that report slightly different information
> > > for readpage and readpages.
> > >
> > > For readpage:
> > > fsx-1387 [006] 380.761896: nfs_aops_readpage: fileid=00:28:2 fhandle=0x36fbbe51 version=1752899355910932437 page_index=32
> > >
> > > The index of a synchronous single-page read is reported.
> > >
> > > For readpages:
> > >
> > > fsx-1387 [006] 380.760847: nfs_aops_readpages: fileid=00:28:2 fhandle=0x36fbbe51 version=1752899355909932456 nr_pages=3
> > >
> > > The count of pages requested is reported. readpages does not wait
> > > for the READ requests to complete.
> > >
> > > Signed-off-by: Chuck Lever <[email protected]>
> > > ---
> > > fs/nfs/nfstrace.h | 70 +++++++++++++++++++++++++++++++++++++++++++++++++++++
> > > fs/nfs/read.c | 8 ++----
> > > 2 files changed, 72 insertions(+), 6 deletions(-)
> > >
> > > diff --git a/fs/nfs/nfstrace.h b/fs/nfs/nfstrace.h
> > > index e9be65b52bfe..0534d090ee55 100644
> > > --- a/fs/nfs/nfstrace.h
> > > +++ b/fs/nfs/nfstrace.h
> > > @@ -862,6 +862,76 @@ TRACE_EVENT(nfs_sillyrename_unlink,
> > > )
> > > );
> > >
> > > +TRACE_EVENT(nfs_aops_readpage,
> > > + TP_PROTO(
> > > + const struct inode *inode,
> > > + struct page *page
> > > + ),
> > > +
> > > + TP_ARGS(inode, page),
> > > +
> > > + TP_STRUCT__entry(
> > > + __field(dev_t, dev)
> > > + __field(u32, fhandle)
> > > + __field(u64, fileid)
> > > + __field(u64, version)
> > > + __field(pgoff_t, index)
> > > + ),
> > > +
> > > + TP_fast_assign(
> > > + const struct nfs_inode *nfsi = NFS_I(inode);
> > > +
> > > + __entry->dev = inode->i_sb->s_dev;
> > > + __entry->fileid = nfsi->fileid;
> > > + __entry->fhandle = nfs_fhandle_hash(&nfsi->fh);
> > > + __entry->version = inode_peek_iversion_raw(inode);
> > > + __entry->index = page_index(page);
> > > + ),
> > > +
> > > + TP_printk(
> > > + "fileid=%02x:%02x:%llu fhandle=0x%08x version=%llu page_index=%lu",
> > > + MAJOR(__entry->dev), MINOR(__entry->dev),
> > > + (unsigned long long)__entry->fileid,
> > > + __entry->fhandle, __entry->version,
> > > + __entry->index
> > > + )
> > > +);
> > > +
> > > +TRACE_EVENT(nfs_aops_readpages,
> > > + TP_PROTO(
> > > + const struct inode *inode,
> > > + unsigned int nr_pages
> > > + ),
> > > +
> > > + TP_ARGS(inode, nr_pages),
> > > +
> > > + TP_STRUCT__entry(
> > > + __field(dev_t, dev)
> > > + __field(u32, fhandle)
> > > + __field(u64, fileid)
> > > + __field(u64, version)
> > > + __field(unsigned int, nr_pages)
> > > + ),
> > > +
> > > + TP_fast_assign(
> > > + const struct nfs_inode *nfsi = NFS_I(inode);
> > > +
> > > + __entry->dev = inode->i_sb->s_dev;
> > > + __entry->fileid = nfsi->fileid;
> > > + __entry->fhandle = nfs_fhandle_hash(&nfsi->fh);
> > > + __entry->version = inode_peek_iversion_raw(inode);
> > > + __entry->nr_pages = nr_pages;
> > > + ),
> > > +
> > > + TP_printk(
> > > + "fileid=%02x:%02x:%llu fhandle=0x%08x version=%llu nr_pages=%u",
> > > + MAJOR(__entry->dev), MINOR(__entry->dev),
> > > + (unsigned long long)__entry->fileid,
> > > + __entry->fhandle, __entry->version,
> > > + __entry->nr_pages
> > > + )
> > > +);
> > > +
> > > TRACE_EVENT(nfs_initiate_read,
> > > TP_PROTO(
> > > const struct nfs_pgio_header *hdr
> > > diff --git a/fs/nfs/read.c b/fs/nfs/read.c
> > > index 08d6cc57cbc3..94690eda2a88 100644
> > > --- a/fs/nfs/read.c
> > > +++ b/fs/nfs/read.c
> > > @@ -337,8 +337,7 @@ int nfs_readpage(struct file *file, struct page *page)
> > > struct inode *inode = page_file_mapping(page)->host;
> > > int ret;
> > >
> > > - dprintk("NFS: nfs_readpage (%p %ld@%lu)\n",
> > > - page, PAGE_SIZE, page_index(page));
> > > + trace_nfs_aops_readpage(inode, page);
> > > nfs_inc_stats(inode, NFSIOS_VFSREADPAGE);
> > >
> > > /*
> > > @@ -403,10 +402,7 @@ int nfs_readpages(struct file *file, struct address_space *mapping,
> > > struct inode *inode = mapping->host;
> > > int ret;
> > >
> > > - dprintk("NFS: nfs_readpages (%s/%Lu %d)\n",
> > > - inode->i_sb->s_id,
> > > - (unsigned long long)NFS_FILEID(inode),
> > > - nr_pages);
> > > + trace_nfs_aops_readpages(inode, nr_pages);
> > > nfs_inc_stats(inode, NFSIOS_VFSREADPAGES);
> > >
> > > ret = -ESTALE;
> > >
> > >
> >
> > I added this on top of my fscache patches and have been testing.
> > Should we be tracing (only?) the return point with the return
> > value?
>
> The purpose of the entry point is: you get a timestamp, filehandle
> information, and you know what is driving the READ request (sync
> read or async readahead).
>
> There is a dprintk() at the top of the function as well as a
> performance metric counter, but there currently isn't a dprintk()
> at the bottom of the function. So I assumed the return code is
> not a critical piece of information. I'm willing to be educated,
> though.
>

Well, I was trying to understand the various approaches in nfstrace.h.
There are tracepoints that are paired with entry and exit, but I guess
these have multiple pieces of information that may change from
start to finish. Examples:
nfs_refresh_inode_enter
nfs_refresh_inode_exit
nfs_lookup_revalidate_enter
nfs_lookup_revalidate_exit

> A return point trace event could be generated only when there
> is an unexpected error condition, for example, to reduce trace
> log noise?
>
Ok so you would just add a second tracepoint for non-zero returns?

>
> > bigfile-6279 [004] ..... 11550.387232: nfs_aops_readpages: fileid=00:2f:26127 fhandle=0xb6d0e8f0 version=1633611037513339503 nr_pages=32
> > bigfile-6279 [004] ..... 11550.387236: nfs_fscache_page_event_read: fileid=00:2f:26127 fhandle=0xb6d0e8f0 offset=0 count=4096
> > bigfile-6279 [004] ..... 11550.387237: nfs_fscache_page_event_read_done: fileid=00:2f:26127 fhandle=0xb6d0e8f0 offset=0 count=4096 error=-105
> > bigfile-6279 [004] ..... 11550.387248: nfs_fscache_page_event_read: fileid=00:2f:26127 fhandle=0xb6d0e8f0 offset=4096 count=4096
> > bigfile-6279 [004] ..... 11550.387248: nfs_fscache_page_event_read_done: fileid=00:2f:26127 fhandle=0xb6d0e8f0 offset=4096 count=4096 error=-105
> > bigfile-6279 [004] ..... 11550.387250: nfs_fscache_page_event_read: fileid=00:2f:26127 fhandle=0xb6d0e8f0 offset=8192 count=4096
> > bigfile-6279 [004] ..... 11550.387250: nfs_fscache_page_event_read_done: fileid=00:2f:26127 fhandle=0xb6d0e8f0 offset=8192 count=4096 error=-105

Example:
bigfile-6279 [004] ..... 11550.387252: nfs_aops_readpages_error:
fileid=00:2f:26127 fhandle=0xb6d0e8f0 version=1633611037513339503
nr_pages=6 error=-5



>
> --
> Chuck Lever
>
>
>

2021-10-07 16:37:46

by Chuck Lever

[permalink] [raw]
Subject: Re: [PATCH RFC 5/5] NFS: Replace dprintk callsites in nfs_readpage(s)



> On Oct 7, 2021, at 11:25 AM, David Wysochanski <[email protected]> wrote:
>
> On Thu, Oct 7, 2021 at 11:01 AM Chuck Lever III <[email protected]> wrote:
>>
>>
>>
>>> On Oct 7, 2021, at 9:05 AM, David Wysochanski <[email protected]> wrote:
>>>
>>> On Tue, Oct 5, 2021 at 1:14 PM Chuck Lever <[email protected]> wrote:
>>>>
>>>> There are two new events that report slightly different information
>>>> for readpage and readpages.
>>>>
>>>> For readpage:
>>>> fsx-1387 [006] 380.761896: nfs_aops_readpage: fileid=00:28:2 fhandle=0x36fbbe51 version=1752899355910932437 page_index=32
>>>>
>>>> The index of a synchronous single-page read is reported.
>>>>
>>>> For readpages:
>>>>
>>>> fsx-1387 [006] 380.760847: nfs_aops_readpages: fileid=00:28:2 fhandle=0x36fbbe51 version=1752899355909932456 nr_pages=3
>>>>
>>>> The count of pages requested is reported. readpages does not wait
>>>> for the READ requests to complete.
>>>>
>>>> Signed-off-by: Chuck Lever <[email protected]>
>>>> ---
>>>> fs/nfs/nfstrace.h | 70 +++++++++++++++++++++++++++++++++++++++++++++++++++++
>>>> fs/nfs/read.c | 8 ++----
>>>> 2 files changed, 72 insertions(+), 6 deletions(-)
>>>>
>>>> diff --git a/fs/nfs/nfstrace.h b/fs/nfs/nfstrace.h
>>>> index e9be65b52bfe..0534d090ee55 100644
>>>> --- a/fs/nfs/nfstrace.h
>>>> +++ b/fs/nfs/nfstrace.h
>>>> @@ -862,6 +862,76 @@ TRACE_EVENT(nfs_sillyrename_unlink,
>>>> )
>>>> );
>>>>
>>>> +TRACE_EVENT(nfs_aops_readpage,
>>>> + TP_PROTO(
>>>> + const struct inode *inode,
>>>> + struct page *page
>>>> + ),
>>>> +
>>>> + TP_ARGS(inode, page),
>>>> +
>>>> + TP_STRUCT__entry(
>>>> + __field(dev_t, dev)
>>>> + __field(u32, fhandle)
>>>> + __field(u64, fileid)
>>>> + __field(u64, version)
>>>> + __field(pgoff_t, index)
>>>> + ),
>>>> +
>>>> + TP_fast_assign(
>>>> + const struct nfs_inode *nfsi = NFS_I(inode);
>>>> +
>>>> + __entry->dev = inode->i_sb->s_dev;
>>>> + __entry->fileid = nfsi->fileid;
>>>> + __entry->fhandle = nfs_fhandle_hash(&nfsi->fh);
>>>> + __entry->version = inode_peek_iversion_raw(inode);
>>>> + __entry->index = page_index(page);
>>>> + ),
>>>> +
>>>> + TP_printk(
>>>> + "fileid=%02x:%02x:%llu fhandle=0x%08x version=%llu page_index=%lu",
>>>> + MAJOR(__entry->dev), MINOR(__entry->dev),
>>>> + (unsigned long long)__entry->fileid,
>>>> + __entry->fhandle, __entry->version,
>>>> + __entry->index
>>>> + )
>>>> +);
>>>> +
>>>> +TRACE_EVENT(nfs_aops_readpages,
>>>> + TP_PROTO(
>>>> + const struct inode *inode,
>>>> + unsigned int nr_pages
>>>> + ),
>>>> +
>>>> + TP_ARGS(inode, nr_pages),
>>>> +
>>>> + TP_STRUCT__entry(
>>>> + __field(dev_t, dev)
>>>> + __field(u32, fhandle)
>>>> + __field(u64, fileid)
>>>> + __field(u64, version)
>>>> + __field(unsigned int, nr_pages)
>>>> + ),
>>>> +
>>>> + TP_fast_assign(
>>>> + const struct nfs_inode *nfsi = NFS_I(inode);
>>>> +
>>>> + __entry->dev = inode->i_sb->s_dev;
>>>> + __entry->fileid = nfsi->fileid;
>>>> + __entry->fhandle = nfs_fhandle_hash(&nfsi->fh);
>>>> + __entry->version = inode_peek_iversion_raw(inode);
>>>> + __entry->nr_pages = nr_pages;
>>>> + ),
>>>> +
>>>> + TP_printk(
>>>> + "fileid=%02x:%02x:%llu fhandle=0x%08x version=%llu nr_pages=%u",
>>>> + MAJOR(__entry->dev), MINOR(__entry->dev),
>>>> + (unsigned long long)__entry->fileid,
>>>> + __entry->fhandle, __entry->version,
>>>> + __entry->nr_pages
>>>> + )
>>>> +);
>>>> +
>>>> TRACE_EVENT(nfs_initiate_read,
>>>> TP_PROTO(
>>>> const struct nfs_pgio_header *hdr
>>>> diff --git a/fs/nfs/read.c b/fs/nfs/read.c
>>>> index 08d6cc57cbc3..94690eda2a88 100644
>>>> --- a/fs/nfs/read.c
>>>> +++ b/fs/nfs/read.c
>>>> @@ -337,8 +337,7 @@ int nfs_readpage(struct file *file, struct page *page)
>>>> struct inode *inode = page_file_mapping(page)->host;
>>>> int ret;
>>>>
>>>> - dprintk("NFS: nfs_readpage (%p %ld@%lu)\n",
>>>> - page, PAGE_SIZE, page_index(page));
>>>> + trace_nfs_aops_readpage(inode, page);
>>>> nfs_inc_stats(inode, NFSIOS_VFSREADPAGE);
>>>>
>>>> /*
>>>> @@ -403,10 +402,7 @@ int nfs_readpages(struct file *file, struct address_space *mapping,
>>>> struct inode *inode = mapping->host;
>>>> int ret;
>>>>
>>>> - dprintk("NFS: nfs_readpages (%s/%Lu %d)\n",
>>>> - inode->i_sb->s_id,
>>>> - (unsigned long long)NFS_FILEID(inode),
>>>> - nr_pages);
>>>> + trace_nfs_aops_readpages(inode, nr_pages);
>>>> nfs_inc_stats(inode, NFSIOS_VFSREADPAGES);
>>>>
>>>> ret = -ESTALE;
>>>>
>>>>
>>>
>>> I added this on top of my fscache patches and have been testing.
>>> Should we be tracing (only?) the return point with the return
>>> value?
>>
>> The purpose of the entry point is: you get a timestamp, filehandle
>> information, and you know what is driving the READ request (sync
>> read or async readahead).
>>
>> There is a dprintk() at the top of the function as well as a
>> performance metric counter, but there currently isn't a dprintk()
>> at the bottom of the function. So I assumed the return code is
>> not a critical piece of information. I'm willing to be educated,
>> though.
>>
>
> Well, I was trying to understand the various approaches in nfstrace.h.
> There are tracepoints that are paired with entry and exit, but I guess
> these have multiple pieces of information that may change from
> start to finish. Examples:
> nfs_refresh_inode_enter
> nfs_refresh_inode_exit
> nfs_lookup_revalidate_enter
> nfs_lookup_revalidate_exit

These were added over many years by several different people.
Unfortunately there isn't much strategy here at this point.

But yes, especially for nfs_readpages, the *pages argument
is a list that is drained by read_cache_pages() so it wouldn't
have the same value at the end of the function (not that the
trace point is recording it, but you get the idea).


>> A return point trace event could be generated only when there
>> is an unexpected error condition, for example, to reduce trace
>> log noise?
>>
> Ok so you would just add a second tracepoint for non-zero returns?

So for I/O operations, there is precedent for recording the
completion with a _done trace event. I'm redriving this patch
to include an unconditional return point trace event. Stand by...


>>> bigfile-6279 [004] ..... 11550.387232: nfs_aops_readpages: fileid=00:2f:26127 fhandle=0xb6d0e8f0 version=1633611037513339503 nr_pages=32
>>> bigfile-6279 [004] ..... 11550.387236: nfs_fscache_page_event_read: fileid=00:2f:26127 fhandle=0xb6d0e8f0 offset=0 count=4096
>>> bigfile-6279 [004] ..... 11550.387237: nfs_fscache_page_event_read_done: fileid=00:2f:26127 fhandle=0xb6d0e8f0 offset=0 count=4096 error=-105
>>> bigfile-6279 [004] ..... 11550.387248: nfs_fscache_page_event_read: fileid=00:2f:26127 fhandle=0xb6d0e8f0 offset=4096 count=4096
>>> bigfile-6279 [004] ..... 11550.387248: nfs_fscache_page_event_read_done: fileid=00:2f:26127 fhandle=0xb6d0e8f0 offset=4096 count=4096 error=-105
>>> bigfile-6279 [004] ..... 11550.387250: nfs_fscache_page_event_read: fileid=00:2f:26127 fhandle=0xb6d0e8f0 offset=8192 count=4096
>>> bigfile-6279 [004] ..... 11550.387250: nfs_fscache_page_event_read_done: fileid=00:2f:26127 fhandle=0xb6d0e8f0 offset=8192 count=4096 error=-105
>
> Example:
> bigfile-6279 [004] ..... 11550.387252: nfs_aops_readpages_error:
> fileid=00:2f:26127 fhandle=0xb6d0e8f0 version=1633611037513339503
> nr_pages=6 error=-5
>
>
>
>>
>> --
>> Chuck Lever

--
Chuck Lever