2021-03-31 17:24:00

by Chuck Lever

[permalink] [raw]
Subject: [PATCH 1/3] SUNRPC: Move fault injection call sites

I've hit some crashes that occur in the xprt_rdma_inject_disconnect
path. It appears that, for some provides, rdma_disconnect() can
take so long that the transport can disconnect and release its
hardware resources while rdma_disconnect() is still running,
resulting in a UAF in the provider.

The transport's fault injection method may depend on the stability
of transport data structures. That means it needs to be invoked
only from contexts that hold the transport write lock.

Fixes: 4a0682583988 ("SUNRPC: Transport fault injection")
Signed-off-by: Chuck Lever <[email protected]>
---
net/sunrpc/clnt.c | 1 -
net/sunrpc/xprt.c | 6 ++++--
net/sunrpc/xprtrdma/transport.c | 6 ++++--
3 files changed, 8 insertions(+), 5 deletions(-)

diff --git a/net/sunrpc/clnt.c b/net/sunrpc/clnt.c
index 612f0a641f4c..c2a01125be1a 100644
--- a/net/sunrpc/clnt.c
+++ b/net/sunrpc/clnt.c
@@ -1799,7 +1799,6 @@ call_allocate(struct rpc_task *task)

status = xprt->ops->buf_alloc(task);
trace_rpc_buf_alloc(task, status);
- xprt_inject_disconnect(xprt);
if (status == 0)
return;
if (status != -ENOMEM) {
diff --git a/net/sunrpc/xprt.c b/net/sunrpc/xprt.c
index 691ccf8049a4..d616b93751d8 100644
--- a/net/sunrpc/xprt.c
+++ b/net/sunrpc/xprt.c
@@ -1483,7 +1483,10 @@ bool xprt_prepare_transmit(struct rpc_task *task)

void xprt_end_transmit(struct rpc_task *task)
{
- xprt_release_write(task->tk_rqstp->rq_xprt, task);
+ struct rpc_xprt *xprt = task->tk_rqstp->rq_xprt;
+
+ xprt_inject_disconnect(xprt);
+ xprt_release_write(xprt, task);
}

/**
@@ -1885,7 +1888,6 @@ void xprt_release(struct rpc_task *task)
spin_unlock(&xprt->transport_lock);
if (req->rq_buffer)
xprt->ops->buf_free(task);
- xprt_inject_disconnect(xprt);
xdr_free_bvec(&req->rq_rcv_buf);
xdr_free_bvec(&req->rq_snd_buf);
if (req->rq_cred != NULL)
diff --git a/net/sunrpc/xprtrdma/transport.c b/net/sunrpc/xprtrdma/transport.c
index 78d29d1bcc20..09953597d055 100644
--- a/net/sunrpc/xprtrdma/transport.c
+++ b/net/sunrpc/xprtrdma/transport.c
@@ -262,8 +262,10 @@ xprt_rdma_connect_worker(struct work_struct *work)
* xprt_rdma_inject_disconnect - inject a connection fault
* @xprt: transport context
*
- * If @xprt is connected, disconnect it to simulate spurious connection
- * loss.
+ * If @xprt is connected, disconnect it to simulate spurious
+ * connection loss. Caller must hold @xprt's send lock to
+ * ensure that data structures and hardware resources are
+ * stable during the rdma_disconnect() call.
*/
static void
xprt_rdma_inject_disconnect(struct rpc_xprt *xprt)



2021-03-31 17:25:08

by Chuck Lever

[permalink] [raw]
Subject: [PATCH 3/3] SUNRPC: Add tracepoint that fires when an RPC is retransmitted

A separate tracepoint can be left enabled all the time to capture
rare but important retransmission events. So for example:

kworker/u26:3-568 [009] 156.967933: xprt_retransmit: task:44093@5 xid=0xa25dbc79 nfsv3 WRITE ntrans=2

Or, for example, enable all nfs and nfs4 tracepoints, and set up a
trigger to disable tracing when xprt_retransmit fires to capture
everything that leads up to it.

Signed-off-by: Chuck Lever <[email protected]>
---
include/trace/events/sunrpc.h | 40 ++++++++++++++++++++++++++++++++++++++++
net/sunrpc/xprt.c | 4 +++-
2 files changed, 43 insertions(+), 1 deletion(-)

diff --git a/include/trace/events/sunrpc.h b/include/trace/events/sunrpc.h
index 690988530d60..8e68268e2a98 100644
--- a/include/trace/events/sunrpc.h
+++ b/include/trace/events/sunrpc.h
@@ -1079,6 +1079,46 @@ TRACE_EVENT(xprt_transmit,
__entry->seqno, __entry->status)
);

+TRACE_EVENT(xprt_retransmit,
+ TP_PROTO(
+ const struct rpc_rqst *rqst
+ ),
+
+ TP_ARGS(rqst),
+
+ TP_STRUCT__entry(
+ __field(unsigned int, task_id)
+ __field(unsigned int, client_id)
+ __field(u32, xid)
+ __field(int, ntrans)
+ __field(int, version)
+ __string(progname,
+ rqst->rq_task->tk_client->cl_program->name)
+ __string(procedure,
+ rqst->rq_task->tk_msg.rpc_proc->p_name)
+ ),
+
+ TP_fast_assign(
+ struct rpc_task *task = rqst->rq_task;
+
+ __entry->task_id = task->tk_pid;
+ __entry->client_id = task->tk_client ?
+ task->tk_client->cl_clid : -1;
+ __entry->xid = be32_to_cpu(rqst->rq_xid);
+ __entry->ntrans = rqst->rq_ntrans;
+ __assign_str(progname,
+ task->tk_client->cl_program->name)
+ __entry->version = task->tk_client->cl_vers;
+ __assign_str(procedure, task->tk_msg.rpc_proc->p_name)
+ ),
+
+ TP_printk(
+ "task:%u@%u xid=0x%08x %sv%d %s ntrans=%d",
+ __entry->task_id, __entry->client_id, __entry->xid,
+ __get_str(progname), __entry->version, __get_str(procedure),
+ __entry->ntrans)
+);
+
TRACE_EVENT(xprt_ping,
TP_PROTO(const struct rpc_xprt *xprt, int status),

diff --git a/net/sunrpc/xprt.c b/net/sunrpc/xprt.c
index b694af4504c4..fcdb17666d2d 100644
--- a/net/sunrpc/xprt.c
+++ b/net/sunrpc/xprt.c
@@ -1540,8 +1540,10 @@ xprt_request_transmit(struct rpc_rqst *req, struct rpc_task *snd_task)
return status;
}

- if (is_retrans)
+ if (is_retrans) {
task->tk_client->cl_stats->rpcretrans++;
+ trace_xprt_retransmit(req);
+ }

xprt_inject_disconnect(xprt);



2021-03-31 17:25:50

by Chuck Lever

[permalink] [raw]
Subject: [PATCH 2/3] SUNRPC: Fix trace_xprt_transmit_queued()

This tracepoint can crash when dereferencing snd_task because
when some transports connect, they put a cookie in that field
instead of a pointer to an rpc_task.

BUG: KASAN: use-after-free in trace_event_raw_event_xprt_writelock_event+0x141/0x18e [sunrpc]
Read of size 2 at addr ffff8881a83bd3a0 by task git/331872

CPU: 11 PID: 331872 Comm: git Tainted: G S 5.12.0-rc2-00007-g3ab6e585a7f9 #1453
Hardware name: Supermicro SYS-6028R-T/X10DRi, BIOS 1.1a 10/16/2015
Call Trace:
dump_stack+0x9c/0xcf
print_address_description.constprop.0+0x18/0x239
kasan_report+0x174/0x1b0
trace_event_raw_event_xprt_writelock_event+0x141/0x18e [sunrpc]
xprt_prepare_transmit+0x8e/0xc1 [sunrpc]
call_transmit+0x4d/0xc6 [sunrpc]

Fixes: 9ce07ae5eb1d ("SUNRPC: Replace dprintk() call site in xprt_prepare_transmit")
Signed-off-by: Chuck Lever <[email protected]>
---
include/trace/events/sunrpc.h | 35 ++++++++++++++++++++++++++++++++++-
net/sunrpc/xprt.c | 2 +-
2 files changed, 35 insertions(+), 2 deletions(-)

diff --git a/include/trace/events/sunrpc.h b/include/trace/events/sunrpc.h
index 036eb1f5c133..690988530d60 100644
--- a/include/trace/events/sunrpc.h
+++ b/include/trace/events/sunrpc.h
@@ -1141,7 +1141,40 @@ DECLARE_EVENT_CLASS(xprt_writelock_event,

DEFINE_WRITELOCK_EVENT(reserve_xprt);
DEFINE_WRITELOCK_EVENT(release_xprt);
-DEFINE_WRITELOCK_EVENT(transmit_queued);
+
+TRACE_EVENT(xprt_transmit_queued,
+ TP_PROTO(
+ const struct rpc_task *task
+ ),
+
+ TP_ARGS(task),
+
+ TP_STRUCT__entry(
+ __field(unsigned int, task_id)
+ __field(unsigned int, client_id)
+ __field(unsigned long, runstate)
+ __field(u32, xid)
+ __field(int, status)
+ __field(unsigned short, flags)
+ ),
+
+ TP_fast_assign(
+ __entry->task_id = task->tk_pid;
+ __entry->client_id =
+ task->tk_client ? task->tk_client->cl_clid : -1;
+ __entry->runstate = task->tk_runstate;
+ __entry->xid = be32_to_cpu(task->tk_rqstp->rq_xid);
+ __entry->status = task->tk_status;
+ __entry->flags = task->tk_flags;
+ ),
+
+ TP_printk("task:%u@%u xid=0x%08x flags=%s runstate=%s status=%d",
+ __entry->task_id, __entry->client_id, __entry->xid,
+ rpc_show_task_flags(__entry->flags),
+ rpc_show_runstate(__entry->runstate),
+ __entry->status
+ )
+);

DECLARE_EVENT_CLASS(xprt_cong_event,
TP_PROTO(
diff --git a/net/sunrpc/xprt.c b/net/sunrpc/xprt.c
index d616b93751d8..b694af4504c4 100644
--- a/net/sunrpc/xprt.c
+++ b/net/sunrpc/xprt.c
@@ -1469,7 +1469,7 @@ bool xprt_prepare_transmit(struct rpc_task *task)
struct rpc_xprt *xprt = req->rq_xprt;

if (!xprt_lock_write(xprt, task)) {
- trace_xprt_transmit_queued(xprt, task);
+ trace_xprt_transmit_queued(task);

/* Race breaker: someone may have transmitted us */
if (!test_bit(RPC_TASK_NEED_XMIT, &task->tk_runstate))


2021-03-31 17:53:52

by Trond Myklebust

[permalink] [raw]
Subject: Re: [PATCH 2/3] SUNRPC: Fix trace_xprt_transmit_queued()

On Wed, 2021-03-31 at 13:22 -0400, Chuck Lever wrote:
> This tracepoint can crash when dereferencing snd_task because
> when some transports connect, they put a cookie in that field
> instead of a pointer to an rpc_task.
>
> BUG: KASAN: use-after-free in
> trace_event_raw_event_xprt_writelock_event+0x141/0x18e [sunrpc]
> Read of size 2 at addr ffff8881a83bd3a0 by task git/331872
>
> CPU: 11 PID: 331872 Comm: git Tainted: G S                5.12.0-rc2-
> 00007-g3ab6e585a7f9 #1453
> Hardware name: Supermicro SYS-6028R-T/X10DRi, BIOS 1.1a 10/16/2015
> Call Trace:
>  dump_stack+0x9c/0xcf
>  print_address_description.constprop.0+0x18/0x239
>  kasan_report+0x174/0x1b0
>  trace_event_raw_event_xprt_writelock_event+0x141/0x18e [sunrpc]
>  xprt_prepare_transmit+0x8e/0xc1 [sunrpc]
>  call_transmit+0x4d/0xc6 [sunrpc]
>
> Fixes: 9ce07ae5eb1d ("SUNRPC: Replace dprintk() call site in
> xprt_prepare_transmit")
> Signed-off-by: Chuck Lever <[email protected]>
> ---
>  include/trace/events/sunrpc.h |   35
> ++++++++++++++++++++++++++++++++++-
>  net/sunrpc/xprt.c             |    2 +-
>  2 files changed, 35 insertions(+), 2 deletions(-)
>
> diff --git a/include/trace/events/sunrpc.h
> b/include/trace/events/sunrpc.h
> index 036eb1f5c133..690988530d60 100644
> --- a/include/trace/events/sunrpc.h
> +++ b/include/trace/events/sunrpc.h
> @@ -1141,7 +1141,40 @@ DECLARE_EVENT_CLASS(xprt_writelock_event,
>  
>  DEFINE_WRITELOCK_EVENT(reserve_xprt);
>  DEFINE_WRITELOCK_EVENT(release_xprt);
> -DEFINE_WRITELOCK_EVENT(transmit_queued);
> +
> +TRACE_EVENT(xprt_transmit_queued,
> +       TP_PROTO(
> +               const struct rpc_task *task
> +       ),
> +
> +       TP_ARGS(task),
> +
> +       TP_STRUCT__entry(
> +               __field(unsigned int, task_id)
> +               __field(unsigned int, client_id)
> +               __field(unsigned long, runstate)
> +               __field(u32, xid)
> +               __field(int, status)
> +               __field(unsigned short, flags)
> +       ),
> +
> +       TP_fast_assign(
> +               __entry->task_id = task->tk_pid;
> +               __entry->client_id =
> +                       task->tk_client ? task->tk_client->cl_clid :
> -1;
> +               __entry->runstate = task->tk_runstate;
> +               __entry->xid = be32_to_cpu(task->tk_rqstp->rq_xid);
> +               __entry->status = task->tk_status;
> +               __entry->flags = task->tk_flags;
> +       ),
> +
> +       TP_printk("task:%u@%u xid=0x%08x flags=%s runstate=%s
> status=%d",
> +               __entry->task_id, __entry->client_id, __entry->xid,
> +               rpc_show_task_flags(__entry->flags),
> +               rpc_show_runstate(__entry->runstate),
> +               __entry->status
> +       )
> +);
>  
>  DECLARE_EVENT_CLASS(xprt_cong_event,
>         TP_PROTO(
> diff --git a/net/sunrpc/xprt.c b/net/sunrpc/xprt.c
> index d616b93751d8..b694af4504c4 100644
> --- a/net/sunrpc/xprt.c
> +++ b/net/sunrpc/xprt.c
> @@ -1469,7 +1469,7 @@ bool xprt_prepare_transmit(struct rpc_task
> *task)
>         struct rpc_xprt *xprt = req->rq_xprt;
>  
>         if (!xprt_lock_write(xprt, task)) {
> -               trace_xprt_transmit_queued(xprt, task);
> +               trace_xprt_transmit_queued(task);

Why do we need this tracepoint? The event we're logging is just
"grabbing the transport write lock failed due to external
circumstances".
 
>                 /* Race breaker: someone may have transmitted us */
>                 if (!test_bit(RPC_TASK_NEED_XMIT, &task-
> >tk_runstate))
>
>

--
Trond Myklebust
Linux NFS client maintainer, Hammerspace
[email protected]


2021-03-31 19:23:49

by Chuck Lever

[permalink] [raw]
Subject: Re: [PATCH 2/3] SUNRPC: Fix trace_xprt_transmit_queued()



> On Mar 31, 2021, at 1:52 PM, Trond Myklebust <[email protected]> wrote:
>
> On Wed, 2021-03-31 at 13:22 -0400, Chuck Lever wrote:
>> This tracepoint can crash when dereferencing snd_task because
>> when some transports connect, they put a cookie in that field
>> instead of a pointer to an rpc_task.
>>
>> BUG: KASAN: use-after-free in
>> trace_event_raw_event_xprt_writelock_event+0x141/0x18e [sunrpc]
>> Read of size 2 at addr ffff8881a83bd3a0 by task git/331872
>>
>> CPU: 11 PID: 331872 Comm: git Tainted: G S 5.12.0-rc2-
>> 00007-g3ab6e585a7f9 #1453
>> Hardware name: Supermicro SYS-6028R-T/X10DRi, BIOS 1.1a 10/16/2015
>> Call Trace:
>> dump_stack+0x9c/0xcf
>> print_address_description.constprop.0+0x18/0x239
>> kasan_report+0x174/0x1b0
>> trace_event_raw_event_xprt_writelock_event+0x141/0x18e [sunrpc]
>> xprt_prepare_transmit+0x8e/0xc1 [sunrpc]
>> call_transmit+0x4d/0xc6 [sunrpc]
>>
>> Fixes: 9ce07ae5eb1d ("SUNRPC: Replace dprintk() call site in
>> xprt_prepare_transmit")
>> Signed-off-by: Chuck Lever <[email protected]>
>> ---
>> include/trace/events/sunrpc.h | 35
>> ++++++++++++++++++++++++++++++++++-
>> net/sunrpc/xprt.c | 2 +-
>> 2 files changed, 35 insertions(+), 2 deletions(-)
>>
>> diff --git a/include/trace/events/sunrpc.h
>> b/include/trace/events/sunrpc.h
>> index 036eb1f5c133..690988530d60 100644
>> --- a/include/trace/events/sunrpc.h
>> +++ b/include/trace/events/sunrpc.h
>> @@ -1141,7 +1141,40 @@ DECLARE_EVENT_CLASS(xprt_writelock_event,
>>
>> DEFINE_WRITELOCK_EVENT(reserve_xprt);
>> DEFINE_WRITELOCK_EVENT(release_xprt);
>> -DEFINE_WRITELOCK_EVENT(transmit_queued);
>> +
>> +TRACE_EVENT(xprt_transmit_queued,
>> + TP_PROTO(
>> + const struct rpc_task *task
>> + ),
>> +
>> + TP_ARGS(task),
>> +
>> + TP_STRUCT__entry(
>> + __field(unsigned int, task_id)
>> + __field(unsigned int, client_id)
>> + __field(unsigned long, runstate)
>> + __field(u32, xid)
>> + __field(int, status)
>> + __field(unsigned short, flags)
>> + ),
>> +
>> + TP_fast_assign(
>> + __entry->task_id = task->tk_pid;
>> + __entry->client_id =
>> + task->tk_client ? task->tk_client->cl_clid :
>> -1;
>> + __entry->runstate = task->tk_runstate;
>> + __entry->xid = be32_to_cpu(task->tk_rqstp->rq_xid);
>> + __entry->status = task->tk_status;
>> + __entry->flags = task->tk_flags;
>> + ),
>> +
>> + TP_printk("task:%u@%u xid=0x%08x flags=%s runstate=%s
>> status=%d",
>> + __entry->task_id, __entry->client_id, __entry->xid,
>> + rpc_show_task_flags(__entry->flags),
>> + rpc_show_runstate(__entry->runstate),
>> + __entry->status
>> + )
>> +);
>>
>> DECLARE_EVENT_CLASS(xprt_cong_event,
>> TP_PROTO(
>> diff --git a/net/sunrpc/xprt.c b/net/sunrpc/xprt.c
>> index d616b93751d8..b694af4504c4 100644
>> --- a/net/sunrpc/xprt.c
>> +++ b/net/sunrpc/xprt.c
>> @@ -1469,7 +1469,7 @@ bool xprt_prepare_transmit(struct rpc_task
>> *task)
>> struct rpc_xprt *xprt = req->rq_xprt;
>>
>> if (!xprt_lock_write(xprt, task)) {
>> - trace_xprt_transmit_queued(xprt, task);
>> + trace_xprt_transmit_queued(task);
>
> Why do we need this tracepoint? The event we're logging is just
> "grabbing the transport write lock failed due to external
> circumstances".

This tracepoint records the decision to queue rather than send immediately.

You can count the number of these to get an idea whether the transport
is backed up.

You can trigger on this event if you want to capture the events that
lead up to queuing a request.

You can see whether the client logic is making the right choice when
congestion control is in play.


>> /* Race breaker: someone may have transmitted us */
>> if (!test_bit(RPC_TASK_NEED_XMIT, &task-
>>> tk_runstate))
>>
>>
>
> --
> Trond Myklebust
> Linux NFS client maintainer, Hammerspace
> [email protected]

--
Chuck Lever



2021-03-31 19:41:43

by Trond Myklebust

[permalink] [raw]
Subject: Re: [PATCH 2/3] SUNRPC: Fix trace_xprt_transmit_queued()

On Wed, 2021-03-31 at 19:22 +0000, Chuck Lever III wrote:
>
>
> > On Mar 31, 2021, at 1:52 PM, Trond Myklebust <
> > [email protected]> wrote:
> >
> > On Wed, 2021-03-31 at 13:22 -0400, Chuck Lever wrote:
> > > This tracepoint can crash when dereferencing snd_task because
> > > when some transports connect, they put a cookie in that field
> > > instead of a pointer to an rpc_task.
> > >
> > > BUG: KASAN: use-after-free in
> > > trace_event_raw_event_xprt_writelock_event+0x141/0x18e [sunrpc]
> > > Read of size 2 at addr ffff8881a83bd3a0 by task git/331872
> > >
> > > CPU: 11 PID: 331872 Comm: git Tainted: G S                5.12.0-
> > > rc2-
> > > 00007-g3ab6e585a7f9 #1453
> > > Hardware name: Supermicro SYS-6028R-T/X10DRi, BIOS 1.1a
> > > 10/16/2015
> > > Call Trace:
> > >  dump_stack+0x9c/0xcf
> > >  print_address_description.constprop.0+0x18/0x239
> > >  kasan_report+0x174/0x1b0
> > >  trace_event_raw_event_xprt_writelock_event+0x141/0x18e [sunrpc]
> > >  xprt_prepare_transmit+0x8e/0xc1 [sunrpc]
> > >  call_transmit+0x4d/0xc6 [sunrpc]
> > >
> > > Fixes: 9ce07ae5eb1d ("SUNRPC: Replace dprintk() call site in
> > > xprt_prepare_transmit")
> > > Signed-off-by: Chuck Lever <[email protected]>
> > > ---
> > >  include/trace/events/sunrpc.h |   35
> > > ++++++++++++++++++++++++++++++++++-
> > >  net/sunrpc/xprt.c             |    2 +-
> > >  2 files changed, 35 insertions(+), 2 deletions(-)
> > >
> > > diff --git a/include/trace/events/sunrpc.h
> > > b/include/trace/events/sunrpc.h
> > > index 036eb1f5c133..690988530d60 100644
> > > --- a/include/trace/events/sunrpc.h
> > > +++ b/include/trace/events/sunrpc.h
> > > @@ -1141,7 +1141,40 @@ DECLARE_EVENT_CLASS(xprt_writelock_event,
> > >  
> > >  DEFINE_WRITELOCK_EVENT(reserve_xprt);
> > >  DEFINE_WRITELOCK_EVENT(release_xprt);
> > > -DEFINE_WRITELOCK_EVENT(transmit_queued);
> > > +
> > > +TRACE_EVENT(xprt_transmit_queued,
> > > +       TP_PROTO(
> > > +               const struct rpc_task *task
> > > +       ),
> > > +
> > > +       TP_ARGS(task),
> > > +
> > > +       TP_STRUCT__entry(
> > > +               __field(unsigned int, task_id)
> > > +               __field(unsigned int, client_id)
> > > +               __field(unsigned long, runstate)
> > > +               __field(u32, xid)
> > > +               __field(int, status)
> > > +               __field(unsigned short, flags)
> > > +       ),
> > > +
> > > +       TP_fast_assign(
> > > +               __entry->task_id = task->tk_pid;
> > > +               __entry->client_id =
> > > +                       task->tk_client ? task->tk_client-
> > > >cl_clid :
> > > -1;
> > > +               __entry->runstate = task->tk_runstate;
> > > +               __entry->xid = be32_to_cpu(task->tk_rqstp-
> > > >rq_xid);
> > > +               __entry->status = task->tk_status;
> > > +               __entry->flags = task->tk_flags;
> > > +       ),
> > > +
> > > +       TP_printk("task:%u@%u xid=0x%08x flags=%s runstate=%s
> > > status=%d",
> > > +               __entry->task_id, __entry->client_id, __entry-
> > > >xid,
> > > +               rpc_show_task_flags(__entry->flags),
> > > +               rpc_show_runstate(__entry->runstate),
> > > +               __entry->status
> > > +       )
> > > +);
> > >  
> > >  DECLARE_EVENT_CLASS(xprt_cong_event,
> > >         TP_PROTO(
> > > diff --git a/net/sunrpc/xprt.c b/net/sunrpc/xprt.c
> > > index d616b93751d8..b694af4504c4 100644
> > > --- a/net/sunrpc/xprt.c
> > > +++ b/net/sunrpc/xprt.c
> > > @@ -1469,7 +1469,7 @@ bool xprt_prepare_transmit(struct rpc_task
> > > *task)
> > >         struct rpc_xprt *xprt = req->rq_xprt;
> > >  
> > >         if (!xprt_lock_write(xprt, task)) {
> > > -               trace_xprt_transmit_queued(xprt, task);
> > > +               trace_xprt_transmit_queued(task);
> >
> > Why do we need this tracepoint? The event we're logging is just
> > "grabbing the transport write lock failed due to external
> > circumstances".
>
> This tracepoint records the decision to queue rather than send
> immediately.

The request is already queued before we call xprt_prepare_transmit().

The decision being made here is whether or not we can move forward with
driving the bind or connect or transmission. This tracepoint just
records the case where something is blocking this specific rpc_task
from doing so.

> You can count the number of these to get an idea whether the
> transport
> is backed up.
>
> You can trigger on this event if you want to capture the events that
> lead up to queuing a request.

The lock can be unavailable simply because some other rpc_task is
already in the process of draining the queue.

If you want to find out if transmission is actually blocked, you're
better off triggering on the event that caused the blockage (either a
disconnect or a failure to reconnect or rebind). This tracepoint is
going to cause way too many false positives to be of any use for
tracing blockage.

> You can see whether the client logic is making the right choice when
> congestion control is in play.

Don't we already have a tracepoint in xprt_reserve_xprt_cong() for
this?

>
>
> > >                 /* Race breaker: someone may have transmitted us
> > > */
> > >                 if (!test_bit(RPC_TASK_NEED_XMIT, &task-
> > > > tk_runstate))
> > >
> > >
> >
> > --
> > Trond Myklebust
> > Linux NFS client maintainer, Hammerspace
> > [email protected]
>
> --
> Chuck Lever
>
>
>

--
Trond Myklebust
Linux NFS client maintainer, Hammerspace
[email protected]