2019-10-16 16:19:33

by Trond Myklebust

[permalink] [raw]
Subject: [PATCH 2/3] SUNRPC: The RDMA back channel mustn't disappear while requests are outstanding

If there are RDMA back channel requests either being processed by the
server threads, then we should hold a reference to the transport
to ensure it doesn't get freed from underneath us.

Reported-by: Neil Brown <[email protected]>
Fixes: 63cae47005af ("xprtrdma: Handle incoming backward direction RPC calls")
Signed-off-by: Trond Myklebust <[email protected]>
---
net/sunrpc/xprtrdma/backchannel.c | 2 ++
1 file changed, 2 insertions(+)

diff --git a/net/sunrpc/xprtrdma/backchannel.c b/net/sunrpc/xprtrdma/backchannel.c
index 50e075fcdd8f..b458bf53ca69 100644
--- a/net/sunrpc/xprtrdma/backchannel.c
+++ b/net/sunrpc/xprtrdma/backchannel.c
@@ -163,6 +163,7 @@ void xprt_rdma_bc_free_rqst(struct rpc_rqst *rqst)
spin_lock(&xprt->bc_pa_lock);
list_add_tail(&rqst->rq_bc_pa_list, &xprt->bc_pa_list);
spin_unlock(&xprt->bc_pa_lock);
+ xprt_put(xprt);
}

static struct rpc_rqst *rpcrdma_bc_rqst_get(struct rpcrdma_xprt *r_xprt)
@@ -259,6 +260,7 @@ void rpcrdma_bc_receive_call(struct rpcrdma_xprt *r_xprt,

/* Queue rqst for ULP's callback service */
bc_serv = xprt->bc_serv;
+ xprt_get(xprt);
spin_lock(&bc_serv->sv_cb_lock);
list_add(&rqst->rq_bc_list, &bc_serv->sv_cb_list);
spin_unlock(&bc_serv->sv_cb_lock);
--
2.21.0


2019-10-16 16:19:33

by Trond Myklebust

[permalink] [raw]
Subject: [PATCH 3/3] SUNRPC: Destroy the back channel when we destroy the host transport

When we're destroying the host transport mechanism, we should ensure
that we do not leak memory by failing to release any back channel
slots that might still exist.

Reported-by: Neil Brown <[email protected]>
Signed-off-by: Trond Myklebust <[email protected]>
---
net/sunrpc/xprt.c | 5 +++++
1 file changed, 5 insertions(+)

diff --git a/net/sunrpc/xprt.c b/net/sunrpc/xprt.c
index 8a45b3ccc313..41df4c507193 100644
--- a/net/sunrpc/xprt.c
+++ b/net/sunrpc/xprt.c
@@ -1942,6 +1942,11 @@ static void xprt_destroy_cb(struct work_struct *work)
rpc_destroy_wait_queue(&xprt->sending);
rpc_destroy_wait_queue(&xprt->backlog);
kfree(xprt->servername);
+ /*
+ * Destroy any existing back channel
+ */
+ xprt_destroy_backchannel(xprt, UINT_MAX);
+
/*
* Tear down transport state and free the rpc_xprt
*/
--
2.21.0

2019-10-17 18:38:35

by NeilBrown

[permalink] [raw]
Subject: Re: [PATCH 3/3] SUNRPC: Destroy the back channel when we destroy the host transport

On Wed, Oct 16 2019, Trond Myklebust wrote:

> When we're destroying the host transport mechanism, we should ensure
> that we do not leak memory by failing to release any back channel
> slots that might still exist.
>
> Reported-by: Neil Brown <[email protected]>
> Signed-off-by: Trond Myklebust <[email protected]>
> ---
> net/sunrpc/xprt.c | 5 +++++
> 1 file changed, 5 insertions(+)
>
> diff --git a/net/sunrpc/xprt.c b/net/sunrpc/xprt.c
> index 8a45b3ccc313..41df4c507193 100644
> --- a/net/sunrpc/xprt.c
> +++ b/net/sunrpc/xprt.c
> @@ -1942,6 +1942,11 @@ static void xprt_destroy_cb(struct work_struct *work)
> rpc_destroy_wait_queue(&xprt->sending);
> rpc_destroy_wait_queue(&xprt->backlog);
> kfree(xprt->servername);
> + /*
> + * Destroy any existing back channel
> + */
> + xprt_destroy_backchannel(xprt, UINT_MAX);
> +

This will cause xprt->bc_alloc_max to become meaningless.
That isn't really a problem as the xprt is about to be freed, but it
still seems a little untidy - fragile maybe.
How about another line in the comment:

* Note: this corrupts ->bc_alloc_max, but it is too late for that to
* matter.
??

Also, possibly add
WARN_ON(atomic_read(&xprt->bc_slot_count);
either before or after the xprt_destroy_backchannel - because there
really shouldn't be any requests by this stage.

Thanks,
NeilBrown


> /*
> * Tear down transport state and free the rpc_xprt
> */
> --
> 2.21.0


Attachments:
signature.asc (847.00 B)

2019-10-18 16:00:44

by Trond Myklebust

[permalink] [raw]
Subject: Re: [PATCH 3/3] SUNRPC: Destroy the back channel when we destroy the host transport

On Thu, 2019-10-17 at 09:08 +1100, NeilBrown wrote:
> On Wed, Oct 16 2019, Trond Myklebust wrote:
>
> > When we're destroying the host transport mechanism, we should
> > ensure
> > that we do not leak memory by failing to release any back channel
> > slots that might still exist.
> >
> > Reported-by: Neil Brown <[email protected]>
> > Signed-off-by: Trond Myklebust <[email protected]>
> > ---
> > net/sunrpc/xprt.c | 5 +++++
> > 1 file changed, 5 insertions(+)
> >
> > diff --git a/net/sunrpc/xprt.c b/net/sunrpc/xprt.c
> > index 8a45b3ccc313..41df4c507193 100644
> > --- a/net/sunrpc/xprt.c
> > +++ b/net/sunrpc/xprt.c
> > @@ -1942,6 +1942,11 @@ static void xprt_destroy_cb(struct
> > work_struct *work)
> > rpc_destroy_wait_queue(&xprt->sending);
> > rpc_destroy_wait_queue(&xprt->backlog);
> > kfree(xprt->servername);
> > + /*
> > + * Destroy any existing back channel
> > + */
> > + xprt_destroy_backchannel(xprt, UINT_MAX);
> > +
>
> This will cause xprt->bc_alloc_max to become meaningless.

Fixed in v2.

> That isn't really a problem as the xprt is about to be freed, but it
> still seems a little untidy - fragile maybe.
> How about another line in the comment:
>
> * Note: this corrupts ->bc_alloc_max, but it is too late for that
> to
> * matter.
> ??
>
> Also, possibly add
> WARN_ON(atomic_read(&xprt->bc_slot_count);
> either before or after the xprt_destroy_backchannel - because there
> really shouldn't be any requests by this stage.

I considered it, but since RDMA doesn't use those variable, it wouldn't
really be a sufficient check.

Thanks
Trond

--
Trond Myklebust
Linux NFS client maintainer, Hammerspace
[email protected]