These fixes resolved crashes due to resource leak BUG_ON checks. The
resource leaks were detected by introducing asynchronous transport errors.
Signed-off-by: Steve Wise <[email protected]>
Signed-off-by: Tom Tucker <[email protected]>
---
net/sunrpc/xprtrdma/svc_rdma_sendto.c | 3 +++
net/sunrpc/xprtrdma/svc_rdma_transport.c | 3 ++-
2 files changed, 5 insertions(+), 1 deletions(-)
diff --git a/net/sunrpc/xprtrdma/svc_rdma_sendto.c b/net/sunrpc/xprtrdma/svc_rdma_sendto.c
index 6c26a67..8b510c5 100644
--- a/net/sunrpc/xprtrdma/svc_rdma_sendto.c
+++ b/net/sunrpc/xprtrdma/svc_rdma_sendto.c
@@ -183,6 +183,7 @@ static int fast_reg_xdr(struct svcxprt_rdma *xprt,
fatal_err:
printk("svcrdma: Error fast registering memory for xprt %p\n", xprt);
+ vec->frmr = NULL;
svc_rdma_put_frmr(xprt, frmr);
return -EIO;
}
@@ -516,6 +517,7 @@ static int send_reply(struct svcxprt_rdma *rdma,
"svcrdma: could not post a receive buffer, err=%d."
"Closing transport %p.\n", ret, rdma);
set_bit(XPT_CLOSE, &rdma->sc_xprt.xpt_flags);
+ svc_rdma_put_frmr(rdma, vec->frmr);
svc_rdma_put_context(ctxt, 0);
return -ENOTCONN;
}
@@ -606,6 +608,7 @@ static int send_reply(struct svcxprt_rdma *rdma,
return 0;
err:
+ svc_rdma_unmap_dma(ctxt);
svc_rdma_put_frmr(rdma, vec->frmr);
svc_rdma_put_context(ctxt, 1);
return -EIO;
diff --git a/net/sunrpc/xprtrdma/svc_rdma_transport.c b/net/sunrpc/xprtrdma/svc_rdma_transport.c
index 3d810e7..4b0c2fa 100644
--- a/net/sunrpc/xprtrdma/svc_rdma_transport.c
+++ b/net/sunrpc/xprtrdma/svc_rdma_transport.c
@@ -520,8 +520,9 @@ int svc_rdma_post_recv(struct svcxprt_rdma *xprt)
svc_xprt_get(&xprt->sc_xprt);
ret = ib_post_recv(xprt->sc_qp, &recv_wr, &bad_recv_wr);
if (ret) {
- svc_xprt_put(&xprt->sc_xprt);
+ svc_rdma_unmap_dma(ctxt);
svc_rdma_put_context(ctxt, 1);
+ svc_xprt_put(&xprt->sc_xprt);
}
return ret;
Hey Bruce,
J. Bruce Fields wrote:
> On Wed, Apr 29, 2009 at 02:14:00PM -0500, Steve Wise wrote:
>
>> These fixes resolved crashes due to resource leak BUG_ON checks. The
>> resource leaks were detected by introducing asynchronous transport errors.
>>
>
> Thanks, applied for 2.6.30. (And also appropriate for stable (2.6.29),
> I assume?)
>
> But, could someone take a closer look at the error paths here? Questions:
>
> - svc_rdma_post_recv() does a svc_rdma_put_context() on error--
> are you sure its caller needs to as well?
>
The svc_rdma_put_context() call inside svc_rdma_post_recv() is for the
recv context that was allocated inside that function. The caller, in
this case send_reply() also does a svc_rdma_put_context(), but that is
for the send context. So I think this is correct.
> - In send_reply, some of the cleanout is shared between the
> first return -ENOTCONN and the final err: cleanup. Could we
> add another err: label and share some of that cleanup?
>
The only common logic I see is the svc_rdma_put_context() call that
could be shared. But one case calls it with free_pages == 1 after the
pages have been mapped, and the other with 0 since no pages are mapped
at that point (when the call to svc_rdma_post_recv() fails). So I'm
not sure its worth doing?
Steve.
On Wed, May 13, 2009 at 05:42:23PM -0500, Steve Wise wrote:
> Hey Bruce,
>
> J. Bruce Fields wrote:
>> On Wed, Apr 29, 2009 at 02:14:00PM -0500, Steve Wise wrote:
>>
>>> These fixes resolved crashes due to resource leak BUG_ON checks. The
>>> resource leaks were detected by introducing asynchronous transport errors.
>>>
>>
>> Thanks, applied for 2.6.30. (And also appropriate for stable (2.6.29),
>> I assume?)
>>
>> But, could someone take a closer look at the error paths here? Questions:
>>
>> - svc_rdma_post_recv() does a svc_rdma_put_context() on error--
>> are you sure its caller needs to as well?
>>
>
> The svc_rdma_put_context() call inside svc_rdma_post_recv() is for the
> recv context that was allocated inside that function. The caller, in
> this case send_reply() also does a svc_rdma_put_context(), but that is
> for the send context. So I think this is correct.
>
>> - In send_reply, some of the cleanout is shared between the
>> first return -ENOTCONN and the final err: cleanup. Could we
>> add another err: label and share some of that cleanup?
>>
>
> The only common logic I see is the svc_rdma_put_context() call that
> could be shared. But one case calls it with free_pages == 1 after the
> pages have been mapped, and the other with 0 since no pages are mapped
> at that point (when the call to svc_rdma_post_recv() fails). So I'm
> not sure its worth doing?
No, I think you're probably right about both of these. Thanks for
taking a look.
--b.
On Wed, Apr 29, 2009 at 02:14:00PM -0500, Steve Wise wrote:
> These fixes resolved crashes due to resource leak BUG_ON checks. The
> resource leaks were detected by introducing asynchronous transport errors.
Thanks, applied for 2.6.30. (And also appropriate for stable (2.6.29),
I assume?)
But, could someone take a closer look at the error paths here? Questions:
- svc_rdma_post_recv() does a svc_rdma_put_context() on error--
are you sure its caller needs to as well?
- In send_reply, some of the cleanout is shared between the
first return -ENOTCONN and the final err: cleanup. Could we
add another err: label and share some of that cleanup?
--b.
>
> Signed-off-by: Steve Wise <[email protected]>
> Signed-off-by: Tom Tucker <[email protected]>
> ---
>
> net/sunrpc/xprtrdma/svc_rdma_sendto.c | 3 +++
> net/sunrpc/xprtrdma/svc_rdma_transport.c | 3 ++-
> 2 files changed, 5 insertions(+), 1 deletions(-)
>
> diff --git a/net/sunrpc/xprtrdma/svc_rdma_sendto.c b/net/sunrpc/xprtrdma/svc_rdma_sendto.c
> index 6c26a67..8b510c5 100644
> --- a/net/sunrpc/xprtrdma/svc_rdma_sendto.c
> +++ b/net/sunrpc/xprtrdma/svc_rdma_sendto.c
> @@ -183,6 +183,7 @@ static int fast_reg_xdr(struct svcxprt_rdma *xprt,
>
> fatal_err:
> printk("svcrdma: Error fast registering memory for xprt %p\n", xprt);
> + vec->frmr = NULL;
> svc_rdma_put_frmr(xprt, frmr);
> return -EIO;
> }
> @@ -516,6 +517,7 @@ static int send_reply(struct svcxprt_rdma *rdma,
> "svcrdma: could not post a receive buffer, err=%d."
> "Closing transport %p.\n", ret, rdma);
> set_bit(XPT_CLOSE, &rdma->sc_xprt.xpt_flags);
> + svc_rdma_put_frmr(rdma, vec->frmr);
> svc_rdma_put_context(ctxt, 0);
> return -ENOTCONN;
> }
> @@ -606,6 +608,7 @@ static int send_reply(struct svcxprt_rdma *rdma,
> return 0;
>
> err:
> + svc_rdma_unmap_dma(ctxt);
> svc_rdma_put_frmr(rdma, vec->frmr);
> svc_rdma_put_context(ctxt, 1);
> return -EIO;
> diff --git a/net/sunrpc/xprtrdma/svc_rdma_transport.c b/net/sunrpc/xprtrdma/svc_rdma_transport.c
> index 3d810e7..4b0c2fa 100644
> --- a/net/sunrpc/xprtrdma/svc_rdma_transport.c
> +++ b/net/sunrpc/xprtrdma/svc_rdma_transport.c
> @@ -520,8 +520,9 @@ int svc_rdma_post_recv(struct svcxprt_rdma *xprt)
> svc_xprt_get(&xprt->sc_xprt);
> ret = ib_post_recv(xprt->sc_qp, &recv_wr, &bad_recv_wr);
> if (ret) {
> - svc_xprt_put(&xprt->sc_xprt);
> + svc_rdma_unmap_dma(ctxt);
> svc_rdma_put_context(ctxt, 1);
> + svc_xprt_put(&xprt->sc_xprt);
> }
> return ret;
>
>