2009-01-05 17:04:54

by Tom Tucker

[permalink] [raw]
Subject: Re: [PATCH 3/3] SUNRPC: svc_xprt_enqueue should not refuse to enqueue 'XPT_DEAD' transports

Trond Myklebust wrote:
> On Sun, 2009-01-04 at 14:12 -0500, Trond Myklebust wrote:
[...snip...]

>> I see nothing that stops svc_delete_xprt() from setting XPT_DEAD after
>> the above test in svc_revisit(), and before the test inside
>> svc_xprt_enqueue(). What's preventing a race there?
>
> I suppose one way to fix it would be to hold the xprt->xpt_lock across
> the above test, and to make sure that you set XPT_DEFERRED while holding
> the lock, and _before_ you test for XPT_DEAD. That way, you guarantee
> that the svc_deferred_dequeue() loop in svc_delete_xprt() will pick up
> anything that races with the setting of XPT_DEAD.
>
> Trond

I think this patch fixes this. Thanks again,

From: Tom Tucker <[email protected]>
Date: Mon, 5 Jan 2009 10:56:03 -0600
Subject: [PATCH] svc: Clean up deferred requests on transport destruction

A race between svc_revisit and svc_delete_xprt can result in
deferred requests holding references on a transport that can never be
recovered because dead transports are not enqueued for subsequent
processing.

Check for XPT_DEAD in revisit to clean up completing deferrals on a dead
transport and sweep a transport's deferred queue to do the same for queued
but unprocessed deferrals.

Signed-off-by: Tom Tucker <[email protected]>

---
net/sunrpc/svc_xprt.c | 29 ++++++++++++++++++++++-------
1 files changed, 22 insertions(+), 7 deletions(-)

diff --git a/net/sunrpc/svc_xprt.c b/net/sunrpc/svc_xprt.c
index bf5b5cd..375a695 100644
--- a/net/sunrpc/svc_xprt.c
+++ b/net/sunrpc/svc_xprt.c
@@ -837,6 +837,15 @@ static void svc_age_temp_xprts(unsigned long closure)
void svc_delete_xprt(struct svc_xprt *xprt)
{
struct svc_serv *serv = xprt->xpt_server;
+ struct svc_deferred_req *dr;
+
+ /* Only do this once */
+ spin_lock(&xprt->xpt_lock);
+ if (test_and_set_bit(XPT_DEAD, &xprt->xpt_flags)) {
+ spin_unlock(&xprt->xpt_lock);
+ return;
+ }
+ spin_unlock(&xprt->xpt_lock);

dprintk("svc: svc_delete_xprt(%p)\n", xprt);
xprt->xpt_ops->xpo_detach(xprt);
@@ -851,12 +860,16 @@ void svc_delete_xprt(struct svc_xprt *xprt)
* while still attached to a queue, the queue itself
* is about to be destroyed (in svc_destroy).
*/
- if (!test_and_set_bit(XPT_DEAD, &xprt->xpt_flags)) {
- BUG_ON(atomic_read(&xprt->xpt_ref.refcount) < 2);
- if (test_bit(XPT_TEMP, &xprt->xpt_flags))
- serv->sv_tmpcnt--;
+ if (test_bit(XPT_TEMP, &xprt->xpt_flags))
+ serv->sv_tmpcnt--;
+
+ for (dr = svc_deferred_dequeue(xprt); dr;
+ dr = svc_deferred_dequeue(xprt)) {
svc_xprt_put(xprt);
+ kfree(dr);
}
+
+ svc_xprt_put(xprt);
spin_unlock_bh(&serv->sv_lock);
}

@@ -902,17 +915,19 @@ static void svc_revisit(struct cache_deferred_req *dreq, int too_many)
container_of(dreq, struct svc_deferred_req, handle);
struct svc_xprt *xprt = dr->xprt;

- if (too_many) {
+ spin_lock(&xprt->xpt_lock);
+ if (too_many || test_bit(XPT_DEAD, &xprt->xpt_flags)) {
+ spin_unlock(&xprt->xpt_lock);
+ dprintk("revisit cancelled\n");
svc_xprt_put(xprt);
kfree(dr);
return;
}
dprintk("revisit queued\n");
dr->xprt = NULL;
- spin_lock(&xprt->xpt_lock);
list_add(&dr->handle.recent, &xprt->xpt_deferred);
- spin_unlock(&xprt->xpt_lock);
set_bit(XPT_DEFERRED, &xprt->xpt_flags);
+ spin_unlock(&xprt->xpt_lock);
svc_xprt_enqueue(xprt);
svc_xprt_put(xprt);
}


2009-01-05 17:13:42

by Trond Myklebust

[permalink] [raw]
Subject: Re: [PATCH 3/3] SUNRPC: svc_xprt_enqueue should not refuse to enqueue 'XPT_DEAD' transports

On Mon, 2009-01-05 at 11:04 -0600, Tom Tucker wrote:
> Trond Myklebust wrote:
> > On Sun, 2009-01-04 at 14:12 -0500, Trond Myklebust wrote:
> [...snip...]
>
> >> I see nothing that stops svc_delete_xprt() from setting XPT_DEAD after
> >> the above test in svc_revisit(), and before the test inside
> >> svc_xprt_enqueue(). What's preventing a race there?
> >
> > I suppose one way to fix it would be to hold the xprt->xpt_lock across
> > the above test, and to make sure that you set XPT_DEFERRED while holding
> > the lock, and _before_ you test for XPT_DEAD. That way, you guarantee
> > that the svc_deferred_dequeue() loop in svc_delete_xprt() will pick up
> > anything that races with the setting of XPT_DEAD.
> >
> > Trond
>
> I think this patch fixes this. Thanks again,
>
> From: Tom Tucker <[email protected]>
> Date: Mon, 5 Jan 2009 10:56:03 -0600
> Subject: [PATCH] svc: Clean up deferred requests on transport destruction
>
> A race between svc_revisit and svc_delete_xprt can result in
> deferred requests holding references on a transport that can never be
> recovered because dead transports are not enqueued for subsequent
> processing.
>
> Check for XPT_DEAD in revisit to clean up completing deferrals on a dead
> transport and sweep a transport's deferred queue to do the same for queued
> but unprocessed deferrals.
>
> Signed-off-by: Tom Tucker <[email protected]>
>
> ---
> net/sunrpc/svc_xprt.c | 29 ++++++++++++++++++++++-------
> 1 files changed, 22 insertions(+), 7 deletions(-)
>
> diff --git a/net/sunrpc/svc_xprt.c b/net/sunrpc/svc_xprt.c
> index bf5b5cd..375a695 100644
> --- a/net/sunrpc/svc_xprt.c
> +++ b/net/sunrpc/svc_xprt.c
> @@ -837,6 +837,15 @@ static void svc_age_temp_xprts(unsigned long closure)
> void svc_delete_xprt(struct svc_xprt *xprt)
> {
> struct svc_serv *serv = xprt->xpt_server;
> + struct svc_deferred_req *dr;
> +
> + /* Only do this once */
> + spin_lock(&xprt->xpt_lock);
> + if (test_and_set_bit(XPT_DEAD, &xprt->xpt_flags)) {
> + spin_unlock(&xprt->xpt_lock);
> + return;
> + }
> + spin_unlock(&xprt->xpt_lock);

You shouldn't need to take the spinlock here if you just move the line
set_bit(XPT_DEFERRED, &xprt->xpt_flags);
in svc_revisit(). See below...

> dprintk("svc: svc_delete_xprt(%p)\n", xprt);
> xprt->xpt_ops->xpo_detach(xprt);
> @@ -851,12 +860,16 @@ void svc_delete_xprt(struct svc_xprt *xprt)
> * while still attached to a queue, the queue itself
> * is about to be destroyed (in svc_destroy).
> */
> - if (!test_and_set_bit(XPT_DEAD, &xprt->xpt_flags)) {
> - BUG_ON(atomic_read(&xprt->xpt_ref.refcount) < 2);
> - if (test_bit(XPT_TEMP, &xprt->xpt_flags))
> - serv->sv_tmpcnt--;
> + if (test_bit(XPT_TEMP, &xprt->xpt_flags))
> + serv->sv_tmpcnt--;
> +
> + for (dr = svc_deferred_dequeue(xprt); dr;
> + dr = svc_deferred_dequeue(xprt)) {
> svc_xprt_put(xprt);
> + kfree(dr);
> }
> +
> + svc_xprt_put(xprt);
> spin_unlock_bh(&serv->sv_lock);
> }
>
> @@ -902,17 +915,19 @@ static void svc_revisit(struct cache_deferred_req *dreq, int too_many)
> container_of(dreq, struct svc_deferred_req, handle);
> struct svc_xprt *xprt = dr->xprt;
>
> - if (too_many) {
> + spin_lock(&xprt->xpt_lock);

+ set_bit(XPT_DEFERRED, &xprt->xpt_flags);


> + if (too_many || test_bit(XPT_DEAD, &xprt->xpt_flags)) {
> + spin_unlock(&xprt->xpt_lock);
> + dprintk("revisit cancelled\n");
> svc_xprt_put(xprt);
> kfree(dr);
> return;
> }
> dprintk("revisit queued\n");
> dr->xprt = NULL;
> - spin_lock(&xprt->xpt_lock);
> list_add(&dr->handle.recent, &xprt->xpt_deferred);
> - spin_unlock(&xprt->xpt_lock);
> set_bit(XPT_DEFERRED, &xprt->xpt_flags);

- set_bit(XPT_DEFERRED, &xprt->xpt_flags);

> + spin_unlock(&xprt->xpt_lock);
> svc_xprt_enqueue(xprt);
> svc_xprt_put(xprt);
> }