From: Tom Tucker Subject: Re: [PATCH 3/3] SUNRPC: svc_xprt_enqueue should not refuse to enqueue 'XPT_DEAD' transports Date: Mon, 05 Jan 2009 11:04:52 -0600 Message-ID: <49623DB4.6090103@opengridcomputing.com> References: <20081017123207.GA14979@rabbit.intern.cm-ag> <1224484046.23068.14.camel@localhost.localdomain> <1225539927.2221.3.camel@localhost.localdomain> <1225546878.4390.3.camel@heimdal.trondhjem.org> <1227596962.16868.22.camel@localhost.localdomain> <1227619696.7057.19.camel@heimdal.trondhjem.org> <1227620339.9425.99.camel@zakaz.uk.xensource.com> <1227621434.7057.33.camel@heimdal.trondhjem.org> <1227621877.9425.102.camel@zakaz.uk.xensource.com> <1227737539.31008.2.camel@localhost.localdomain> <1228090631.7112.11.camel@heimdal.trondhjem.org> <1228090815.7112.15.camel@heimdal.trondhjem.org> <49491C58.9@opengridcomputing.com> <1229540877.7257.97.camel@heimdal.trondhjem.org> <495E8AB2.8020805@opengridcomputing.com> <1231096358.7363.6.camel@heimdal.trondhjem.org> <1231097131.7 363.11.camel@heimdal.trondhjem.org> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Cc: "J. Bruce Fields" , linux-nfs@vger.kernel.org To: Trond Myklebust Return-path: Received: from smtp.opengridcomputing.com ([209.198.142.2]:42052 "EHLO smtp.opengridcomputing.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751793AbZAEREy (ORCPT ); Mon, 5 Jan 2009 12:04:54 -0500 In-Reply-To: <1231097131.7363.11.camel-rJ7iovZKK19ZJLDQqaL3InhyD016LWXt@public.gmane.org> Sender: linux-nfs-owner@vger.kernel.org List-ID: Trond Myklebust wrote: > On Sun, 2009-01-04 at 14:12 -0500, Trond Myklebust wrote: [...snip...] >> I see nothing that stops svc_delete_xprt() from setting XPT_DEAD after >> the above test in svc_revisit(), and before the test inside >> svc_xprt_enqueue(). What's preventing a race there? > > I suppose one way to fix it would be to hold the xprt->xpt_lock across > the above test, and to make sure that you set XPT_DEFERRED while holding > the lock, and _before_ you test for XPT_DEAD. That way, you guarantee > that the svc_deferred_dequeue() loop in svc_delete_xprt() will pick up > anything that races with the setting of XPT_DEAD. > > Trond I think this patch fixes this. Thanks again, From: Tom Tucker Date: Mon, 5 Jan 2009 10:56:03 -0600 Subject: [PATCH] svc: Clean up deferred requests on transport destruction A race between svc_revisit and svc_delete_xprt can result in deferred requests holding references on a transport that can never be recovered because dead transports are not enqueued for subsequent processing. Check for XPT_DEAD in revisit to clean up completing deferrals on a dead transport and sweep a transport's deferred queue to do the same for queued but unprocessed deferrals. Signed-off-by: Tom Tucker --- net/sunrpc/svc_xprt.c | 29 ++++++++++++++++++++++------- 1 files changed, 22 insertions(+), 7 deletions(-) diff --git a/net/sunrpc/svc_xprt.c b/net/sunrpc/svc_xprt.c index bf5b5cd..375a695 100644 --- a/net/sunrpc/svc_xprt.c +++ b/net/sunrpc/svc_xprt.c @@ -837,6 +837,15 @@ static void svc_age_temp_xprts(unsigned long closure) void svc_delete_xprt(struct svc_xprt *xprt) { struct svc_serv *serv = xprt->xpt_server; + struct svc_deferred_req *dr; + + /* Only do this once */ + spin_lock(&xprt->xpt_lock); + if (test_and_set_bit(XPT_DEAD, &xprt->xpt_flags)) { + spin_unlock(&xprt->xpt_lock); + return; + } + spin_unlock(&xprt->xpt_lock); dprintk("svc: svc_delete_xprt(%p)\n", xprt); xprt->xpt_ops->xpo_detach(xprt); @@ -851,12 +860,16 @@ void svc_delete_xprt(struct svc_xprt *xprt) * while still attached to a queue, the queue itself * is about to be destroyed (in svc_destroy). */ - if (!test_and_set_bit(XPT_DEAD, &xprt->xpt_flags)) { - BUG_ON(atomic_read(&xprt->xpt_ref.refcount) < 2); - if (test_bit(XPT_TEMP, &xprt->xpt_flags)) - serv->sv_tmpcnt--; + if (test_bit(XPT_TEMP, &xprt->xpt_flags)) + serv->sv_tmpcnt--; + + for (dr = svc_deferred_dequeue(xprt); dr; + dr = svc_deferred_dequeue(xprt)) { svc_xprt_put(xprt); + kfree(dr); } + + svc_xprt_put(xprt); spin_unlock_bh(&serv->sv_lock); } @@ -902,17 +915,19 @@ static void svc_revisit(struct cache_deferred_req *dreq, int too_many) container_of(dreq, struct svc_deferred_req, handle); struct svc_xprt *xprt = dr->xprt; - if (too_many) { + spin_lock(&xprt->xpt_lock); + if (too_many || test_bit(XPT_DEAD, &xprt->xpt_flags)) { + spin_unlock(&xprt->xpt_lock); + dprintk("revisit cancelled\n"); svc_xprt_put(xprt); kfree(dr); return; } dprintk("revisit queued\n"); dr->xprt = NULL; - spin_lock(&xprt->xpt_lock); list_add(&dr->handle.recent, &xprt->xpt_deferred); - spin_unlock(&xprt->xpt_lock); set_bit(XPT_DEFERRED, &xprt->xpt_flags); + spin_unlock(&xprt->xpt_lock); svc_xprt_enqueue(xprt); svc_xprt_put(xprt); }