2009-09-09 06:33:06

by NeilBrown

[permalink] [raw]
Subject: [PATCH 5/9] sunrpc/cache: allow threads to block while waiting for cache update.

The current practice of waiting for cache updates by queueing the
whole request to be retried has (at least) two problems.

1/ With NFSv4, requests can be quite complex and re-trying a whole
request when a latter part fails should only be a last-resort, not a
normal practice.

2/ Large requests, and in particular any 'write' request, will not be
queued by the current code and doing so would be undesirable.

In many cases only a very sort wait is needed before the cache gets
valid data.

So, providing the underlying transport permits it by setting
->thread_wait,
arrange to wait briefly for an upcall to be completed (as reflected in
the clearing of CACHE_PENDING).
If the short wait was not long enough and CACHE_PENDING is still set,
fall back on the old approach.

The 'thread_wait' value is set to 5 seconds when there are spare
threads, and 1 second when there are no spare threads.

These values are probably much higher than needed, but will ensure
some forward progress.

Signed-off-by: NeilBrown <[email protected]>
---

include/linux/sunrpc/cache.h | 3 +++
net/sunrpc/cache.c | 44 +++++++++++++++++++++++++++++++++++++++++-
net/sunrpc/svc_xprt.c | 11 +++++++++++
3 files changed, 57 insertions(+), 1 deletions(-)

diff --git a/include/linux/sunrpc/cache.h b/include/linux/sunrpc/cache.h
index 6f52b4d..ef3db11 100644
--- a/include/linux/sunrpc/cache.h
+++ b/include/linux/sunrpc/cache.h
@@ -125,6 +125,9 @@ struct cache_detail {
*/
struct cache_req {
struct cache_deferred_req *(*defer)(struct cache_req *req);
+ int thread_wait; /* How long (jiffies) we can block the
+ * current thread to wait for updates.
+ */
};
/* this must be embedded in a deferred_request that is being
* delayed awaiting cache-fill
diff --git a/net/sunrpc/cache.c b/net/sunrpc/cache.c
index 54bbd83..46e9e2b 100644
--- a/net/sunrpc/cache.c
+++ b/net/sunrpc/cache.c
@@ -498,10 +498,22 @@ static LIST_HEAD(cache_defer_list);
static struct list_head cache_defer_hash[DFR_HASHSIZE];
static int cache_defer_cnt;

+struct thread_deferred_req {
+ struct cache_deferred_req handle;
+ wait_queue_head_t wait;
+};
+static void cache_restart_thread(struct cache_deferred_req *dreq, int too_many)
+{
+ struct thread_deferred_req *dr =
+ container_of(dreq, struct thread_deferred_req, handle);
+ wake_up(&dr->wait);
+}
+
static int cache_defer_req(struct cache_req *req, struct cache_head *item)
{
struct cache_deferred_req *dreq, *discard;
int hash = DFR_HASH(item);
+ struct thread_deferred_req sleeper;

if (cache_defer_cnt >= DFR_MAX) {
/* too much in the cache, randomly drop this one,
@@ -510,7 +522,14 @@ static int cache_defer_req(struct cache_req *req, struct cache_head *item)
if (net_random()&1)
return -ENOMEM;
}
- dreq = req->defer(req);
+ if (req->thread_wait) {
+ dreq = &sleeper.handle;
+ init_waitqueue_head(&sleeper.wait);
+ dreq->revisit = cache_restart_thread;
+ } else
+ dreq = req->defer(req);
+
+ retry:
if (dreq == NULL)
return -ENOMEM;

@@ -544,6 +563,29 @@ static int cache_defer_req(struct cache_req *req, struct cache_head *item)
cache_revisit_request(item);
return -EAGAIN;
}
+
+ if (dreq == &sleeper.handle) {
+ wait_event_interruptible_timeout(
+ sleeper.wait,
+ !test_bit(CACHE_PENDING, &item->flags)
+ || list_empty(&sleeper.handle.hash),
+ req->thread_wait);
+ spin_lock(&cache_defer_lock);
+ if (!list_empty(&sleeper.handle.hash)) {
+ list_del_init(&sleeper.handle.recent);
+ list_del_init(&sleeper.handle.hash);
+ cache_defer_cnt--;
+ }
+ spin_unlock(&cache_defer_lock);
+ if (test_bit(CACHE_PENDING, &item->flags)) {
+ /* item is still pending, try request
+ * deferral
+ */
+ dreq = req->defer(req);
+ goto retry;
+ }
+ return 0;
+ }
return 0;
}

diff --git a/net/sunrpc/svc_xprt.c b/net/sunrpc/svc_xprt.c
index 912dea5..65f6a25 100644
--- a/net/sunrpc/svc_xprt.c
+++ b/net/sunrpc/svc_xprt.c
@@ -655,12 +655,23 @@ int svc_recv(struct svc_rqst *rqstp, long timeout)
pool->sp_nwaking--;
BUG_ON(pool->sp_nwaking < 0);
}
+
+ /* Normally we will wait up to 5 seconds for any required
+ * cache information to be provided.
+ */
+ rqstp->rq_chandle.thread_wait = 5*HZ;
xprt = svc_xprt_dequeue(pool);
if (xprt) {
rqstp->rq_xprt = xprt;
svc_xprt_get(xprt);
rqstp->rq_reserved = serv->sv_max_mesg;
atomic_add(rqstp->rq_reserved, &xprt->xpt_reserved);
+
+ /* As there is a shortage of threads and this request
+ * had to be queue, don't allow the thread to wait so
+ * long for cache updates.
+ */
+ rqstp->rq_chandle.thread_wait = 1*HZ;
} else {
/* No data pending. Go to sleep */
svc_thread_enqueue(pool, rqstp);




2009-12-02 20:58:24

by J. Bruce Fields

[permalink] [raw]
Subject: Re: [PATCH 5/9] sunrpc/cache: allow threads to block while waiting for cache update.

On Wed, Sep 09, 2009 at 04:32:54PM +1000, NeilBrown wrote:
> The current practice of waiting for cache updates by queueing the
> whole request to be retried has (at least) two problems.

Apologies for the delay!

> 1/ With NFSv4, requests can be quite complex and re-trying a whole
> request when a latter part fails should only be a last-resort, not a
> normal practice.
>
> 2/ Large requests, and in particular any 'write' request, will not be
> queued by the current code and doing so would be undesirable.
>
> In many cases only a very sort wait is needed before the cache gets
> valid data.
>
> So, providing the underlying transport permits it by setting
> ->thread_wait,
> arrange to wait briefly for an upcall to be completed (as reflected in
> the clearing of CACHE_PENDING).
> If the short wait was not long enough and CACHE_PENDING is still set,
> fall back on the old approach.
>
> The 'thread_wait' value is set to 5 seconds when there are spare
> threads, and 1 second when there are no spare threads.
>
> These values are probably much higher than needed, but will ensure
> some forward progress.

This looks fine, and I want to merge it. One mainly superficial
complaint:

> static int cache_defer_req(struct cache_req *req, struct cache_head *item)
> {
> struct cache_deferred_req *dreq, *discard;
> int hash = DFR_HASH(item);
> + struct thread_deferred_req sleeper;
>
> if (cache_defer_cnt >= DFR_MAX) {
> /* too much in the cache, randomly drop this one,
> @@ -510,7 +522,14 @@ static int cache_defer_req(struct cache_req *req, struct cache_head *item)
> if (net_random()&1)
> return -ENOMEM;
> }
> - dreq = req->defer(req);
> + if (req->thread_wait) {
> + dreq = &sleeper.handle;
> + init_waitqueue_head(&sleeper.wait);
> + dreq->revisit = cache_restart_thread;
> + } else
> + dreq = req->defer(req);
> +
> + retry:
> if (dreq == NULL)
> return -ENOMEM;
>
> @@ -544,6 +563,29 @@ static int cache_defer_req(struct cache_req *req, struct cache_head *item)
> cache_revisit_request(item);
> return -EAGAIN;
> }
> +
> + if (dreq == &sleeper.handle) {
> + wait_event_interruptible_timeout(
> + sleeper.wait,
> + !test_bit(CACHE_PENDING, &item->flags)
> + || list_empty(&sleeper.handle.hash),
> + req->thread_wait);
> + spin_lock(&cache_defer_lock);
> + if (!list_empty(&sleeper.handle.hash)) {
> + list_del_init(&sleeper.handle.recent);
> + list_del_init(&sleeper.handle.hash);
> + cache_defer_cnt--;
> + }
> + spin_unlock(&cache_defer_lock);
> + if (test_bit(CACHE_PENDING, &item->flags)) {
> + /* item is still pending, try request
> + * deferral
> + */
> + dreq = req->defer(req);
> + goto retry;
> + }
> + return 0;
> + }

With this, cache_defer_req is tending towards the long and complicated
side. It'd probably suffice to do something as simple as moving some of
the code into helper functions to hide the details.

--b.

2009-12-02 21:23:05

by Trond Myklebust

[permalink] [raw]
Subject: Re: [PATCH 5/9] sunrpc/cache: allow threads to block while waiting for cache update.

On Wed, 2009-12-02 at 15:59 -0500, J. Bruce Fields wrote:
> On Wed, Sep 09, 2009 at 04:32:54PM +1000, NeilBrown wrote:
> > The current practice of waiting for cache updates by queueing the
> > whole request to be retried has (at least) two problems.
>
> Apologies for the delay!
>
> > 1/ With NFSv4, requests can be quite complex and re-trying a whole
> > request when a latter part fails should only be a last-resort, not a
> > normal practice.
> >
> > 2/ Large requests, and in particular any 'write' request, will not be
> > queued by the current code and doing so would be undesirable.
> >
> > In many cases only a very sort wait is needed before the cache gets
> > valid data.
> >
> > So, providing the underlying transport permits it by setting
> > ->thread_wait,
> > arrange to wait briefly for an upcall to be completed (as reflected in
> > the clearing of CACHE_PENDING).
> > If the short wait was not long enough and CACHE_PENDING is still set,
> > fall back on the old approach.
> >
> > The 'thread_wait' value is set to 5 seconds when there are spare
> > threads, and 1 second when there are no spare threads.
> >
> > These values are probably much higher than needed, but will ensure
> > some forward progress.
>
> This looks fine, and I want to merge it. One mainly superficial
> complaint:
>
> > static int cache_defer_req(struct cache_req *req, struct cache_head *item)
> > {
> > struct cache_deferred_req *dreq, *discard;
> > int hash = DFR_HASH(item);
> > + struct thread_deferred_req sleeper;
> >
> > if (cache_defer_cnt >= DFR_MAX) {
> > /* too much in the cache, randomly drop this one,
> > @@ -510,7 +522,14 @@ static int cache_defer_req(struct cache_req *req, struct cache_head *item)
> > if (net_random()&1)
> > return -ENOMEM;
> > }
> > - dreq = req->defer(req);
> > + if (req->thread_wait) {
> > + dreq = &sleeper.handle;
> > + init_waitqueue_head(&sleeper.wait);
> > + dreq->revisit = cache_restart_thread;
> > + } else
> > + dreq = req->defer(req);
> > +
> > + retry:
> > if (dreq == NULL)
> > return -ENOMEM;
> >
> > @@ -544,6 +563,29 @@ static int cache_defer_req(struct cache_req *req, struct cache_head *item)
> > cache_revisit_request(item);
> > return -EAGAIN;
> > }
> > +
> > + if (dreq == &sleeper.handle) {
> > + wait_event_interruptible_timeout(
> > + sleeper.wait,
> > + !test_bit(CACHE_PENDING, &item->flags)
> > + || list_empty(&sleeper.handle.hash),
> > + req->thread_wait);
> > + spin_lock(&cache_defer_lock);
> > + if (!list_empty(&sleeper.handle.hash)) {
> > + list_del_init(&sleeper.handle.recent);
> > + list_del_init(&sleeper.handle.hash);
> > + cache_defer_cnt--;
> > + }
> > + spin_unlock(&cache_defer_lock);
> > + if (test_bit(CACHE_PENDING, &item->flags)) {
> > + /* item is still pending, try request
> > + * deferral
> > + */
> > + dreq = req->defer(req);
> > + goto retry;
> > + }
> > + return 0;
> > + }
>
> With this, cache_defer_req is tending towards the long and complicated
> side. It'd probably suffice to do something as simple as moving some of
> the code into helper functions to hide the details.

Couldn't you also simplify things a good deal by just adding a
completion to struct cache_deferred_req, and then letting
cache_defer_req() call wait_for_completion_timeout()?

That's pretty much all we do in fs/nfs/cache_lib.c...

Cheers
Trond


2009-12-02 21:51:02

by Trond Myklebust

[permalink] [raw]
Subject: Re: [PATCH 5/9] sunrpc/cache: allow threads to block while waiting for cache update.

On Wed, 2009-12-02 at 16:23 -0500, Trond Myklebust wrote:
> Couldn't you also simplify things a good deal by just adding a
> completion to struct cache_deferred_req, and then letting
> cache_defer_req() call wait_for_completion_timeout()?

Better yet, the caller of cache_check() could decide whether or not to
also call wait_for_completion_timeout().

In addition to allowing the rpc server thread to wait upon the
authentication upcall, this could also simplify the nfsv4 server's
idmapper code, and the nfsv4 client's dns resolver code.