From: "J. Bruce Fields" <bfields@fieldses.org>
Subject: Re: [PATCH 5/9] sunrpc/cache: allow threads to block while waiting
	for cache update.
Date: Wed, 2 Dec 2009 15:59:28 -0500
Message-ID: <20091202205928.GF15045@fieldses.org>
References: <20090909062539.20462.67466.stgit@notabene.brown> <20090909063254.20462.99277.stgit@notabene.brown>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Cc: linux-nfs@vger.kernel.org
To: NeilBrown <neilb@suse.de>
In-Reply-To: <20090909063254.20462.99277.stgit-wvvUuzkyo1EYVZTmpyfIwg@public.gmane.org>
Sender: linux-nfs-owner@vger.kernel.org

On Wed, Sep 09, 2009 at 04:32:54PM +1000, NeilBrown wrote:
> The current practice of waiting for cache updates by queueing the
> whole request to be retried has (at least) two problems.

Apologies for the delay!

> 1/ With NFSv4, requests can be quite complex and re-trying a whole
>   request when a latter part fails should only be a last-resort, not a
>   normal practice.
> 
> 2/ Large requests, and in particular any 'write' request, will not be
>   queued by the current code and doing so would be undesirable.
> 
> In many cases only a very sort wait is needed before the cache gets
> valid data.
> 
> So, providing the underlying transport permits it by setting
>  ->thread_wait,
> arrange to wait briefly for an upcall to be completed (as reflected in
> the clearing of CACHE_PENDING).
> If the short wait was not long enough and CACHE_PENDING is still set,
> fall back on the old approach.
> 
> The 'thread_wait' value is set to 5 seconds when there are spare
> threads, and 1 second when there are no spare threads.
> 
> These values are probably much higher than needed, but will ensure
> some forward progress.

This looks fine, and I want to merge it.  One mainly superficial
complaint:

>  static int cache_defer_req(struct cache_req *req, struct cache_head *item)
>  {
>  	struct cache_deferred_req *dreq, *discard;
>  	int hash = DFR_HASH(item);
> +	struct thread_deferred_req sleeper;
>  
>  	if (cache_defer_cnt >= DFR_MAX) {
>  		/* too much in the cache, randomly drop this one,
> @@ -510,7 +522,14 @@ static int cache_defer_req(struct cache_req *req, struct cache_head *item)
>  		if (net_random()&1)
>  			return -ENOMEM;
>  	}
> -	dreq = req->defer(req);
> +	if (req->thread_wait) {
> +		dreq = &sleeper.handle;
> +		init_waitqueue_head(&sleeper.wait);
> +		dreq->revisit = cache_restart_thread;
> +	} else
> +		dreq = req->defer(req);
> +
> + retry:
>  	if (dreq == NULL)
>  		return -ENOMEM;
>  
> @@ -544,6 +563,29 @@ static int cache_defer_req(struct cache_req *req, struct cache_head *item)
>  		cache_revisit_request(item);
>  		return -EAGAIN;
>  	}
> +
> +	if (dreq == &sleeper.handle) {
> +		wait_event_interruptible_timeout(
> +			sleeper.wait,
> +			!test_bit(CACHE_PENDING, &item->flags)
> +			|| list_empty(&sleeper.handle.hash),
> +			req->thread_wait);
> +		spin_lock(&cache_defer_lock);
> +		if (!list_empty(&sleeper.handle.hash)) {
> +			list_del_init(&sleeper.handle.recent);
> +			list_del_init(&sleeper.handle.hash);
> +			cache_defer_cnt--;
> +		}
> +		spin_unlock(&cache_defer_lock);
> +		if (test_bit(CACHE_PENDING, &item->flags)) {
> +			/* item is still pending, try request
> +			 * deferral
> +			 */
> +			dreq = req->defer(req);
> +			goto retry;
> +		}
> +		return 0;
> +	}

With this, cache_defer_req is tending towards the long and complicated
side.  It'd probably suffice to do something as simple as moving some of
the code into helper functions to hide the details.

--b.