Date: Thu, 4 Feb 2010 08:23:54 +1100
From: Neil Brown <neilb@suse.de>
To: Chuck Lever <chuck.lever@oracle.com>
Cc: "J. Bruce Fields" <bfields@fieldses.org>, linux-nfs@vger.kernel.org
Subject: Re: [PATCH 6/9] sunrpc: close connection when a request is
 irretrievably lost.
Message-ID: <20100204082354.0bf3b7e5@notabene.brown>
In-Reply-To: <4B699988.9000209@oracle.com>
References: <20100203060657.12945.27293.stgit@notabene.brown>
	<20100203063131.12945.34978.stgit@notabene.brown>
	<4B699988.9000209@oracle.com>
Content-Type: text/plain; charset=US-ASCII
Sender: linux-nfs-owner@vger.kernel.org
MIME-Version: 1.0

On Wed, 03 Feb 2010 10:43:04 -0500
Chuck Lever <chuck.lever@oracle.com> wrote:

> On 02/03/2010 01:31 AM, NeilBrown wrote:
> > If we drop a request in the sunrpc layer, either due kmalloc failure,
> > or due to a cache miss when we could not queue the request for later
> > replay, then close the connection to encourage the client to retry sooner.
> 
> I studied connection dropping behavior a few years back, and decided 
> that dropping the connection on a retransmit is nearly always 
> counterproductive.  Any other pending requests on a connection that is 
> dropped must also be retransmitted, which means one retransmit suddenly 
> turns into many.  And then you get into issues of idempotency and all 
> the extra traffic and the long delays and the risk of reconnecting on a 
> different port so that XID replay is undetectable...

You make some good points there, thanks.

> 
> I don't think dropping the connection will cause the client to 
> retransmit sooner.  Clients I have encountered will reconnect and 
> retransmit only after their retransmit timeout fires, never sooner.
> 

I thought I had noticed the Linux client resending immediately, but it would
have been a while ago, and I could easily be remembering wrongly.

My reasoning was that if the connection is closed then the client can *know*
that they won't get a response to any outstanding requests, rather than
having to use the timeout heuristic.  How the client uses that information I
don't know, but at least they would have it.

> Unfortunately NFSv4 requires a connection drop before a retransmit, but 
> NFSv3 does not.  NFSv4 servers are rather supposed to try very hard not 
> to drop requests.
> 
> How often do you expect this kind of recovery to be necessary?  Would it 
> be possible to drop only for NFSv4 connections?
> 

With the improved handling of large requests I would expect this kind of
recovery would be very rarely needed.
Yes, it would be quite easy to only drop connections on which we have seen an
NFSv4 request... and maybe also connections on which we have not successfully
handled any request yet(?).
What if, instead of closing the connection, we set a flag so that it would be
closed as soon as it had been idle for 1 second,  thus flushing any other
pending requests???   That probably doesn't help - there would easily be real
cases where other threads of activity keep the connection busy, while the
thread waiting for the lost request still needs a full time-out.

I would be happy with the v4-only version.

Thanks,
NeilBrown