Subject: Re: NFS/TCP timeout sequence
From: Trond Myklebust <Trond.Myklebust@netapp.com>
To: Max Matveev <makc@redhat.com>
Cc: linux-nfs@vger.kernel.org
Date: Thu, 07 Jul 2011 09:47:19 -0400
In-Reply-To: <19989.27202.793003.725608@regina.usersys.redhat.com>
References: <19989.27202.793003.725608@regina.usersys.redhat.com>
Content-Type: text/plain; charset="UTF-8"
Message-ID: <1310046439.3863.30.camel@lade.trondhjem.org>
Sender: linux-nfs-owner@vger.kernel.org
MIME-Version: 1.0

On Thu, 2011-07-07 at 18:11 +1000, Max Matveev wrote: 
> I've had to look at the way NFS/TCP does its timeouts and backoff
> and it does not make a lot of sense to me: according to the
> following paragram from nfs(5) on Fedora 14 (I'm using Fedora 14
> because it has more text then the same page in nfs-utils):
> 
>       timeo=n    The time (in tenths of a second) the  NFS  client  waits
>                  for a response before it retries an NFS request. If this
>                  option is not specified, requests are retried  every  60
>                  seconds  for NFS over TCP.  The NFS client does not per‐
>                  form any kind of timeout backoff for NFS over TCP.
> 
> but if I try the mount with timeo=20,retrans=7 then I'm getting
> retransmits which are 2, 4, 6, 8, 2, 4, 6, 8 seconds apart, i.e.
> there is a) linear backoff and b) the backoff is not long enough to
> let the complete sequence of 7 retransmits run its course.

Sigh... Firstly, 2 second timeouts are complete lunacy when using a
protocol that guarantees reliable delivery, such as TCP does. Anyone who
tries it deserves exactly what they get: poor unreliable performance.

Secondly, the _other_ fix for this problem is to fix the documentation.

Trond
-- 
Trond Myklebust
Linux NFS client maintainer

NetApp
Trond.Myklebust@netapp.com
www.netapp.com