Return-Path: linux-nfs-owner@vger.kernel.org Received: from mx12.netapp.com ([216.240.18.77]:20791 "EHLO mx12.netapp.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1757277Ab3DYOKi convert rfc822-to-8bit (ORCPT ); Thu, 25 Apr 2013 10:10:38 -0400 From: "Myklebust, Trond" To: "bfields@fieldses.org" CC: David Wysochanski , Dave Chiluk , "linux-nfs@vger.kernel.org" , "linux-kernel@vger.kernel.org" Subject: Re: [PATCH] NFSv4: Use exponential backoff delay for NFS4_ERRDELAY Date: Thu, 25 Apr 2013 14:10:36 +0000 Message-ID: <1366899034.6812.4.camel@leira.trondhjem.org> References: <1366836949-18465-1-git-send-email-chiluk@canonical.com> <1366838926.22397.25.camel@leira.trondhjem.org> <5178549A.7010402@canonical.com> <1366842905.22397.49.camel@leira.trondhjem.org> <1366892374.26249.294.camel@localhost.localdomain> <20130425132907.GB31851@fieldses.org> <1366896654.4719.18.camel@leira.trondhjem.org> <20130425134918.GC31851@fieldses.org> In-Reply-To: <20130425134918.GC31851@fieldses.org> Content-Type: text/plain; charset=US-ASCII MIME-Version: 1.0 Sender: linux-nfs-owner@vger.kernel.org List-ID: On Thu, 2013-04-25 at 09:49 -0400, bfields@fieldses.org wrote: > On Thu, Apr 25, 2013 at 01:30:58PM +0000, Myklebust, Trond wrote: > > On Thu, 2013-04-25 at 09:29 -0400, bfields@fieldses.org wrote: > > > > > My position is that we simply have no idea what order of magnitude even > > > delay should be. And that in such a situation exponential backoff such > > > as implemented in the synchronous case seems the reasonable default as > > > it guarantees at worst doubling the delay while still bounding the > > > long-term average frequency of retries. > > > > So we start with a 15 second delay, and then go to 60 seconds? > > I agree that a server should normally be doing the wait on its own if > the wait would be on the order of an rpc round trip. > > So I'd be inclined to start with a delay that was an order of magnitude > or two more than a round trip. > > And I'd expect NFS isn't common on networks with 1-second latencies. > > So the 1/10 second we're using in the synchronous case sounds closer to > the right ballpark to me. OK, then. Now all I need is actual motivation for changing the existing code other than handwaving arguments about "polling is better than flat waits". What actual use cases are impacting us now, other than the AIX design decision to force CLOSE to retry at least once before succeeding? -- Trond Myklebust Linux NFS client maintainer NetApp Trond.Myklebust@netapp.com www.netapp.com