From: jtk@us.ibm.com (John T. Kohl) Subject: Re: [NFS] Re: [RFC] Multiple server selection and replicated mount failover Date: 24 May 2006 16:45:04 -0400 Message-ID: <6cpsi36tkf.fsf@sumu.lexma.ibm.com> References: <44745972.2010305@redhat.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Cc: Ian Kent , nfs@lists.sourceforge.net, linux-fsdevel , autofs mailing list Return-path: To: Peter Staubach In-Reply-To: <44745972.2010305@redhat.com> Sender: linux-fsdevel-owner@vger.kernel.org List-ID: >>>>> "PS" == Peter Staubach writes: PS> When the Solaris client gets a timeout from an RPC, it checks to see PS> whether this file and mount are failover'able. This checks to see PS> whether there are alternate servers in the list and could contain a PS> check to see if there are locks existing on the file. If there are PS> locks, then don't failover. The alternative to doing this is to PS> attempt to move the lock, but this could be problematic because PS> there would be no guarantee that the new lock could be acquired. PS> Anyway, if the file is failover'able, then a new server is chosen PS> from the list and the file handle associated with the file is PS> remapped to the equivalent file on the new server. This is done by PS> repeating the lookups done to get the original file handle. Once PS> the new file handle is acquired, then some minimal checks are done PS> to try to ensure that the files are the "same". This is probably PS> mostly checking to see whether the sizes of the two files are the PS> same. PS> Please note that this approach contains the interesting aspect that PS> files are only failed over when they need to be and are not failed over PS> proactively. This can lead to the situation where processes using the PS> the file system can be talking to many of the different underlying PS> servers, all at the sametime. If a server goes down and then comes back PS> up before a process, which was talking to that server, notices, then it PS> will just continue to use that server, while another process, which PS> noticed the failed server, may have failed over to a new server. If you have multiple processes talking to different server replicas, can you then get cases where the processes aren't sharing the same files given the same name? Process "A" looks up /mount/a/b/c/file.c (using server 1) opens it and starts working on it. It then sits around doing nothing for a while. Process "B" cd's to /mount/a/b, gets a timeout, fails over to server 2, and then looks up "c/file.c" which will be referencing the object on server 2 ? A & B then try locking to cooperate... Are replicas only useful for read-only copies? If they're read-only, do locks even make sense? -- John Kohl Senior Software Engineer - Rational Software - IBM Software Group Lexington, Massachusetts, USA jtk@us.ibm.com