From: "Andrew Dixie" Subject: Re: (fwd) nfs hang on 2.6.24 Date: Thu, 7 Feb 2008 10:19:06 +1300 (NZDT) Message-ID: <37673.203.167.214.129.1202332746.squirrel@mail.orcon.net.nz> References: <20080205090132.GA8286@stro.at> <1202248931.12271.18.camel@heimdal.trondhjem.org> <003301c86888$ed735a20$0301a8c0@MURTLE> <1202310021.12647.6.camel@heimdal.trondhjem.org> <20080206150739.GA5342@fieldses.org> <1202310924.12647.24.camel@heimdal.trondhjem.org> <20080206172315.GE5342@fieldses.org> <1202320337.14889.18.camel@heimdal.trondhjem.org> <20080206183128.GG5342@fieldses.org> Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Cc: "Trond Myklebust" , "Andrew Dixie" , linux-nfs@vger.kernel.org, "maximilian attems" To: "J. Bruce Fields" Return-path: Received: from mail-out3.orcon.net.nz ([219.88.242.31]:34565 "EHLO mx5.orcon.net.nz" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1758721AbYBFWRv (ORCPT ); Wed, 6 Feb 2008 17:17:51 -0500 Received: from Debian-exim by mx5.orcon.net.nz with local (Exim 4.67) (envelope-from ) id 1JMrfi-0006j5-Rc for linux-nfs@vger.kernel.org; Thu, 07 Feb 2008 10:19:10 +1300 In-Reply-To: <20080206183128.GG5342@fieldses.org> Sender: linux-nfs-owner@vger.kernel.org List-ID: > Oh, right, I was confusing client and server reboot and assuming the > client would forget the uniquifier on server reboot. That's obviously > wrong! The client will forget its own uniquifier on client reboot, but > that's alright since it's happy enough just to let that old state time > out at that point. So the only possible problem is suboptimal behavior > when the client reboot time is less than the lease time. There is one client, a stable connection between client and server, and neither client or server are being rebooted. Are the "string in use by client" messages still expected? Below is a program that attempts to open a file that is contained in a directory that has been deleted by another client. I'm not sure these are conditions that are normally occuring, it's just something I encountered trying to reproduce the hang. This reliably reproduces: Feb 7 09:55:01 devfile kernel: NFSD: preprocess_seqid_op: bad seqid (expected 20, got 22) And about 1 in 10 times it also reproduces: Feb 7 09:55:01 devfile kernel: NFSD: setclientid: string in use by client(clientid 47a627bd/0000044b) The server is 2.6.18-5 from debian. --- #include #include #include #include #include #include #include #define ASSERT(x) \ if (!(x)) { fprintf(stderr, "%s:%i:assert:" #x "\n", __FILE__, __LINE__); abort(); } #define testdir "/home/andrewd/testdir" #define testfile testdir "/fred" int main(int argc, char *argv[]) { int fd; int rv; rv = mkdir(testdir,0777); ASSERT(rv == 0 || errno == EEXIST); fd = open(testfile, O_CREAT|O_WRONLY); ASSERT(fd != -1); rv = write(fd, "stuff\n", 6); ASSERT(rv == 6); close(fd); rv = access(testfile, 0); ASSERT(rv == 0); // Remove directory via another client (nfsv3) system("ssh devlin7 rm -r "testdir); // Try to open file fd = open(testfile, O_RDONLY); printf("got fd:%i errno:%i\n", fd, errno); // fd == -1, errno = ENOENT // This is expected, error on nfs server is not. return 0; }