Return-Path: Received: from mailsrv.ikr.uni-stuttgart.de ([129.69.170.2]:56891 "EHLO mailsrv.ikr.uni-stuttgart.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752972AbbHaMIN (ORCPT ); Mon, 31 Aug 2015 08:08:13 -0400 From: Ulrich Gemkow To: "J. Bruce Fields" Subject: Re: NFSv4 mount fails on Sun Solaris 10 after reboot of client Date: Mon, 31 Aug 2015 14:08:08 +0200 References: <201508241452.57718.ulrich.gemkow@ikr.uni-stuttgart.de> <201508262154.24455.ulrich.gemkow@ikr.uni-stuttgart.de> <20150826200940.GE4161@fieldses.org> In-Reply-To: <20150826200940.GE4161@fieldses.org> MIME-Version: 1.0 Cc: linux-nfs@vger.kernel.org Content-Type: Text/Plain; charset="iso-8859-1" Message-Id: <201508311408.10693.ulrich.gemkow@ikr.uni-stuttgart.de> Sender: linux-nfs-owner@vger.kernel.org List-ID: Hallo Bruce, On Wednesday 26 August 2015 22:09:40 you wrote: > On Wed, Aug 26, 2015 at 09:54:22PM +0200, Ulrich Gemkow wrote: > > Hello Bruce, > > > > On Tuesday 25 August 2015 23:54:56 J. Bruce Fields wrote: > > > The SERVERFAULT is on SETCLIENTID_CONFIRM. > > > > > > In nfsd4_setclientid_confirm(): > > > > > > conf = find_confirmed_client(clid, false, nn); > > > unconf = find_unconfirmed_client(clid, false, nn); > > > /* > > > * We try hard to give out unique clientid's, so if we get an > > > * attempt to confirm the same clientid with a different cred, > > > * there's a bug somewhere. Let's charitably assume it's our > > > * bug. > > > */ > > > status = nfserr_serverfault; > > > if (unconf && !same_creds(&unconf->cl_cred, &rqstp->rq_cred)) > > > goto out; > > > if (conf && !same_creds(&conf->cl_cred, &rqstp->rq_cred)) > > > goto out; > > > > > > The SETCLIENTID and SETCLIENTID_CONFIRM are done with identical > > > auth_unix creds. > > > > > > The clientid that were looking up there was returned from the previous > > > SETCLIENTID, generated by this logic: > > > > > > if (conf && same_verf(&conf->cl_verifier, &clverifier)) > > > /* case 1: probable callback update */ > > > copy_clid(new, conf); > > > else /* case 4 (new client) or cases 2, 3 (client reboot): */ > > > gen_clid(new, nn); > > > > > > So it should be a brand new clientid, unless the client was reusing the old > > > verifier. > > > > > > So perhaps the client is sending the SETCLIENTID with a verifier set to what it > > > used on the previous boot? That sounds like a client bug. The linux > > > client uses a timestamp for the verifier, looks like the Solaris client > > > might too. Is there some reason the clock on this client isn't > > > advancing on reboot? > > > > Thank you for the analysis. But the clock of the client advances > > regularely and as one would expect. > > OK, thanks for checking that. > > > The client is SPARC Solaris 10 with the latest patches > > applied - I cannot believe that this client has such a > > basic NFS bug. > > To confirm or deny my hypothesis, I think what we want is a longer > capture that gets the failing SETCLIENTID_CONFIRM (as seen in the > previous capture) but also shows what clientid the client was using > before the reboot. So ideal might be something like: > > - start the capture > - mount > - create a file (I just want to make sure the client does at > least one open) > - reboot the client > - mount again, see the failure > - stop the capture I tried but probably made a mistake: To be sure to have a defined state for the test I rebooted the server while clearing all its NFS state and I reinstalled the client - both with the exact same configuration as before. And now the bug unfortunately does not happen again, the mount always succeeds. I did the reinstall of the client also before my first mail to be sure so it seems that the server may have reached an invalid state before - whatever this may has caused. I can only wait until the bug happens again (hoping not :-). Maybe you are able to find a reason from the information given before. I regret to be of no more help. If I can do something please tell me. Thank you very much again and best regards -Ulrich -- | Ulrich Gemkow | University of Stuttgart | Institute of Communication Networks and Computer Engineering (IKR)