Date: Tue, 12 Nov 2013 11:08:31 -0500
From: Jeff Layton <jlayton@redhat.com>
To: Chuck Lever <chuck.lever@oracle.com>
Cc: trond.myklebust@netapp.com, linux-nfs@vger.kernel.org, steved@redhat.com
Subject: Re: [PATCH 0/2] sunrpc: more reliable detection of running gssd
Message-ID: <20131112110831.72234c64@tlielax.poochiereds.net>
In-Reply-To: <A0488E98-0DF9-4ECE-A39D-C8E460E4C546@oracle.com>
References: <1384261225-28559-1-git-send-email-jlayton@redhat.com>
	<A0488E98-0DF9-4ECE-A39D-C8E460E4C546@oracle.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Sender: linux-nfs-owner@vger.kernel.org

On Tue, 12 Nov 2013 11:02:42 -0500
Chuck Lever <chuck.lever@oracle.com> wrote:

> 
> On Nov 12, 2013, at 8:00 AM, Jeff Layton <jlayton@redhat.com> wrote:
> 
> > We've gotten a lot of complaints recently about the 15s delay when
> > doing a sec=sys mount without gssd running.
> > 
> > A large part of the problem is that the kernel isn't able to reliably
> > detect when rpc.gssd is running. What we currently have is a
> > gssd_running flag that is initially set to 1. When an upcall times out,
> > that gets set to 0, and subsequent upcalls get a much shorter timeout
> > (1/4s instead of 15s). It's reset back to '1' when a pipe is reopened.
> > 
> > The approach of using a flag like this is pretty inadequate. First, it
> > doesn't eliminate the long delay on the initial upcall attempt. Also,
> > if gssd spontaneously dies, then the flag will still be set to 1 until
> > the next upcall attempt times out. Finally, it currently requires that
> > the pipe be reopened in order to reset the flag back to true.
> > 
> > This patchset replaces that flag with a more reliable mechanism for
> > detecting when gssd is running. When rpc_pipefs is mounted, it creates a
> > new "dummy" pipe that gssd will naturally find and hold open. We'll
> > never send an upcall down this pipe, and writing to it always fails.
> > But, since we can detect when something is holding it open, we can use
> > that to determine whether gssd is running.
> > 
> > The current patch just uses this mechanism to replace the gssd_running
> > flag with this new mechanism. This shortens the long delay when mounting
> > without gssd running, but does not silence these warnings:
> > 
> >    RPC: AUTH_GSS upcall timed out.
> >    Please check user daemon is running.
> > 
> > I'm willing to add a patch to do that, but I'm a little unclear on the
> > best way to do so. Those messages are generated by the auth_gss code. We
> > probably do want to print them if someone mounted with sec=krb5, but
> > suppress them when mounting with sec=sys.
> > 
> > Do we need to somehow pass down that intent to auth_gss? Another idea
> > would be to call gssd_running() from the nfs mount code and use that to
> > determine whether to try and use krb5 at all...
> > 
> > Discuss!
> 
> I'd like to pursue the module loading solution as well.
> 

Sorry, I missed that part of the discussion.

What's the module loading solution?

-- 
Jeff Layton <jlayton@redhat.com>