2013-11-13 14:31:00

by Jeff Layton

[permalink] [raw]
Subject: [PATCH v2 0/3] sunrpc/nfs: more reliable detection of running gssd

v2:
- change name of toplevel pipefs dir from "dummy" to "gssd" (per
Trond's suggestion)

- when gssd isn't running, don't bother to upcall (per Neil B.'s
suggestion)

- fix lifecycle of rpc_pipe data. Previously it would have leaked
after umount. With this set, it's created and destroyed along with
the netns, and just attached to the pipe inode on mount/unmount
of rpc_pipefs.

- patch has been added to skip attempting setclientid with krb5i
if gssd isn't running. This avoids the "AUTH_GSS upcall timed out"
message when gssd isn't running and you mount with sec=sys. It also
shortens the delay when gssd isn't up.

The original cover letter from the v1 posting follows. Note that this
set does address the warnings about the AUTH_GSS upcall timing out.

-------------------------[snip]-----------------------------

We've gotten a lot of complaints recently about the 15s delay when
doing a sec=sys mount without gssd running.

A large part of the problem is that the kernel isn't able to reliably
detect when rpc.gssd is running. What we currently have is a
gssd_running flag that is initially set to 1. When an upcall times out,
that gets set to 0, and subsequent upcalls get a much shorter timeout
(1/4s instead of 15s). It's reset back to '1' when a pipe is reopened.

The approach of using a flag like this is pretty inadequate. First, it
doesn't eliminate the long delay on the initial upcall attempt. Also,
if gssd spontaneously dies, then the flag will still be set to 1 until
the next upcall attempt times out. Finally, it currently requires that
the pipe be reopened in order to reset the flag back to true.

This patchset replaces that flag with a more reliable mechanism for
detecting when gssd is running. When rpc_pipefs is mounted, it creates a
new "dummy" pipe that gssd will naturally find and hold open. We'll
never send an upcall down this pipe, and writing to it always fails.
But, since we can detect when something is holding it open, we can use
that to determine whether gssd is running.

The current patch just uses this mechanism to replace the gssd_running
flag with this new mechanism. This shortens the long delay when mounting
without gssd running, but does not silence these warnings:

RPC: AUTH_GSS upcall timed out.
Please check user daemon is running.

I'm willing to add a patch to do that, but I'm a little unclear on the
best way to do so. Those messages are generated by the auth_gss code. We
probably do want to print them if someone mounted with sec=krb5, but
suppress them when mounting with sec=sys.

Do we need to somehow pass down that intent to auth_gss? Another idea
would be to call gssd_running() from the nfs mount code and use that to
determine whether to try and use krb5 at all...

Discuss!

Jeff Layton (3):
sunrpc: create a new dummy pipe for gssd to hold open
sunrpc: replace sunrpc_net->gssd_running flag with a more reliable
check
nfs: check if gssd is running before attempting to use krb5i auth in
SETCLIENTID call

fs/nfs/client.c | 5 +-
fs/nfs/internal.h | 4 +-
fs/nfs/nfs4client.c | 8 ++-
include/linux/nfs_xdr.h | 2 +-
include/linux/sunrpc/auth_gss.h | 10 ++++
include/linux/sunrpc/rpc_pipe_fs.h | 7 ++-
net/sunrpc/auth_gss/auth_gss.c | 19 +++++--
net/sunrpc/netns.h | 3 +-
net/sunrpc/rpc_pipe.c | 104 ++++++++++++++++++++++++++++++++++---
net/sunrpc/sunrpc_syms.c | 8 ++-
10 files changed, 147 insertions(+), 23 deletions(-)

--
1.8.3.1



2013-11-13 14:31:03

by Jeff Layton

[permalink] [raw]
Subject: [PATCH v2 2/3] sunrpc: replace sunrpc_net->gssd_running flag with a more reliable check

Now that we have a more reliable method to tell if gssd is running, we
can replace the sn->gssd_running flag with a function that will query to
see if it's up and running.

There's also no need to attempt an upcall that we know will fail, so
just return -EACCES if gssd isn't running in that case.

Signed-off-by: Jeff Layton <[email protected]>
---
include/linux/sunrpc/rpc_pipe_fs.h | 4 ++++
net/sunrpc/auth_gss/auth_gss.c | 18 +++++++++++++-----
net/sunrpc/netns.h | 2 --
net/sunrpc/rpc_pipe.c | 11 +++++++----
4 files changed, 24 insertions(+), 11 deletions(-)

diff --git a/include/linux/sunrpc/rpc_pipe_fs.h b/include/linux/sunrpc/rpc_pipe_fs.h
index 85f1342..6312d5d 100644
--- a/include/linux/sunrpc/rpc_pipe_fs.h
+++ b/include/linux/sunrpc/rpc_pipe_fs.h
@@ -5,6 +5,8 @@

#include <linux/workqueue.h>

+struct sunrpc_net;
+
struct rpc_pipe_dir_head {
struct list_head pdh_entries;
struct dentry *pdh_dentry;
@@ -131,5 +133,7 @@ extern int rpc_unlink(struct dentry *);
extern int register_rpc_pipefs(void);
extern void unregister_rpc_pipefs(void);

+extern bool rpc_pipe_is_open(struct rpc_pipe *pipe);
+
#endif
#endif
diff --git a/net/sunrpc/auth_gss/auth_gss.c b/net/sunrpc/auth_gss/auth_gss.c
index 97912b4..399390e 100644
--- a/net/sunrpc/auth_gss/auth_gss.c
+++ b/net/sunrpc/auth_gss/auth_gss.c
@@ -592,6 +592,14 @@ out:
return err;
}

+static bool
+gssd_running(struct net *net)
+{
+ struct sunrpc_net *sn = net_generic(net, sunrpc_net_id);
+
+ return rpc_pipe_is_open(sn->gssd_dummy);
+}
+
static inline int
gss_create_upcall(struct gss_auth *gss_auth, struct gss_cred *gss_cred)
{
@@ -608,17 +616,17 @@ gss_create_upcall(struct gss_auth *gss_auth, struct gss_cred *gss_cred)
__func__, from_kuid(&init_user_ns, cred->cr_uid));
retry:
err = 0;
- /* Default timeout is 15s unless we know that gssd is not running */
+ /* if gssd is down, just skip upcalling altogether */
+ if (!gssd_running(net)) {
+ warn_gssd();
+ return -EACCES;
+ }
timeout = 15 * HZ;
- if (!sn->gssd_running)
- timeout = HZ >> 2;
gss_msg = gss_setup_upcall(gss_auth, cred);
if (PTR_ERR(gss_msg) == -EAGAIN) {
err = wait_event_interruptible_timeout(pipe_version_waitqueue,
sn->pipe_version >= 0, timeout);
if (sn->pipe_version < 0) {
- if (err == 0)
- sn->gssd_running = 0;
warn_gssd();
err = -EACCES;
}
diff --git a/net/sunrpc/netns.h b/net/sunrpc/netns.h
index 8a8e841..94e506f 100644
--- a/net/sunrpc/netns.h
+++ b/net/sunrpc/netns.h
@@ -33,8 +33,6 @@ struct sunrpc_net {
int pipe_version;
atomic_t pipe_users;
struct proc_dir_entry *use_gssp_proc;
-
- unsigned int gssd_running;
};

extern int sunrpc_net_id;
diff --git a/net/sunrpc/rpc_pipe.c b/net/sunrpc/rpc_pipe.c
index 34efdbf..917e385 100644
--- a/net/sunrpc/rpc_pipe.c
+++ b/net/sunrpc/rpc_pipe.c
@@ -216,14 +216,11 @@ rpc_destroy_inode(struct inode *inode)
static int
rpc_pipe_open(struct inode *inode, struct file *filp)
{
- struct net *net = inode->i_sb->s_fs_info;
- struct sunrpc_net *sn = net_generic(net, sunrpc_net_id);
struct rpc_pipe *pipe;
int first_open;
int res = -ENXIO;

mutex_lock(&inode->i_mutex);
- sn->gssd_running = 1;
pipe = RPC_I(inode)->pipe;
if (pipe == NULL)
goto out;
@@ -1231,7 +1228,6 @@ int rpc_pipefs_init_net(struct net *net)
return PTR_ERR(sn->gssd_dummy);

mutex_init(&sn->pipefs_sb_lock);
- sn->gssd_running = 1;
sn->pipe_version = -1;
return 0;
}
@@ -1385,6 +1381,13 @@ err_depopulate:
return err;
}

+bool
+rpc_pipe_is_open(struct rpc_pipe *pipe)
+{
+ return (pipe->nreaders || pipe->nwriters);
+}
+EXPORT_SYMBOL_GPL(rpc_pipe_is_open);
+
static struct dentry *
rpc_mount(struct file_system_type *fs_type,
int flags, const char *dev_name, void *data)
--
1.8.3.1


2013-11-13 14:39:03

by Chuck Lever

[permalink] [raw]
Subject: Re: [PATCH v2 3/3] nfs: check if gssd is running before attempting to use krb5i auth in SETCLIENTID call


On Nov 13, 2013, at 9:30 AM, Jeff Layton <[email protected]> wrote:

> Currently, the client will attempt to use krb5i in the SETCLIENTID call
> even if rpc.gssd is running. If that fails, it'll then fall back to
> RPC_AUTH_UNIX. This introduced a delay when mounting if rpc.gssd isn't
> running, and causes warning messages to pop up in the ring buffer.
>
> Check to see if rpc.gssd is running before even attempting to use krb5i
> auth, and just silently skip trying to do so if it isn't.
>
> Signed-off-by: Jeff Layton <[email protected]>
> ---
> fs/nfs/client.c | 5 +++--
> fs/nfs/internal.h | 4 ++--
> fs/nfs/nfs4client.c | 8 ++++++--
> include/linux/nfs_xdr.h | 2 +-
> include/linux/sunrpc/auth_gss.h | 10 ++++++++++
> net/sunrpc/auth_gss/auth_gss.c | 3 ++-
> 6 files changed, 24 insertions(+), 8 deletions(-)
>
> diff --git a/fs/nfs/client.c b/fs/nfs/client.c
> index 1d09289..fbee354 100644
> --- a/fs/nfs/client.c
> +++ b/fs/nfs/client.c
> @@ -501,7 +501,8 @@ nfs_get_client(const struct nfs_client_initdata *cl_init,
> &nn->nfs_client_list);
> spin_unlock(&nn->nfs_client_lock);
> new->cl_flags = cl_init->init_flags;
> - return rpc_ops->init_client(new, timeparms, ip_addr);
> + return rpc_ops->init_client(new, timeparms, ip_addr,
> + cl_init->net);
> }
>
> spin_unlock(&nn->nfs_client_lock);
> @@ -700,7 +701,7 @@ EXPORT_SYMBOL_GPL(nfs_init_server_rpcclient);
> */
> struct nfs_client *nfs_init_client(struct nfs_client *clp,
> const struct rpc_timeout *timeparms,
> - const char *ip_addr)
> + const char *ip_addr, struct net *net)
> {
> int error;
>
> diff --git a/fs/nfs/internal.h b/fs/nfs/internal.h
> index bca6a3e..69d3d1c 100644
> --- a/fs/nfs/internal.h
> +++ b/fs/nfs/internal.h
> @@ -273,7 +273,7 @@ extern struct rpc_procinfo nfs4_procedures[];
> void nfs_close_context(struct nfs_open_context *ctx, int is_sync);
> extern struct nfs_client *nfs_init_client(struct nfs_client *clp,
> const struct rpc_timeout *timeparms,
> - const char *ip_addr);
> + const char *ip_addr, struct net *net);
>
> /* dir.c */
> extern unsigned long nfs_access_cache_count(struct shrinker *shrink,
> @@ -462,7 +462,7 @@ extern ssize_t nfs_dreq_bytes_left(struct nfs_direct_req *dreq);
> extern void __nfs4_read_done_cb(struct nfs_read_data *);
> extern struct nfs_client *nfs4_init_client(struct nfs_client *clp,
> const struct rpc_timeout *timeparms,
> - const char *ip_addr);
> + const char *ip_addr, struct net *net);
> extern int nfs40_walk_client_list(struct nfs_client *clp,
> struct nfs_client **result,
> struct rpc_cred *cred);
> diff --git a/fs/nfs/nfs4client.c b/fs/nfs/nfs4client.c
> index b4a160a..988aebf 100644
> --- a/fs/nfs/nfs4client.c
> +++ b/fs/nfs/nfs4client.c
> @@ -8,6 +8,7 @@
> #include <linux/nfs_mount.h>
> #include <linux/sunrpc/addr.h>
> #include <linux/sunrpc/auth.h>
> +#include <linux/sunrpc/auth_gss.h>
> #include <linux/sunrpc/xprt.h>
> #include <linux/sunrpc/bc_xprt.h>
> #include "internal.h"
> @@ -351,7 +352,7 @@ static int nfs4_init_client_minor_version(struct nfs_client *clp)
> */
> struct nfs_client *nfs4_init_client(struct nfs_client *clp,
> const struct rpc_timeout *timeparms,
> - const char *ip_addr)
> + const char *ip_addr, struct net *net)
> {
> char buf[INET6_ADDRSTRLEN + 1];
> struct nfs_client *old;
> @@ -370,7 +371,10 @@ struct nfs_client *nfs4_init_client(struct nfs_client *clp,
> __set_bit(NFS_CS_INFINITE_SLOTS, &clp->cl_flags);
> __set_bit(NFS_CS_DISCRTRY, &clp->cl_flags);
> __set_bit(NFS_CS_NO_RETRANS_TIMEOUT, &clp->cl_flags);
> - error = nfs_create_rpc_client(clp, timeparms, RPC_AUTH_GSS_KRB5I);
> +
> + error = -EINVAL;
> + if (gssd_running(net))
> + error = nfs_create_rpc_client(clp, timeparms,RPC_AUTH_GSS_KRB5I);
> if (error == -EINVAL)
> error = nfs_create_rpc_client(clp, timeparms, RPC_AUTH_UNIX);

This feels like a layering violation. Why wouldn't you put a gssd_running check in gss_create(), for example, and have rpcauth_create() return -EINVAL?


> if (error < 0)
> diff --git a/include/linux/nfs_xdr.h b/include/linux/nfs_xdr.h
> index 3ccfcec..e75b2cc 100644
> --- a/include/linux/nfs_xdr.h
> +++ b/include/linux/nfs_xdr.h
> @@ -1486,7 +1486,7 @@ struct nfs_rpc_ops {
> struct nfs_client *(*alloc_client) (const struct nfs_client_initdata *);
> struct nfs_client *
> (*init_client) (struct nfs_client *, const struct rpc_timeout *,
> - const char *);
> + const char *, struct net *);
> void (*free_client) (struct nfs_client *);
> struct nfs_server *(*create_server)(struct nfs_mount_info *, struct nfs_subversion *);
> struct nfs_server *(*clone_server)(struct nfs_server *, struct nfs_fh *,
> diff --git a/include/linux/sunrpc/auth_gss.h b/include/linux/sunrpc/auth_gss.h
> index f1cfd4c..cecf684 100644
> --- a/include/linux/sunrpc/auth_gss.h
> +++ b/include/linux/sunrpc/auth_gss.h
> @@ -86,6 +86,16 @@ struct gss_cred {
> unsigned long gc_upcall_timestamp;
> };
>
> +#if IS_ENABLED(CONFIG_SUNRPC_GSS)
> +extern bool gssd_running(struct net *net);
> +#else /* !CONFIG_SUNRPC_GSS */
> +static inline bool
> +gssd_running(struct net *net)
> +{
> + return false;
> +}
> +#endif /* CONFIG_SUNRPC_GSS */
> +
> #endif /* __KERNEL__ */
> #endif /* _LINUX_SUNRPC_AUTH_GSS_H */
>
> diff --git a/net/sunrpc/auth_gss/auth_gss.c b/net/sunrpc/auth_gss/auth_gss.c
> index 399390e..0cc2174 100644
> --- a/net/sunrpc/auth_gss/auth_gss.c
> +++ b/net/sunrpc/auth_gss/auth_gss.c
> @@ -592,13 +592,14 @@ out:
> return err;
> }
>
> -static bool
> +bool
> gssd_running(struct net *net)
> {
> struct sunrpc_net *sn = net_generic(net, sunrpc_net_id);
>
> return rpc_pipe_is_open(sn->gssd_dummy);
> }
> +EXPORT_SYMBOL_GPL(gssd_running);
>
> static inline int
> gss_create_upcall(struct gss_auth *gss_auth, struct gss_cred *gss_cred)
> --
> 1.8.3.1
>

--
Chuck Lever
chuck[dot]lever[at]oracle[dot]com





2013-11-13 15:54:49

by Jeff Layton

[permalink] [raw]
Subject: Re: [PATCH v2 3/3] nfs: check if gssd is running before attempting to use krb5i auth in SETCLIENTID call

On Wed, 13 Nov 2013 15:49:13 +0000
"Myklebust, Trond" <[email protected]> wrote:

>
> On Nov 13, 2013, at 10:35, Jeff Layton <[email protected]> wrote:
>
> > On Wed, 13 Nov 2013 10:14:30 -0500
> > Chuck Lever <[email protected]> wrote:
> >
> >>
> >> On Nov 13, 2013, at 9:48 AM, "Myklebust, Trond" <[email protected]> wrote:
> >>
> >>>
> >>> On Nov 13, 2013, at 9:38, Chuck Lever <[email protected]> wrote:
> >>>
> >>>>
> >>>> On Nov 13, 2013, at 9:30 AM, Jeff Layton <[email protected]> wrote:
> >>>>
> >>>>> Currently, the client will attempt to use krb5i in the SETCLIENTID call
> >>>>> even if rpc.gssd is running. If that fails, it'll then fall back to
> >>>>> RPC_AUTH_UNIX. This introduced a delay when mounting if rpc.gssd isn't
> >>>>> running, and causes warning messages to pop up in the ring buffer.
> >>>>>
> >>>>> Check to see if rpc.gssd is running before even attempting to use krb5i
> >>>>> auth, and just silently skip trying to do so if it isn't.
> >>>>>
> >>>>> Signed-off-by: Jeff Layton <[email protected]>
> >>>>> ---
> >>>>>
> >>>>> diff --git a/fs/nfs/nfs4client.c b/fs/nfs/nfs4client.c
> >>>>> index b4a160a..988aebf 100644
> >>>>> --- a/fs/nfs/nfs4client.c
> >>>>> +++ b/fs/nfs/nfs4client.c
> >>>>> @@ -8,6 +8,7 @@
> >>>>> #include <linux/nfs_mount.h>
> >>>>> #include <linux/sunrpc/addr.h>
> >>>>> #include <linux/sunrpc/auth.h>
> >>>>> +#include <linux/sunrpc/auth_gss.h>
> >>>>> #include <linux/sunrpc/xprt.h>
> >>>>> #include <linux/sunrpc/bc_xprt.h>
> >>>>> #include "internal.h"
> >>>>> @@ -351,7 +352,7 @@ static int nfs4_init_client_minor_version(struct nfs_client *clp)
> >>>>> */
> >>>>> struct nfs_client *nfs4_init_client(struct nfs_client *clp,
> >>>>> const struct rpc_timeout *timeparms,
> >>>>> - const char *ip_addr)
> >>>>> + const char *ip_addr, struct net *net)
> >>>
> >>> Why the ‘struct net’ argument? clp->cl_net should already be initialized here.
> >>>
> >>>>> {
> >>>>> char buf[INET6_ADDRSTRLEN + 1];
> >>>>> struct nfs_client *old;
> >>>>> @@ -370,7 +371,10 @@ struct nfs_client *nfs4_init_client(struct nfs_client *clp,
> >>>>> __set_bit(NFS_CS_INFINITE_SLOTS, &clp->cl_flags);
> >>>>> __set_bit(NFS_CS_DISCRTRY, &clp->cl_flags);
> >>>>> __set_bit(NFS_CS_NO_RETRANS_TIMEOUT, &clp->cl_flags);
> >>>>> - error = nfs_create_rpc_client(clp, timeparms, RPC_AUTH_GSS_KRB5I);
> >>>>> +
> >>>>> + error = -EINVAL;
> >>>>> + if (gssd_running(net))
> >>>>> + error = nfs_create_rpc_client(clp, timeparms,RPC_AUTH_GSS_KRB5I);
> >>>>> if (error == -EINVAL)
> >>>>> error = nfs_create_rpc_client(clp, timeparms, RPC_AUTH_UNIX);
> >>>>
> >>>> This feels like a layering violation. Why wouldn't you put a gssd_running check in gss_create(), for example, and have rpcauth_create() return -EINVAL?
> >>>>
> >>>
> >>> It would be better to put it in the upcall.
> >>
> >> Waiting until the upcall has its benefits. At that point, GSSD (if it is running) can report other errors, such as that there is no material to form a machine credential.
> >>
> >> My point to Jeff is that, aside from architectural aesthetics, direct calls from RPC consumers to the GSS layer causes a module dependency problem. The right way to plumb this is to create an RPC client API that invokes gssd_running() but only if auth_rpcgss.ko is loaded.
>
> Chuck, I’ve already told you that the auth_rpcgss dependency is a non-starter. Turning off automatic loading of rpcsec_gss modules is a _worse_ regression than the ones we already have and (as I already said) can be trivially defeated by just compiling them into the kernel.
>
> We _want_ the kernel to be able to autoload modules so that we can add new security flavours etc without having to recompile.
>
> >> However, I don't see why the existing RPC client APIs shouldn't just fail where appropriate if GSSD isn't available. Is there a strong need to expose gssd_running() as a separate API?
> >>
> >
> > One of the complaints about this whole "use krb5i by default" change is
> > that we now get the warnings in the ring buffer when gssd isn't
> > running. That's a good thing if krb5 was explicitly requested, but is
> > useless and confusing if the user just wants to use AUTH_SYS.
> >
> > If we wait until gss_create, it's too late to know what the "intent"
> > was. We'll either fire the warning inappropriately like we do now, or
> > miss it altogether when we actually should have printed it.
>
> What if the user intended to use krb5i, but the daemon failed to start? This whole “kernel second guessing what the admin _actually_ wanted to do” thing is a red herring. Let’s just fix the real problem and then leave it at that.
>

In that case, they will get a failure and warning when they go to the
next stage of the mount (I forget which RPC it is). With this change,
krb5/krb5i mounts will still fail just like they do today when gssd
isn't running. You just get a single warning in the ring buffer about
it instead of two.

--
Jeff Layton <[email protected]>

2013-11-13 16:58:08

by Chuck Lever

[permalink] [raw]
Subject: Re: [PATCH v2 3/3] nfs: check if gssd is running before attempting to use krb5i auth in SETCLIENTID call


On Nov 13, 2013, at 10:35 AM, Jeff Layton <[email protected]> wrote:

> On Wed, 13 Nov 2013 10:14:30 -0500
> Chuck Lever <[email protected]> wrote:
>
>>
>> On Nov 13, 2013, at 9:48 AM, "Myklebust, Trond" <[email protected]> wrote:
>>
>>>
>>> On Nov 13, 2013, at 9:38, Chuck Lever <[email protected]> wrote:
>>>
>>>>
>>>> On Nov 13, 2013, at 9:30 AM, Jeff Layton <[email protected]> wrote:
>>>>
>>>>> Currently, the client will attempt to use krb5i in the SETCLIENTID call
>>>>> even if rpc.gssd is running. If that fails, it'll then fall back to
>>>>> RPC_AUTH_UNIX. This introduced a delay when mounting if rpc.gssd isn't
>>>>> running, and causes warning messages to pop up in the ring buffer.
>>>>>
>>>>> Check to see if rpc.gssd is running before even attempting to use krb5i
>>>>> auth, and just silently skip trying to do so if it isn't.
>>>>>
>>>>> Signed-off-by: Jeff Layton <[email protected]>
>>>>> ---
>>>>>
>>>>> diff --git a/fs/nfs/nfs4client.c b/fs/nfs/nfs4client.c
>>>>> index b4a160a..988aebf 100644
>>>>> --- a/fs/nfs/nfs4client.c
>>>>> +++ b/fs/nfs/nfs4client.c
>>>>> @@ -8,6 +8,7 @@
>>>>> #include <linux/nfs_mount.h>
>>>>> #include <linux/sunrpc/addr.h>
>>>>> #include <linux/sunrpc/auth.h>
>>>>> +#include <linux/sunrpc/auth_gss.h>
>>>>> #include <linux/sunrpc/xprt.h>
>>>>> #include <linux/sunrpc/bc_xprt.h>
>>>>> #include "internal.h"
>>>>> @@ -351,7 +352,7 @@ static int nfs4_init_client_minor_version(struct nfs_client *clp)
>>>>> */
>>>>> struct nfs_client *nfs4_init_client(struct nfs_client *clp,
>>>>> const struct rpc_timeout *timeparms,
>>>>> - const char *ip_addr)
>>>>> + const char *ip_addr, struct net *net)
>>>
>>> Why the ?struct net? argument? clp->cl_net should already be initialized here.
>>>
>>>>> {
>>>>> char buf[INET6_ADDRSTRLEN + 1];
>>>>> struct nfs_client *old;
>>>>> @@ -370,7 +371,10 @@ struct nfs_client *nfs4_init_client(struct nfs_client *clp,
>>>>> __set_bit(NFS_CS_INFINITE_SLOTS, &clp->cl_flags);
>>>>> __set_bit(NFS_CS_DISCRTRY, &clp->cl_flags);
>>>>> __set_bit(NFS_CS_NO_RETRANS_TIMEOUT, &clp->cl_flags);
>>>>> - error = nfs_create_rpc_client(clp, timeparms, RPC_AUTH_GSS_KRB5I);
>>>>> +
>>>>> + error = -EINVAL;
>>>>> + if (gssd_running(net))
>>>>> + error = nfs_create_rpc_client(clp, timeparms,RPC_AUTH_GSS_KRB5I);
>>>>> if (error == -EINVAL)
>>>>> error = nfs_create_rpc_client(clp, timeparms, RPC_AUTH_UNIX);
>>>>
>>>> This feels like a layering violation. Why wouldn't you put a gssd_running check in gss_create(), for example, and have rpcauth_create() return -EINVAL?
>>>>
>>>
>>> It would be better to put it in the upcall.
>>
>> Waiting until the upcall has its benefits. At that point, GSSD (if it is running) can report other errors, such as that there is no material to form a machine credential.
>>
>> My point to Jeff is that, aside from architectural aesthetics, direct calls from RPC consumers to the GSS layer causes a module dependency problem. The right way to plumb this is to create an RPC client API that invokes gssd_running() but only if auth_rpcgss.ko is loaded.
>>
>> However, I don't see why the existing RPC client APIs shouldn't just fail where appropriate if GSSD isn't available. Is there a strong need to expose gssd_running() as a separate API?
>>
>
> One of the complaints about this whole "use krb5i by default" change is
> that we now get the warnings in the ring buffer when gssd isn't
> running. That's a good thing if krb5 was explicitly requested, but is
> useless and confusing if the user just wants to use AUTH_SYS.
>
> If we wait until gss_create, it's too late to know what the "intent"
> was. We'll either fire the warning inappropriately like we do now, or
> miss it altogether when we actually should have printed it.

Right. If the kernel autoloads auth_rpcgss.ko, it has no way to know whether the administrator forgot to load it, or whether the administrator didn't load it because user space isn't configured. Thus we have to make the kernel more complicated to deal with both of those cases.

We don't have another security flavor that has both a kernel and user space component. That's why I think auth_rpcgss.ko should be handled differently. That's off topic, though.

> So, that was the main reason for the layering violation here. I do
> however get your point on the module dependency, but IIUC...don't we get
> that anyway?

Strictly speaking, you can't depend on gssd_running() to be available until auth_rpcgss.ko is loaded.

You are invoking gssd_running() before nfs_create_rpc_client() has had a chance to autoload auth_rpcgss.ko. Thus, the very first time through this path, gssd_running() may not yet be available.

If you need this API in the NFS client, you have to wrap it with something that ensures auth_rpcgss.ko is loaded first, then invokes gssd_running(). See rpcauth_get_pseudoflavor() and rpcauth_get_gssinfo() for examples.

Yucky, huh?

> Now that you try using krb5i by default for SETCLIENTID,
> don't we end up loading auth_gss.ko anyway on every nfsv4 mount
> regardless?

Yes we do, but that's the problem I'm trying to address elsewhere. (Note though, that the very first time through, even that doesn't protect you from calling a function that may not be loaded yet).

--
Chuck Lever
chuck[dot]lever[at]oracle[dot]com





2013-11-13 16:18:35

by Jeff Layton

[permalink] [raw]
Subject: Re: [PATCH v2 3/3] nfs: check if gssd is running before attempting to use krb5i auth in SETCLIENTID call

On Wed, 13 Nov 2013 11:10:52 -0500
Chuck Lever <[email protected]> wrote:

>
> On Nov 13, 2013, at 11:09 AM, Jeff Layton <[email protected]> wrote:
>
> > On Wed, 13 Nov 2013 10:57:02 -0500
> > Jeff Layton <[email protected]> wrote:
> >
> >> On Wed, 13 Nov 2013 15:49:13 +0000
> >> "Myklebust, Trond" <[email protected]> wrote:
> >>
> >>>
> >>> On Nov 13, 2013, at 10:35, Jeff Layton <[email protected]> wrote:
> >>>
> >>>> On Wed, 13 Nov 2013 10:14:30 -0500
> >>>> Chuck Lever <[email protected]> wrote:
> >>>>
> >>>>>
> >>>>> On Nov 13, 2013, at 9:48 AM, "Myklebust, Trond" <[email protected]> wrote:
> >>>>>
> >>>>>>
> >>>>>> On Nov 13, 2013, at 9:38, Chuck Lever <[email protected]> wrote:
> >>>>>>
> >>>>>>>
> >>>>>>> On Nov 13, 2013, at 9:30 AM, Jeff Layton <[email protected]> wrote:
> >>>>>>>
> >>>>>>>> Currently, the client will attempt to use krb5i in the SETCLIENTID call
> >>>>>>>> even if rpc.gssd is running. If that fails, it'll then fall back to
> >>>>>>>> RPC_AUTH_UNIX. This introduced a delay when mounting if rpc.gssd isn't
> >>>>>>>> running, and causes warning messages to pop up in the ring buffer.
> >>>>>>>>
> >>>>>>>> Check to see if rpc.gssd is running before even attempting to use krb5i
> >>>>>>>> auth, and just silently skip trying to do so if it isn't.
> >>>>>>>>
> >>>>>>>> Signed-off-by: Jeff Layton <[email protected]>
> >>>>>>>> ---
> >>>>>>>>
> >>>>>>>> diff --git a/fs/nfs/nfs4client.c b/fs/nfs/nfs4client.c
> >>>>>>>> index b4a160a..988aebf 100644
> >>>>>>>> --- a/fs/nfs/nfs4client.c
> >>>>>>>> +++ b/fs/nfs/nfs4client.c
> >>>>>>>> @@ -8,6 +8,7 @@
> >>>>>>>> #include <linux/nfs_mount.h>
> >>>>>>>> #include <linux/sunrpc/addr.h>
> >>>>>>>> #include <linux/sunrpc/auth.h>
> >>>>>>>> +#include <linux/sunrpc/auth_gss.h>
> >>>>>>>> #include <linux/sunrpc/xprt.h>
> >>>>>>>> #include <linux/sunrpc/bc_xprt.h>
> >>>>>>>> #include "internal.h"
> >>>>>>>> @@ -351,7 +352,7 @@ static int nfs4_init_client_minor_version(struct nfs_client *clp)
> >>>>>>>> */
> >>>>>>>> struct nfs_client *nfs4_init_client(struct nfs_client *clp,
> >>>>>>>> const struct rpc_timeout *timeparms,
> >>>>>>>> - const char *ip_addr)
> >>>>>>>> + const char *ip_addr, struct net *net)
> >>>>>>
> >>>>>> Why the ‘struct net’ argument? clp->cl_net should already be initialized here.
> >>>>>>
> >>>>>>>> {
> >>>>>>>> char buf[INET6_ADDRSTRLEN + 1];
> >>>>>>>> struct nfs_client *old;
> >>>>>>>> @@ -370,7 +371,10 @@ struct nfs_client *nfs4_init_client(struct nfs_client *clp,
> >>>>>>>> __set_bit(NFS_CS_INFINITE_SLOTS, &clp->cl_flags);
> >>>>>>>> __set_bit(NFS_CS_DISCRTRY, &clp->cl_flags);
> >>>>>>>> __set_bit(NFS_CS_NO_RETRANS_TIMEOUT, &clp->cl_flags);
> >>>>>>>> - error = nfs_create_rpc_client(clp, timeparms, RPC_AUTH_GSS_KRB5I);
> >>>>>>>> +
> >>>>>>>> + error = -EINVAL;
> >>>>>>>> + if (gssd_running(net))
> >>>>>>>> + error = nfs_create_rpc_client(clp, timeparms,RPC_AUTH_GSS_KRB5I);
> >>>>>>>> if (error == -EINVAL)
> >>>>>>>> error = nfs_create_rpc_client(clp, timeparms, RPC_AUTH_UNIX);
> >>>>>>>
> >>>>>>> This feels like a layering violation. Why wouldn't you put a gssd_running check in gss_create(), for example, and have rpcauth_create() return -EINVAL?
> >>>>>>>
> >>>>>>
> >>>>>> It would be better to put it in the upcall.
> >>>>>
> >>>>> Waiting until the upcall has its benefits. At that point, GSSD (if it is running) can report other errors, such as that there is no material to form a machine credential.
> >>>>>
> >>>>> My point to Jeff is that, aside from architectural aesthetics, direct calls from RPC consumers to the GSS layer causes a module dependency problem. The right way to plumb this is to create an RPC client API that invokes gssd_running() but only if auth_rpcgss.ko is loaded.
> >>>
> >>> Chuck, I’ve already told you that the auth_rpcgss dependency is a non-starter. Turning off automatic loading of rpcsec_gss modules is a _worse_ regression than the ones we already have and (as I already said) can be trivially defeated by just compiling them into the kernel.
> >>>
> >>> We _want_ the kernel to be able to autoload modules so that we can add new security flavours etc without having to recompile.
> >>>
> >
> > Right, and just because auth_gss isn't currently plugged in, doesn't
> > mean that it's not able to be plugged in. If this is the first mount
> > attempt then it's likely that auth_gss isn't loaded yet, even if
> > rpc.gssd is running.
> >
> >>>>> However, I don't see why the existing RPC client APIs shouldn't just fail where appropriate if GSSD isn't available. Is there a strong need to expose gssd_running() as a separate API?
> >>>>>
> >>>>
> >>>> One of the complaints about this whole "use krb5i by default" change is
> >>>> that we now get the warnings in the ring buffer when gssd isn't
> >>>> running. That's a good thing if krb5 was explicitly requested, but is
> >>>> useless and confusing if the user just wants to use AUTH_SYS.
> >>>>
> >>>> If we wait until gss_create, it's too late to know what the "intent"
> >>>> was. We'll either fire the warning inappropriately like we do now, or
> >>>> miss it altogether when we actually should have printed it.
> >>>
> >>> What if the user intended to use krb5i, but the daemon failed to start? This whole “kernel second guessing what the admin _actually_ wanted to do” thing is a red herring. Let’s just fix the real problem and then leave it at that.
> >>>
> >>
> >> In that case, they will get a failure and warning when they go to the
> >> next stage of the mount (I forget which RPC it is). With this change,
> >> krb5/krb5i mounts will still fail just like they do today when gssd
> >> isn't running. You just get a single warning in the ring buffer about
> >> it instead of two.
> >>
> >
> > So to clarify...today we do this when gssd isn't running and we try an
> > AUTH_GSS mount:
> >
> > - attempt SETCLIENTID with krb5i
> > - when that fails, log a warning to ring buffer
> > - attempt SETCLIENTID with AUTH_SYS
> > - attempt rest of mount with krb5i
>
> Hold it. This step should not be happening. Lease management should try krb5i by default, but why is the rest of the mount attempted with krb5i?
>

Sorry, I should have been more clear...the rest of the mount is
attempted with krb5i because sec=krb5i was specified on the command
line.

IOW, this patch just shortcuts attempting to do the lease
establishment with krb5i when we know that that will fail. The main
benefit being that we don't end up logging a warning about AUTH_GSS not
running in that case.

The warning will be logged if/when a later call attempts to use GSSAPI.

> > - log another warning to ring buffer when it fails
> >
> > ...with the first two patches, that doesn't really change. With the
> > third patch in place, we just skip the first two steps if gssd isn't
> > running. You'll still get a single warning in the ring buffer (which is
> > as expected).
> >
> > Trond, is that acceptable or do you want me to just drop the 3rd patch?
>
> NAK from me on the third patch as it stands. Find some other way to invoke gssd_running() if you really need to do that.
>

Again, it's not clear to me why this is a concern. You're attempting to
call auth_gss functions in order to do the SETCLIENTID call anyway, why
is calling gssd_running prior to doing that a problem? In what
situations will that break?

--
Jeff Layton <[email protected]>

2013-11-14 19:26:02

by J. Bruce Fields

[permalink] [raw]
Subject: Re: [PATCH v2 0/3] sunrpc/nfs: more reliable detection of running gssd

On Wed, Nov 13, 2013 at 09:30:50AM -0500, Jeff Layton wrote:
> v2:
> - change name of toplevel pipefs dir from "dummy" to "gssd" (per
> Trond's suggestion)
>
> - when gssd isn't running, don't bother to upcall (per Neil B.'s
> suggestion)
>
> - fix lifecycle of rpc_pipe data. Previously it would have leaked
> after umount. With this set, it's created and destroyed along with
> the netns, and just attached to the pipe inode on mount/unmount
> of rpc_pipefs.
>
> - patch has been added to skip attempting setclientid with krb5i
> if gssd isn't running. This avoids the "AUTH_GSS upcall timed out"
> message when gssd isn't running and you mount with sec=sys. It also
> shortens the delay when gssd isn't up.
>
> The original cover letter from the v1 posting follows. Note that this
> set does address the warnings about the AUTH_GSS upcall timing out.
>
> -------------------------[snip]-----------------------------
>
> We've gotten a lot of complaints recently about the 15s delay when
> doing a sec=sys mount without gssd running.
>
> A large part of the problem is that the kernel isn't able to reliably
> detect when rpc.gssd is running. What we currently have is a
> gssd_running flag that is initially set to 1. When an upcall times out,
> that gets set to 0, and subsequent upcalls get a much shorter timeout
> (1/4s instead of 15s). It's reset back to '1' when a pipe is reopened.
>
> The approach of using a flag like this is pretty inadequate. First, it
> doesn't eliminate the long delay on the initial upcall attempt. Also,
> if gssd spontaneously dies, then the flag will still be set to 1 until
> the next upcall attempt times out. Finally, it currently requires that
> the pipe be reopened in order to reset the flag back to true.
>
> This patchset replaces that flag with a more reliable mechanism for
> detecting when gssd is running. When rpc_pipefs is mounted, it creates a
> new "dummy" pipe that gssd will naturally find and hold open. We'll
> never send an upcall down this pipe, and writing to it always fails.
> But, since we can detect when something is holding it open, we can use
> that to determine whether gssd is running.

I think this might have been addressed before, I don't remember: does
the init system currently have a way to wait till gssd has gotten as far
as scanning pipefs before allowing mounts?

(To avoid the race where a krb5 mount fails because gssd is still in the
process of being started.)

--b.

>
> The current patch just uses this mechanism to replace the gssd_running
> flag with this new mechanism. This shortens the long delay when mounting
> without gssd running, but does not silence these warnings:
>
> RPC: AUTH_GSS upcall timed out.
> Please check user daemon is running.
>
> I'm willing to add a patch to do that, but I'm a little unclear on the
> best way to do so. Those messages are generated by the auth_gss code. We
> probably do want to print them if someone mounted with sec=krb5, but
> suppress them when mounting with sec=sys.
>
> Do we need to somehow pass down that intent to auth_gss? Another idea
> would be to call gssd_running() from the nfs mount code and use that to
> determine whether to try and use krb5 at all...
>
> Discuss!
>
> Jeff Layton (3):
> sunrpc: create a new dummy pipe for gssd to hold open
> sunrpc: replace sunrpc_net->gssd_running flag with a more reliable
> check
> nfs: check if gssd is running before attempting to use krb5i auth in
> SETCLIENTID call
>
> fs/nfs/client.c | 5 +-
> fs/nfs/internal.h | 4 +-
> fs/nfs/nfs4client.c | 8 ++-
> include/linux/nfs_xdr.h | 2 +-
> include/linux/sunrpc/auth_gss.h | 10 ++++
> include/linux/sunrpc/rpc_pipe_fs.h | 7 ++-
> net/sunrpc/auth_gss/auth_gss.c | 19 +++++--
> net/sunrpc/netns.h | 3 +-
> net/sunrpc/rpc_pipe.c | 104 ++++++++++++++++++++++++++++++++++---
> net/sunrpc/sunrpc_syms.c | 8 ++-
> 10 files changed, 147 insertions(+), 23 deletions(-)
>
> --
> 1.8.3.1
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html

2013-11-13 15:14:43

by Chuck Lever

[permalink] [raw]
Subject: Re: [PATCH v2 3/3] nfs: check if gssd is running before attempting to use krb5i auth in SETCLIENTID call


On Nov 13, 2013, at 9:48 AM, "Myklebust, Trond" <[email protected]> wrote:

>
> On Nov 13, 2013, at 9:38, Chuck Lever <[email protected]> wrote:
>
>>
>> On Nov 13, 2013, at 9:30 AM, Jeff Layton <[email protected]> wrote:
>>
>>> Currently, the client will attempt to use krb5i in the SETCLIENTID call
>>> even if rpc.gssd is running. If that fails, it'll then fall back to
>>> RPC_AUTH_UNIX. This introduced a delay when mounting if rpc.gssd isn't
>>> running, and causes warning messages to pop up in the ring buffer.
>>>
>>> Check to see if rpc.gssd is running before even attempting to use krb5i
>>> auth, and just silently skip trying to do so if it isn't.
>>>
>>> Signed-off-by: Jeff Layton <[email protected]>
>>> ---
>>>
>>> diff --git a/fs/nfs/nfs4client.c b/fs/nfs/nfs4client.c
>>> index b4a160a..988aebf 100644
>>> --- a/fs/nfs/nfs4client.c
>>> +++ b/fs/nfs/nfs4client.c
>>> @@ -8,6 +8,7 @@
>>> #include <linux/nfs_mount.h>
>>> #include <linux/sunrpc/addr.h>
>>> #include <linux/sunrpc/auth.h>
>>> +#include <linux/sunrpc/auth_gss.h>
>>> #include <linux/sunrpc/xprt.h>
>>> #include <linux/sunrpc/bc_xprt.h>
>>> #include "internal.h"
>>> @@ -351,7 +352,7 @@ static int nfs4_init_client_minor_version(struct nfs_client *clp)
>>> */
>>> struct nfs_client *nfs4_init_client(struct nfs_client *clp,
>>> const struct rpc_timeout *timeparms,
>>> - const char *ip_addr)
>>> + const char *ip_addr, struct net *net)
>
> Why the ?struct net? argument? clp->cl_net should already be initialized here.
>
>>> {
>>> char buf[INET6_ADDRSTRLEN + 1];
>>> struct nfs_client *old;
>>> @@ -370,7 +371,10 @@ struct nfs_client *nfs4_init_client(struct nfs_client *clp,
>>> __set_bit(NFS_CS_INFINITE_SLOTS, &clp->cl_flags);
>>> __set_bit(NFS_CS_DISCRTRY, &clp->cl_flags);
>>> __set_bit(NFS_CS_NO_RETRANS_TIMEOUT, &clp->cl_flags);
>>> - error = nfs_create_rpc_client(clp, timeparms, RPC_AUTH_GSS_KRB5I);
>>> +
>>> + error = -EINVAL;
>>> + if (gssd_running(net))
>>> + error = nfs_create_rpc_client(clp, timeparms,RPC_AUTH_GSS_KRB5I);
>>> if (error == -EINVAL)
>>> error = nfs_create_rpc_client(clp, timeparms, RPC_AUTH_UNIX);
>>
>> This feels like a layering violation. Why wouldn't you put a gssd_running check in gss_create(), for example, and have rpcauth_create() return -EINVAL?
>>
>
> It would be better to put it in the upcall.

Waiting until the upcall has its benefits. At that point, GSSD (if it is running) can report other errors, such as that there is no material to form a machine credential.

My point to Jeff is that, aside from architectural aesthetics, direct calls from RPC consumers to the GSS layer causes a module dependency problem. The right way to plumb this is to create an RPC client API that invokes gssd_running() but only if auth_rpcgss.ko is loaded.

However, I don't see why the existing RPC client APIs shouldn't just fail where appropriate if GSSD isn't available. Is there a strong need to expose gssd_running() as a separate API?

--
Chuck Lever
chuck[dot]lever[at]oracle[dot]com





2013-11-13 16:11:01

by Chuck Lever

[permalink] [raw]
Subject: Re: [PATCH v2 3/3] nfs: check if gssd is running before attempting to use krb5i auth in SETCLIENTID call


On Nov 13, 2013, at 11:09 AM, Jeff Layton <[email protected]> wrote:

> On Wed, 13 Nov 2013 10:57:02 -0500
> Jeff Layton <[email protected]> wrote:
>
>> On Wed, 13 Nov 2013 15:49:13 +0000
>> "Myklebust, Trond" <[email protected]> wrote:
>>
>>>
>>> On Nov 13, 2013, at 10:35, Jeff Layton <[email protected]> wrote:
>>>
>>>> On Wed, 13 Nov 2013 10:14:30 -0500
>>>> Chuck Lever <[email protected]> wrote:
>>>>
>>>>>
>>>>> On Nov 13, 2013, at 9:48 AM, "Myklebust, Trond" <[email protected]> wrote:
>>>>>
>>>>>>
>>>>>> On Nov 13, 2013, at 9:38, Chuck Lever <[email protected]> wrote:
>>>>>>
>>>>>>>
>>>>>>> On Nov 13, 2013, at 9:30 AM, Jeff Layton <[email protected]> wrote:
>>>>>>>
>>>>>>>> Currently, the client will attempt to use krb5i in the SETCLIENTID call
>>>>>>>> even if rpc.gssd is running. If that fails, it'll then fall back to
>>>>>>>> RPC_AUTH_UNIX. This introduced a delay when mounting if rpc.gssd isn't
>>>>>>>> running, and causes warning messages to pop up in the ring buffer.
>>>>>>>>
>>>>>>>> Check to see if rpc.gssd is running before even attempting to use krb5i
>>>>>>>> auth, and just silently skip trying to do so if it isn't.
>>>>>>>>
>>>>>>>> Signed-off-by: Jeff Layton <[email protected]>
>>>>>>>> ---
>>>>>>>>
>>>>>>>> diff --git a/fs/nfs/nfs4client.c b/fs/nfs/nfs4client.c
>>>>>>>> index b4a160a..988aebf 100644
>>>>>>>> --- a/fs/nfs/nfs4client.c
>>>>>>>> +++ b/fs/nfs/nfs4client.c
>>>>>>>> @@ -8,6 +8,7 @@
>>>>>>>> #include <linux/nfs_mount.h>
>>>>>>>> #include <linux/sunrpc/addr.h>
>>>>>>>> #include <linux/sunrpc/auth.h>
>>>>>>>> +#include <linux/sunrpc/auth_gss.h>
>>>>>>>> #include <linux/sunrpc/xprt.h>
>>>>>>>> #include <linux/sunrpc/bc_xprt.h>
>>>>>>>> #include "internal.h"
>>>>>>>> @@ -351,7 +352,7 @@ static int nfs4_init_client_minor_version(struct nfs_client *clp)
>>>>>>>> */
>>>>>>>> struct nfs_client *nfs4_init_client(struct nfs_client *clp,
>>>>>>>> const struct rpc_timeout *timeparms,
>>>>>>>> - const char *ip_addr)
>>>>>>>> + const char *ip_addr, struct net *net)
>>>>>>
>>>>>> Why the ?struct net? argument? clp->cl_net should already be initialized here.
>>>>>>
>>>>>>>> {
>>>>>>>> char buf[INET6_ADDRSTRLEN + 1];
>>>>>>>> struct nfs_client *old;
>>>>>>>> @@ -370,7 +371,10 @@ struct nfs_client *nfs4_init_client(struct nfs_client *clp,
>>>>>>>> __set_bit(NFS_CS_INFINITE_SLOTS, &clp->cl_flags);
>>>>>>>> __set_bit(NFS_CS_DISCRTRY, &clp->cl_flags);
>>>>>>>> __set_bit(NFS_CS_NO_RETRANS_TIMEOUT, &clp->cl_flags);
>>>>>>>> - error = nfs_create_rpc_client(clp, timeparms, RPC_AUTH_GSS_KRB5I);
>>>>>>>> +
>>>>>>>> + error = -EINVAL;
>>>>>>>> + if (gssd_running(net))
>>>>>>>> + error = nfs_create_rpc_client(clp, timeparms,RPC_AUTH_GSS_KRB5I);
>>>>>>>> if (error == -EINVAL)
>>>>>>>> error = nfs_create_rpc_client(clp, timeparms, RPC_AUTH_UNIX);
>>>>>>>
>>>>>>> This feels like a layering violation. Why wouldn't you put a gssd_running check in gss_create(), for example, and have rpcauth_create() return -EINVAL?
>>>>>>>
>>>>>>
>>>>>> It would be better to put it in the upcall.
>>>>>
>>>>> Waiting until the upcall has its benefits. At that point, GSSD (if it is running) can report other errors, such as that there is no material to form a machine credential.
>>>>>
>>>>> My point to Jeff is that, aside from architectural aesthetics, direct calls from RPC consumers to the GSS layer causes a module dependency problem. The right way to plumb this is to create an RPC client API that invokes gssd_running() but only if auth_rpcgss.ko is loaded.
>>>
>>> Chuck, I?ve already told you that the auth_rpcgss dependency is a non-starter. Turning off automatic loading of rpcsec_gss modules is a _worse_ regression than the ones we already have and (as I already said) can be trivially defeated by just compiling them into the kernel.
>>>
>>> We _want_ the kernel to be able to autoload modules so that we can add new security flavours etc without having to recompile.
>>>
>
> Right, and just because auth_gss isn't currently plugged in, doesn't
> mean that it's not able to be plugged in. If this is the first mount
> attempt then it's likely that auth_gss isn't loaded yet, even if
> rpc.gssd is running.
>
>>>>> However, I don't see why the existing RPC client APIs shouldn't just fail where appropriate if GSSD isn't available. Is there a strong need to expose gssd_running() as a separate API?
>>>>>
>>>>
>>>> One of the complaints about this whole "use krb5i by default" change is
>>>> that we now get the warnings in the ring buffer when gssd isn't
>>>> running. That's a good thing if krb5 was explicitly requested, but is
>>>> useless and confusing if the user just wants to use AUTH_SYS.
>>>>
>>>> If we wait until gss_create, it's too late to know what the "intent"
>>>> was. We'll either fire the warning inappropriately like we do now, or
>>>> miss it altogether when we actually should have printed it.
>>>
>>> What if the user intended to use krb5i, but the daemon failed to start? This whole ?kernel second guessing what the admin _actually_ wanted to do? thing is a red herring. Let?s just fix the real problem and then leave it at that.
>>>
>>
>> In that case, they will get a failure and warning when they go to the
>> next stage of the mount (I forget which RPC it is). With this change,
>> krb5/krb5i mounts will still fail just like they do today when gssd
>> isn't running. You just get a single warning in the ring buffer about
>> it instead of two.
>>
>
> So to clarify...today we do this when gssd isn't running and we try an
> AUTH_GSS mount:
>
> - attempt SETCLIENTID with krb5i
> - when that fails, log a warning to ring buffer
> - attempt SETCLIENTID with AUTH_SYS
> - attempt rest of mount with krb5i

Hold it. This step should not be happening. Lease management should try krb5i by default, but why is the rest of the mount attempted with krb5i?

> - log another warning to ring buffer when it fails
>
> ...with the first two patches, that doesn't really change. With the
> third patch in place, we just skip the first two steps if gssd isn't
> running. You'll still get a single warning in the ring buffer (which is
> as expected).
>
> Trond, is that acceptable or do you want me to just drop the 3rd patch?

NAK from me on the third patch as it stands. Find some other way to invoke gssd_running() if you really need to do that.

--
Chuck Lever
chuck[dot]lever[at]oracle[dot]com





2013-11-13 17:05:13

by Chuck Lever

[permalink] [raw]
Subject: Re: [PATCH v2 3/3] nfs: check if gssd is running before attempting to use krb5i auth in SETCLIENTID call


On Nov 13, 2013, at 11:20 AM, Jeff Layton <[email protected]> wrote:

> On Wed, 13 Nov 2013 11:10:52 -0500
> Chuck Lever <[email protected]> wrote:
>
>>
>> On Nov 13, 2013, at 11:09 AM, Jeff Layton <[email protected]> wrote:
>>>
>>> So to clarify...today we do this when gssd isn't running and we try an
>>> AUTH_GSS mount:
>>>
>>> - attempt SETCLIENTID with krb5i
>>> - when that fails, log a warning to ring buffer
>>> - attempt SETCLIENTID with AUTH_SYS
>>> - attempt rest of mount with krb5i
>>
>> Hold it. This step should not be happening. Lease management should try krb5i by default, but why is the rest of the mount attempted with krb5i?
>>
>
> Sorry, I should have been more clear...the rest of the mount is
> attempted with krb5i because sec=krb5i was specified on the command
> line.
>
> IOW, this patch just shortcuts attempting to do the lease
> establishment with krb5i when we know that that will fail. The main
> benefit being that we don't end up logging a warning about AUTH_GSS not
> running in that case.
>
> The warning will be logged if/when a later call attempts to use GSSAPI.

Just a thought: The usual way we have of dealing with problems like this is WARN_ONCE() (or a user space equivalent).


--
Chuck Lever
chuck[dot]lever[at]oracle[dot]com





2013-11-13 14:36:12

by Jeff Layton

[permalink] [raw]
Subject: [PATCH v2 3/3] nfs: check if gssd is running before attempting to use krb5i auth in SETCLIENTID call

Currently, the client will attempt to use krb5i in the SETCLIENTID call
even if rpc.gssd is running. If that fails, it'll then fall back to
RPC_AUTH_UNIX. This introduced a delay when mounting if rpc.gssd isn't
running, and causes warning messages to pop up in the ring buffer.

Check to see if rpc.gssd is running before even attempting to use krb5i
auth, and just silently skip trying to do so if it isn't.

Signed-off-by: Jeff Layton <[email protected]>
---
fs/nfs/client.c | 5 +++--
fs/nfs/internal.h | 4 ++--
fs/nfs/nfs4client.c | 8 ++++++--
include/linux/nfs_xdr.h | 2 +-
include/linux/sunrpc/auth_gss.h | 10 ++++++++++
net/sunrpc/auth_gss/auth_gss.c | 3 ++-
6 files changed, 24 insertions(+), 8 deletions(-)

diff --git a/fs/nfs/client.c b/fs/nfs/client.c
index 1d09289..fbee354 100644
--- a/fs/nfs/client.c
+++ b/fs/nfs/client.c
@@ -501,7 +501,8 @@ nfs_get_client(const struct nfs_client_initdata *cl_init,
&nn->nfs_client_list);
spin_unlock(&nn->nfs_client_lock);
new->cl_flags = cl_init->init_flags;
- return rpc_ops->init_client(new, timeparms, ip_addr);
+ return rpc_ops->init_client(new, timeparms, ip_addr,
+ cl_init->net);
}

spin_unlock(&nn->nfs_client_lock);
@@ -700,7 +701,7 @@ EXPORT_SYMBOL_GPL(nfs_init_server_rpcclient);
*/
struct nfs_client *nfs_init_client(struct nfs_client *clp,
const struct rpc_timeout *timeparms,
- const char *ip_addr)
+ const char *ip_addr, struct net *net)
{
int error;

diff --git a/fs/nfs/internal.h b/fs/nfs/internal.h
index bca6a3e..69d3d1c 100644
--- a/fs/nfs/internal.h
+++ b/fs/nfs/internal.h
@@ -273,7 +273,7 @@ extern struct rpc_procinfo nfs4_procedures[];
void nfs_close_context(struct nfs_open_context *ctx, int is_sync);
extern struct nfs_client *nfs_init_client(struct nfs_client *clp,
const struct rpc_timeout *timeparms,
- const char *ip_addr);
+ const char *ip_addr, struct net *net);

/* dir.c */
extern unsigned long nfs_access_cache_count(struct shrinker *shrink,
@@ -462,7 +462,7 @@ extern ssize_t nfs_dreq_bytes_left(struct nfs_direct_req *dreq);
extern void __nfs4_read_done_cb(struct nfs_read_data *);
extern struct nfs_client *nfs4_init_client(struct nfs_client *clp,
const struct rpc_timeout *timeparms,
- const char *ip_addr);
+ const char *ip_addr, struct net *net);
extern int nfs40_walk_client_list(struct nfs_client *clp,
struct nfs_client **result,
struct rpc_cred *cred);
diff --git a/fs/nfs/nfs4client.c b/fs/nfs/nfs4client.c
index b4a160a..988aebf 100644
--- a/fs/nfs/nfs4client.c
+++ b/fs/nfs/nfs4client.c
@@ -8,6 +8,7 @@
#include <linux/nfs_mount.h>
#include <linux/sunrpc/addr.h>
#include <linux/sunrpc/auth.h>
+#include <linux/sunrpc/auth_gss.h>
#include <linux/sunrpc/xprt.h>
#include <linux/sunrpc/bc_xprt.h>
#include "internal.h"
@@ -351,7 +352,7 @@ static int nfs4_init_client_minor_version(struct nfs_client *clp)
*/
struct nfs_client *nfs4_init_client(struct nfs_client *clp,
const struct rpc_timeout *timeparms,
- const char *ip_addr)
+ const char *ip_addr, struct net *net)
{
char buf[INET6_ADDRSTRLEN + 1];
struct nfs_client *old;
@@ -370,7 +371,10 @@ struct nfs_client *nfs4_init_client(struct nfs_client *clp,
__set_bit(NFS_CS_INFINITE_SLOTS, &clp->cl_flags);
__set_bit(NFS_CS_DISCRTRY, &clp->cl_flags);
__set_bit(NFS_CS_NO_RETRANS_TIMEOUT, &clp->cl_flags);
- error = nfs_create_rpc_client(clp, timeparms, RPC_AUTH_GSS_KRB5I);
+
+ error = -EINVAL;
+ if (gssd_running(net))
+ error = nfs_create_rpc_client(clp, timeparms,RPC_AUTH_GSS_KRB5I);
if (error == -EINVAL)
error = nfs_create_rpc_client(clp, timeparms, RPC_AUTH_UNIX);
if (error < 0)
diff --git a/include/linux/nfs_xdr.h b/include/linux/nfs_xdr.h
index 3ccfcec..e75b2cc 100644
--- a/include/linux/nfs_xdr.h
+++ b/include/linux/nfs_xdr.h
@@ -1486,7 +1486,7 @@ struct nfs_rpc_ops {
struct nfs_client *(*alloc_client) (const struct nfs_client_initdata *);
struct nfs_client *
(*init_client) (struct nfs_client *, const struct rpc_timeout *,
- const char *);
+ const char *, struct net *);
void (*free_client) (struct nfs_client *);
struct nfs_server *(*create_server)(struct nfs_mount_info *, struct nfs_subversion *);
struct nfs_server *(*clone_server)(struct nfs_server *, struct nfs_fh *,
diff --git a/include/linux/sunrpc/auth_gss.h b/include/linux/sunrpc/auth_gss.h
index f1cfd4c..cecf684 100644
--- a/include/linux/sunrpc/auth_gss.h
+++ b/include/linux/sunrpc/auth_gss.h
@@ -86,6 +86,16 @@ struct gss_cred {
unsigned long gc_upcall_timestamp;
};

+#if IS_ENABLED(CONFIG_SUNRPC_GSS)
+extern bool gssd_running(struct net *net);
+#else /* !CONFIG_SUNRPC_GSS */
+static inline bool
+gssd_running(struct net *net)
+{
+ return false;
+}
+#endif /* CONFIG_SUNRPC_GSS */
+
#endif /* __KERNEL__ */
#endif /* _LINUX_SUNRPC_AUTH_GSS_H */

diff --git a/net/sunrpc/auth_gss/auth_gss.c b/net/sunrpc/auth_gss/auth_gss.c
index 399390e..0cc2174 100644
--- a/net/sunrpc/auth_gss/auth_gss.c
+++ b/net/sunrpc/auth_gss/auth_gss.c
@@ -592,13 +592,14 @@ out:
return err;
}

-static bool
+bool
gssd_running(struct net *net)
{
struct sunrpc_net *sn = net_generic(net, sunrpc_net_id);

return rpc_pipe_is_open(sn->gssd_dummy);
}
+EXPORT_SYMBOL_GPL(gssd_running);

static inline int
gss_create_upcall(struct gss_auth *gss_auth, struct gss_cred *gss_cred)
--
1.8.3.1


2013-11-13 16:13:00

by Chuck Lever

[permalink] [raw]
Subject: Re: [PATCH v2 3/3] nfs: check if gssd is running before attempting to use krb5i auth in SETCLIENTID call


On Nov 13, 2013, at 10:49 AM, "Myklebust, Trond" <[email protected]> wrote:

>
> On Nov 13, 2013, at 10:35, Jeff Layton <[email protected]> wrote:
>
>> On Wed, 13 Nov 2013 10:14:30 -0500
>> Chuck Lever <[email protected]> wrote:
>>
>>>
>>> On Nov 13, 2013, at 9:48 AM, "Myklebust, Trond" <[email protected]> wrote:
>>>
>>>>
>>>> On Nov 13, 2013, at 9:38, Chuck Lever <[email protected]> wrote:
>>>>
>>>>>
>>>>> On Nov 13, 2013, at 9:30 AM, Jeff Layton <[email protected]> wrote:
>>>>>
>>>>>> Currently, the client will attempt to use krb5i in the SETCLIENTID call
>>>>>> even if rpc.gssd is running. If that fails, it'll then fall back to
>>>>>> RPC_AUTH_UNIX. This introduced a delay when mounting if rpc.gssd isn't
>>>>>> running, and causes warning messages to pop up in the ring buffer.
>>>>>>
>>>>>> Check to see if rpc.gssd is running before even attempting to use krb5i
>>>>>> auth, and just silently skip trying to do so if it isn't.
>>>>>>
>>>>>> Signed-off-by: Jeff Layton <[email protected]>
>>>>>> ---
>>>>>>
>>>>>> diff --git a/fs/nfs/nfs4client.c b/fs/nfs/nfs4client.c
>>>>>> index b4a160a..988aebf 100644
>>>>>> --- a/fs/nfs/nfs4client.c
>>>>>> +++ b/fs/nfs/nfs4client.c
>>>>>> @@ -8,6 +8,7 @@
>>>>>> #include <linux/nfs_mount.h>
>>>>>> #include <linux/sunrpc/addr.h>
>>>>>> #include <linux/sunrpc/auth.h>
>>>>>> +#include <linux/sunrpc/auth_gss.h>
>>>>>> #include <linux/sunrpc/xprt.h>
>>>>>> #include <linux/sunrpc/bc_xprt.h>
>>>>>> #include "internal.h"
>>>>>> @@ -351,7 +352,7 @@ static int nfs4_init_client_minor_version(struct nfs_client *clp)
>>>>>> */
>>>>>> struct nfs_client *nfs4_init_client(struct nfs_client *clp,
>>>>>> const struct rpc_timeout *timeparms,
>>>>>> - const char *ip_addr)
>>>>>> + const char *ip_addr, struct net *net)
>>>>
>>>> Why the ?struct net? argument? clp->cl_net should already be initialized here.
>>>>
>>>>>> {
>>>>>> char buf[INET6_ADDRSTRLEN + 1];
>>>>>> struct nfs_client *old;
>>>>>> @@ -370,7 +371,10 @@ struct nfs_client *nfs4_init_client(struct nfs_client *clp,
>>>>>> __set_bit(NFS_CS_INFINITE_SLOTS, &clp->cl_flags);
>>>>>> __set_bit(NFS_CS_DISCRTRY, &clp->cl_flags);
>>>>>> __set_bit(NFS_CS_NO_RETRANS_TIMEOUT, &clp->cl_flags);
>>>>>> - error = nfs_create_rpc_client(clp, timeparms, RPC_AUTH_GSS_KRB5I);
>>>>>> +
>>>>>> + error = -EINVAL;
>>>>>> + if (gssd_running(net))
>>>>>> + error = nfs_create_rpc_client(clp, timeparms,RPC_AUTH_GSS_KRB5I);
>>>>>> if (error == -EINVAL)
>>>>>> error = nfs_create_rpc_client(clp, timeparms, RPC_AUTH_UNIX);
>>>>>
>>>>> This feels like a layering violation. Why wouldn't you put a gssd_running check in gss_create(), for example, and have rpcauth_create() return -EINVAL?
>>>>>
>>>>
>>>> It would be better to put it in the upcall.
>>>
>>> Waiting until the upcall has its benefits. At that point, GSSD (if it is running) can report other errors, such as that there is no material to form a machine credential.
>>>
>>> My point to Jeff is that, aside from architectural aesthetics, direct calls from RPC consumers to the GSS layer causes a module dependency problem. The right way to plumb this is to create an RPC client API that invokes gssd_running() but only if auth_rpcgss.ko is loaded.
>
> Chuck, I?ve already told you that the auth_rpcgss dependency is a non-starter.

Maybe you thought that in your brain, but look at the e-mail archive: You actually haven't explicitly objected yet. You simply observed that this mechanism doesn't work for built-in scenarios, and I agree with that observation. It isn't supposed to work for built-in.

But I don't think you understand my full proposal or my objection to Jeff's third patch.

> Turning off automatic loading of rpcsec_gss modules is a _worse_ regression than the ones we already have and (as I already said) can be trivially defeated by just compiling them into the kernel.

Automatic loading of auth_rpcgss never worked until 71afa85e (March 16). It simply isn't possible that we've depended on automatic loading of auth_rpcgss.ko in the past, because it has never worked until recently.

> We _want_ the kernel to be able to autoload modules so that we can add new security flavours etc without having to recompile.

I'm not proposing to turn off automatic loading of the GSS mechanisms. If auth_rpcgss.ko is loaded, the mechanisms get loaded automatically.

>
>>> However, I don't see why the existing RPC client APIs shouldn't just fail where appropriate if GSSD isn't available. Is there a strong need to expose gssd_running() as a separate API?
>>>
>>
>> One of the complaints about this whole "use krb5i by default" change is
>> that we now get the warnings in the ring buffer when gssd isn't
>> running. That's a good thing if krb5 was explicitly requested, but is
>> useless and confusing if the user just wants to use AUTH_SYS.
>>
>> If we wait until gss_create, it's too late to know what the "intent"
>> was. We'll either fire the warning inappropriately like we do now, or
>> miss it altogether when we actually should have printed it.
>
> What if the user intended to use krb5i, but the daemon failed to start?
> This whole ?kernel second guessing what the admin _actually_ wanted to do? thing is a red herring. Let?s just fix the real problem and then leave it at that.

I agree that the kernel shouldn't guess. Autoloading auth_rpcgss is guessing what the admin wanted, IMO.

--
Chuck Lever
chuck[dot]lever[at]oracle[dot]com





2013-11-13 14:31:02

by Jeff Layton

[permalink] [raw]
Subject: [PATCH v2 1/3] sunrpc: create a new dummy pipe for gssd to hold open

rpc.gssd will naturally hold open any pipe named */clnt*/gssd that shows
up under rpc_pipefs. That behavior gives us a reliable mechanism to tell
whether it's actually running or not.

Create a new toplevel "dummy" directory in rpc_pipefs when it's mounted.
Under that directory create another directory called "clntXX", and then
within that a pipe called "gssd".

We'll never send an upcall along that pipe, and any downcall written to
it will just return -EINVAL.

Signed-off-by: Jeff Layton <[email protected]>
---
include/linux/sunrpc/rpc_pipe_fs.h | 3 +-
net/sunrpc/netns.h | 1 +
net/sunrpc/rpc_pipe.c | 93 ++++++++++++++++++++++++++++++++++++--
net/sunrpc/sunrpc_syms.c | 8 +++-
4 files changed, 100 insertions(+), 5 deletions(-)

diff --git a/include/linux/sunrpc/rpc_pipe_fs.h b/include/linux/sunrpc/rpc_pipe_fs.h
index a353e03..85f1342 100644
--- a/include/linux/sunrpc/rpc_pipe_fs.h
+++ b/include/linux/sunrpc/rpc_pipe_fs.h
@@ -84,7 +84,8 @@ enum {

extern struct dentry *rpc_d_lookup_sb(const struct super_block *sb,
const unsigned char *dir_name);
-extern void rpc_pipefs_init_net(struct net *net);
+extern int rpc_pipefs_init_net(struct net *net);
+extern void rpc_pipefs_exit_net(struct net *net);
extern struct super_block *rpc_get_sb_net(const struct net *net);
extern void rpc_put_sb_net(const struct net *net);

diff --git a/net/sunrpc/netns.h b/net/sunrpc/netns.h
index 779742c..8a8e841 100644
--- a/net/sunrpc/netns.h
+++ b/net/sunrpc/netns.h
@@ -14,6 +14,7 @@ struct sunrpc_net {
struct cache_detail *rsi_cache;

struct super_block *pipefs_sb;
+ struct rpc_pipe *gssd_dummy;
struct mutex pipefs_sb_lock;

struct list_head all_clients;
diff --git a/net/sunrpc/rpc_pipe.c b/net/sunrpc/rpc_pipe.c
index d0d14a0..34efdbf 100644
--- a/net/sunrpc/rpc_pipe.c
+++ b/net/sunrpc/rpc_pipe.c
@@ -38,7 +38,7 @@
#define NET_NAME(net) ((net == &init_net) ? " (init_net)" : "")

static struct file_system_type rpc_pipe_fs_type;
-
+static const struct rpc_pipe_ops gssd_dummy_pipe_ops;

static struct kmem_cache *rpc_inode_cachep __read_mostly;

@@ -1168,6 +1168,7 @@ enum {
RPCAUTH_nfsd4_cb,
RPCAUTH_cache,
RPCAUTH_nfsd,
+ RPCAUTH_gssd,
RPCAUTH_RootEOF
};

@@ -1204,6 +1205,10 @@ static const struct rpc_filelist files[] = {
.name = "nfsd",
.mode = S_IFDIR | S_IRUGO | S_IXUGO,
},
+ [RPCAUTH_gssd] = {
+ .name = "gssd",
+ .mode = S_IFDIR | S_IRUGO | S_IXUGO,
+ },
};

/*
@@ -1217,13 +1222,25 @@ struct dentry *rpc_d_lookup_sb(const struct super_block *sb,
}
EXPORT_SYMBOL_GPL(rpc_d_lookup_sb);

-void rpc_pipefs_init_net(struct net *net)
+int rpc_pipefs_init_net(struct net *net)
{
struct sunrpc_net *sn = net_generic(net, sunrpc_net_id);

+ sn->gssd_dummy = rpc_mkpipe_data(&gssd_dummy_pipe_ops, 0);
+ if (IS_ERR(sn->gssd_dummy))
+ return PTR_ERR(sn->gssd_dummy);
+
mutex_init(&sn->pipefs_sb_lock);
sn->gssd_running = 1;
sn->pipe_version = -1;
+ return 0;
+}
+
+void rpc_pipefs_exit_net(struct net *net)
+{
+ struct sunrpc_net *sn = net_generic(net, sunrpc_net_id);
+
+ rpc_destroy_pipe_data(sn->gssd_dummy);
}

/*
@@ -1253,11 +1270,73 @@ void rpc_put_sb_net(const struct net *net)
}
EXPORT_SYMBOL_GPL(rpc_put_sb_net);

+static const struct rpc_filelist gssd_dummy_clnt_dir[] = {
+ [0] = {
+ .name = "clntXX",
+ .mode = S_IFDIR | S_IRUGO | S_IXUGO,
+ },
+};
+
+static ssize_t
+dummy_downcall(struct file *filp, const char __user *src, size_t len)
+{
+ return -EINVAL;
+}
+
+static const struct rpc_pipe_ops gssd_dummy_pipe_ops = {
+ .upcall = rpc_pipe_generic_upcall,
+ .downcall = dummy_downcall,
+};
+
+/**
+ * rpc_gssd_dummy_populate - create a dummy gssd pipe
+ * @root: root of the rpc_pipefs filesystem
+ * @pipe_data: pipe data created when netns is initialized
+ *
+ * Create a dummy set of directories and a pipe that gssd can hold open to
+ * indicate that it is up and running.
+ */
+static struct dentry *
+rpc_gssd_dummy_populate(struct dentry *root, struct rpc_pipe *pipe_data)
+{
+ int ret = 0;
+ struct dentry *gssd_dentry;
+ struct dentry *clnt_dentry = NULL;
+ struct dentry *pipe_dentry = NULL;
+ struct qstr q = QSTR_INIT(files[RPCAUTH_gssd].name,
+ strlen(files[RPCAUTH_gssd].name));
+
+ /* We should never get this far if "gssd" doesn't exist */
+ gssd_dentry = d_hash_and_lookup(root, &q);
+ if (!gssd_dentry)
+ return ERR_PTR(-ENOENT);
+
+ ret = rpc_populate(gssd_dentry, gssd_dummy_clnt_dir, 0, 1, NULL);
+ if (ret) {
+ pipe_dentry = ERR_PTR(ret);
+ goto out;
+ }
+
+ q.name = gssd_dummy_clnt_dir[0].name;
+ q.len = strlen(gssd_dummy_clnt_dir[0].name);
+ clnt_dentry = d_hash_and_lookup(gssd_dentry, &q);
+ if (!clnt_dentry) {
+ pipe_dentry = ERR_PTR(-ENOENT);
+ goto out;
+ }
+
+ pipe_dentry = rpc_mkpipe_dentry(clnt_dentry, "gssd", NULL, pipe_data);
+out:
+ dput(clnt_dentry);
+ dput(gssd_dentry);
+ return pipe_dentry;
+}
+
static int
rpc_fill_super(struct super_block *sb, void *data, int silent)
{
struct inode *inode;
- struct dentry *root;
+ struct dentry *root, *gssd_dentry;
struct net *net = data;
struct sunrpc_net *sn = net_generic(net, sunrpc_net_id);
int err;
@@ -1275,6 +1354,13 @@ rpc_fill_super(struct super_block *sb, void *data, int silent)
return -ENOMEM;
if (rpc_populate(root, files, RPCAUTH_lockd, RPCAUTH_RootEOF, NULL))
return -ENOMEM;
+
+ gssd_dentry = rpc_gssd_dummy_populate(root, sn->gssd_dummy);
+ if (IS_ERR(gssd_dentry)) {
+ __rpc_depopulate(root, files, RPCAUTH_lockd, RPCAUTH_RootEOF);
+ return PTR_ERR(gssd_dentry);
+ }
+
dprintk("RPC: sending pipefs MOUNT notification for net %p%s\n",
net, NET_NAME(net));
mutex_lock(&sn->pipefs_sb_lock);
@@ -1289,6 +1375,7 @@ rpc_fill_super(struct super_block *sb, void *data, int silent)
return 0;

err_depopulate:
+ dput(gssd_dentry);
blocking_notifier_call_chain(&rpc_pipefs_notifier_list,
RPC_PIPEFS_UMOUNT,
sb);
diff --git a/net/sunrpc/sunrpc_syms.c b/net/sunrpc/sunrpc_syms.c
index 3d6498a..cd30120 100644
--- a/net/sunrpc/sunrpc_syms.c
+++ b/net/sunrpc/sunrpc_syms.c
@@ -44,12 +44,17 @@ static __net_init int sunrpc_init_net(struct net *net)
if (err)
goto err_unixgid;

- rpc_pipefs_init_net(net);
+ err = rpc_pipefs_init_net(net);
+ if (err)
+ goto err_pipefs;
+
INIT_LIST_HEAD(&sn->all_clients);
spin_lock_init(&sn->rpc_client_lock);
spin_lock_init(&sn->rpcb_clnt_lock);
return 0;

+err_pipefs:
+ unix_gid_cache_destroy(net);
err_unixgid:
ip_map_cache_destroy(net);
err_ipmap:
@@ -60,6 +65,7 @@ err_proc:

static __net_exit void sunrpc_exit_net(struct net *net)
{
+ rpc_pipefs_exit_net(net);
unix_gid_cache_destroy(net);
ip_map_cache_destroy(net);
rpc_proc_exit(net);
--
1.8.3.1


2013-11-13 14:48:18

by Myklebust, Trond

[permalink] [raw]
Subject: Re: [PATCH v2 3/3] nfs: check if gssd is running before attempting to use krb5i auth in SETCLIENTID call


On Nov 13, 2013, at 9:38, Chuck Lever <[email protected]> wrote:

>
> On Nov 13, 2013, at 9:30 AM, Jeff Layton <[email protected]> wrote:
>
>> Currently, the client will attempt to use krb5i in the SETCLIENTID call
>> even if rpc.gssd is running. If that fails, it'll then fall back to
>> RPC_AUTH_UNIX. This introduced a delay when mounting if rpc.gssd isn't
>> running, and causes warning messages to pop up in the ring buffer.
>>
>> Check to see if rpc.gssd is running before even attempting to use krb5i
>> auth, and just silently skip trying to do so if it isn't.
>>
>> Signed-off-by: Jeff Layton <[email protected]>
>> ---
>>
>> diff --git a/fs/nfs/nfs4client.c b/fs/nfs/nfs4client.c
>> index b4a160a..988aebf 100644
>> --- a/fs/nfs/nfs4client.c
>> +++ b/fs/nfs/nfs4client.c
>> @@ -8,6 +8,7 @@
>> #include <linux/nfs_mount.h>
>> #include <linux/sunrpc/addr.h>
>> #include <linux/sunrpc/auth.h>
>> +#include <linux/sunrpc/auth_gss.h>
>> #include <linux/sunrpc/xprt.h>
>> #include <linux/sunrpc/bc_xprt.h>
>> #include "internal.h"
>> @@ -351,7 +352,7 @@ static int nfs4_init_client_minor_version(struct nfs_client *clp)
>> */
>> struct nfs_client *nfs4_init_client(struct nfs_client *clp,
>> const struct rpc_timeout *timeparms,
>> - const char *ip_addr)
>> + const char *ip_addr, struct net *net)

Why the ?struct net? argument? clp->cl_net should already be initialized here.

>> {
>> char buf[INET6_ADDRSTRLEN + 1];
>> struct nfs_client *old;
>> @@ -370,7 +371,10 @@ struct nfs_client *nfs4_init_client(struct nfs_client *clp,
>> __set_bit(NFS_CS_INFINITE_SLOTS, &clp->cl_flags);
>> __set_bit(NFS_CS_DISCRTRY, &clp->cl_flags);
>> __set_bit(NFS_CS_NO_RETRANS_TIMEOUT, &clp->cl_flags);
>> - error = nfs_create_rpc_client(clp, timeparms, RPC_AUTH_GSS_KRB5I);
>> +
>> + error = -EINVAL;
>> + if (gssd_running(net))
>> + error = nfs_create_rpc_client(clp, timeparms,RPC_AUTH_GSS_KRB5I);
>> if (error == -EINVAL)
>> error = nfs_create_rpc_client(clp, timeparms, RPC_AUTH_UNIX);
>
> This feels like a layering violation. Why wouldn't you put a gssd_running check in gss_create(), for example, and have rpcauth_create() return -EINVAL?
>

It would be better to put it in the upcall.

Cheers
Trond

2013-11-13 16:06:50

by Jeff Layton

[permalink] [raw]
Subject: Re: [PATCH v2 3/3] nfs: check if gssd is running before attempting to use krb5i auth in SETCLIENTID call

On Wed, 13 Nov 2013 10:57:02 -0500
Jeff Layton <[email protected]> wrote:

> On Wed, 13 Nov 2013 15:49:13 +0000
> "Myklebust, Trond" <[email protected]> wrote:
>
> >
> > On Nov 13, 2013, at 10:35, Jeff Layton <[email protected]> wrote:
> >
> > > On Wed, 13 Nov 2013 10:14:30 -0500
> > > Chuck Lever <[email protected]> wrote:
> > >
> > >>
> > >> On Nov 13, 2013, at 9:48 AM, "Myklebust, Trond" <[email protected]> wrote:
> > >>
> > >>>
> > >>> On Nov 13, 2013, at 9:38, Chuck Lever <[email protected]> wrote:
> > >>>
> > >>>>
> > >>>> On Nov 13, 2013, at 9:30 AM, Jeff Layton <[email protected]> wrote:
> > >>>>
> > >>>>> Currently, the client will attempt to use krb5i in the SETCLIENTID call
> > >>>>> even if rpc.gssd is running. If that fails, it'll then fall back to
> > >>>>> RPC_AUTH_UNIX. This introduced a delay when mounting if rpc.gssd isn't
> > >>>>> running, and causes warning messages to pop up in the ring buffer.
> > >>>>>
> > >>>>> Check to see if rpc.gssd is running before even attempting to use krb5i
> > >>>>> auth, and just silently skip trying to do so if it isn't.
> > >>>>>
> > >>>>> Signed-off-by: Jeff Layton <[email protected]>
> > >>>>> ---
> > >>>>>
> > >>>>> diff --git a/fs/nfs/nfs4client.c b/fs/nfs/nfs4client.c
> > >>>>> index b4a160a..988aebf 100644
> > >>>>> --- a/fs/nfs/nfs4client.c
> > >>>>> +++ b/fs/nfs/nfs4client.c
> > >>>>> @@ -8,6 +8,7 @@
> > >>>>> #include <linux/nfs_mount.h>
> > >>>>> #include <linux/sunrpc/addr.h>
> > >>>>> #include <linux/sunrpc/auth.h>
> > >>>>> +#include <linux/sunrpc/auth_gss.h>
> > >>>>> #include <linux/sunrpc/xprt.h>
> > >>>>> #include <linux/sunrpc/bc_xprt.h>
> > >>>>> #include "internal.h"
> > >>>>> @@ -351,7 +352,7 @@ static int nfs4_init_client_minor_version(struct nfs_client *clp)
> > >>>>> */
> > >>>>> struct nfs_client *nfs4_init_client(struct nfs_client *clp,
> > >>>>> const struct rpc_timeout *timeparms,
> > >>>>> - const char *ip_addr)
> > >>>>> + const char *ip_addr, struct net *net)
> > >>>
> > >>> Why the ‘struct net’ argument? clp->cl_net should already be initialized here.
> > >>>
> > >>>>> {
> > >>>>> char buf[INET6_ADDRSTRLEN + 1];
> > >>>>> struct nfs_client *old;
> > >>>>> @@ -370,7 +371,10 @@ struct nfs_client *nfs4_init_client(struct nfs_client *clp,
> > >>>>> __set_bit(NFS_CS_INFINITE_SLOTS, &clp->cl_flags);
> > >>>>> __set_bit(NFS_CS_DISCRTRY, &clp->cl_flags);
> > >>>>> __set_bit(NFS_CS_NO_RETRANS_TIMEOUT, &clp->cl_flags);
> > >>>>> - error = nfs_create_rpc_client(clp, timeparms, RPC_AUTH_GSS_KRB5I);
> > >>>>> +
> > >>>>> + error = -EINVAL;
> > >>>>> + if (gssd_running(net))
> > >>>>> + error = nfs_create_rpc_client(clp, timeparms,RPC_AUTH_GSS_KRB5I);
> > >>>>> if (error == -EINVAL)
> > >>>>> error = nfs_create_rpc_client(clp, timeparms, RPC_AUTH_UNIX);
> > >>>>
> > >>>> This feels like a layering violation. Why wouldn't you put a gssd_running check in gss_create(), for example, and have rpcauth_create() return -EINVAL?
> > >>>>
> > >>>
> > >>> It would be better to put it in the upcall.
> > >>
> > >> Waiting until the upcall has its benefits. At that point, GSSD (if it is running) can report other errors, such as that there is no material to form a machine credential.
> > >>
> > >> My point to Jeff is that, aside from architectural aesthetics, direct calls from RPC consumers to the GSS layer causes a module dependency problem. The right way to plumb this is to create an RPC client API that invokes gssd_running() but only if auth_rpcgss.ko is loaded.
> >
> > Chuck, I’ve already told you that the auth_rpcgss dependency is a non-starter. Turning off automatic loading of rpcsec_gss modules is a _worse_ regression than the ones we already have and (as I already said) can be trivially defeated by just compiling them into the kernel.
> >
> > We _want_ the kernel to be able to autoload modules so that we can add new security flavours etc without having to recompile.
> >

Right, and just because auth_gss isn't currently plugged in, doesn't
mean that it's not able to be plugged in. If this is the first mount
attempt then it's likely that auth_gss isn't loaded yet, even if
rpc.gssd is running.

> > >> However, I don't see why the existing RPC client APIs shouldn't just fail where appropriate if GSSD isn't available. Is there a strong need to expose gssd_running() as a separate API?
> > >>
> > >
> > > One of the complaints about this whole "use krb5i by default" change is
> > > that we now get the warnings in the ring buffer when gssd isn't
> > > running. That's a good thing if krb5 was explicitly requested, but is
> > > useless and confusing if the user just wants to use AUTH_SYS.
> > >
> > > If we wait until gss_create, it's too late to know what the "intent"
> > > was. We'll either fire the warning inappropriately like we do now, or
> > > miss it altogether when we actually should have printed it.
> >
> > What if the user intended to use krb5i, but the daemon failed to start? This whole “kernel second guessing what the admin _actually_ wanted to do” thing is a red herring. Let’s just fix the real problem and then leave it at that.
> >
>
> In that case, they will get a failure and warning when they go to the
> next stage of the mount (I forget which RPC it is). With this change,
> krb5/krb5i mounts will still fail just like they do today when gssd
> isn't running. You just get a single warning in the ring buffer about
> it instead of two.
>

So to clarify...today we do this when gssd isn't running and we try an
AUTH_GSS mount:

- attempt SETCLIENTID with krb5i
- when that fails, log a warning to ring buffer
- attempt SETCLIENTID with AUTH_SYS
- attempt rest of mount with krb5i
- log another warning to ring buffer when it fails

...with the first two patches, that doesn't really change. With the
third patch in place, we just skip the first two steps if gssd isn't
running. You'll still get a single warning in the ring buffer (which is
as expected).

Trond, is that acceptable or do you want me to just drop the 3rd patch?

--
Jeff Layton <[email protected]>

2013-11-13 15:32:55

by Jeff Layton

[permalink] [raw]
Subject: Re: [PATCH v2 3/3] nfs: check if gssd is running before attempting to use krb5i auth in SETCLIENTID call

On Wed, 13 Nov 2013 10:14:30 -0500
Chuck Lever <[email protected]> wrote:

>
> On Nov 13, 2013, at 9:48 AM, "Myklebust, Trond" <[email protected]> wrote:
>
> >
> > On Nov 13, 2013, at 9:38, Chuck Lever <[email protected]> wrote:
> >
> >>
> >> On Nov 13, 2013, at 9:30 AM, Jeff Layton <[email protected]> wrote:
> >>
> >>> Currently, the client will attempt to use krb5i in the SETCLIENTID call
> >>> even if rpc.gssd is running. If that fails, it'll then fall back to
> >>> RPC_AUTH_UNIX. This introduced a delay when mounting if rpc.gssd isn't
> >>> running, and causes warning messages to pop up in the ring buffer.
> >>>
> >>> Check to see if rpc.gssd is running before even attempting to use krb5i
> >>> auth, and just silently skip trying to do so if it isn't.
> >>>
> >>> Signed-off-by: Jeff Layton <[email protected]>
> >>> ---
> >>>
> >>> diff --git a/fs/nfs/nfs4client.c b/fs/nfs/nfs4client.c
> >>> index b4a160a..988aebf 100644
> >>> --- a/fs/nfs/nfs4client.c
> >>> +++ b/fs/nfs/nfs4client.c
> >>> @@ -8,6 +8,7 @@
> >>> #include <linux/nfs_mount.h>
> >>> #include <linux/sunrpc/addr.h>
> >>> #include <linux/sunrpc/auth.h>
> >>> +#include <linux/sunrpc/auth_gss.h>
> >>> #include <linux/sunrpc/xprt.h>
> >>> #include <linux/sunrpc/bc_xprt.h>
> >>> #include "internal.h"
> >>> @@ -351,7 +352,7 @@ static int nfs4_init_client_minor_version(struct nfs_client *clp)
> >>> */
> >>> struct nfs_client *nfs4_init_client(struct nfs_client *clp,
> >>> const struct rpc_timeout *timeparms,
> >>> - const char *ip_addr)
> >>> + const char *ip_addr, struct net *net)
> >
> > Why the ‘struct net’ argument? clp->cl_net should already be initialized here.
> >
> >>> {
> >>> char buf[INET6_ADDRSTRLEN + 1];
> >>> struct nfs_client *old;
> >>> @@ -370,7 +371,10 @@ struct nfs_client *nfs4_init_client(struct nfs_client *clp,
> >>> __set_bit(NFS_CS_INFINITE_SLOTS, &clp->cl_flags);
> >>> __set_bit(NFS_CS_DISCRTRY, &clp->cl_flags);
> >>> __set_bit(NFS_CS_NO_RETRANS_TIMEOUT, &clp->cl_flags);
> >>> - error = nfs_create_rpc_client(clp, timeparms, RPC_AUTH_GSS_KRB5I);
> >>> +
> >>> + error = -EINVAL;
> >>> + if (gssd_running(net))
> >>> + error = nfs_create_rpc_client(clp, timeparms,RPC_AUTH_GSS_KRB5I);
> >>> if (error == -EINVAL)
> >>> error = nfs_create_rpc_client(clp, timeparms, RPC_AUTH_UNIX);
> >>
> >> This feels like a layering violation. Why wouldn't you put a gssd_running check in gss_create(), for example, and have rpcauth_create() return -EINVAL?
> >>
> >
> > It would be better to put it in the upcall.
>
> Waiting until the upcall has its benefits. At that point, GSSD (if it is running) can report other errors, such as that there is no material to form a machine credential.
>
> My point to Jeff is that, aside from architectural aesthetics, direct calls from RPC consumers to the GSS layer causes a module dependency problem. The right way to plumb this is to create an RPC client API that invokes gssd_running() but only if auth_rpcgss.ko is loaded.
>
> However, I don't see why the existing RPC client APIs shouldn't just fail where appropriate if GSSD isn't available. Is there a strong need to expose gssd_running() as a separate API?
>

One of the complaints about this whole "use krb5i by default" change is
that we now get the warnings in the ring buffer when gssd isn't
running. That's a good thing if krb5 was explicitly requested, but is
useless and confusing if the user just wants to use AUTH_SYS.

If we wait until gss_create, it's too late to know what the "intent"
was. We'll either fire the warning inappropriately like we do now, or
miss it altogether when we actually should have printed it.

So, that was the main reason for the layering violation here. I do
however get your point on the module dependency, but IIUC...don't we get
that anyway? Now that you try using krb5i by default for SETCLIENTID,
don't we end up loading auth_gss.ko anyway on every nfsv4 mount
regardless?

--
Jeff Layton <[email protected]>

2013-11-14 20:35:44

by Jeff Layton

[permalink] [raw]
Subject: Re: [PATCH v2 0/3] sunrpc/nfs: more reliable detection of running gssd

On Thu, 14 Nov 2013 14:26:01 -0500
"J. Bruce Fields" <[email protected]> wrote:

> On Wed, Nov 13, 2013 at 09:30:50AM -0500, Jeff Layton wrote:
> > v2:
> > - change name of toplevel pipefs dir from "dummy" to "gssd" (per
> > Trond's suggestion)
> >
> > - when gssd isn't running, don't bother to upcall (per Neil B.'s
> > suggestion)
> >
> > - fix lifecycle of rpc_pipe data. Previously it would have leaked
> > after umount. With this set, it's created and destroyed along with
> > the netns, and just attached to the pipe inode on mount/unmount
> > of rpc_pipefs.
> >
> > - patch has been added to skip attempting setclientid with krb5i
> > if gssd isn't running. This avoids the "AUTH_GSS upcall timed out"
> > message when gssd isn't running and you mount with sec=sys. It also
> > shortens the delay when gssd isn't up.
> >
> > The original cover letter from the v1 posting follows. Note that this
> > set does address the warnings about the AUTH_GSS upcall timing out.
> >
> > -------------------------[snip]-----------------------------
> >
> > We've gotten a lot of complaints recently about the 15s delay when
> > doing a sec=sys mount without gssd running.
> >
> > A large part of the problem is that the kernel isn't able to reliably
> > detect when rpc.gssd is running. What we currently have is a
> > gssd_running flag that is initially set to 1. When an upcall times out,
> > that gets set to 0, and subsequent upcalls get a much shorter timeout
> > (1/4s instead of 15s). It's reset back to '1' when a pipe is reopened.
> >
> > The approach of using a flag like this is pretty inadequate. First, it
> > doesn't eliminate the long delay on the initial upcall attempt. Also,
> > if gssd spontaneously dies, then the flag will still be set to 1 until
> > the next upcall attempt times out. Finally, it currently requires that
> > the pipe be reopened in order to reset the flag back to true.
> >
> > This patchset replaces that flag with a more reliable mechanism for
> > detecting when gssd is running. When rpc_pipefs is mounted, it creates a
> > new "dummy" pipe that gssd will naturally find and hold open. We'll
> > never send an upcall down this pipe, and writing to it always fails.
> > But, since we can detect when something is holding it open, we can use
> > that to determine whether gssd is running.
>
> I think this might have been addressed before, I don't remember: does
> the init system currently have a way to wait till gssd has gotten as far
> as scanning pipefs before allowing mounts?
>
> (To avoid the race where a krb5 mount fails because gssd is still in the
> process of being started.)
>
> --b.
>

I don't think it does.

gssd forks (using daemon()) before it does any real work, and then the
parent exits. I imagine it's possible to hit such a race.

Personally, I've never seen it happen in practice, and I have tested
sec=krb5 mounts in /etc/fstab on recent Fedora. That may just be a
matter of timing or of enough stuff running between rpc.gssd and the
mounts being done.

It might not hurt to patch gssd so that the parent delays exiting until
the child starts up and does its initial run in the !fg case...

--
Jeff Layton <[email protected]>

2013-11-13 15:49:16

by Myklebust, Trond

[permalink] [raw]
Subject: Re: [PATCH v2 3/3] nfs: check if gssd is running before attempting to use krb5i auth in SETCLIENTID call


On Nov 13, 2013, at 10:35, Jeff Layton <[email protected]> wrote:

> On Wed, 13 Nov 2013 10:14:30 -0500
> Chuck Lever <[email protected]> wrote:
>
>>
>> On Nov 13, 2013, at 9:48 AM, "Myklebust, Trond" <[email protected]> wrote:
>>
>>>
>>> On Nov 13, 2013, at 9:38, Chuck Lever <[email protected]> wrote:
>>>
>>>>
>>>> On Nov 13, 2013, at 9:30 AM, Jeff Layton <[email protected]> wrote:
>>>>
>>>>> Currently, the client will attempt to use krb5i in the SETCLIENTID call
>>>>> even if rpc.gssd is running. If that fails, it'll then fall back to
>>>>> RPC_AUTH_UNIX. This introduced a delay when mounting if rpc.gssd isn't
>>>>> running, and causes warning messages to pop up in the ring buffer.
>>>>>
>>>>> Check to see if rpc.gssd is running before even attempting to use krb5i
>>>>> auth, and just silently skip trying to do so if it isn't.
>>>>>
>>>>> Signed-off-by: Jeff Layton <[email protected]>
>>>>> ---
>>>>>
>>>>> diff --git a/fs/nfs/nfs4client.c b/fs/nfs/nfs4client.c
>>>>> index b4a160a..988aebf 100644
>>>>> --- a/fs/nfs/nfs4client.c
>>>>> +++ b/fs/nfs/nfs4client.c
>>>>> @@ -8,6 +8,7 @@
>>>>> #include <linux/nfs_mount.h>
>>>>> #include <linux/sunrpc/addr.h>
>>>>> #include <linux/sunrpc/auth.h>
>>>>> +#include <linux/sunrpc/auth_gss.h>
>>>>> #include <linux/sunrpc/xprt.h>
>>>>> #include <linux/sunrpc/bc_xprt.h>
>>>>> #include "internal.h"
>>>>> @@ -351,7 +352,7 @@ static int nfs4_init_client_minor_version(struct nfs_client *clp)
>>>>> */
>>>>> struct nfs_client *nfs4_init_client(struct nfs_client *clp,
>>>>> const struct rpc_timeout *timeparms,
>>>>> - const char *ip_addr)
>>>>> + const char *ip_addr, struct net *net)
>>>
>>> Why the ?struct net? argument? clp->cl_net should already be initialized here.
>>>
>>>>> {
>>>>> char buf[INET6_ADDRSTRLEN + 1];
>>>>> struct nfs_client *old;
>>>>> @@ -370,7 +371,10 @@ struct nfs_client *nfs4_init_client(struct nfs_client *clp,
>>>>> __set_bit(NFS_CS_INFINITE_SLOTS, &clp->cl_flags);
>>>>> __set_bit(NFS_CS_DISCRTRY, &clp->cl_flags);
>>>>> __set_bit(NFS_CS_NO_RETRANS_TIMEOUT, &clp->cl_flags);
>>>>> - error = nfs_create_rpc_client(clp, timeparms, RPC_AUTH_GSS_KRB5I);
>>>>> +
>>>>> + error = -EINVAL;
>>>>> + if (gssd_running(net))
>>>>> + error = nfs_create_rpc_client(clp, timeparms,RPC_AUTH_GSS_KRB5I);
>>>>> if (error == -EINVAL)
>>>>> error = nfs_create_rpc_client(clp, timeparms, RPC_AUTH_UNIX);
>>>>
>>>> This feels like a layering violation. Why wouldn't you put a gssd_running check in gss_create(), for example, and have rpcauth_create() return -EINVAL?
>>>>
>>>
>>> It would be better to put it in the upcall.
>>
>> Waiting until the upcall has its benefits. At that point, GSSD (if it is running) can report other errors, such as that there is no material to form a machine credential.
>>
>> My point to Jeff is that, aside from architectural aesthetics, direct calls from RPC consumers to the GSS layer causes a module dependency problem. The right way to plumb this is to create an RPC client API that invokes gssd_running() but only if auth_rpcgss.ko is loaded.

Chuck, I?ve already told you that the auth_rpcgss dependency is a non-starter. Turning off automatic loading of rpcsec_gss modules is a _worse_ regression than the ones we already have and (as I already said) can be trivially defeated by just compiling them into the kernel.

We _want_ the kernel to be able to autoload modules so that we can add new security flavours etc without having to recompile.

>> However, I don't see why the existing RPC client APIs shouldn't just fail where appropriate if GSSD isn't available. Is there a strong need to expose gssd_running() as a separate API?
>>
>
> One of the complaints about this whole "use krb5i by default" change is
> that we now get the warnings in the ring buffer when gssd isn't
> running. That's a good thing if krb5 was explicitly requested, but is
> useless and confusing if the user just wants to use AUTH_SYS.
>
> If we wait until gss_create, it's too late to know what the "intent"
> was. We'll either fire the warning inappropriately like we do now, or
> miss it altogether when we actually should have printed it.

What if the user intended to use krb5i, but the daemon failed to start? This whole ?kernel second guessing what the admin _actually_ wanted to do? thing is a red herring. Let?s just fix the real problem and then leave it at that.

--
Trond Myklebust
Linux NFS client maintainer

NetApp
[email protected]
http://www.netapp.com