2013-11-14 12:25:28

by Jeff Layton

[permalink] [raw]
Subject: [PATCH v4 0/3] sunrpc/nfs: more reliable detection of running gssd

v4:
- fix commit message in first patch (s/"dummy"/"gssd"/)

- remove #if IS_ENABLED(CONFIG_SUNRPC_GSS) compile-time dependency

- added cleanup of warn_gssd() message

- fix minor checkpatch.pl warnings

v3:
- get rid of unneeded argument to init_client nfs op

- move gssd_running check into rpc_pipe.c to eliminate dependency on
auth_gss.ko

v2:
- change name of toplevel pipefs dir from "dummy" to "gssd" (per
Trond's suggestion)

- when gssd isn't running, don't bother to upcall (per Neil B.'s
suggestion)

- fix lifecycle of rpc_pipe data. Previously it would have leaked
after umount. With this set, it's created and destroyed along with
the netns, and just attached to the pipe inode on mount/unmount
of rpc_pipefs.

- patch has been added to skip attempting setclientid with krb5i
if gssd isn't running. This avoids the "AUTH_GSS upcall timed out"
message when gssd isn't running and you mount with sec=sys. It also
shortens the delay when gssd isn't up.

The original cover letter from the v1 posting follows. Note that this
set does address the warnings about the AUTH_GSS upcall timing out.

-------------------------[snip]-----------------------------

We've gotten a lot of complaints recently about the 15s delay when
doing a sec=sys mount without gssd running.

A large part of the problem is that the kernel isn't able to reliably
detect when rpc.gssd is running. What we currently have is a
gssd_running flag that is initially set to 1. When an upcall times out,
that gets set to 0, and subsequent upcalls get a much shorter timeout
(1/4s instead of 15s). It's reset back to '1' when a pipe is reopened.

The approach of using a flag like this is pretty inadequate. First, it
doesn't eliminate the long delay on the initial upcall attempt. Also,
if gssd spontaneously dies, then the flag will still be set to 1 until
the next upcall attempt times out. Finally, it currently requires that
the pipe be reopened in order to reset the flag back to true.

This patchset replaces that flag with a more reliable mechanism for
detecting when gssd is running. When rpc_pipefs is mounted, it creates a
new "dummy" pipe that gssd will naturally find and hold open. We'll
never send an upcall down this pipe, and writing to it always fails.
But, since we can detect when something is holding it open, we can use
that to determine whether gssd is running.

The current patch just uses this mechanism to replace the gssd_running
flag with this new mechanism. This shortens the long delay when mounting
without gssd running, but does not silence these warnings:

RPC: AUTH_GSS upcall timed out.
Please check user daemon is running.

I'm willing to add a patch to do that, but I'm a little unclear on the
best way to do so. Those messages are generated by the auth_gss code. We
probably do want to print them if someone mounted with sec=krb5, but
suppress them when mounting with sec=sys.

Do we need to somehow pass down that intent to auth_gss? Another idea
would be to call gssd_running() from the nfs mount code and use that to
determine whether to try and use krb5 at all...

Discuss!

Jeff Layton (3):
sunrpc: create a new dummy pipe for gssd to hold open
sunrpc: replace sunrpc_net->gssd_running flag with a more reliable
check
nfs: check if gssd is running before attempting to use krb5i auth in
SETCLIENTID call

fs/nfs/nfs4client.c | 7 ++-
include/linux/sunrpc/rpc_pipe_fs.h | 5 +-
net/sunrpc/auth_gss/auth_gss.c | 17 +++---
net/sunrpc/netns.h | 3 +-
net/sunrpc/rpc_pipe.c | 107 ++++++++++++++++++++++++++++++++++---
net/sunrpc/sunrpc_syms.c | 8 ++-
6 files changed, 125 insertions(+), 22 deletions(-)

--
1.8.3.1



2013-11-14 12:25:31

by Jeff Layton

[permalink] [raw]
Subject: [PATCH v4 3/3] nfs: check if gssd is running before attempting to use krb5i auth in SETCLIENTID call

Currently, the client will attempt to use krb5i in the SETCLIENTID call
even if rpc.gssd isn't running. When that fails, it'll then fall back to
RPC_AUTH_UNIX. This introduced a delay when mounting if rpc.gssd isn't
running, and causes warning messages to pop up in the ring buffer.

Check to see if rpc.gssd is running before even attempting to use krb5i
auth, and just silently skip trying to do so if it isn't. In the event
that the admin is actually trying to mount with krb5*, it will still
fail at a later stage of the mount attempt.

Signed-off-by: Jeff Layton <[email protected]>
---
fs/nfs/nfs4client.c | 7 ++++++-
1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/fs/nfs/nfs4client.c b/fs/nfs/nfs4client.c
index b4a160a..c1b7a80 100644
--- a/fs/nfs/nfs4client.c
+++ b/fs/nfs/nfs4client.c
@@ -10,6 +10,7 @@
#include <linux/sunrpc/auth.h>
#include <linux/sunrpc/xprt.h>
#include <linux/sunrpc/bc_xprt.h>
+#include <linux/sunrpc/rpc_pipe_fs.h>
#include "internal.h"
#include "callback.h"
#include "delegation.h"
@@ -370,7 +371,11 @@ struct nfs_client *nfs4_init_client(struct nfs_client *clp,
__set_bit(NFS_CS_INFINITE_SLOTS, &clp->cl_flags);
__set_bit(NFS_CS_DISCRTRY, &clp->cl_flags);
__set_bit(NFS_CS_NO_RETRANS_TIMEOUT, &clp->cl_flags);
- error = nfs_create_rpc_client(clp, timeparms, RPC_AUTH_GSS_KRB5I);
+
+ error = -EINVAL;
+ if (gssd_running(clp->cl_net))
+ error = nfs_create_rpc_client(clp, timeparms,
+ RPC_AUTH_GSS_KRB5I);
if (error == -EINVAL)
error = nfs_create_rpc_client(clp, timeparms, RPC_AUTH_UNIX);
if (error < 0)
--
1.8.3.1


2013-11-14 12:25:28

by Jeff Layton

[permalink] [raw]
Subject: [PATCH v4 1/3] sunrpc: create a new dummy pipe for gssd to hold open

rpc.gssd will naturally hold open any pipe named */clnt*/gssd that shows
up under rpc_pipefs. That behavior gives us a reliable mechanism to tell
whether it's actually running or not.

Create a new toplevel "gssd" directory in rpc_pipefs when it's mounted.
Under that directory create another directory called "clntXX", and then
within that a pipe called "gssd".

We'll never send an upcall along that pipe, and any downcall written to
it will just return -EINVAL.

Signed-off-by: Jeff Layton <[email protected]>
---
include/linux/sunrpc/rpc_pipe_fs.h | 3 +-
net/sunrpc/netns.h | 1 +
net/sunrpc/rpc_pipe.c | 93 ++++++++++++++++++++++++++++++++++++--
net/sunrpc/sunrpc_syms.c | 8 +++-
4 files changed, 100 insertions(+), 5 deletions(-)

diff --git a/include/linux/sunrpc/rpc_pipe_fs.h b/include/linux/sunrpc/rpc_pipe_fs.h
index a353e03..85f1342 100644
--- a/include/linux/sunrpc/rpc_pipe_fs.h
+++ b/include/linux/sunrpc/rpc_pipe_fs.h
@@ -84,7 +84,8 @@ enum {

extern struct dentry *rpc_d_lookup_sb(const struct super_block *sb,
const unsigned char *dir_name);
-extern void rpc_pipefs_init_net(struct net *net);
+extern int rpc_pipefs_init_net(struct net *net);
+extern void rpc_pipefs_exit_net(struct net *net);
extern struct super_block *rpc_get_sb_net(const struct net *net);
extern void rpc_put_sb_net(const struct net *net);

diff --git a/net/sunrpc/netns.h b/net/sunrpc/netns.h
index 779742c..8a8e841 100644
--- a/net/sunrpc/netns.h
+++ b/net/sunrpc/netns.h
@@ -14,6 +14,7 @@ struct sunrpc_net {
struct cache_detail *rsi_cache;

struct super_block *pipefs_sb;
+ struct rpc_pipe *gssd_dummy;
struct mutex pipefs_sb_lock;

struct list_head all_clients;
diff --git a/net/sunrpc/rpc_pipe.c b/net/sunrpc/rpc_pipe.c
index d0d14a0..34efdbf 100644
--- a/net/sunrpc/rpc_pipe.c
+++ b/net/sunrpc/rpc_pipe.c
@@ -38,7 +38,7 @@
#define NET_NAME(net) ((net == &init_net) ? " (init_net)" : "")

static struct file_system_type rpc_pipe_fs_type;
-
+static const struct rpc_pipe_ops gssd_dummy_pipe_ops;

static struct kmem_cache *rpc_inode_cachep __read_mostly;

@@ -1168,6 +1168,7 @@ enum {
RPCAUTH_nfsd4_cb,
RPCAUTH_cache,
RPCAUTH_nfsd,
+ RPCAUTH_gssd,
RPCAUTH_RootEOF
};

@@ -1204,6 +1205,10 @@ static const struct rpc_filelist files[] = {
.name = "nfsd",
.mode = S_IFDIR | S_IRUGO | S_IXUGO,
},
+ [RPCAUTH_gssd] = {
+ .name = "gssd",
+ .mode = S_IFDIR | S_IRUGO | S_IXUGO,
+ },
};

/*
@@ -1217,13 +1222,25 @@ struct dentry *rpc_d_lookup_sb(const struct super_block *sb,
}
EXPORT_SYMBOL_GPL(rpc_d_lookup_sb);

-void rpc_pipefs_init_net(struct net *net)
+int rpc_pipefs_init_net(struct net *net)
{
struct sunrpc_net *sn = net_generic(net, sunrpc_net_id);

+ sn->gssd_dummy = rpc_mkpipe_data(&gssd_dummy_pipe_ops, 0);
+ if (IS_ERR(sn->gssd_dummy))
+ return PTR_ERR(sn->gssd_dummy);
+
mutex_init(&sn->pipefs_sb_lock);
sn->gssd_running = 1;
sn->pipe_version = -1;
+ return 0;
+}
+
+void rpc_pipefs_exit_net(struct net *net)
+{
+ struct sunrpc_net *sn = net_generic(net, sunrpc_net_id);
+
+ rpc_destroy_pipe_data(sn->gssd_dummy);
}

/*
@@ -1253,11 +1270,73 @@ void rpc_put_sb_net(const struct net *net)
}
EXPORT_SYMBOL_GPL(rpc_put_sb_net);

+static const struct rpc_filelist gssd_dummy_clnt_dir[] = {
+ [0] = {
+ .name = "clntXX",
+ .mode = S_IFDIR | S_IRUGO | S_IXUGO,
+ },
+};
+
+static ssize_t
+dummy_downcall(struct file *filp, const char __user *src, size_t len)
+{
+ return -EINVAL;
+}
+
+static const struct rpc_pipe_ops gssd_dummy_pipe_ops = {
+ .upcall = rpc_pipe_generic_upcall,
+ .downcall = dummy_downcall,
+};
+
+/**
+ * rpc_gssd_dummy_populate - create a dummy gssd pipe
+ * @root: root of the rpc_pipefs filesystem
+ * @pipe_data: pipe data created when netns is initialized
+ *
+ * Create a dummy set of directories and a pipe that gssd can hold open to
+ * indicate that it is up and running.
+ */
+static struct dentry *
+rpc_gssd_dummy_populate(struct dentry *root, struct rpc_pipe *pipe_data)
+{
+ int ret = 0;
+ struct dentry *gssd_dentry;
+ struct dentry *clnt_dentry = NULL;
+ struct dentry *pipe_dentry = NULL;
+ struct qstr q = QSTR_INIT(files[RPCAUTH_gssd].name,
+ strlen(files[RPCAUTH_gssd].name));
+
+ /* We should never get this far if "gssd" doesn't exist */
+ gssd_dentry = d_hash_and_lookup(root, &q);
+ if (!gssd_dentry)
+ return ERR_PTR(-ENOENT);
+
+ ret = rpc_populate(gssd_dentry, gssd_dummy_clnt_dir, 0, 1, NULL);
+ if (ret) {
+ pipe_dentry = ERR_PTR(ret);
+ goto out;
+ }
+
+ q.name = gssd_dummy_clnt_dir[0].name;
+ q.len = strlen(gssd_dummy_clnt_dir[0].name);
+ clnt_dentry = d_hash_and_lookup(gssd_dentry, &q);
+ if (!clnt_dentry) {
+ pipe_dentry = ERR_PTR(-ENOENT);
+ goto out;
+ }
+
+ pipe_dentry = rpc_mkpipe_dentry(clnt_dentry, "gssd", NULL, pipe_data);
+out:
+ dput(clnt_dentry);
+ dput(gssd_dentry);
+ return pipe_dentry;
+}
+
static int
rpc_fill_super(struct super_block *sb, void *data, int silent)
{
struct inode *inode;
- struct dentry *root;
+ struct dentry *root, *gssd_dentry;
struct net *net = data;
struct sunrpc_net *sn = net_generic(net, sunrpc_net_id);
int err;
@@ -1275,6 +1354,13 @@ rpc_fill_super(struct super_block *sb, void *data, int silent)
return -ENOMEM;
if (rpc_populate(root, files, RPCAUTH_lockd, RPCAUTH_RootEOF, NULL))
return -ENOMEM;
+
+ gssd_dentry = rpc_gssd_dummy_populate(root, sn->gssd_dummy);
+ if (IS_ERR(gssd_dentry)) {
+ __rpc_depopulate(root, files, RPCAUTH_lockd, RPCAUTH_RootEOF);
+ return PTR_ERR(gssd_dentry);
+ }
+
dprintk("RPC: sending pipefs MOUNT notification for net %p%s\n",
net, NET_NAME(net));
mutex_lock(&sn->pipefs_sb_lock);
@@ -1289,6 +1375,7 @@ rpc_fill_super(struct super_block *sb, void *data, int silent)
return 0;

err_depopulate:
+ dput(gssd_dentry);
blocking_notifier_call_chain(&rpc_pipefs_notifier_list,
RPC_PIPEFS_UMOUNT,
sb);
diff --git a/net/sunrpc/sunrpc_syms.c b/net/sunrpc/sunrpc_syms.c
index 3d6498a..cd30120 100644
--- a/net/sunrpc/sunrpc_syms.c
+++ b/net/sunrpc/sunrpc_syms.c
@@ -44,12 +44,17 @@ static __net_init int sunrpc_init_net(struct net *net)
if (err)
goto err_unixgid;

- rpc_pipefs_init_net(net);
+ err = rpc_pipefs_init_net(net);
+ if (err)
+ goto err_pipefs;
+
INIT_LIST_HEAD(&sn->all_clients);
spin_lock_init(&sn->rpc_client_lock);
spin_lock_init(&sn->rpcb_clnt_lock);
return 0;

+err_pipefs:
+ unix_gid_cache_destroy(net);
err_unixgid:
ip_map_cache_destroy(net);
err_ipmap:
@@ -60,6 +65,7 @@ err_proc:

static __net_exit void sunrpc_exit_net(struct net *net)
{
+ rpc_pipefs_exit_net(net);
unix_gid_cache_destroy(net);
ip_map_cache_destroy(net);
rpc_proc_exit(net);
--
1.8.3.1


2013-11-14 12:25:29

by Jeff Layton

[permalink] [raw]
Subject: [PATCH v4 2/3] sunrpc: replace sunrpc_net->gssd_running flag with a more reliable check

Now that we have a more reliable method to tell if gssd is running, we
can replace the sn->gssd_running flag with a function that will query to
see if it's up and running.

There's also no need to attempt an upcall that we know will fail, so
just return -EACCES if gssd isn't running. Finally, fix the warn_gss()
message not to claim that that the upcall timed out since we don't
necesarily perform one now when gssd isn't running, and remove the
extraneous newline from the message.

Signed-off-by: Jeff Layton <[email protected]>
---
include/linux/sunrpc/rpc_pipe_fs.h | 2 ++
net/sunrpc/auth_gss/auth_gss.c | 17 +++++++----------
net/sunrpc/netns.h | 2 --
net/sunrpc/rpc_pipe.c | 14 ++++++++++----
4 files changed, 19 insertions(+), 16 deletions(-)

diff --git a/include/linux/sunrpc/rpc_pipe_fs.h b/include/linux/sunrpc/rpc_pipe_fs.h
index 85f1342..7f490be 100644
--- a/include/linux/sunrpc/rpc_pipe_fs.h
+++ b/include/linux/sunrpc/rpc_pipe_fs.h
@@ -131,5 +131,7 @@ extern int rpc_unlink(struct dentry *);
extern int register_rpc_pipefs(void);
extern void unregister_rpc_pipefs(void);

+extern bool gssd_running(struct net *net);
+
#endif
#endif
diff --git a/net/sunrpc/auth_gss/auth_gss.c b/net/sunrpc/auth_gss/auth_gss.c
index 97912b4..0316b8d 100644
--- a/net/sunrpc/auth_gss/auth_gss.c
+++ b/net/sunrpc/auth_gss/auth_gss.c
@@ -536,8 +536,7 @@ static void warn_gssd(void)
unsigned long now = jiffies;

if (time_after(now, ratelimit)) {
- printk(KERN_WARNING "RPC: AUTH_GSS upcall timed out.\n"
- "Please check user daemon is running.\n");
+ pr_warn("RPC: AUTH_GSS upcall failed. Please check user daemon is running.\n");
ratelimit = now + 15*HZ;
}
}
@@ -600,7 +599,6 @@ gss_create_upcall(struct gss_auth *gss_auth, struct gss_cred *gss_cred)
struct rpc_pipe *pipe;
struct rpc_cred *cred = &gss_cred->gc_base;
struct gss_upcall_msg *gss_msg;
- unsigned long timeout;
DEFINE_WAIT(wait);
int err;

@@ -608,17 +606,16 @@ gss_create_upcall(struct gss_auth *gss_auth, struct gss_cred *gss_cred)
__func__, from_kuid(&init_user_ns, cred->cr_uid));
retry:
err = 0;
- /* Default timeout is 15s unless we know that gssd is not running */
- timeout = 15 * HZ;
- if (!sn->gssd_running)
- timeout = HZ >> 2;
+ /* if gssd is down, just skip upcalling altogether */
+ if (!gssd_running(net)) {
+ warn_gssd();
+ return -EACCES;
+ }
gss_msg = gss_setup_upcall(gss_auth, cred);
if (PTR_ERR(gss_msg) == -EAGAIN) {
err = wait_event_interruptible_timeout(pipe_version_waitqueue,
- sn->pipe_version >= 0, timeout);
+ sn->pipe_version >= 0, 15 * HZ);
if (sn->pipe_version < 0) {
- if (err == 0)
- sn->gssd_running = 0;
warn_gssd();
err = -EACCES;
}
diff --git a/net/sunrpc/netns.h b/net/sunrpc/netns.h
index 8a8e841..94e506f 100644
--- a/net/sunrpc/netns.h
+++ b/net/sunrpc/netns.h
@@ -33,8 +33,6 @@ struct sunrpc_net {
int pipe_version;
atomic_t pipe_users;
struct proc_dir_entry *use_gssp_proc;
-
- unsigned int gssd_running;
};

extern int sunrpc_net_id;
diff --git a/net/sunrpc/rpc_pipe.c b/net/sunrpc/rpc_pipe.c
index 34efdbf..9ce0082 100644
--- a/net/sunrpc/rpc_pipe.c
+++ b/net/sunrpc/rpc_pipe.c
@@ -216,14 +216,11 @@ rpc_destroy_inode(struct inode *inode)
static int
rpc_pipe_open(struct inode *inode, struct file *filp)
{
- struct net *net = inode->i_sb->s_fs_info;
- struct sunrpc_net *sn = net_generic(net, sunrpc_net_id);
struct rpc_pipe *pipe;
int first_open;
int res = -ENXIO;

mutex_lock(&inode->i_mutex);
- sn->gssd_running = 1;
pipe = RPC_I(inode)->pipe;
if (pipe == NULL)
goto out;
@@ -1231,7 +1228,6 @@ int rpc_pipefs_init_net(struct net *net)
return PTR_ERR(sn->gssd_dummy);

mutex_init(&sn->pipefs_sb_lock);
- sn->gssd_running = 1;
sn->pipe_version = -1;
return 0;
}
@@ -1385,6 +1381,16 @@ err_depopulate:
return err;
}

+bool
+gssd_running(struct net *net)
+{
+ struct sunrpc_net *sn = net_generic(net, sunrpc_net_id);
+ struct rpc_pipe *pipe = sn->gssd_dummy;
+
+ return pipe->nreaders || pipe->nwriters;
+}
+EXPORT_SYMBOL_GPL(gssd_running);
+
static struct dentry *
rpc_mount(struct file_system_type *fs_type,
int flags, const char *dev_name, void *data)
--
1.8.3.1


2013-11-15 17:49:03

by Jeff Layton

[permalink] [raw]
Subject: Re: [PATCH v4 1/3] sunrpc: create a new dummy pipe for gssd to hold open

On Fri, 15 Nov 2013 16:56:51 +0000
"Myklebust, Trond" <[email protected]> wrote:

> On Thu, 2013-11-14 at 07:25 -0500, Jeff Layton wrote:
> > rpc.gssd will naturally hold open any pipe named */clnt*/gssd that shows
> > up under rpc_pipefs. That behavior gives us a reliable mechanism to tell
> > whether it's actually running or not.
> >
> > Create a new toplevel "gssd" directory in rpc_pipefs when it's mounted.
> > Under that directory create another directory called "clntXX", and then
> > within that a pipe called "gssd".
> >
> > We'll never send an upcall along that pipe, and any downcall written to
> > it will just return -EINVAL.
> >
> > Signed-off-by: Jeff Layton <[email protected]>
>
> Hi Jeff,
>
> Don't you need something in rpc_kill_sb() in order to remove the pipe
> and the clntXX directory?
>
> Also please see rpc_mkdir_populate() and rpc_mkdir_depopulate() for how
> you can simplify the creation/destruction of the clntXX+clntXX/gssd.
>
> Cheers
> Trond

The concerned me too when I was working on this patch, but I don't think
that's the case. Note that rpc_kill_sb() doesn't call rpc_depopulate() or
anything for the top level directories in files[] either, which puzzled
me until I dove in to figure out why...

Basically what happens is that we take a dentry reference to those when
they are instantiated and that keeps them pinned in memory. The last
thing that rpc_kill_sb() does is call kill_litter_super(), which calls
d_gencode() to put a reference on any dentry reachable from s->root.

After that, it calls generic_shutdown_super() which then calls
shrink_dcache_for_umount() to go through and purge all of the
dentries that are now at 0. At that point you should see "Dentry ...
still in use" messages if any still have a non-zero refcount. I've
tested mounting and unmounting rpc_pipefs with this set and I've never
seen those warnings pop.

So, I don't think we need to do any cleanup of them, but I will admit
that none of this is particularly obvious. ;)

--
Jeff Layton <[email protected]>

2013-11-15 16:56:53

by Myklebust, Trond

[permalink] [raw]
Subject: Re: [PATCH v4 1/3] sunrpc: create a new dummy pipe for gssd to hold open

On Thu, 2013-11-14 at 07:25 -0500, Jeff Layton wrote:
+AD4- rpc.gssd will naturally hold open any pipe named +ACo-/clnt+ACo-/gssd that shows
+AD4- up under rpc+AF8-pipefs. That behavior gives us a reliable mechanism to tell
+AD4- whether it's actually running or not.
+AD4-
+AD4- Create a new toplevel +ACI-gssd+ACI- directory in rpc+AF8-pipefs when it's mounted.
+AD4- Under that directory create another directory called +ACI-clntXX+ACI-, and then
+AD4- within that a pipe called +ACI-gssd+ACI-.
+AD4-
+AD4- We'll never send an upcall along that pipe, and any downcall written to
+AD4- it will just return -EINVAL.
+AD4-
+AD4- Signed-off-by: Jeff Layton +ADw-jlayton+AEA-redhat.com+AD4-

Hi Jeff,

Don't you need something in rpc+AF8-kill+AF8-sb() in order to remove the pipe
and the clntXX directory?

Also please see rpc+AF8-mkdir+AF8-populate() and rpc+AF8-mkdir+AF8-depopulate() for how
you can simplify the creation/destruction of the clntXX+-clntXX/gssd.

Cheers
Trond