2013-05-15 19:51:12

by Myklebust, Trond

[permalink] [raw]
Subject: [PATCH 0/3] Speed up detection of whether or not rpc.gssd is running

Also fix the auth_gss pipe version detection so that it works correctly
in the presence of namespaces.

Trond Myklebust (3):
SUNRPC: Fix a bug in gss_create_upcall
SUNRPC: Faster detection if gssd is actually running
SUNRPC: Convert auth_gss pipe detection to work in namespaces

net/sunrpc/auth_gss/auth_gss.c | 62 ++++++++++++++++++++++++++++--------------
net/sunrpc/netns.h | 4 +++
net/sunrpc/rpc_pipe.c | 5 ++++
3 files changed, 50 insertions(+), 21 deletions(-)

--
1.8.1.4



2013-05-16 20:19:57

by J. Bruce Fields

[permalink] [raw]
Subject: Re: [PATCH 2/3] SUNRPC: Faster detection if gssd is actually running

On Wed, May 15, 2013 at 12:50:40PM -0700, Trond Myklebust wrote:
> Recent changes to the NFS security flavour negotiation mean that
> we have a stronger dependency on rpc.gssd. If the latter is not
> running, because the user failed to start it, then we time out
> and mark the container as not having an instance. We then
> use that information to time out faster the next time.
>
> If, on the other hand, the rpc.gssd successfully binds to an rpc_pipe,
> then we mark the container as having an rpc.gssd instance.

So it's still a 15 second delay on the first mount, then 7 on the
second, then 3, 1, and no delay thereafter. Is that right?

Why not be harsher and go straight to 0 after the first failure?

--b.

>
> Signed-off-by: Trond Myklebust <[email protected]>
> ---
> net/sunrpc/auth_gss/auth_gss.c | 13 ++++++++++++-
> net/sunrpc/netns.h | 2 ++
> net/sunrpc/rpc_pipe.c | 4 ++++
> 3 files changed, 18 insertions(+), 1 deletion(-)
>
> diff --git a/net/sunrpc/auth_gss/auth_gss.c b/net/sunrpc/auth_gss/auth_gss.c
> index f17f3c5..3aff72f 100644
> --- a/net/sunrpc/auth_gss/auth_gss.c
> +++ b/net/sunrpc/auth_gss/auth_gss.c
> @@ -52,6 +52,8 @@
> #include <linux/sunrpc/gss_api.h>
> #include <asm/uaccess.h>
>
> +#include "../netns.h"
> +
> static const struct rpc_authops authgss_ops;
>
> static const struct rpc_credops gss_credops;
> @@ -559,9 +561,12 @@ out:
> static inline int
> gss_create_upcall(struct gss_auth *gss_auth, struct gss_cred *gss_cred)
> {
> + struct net *net = rpc_net_ns(gss_auth->client);
> + struct sunrpc_net *sn = net_generic(net, sunrpc_net_id);
> struct rpc_pipe *pipe;
> struct rpc_cred *cred = &gss_cred->gc_base;
> struct gss_upcall_msg *gss_msg;
> + unsigned long timeout;
> DEFINE_WAIT(wait);
> int err;
>
> @@ -569,11 +574,17 @@ gss_create_upcall(struct gss_auth *gss_auth, struct gss_cred *gss_cred)
> __func__, from_kuid(&init_user_ns, cred->cr_uid));
> retry:
> err = 0;
> + /* Default timeout is 15s unless we know that gssd is not running */
> + timeout = 15 * HZ;
> + if (!sn->gssd_running)
> + timeout = HZ >> 2;
> gss_msg = gss_setup_upcall(gss_auth->client, gss_auth, cred);
> if (PTR_ERR(gss_msg) == -EAGAIN) {
> err = wait_event_interruptible_timeout(pipe_version_waitqueue,
> - pipe_version >= 0, 15*HZ);
> + pipe_version >= 0, timeout);
> if (pipe_version < 0) {
> + if (err == 0)
> + sn->gssd_running = 0;
> warn_gssd();
> err = -EACCES;
> }
> diff --git a/net/sunrpc/netns.h b/net/sunrpc/netns.h
> index 7111a4c..0827f64 100644
> --- a/net/sunrpc/netns.h
> +++ b/net/sunrpc/netns.h
> @@ -29,6 +29,8 @@ struct sunrpc_net {
> struct rpc_clnt *gssp_clnt;
> int use_gss_proxy;
> struct proc_dir_entry *use_gssp_proc;
> +
> + unsigned int gssd_running;
> };
>
> extern int sunrpc_net_id;
> diff --git a/net/sunrpc/rpc_pipe.c b/net/sunrpc/rpc_pipe.c
> index a9129f8..415b705 100644
> --- a/net/sunrpc/rpc_pipe.c
> +++ b/net/sunrpc/rpc_pipe.c
> @@ -216,11 +216,14 @@ rpc_destroy_inode(struct inode *inode)
> static int
> rpc_pipe_open(struct inode *inode, struct file *filp)
> {
> + struct net *net = inode->i_sb->s_fs_info;
> + struct sunrpc_net *sn = net_generic(net, sunrpc_net_id);
> struct rpc_pipe *pipe;
> int first_open;
> int res = -ENXIO;
>
> mutex_lock(&inode->i_mutex);
> + sn->gssd_running = 1;
> pipe = RPC_I(inode)->pipe;
> if (pipe == NULL)
> goto out;
> @@ -1069,6 +1072,7 @@ void rpc_pipefs_init_net(struct net *net)
> struct sunrpc_net *sn = net_generic(net, sunrpc_net_id);
>
> mutex_init(&sn->pipefs_sb_lock);
> + sn->gssd_running = -1;
> }
>
> /*
> --
> 1.8.1.4
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html

2013-05-15 19:51:13

by Myklebust, Trond

[permalink] [raw]
Subject: [PATCH 1/3] SUNRPC: Fix a bug in gss_create_upcall

If wait_event_interruptible_timeout() is successful, it returns
the number of seconds remaining until the timeout. In that
case, we should be retrying the upcall.

Signed-off-by: Trond Myklebust <[email protected]>
---
net/sunrpc/auth_gss/auth_gss.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/net/sunrpc/auth_gss/auth_gss.c b/net/sunrpc/auth_gss/auth_gss.c
index 7da6b45..f17f3c5 100644
--- a/net/sunrpc/auth_gss/auth_gss.c
+++ b/net/sunrpc/auth_gss/auth_gss.c
@@ -563,11 +563,12 @@ gss_create_upcall(struct gss_auth *gss_auth, struct gss_cred *gss_cred)
struct rpc_cred *cred = &gss_cred->gc_base;
struct gss_upcall_msg *gss_msg;
DEFINE_WAIT(wait);
- int err = 0;
+ int err;

dprintk("RPC: %s for uid %u\n",
__func__, from_kuid(&init_user_ns, cred->cr_uid));
retry:
+ err = 0;
gss_msg = gss_setup_upcall(gss_auth->client, gss_auth, cred);
if (PTR_ERR(gss_msg) == -EAGAIN) {
err = wait_event_interruptible_timeout(pipe_version_waitqueue,
@@ -576,7 +577,7 @@ retry:
warn_gssd();
err = -EACCES;
}
- if (err)
+ if (err < 0)
goto out;
goto retry;
}
--
1.8.1.4


2013-05-15 19:51:14

by Myklebust, Trond

[permalink] [raw]
Subject: [PATCH 3/3] SUNRPC: Convert auth_gss pipe detection to work in namespaces

This seems to have been overlooked when we did the namespace
conversion. If a container is running a legacy version of rpc.gssd
then it will be disrupted if the global 'pipe_version' is set by a
container running the new version of rpc.gssd.

Signed-off-by: Trond Myklebust <[email protected]>
---
net/sunrpc/auth_gss/auth_gss.c | 46 +++++++++++++++++++++++++-----------------
net/sunrpc/netns.h | 2 ++
net/sunrpc/rpc_pipe.c | 1 +
3 files changed, 30 insertions(+), 19 deletions(-)

diff --git a/net/sunrpc/auth_gss/auth_gss.c b/net/sunrpc/auth_gss/auth_gss.c
index 3aff72f..fc2f78d 100644
--- a/net/sunrpc/auth_gss/auth_gss.c
+++ b/net/sunrpc/auth_gss/auth_gss.c
@@ -87,8 +87,6 @@ struct gss_auth {
};

/* pipe_version >= 0 if and only if someone has a pipe open. */
-static int pipe_version = -1;
-static atomic_t pipe_users = ATOMIC_INIT(0);
static DEFINE_SPINLOCK(pipe_version_lock);
static struct rpc_wait_queue pipe_version_rpc_waitqueue;
static DECLARE_WAIT_QUEUE_HEAD(pipe_version_waitqueue);
@@ -268,24 +266,27 @@ struct gss_upcall_msg {
char databuf[UPCALL_BUF_LEN];
};

-static int get_pipe_version(void)
+static int get_pipe_version(struct net *net)
{
+ struct sunrpc_net *sn = net_generic(net, sunrpc_net_id);
int ret;

spin_lock(&pipe_version_lock);
- if (pipe_version >= 0) {
- atomic_inc(&pipe_users);
- ret = pipe_version;
+ if (sn->pipe_version >= 0) {
+ atomic_inc(&sn->pipe_users);
+ ret = sn->pipe_version;
} else
ret = -EAGAIN;
spin_unlock(&pipe_version_lock);
return ret;
}

-static void put_pipe_version(void)
+static void put_pipe_version(struct net *net)
{
- if (atomic_dec_and_lock(&pipe_users, &pipe_version_lock)) {
- pipe_version = -1;
+ struct sunrpc_net *sn = net_generic(net, sunrpc_net_id);
+
+ if (atomic_dec_and_lock(&sn->pipe_users, &pipe_version_lock)) {
+ sn->pipe_version = -1;
spin_unlock(&pipe_version_lock);
}
}
@@ -293,9 +294,10 @@ static void put_pipe_version(void)
static void
gss_release_msg(struct gss_upcall_msg *gss_msg)
{
+ struct net *net = rpc_net_ns(gss_msg->auth->client);
if (!atomic_dec_and_test(&gss_msg->count))
return;
- put_pipe_version();
+ put_pipe_version(net);
BUG_ON(!list_empty(&gss_msg->list));
if (gss_msg->ctx != NULL)
gss_put_ctx(gss_msg->ctx);
@@ -441,7 +443,10 @@ static void gss_encode_msg(struct gss_upcall_msg *gss_msg,
struct rpc_clnt *clnt,
const char *service_name)
{
- if (pipe_version == 0)
+ struct net *net = rpc_net_ns(clnt);
+ struct sunrpc_net *sn = net_generic(net, sunrpc_net_id);
+
+ if (sn->pipe_version == 0)
gss_encode_v0_msg(gss_msg);
else /* pipe_version == 1 */
gss_encode_v1_msg(gss_msg, clnt, service_name);
@@ -457,7 +462,7 @@ gss_alloc_msg(struct gss_auth *gss_auth, struct rpc_clnt *clnt,
gss_msg = kzalloc(sizeof(*gss_msg), GFP_NOFS);
if (gss_msg == NULL)
return ERR_PTR(-ENOMEM);
- vers = get_pipe_version();
+ vers = get_pipe_version(rpc_net_ns(clnt));
if (vers < 0) {
kfree(gss_msg);
return ERR_PTR(vers);
@@ -581,8 +586,8 @@ retry:
gss_msg = gss_setup_upcall(gss_auth->client, gss_auth, cred);
if (PTR_ERR(gss_msg) == -EAGAIN) {
err = wait_event_interruptible_timeout(pipe_version_waitqueue,
- pipe_version >= 0, timeout);
- if (pipe_version < 0) {
+ sn->pipe_version >= 0, timeout);
+ if (sn->pipe_version < 0) {
if (err == 0)
sn->gssd_running = 0;
warn_gssd();
@@ -719,20 +724,22 @@ out:

static int gss_pipe_open(struct inode *inode, int new_version)
{
+ struct net *net = inode->i_sb->s_fs_info;
+ struct sunrpc_net *sn = net_generic(net, sunrpc_net_id);
int ret = 0;

spin_lock(&pipe_version_lock);
- if (pipe_version < 0) {
+ if (sn->pipe_version < 0) {
/* First open of any gss pipe determines the version: */
- pipe_version = new_version;
+ sn->pipe_version = new_version;
rpc_wake_up(&pipe_version_rpc_waitqueue);
wake_up(&pipe_version_waitqueue);
- } else if (pipe_version != new_version) {
+ } else if (sn->pipe_version != new_version) {
/* Trying to open a pipe of a different version */
ret = -EBUSY;
goto out;
}
- atomic_inc(&pipe_users);
+ atomic_inc(&sn->pipe_users);
out:
spin_unlock(&pipe_version_lock);
return ret;
@@ -752,6 +759,7 @@ static int gss_pipe_open_v1(struct inode *inode)
static void
gss_pipe_release(struct inode *inode)
{
+ struct net *net = inode->i_sb->s_fs_info;
struct rpc_pipe *pipe = RPC_I(inode)->pipe;
struct gss_upcall_msg *gss_msg;

@@ -770,7 +778,7 @@ restart:
}
spin_unlock(&pipe->lock);

- put_pipe_version();
+ put_pipe_version(net);
}

static void
diff --git a/net/sunrpc/netns.h b/net/sunrpc/netns.h
index 0827f64..f88b018 100644
--- a/net/sunrpc/netns.h
+++ b/net/sunrpc/netns.h
@@ -28,6 +28,8 @@ struct sunrpc_net {
wait_queue_head_t gssp_wq;
struct rpc_clnt *gssp_clnt;
int use_gss_proxy;
+ unsigned int pipe_version;
+ atomic_t pipe_users;
struct proc_dir_entry *use_gssp_proc;

unsigned int gssd_running;
diff --git a/net/sunrpc/rpc_pipe.c b/net/sunrpc/rpc_pipe.c
index 415b705..4f59a3c 100644
--- a/net/sunrpc/rpc_pipe.c
+++ b/net/sunrpc/rpc_pipe.c
@@ -1073,6 +1073,7 @@ void rpc_pipefs_init_net(struct net *net)

mutex_init(&sn->pipefs_sb_lock);
sn->gssd_running = -1;
+ sn->pipe_version = -1;
}

/*
--
1.8.1.4


2013-05-17 17:56:23

by Myklebust, Trond

[permalink] [raw]
Subject: Re: [PATCH 3/3] SUNRPC: Convert auth_gss pipe detection to work in namespaces

On Thu, 2013-05-16 at 16:21 -0400, J. Bruce Fields wrote:
> On Wed, May 15, 2013 at 12:50:41PM -0700, Trond Myklebust wrote:
> > This seems to have been overlooked when we did the namespace
> > conversion. If a container is running a legacy version of rpc.gssd
> > then it will be disrupted if the global 'pipe_version' is set by a
> > container running the new version of rpc.gssd.
>
> Might be worth deprecating the "legacy" version--add a warning in for
> any users and then remove it if we don't see any evidence anyone's hit
> it recently.

Possibly, however given that we're still seeing distros using the binary
mount interface that was replaced 6 years ago...

--
Trond Myklebust
Linux NFS client maintainer

NetApp
[email protected]
http://www.netapp.com

2013-05-16 20:22:31

by J. Bruce Fields

[permalink] [raw]
Subject: Re: [PATCH 0/3] Speed up detection of whether or not rpc.gssd is running

On Thu, May 16, 2013 at 09:55:21AM -0400, Chuck Lever wrote:
>
> On May 15, 2013, at 3:50 PM, Trond Myklebust
> <[email protected]> wrote:
>
> > Also fix the auth_gss pipe version detection so that it works
> > correctly in the presence of namespaces.
>
> Obviously my preference is to keep the existing krb5i -> sys logic and
> to try to address the upcall timeout, so this is a good direction,
> IMO.
>
> After some review, I'm still a little concerned about initialization
> races inappropriately preventing upcalls,

I don't understand what you're referring to--could you elaborate?

--b.

> but time (and testing) will
> tell. I'm happy with this solution if you and Bruce are.
>
> Speaking of testing, I can test this series if you think that would be
> valuable, but I assume that those who originally reported the timeout
> problem (Jeff and Steve) should be the ones to confirm whether this
> series addresses their concern.
>
>
> > Trond Myklebust (3): SUNRPC: Fix a bug in gss_create_upcall SUNRPC:
> > Faster detection if gssd is actually running SUNRPC: Convert
> > auth_gss pipe detection to work in namespaces
> >
> > net/sunrpc/auth_gss/auth_gss.c | 62
> > ++++++++++++++++++++++++++++-------------- net/sunrpc/netns.h
> > | 4 +++ net/sunrpc/rpc_pipe.c | 5 ++++ 3 files changed,
> > 50 insertions(+), 21 deletions(-)
>
> -- Chuck Lever chuck[dot]lever[at]oracle[dot]com
>
>
>
>
> -- To unsubscribe from this list: send the line "unsubscribe
> linux-nfs" in the body of a message to [email protected] More
> majordomo info at http://vger.kernel.org/majordomo-info.html

2013-05-17 01:03:46

by J. Bruce Fields

[permalink] [raw]
Subject: Re: [PATCH 2/3] SUNRPC: Faster detection if gssd is actually running

On Thu, May 16, 2013 at 04:19:54PM -0400, bfields wrote:
> On Wed, May 15, 2013 at 12:50:40PM -0700, Trond Myklebust wrote:
> > Recent changes to the NFS security flavour negotiation mean that
> > we have a stronger dependency on rpc.gssd. If the latter is not
> > running, because the user failed to start it, then we time out
> > and mark the container as not having an instance. We then
> > use that information to time out faster the next time.
> >
> > If, on the other hand, the rpc.gssd successfully binds to an rpc_pipe,
> > then we mark the container as having an rpc.gssd instance.
>
> So it's still a 15 second delay on the first mount, then 7 on the
> second, then 3, 1, and no delay thereafter. Is that right?
>
> Why not be harsher and go straight to 0 after the first failure?

Chuck points out I'm confused, it's actually 15s then 1/4s (why 1/4s?).

(Apologies, somehow I saw a ">> 2" in there and my brain shut down and
jumped to the "exponential delay" conclusion.)

Still sort of curious how we're choosing these delays, though.

Hard to imagine people won't still notice the delay on first mount.
Given there's a workaround (run gssd), maybe that's just good enough for
now, I don't know.

--b.


>
> --b.
>
> >
> > Signed-off-by: Trond Myklebust <[email protected]>
> > ---
> > net/sunrpc/auth_gss/auth_gss.c | 13 ++++++++++++-
> > net/sunrpc/netns.h | 2 ++
> > net/sunrpc/rpc_pipe.c | 4 ++++
> > 3 files changed, 18 insertions(+), 1 deletion(-)
> >
> > diff --git a/net/sunrpc/auth_gss/auth_gss.c b/net/sunrpc/auth_gss/auth_gss.c
> > index f17f3c5..3aff72f 100644
> > --- a/net/sunrpc/auth_gss/auth_gss.c
> > +++ b/net/sunrpc/auth_gss/auth_gss.c
> > @@ -52,6 +52,8 @@
> > #include <linux/sunrpc/gss_api.h>
> > #include <asm/uaccess.h>
> >
> > +#include "../netns.h"
> > +
> > static const struct rpc_authops authgss_ops;
> >
> > static const struct rpc_credops gss_credops;
> > @@ -559,9 +561,12 @@ out:
> > static inline int
> > gss_create_upcall(struct gss_auth *gss_auth, struct gss_cred *gss_cred)
> > {
> > + struct net *net = rpc_net_ns(gss_auth->client);
> > + struct sunrpc_net *sn = net_generic(net, sunrpc_net_id);
> > struct rpc_pipe *pipe;
> > struct rpc_cred *cred = &gss_cred->gc_base;
> > struct gss_upcall_msg *gss_msg;
> > + unsigned long timeout;
> > DEFINE_WAIT(wait);
> > int err;
> >
> > @@ -569,11 +574,17 @@ gss_create_upcall(struct gss_auth *gss_auth, struct gss_cred *gss_cred)
> > __func__, from_kuid(&init_user_ns, cred->cr_uid));
> > retry:
> > err = 0;
> > + /* Default timeout is 15s unless we know that gssd is not running */
> > + timeout = 15 * HZ;
> > + if (!sn->gssd_running)
> > + timeout = HZ >> 2;
> > gss_msg = gss_setup_upcall(gss_auth->client, gss_auth, cred);
> > if (PTR_ERR(gss_msg) == -EAGAIN) {
> > err = wait_event_interruptible_timeout(pipe_version_waitqueue,
> > - pipe_version >= 0, 15*HZ);
> > + pipe_version >= 0, timeout);
> > if (pipe_version < 0) {
> > + if (err == 0)
> > + sn->gssd_running = 0;
> > warn_gssd();
> > err = -EACCES;
> > }
> > diff --git a/net/sunrpc/netns.h b/net/sunrpc/netns.h
> > index 7111a4c..0827f64 100644
> > --- a/net/sunrpc/netns.h
> > +++ b/net/sunrpc/netns.h
> > @@ -29,6 +29,8 @@ struct sunrpc_net {
> > struct rpc_clnt *gssp_clnt;
> > int use_gss_proxy;
> > struct proc_dir_entry *use_gssp_proc;
> > +
> > + unsigned int gssd_running;
> > };
> >
> > extern int sunrpc_net_id;
> > diff --git a/net/sunrpc/rpc_pipe.c b/net/sunrpc/rpc_pipe.c
> > index a9129f8..415b705 100644
> > --- a/net/sunrpc/rpc_pipe.c
> > +++ b/net/sunrpc/rpc_pipe.c
> > @@ -216,11 +216,14 @@ rpc_destroy_inode(struct inode *inode)
> > static int
> > rpc_pipe_open(struct inode *inode, struct file *filp)
> > {
> > + struct net *net = inode->i_sb->s_fs_info;
> > + struct sunrpc_net *sn = net_generic(net, sunrpc_net_id);
> > struct rpc_pipe *pipe;
> > int first_open;
> > int res = -ENXIO;
> >
> > mutex_lock(&inode->i_mutex);
> > + sn->gssd_running = 1;
> > pipe = RPC_I(inode)->pipe;
> > if (pipe == NULL)
> > goto out;
> > @@ -1069,6 +1072,7 @@ void rpc_pipefs_init_net(struct net *net)
> > struct sunrpc_net *sn = net_generic(net, sunrpc_net_id);
> >
> > mutex_init(&sn->pipefs_sb_lock);
> > + sn->gssd_running = -1;
> > }
> >
> > /*
> > --
> > 1.8.1.4
> >
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> > the body of a message to [email protected]
> > More majordomo info at http://vger.kernel.org/majordomo-info.html

2013-05-15 19:51:13

by Myklebust, Trond

[permalink] [raw]
Subject: [PATCH 2/3] SUNRPC: Faster detection if gssd is actually running

Recent changes to the NFS security flavour negotiation mean that
we have a stronger dependency on rpc.gssd. If the latter is not
running, because the user failed to start it, then we time out
and mark the container as not having an instance. We then
use that information to time out faster the next time.

If, on the other hand, the rpc.gssd successfully binds to an rpc_pipe,
then we mark the container as having an rpc.gssd instance.

Signed-off-by: Trond Myklebust <[email protected]>
---
net/sunrpc/auth_gss/auth_gss.c | 13 ++++++++++++-
net/sunrpc/netns.h | 2 ++
net/sunrpc/rpc_pipe.c | 4 ++++
3 files changed, 18 insertions(+), 1 deletion(-)

diff --git a/net/sunrpc/auth_gss/auth_gss.c b/net/sunrpc/auth_gss/auth_gss.c
index f17f3c5..3aff72f 100644
--- a/net/sunrpc/auth_gss/auth_gss.c
+++ b/net/sunrpc/auth_gss/auth_gss.c
@@ -52,6 +52,8 @@
#include <linux/sunrpc/gss_api.h>
#include <asm/uaccess.h>

+#include "../netns.h"
+
static const struct rpc_authops authgss_ops;

static const struct rpc_credops gss_credops;
@@ -559,9 +561,12 @@ out:
static inline int
gss_create_upcall(struct gss_auth *gss_auth, struct gss_cred *gss_cred)
{
+ struct net *net = rpc_net_ns(gss_auth->client);
+ struct sunrpc_net *sn = net_generic(net, sunrpc_net_id);
struct rpc_pipe *pipe;
struct rpc_cred *cred = &gss_cred->gc_base;
struct gss_upcall_msg *gss_msg;
+ unsigned long timeout;
DEFINE_WAIT(wait);
int err;

@@ -569,11 +574,17 @@ gss_create_upcall(struct gss_auth *gss_auth, struct gss_cred *gss_cred)
__func__, from_kuid(&init_user_ns, cred->cr_uid));
retry:
err = 0;
+ /* Default timeout is 15s unless we know that gssd is not running */
+ timeout = 15 * HZ;
+ if (!sn->gssd_running)
+ timeout = HZ >> 2;
gss_msg = gss_setup_upcall(gss_auth->client, gss_auth, cred);
if (PTR_ERR(gss_msg) == -EAGAIN) {
err = wait_event_interruptible_timeout(pipe_version_waitqueue,
- pipe_version >= 0, 15*HZ);
+ pipe_version >= 0, timeout);
if (pipe_version < 0) {
+ if (err == 0)
+ sn->gssd_running = 0;
warn_gssd();
err = -EACCES;
}
diff --git a/net/sunrpc/netns.h b/net/sunrpc/netns.h
index 7111a4c..0827f64 100644
--- a/net/sunrpc/netns.h
+++ b/net/sunrpc/netns.h
@@ -29,6 +29,8 @@ struct sunrpc_net {
struct rpc_clnt *gssp_clnt;
int use_gss_proxy;
struct proc_dir_entry *use_gssp_proc;
+
+ unsigned int gssd_running;
};

extern int sunrpc_net_id;
diff --git a/net/sunrpc/rpc_pipe.c b/net/sunrpc/rpc_pipe.c
index a9129f8..415b705 100644
--- a/net/sunrpc/rpc_pipe.c
+++ b/net/sunrpc/rpc_pipe.c
@@ -216,11 +216,14 @@ rpc_destroy_inode(struct inode *inode)
static int
rpc_pipe_open(struct inode *inode, struct file *filp)
{
+ struct net *net = inode->i_sb->s_fs_info;
+ struct sunrpc_net *sn = net_generic(net, sunrpc_net_id);
struct rpc_pipe *pipe;
int first_open;
int res = -ENXIO;

mutex_lock(&inode->i_mutex);
+ sn->gssd_running = 1;
pipe = RPC_I(inode)->pipe;
if (pipe == NULL)
goto out;
@@ -1069,6 +1072,7 @@ void rpc_pipefs_init_net(struct net *net)
struct sunrpc_net *sn = net_generic(net, sunrpc_net_id);

mutex_init(&sn->pipefs_sb_lock);
+ sn->gssd_running = -1;
}

/*
--
1.8.1.4


2013-05-16 20:21:43

by J. Bruce Fields

[permalink] [raw]
Subject: Re: [PATCH 3/3] SUNRPC: Convert auth_gss pipe detection to work in namespaces

On Wed, May 15, 2013 at 12:50:41PM -0700, Trond Myklebust wrote:
> This seems to have been overlooked when we did the namespace
> conversion. If a container is running a legacy version of rpc.gssd
> then it will be disrupted if the global 'pipe_version' is set by a
> container running the new version of rpc.gssd.

Might be worth deprecating the "legacy" version--add a warning in for
any users and then remove it if we don't see any evidence anyone's hit
it recently.

--b.

>
> Signed-off-by: Trond Myklebust <[email protected]>
> ---
> net/sunrpc/auth_gss/auth_gss.c | 46 +++++++++++++++++++++++++-----------------
> net/sunrpc/netns.h | 2 ++
> net/sunrpc/rpc_pipe.c | 1 +
> 3 files changed, 30 insertions(+), 19 deletions(-)
>
> diff --git a/net/sunrpc/auth_gss/auth_gss.c b/net/sunrpc/auth_gss/auth_gss.c
> index 3aff72f..fc2f78d 100644
> --- a/net/sunrpc/auth_gss/auth_gss.c
> +++ b/net/sunrpc/auth_gss/auth_gss.c
> @@ -87,8 +87,6 @@ struct gss_auth {
> };
>
> /* pipe_version >= 0 if and only if someone has a pipe open. */
> -static int pipe_version = -1;
> -static atomic_t pipe_users = ATOMIC_INIT(0);
> static DEFINE_SPINLOCK(pipe_version_lock);
> static struct rpc_wait_queue pipe_version_rpc_waitqueue;
> static DECLARE_WAIT_QUEUE_HEAD(pipe_version_waitqueue);
> @@ -268,24 +266,27 @@ struct gss_upcall_msg {
> char databuf[UPCALL_BUF_LEN];
> };
>
> -static int get_pipe_version(void)
> +static int get_pipe_version(struct net *net)
> {
> + struct sunrpc_net *sn = net_generic(net, sunrpc_net_id);
> int ret;
>
> spin_lock(&pipe_version_lock);
> - if (pipe_version >= 0) {
> - atomic_inc(&pipe_users);
> - ret = pipe_version;
> + if (sn->pipe_version >= 0) {
> + atomic_inc(&sn->pipe_users);
> + ret = sn->pipe_version;
> } else
> ret = -EAGAIN;
> spin_unlock(&pipe_version_lock);
> return ret;
> }
>
> -static void put_pipe_version(void)
> +static void put_pipe_version(struct net *net)
> {
> - if (atomic_dec_and_lock(&pipe_users, &pipe_version_lock)) {
> - pipe_version = -1;
> + struct sunrpc_net *sn = net_generic(net, sunrpc_net_id);
> +
> + if (atomic_dec_and_lock(&sn->pipe_users, &pipe_version_lock)) {
> + sn->pipe_version = -1;
> spin_unlock(&pipe_version_lock);
> }
> }
> @@ -293,9 +294,10 @@ static void put_pipe_version(void)
> static void
> gss_release_msg(struct gss_upcall_msg *gss_msg)
> {
> + struct net *net = rpc_net_ns(gss_msg->auth->client);
> if (!atomic_dec_and_test(&gss_msg->count))
> return;
> - put_pipe_version();
> + put_pipe_version(net);
> BUG_ON(!list_empty(&gss_msg->list));
> if (gss_msg->ctx != NULL)
> gss_put_ctx(gss_msg->ctx);
> @@ -441,7 +443,10 @@ static void gss_encode_msg(struct gss_upcall_msg *gss_msg,
> struct rpc_clnt *clnt,
> const char *service_name)
> {
> - if (pipe_version == 0)
> + struct net *net = rpc_net_ns(clnt);
> + struct sunrpc_net *sn = net_generic(net, sunrpc_net_id);
> +
> + if (sn->pipe_version == 0)
> gss_encode_v0_msg(gss_msg);
> else /* pipe_version == 1 */
> gss_encode_v1_msg(gss_msg, clnt, service_name);
> @@ -457,7 +462,7 @@ gss_alloc_msg(struct gss_auth *gss_auth, struct rpc_clnt *clnt,
> gss_msg = kzalloc(sizeof(*gss_msg), GFP_NOFS);
> if (gss_msg == NULL)
> return ERR_PTR(-ENOMEM);
> - vers = get_pipe_version();
> + vers = get_pipe_version(rpc_net_ns(clnt));
> if (vers < 0) {
> kfree(gss_msg);
> return ERR_PTR(vers);
> @@ -581,8 +586,8 @@ retry:
> gss_msg = gss_setup_upcall(gss_auth->client, gss_auth, cred);
> if (PTR_ERR(gss_msg) == -EAGAIN) {
> err = wait_event_interruptible_timeout(pipe_version_waitqueue,
> - pipe_version >= 0, timeout);
> - if (pipe_version < 0) {
> + sn->pipe_version >= 0, timeout);
> + if (sn->pipe_version < 0) {
> if (err == 0)
> sn->gssd_running = 0;
> warn_gssd();
> @@ -719,20 +724,22 @@ out:
>
> static int gss_pipe_open(struct inode *inode, int new_version)
> {
> + struct net *net = inode->i_sb->s_fs_info;
> + struct sunrpc_net *sn = net_generic(net, sunrpc_net_id);
> int ret = 0;
>
> spin_lock(&pipe_version_lock);
> - if (pipe_version < 0) {
> + if (sn->pipe_version < 0) {
> /* First open of any gss pipe determines the version: */
> - pipe_version = new_version;
> + sn->pipe_version = new_version;
> rpc_wake_up(&pipe_version_rpc_waitqueue);
> wake_up(&pipe_version_waitqueue);
> - } else if (pipe_version != new_version) {
> + } else if (sn->pipe_version != new_version) {
> /* Trying to open a pipe of a different version */
> ret = -EBUSY;
> goto out;
> }
> - atomic_inc(&pipe_users);
> + atomic_inc(&sn->pipe_users);
> out:
> spin_unlock(&pipe_version_lock);
> return ret;
> @@ -752,6 +759,7 @@ static int gss_pipe_open_v1(struct inode *inode)
> static void
> gss_pipe_release(struct inode *inode)
> {
> + struct net *net = inode->i_sb->s_fs_info;
> struct rpc_pipe *pipe = RPC_I(inode)->pipe;
> struct gss_upcall_msg *gss_msg;
>
> @@ -770,7 +778,7 @@ restart:
> }
> spin_unlock(&pipe->lock);
>
> - put_pipe_version();
> + put_pipe_version(net);
> }
>
> static void
> diff --git a/net/sunrpc/netns.h b/net/sunrpc/netns.h
> index 0827f64..f88b018 100644
> --- a/net/sunrpc/netns.h
> +++ b/net/sunrpc/netns.h
> @@ -28,6 +28,8 @@ struct sunrpc_net {
> wait_queue_head_t gssp_wq;
> struct rpc_clnt *gssp_clnt;
> int use_gss_proxy;
> + unsigned int pipe_version;
> + atomic_t pipe_users;
> struct proc_dir_entry *use_gssp_proc;
>
> unsigned int gssd_running;
> diff --git a/net/sunrpc/rpc_pipe.c b/net/sunrpc/rpc_pipe.c
> index 415b705..4f59a3c 100644
> --- a/net/sunrpc/rpc_pipe.c
> +++ b/net/sunrpc/rpc_pipe.c
> @@ -1073,6 +1073,7 @@ void rpc_pipefs_init_net(struct net *net)
>
> mutex_init(&sn->pipefs_sb_lock);
> sn->gssd_running = -1;
> + sn->pipe_version = -1;
> }
>
> /*
> --
> 1.8.1.4
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html

2013-05-16 13:55:26

by Chuck Lever III

[permalink] [raw]
Subject: Re: [PATCH 0/3] Speed up detection of whether or not rpc.gssd is running


On May 15, 2013, at 3:50 PM, Trond Myklebust <[email protected]> wrote:

> Also fix the auth_gss pipe version detection so that it works correctly
> in the presence of namespaces.

Obviously my preference is to keep the existing krb5i -> sys logic and to try to address the upcall timeout, so this is a good direction, IMO.

After some review, I'm still a little concerned about initialization races inappropriately preventing upcalls, but time (and testing) will tell. I'm happy with this solution if you and Bruce are.

Speaking of testing, I can test this series if you think that would be valuable, but I assume that those who originally reported the timeout problem (Jeff and Steve) should be the ones to confirm whether this series addresses their concern.


> Trond Myklebust (3):
> SUNRPC: Fix a bug in gss_create_upcall
> SUNRPC: Faster detection if gssd is actually running
> SUNRPC: Convert auth_gss pipe detection to work in namespaces
>
> net/sunrpc/auth_gss/auth_gss.c | 62 ++++++++++++++++++++++++++++--------------
> net/sunrpc/netns.h | 4 +++
> net/sunrpc/rpc_pipe.c | 5 ++++
> 3 files changed, 50 insertions(+), 21 deletions(-)

--
Chuck Lever
chuck[dot]lever[at]oracle[dot]com





2013-05-17 17:56:49

by Myklebust, Trond

[permalink] [raw]
Subject: Re: [PATCH 2/3] SUNRPC: Faster detection if gssd is actually running

On Thu, 2013-05-16 at 21:03 -0400, J. Bruce Fields wrote:
> On Thu, May 16, 2013 at 04:19:54PM -0400, bfields wrote:
> > On Wed, May 15, 2013 at 12:50:40PM -0700, Trond Myklebust wrote:
> > > Recent changes to the NFS security flavour negotiation mean that
> > > we have a stronger dependency on rpc.gssd. If the latter is not
> > > running, because the user failed to start it, then we time out
> > > and mark the container as not having an instance. We then
> > > use that information to time out faster the next time.
> > >
> > > If, on the other hand, the rpc.gssd successfully binds to an rpc_pipe,
> > > then we mark the container as having an rpc.gssd instance.
> >
> > So it's still a 15 second delay on the first mount, then 7 on the
> > second, then 3, 1, and no delay thereafter. Is that right?
> >
> > Why not be harsher and go straight to 0 after the first failure?
>
> Chuck points out I'm confused, it's actually 15s then 1/4s (why 1/4s?).

The timeout has to be non-zero, otherwise if you _do_ restart rpc.gssd,
it needs a certain time to actually connect to one of the gssd
rpc_pipes.

The 15 second initial timeout is there in order to deal with the fact
that it may take a moment or 2 for init to get round to starting
rpc.gssd. I didn't want to change that value right now.

> (Apologies, somehow I saw a ">> 2" in there and my brain shut down and
> jumped to the "exponential delay" conclusion.)
>
> Still sort of curious how we're choosing these delays, though.
>
> Hard to imagine people won't still notice the delay on first mount.
> Given there's a workaround (run gssd), maybe that's just good enough for
> now, I don't know.


--
Trond Myklebust
Linux NFS client maintainer

NetApp
[email protected]
http://www.netapp.com