2012-03-21 13:52:14

by Jeff Layton

[permalink] [raw]
Subject: [PATCH v10 0/8] nfsd: overhaul the client name tracking code

This is the tenth iteration of this patchset. The primary motivation
of this respin is to fix up some merge conflicts with some fixes that
Bruce merged recently.

For those who haven't followed along on the last few iterations, this
patchset also begins the "containerization" of nfsd. It introduces a
per-ns object that I envision growing over time as we make more of the
nfsd code namespace aware.

I've also rolled in the patch to convert the cl_cb_flags to a generic
flags field since that's a prerequisite, and added a patch to ensure
that no one tries to use the legacy client tracking code in anything but
the init_net namespace.

I'd like to see this go into 3.4 if possible...

Thanks,

Jeff Layton (8):
nfsd: convert nfs4_client->cl_cb_flags to a generic flags field
nfsd: add nfsd4_client_tracking_ops struct and a way to set it
sunrpc: create nfsd dir in rpc_pipefs
nfsd: add a per-net-namespace struct for nfsd
nfsd: add a header describing upcall to nfsdcld
nfsd: add the infrastructure to handle the cld upcall
nfsd: add notifier to handle mount/unmount of rpc_pipefs sb
nfsd: don't allow legacy client tracker init for anything but
init_net

fs/nfsd/netns.h | 35 +++
fs/nfsd/nfs4callback.c | 14 +-
fs/nfsd/nfs4proc.c | 3 +-
fs/nfsd/nfs4recover.c | 636 +++++++++++++++++++++++++++++++++++++++++++++-
fs/nfsd/nfs4state.c | 74 +++---
fs/nfsd/nfsctl.c | 22 ++-
fs/nfsd/state.h | 26 ++-
include/linux/nfsd/cld.h | 56 ++++
net/sunrpc/rpc_pipe.c | 5 +
9 files changed, 796 insertions(+), 75 deletions(-)
create mode 100644 fs/nfsd/netns.h
create mode 100644 include/linux/nfsd/cld.h

--
1.7.7.6



2012-03-23 16:11:55

by Jeff Layton

[permalink] [raw]
Subject: Re: [PATCH v10 3/8] sunrpc: create nfsd dir in rpc_pipefs

On Fri, 23 Mar 2012 11:53:37 -0400
Jeff Layton <[email protected]> wrote:

> On Fri, 23 Mar 2012 15:34:21 +0000
> "Myklebust, Trond" <[email protected]> wrote:
>
> > On Fri, 2012-03-23 at 11:22 -0400, J. Bruce Fields wrote:
> > > On Fri, Mar 23, 2012 at 03:20:21PM +0000, Myklebust, Trond wrote:
> > > > On Fri, 2012-03-23 at 09:31 -0400, J. Bruce Fields wrote:
> > > > > On Fri, Mar 23, 2012 at 08:12:08AM -0400, J. Bruce Fields wrote:
> > > > > > On Wed, Mar 21, 2012 at 09:52:04AM -0400, Jeff Layton wrote:
> > > > > > > Add a new top-level dir in rpc_pipefs to hold the pipe for the clientid
> > > > > > > upcall.
> > > > > >
> > > > > > After applying this patch, my tests consistently hang. The hang happens
> > > > > > in excltest (of the special connectaton tests), over nfs4.1 and krb5.
> > > > > > Looking at the wire traffic, I'm seeing DELAY returned from a setattr
> > > > > > for mode on a newly-created (with EXCLUSIVE4_1) file. That open got a
> > > > > > delegation, so presumably that's what's causing the DELAY, though I'm
> > > > > > not seeing the server send a recall. That could be a krb5 bug.
> > > > > >
> > > > > > Whatever bug there is here, it's hard to tell why this patch in
> > > > > > particular would make it more likely.
> > > > > >
> > > > > > So, still investigating!
> > > > >
> > > > > Reproduceable by:
> > > > >
> > > > > mount -osec=krb5,minorversion=1 server:/export/ /mnt/
> > > > > cp cthon04/special/excltest /mnt/
> > > > > cd /mnt
> > > > > ./excltest
> > > >
> > > > Umm... When would you ever get a DELAY in the above scenario? I can see
> > > > getting an NFS4ERR_OPENMODE, but not DELAY.
> > >
> > > There's a setattr for mode right after the open. Is that unexpected?
> >
> > Well yes, it is. The NFSv4.1 exclusive open should always be sending a
> > full set of attributes as part of the OPEN operation. The session replay
> > cache is now supposed to guarantee the only-once semantics that the
> > verifier used to provide.
> >
> > > The server doesn't really have to recall the delegation in that case (it
> > > only needs to recall *other* clients' delegations) but I don't think
> > > it's wrong to.
> >
> > Then why isn't it allowing the operation? Any sane client would normally
> > interpret NFS4ERR_DELAY to mean that the server is doing something to
> > fix whatever situation is preventing the operation from completing
> > (presumably by recalling delegations in this case). Just replying DELAY
> > and doing nothing is not helpful...
> >
>
> Yeah, this seems like a clear bug in the server code. I think it's
> replying DELAY in order to recall the delegation, but the delegation
> isn't getting recalled for some reason. We arguably don't actually need
> to recall it here, but I don't see any recall go out at all either...
>
> As to why this patch seems to uncover this bug -- that's a complete
> mystery at this point...
>

...and contrary to what Bruce has seen, I can also reproduce this when
the server is running a stock (unpatched) 3.3.0 kernel from the Fedora
rawhide repos.

--
Jeff Layton <[email protected]>

2012-03-29 14:30:31

by Matt W. Benjamin

[permalink] [raw]
Subject: Re: [PATCH] nfsd4: use auth_unix unconditionally on backchannel

Er, I meant to type, only with respect to v41.

----- "Matt W. Benjamin" <[email protected]> wrote:

> Hi,
>
> Am I correct that this limitation is only with respect to v40 (that's
> how I read the comment and the code in fs/nfs/callback.c)?
>
> Thanks,
>
> Matt
>
> ----- "J. Bruce Fields" <[email protected]> wrote:
>
> > On Wed, Mar 28, 2012 at 11:16:49PM +0000, Myklebust, Trond wrote:
> > > On Wed, 2012-03-28 at 19:09 -0400, J. Bruce Fields wrote:
> > > > This is a bandaid.
> > > >
> > > > I have a series of patches that actually implement the correct
> > behavior,
> > > > but that may not quite be ready for 3.4.
> > > >
> > > > --b.
> > > >
> > > > commit 2f026867c76171d26f003b211063ff0562097d5e
> > > > Author: J. Bruce Fields <[email protected]>
> > > > Date: Wed Mar 28 14:18:16 2012 -0400
> > > >
> > > > nfsd4: use auth_unix unconditionally on backchannel
> > > >
> > > > This isn't actually correct, but it works with the Linux
> > client, and
> > > > agrees with the behavior we used to have before commit
> > 80fc015bdfe.
> > >
> > > Question: does the Linux client ever send you anything other than
> > > AUTH_SYS credentials for the csa_sec_parms argument in
> > CREATE_SESSION?
> > > Anything other than that would be a bug, since our client doesn't
> > > actually support RPCSEC_GSS in the callback channel.
> >
> > Right, I've never seen anything else, so I think the client's
> > behaving
> > as expected.
> >
> > But the server needs to be fixed to deal with the range of possible
> > csa_sec_parms possibilities regardless.
> >
> > The only thing I find odd about the client behavior is why it even
> > bothers with auth_sys when auth_null would work just as well and be
> > even
> > slightly simpler.
> >
> > --b.
> >
> > >
> > > > Later patches will implement the spec-mandated behavior
> (which
> > is to use
> > > > the security parameters explicitly given by the client in
> > create_session
> > > > or backchannel_ctl).
> > > >
> > >
> > >
> > > --
> > > Trond Myklebust
> > > Linux NFS client maintainer
> > >
> > > NetApp
> > > [email protected]
> > > http://www.netapp.com
> > >
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-nfs"
> > in
> > the body of a message to [email protected]
> > More majordomo info at http://vger.kernel.org/majordomo-info.html
>
> --
> Matt Benjamin
> The Linux Box
> 206 South Fifth Ave. Suite 150
> Ann Arbor, MI 48104
>
> http://linuxbox.com
>
> tel. 734-761-4689
> fax. 734-769-8938
> cel. 734-216-5309

--
Matt Benjamin
The Linux Box
206 South Fifth Ave. Suite 150
Ann Arbor, MI 48104

http://linuxbox.com

tel. 734-761-4689
fax. 734-769-8938
cel. 734-216-5309

2012-03-21 13:52:23

by Jeff Layton

[permalink] [raw]
Subject: [PATCH v10 7/8] nfsd: add notifier to handle mount/unmount of rpc_pipefs sb

In the event that rpc_pipefs isn't mounted when nfsd starts, we
must register a notifier to handle creating the dentry once it
is mounted, and to remove the dentry on unmount.

Signed-off-by: Jeff Layton <[email protected]>
---
fs/nfsd/netns.h | 1 +
fs/nfsd/nfs4recover.c | 44 ++++++++++++++++++++++++++++++++++++++++++++
fs/nfsd/nfsctl.c | 9 ++++++++-
3 files changed, 53 insertions(+), 1 deletions(-)

diff --git a/fs/nfsd/netns.h b/fs/nfsd/netns.h
index 12e0cff..66eac33 100644
--- a/fs/nfsd/netns.h
+++ b/fs/nfsd/netns.h
@@ -31,4 +31,5 @@ struct nfsd_net {
};

extern int nfsd_net_id;
+extern struct notifier_block nfsd4_cld_block;
#endif /* __NFSD_NETNS_H__ */
diff --git a/fs/nfsd/nfs4recover.c b/fs/nfsd/nfs4recover.c
index cec62ed..6f13281 100644
--- a/fs/nfsd/nfs4recover.c
+++ b/fs/nfsd/nfs4recover.c
@@ -38,6 +38,7 @@
#include <linux/crypto.h>
#include <linux/sched.h>
#include <linux/fs.h>
+#include <linux/module.h>
#include <net/net_namespace.h>
#include <linux/sunrpc/rpc_pipe_fs.h>
#include <linux/sunrpc/clnt.h>
@@ -982,3 +983,46 @@ nfsd4_record_grace_done(struct net *net, time_t boot_time)
if (client_tracking_ops)
client_tracking_ops->grace_done(net, boot_time);
}
+
+static int
+rpc_pipefs_event(struct notifier_block *nb, unsigned long event, void *ptr)
+{
+ struct super_block *sb = ptr;
+ struct net *net = sb->s_fs_info;
+ struct nfsd_net *nn = net_generic(net, nfsd_net_id);
+ struct cld_net *cn = nn->cld_net;
+ struct dentry *dentry;
+ int ret = 0;
+
+ if (!try_module_get(THIS_MODULE))
+ return 0;
+
+ if (!cn) {
+ module_put(THIS_MODULE);
+ return 0;
+ }
+
+ switch (event) {
+ case RPC_PIPEFS_MOUNT:
+ dentry = nfsd4_cld_register_sb(sb, cn->cn_pipe);
+ if (IS_ERR(dentry)) {
+ ret = PTR_ERR(dentry);
+ break;
+ }
+ cn->cn_pipe->dentry = dentry;
+ break;
+ case RPC_PIPEFS_UMOUNT:
+ if (cn->cn_pipe->dentry)
+ nfsd4_cld_unregister_sb(cn->cn_pipe);
+ break;
+ default:
+ ret = -ENOTSUPP;
+ break;
+ }
+ module_put(THIS_MODULE);
+ return ret;
+}
+
+struct notifier_block nfsd4_cld_block = {
+ .notifier_call = rpc_pipefs_event,
+};
diff --git a/fs/nfsd/nfsctl.c b/fs/nfsd/nfsctl.c
index 141197e..dee6c1b 100644
--- a/fs/nfsd/nfsctl.c
+++ b/fs/nfsd/nfsctl.c
@@ -13,6 +13,7 @@
#include <linux/sunrpc/clnt.h>
#include <linux/sunrpc/gss_api.h>
#include <linux/sunrpc/gss_krb5_enctypes.h>
+#include <linux/sunrpc/rpc_pipe_fs.h>
#include <linux/module.h>

#include "idmap.h"
@@ -1136,9 +1137,12 @@ static int __init init_nfsd(void)
int retval;
printk(KERN_INFO "Installing knfsd (copyright (C) 1996 [email protected]).\n");

+ retval = rpc_pipefs_notifier_register(&nfsd4_cld_block);
+ if (retval)
+ return retval;
retval = register_pernet_subsys(&nfsd_net_ops);
if (retval < 0)
- return retval;
+ goto out_unregister_notifier;
retval = nfsd4_init_slabs();
if (retval)
goto out_unregister_pernet;
@@ -1181,6 +1185,8 @@ out_free_slabs:
nfsd4_free_slabs();
out_unregister_pernet:
unregister_pernet_subsys(&nfsd_net_ops);
+out_unregister_notifier:
+ rpc_pipefs_notifier_unregister(&nfsd4_cld_block);
return retval;
}

@@ -1197,6 +1203,7 @@ static void __exit exit_nfsd(void)
nfsd_fault_inject_cleanup();
unregister_filesystem(&nfsd_fs_type);
unregister_pernet_subsys(&nfsd_net_ops);
+ rpc_pipefs_notifier_unregister(&nfsd4_cld_block);
}

MODULE_AUTHOR("Olaf Kirch <[email protected]>");
--
1.7.7.6


2012-03-21 21:05:36

by J. Bruce Fields

[permalink] [raw]
Subject: Re: [PATCH v10 1/8] nfsd: convert nfs4_client->cl_cb_flags to a generic flags field

On Wed, Mar 21, 2012 at 04:52:44PM -0400, Jeff Layton wrote:
> On Wed, 21 Mar 2012 16:41:42 -0400
> "J. Bruce Fields" <[email protected]> wrote:
>
> > On Wed, Mar 21, 2012 at 09:52:02AM -0400, Jeff Layton wrote:
> > > @@ -2779,12 +2780,6 @@ static void
> > > nfs4_set_claim_prev(struct nfsd4_open *open, bool has_session)
> > > {
> > > open->op_openowner->oo_flags |= NFS4_OO_CONFIRMED;
> > > - /*
> > > - * On a 4.1+ client, we don't create a state record for a client
> > > - * until it performs RECLAIM_COMPLETE:
> > > - */
> > > - if (!has_session)
> > > - open->op_openowner->oo_owner.so_client->cl_firststate = 1;
> >
> > That doesn't look right. Sure you didn't mean to just make that a
> > set_bit, at least for now?
> >
> > --b.
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> > the body of a message to [email protected]
> > More majordomo info at http://vger.kernel.org/majordomo-info.html
>
> No. I think that's correct. Once the RECLAIM_COMPLETE is done, we'll
> create a client record and then the NFSD4_CLIENT_STABLE flag will be
> set. We clearly don't want to set that flag until the client is
> actually recorded onto stable storage.
>
> One of the goals of this patch was to get rid of cl_firststate which
> had a very ambiguous meaning. NFSD4_CLIENT_STABLE now means that the
> client is recorded on stable storage, full stop.
>
> Did I miss some subtlety that we'll need to account for in the state
> model here?

Oh, I see, I missed that nfsd4_client_record_check is setting the flag
where I think it wouldn't have been before.

This does look cleaner, thanks!

--b.

2012-03-21 13:52:15

by Jeff Layton

[permalink] [raw]
Subject: [PATCH v10 1/8] nfsd: convert nfs4_client->cl_cb_flags to a generic flags field

We'll need a way to flag the nfs4_client as already being recorded on
stable storage so that we don't continually upcall. Currently, that's
recorded in the cl_firststate field of the client struct. Using an
entire u32 to store a flag is rather wasteful though.

The cl_cb_flags field is only using 2 bits right now, so repurpose that
to a generic flags field. Rename NFSD4_CLIENT_KILL to
NFSD4_CLIENT_CB_KILL to make it evident that it's part of the callback
flags. Add a mask that we can use for existing checks that look to see
whether any flags are set, so that the new flags don't interfere.

Convert all references to cl_firstate to the NFSD4_CLIENT_STABLE flag,
and add a new NFSD4_CLIENT_RECLAIM_COMPLETE flag.

Signed-off-by: Jeff Layton <[email protected]>
---
fs/nfsd/nfs4callback.c | 14 +++++++-------
fs/nfsd/nfs4proc.c | 3 ++-
fs/nfsd/nfs4recover.c | 7 +++----
fs/nfsd/nfs4state.c | 45 ++++++++++++++++++++++++++++-----------------
fs/nfsd/state.h | 11 +++++++----
5 files changed, 47 insertions(+), 33 deletions(-)

diff --git a/fs/nfsd/nfs4callback.c b/fs/nfsd/nfs4callback.c
index 0e262f3..a09dcc4 100644
--- a/fs/nfsd/nfs4callback.c
+++ b/fs/nfsd/nfs4callback.c
@@ -754,9 +754,9 @@ static void do_probe_callback(struct nfs4_client *clp)
*/
void nfsd4_probe_callback(struct nfs4_client *clp)
{
- /* XXX: atomicity? Also, should we be using cl_cb_flags? */
+ /* XXX: atomicity? Also, should we be using cl_flags? */
clp->cl_cb_state = NFSD4_CB_UNKNOWN;
- set_bit(NFSD4_CLIENT_CB_UPDATE, &clp->cl_cb_flags);
+ set_bit(NFSD4_CLIENT_CB_UPDATE, &clp->cl_flags);
do_probe_callback(clp);
}

@@ -915,7 +915,7 @@ void nfsd4_destroy_callback_queue(void)
/* must be called under the state lock */
void nfsd4_shutdown_callback(struct nfs4_client *clp)
{
- set_bit(NFSD4_CLIENT_KILL, &clp->cl_cb_flags);
+ set_bit(NFSD4_CLIENT_CB_KILL, &clp->cl_flags);
/*
* Note this won't actually result in a null callback;
* instead, nfsd4_do_callback_rpc() will detect the killed
@@ -966,15 +966,15 @@ static void nfsd4_process_cb_update(struct nfsd4_callback *cb)
svc_xprt_put(clp->cl_cb_conn.cb_xprt);
clp->cl_cb_conn.cb_xprt = NULL;
}
- if (test_bit(NFSD4_CLIENT_KILL, &clp->cl_cb_flags))
+ if (test_bit(NFSD4_CLIENT_CB_KILL, &clp->cl_flags))
return;
spin_lock(&clp->cl_lock);
/*
* Only serialized callback code is allowed to clear these
* flags; main nfsd code can only set them:
*/
- BUG_ON(!clp->cl_cb_flags);
- clear_bit(NFSD4_CLIENT_CB_UPDATE, &clp->cl_cb_flags);
+ BUG_ON(!(clp->cl_flags & NFSD4_CLIENT_CB_FLAG_MASK));
+ clear_bit(NFSD4_CLIENT_CB_UPDATE, &clp->cl_flags);
memcpy(&conn, &cb->cb_clp->cl_cb_conn, sizeof(struct nfs4_cb_conn));
c = __nfsd4_find_backchannel(clp);
if (c) {
@@ -1000,7 +1000,7 @@ void nfsd4_do_callback_rpc(struct work_struct *w)
struct nfs4_client *clp = cb->cb_clp;
struct rpc_clnt *clnt;

- if (clp->cl_cb_flags)
+ if (clp->cl_flags & NFSD4_CLIENT_CB_FLAG_MASK)
nfsd4_process_cb_update(cb);

clnt = clp->cl_cb_client;
diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c
index 896da74..af2c3e8 100644
--- a/fs/nfsd/nfs4proc.c
+++ b/fs/nfsd/nfs4proc.c
@@ -319,7 +319,8 @@ nfsd4_open(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
* Before RECLAIM_COMPLETE done, server should deny new lock
*/
if (nfsd4_has_session(cstate) &&
- !cstate->session->se_client->cl_firststate &&
+ !test_bit(NFSD4_CLIENT_RECLAIM_COMPLETE,
+ &cstate->session->se_client->cl_flags) &&
open->op_claim_type != NFS4_OPEN_CLAIM_PREVIOUS)
return nfserr_grace;

diff --git a/fs/nfsd/nfs4recover.c b/fs/nfsd/nfs4recover.c
index 0b3e875..6523809 100644
--- a/fs/nfsd/nfs4recover.c
+++ b/fs/nfsd/nfs4recover.c
@@ -126,9 +126,8 @@ void nfsd4_create_clid_dir(struct nfs4_client *clp)

dprintk("NFSD: nfsd4_create_clid_dir for \"%s\"\n", dname);

- if (clp->cl_firststate)
+ if (test_and_set_bit(NFSD4_CLIENT_STABLE, &clp->cl_flags))
return;
- clp->cl_firststate = 1;
if (!rec_file)
return;
status = nfs4_save_creds(&original_cred);
@@ -271,13 +270,13 @@ nfsd4_remove_clid_dir(struct nfs4_client *clp)
const struct cred *original_cred;
int status;

- if (!rec_file || !clp->cl_firststate)
+ if (!rec_file || !test_bit(NFSD4_CLIENT_STABLE, &clp->cl_flags))
return;

status = mnt_want_write_file(rec_file);
if (status)
goto out;
- clp->cl_firststate = 0;
+ clear_bit(NFSD4_CLIENT_STABLE, &clp->cl_flags);

status = nfs4_save_creds(&original_cred);
if (status < 0)
diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index 8b20707..534f82f1 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -2030,7 +2030,8 @@ nfsd4_reclaim_complete(struct svc_rqst *rqstp, struct nfsd4_compound_state *csta

nfs4_lock_state();
status = nfserr_complete_already;
- if (cstate->session->se_client->cl_firststate)
+ if (test_and_set_bit(NFSD4_CLIENT_RECLAIM_COMPLETE,
+ &cstate->session->se_client->cl_flags))
goto out;

status = nfserr_stale_clientid;
@@ -2779,12 +2780,6 @@ static void
nfs4_set_claim_prev(struct nfsd4_open *open, bool has_session)
{
open->op_openowner->oo_flags |= NFS4_OO_CONFIRMED;
- /*
- * On a 4.1+ client, we don't create a state record for a client
- * until it performs RECLAIM_COMPLETE:
- */
- if (!has_session)
- open->op_openowner->oo_owner.so_client->cl_firststate = 1;
}

/* Should we give out recallable state?: */
@@ -4360,7 +4355,7 @@ nfs4_has_reclaimed_state(const char *name, bool use_exchange_id)
clp = find_confirmed_client_by_str(name, strhashval);
if (!clp)
return 0;
- return clp->cl_firststate;
+ return test_bit(NFSD4_CLIENT_STABLE, &clp->cl_flags);
}

/*
@@ -4405,18 +4400,11 @@ nfs4_release_reclaim(void)
/*
* called from OPEN, CLAIM_PREVIOUS with a new clientid. */
static struct nfs4_client_reclaim *
-nfs4_find_reclaim_client(clientid_t *clid)
+nfsd4_find_reclaim_client(struct nfs4_client *clp)
{
unsigned int strhashval;
- struct nfs4_client *clp;
struct nfs4_client_reclaim *crp = NULL;

-
- /* find clientid in conf_id_hashtbl */
- clp = find_confirmed_client(clid);
- if (clp == NULL)
- return NULL;
-
dprintk("NFSD: nfs4_find_reclaim_client for %.*s with recdir %s\n",
clp->cl_name.len, clp->cl_name.data,
clp->cl_recdir);
@@ -4431,13 +4419,36 @@ nfs4_find_reclaim_client(clientid_t *clid)
return NULL;
}

+static int
+nfsd4_client_record_check(struct nfs4_client *clp)
+{
+ /* did we already find that this client is stable? */
+ if (test_bit(NFSD4_CLIENT_STABLE, &clp->cl_flags))
+ return 0;
+
+ /* look for it in the reclaim hashtable otherwise */
+ if (nfsd4_find_reclaim_client(clp)) {
+ set_bit(NFSD4_CLIENT_STABLE, &clp->cl_flags);
+ return 0;
+ }
+
+ return -ENOENT;
+}
+
/*
* Called from OPEN. Look for clientid in reclaim list.
*/
__be32
nfs4_check_open_reclaim(clientid_t *clid)
{
- return nfs4_find_reclaim_client(clid) ? nfs_ok : nfserr_reclaim_bad;
+ struct nfs4_client *clp;
+
+ /* find clientid in conf_id_hashtbl */
+ clp = find_confirmed_client(clid);
+ if (clp == NULL)
+ return nfserr_reclaim_bad;
+
+ return nfsd4_client_record_check(clp) ? nfserr_reclaim_bad : nfs_ok;
}

#ifdef CONFIG_NFSD_FAULT_INJECTION
diff --git a/fs/nfsd/state.h b/fs/nfsd/state.h
index ffb5df1..b9c036d 100644
--- a/fs/nfsd/state.h
+++ b/fs/nfsd/state.h
@@ -245,14 +245,17 @@ struct nfs4_client {
struct svc_cred cl_cred; /* setclientid principal */
clientid_t cl_clientid; /* generated by server */
nfs4_verifier cl_confirm; /* generated by server */
- u32 cl_firststate; /* recovery dir creation */
u32 cl_minorversion;

/* for v4.0 and v4.1 callbacks: */
struct nfs4_cb_conn cl_cb_conn;
-#define NFSD4_CLIENT_CB_UPDATE 1
-#define NFSD4_CLIENT_KILL 2
- unsigned long cl_cb_flags;
+#define NFSD4_CLIENT_CB_UPDATE (0)
+#define NFSD4_CLIENT_CB_KILL (1)
+#define NFSD4_CLIENT_STABLE (2) /* client on stable storage */
+#define NFSD4_CLIENT_RECLAIM_COMPLETE (3) /* reclaim_complete done */
+#define NFSD4_CLIENT_CB_FLAG_MASK (1 << NFSD4_CLIENT_CB_UPDATE | \
+ 1 << NFSD4_CLIENT_CB_KILL)
+ unsigned long cl_flags;
struct rpc_clnt *cl_cb_client;
u32 cl_cb_ident;
#define NFSD4_CB_UP 0
--
1.7.7.6


2012-03-23 16:00:06

by J. Bruce Fields

[permalink] [raw]
Subject: Re: [PATCH v10 3/8] sunrpc: create nfsd dir in rpc_pipefs

On Fri, Mar 23, 2012 at 03:34:21PM +0000, Myklebust, Trond wrote:
> On Fri, 2012-03-23 at 11:22 -0400, J. Bruce Fields wrote:
> > On Fri, Mar 23, 2012 at 03:20:21PM +0000, Myklebust, Trond wrote:
> > > On Fri, 2012-03-23 at 09:31 -0400, J. Bruce Fields wrote:
> > > > On Fri, Mar 23, 2012 at 08:12:08AM -0400, J. Bruce Fields wrote:
> > > > > On Wed, Mar 21, 2012 at 09:52:04AM -0400, Jeff Layton wrote:
> > > > > > Add a new top-level dir in rpc_pipefs to hold the pipe for the clientid
> > > > > > upcall.
> > > > >
> > > > > After applying this patch, my tests consistently hang. The hang happens
> > > > > in excltest (of the special connectaton tests), over nfs4.1 and krb5.
> > > > > Looking at the wire traffic, I'm seeing DELAY returned from a setattr
> > > > > for mode on a newly-created (with EXCLUSIVE4_1) file. That open got a
> > > > > delegation, so presumably that's what's causing the DELAY, though I'm
> > > > > not seeing the server send a recall. That could be a krb5 bug.
> > > > >
> > > > > Whatever bug there is here, it's hard to tell why this patch in
> > > > > particular would make it more likely.
> > > > >
> > > > > So, still investigating!
> > > >
> > > > Reproduceable by:
> > > >
> > > > mount -osec=krb5,minorversion=1 server:/export/ /mnt/
> > > > cp cthon04/special/excltest /mnt/
> > > > cd /mnt
> > > > ./excltest
> > >
> > > Umm... When would you ever get a DELAY in the above scenario? I can see
> > > getting an NFS4ERR_OPENMODE, but not DELAY.
> >
> > There's a setattr for mode right after the open. Is that unexpected?
>
> Well yes, it is. The NFSv4.1 exclusive open should always be sending a
> full set of attributes as part of the OPEN operation. The session replay
> cache is now supposed to guarantee the only-once semantics that the
> verifier used to provide.

Looking at the trace.... The client is passing a zero attribute set on
the EXCLUSIVE4_1 open.

Hm, I wonder if our support for suppattr_exclreat has a bug. On a quick
check, the code looks like it should do the right thing.

> > The server doesn't really have to recall the delegation in that case (it
> > only needs to recall *other* clients' delegations) but I don't think
> > it's wrong to.
>
> Then why isn't it allowing the operation? Any sane client would normally
> interpret NFS4ERR_DELAY to mean that the server is doing something to
> fix whatever situation is preventing the operation from completing
> (presumably by recalling delegations in this case). Just replying DELAY
> and doing nothing is not helpful...

Yes, there's a backchannel bug of some kind.

Actually I doubt the server's 4.1 krb5 implementation handles the
backchannel correctly at all.

--b.

2012-03-23 13:31:12

by J. Bruce Fields

[permalink] [raw]
Subject: Re: [PATCH v10 3/8] sunrpc: create nfsd dir in rpc_pipefs

On Fri, Mar 23, 2012 at 08:12:08AM -0400, J. Bruce Fields wrote:
> On Wed, Mar 21, 2012 at 09:52:04AM -0400, Jeff Layton wrote:
> > Add a new top-level dir in rpc_pipefs to hold the pipe for the clientid
> > upcall.
>
> After applying this patch, my tests consistently hang. The hang happens
> in excltest (of the special connectaton tests), over nfs4.1 and krb5.
> Looking at the wire traffic, I'm seeing DELAY returned from a setattr
> for mode on a newly-created (with EXCLUSIVE4_1) file. That open got a
> delegation, so presumably that's what's causing the DELAY, though I'm
> not seeing the server send a recall. That could be a krb5 bug.
>
> Whatever bug there is here, it's hard to tell why this patch in
> particular would make it more likely.
>
> So, still investigating!

Reproduceable by:

mount -osec=krb5,minorversion=1 server:/export/ /mnt/
cp cthon04/special/excltest /mnt/
cd /mnt
./excltest

--b.

2012-03-26 20:02:13

by J. Bruce Fields

[permalink] [raw]
Subject: Re: [PATCH v10 0/8] nfsd: overhaul the client name tracking code

On Fri, Mar 23, 2012 at 01:26:18PM -0400, Jeff Layton wrote:
> On Fri, 23 Mar 2012 13:06:30 -0400
> "J. Bruce Fields" <[email protected]> wrote:
>
> > On Wed, Mar 21, 2012 at 09:52:01AM -0400, Jeff Layton wrote:
> > > This is the tenth iteration of this patchset. The primary motivation
> > > of this respin is to fix up some merge conflicts with some fixes that
> > > Bruce merged recently.
> >
> > Thanks, these look fine. I'm intending to merge for 3.4. But I going
> > want to spend some more time investigating the callback bug that
> > excltest is triggering before committing to my for-3.4 branch.
> >
> > --b.
> >
>
> Thanks, and totally understood. No need to muddy the waters by merging new code...

Having looked at it longer: first, I can't see how 4.1/krb5 callbacks
ever really worked. That's a project for another day. (Soon, but
probably not for 3.4.)

Second, after further testing it turns out that the failure isn't always
reproduceable after applying these patches, and is sometimes
reproduceable without.

Also, the patches look fine to me.

So, I'm pushing them out soon and should be sending Linus a pull request
in the next couple days.

--b.

2012-03-28 23:46:20

by J. Bruce Fields

[permalink] [raw]
Subject: Re: [PATCH] nfsd4: use auth_unix unconditionally on backchannel

On Wed, Mar 28, 2012 at 11:16:49PM +0000, Myklebust, Trond wrote:
> On Wed, 2012-03-28 at 19:09 -0400, J. Bruce Fields wrote:
> > This is a bandaid.
> >
> > I have a series of patches that actually implement the correct behavior,
> > but that may not quite be ready for 3.4.
> >
> > --b.
> >
> > commit 2f026867c76171d26f003b211063ff0562097d5e
> > Author: J. Bruce Fields <[email protected]>
> > Date: Wed Mar 28 14:18:16 2012 -0400
> >
> > nfsd4: use auth_unix unconditionally on backchannel
> >
> > This isn't actually correct, but it works with the Linux client, and
> > agrees with the behavior we used to have before commit 80fc015bdfe.
>
> Question: does the Linux client ever send you anything other than
> AUTH_SYS credentials for the csa_sec_parms argument in CREATE_SESSION?
> Anything other than that would be a bug, since our client doesn't
> actually support RPCSEC_GSS in the callback channel.

Right, I've never seen anything else, so I think the client's behaving
as expected.

But the server needs to be fixed to deal with the range of possible
csa_sec_parms possibilities regardless.

The only thing I find odd about the client behavior is why it even
bothers with auth_sys when auth_null would work just as well and be even
slightly simpler.

--b.

>
> > Later patches will implement the spec-mandated behavior (which is to use
> > the security parameters explicitly given by the client in create_session
> > or backchannel_ctl).
> >
>
>
> --
> Trond Myklebust
> Linux NFS client maintainer
>
> NetApp
> [email protected]
> http://www.netapp.com
>

2012-03-29 14:29:46

by Matt W. Benjamin

[permalink] [raw]
Subject: Re: [PATCH] nfsd4: use auth_unix unconditionally on backchannel

Hi,

Am I correct that this limitation is only with respect to v40 (that's how I read the comment and the code in fs/nfs/callback.c)?

Thanks,

Matt

----- "J. Bruce Fields" <[email protected]> wrote:

> On Wed, Mar 28, 2012 at 11:16:49PM +0000, Myklebust, Trond wrote:
> > On Wed, 2012-03-28 at 19:09 -0400, J. Bruce Fields wrote:
> > > This is a bandaid.
> > >
> > > I have a series of patches that actually implement the correct
> behavior,
> > > but that may not quite be ready for 3.4.
> > >
> > > --b.
> > >
> > > commit 2f026867c76171d26f003b211063ff0562097d5e
> > > Author: J. Bruce Fields <[email protected]>
> > > Date: Wed Mar 28 14:18:16 2012 -0400
> > >
> > > nfsd4: use auth_unix unconditionally on backchannel
> > >
> > > This isn't actually correct, but it works with the Linux
> client, and
> > > agrees with the behavior we used to have before commit
> 80fc015bdfe.
> >
> > Question: does the Linux client ever send you anything other than
> > AUTH_SYS credentials for the csa_sec_parms argument in
> CREATE_SESSION?
> > Anything other than that would be a bug, since our client doesn't
> > actually support RPCSEC_GSS in the callback channel.
>
> Right, I've never seen anything else, so I think the client's
> behaving
> as expected.
>
> But the server needs to be fixed to deal with the range of possible
> csa_sec_parms possibilities regardless.
>
> The only thing I find odd about the client behavior is why it even
> bothers with auth_sys when auth_null would work just as well and be
> even
> slightly simpler.
>
> --b.
>
> >
> > > Later patches will implement the spec-mandated behavior (which
> is to use
> > > the security parameters explicitly given by the client in
> create_session
> > > or backchannel_ctl).
> > >
> >
> >
> > --
> > Trond Myklebust
> > Linux NFS client maintainer
> >
> > NetApp
> > [email protected]
> > http://www.netapp.com
> >
> --
> To unsubscribe from this list: send the line "unsubscribe linux-nfs"
> in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html

--
Matt Benjamin
The Linux Box
206 South Fifth Ave. Suite 150
Ann Arbor, MI 48104

http://linuxbox.com

tel. 734-761-4689
fax. 734-769-8938
cel. 734-216-5309

2012-03-29 14:48:45

by J. Bruce Fields

[permalink] [raw]
Subject: Re: [PATCH] nfsd4: use auth_unix unconditionally on backchannel

On Thu, Mar 29, 2012 at 10:29:32AM -0400, Matt W. Benjamin wrote:
> Am I correct that this limitation is only with respect to v40 (that's how I read the comment and the code in fs/nfs/callback.c)?

I'm not sure what "limitation" you mean exactly.... The way the spec
works is (from memory, someone correct me if I screw up):

- In the 4.0 case, the server's callbacks use the same flavor as
was used on the setclientid.
- In the 4.1 case, the server's callbacks use the flavor
specified in the csa_sec_parms field in a create_session or
backchannel_ctl.

In the 4.1 case the client always requests auth_unix on the backchannel.
That is the client's right, and is an implementation choice based on the
assumption that the amount of mischief somebody could perform by reading
(or spoofing) callbacks is limited.

The Linux server correctly implements the 4.0 case, but in the 4.1 case
(after this patch, and before my earlier mistake in 80fc015bdfe) it
always uses auth_unix. That happens to satisfy the linux client, but
isn't really correct, as it is perfectly legal for a client to request
something other than auth_unix, and the Linux server would currently
fail to interoperate with such a client.

--b.

>
> Thanks,
>
> Matt
>
> ----- "J. Bruce Fields" <[email protected]> wrote:
>
> > On Wed, Mar 28, 2012 at 11:16:49PM +0000, Myklebust, Trond wrote:
> > > On Wed, 2012-03-28 at 19:09 -0400, J. Bruce Fields wrote:
> > > > This is a bandaid.
> > > >
> > > > I have a series of patches that actually implement the correct
> > behavior,
> > > > but that may not quite be ready for 3.4.
> > > >
> > > > --b.
> > > >
> > > > commit 2f026867c76171d26f003b211063ff0562097d5e
> > > > Author: J. Bruce Fields <[email protected]>
> > > > Date: Wed Mar 28 14:18:16 2012 -0400
> > > >
> > > > nfsd4: use auth_unix unconditionally on backchannel
> > > >
> > > > This isn't actually correct, but it works with the Linux
> > client, and
> > > > agrees with the behavior we used to have before commit
> > 80fc015bdfe.
> > >
> > > Question: does the Linux client ever send you anything other than
> > > AUTH_SYS credentials for the csa_sec_parms argument in
> > CREATE_SESSION?
> > > Anything other than that would be a bug, since our client doesn't
> > > actually support RPCSEC_GSS in the callback channel.
> >
> > Right, I've never seen anything else, so I think the client's
> > behaving
> > as expected.
> >
> > But the server needs to be fixed to deal with the range of possible
> > csa_sec_parms possibilities regardless.
> >
> > The only thing I find odd about the client behavior is why it even
> > bothers with auth_sys when auth_null would work just as well and be
> > even
> > slightly simpler.
> >
> > --b.
> >
> > >
> > > > Later patches will implement the spec-mandated behavior (which
> > is to use
> > > > the security parameters explicitly given by the client in
> > create_session
> > > > or backchannel_ctl).
> > > >
> > >
> > >
> > > --
> > > Trond Myklebust
> > > Linux NFS client maintainer
> > >
> > > NetApp
> > > [email protected]
> > > http://www.netapp.com
> > >
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-nfs"
> > in
> > the body of a message to [email protected]
> > More majordomo info at http://vger.kernel.org/majordomo-info.html
>
> --
> Matt Benjamin
> The Linux Box
> 206 South Fifth Ave. Suite 150
> Ann Arbor, MI 48104
>
> http://linuxbox.com
>
> tel. 734-761-4689
> fax. 734-769-8938
> cel. 734-216-5309

2012-03-27 15:06:58

by J. Bruce Fields

[permalink] [raw]
Subject: Re: [PATCH v10 0/8] nfsd: overhaul the client name tracking code

On Mon, Mar 26, 2012 at 04:02:12PM -0400, J. Bruce Fields wrote:
> Having looked at it longer: first, I can't see how 4.1/krb5 callbacks
> ever really worked. That's a project for another day. (Soon, but
> probably not for 3.4.)

Bah, I'm stupid, I'd forgotten how 4.1 backchannel security works: the
client chooses which flavor(s) are acceptable in create_session (or the
mandatory but unimplemented backchannel_ct). The Linux client always
chooses auth_sys. We've never really paid much attention to the client.
Before we basically just used auth_sys no matter what. Now we're using
krb5 in the krb5 case. Both are wrong, but the latter also breaks in
practice against the Linux client.

I think I changed the behavior accidentally while overhauling the 4.1
server's callback and trunking behavior, probably with 80fc015bdfe
"nfsd4: use common rpc_cred for all callbacks".

I'll look into doing this a little more correctly....

--b.

2012-03-23 17:25:51

by Jeff Layton

[permalink] [raw]
Subject: Re: [PATCH v10 0/8] nfsd: overhaul the client name tracking code

On Fri, 23 Mar 2012 13:06:30 -0400
"J. Bruce Fields" <[email protected]> wrote:

> On Wed, Mar 21, 2012 at 09:52:01AM -0400, Jeff Layton wrote:
> > This is the tenth iteration of this patchset. The primary motivation
> > of this respin is to fix up some merge conflicts with some fixes that
> > Bruce merged recently.
>
> Thanks, these look fine. I'm intending to merge for 3.4. But I going
> want to spend some more time investigating the callback bug that
> excltest is triggering before committing to my for-3.4 branch.
>
> --b.
>

Thanks, and totally understood. No need to muddy the waters by merging new code...

> >
> > For those who haven't followed along on the last few iterations, this
> > patchset also begins the "containerization" of nfsd. It introduces a
> > per-ns object that I envision growing over time as we make more of the
> > nfsd code namespace aware.
> >
> > I've also rolled in the patch to convert the cl_cb_flags to a generic
> > flags field since that's a prerequisite, and added a patch to ensure
> > that no one tries to use the legacy client tracking code in anything but
> > the init_net namespace.
> >
> > I'd like to see this go into 3.4 if possible...
> >
> > Thanks,
> >
> > Jeff Layton (8):
> > nfsd: convert nfs4_client->cl_cb_flags to a generic flags field
> > nfsd: add nfsd4_client_tracking_ops struct and a way to set it
> > sunrpc: create nfsd dir in rpc_pipefs
> > nfsd: add a per-net-namespace struct for nfsd
> > nfsd: add a header describing upcall to nfsdcld
> > nfsd: add the infrastructure to handle the cld upcall
> > nfsd: add notifier to handle mount/unmount of rpc_pipefs sb
> > nfsd: don't allow legacy client tracker init for anything but
> > init_net
> >
> > fs/nfsd/netns.h | 35 +++
> > fs/nfsd/nfs4callback.c | 14 +-
> > fs/nfsd/nfs4proc.c | 3 +-
> > fs/nfsd/nfs4recover.c | 636 +++++++++++++++++++++++++++++++++++++++++++++-
> > fs/nfsd/nfs4state.c | 74 +++---
> > fs/nfsd/nfsctl.c | 22 ++-
> > fs/nfsd/state.h | 26 ++-
> > include/linux/nfsd/cld.h | 56 ++++
> > net/sunrpc/rpc_pipe.c | 5 +
> > 9 files changed, 796 insertions(+), 75 deletions(-)
> > create mode 100644 fs/nfsd/netns.h
> > create mode 100644 include/linux/nfsd/cld.h
> >
> > --
> > 1.7.7.6
> >


--
Jeff Layton <[email protected]>

2012-03-21 13:52:19

by Jeff Layton

[permalink] [raw]
Subject: [PATCH v10 4/8] nfsd: add a per-net-namespace struct for nfsd

Eventually, we'll need this when nfsd gets containerized fully. For
now, create a struct on a per-net-namespace basis that will just hold
a pointer to the cld_net structure. That struct will hold all of the
per-net data that we need for the cld tracker.

Eventually we can add other pernet objects to struct nfsd_net.

Signed-off-by: Jeff Layton <[email protected]>
---
fs/nfsd/netns.h | 34 ++++++++++++++++++++++++++++++++++
fs/nfsd/nfsctl.c | 15 ++++++++++++++-
2 files changed, 48 insertions(+), 1 deletions(-)
create mode 100644 fs/nfsd/netns.h

diff --git a/fs/nfsd/netns.h b/fs/nfsd/netns.h
new file mode 100644
index 0000000..12e0cff
--- /dev/null
+++ b/fs/nfsd/netns.h
@@ -0,0 +1,34 @@
+/*
+ * per net namespace data structures for nfsd
+ *
+ * Copyright (C) 2012, Jeff Layton <[email protected]>
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms of the GNU General Public License as published by the Free
+ * Software Foundation; either version 2 of the License, or (at your option)
+ * any later version.
+ *
+ * This program is distributed in the hope that it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for
+ * more details.
+ *
+ * You should have received a copy of the GNU General Public License along with
+ * this program; if not, write to the Free Software Foundation, Inc., 51
+ * Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA.
+ */
+
+#ifndef __NFSD_NETNS_H__
+#define __NFSD_NETNS_H__
+
+#include <net/net_namespace.h>
+#include <net/netns/generic.h>
+
+struct cld_net;
+
+struct nfsd_net {
+ struct cld_net *cld_net;
+};
+
+extern int nfsd_net_id;
+#endif /* __NFSD_NETNS_H__ */
diff --git a/fs/nfsd/nfsctl.c b/fs/nfsd/nfsctl.c
index 64c24af..141197e 100644
--- a/fs/nfsd/nfsctl.c
+++ b/fs/nfsd/nfsctl.c
@@ -19,6 +19,7 @@
#include "nfsd.h"
#include "cache.h"
#include "fault_inject.h"
+#include "netns.h"

/*
* We have a single directory with several nodes in it.
@@ -1124,14 +1125,23 @@ static int create_proc_exports_entry(void)
}
#endif

+int nfsd_net_id;
+static struct pernet_operations nfsd_net_ops = {
+ .id = &nfsd_net_id,
+ .size = sizeof(struct nfsd_net),
+};
+
static int __init init_nfsd(void)
{
int retval;
printk(KERN_INFO "Installing knfsd (copyright (C) 1996 [email protected]).\n");

+ retval = register_pernet_subsys(&nfsd_net_ops);
+ if (retval < 0)
+ return retval;
retval = nfsd4_init_slabs();
if (retval)
- return retval;
+ goto out_unregister_pernet;
nfs4_state_init();
retval = nfsd_fault_inject_init(); /* nfsd fault injection controls */
if (retval)
@@ -1169,6 +1179,8 @@ out_free_stat:
nfsd_fault_inject_cleanup();
out_free_slabs:
nfsd4_free_slabs();
+out_unregister_pernet:
+ unregister_pernet_subsys(&nfsd_net_ops);
return retval;
}

@@ -1184,6 +1196,7 @@ static void __exit exit_nfsd(void)
nfsd4_free_slabs();
nfsd_fault_inject_cleanup();
unregister_filesystem(&nfsd_fs_type);
+ unregister_pernet_subsys(&nfsd_net_ops);
}

MODULE_AUTHOR("Olaf Kirch <[email protected]>");
--
1.7.7.6


2012-03-21 13:52:24

by Jeff Layton

[permalink] [raw]
Subject: [PATCH v10 8/8] nfsd: don't allow legacy client tracker init for anything but init_net

This code isn't set up for containers, so don't allow it to be
used for anything but init_net.

Signed-off-by: Jeff Layton <[email protected]>
---
fs/nfsd/nfs4recover.c | 7 +++++++
1 files changed, 7 insertions(+), 0 deletions(-)

diff --git a/fs/nfsd/nfs4recover.c b/fs/nfsd/nfs4recover.c
index 6f13281..21b53d5 100644
--- a/fs/nfsd/nfs4recover.c
+++ b/fs/nfsd/nfs4recover.c
@@ -416,6 +416,13 @@ nfsd4_load_reboot_recovery_data(struct net *net)
{
int status;

+ /* XXX: The legacy code won't work in a container */
+ if (net != &init_net) {
+ WARN(1, KERN_ERR "NFSD: attempt to initialize legacy client "
+ "tracking in a container!\n");
+ return -EINVAL;
+ }
+
nfs4_lock_state();
status = nfsd4_init_recdir();
if (!status)
--
1.7.7.6


2012-03-21 13:52:22

by Jeff Layton

[permalink] [raw]
Subject: [PATCH v10 6/8] nfsd: add the infrastructure to handle the cld upcall

...and add a mechanism for switching between the "legacy" tracker and
the new one. The decision is made by looking to see whether the
v4recoverydir exists. If it does, then the legacy client tracker is
used.

If it's not, then the kernel will create a "cld" pipe in rpc_pipefs.
That pipe is used to talk to a daemon for handling the upcall.

Most of the data structures for the new client tracker are handled on a
per-namespace basis, so this upcall should be essentially ready for
containerization. For now however, nfsd just starts it by calling the
initialization and exit functions for init_net.

I'm making the assumption that at some point in the future we'll be able
to determine the net namespace from the nfs4_client. Until then, this
patch hardcodes init_net in those places. I've sprinkled some "FIXME"
comments around that code to attempt to make it clear where we'll need
to fix that up later.

Signed-off-by: Jeff Layton <[email protected]>
---
fs/nfsd/nfs4recover.c | 444 ++++++++++++++++++++++++++++++++++++++++++++++++-
1 files changed, 443 insertions(+), 1 deletions(-)

diff --git a/fs/nfsd/nfs4recover.c b/fs/nfsd/nfs4recover.c
index e616f88..cec62ed 100644
--- a/fs/nfsd/nfs4recover.c
+++ b/fs/nfsd/nfs4recover.c
@@ -1,5 +1,6 @@
/*
* Copyright (c) 2004 The Regents of the University of Michigan.
+* Copyright (c) 2012 Jeff Layton <[email protected]>
* All rights reserved.
*
* Andy Adamson <[email protected]>
@@ -36,10 +37,16 @@
#include <linux/namei.h>
#include <linux/crypto.h>
#include <linux/sched.h>
+#include <linux/fs.h>
+#include <net/net_namespace.h>
+#include <linux/sunrpc/rpc_pipe_fs.h>
+#include <linux/sunrpc/clnt.h>
+#include <linux/nfsd/cld.h>

#include "nfsd.h"
#include "state.h"
#include "vfs.h"
+#include "netns.h"

#define NFSDDBG_FACILITY NFSDDBG_PROC

@@ -486,12 +493,447 @@ static struct nfsd4_client_tracking_ops nfsd4_legacy_tracking_ops = {
.grace_done = nfsd4_recdir_purge_old,
};

+/* Globals */
+#define NFSD_PIPE_DIR "nfsd"
+#define NFSD_CLD_PIPE "cld"
+
+/* per-net-ns structure for holding cld upcall info */
+struct cld_net {
+ struct rpc_pipe *cn_pipe;
+ spinlock_t cn_lock;
+ struct list_head cn_list;
+ unsigned int cn_xid;
+};
+
+struct cld_upcall {
+ struct list_head cu_list;
+ struct cld_net *cu_net;
+ struct task_struct *cu_task;
+ struct cld_msg cu_msg;
+};
+
+static int
+__cld_pipe_upcall(struct rpc_pipe *pipe, struct cld_msg *cmsg)
+{
+ int ret;
+ struct rpc_pipe_msg msg;
+
+ memset(&msg, 0, sizeof(msg));
+ msg.data = cmsg;
+ msg.len = sizeof(*cmsg);
+
+ /*
+ * Set task state before we queue the upcall. That prevents
+ * wake_up_process in the downcall from racing with schedule.
+ */
+ set_current_state(TASK_UNINTERRUPTIBLE);
+ ret = rpc_queue_upcall(pipe, &msg);
+ if (ret < 0) {
+ set_current_state(TASK_RUNNING);
+ goto out;
+ }
+
+ schedule();
+ set_current_state(TASK_RUNNING);
+
+ if (msg.errno < 0)
+ ret = msg.errno;
+out:
+ return ret;
+}
+
+static int
+cld_pipe_upcall(struct rpc_pipe *pipe, struct cld_msg *cmsg)
+{
+ int ret;
+
+ /*
+ * -EAGAIN occurs when pipe is closed and reopened while there are
+ * upcalls queued.
+ */
+ do {
+ ret = __cld_pipe_upcall(pipe, cmsg);
+ } while (ret == -EAGAIN);
+
+ return ret;
+}
+
+static ssize_t
+cld_pipe_downcall(struct file *filp, const char __user *src, size_t mlen)
+{
+ struct cld_upcall *tmp, *cup;
+ struct cld_msg *cmsg = (struct cld_msg *)src;
+ uint32_t xid;
+ struct nfsd_net *nn = net_generic(filp->f_dentry->d_sb->s_fs_info,
+ nfsd_net_id);
+ struct cld_net *cn = nn->cld_net;
+
+ if (mlen != sizeof(*cmsg)) {
+ dprintk("%s: got %lu bytes, expected %lu\n", __func__, mlen,
+ sizeof(*cmsg));
+ return -EINVAL;
+ }
+
+ /* copy just the xid so we can try to find that */
+ if (copy_from_user(&xid, &cmsg->cm_xid, sizeof(xid)) != 0) {
+ dprintk("%s: error when copying xid from userspace", __func__);
+ return -EFAULT;
+ }
+
+ /* walk the list and find corresponding xid */
+ cup = NULL;
+ spin_lock(&cn->cn_lock);
+ list_for_each_entry(tmp, &cn->cn_list, cu_list) {
+ if (get_unaligned(&tmp->cu_msg.cm_xid) == xid) {
+ cup = tmp;
+ list_del_init(&cup->cu_list);
+ break;
+ }
+ }
+ spin_unlock(&cn->cn_lock);
+
+ /* couldn't find upcall? */
+ if (!cup) {
+ dprintk("%s: couldn't find upcall -- xid=%u\n", __func__,
+ cup->cu_msg.cm_xid);
+ return -EINVAL;
+ }
+
+ if (copy_from_user(&cup->cu_msg, src, mlen) != 0)
+ return -EFAULT;
+
+ wake_up_process(cup->cu_task);
+ return mlen;
+}
+
+static void
+cld_pipe_destroy_msg(struct rpc_pipe_msg *msg)
+{
+ struct cld_msg *cmsg = msg->data;
+ struct cld_upcall *cup = container_of(cmsg, struct cld_upcall,
+ cu_msg);
+
+ /* errno >= 0 means we got a downcall */
+ if (msg->errno >= 0)
+ return;
+
+ wake_up_process(cup->cu_task);
+}
+
+static const struct rpc_pipe_ops cld_upcall_ops = {
+ .upcall = rpc_pipe_generic_upcall,
+ .downcall = cld_pipe_downcall,
+ .destroy_msg = cld_pipe_destroy_msg,
+};
+
+static struct dentry *
+nfsd4_cld_register_sb(struct super_block *sb, struct rpc_pipe *pipe)
+{
+ struct dentry *dir, *dentry;
+
+ dir = rpc_d_lookup_sb(sb, NFSD_PIPE_DIR);
+ if (dir == NULL)
+ return ERR_PTR(-ENOENT);
+ dentry = rpc_mkpipe_dentry(dir, NFSD_CLD_PIPE, NULL, pipe);
+ dput(dir);
+ return dentry;
+}
+
+static void
+nfsd4_cld_unregister_sb(struct rpc_pipe *pipe)
+{
+ if (pipe->dentry)
+ rpc_unlink(pipe->dentry);
+}
+
+static struct dentry *
+nfsd4_cld_register_net(struct net *net, struct rpc_pipe *pipe)
+{
+ struct super_block *sb;
+ struct dentry *dentry;
+
+ sb = rpc_get_sb_net(net);
+ if (!sb)
+ return NULL;
+ dentry = nfsd4_cld_register_sb(sb, pipe);
+ rpc_put_sb_net(net);
+ return dentry;
+}
+
+static void
+nfsd4_cld_unregister_net(struct net *net, struct rpc_pipe *pipe)
+{
+ struct super_block *sb;
+
+ sb = rpc_get_sb_net(net);
+ if (sb) {
+ nfsd4_cld_unregister_sb(pipe);
+ rpc_put_sb_net(net);
+ }
+}
+
+/* Initialize rpc_pipefs pipe for communication with client tracking daemon */
+static int
+nfsd4_init_cld_pipe(struct net *net)
+{
+ int ret;
+ struct dentry *dentry;
+ struct nfsd_net *nn = net_generic(net, nfsd_net_id);
+ struct cld_net *cn;
+
+ if (nn->cld_net)
+ return 0;
+
+ cn = kzalloc(sizeof(*cn), GFP_KERNEL);
+ if (!cn) {
+ ret = -ENOMEM;
+ goto err;
+ }
+
+ cn->cn_pipe = rpc_mkpipe_data(&cld_upcall_ops, RPC_PIPE_WAIT_FOR_OPEN);
+ if (IS_ERR(cn->cn_pipe)) {
+ ret = PTR_ERR(cn->cn_pipe);
+ goto err;
+ }
+ spin_lock_init(&cn->cn_lock);
+ INIT_LIST_HEAD(&cn->cn_list);
+
+ dentry = nfsd4_cld_register_net(net, cn->cn_pipe);
+ if (IS_ERR(dentry)) {
+ ret = PTR_ERR(dentry);
+ goto err_destroy_data;
+ }
+
+ cn->cn_pipe->dentry = dentry;
+ nn->cld_net = cn;
+ return 0;
+
+err_destroy_data:
+ rpc_destroy_pipe_data(cn->cn_pipe);
+err:
+ kfree(cn);
+ printk(KERN_ERR "NFSD: unable to create nfsdcld upcall pipe (%d)\n",
+ ret);
+ return ret;
+}
+
+static void
+nfsd4_remove_cld_pipe(struct net *net)
+{
+ struct nfsd_net *nn = net_generic(net, nfsd_net_id);
+ struct cld_net *cn = nn->cld_net;
+
+ nfsd4_cld_unregister_net(net, cn->cn_pipe);
+ rpc_destroy_pipe_data(cn->cn_pipe);
+ kfree(nn->cld_net);
+ nn->cld_net = NULL;
+}
+
+static struct cld_upcall *
+alloc_cld_upcall(struct cld_net *cn)
+{
+ struct cld_upcall *new, *tmp;
+
+ new = kzalloc(sizeof(*new), GFP_KERNEL);
+ if (!new)
+ return new;
+
+ /* FIXME: hard cap on number in flight? */
+restart_search:
+ spin_lock(&cn->cn_lock);
+ list_for_each_entry(tmp, &cn->cn_list, cu_list) {
+ if (tmp->cu_msg.cm_xid == cn->cn_xid) {
+ cn->cn_xid++;
+ spin_unlock(&cn->cn_lock);
+ goto restart_search;
+ }
+ }
+ new->cu_task = current;
+ new->cu_msg.cm_vers = CLD_UPCALL_VERSION;
+ put_unaligned(cn->cn_xid++, &new->cu_msg.cm_xid);
+ new->cu_net = cn;
+ list_add(&new->cu_list, &cn->cn_list);
+ spin_unlock(&cn->cn_lock);
+
+ dprintk("%s: allocated xid %u\n", __func__, new->cu_msg.cm_xid);
+
+ return new;
+}
+
+static void
+free_cld_upcall(struct cld_upcall *victim)
+{
+ struct cld_net *cn = victim->cu_net;
+
+ spin_lock(&cn->cn_lock);
+ list_del(&victim->cu_list);
+ spin_unlock(&cn->cn_lock);
+ kfree(victim);
+}
+
+/* Ask daemon to create a new record */
+static void
+nfsd4_cld_create(struct nfs4_client *clp)
+{
+ int ret;
+ struct cld_upcall *cup;
+ /* FIXME: determine net from clp */
+ struct nfsd_net *nn = net_generic(&init_net, nfsd_net_id);
+ struct cld_net *cn = nn->cld_net;
+
+ /* Don't upcall if it's already stored */
+ if (test_bit(NFSD4_CLIENT_STABLE, &clp->cl_flags))
+ return;
+
+ cup = alloc_cld_upcall(cn);
+ if (!cup) {
+ ret = -ENOMEM;
+ goto out_err;
+ }
+
+ cup->cu_msg.cm_cmd = Cld_Create;
+ cup->cu_msg.cm_u.cm_name.cn_len = clp->cl_name.len;
+ memcpy(cup->cu_msg.cm_u.cm_name.cn_id, clp->cl_name.data,
+ clp->cl_name.len);
+
+ ret = cld_pipe_upcall(cn->cn_pipe, &cup->cu_msg);
+ if (!ret) {
+ ret = cup->cu_msg.cm_status;
+ set_bit(NFSD4_CLIENT_STABLE, &clp->cl_flags);
+ }
+
+ free_cld_upcall(cup);
+out_err:
+ if (ret)
+ printk(KERN_ERR "NFSD: Unable to create client "
+ "record on stable storage: %d\n", ret);
+}
+
+/* Ask daemon to create a new record */
+static void
+nfsd4_cld_remove(struct nfs4_client *clp)
+{
+ int ret;
+ struct cld_upcall *cup;
+ /* FIXME: determine net from clp */
+ struct nfsd_net *nn = net_generic(&init_net, nfsd_net_id);
+ struct cld_net *cn = nn->cld_net;
+
+ /* Don't upcall if it's already removed */
+ if (!test_bit(NFSD4_CLIENT_STABLE, &clp->cl_flags))
+ return;
+
+ cup = alloc_cld_upcall(cn);
+ if (!cup) {
+ ret = -ENOMEM;
+ goto out_err;
+ }
+
+ cup->cu_msg.cm_cmd = Cld_Remove;
+ cup->cu_msg.cm_u.cm_name.cn_len = clp->cl_name.len;
+ memcpy(cup->cu_msg.cm_u.cm_name.cn_id, clp->cl_name.data,
+ clp->cl_name.len);
+
+ ret = cld_pipe_upcall(cn->cn_pipe, &cup->cu_msg);
+ if (!ret) {
+ ret = cup->cu_msg.cm_status;
+ clear_bit(NFSD4_CLIENT_STABLE, &clp->cl_flags);
+ }
+
+ free_cld_upcall(cup);
+out_err:
+ if (ret)
+ printk(KERN_ERR "NFSD: Unable to remove client "
+ "record from stable storage: %d\n", ret);
+}
+
+/* Check for presence of a record, and update its timestamp */
+static int
+nfsd4_cld_check(struct nfs4_client *clp)
+{
+ int ret;
+ struct cld_upcall *cup;
+ /* FIXME: determine net from clp */
+ struct nfsd_net *nn = net_generic(&init_net, nfsd_net_id);
+ struct cld_net *cn = nn->cld_net;
+
+ /* Don't upcall if one was already stored during this grace pd */
+ if (test_bit(NFSD4_CLIENT_STABLE, &clp->cl_flags))
+ return 0;
+
+ cup = alloc_cld_upcall(cn);
+ if (!cup) {
+ printk(KERN_ERR "NFSD: Unable to check client record on "
+ "stable storage: %d\n", -ENOMEM);
+ return -ENOMEM;
+ }
+
+ cup->cu_msg.cm_cmd = Cld_Check;
+ cup->cu_msg.cm_u.cm_name.cn_len = clp->cl_name.len;
+ memcpy(cup->cu_msg.cm_u.cm_name.cn_id, clp->cl_name.data,
+ clp->cl_name.len);
+
+ ret = cld_pipe_upcall(cn->cn_pipe, &cup->cu_msg);
+ if (!ret) {
+ ret = cup->cu_msg.cm_status;
+ set_bit(NFSD4_CLIENT_STABLE, &clp->cl_flags);
+ }
+
+ free_cld_upcall(cup);
+ return ret;
+}
+
+static void
+nfsd4_cld_grace_done(struct net *net, time_t boot_time)
+{
+ int ret;
+ struct cld_upcall *cup;
+ struct nfsd_net *nn = net_generic(net, nfsd_net_id);
+ struct cld_net *cn = nn->cld_net;
+
+ cup = alloc_cld_upcall(cn);
+ if (!cup) {
+ ret = -ENOMEM;
+ goto out_err;
+ }
+
+ cup->cu_msg.cm_cmd = Cld_GraceDone;
+ cup->cu_msg.cm_u.cm_gracetime = (int64_t)boot_time;
+ ret = cld_pipe_upcall(cn->cn_pipe, &cup->cu_msg);
+ if (!ret)
+ ret = cup->cu_msg.cm_status;
+
+ free_cld_upcall(cup);
+out_err:
+ if (ret)
+ printk(KERN_ERR "NFSD: Unable to end grace period: %d\n", ret);
+}
+
+static struct nfsd4_client_tracking_ops nfsd4_cld_tracking_ops = {
+ .init = nfsd4_init_cld_pipe,
+ .exit = nfsd4_remove_cld_pipe,
+ .create = nfsd4_cld_create,
+ .remove = nfsd4_cld_remove,
+ .check = nfsd4_cld_check,
+ .grace_done = nfsd4_cld_grace_done,
+};
+
int
nfsd4_client_tracking_init(struct net *net)
{
int status;
+ struct path path;

- client_tracking_ops = &nfsd4_legacy_tracking_ops;
+ if (!client_tracking_ops) {
+ client_tracking_ops = &nfsd4_cld_tracking_ops;
+ status = kern_path(nfs4_recoverydir(), LOOKUP_FOLLOW, &path);
+ if (!status) {
+ if (S_ISDIR(path.dentry->d_inode->i_mode))
+ client_tracking_ops =
+ &nfsd4_legacy_tracking_ops;
+ path_put(&path);
+ }
+ }

status = client_tracking_ops->init(net);
if (status) {
--
1.7.7.6


2012-03-21 20:52:18

by Jeff Layton

[permalink] [raw]
Subject: Re: [PATCH v10 1/8] nfsd: convert nfs4_client->cl_cb_flags to a generic flags field

On Wed, 21 Mar 2012 16:41:42 -0400
"J. Bruce Fields" <[email protected]> wrote:

> On Wed, Mar 21, 2012 at 09:52:02AM -0400, Jeff Layton wrote:
> > @@ -2779,12 +2780,6 @@ static void
> > nfs4_set_claim_prev(struct nfsd4_open *open, bool has_session)
> > {
> > open->op_openowner->oo_flags |= NFS4_OO_CONFIRMED;
> > - /*
> > - * On a 4.1+ client, we don't create a state record for a client
> > - * until it performs RECLAIM_COMPLETE:
> > - */
> > - if (!has_session)
> > - open->op_openowner->oo_owner.so_client->cl_firststate = 1;
>
> That doesn't look right. Sure you didn't mean to just make that a
> set_bit, at least for now?
>
> --b.
> --
> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html

No. I think that's correct. Once the RECLAIM_COMPLETE is done, we'll
create a client record and then the NFSD4_CLIENT_STABLE flag will be
set. We clearly don't want to set that flag until the client is
actually recorded onto stable storage.

One of the goals of this patch was to get rid of cl_firststate which
had a very ambiguous meaning. NFSD4_CLIENT_STABLE now means that the
client is recorded on stable storage, full stop.

Did I miss some subtlety that we'll need to account for in the state
model here?

--
Jeff Layton <[email protected]>

2012-03-28 23:09:35

by J. Bruce Fields

[permalink] [raw]
Subject: [PATCH] nfsd4: use auth_unix unconditionally on backchannel

This is a bandaid.

I have a series of patches that actually implement the correct behavior,
but that may not quite be ready for 3.4.

--b.

commit 2f026867c76171d26f003b211063ff0562097d5e
Author: J. Bruce Fields <[email protected]>
Date: Wed Mar 28 14:18:16 2012 -0400

nfsd4: use auth_unix unconditionally on backchannel

This isn't actually correct, but it works with the Linux client, and
agrees with the behavior we used to have before commit 80fc015bdfe.

Later patches will implement the spec-mandated behavior (which is to use
the security parameters explicitly given by the client in create_session
or backchannel_ctl).

Signed-off-by: J. Bruce Fields <[email protected]>

diff --git a/fs/nfsd/nfs4callback.c b/fs/nfsd/nfs4callback.c
index 0840fc4..c8e9f63 100644
--- a/fs/nfsd/nfs4callback.c
+++ b/fs/nfsd/nfs4callback.c
@@ -645,7 +645,6 @@ static int setup_callback_client(struct nfs4_client *clp, struct nfs4_cb_conn *c
.timeout = &timeparms,
.program = &cb_program,
.version = 0,
- .authflavor = clp->cl_flavor,
.flags = (RPC_CLNT_CREATE_NOPING | RPC_CLNT_CREATE_QUIET),
};
struct rpc_clnt *client;
@@ -656,6 +655,7 @@ static int setup_callback_client(struct nfs4_client *clp, struct nfs4_cb_conn *c
args.client_name = clp->cl_principal;
args.prognumber = conn->cb_prog,
args.protocol = XPRT_TRANSPORT_TCP;
+ args.authflavor = clp->cl_flavor;
clp->cl_cb_ident = conn->cb_ident;
} else {
if (!conn->cb_xprt)
@@ -665,6 +665,7 @@ static int setup_callback_client(struct nfs4_client *clp, struct nfs4_cb_conn *c
args.bc_xprt = conn->cb_xprt;
args.prognumber = clp->cl_cb_session->se_cb_prog;
args.protocol = XPRT_TRANSPORT_BC_TCP;
+ args.authflavor = RPC_AUTH_UNIX;
}
/* Create RPC client */
client = rpc_create(&args);

2012-03-23 15:34:24

by Myklebust, Trond

[permalink] [raw]
Subject: Re: [PATCH v10 3/8] sunrpc: create nfsd dir in rpc_pipefs

T24gRnJpLCAyMDEyLTAzLTIzIGF0IDExOjIyIC0wNDAwLCBKLiBCcnVjZSBGaWVsZHMgd3JvdGU6
DQo+IE9uIEZyaSwgTWFyIDIzLCAyMDEyIGF0IDAzOjIwOjIxUE0gKzAwMDAsIE15a2xlYnVzdCwg
VHJvbmQgd3JvdGU6DQo+ID4gT24gRnJpLCAyMDEyLTAzLTIzIGF0IDA5OjMxIC0wNDAwLCBKLiBC
cnVjZSBGaWVsZHMgd3JvdGU6DQo+ID4gPiBPbiBGcmksIE1hciAyMywgMjAxMiBhdCAwODoxMjow
OEFNIC0wNDAwLCBKLiBCcnVjZSBGaWVsZHMgd3JvdGU6DQo+ID4gPiA+IE9uIFdlZCwgTWFyIDIx
LCAyMDEyIGF0IDA5OjUyOjA0QU0gLTA0MDAsIEplZmYgTGF5dG9uIHdyb3RlOg0KPiA+ID4gPiA+
IEFkZCBhIG5ldyB0b3AtbGV2ZWwgZGlyIGluIHJwY19waXBlZnMgdG8gaG9sZCB0aGUgcGlwZSBm
b3IgdGhlIGNsaWVudGlkDQo+ID4gPiA+ID4gdXBjYWxsLg0KPiA+ID4gPiANCj4gPiA+ID4gQWZ0
ZXIgYXBwbHlpbmcgdGhpcyBwYXRjaCwgbXkgdGVzdHMgY29uc2lzdGVudGx5IGhhbmcuICBUaGUg
aGFuZyBoYXBwZW5zDQo+ID4gPiA+IGluIGV4Y2x0ZXN0IChvZiB0aGUgc3BlY2lhbCBjb25uZWN0
YXRvbiB0ZXN0cyksIG92ZXIgbmZzNC4xIGFuZCBrcmI1Lg0KPiA+ID4gPiBMb29raW5nIGF0IHRo
ZSB3aXJlIHRyYWZmaWMsIEknbSBzZWVpbmcgREVMQVkgcmV0dXJuZWQgZnJvbSBhIHNldGF0dHIN
Cj4gPiA+ID4gZm9yIG1vZGUgb24gYSBuZXdseS1jcmVhdGVkICh3aXRoIEVYQ0xVU0lWRTRfMSkg
ZmlsZS4gIFRoYXQgb3BlbiBnb3QgYQ0KPiA+ID4gPiBkZWxlZ2F0aW9uLCBzbyBwcmVzdW1hYmx5
IHRoYXQncyB3aGF0J3MgY2F1c2luZyB0aGUgREVMQVksIHRob3VnaCBJJ20NCj4gPiA+ID4gbm90
IHNlZWluZyB0aGUgc2VydmVyIHNlbmQgYSByZWNhbGwuICBUaGF0IGNvdWxkIGJlIGEga3JiNSBi
dWcuDQo+ID4gPiA+IA0KPiA+ID4gPiBXaGF0ZXZlciBidWcgdGhlcmUgaXMgaGVyZSwgaXQncyBo
YXJkIHRvIHRlbGwgd2h5IHRoaXMgcGF0Y2ggaW4NCj4gPiA+ID4gcGFydGljdWxhciB3b3VsZCBt
YWtlIGl0IG1vcmUgbGlrZWx5Lg0KPiA+ID4gPiANCj4gPiA+ID4gU28sIHN0aWxsIGludmVzdGln
YXRpbmchDQo+ID4gPiANCj4gPiA+IFJlcHJvZHVjZWFibGUgYnk6DQo+ID4gPiANCj4gPiA+IAlt
b3VudCAtb3NlYz1rcmI1LG1pbm9ydmVyc2lvbj0xIHNlcnZlcjovZXhwb3J0LyAvbW50Lw0KPiA+
ID4gCWNwIGN0aG9uMDQvc3BlY2lhbC9leGNsdGVzdCAvbW50Lw0KPiA+ID4gCWNkIC9tbnQNCj4g
PiA+IAkuL2V4Y2x0ZXN0DQo+ID4gDQo+ID4gVW1tLi4uIFdoZW4gd291bGQgeW91IGV2ZXIgZ2V0
IGEgREVMQVkgaW4gdGhlIGFib3ZlIHNjZW5hcmlvPyBJIGNhbiBzZWUNCj4gPiBnZXR0aW5nIGFu
IE5GUzRFUlJfT1BFTk1PREUsIGJ1dCBub3QgREVMQVkuDQo+IA0KPiBUaGVyZSdzIGEgc2V0YXR0
ciBmb3IgbW9kZSByaWdodCBhZnRlciB0aGUgb3Blbi4gIElzIHRoYXQgdW5leHBlY3RlZD8NCg0K
V2VsbCB5ZXMsIGl0IGlzLiBUaGUgTkZTdjQuMSBleGNsdXNpdmUgb3BlbiBzaG91bGQgYWx3YXlz
IGJlIHNlbmRpbmcgYQ0KZnVsbCBzZXQgb2YgYXR0cmlidXRlcyBhcyBwYXJ0IG9mIHRoZSBPUEVO
IG9wZXJhdGlvbi4gVGhlIHNlc3Npb24gcmVwbGF5DQpjYWNoZSBpcyBub3cgc3VwcG9zZWQgdG8g
Z3VhcmFudGVlIHRoZSBvbmx5LW9uY2Ugc2VtYW50aWNzIHRoYXQgdGhlDQp2ZXJpZmllciB1c2Vk
IHRvIHByb3ZpZGUuDQoNCj4gVGhlIHNlcnZlciBkb2Vzbid0IHJlYWxseSBoYXZlIHRvIHJlY2Fs
bCB0aGUgZGVsZWdhdGlvbiBpbiB0aGF0IGNhc2UgKGl0DQo+IG9ubHkgbmVlZHMgdG8gcmVjYWxs
ICpvdGhlciogY2xpZW50cycgZGVsZWdhdGlvbnMpIGJ1dCBJIGRvbid0IHRoaW5rDQo+IGl0J3Mg
d3JvbmcgdG8uDQoNClRoZW4gd2h5IGlzbid0IGl0IGFsbG93aW5nIHRoZSBvcGVyYXRpb24/IEFu
eSBzYW5lIGNsaWVudCB3b3VsZCBub3JtYWxseQ0KaW50ZXJwcmV0IE5GUzRFUlJfREVMQVkgdG8g
bWVhbiB0aGF0IHRoZSBzZXJ2ZXIgaXMgZG9pbmcgc29tZXRoaW5nIHRvDQpmaXggd2hhdGV2ZXIg
c2l0dWF0aW9uIGlzIHByZXZlbnRpbmcgdGhlIG9wZXJhdGlvbiBmcm9tIGNvbXBsZXRpbmcNCihw
cmVzdW1hYmx5IGJ5IHJlY2FsbGluZyBkZWxlZ2F0aW9ucyBpbiB0aGlzIGNhc2UpLiBKdXN0IHJl
cGx5aW5nIERFTEFZDQphbmQgZG9pbmcgbm90aGluZyBpcyBub3QgaGVscGZ1bC4uLg0KDQotLSAN
ClRyb25kIE15a2xlYnVzdA0KTGludXggTkZTIGNsaWVudCBtYWludGFpbmVyDQoNCk5ldEFwcA0K
VHJvbmQuTXlrbGVidXN0QG5ldGFwcC5jb20NCnd3dy5uZXRhcHAuY29tDQoNCg==

2012-03-23 17:04:08

by J. Bruce Fields

[permalink] [raw]
Subject: Re: [PATCH v10 3/8] sunrpc: create nfsd dir in rpc_pipefs

On Fri, Mar 23, 2012 at 12:12:18PM -0400, Jeff Layton wrote:
> ...and contrary to what Bruce has seen, I can also reproduce this when
> the server is running a stock (unpatched) 3.3.0 kernel from the Fedora
> rawhide repos.

Yeah, there may just be some subtle timing difference that makes it
reproduce for me with that patch applied and not without it. It's
clearly a preexisting bug not directly relevant to this patch.

I'll keep looking.

--b.

2012-03-21 13:52:20

by Jeff Layton

[permalink] [raw]
Subject: [PATCH v10 5/8] nfsd: add a header describing upcall to nfsdcld

The daemon takes a versioned binary struct. Hopefully this should allow
us to revise the struct later if it becomes necessary.

Signed-off-by: Jeff Layton <[email protected]>
---
include/linux/nfsd/cld.h | 56 ++++++++++++++++++++++++++++++++++++++++++++++
1 files changed, 56 insertions(+), 0 deletions(-)
create mode 100644 include/linux/nfsd/cld.h

diff --git a/include/linux/nfsd/cld.h b/include/linux/nfsd/cld.h
new file mode 100644
index 0000000..f14a9ab
--- /dev/null
+++ b/include/linux/nfsd/cld.h
@@ -0,0 +1,56 @@
+/*
+ * Upcall description for nfsdcld communication
+ *
+ * Copyright (c) 2012 Red Hat, Inc.
+ * Author(s): Jeff Layton <[email protected]>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
+ */
+
+#ifndef _NFSD_CLD_H
+#define _NFSD_CLD_H
+
+/* latest upcall version available */
+#define CLD_UPCALL_VERSION 1
+
+/* defined by RFC3530 */
+#define NFS4_OPAQUE_LIMIT 1024
+
+enum cld_command {
+ Cld_Create, /* create a record for this cm_id */
+ Cld_Remove, /* remove record of this cm_id */
+ Cld_Check, /* is this cm_id allowed? */
+ Cld_GraceDone, /* grace period is complete */
+};
+
+/* representation of long-form NFSv4 client ID */
+struct cld_name {
+ uint16_t cn_len; /* length of cm_id */
+ unsigned char cn_id[NFS4_OPAQUE_LIMIT]; /* client-provided */
+} __attribute__((packed));
+
+/* message struct for communication with userspace */
+struct cld_msg {
+ uint8_t cm_vers; /* upcall version */
+ uint8_t cm_cmd; /* upcall command */
+ int16_t cm_status; /* return code */
+ uint32_t cm_xid; /* transaction id */
+ union {
+ int64_t cm_gracetime; /* grace period start time */
+ struct cld_name cm_name;
+ } __attribute__((packed)) cm_u;
+} __attribute__((packed));
+
+#endif /* !_NFSD_CLD_H */
--
1.7.7.6


2012-03-23 12:12:09

by J. Bruce Fields

[permalink] [raw]
Subject: Re: [PATCH v10 3/8] sunrpc: create nfsd dir in rpc_pipefs

On Wed, Mar 21, 2012 at 09:52:04AM -0400, Jeff Layton wrote:
> Add a new top-level dir in rpc_pipefs to hold the pipe for the clientid
> upcall.

After applying this patch, my tests consistently hang. The hang happens
in excltest (of the special connectaton tests), over nfs4.1 and krb5.
Looking at the wire traffic, I'm seeing DELAY returned from a setattr
for mode on a newly-created (with EXCLUSIVE4_1) file. That open got a
delegation, so presumably that's what's causing the DELAY, though I'm
not seeing the server send a recall. That could be a krb5 bug.

Whatever bug there is here, it's hard to tell why this patch in
particular would make it more likely.

So, still investigating!

--b.

>
> Signed-off-by: Jeff Layton <[email protected]>
> ---
> net/sunrpc/rpc_pipe.c | 5 +++++
> 1 files changed, 5 insertions(+), 0 deletions(-)
>
> diff --git a/net/sunrpc/rpc_pipe.c b/net/sunrpc/rpc_pipe.c
> index 8584ec0..e651e1b 100644
> --- a/net/sunrpc/rpc_pipe.c
> +++ b/net/sunrpc/rpc_pipe.c
> @@ -1014,6 +1014,7 @@ enum {
> RPCAUTH_statd,
> RPCAUTH_nfsd4_cb,
> RPCAUTH_cache,
> + RPCAUTH_nfsd,
> RPCAUTH_RootEOF
> };
>
> @@ -1046,6 +1047,10 @@ static const struct rpc_filelist files[] = {
> .name = "cache",
> .mode = S_IFDIR | S_IRUGO | S_IXUGO,
> },
> + [RPCAUTH_nfsd] = {
> + .name = "nfsd",
> + .mode = S_IFDIR | S_IRUGO | S_IXUGO,
> + },
> };
>
> /*
> --
> 1.7.7.6
>

2012-03-21 23:59:26

by Jeff Layton

[permalink] [raw]
Subject: Re: [PATCH v10 2/8] nfsd: add nfsd4_client_tracking_ops struct and a way to set it

On Wed, 21 Mar 2012 09:52:03 -0400
Jeff Layton <[email protected]> wrote:

> Abstract out the mechanism that we use to track clients into a set of
> client name tracking functions.
>
> This gives us a mechanism to plug in a new set of client tracking
> functions without disturbing the callers. It also gives us a way to
> decide on what tracking scheme to use at runtime.
>
> For now, this just looks like pointless abstraction, but later we'll
> add a new alternate scheme for tracking clients on stable storage.
>
> Note too that this patch anticipates the eventual containerization
> of this code by passing in struct net pointers in places. No attempt
> is made to containerize the legacy client tracker however.
>
> Signed-off-by: Jeff Layton <[email protected]>
> ---
> fs/nfsd/nfs4recover.c | 136 +++++++++++++++++++++++++++++++++++++++++++++----
> fs/nfsd/nfs4state.c | 61 +++++++---------------
> fs/nfsd/state.h | 15 +++--
> 3 files changed, 155 insertions(+), 57 deletions(-)
>

Bruce pointed out that I missed converting a spot in the new fault
injection code, so I've followed this up with a v11 patch that fixes
that problem.

Cheers!
--
Jeff Layton <[email protected]>

2012-03-28 23:17:11

by Myklebust, Trond

[permalink] [raw]
Subject: Re: [PATCH] nfsd4: use auth_unix unconditionally on backchannel

T24gV2VkLCAyMDEyLTAzLTI4IGF0IDE5OjA5IC0wNDAwLCBKLiBCcnVjZSBGaWVsZHMgd3JvdGU6
DQo+IFRoaXMgaXMgYSBiYW5kYWlkLg0KPiANCj4gSSBoYXZlIGEgc2VyaWVzIG9mIHBhdGNoZXMg
dGhhdCBhY3R1YWxseSBpbXBsZW1lbnQgdGhlIGNvcnJlY3QgYmVoYXZpb3IsDQo+IGJ1dCB0aGF0
IG1heSBub3QgcXVpdGUgYmUgcmVhZHkgZm9yIDMuNC4NCj4gDQo+IC0tYi4NCj4gDQo+IGNvbW1p
dCAyZjAyNjg2N2M3NjE3MWQyNmYwMDNiMjExMDYzZmYwNTYyMDk3ZDVlDQo+IEF1dGhvcjogSi4g
QnJ1Y2UgRmllbGRzIDxiZmllbGRzQHJlZGhhdC5jb20+DQo+IERhdGU6ICAgV2VkIE1hciAyOCAx
NDoxODoxNiAyMDEyIC0wNDAwDQo+IA0KPiAgICAgbmZzZDQ6IHVzZSBhdXRoX3VuaXggdW5jb25k
aXRpb25hbGx5IG9uIGJhY2tjaGFubmVsDQo+ICAgICANCj4gICAgIFRoaXMgaXNuJ3QgYWN0dWFs
bHkgY29ycmVjdCwgYnV0IGl0IHdvcmtzIHdpdGggdGhlIExpbnV4IGNsaWVudCwgYW5kDQo+ICAg
ICBhZ3JlZXMgd2l0aCB0aGUgYmVoYXZpb3Igd2UgdXNlZCB0byBoYXZlIGJlZm9yZSBjb21taXQg
ODBmYzAxNWJkZmUuDQoNClF1ZXN0aW9uOiBkb2VzIHRoZSBMaW51eCBjbGllbnQgZXZlciBzZW5k
IHlvdSBhbnl0aGluZyBvdGhlciB0aGFuDQpBVVRIX1NZUyBjcmVkZW50aWFscyBmb3IgdGhlIGNz
YV9zZWNfcGFybXMgYXJndW1lbnQgaW4gQ1JFQVRFX1NFU1NJT04/DQpBbnl0aGluZyBvdGhlciB0
aGFuIHRoYXQgd291bGQgYmUgYSBidWcsIHNpbmNlIG91ciBjbGllbnQgZG9lc24ndA0KYWN0dWFs
bHkgc3VwcG9ydCBSUENTRUNfR1NTIGluIHRoZSBjYWxsYmFjayBjaGFubmVsLg0KDQo+ICAgICBM
YXRlciBwYXRjaGVzIHdpbGwgaW1wbGVtZW50IHRoZSBzcGVjLW1hbmRhdGVkIGJlaGF2aW9yICh3
aGljaCBpcyB0byB1c2UNCj4gICAgIHRoZSBzZWN1cml0eSBwYXJhbWV0ZXJzIGV4cGxpY2l0bHkg
Z2l2ZW4gYnkgdGhlIGNsaWVudCBpbiBjcmVhdGVfc2Vzc2lvbg0KPiAgICAgb3IgYmFja2NoYW5u
ZWxfY3RsKS4NCj4gICAgIA0KDQoNCi0tIA0KVHJvbmQgTXlrbGVidXN0DQpMaW51eCBORlMgY2xp
ZW50IG1haW50YWluZXINCg0KTmV0QXBwDQpUcm9uZC5NeWtsZWJ1c3RAbmV0YXBwLmNvbQ0Kd3d3
Lm5ldGFwcC5jb20NCg0K

2012-03-23 15:20:41

by Myklebust, Trond

[permalink] [raw]
Subject: Re: [PATCH v10 3/8] sunrpc: create nfsd dir in rpc_pipefs

T24gRnJpLCAyMDEyLTAzLTIzIGF0IDA5OjMxIC0wNDAwLCBKLiBCcnVjZSBGaWVsZHMgd3JvdGU6
DQo+IE9uIEZyaSwgTWFyIDIzLCAyMDEyIGF0IDA4OjEyOjA4QU0gLTA0MDAsIEouIEJydWNlIEZp
ZWxkcyB3cm90ZToNCj4gPiBPbiBXZWQsIE1hciAyMSwgMjAxMiBhdCAwOTo1MjowNEFNIC0wNDAw
LCBKZWZmIExheXRvbiB3cm90ZToNCj4gPiA+IEFkZCBhIG5ldyB0b3AtbGV2ZWwgZGlyIGluIHJw
Y19waXBlZnMgdG8gaG9sZCB0aGUgcGlwZSBmb3IgdGhlIGNsaWVudGlkDQo+ID4gPiB1cGNhbGwu
DQo+ID4gDQo+ID4gQWZ0ZXIgYXBwbHlpbmcgdGhpcyBwYXRjaCwgbXkgdGVzdHMgY29uc2lzdGVu
dGx5IGhhbmcuICBUaGUgaGFuZyBoYXBwZW5zDQo+ID4gaW4gZXhjbHRlc3QgKG9mIHRoZSBzcGVj
aWFsIGNvbm5lY3RhdG9uIHRlc3RzKSwgb3ZlciBuZnM0LjEgYW5kIGtyYjUuDQo+ID4gTG9va2lu
ZyBhdCB0aGUgd2lyZSB0cmFmZmljLCBJJ20gc2VlaW5nIERFTEFZIHJldHVybmVkIGZyb20gYSBz
ZXRhdHRyDQo+ID4gZm9yIG1vZGUgb24gYSBuZXdseS1jcmVhdGVkICh3aXRoIEVYQ0xVU0lWRTRf
MSkgZmlsZS4gIFRoYXQgb3BlbiBnb3QgYQ0KPiA+IGRlbGVnYXRpb24sIHNvIHByZXN1bWFibHkg
dGhhdCdzIHdoYXQncyBjYXVzaW5nIHRoZSBERUxBWSwgdGhvdWdoIEknbQ0KPiA+IG5vdCBzZWVp
bmcgdGhlIHNlcnZlciBzZW5kIGEgcmVjYWxsLiAgVGhhdCBjb3VsZCBiZSBhIGtyYjUgYnVnLg0K
PiA+IA0KPiA+IFdoYXRldmVyIGJ1ZyB0aGVyZSBpcyBoZXJlLCBpdCdzIGhhcmQgdG8gdGVsbCB3
aHkgdGhpcyBwYXRjaCBpbg0KPiA+IHBhcnRpY3VsYXIgd291bGQgbWFrZSBpdCBtb3JlIGxpa2Vs
eS4NCj4gPiANCj4gPiBTbywgc3RpbGwgaW52ZXN0aWdhdGluZyENCj4gDQo+IFJlcHJvZHVjZWFi
bGUgYnk6DQo+IA0KPiAJbW91bnQgLW9zZWM9a3JiNSxtaW5vcnZlcnNpb249MSBzZXJ2ZXI6L2V4
cG9ydC8gL21udC8NCj4gCWNwIGN0aG9uMDQvc3BlY2lhbC9leGNsdGVzdCAvbW50Lw0KPiAJY2Qg
L21udA0KPiAJLi9leGNsdGVzdA0KDQpVbW0uLi4gV2hlbiB3b3VsZCB5b3UgZXZlciBnZXQgYSBE
RUxBWSBpbiB0aGUgYWJvdmUgc2NlbmFyaW8/IEkgY2FuIHNlZQ0KZ2V0dGluZyBhbiBORlM0RVJS
X09QRU5NT0RFLCBidXQgbm90IERFTEFZLg0KDQotLSANClRyb25kIE15a2xlYnVzdA0KTGludXgg
TkZTIGNsaWVudCBtYWludGFpbmVyDQoNCk5ldEFwcA0KVHJvbmQuTXlrbGVidXN0QG5ldGFwcC5j
b20NCnd3dy5uZXRhcHAuY29tDQoNCg==

2012-03-23 15:53:11

by Jeff Layton

[permalink] [raw]
Subject: Re: [PATCH v10 3/8] sunrpc: create nfsd dir in rpc_pipefs

On Fri, 23 Mar 2012 15:34:21 +0000
"Myklebust, Trond" <[email protected]> wrote:

> On Fri, 2012-03-23 at 11:22 -0400, J. Bruce Fields wrote:
> > On Fri, Mar 23, 2012 at 03:20:21PM +0000, Myklebust, Trond wrote:
> > > On Fri, 2012-03-23 at 09:31 -0400, J. Bruce Fields wrote:
> > > > On Fri, Mar 23, 2012 at 08:12:08AM -0400, J. Bruce Fields wrote:
> > > > > On Wed, Mar 21, 2012 at 09:52:04AM -0400, Jeff Layton wrote:
> > > > > > Add a new top-level dir in rpc_pipefs to hold the pipe for the clientid
> > > > > > upcall.
> > > > >
> > > > > After applying this patch, my tests consistently hang. The hang happens
> > > > > in excltest (of the special connectaton tests), over nfs4.1 and krb5.
> > > > > Looking at the wire traffic, I'm seeing DELAY returned from a setattr
> > > > > for mode on a newly-created (with EXCLUSIVE4_1) file. That open got a
> > > > > delegation, so presumably that's what's causing the DELAY, though I'm
> > > > > not seeing the server send a recall. That could be a krb5 bug.
> > > > >
> > > > > Whatever bug there is here, it's hard to tell why this patch in
> > > > > particular would make it more likely.
> > > > >
> > > > > So, still investigating!
> > > >
> > > > Reproduceable by:
> > > >
> > > > mount -osec=krb5,minorversion=1 server:/export/ /mnt/
> > > > cp cthon04/special/excltest /mnt/
> > > > cd /mnt
> > > > ./excltest
> > >
> > > Umm... When would you ever get a DELAY in the above scenario? I can see
> > > getting an NFS4ERR_OPENMODE, but not DELAY.
> >
> > There's a setattr for mode right after the open. Is that unexpected?
>
> Well yes, it is. The NFSv4.1 exclusive open should always be sending a
> full set of attributes as part of the OPEN operation. The session replay
> cache is now supposed to guarantee the only-once semantics that the
> verifier used to provide.
>
> > The server doesn't really have to recall the delegation in that case (it
> > only needs to recall *other* clients' delegations) but I don't think
> > it's wrong to.
>
> Then why isn't it allowing the operation? Any sane client would normally
> interpret NFS4ERR_DELAY to mean that the server is doing something to
> fix whatever situation is preventing the operation from completing
> (presumably by recalling delegations in this case). Just replying DELAY
> and doing nothing is not helpful...
>

Yeah, this seems like a clear bug in the server code. I think it's
replying DELAY in order to recall the delegation, but the delegation
isn't getting recalled for some reason. We arguably don't actually need
to recall it here, but I don't see any recall go out at all either...

As to why this patch seems to uncover this bug -- that's a complete
mystery at this point...

--
Jeff Layton <[email protected]>

2012-03-21 20:41:44

by J. Bruce Fields

[permalink] [raw]
Subject: Re: [PATCH v10 1/8] nfsd: convert nfs4_client->cl_cb_flags to a generic flags field

On Wed, Mar 21, 2012 at 09:52:02AM -0400, Jeff Layton wrote:
> @@ -2779,12 +2780,6 @@ static void
> nfs4_set_claim_prev(struct nfsd4_open *open, bool has_session)
> {
> open->op_openowner->oo_flags |= NFS4_OO_CONFIRMED;
> - /*
> - * On a 4.1+ client, we don't create a state record for a client
> - * until it performs RECLAIM_COMPLETE:
> - */
> - if (!has_session)
> - open->op_openowner->oo_owner.so_client->cl_firststate = 1;

That doesn't look right. Sure you didn't mean to just make that a
set_bit, at least for now?

--b.

2012-03-21 20:42:53

by Jeff Layton

[permalink] [raw]
Subject: [PATCH v11 2/8] nfsd: add nfsd4_client_tracking_ops struct and a way to set it

Abstract out the mechanism that we use to track clients into a set of
client name tracking functions.

This gives us a mechanism to plug in a new set of client tracking
functions without disturbing the callers. It also gives us a way to
decide on what tracking scheme to use at runtime.

For now, this just looks like pointless abstraction, but later we'll
add a new alternate scheme for tracking clients on stable storage.

Note too that this patch anticipates the eventual containerization
of this code by passing in struct net pointers in places. No attempt
is made to containerize the legacy client tracker however.

Signed-off-by: Jeff Layton <[email protected]>
---
fs/nfsd/nfs4recover.c | 136 +++++++++++++++++++++++++++++++++++++++++++++----
fs/nfsd/nfs4state.c | 63 ++++++++---------------
fs/nfsd/state.h | 15 +++--
3 files changed, 156 insertions(+), 58 deletions(-)

diff --git a/fs/nfsd/nfs4recover.c b/fs/nfsd/nfs4recover.c
index 6523809..e616f88 100644
--- a/fs/nfsd/nfs4recover.c
+++ b/fs/nfsd/nfs4recover.c
@@ -43,9 +43,20 @@

#define NFSDDBG_FACILITY NFSDDBG_PROC

+/* Declarations */
+struct nfsd4_client_tracking_ops {
+ int (*init)(struct net *);
+ void (*exit)(struct net *);
+ void (*create)(struct nfs4_client *);
+ void (*remove)(struct nfs4_client *);
+ int (*check)(struct nfs4_client *);
+ void (*grace_done)(struct net *, time_t);
+};
+
/* Globals */
static struct file *rec_file;
static char user_recovery_dirname[PATH_MAX] = "/var/lib/nfs/v4recovery";
+static struct nfsd4_client_tracking_ops *client_tracking_ops;

static int
nfs4_save_creds(const struct cred **original_creds)
@@ -117,7 +128,8 @@ out_no_tfm:
return status;
}

-void nfsd4_create_clid_dir(struct nfs4_client *clp)
+static void
+nfsd4_create_clid_dir(struct nfs4_client *clp)
{
const struct cred *original_cred;
char *dname = clp->cl_recdir;
@@ -264,7 +276,7 @@ out_unlock:
return status;
}

-void
+static void
nfsd4_remove_clid_dir(struct nfs4_client *clp)
{
const struct cred *original_cred;
@@ -291,7 +303,6 @@ out:
if (status)
printk("NFSD: Failed to remove expired client state directory"
" %.*s\n", HEXDIR_LEN, clp->cl_recdir);
- return;
}

static int
@@ -310,8 +321,9 @@ purge_old(struct dentry *parent, struct dentry *child)
return 0;
}

-void
-nfsd4_recdir_purge_old(void) {
+static void
+nfsd4_recdir_purge_old(struct net *net, time_t boot_time)
+{
int status;

if (!rec_file)
@@ -342,7 +354,7 @@ load_recdir(struct dentry *parent, struct dentry *child)
return 0;
}

-int
+static int
nfsd4_recdir_load(void) {
int status;

@@ -360,8 +372,8 @@ nfsd4_recdir_load(void) {
* Hold reference to the recovery directory.
*/

-void
-nfsd4_init_recdir()
+static int
+nfsd4_init_recdir(void)
{
const struct cred *original_cred;
int status;
@@ -376,20 +388,37 @@ nfsd4_init_recdir()
printk("NFSD: Unable to change credentials to find recovery"
" directory: error %d\n",
status);
- return;
+ return status;
}

rec_file = filp_open(user_recovery_dirname, O_RDONLY | O_DIRECTORY, 0);
if (IS_ERR(rec_file)) {
printk("NFSD: unable to find recovery directory %s\n",
user_recovery_dirname);
+ status = PTR_ERR(rec_file);
rec_file = NULL;
}

nfs4_reset_creds(original_cred);
+ return status;
}

-void
+static int
+nfsd4_load_reboot_recovery_data(struct net *net)
+{
+ int status;
+
+ nfs4_lock_state();
+ status = nfsd4_init_recdir();
+ if (!status)
+ status = nfsd4_recdir_load();
+ nfs4_unlock_state();
+ if (status)
+ printk(KERN_ERR "NFSD: Failure reading reboot recovery data\n");
+ return status;
+}
+
+static void
nfsd4_shutdown_recdir(void)
{
if (!rec_file)
@@ -398,6 +427,13 @@ nfsd4_shutdown_recdir(void)
rec_file = NULL;
}

+static void
+nfsd4_legacy_tracking_exit(struct net *net)
+{
+ nfs4_release_reclaim();
+ nfsd4_shutdown_recdir();
+}
+
/*
* Change the NFSv4 recovery directory to recdir.
*/
@@ -424,3 +460,83 @@ nfs4_recoverydir(void)
{
return user_recovery_dirname;
}
+
+static int
+nfsd4_check_legacy_client(struct nfs4_client *clp)
+{
+ /* did we already find that this client is stable? */
+ if (test_bit(NFSD4_CLIENT_STABLE, &clp->cl_flags))
+ return 0;
+
+ /* look for it in the reclaim hashtable otherwise */
+ if (nfsd4_find_reclaim_client(clp)) {
+ set_bit(NFSD4_CLIENT_STABLE, &clp->cl_flags);
+ return 0;
+ }
+
+ return -ENOENT;
+}
+
+static struct nfsd4_client_tracking_ops nfsd4_legacy_tracking_ops = {
+ .init = nfsd4_load_reboot_recovery_data,
+ .exit = nfsd4_legacy_tracking_exit,
+ .create = nfsd4_create_clid_dir,
+ .remove = nfsd4_remove_clid_dir,
+ .check = nfsd4_check_legacy_client,
+ .grace_done = nfsd4_recdir_purge_old,
+};
+
+int
+nfsd4_client_tracking_init(struct net *net)
+{
+ int status;
+
+ client_tracking_ops = &nfsd4_legacy_tracking_ops;
+
+ status = client_tracking_ops->init(net);
+ if (status) {
+ printk(KERN_WARNING "NFSD: Unable to initialize client "
+ "recovery tracking! (%d)\n", status);
+ client_tracking_ops = NULL;
+ }
+ return status;
+}
+
+void
+nfsd4_client_tracking_exit(struct net *net)
+{
+ if (client_tracking_ops) {
+ client_tracking_ops->exit(net);
+ client_tracking_ops = NULL;
+ }
+}
+
+void
+nfsd4_client_record_create(struct nfs4_client *clp)
+{
+ if (client_tracking_ops)
+ client_tracking_ops->create(clp);
+}
+
+void
+nfsd4_client_record_remove(struct nfs4_client *clp)
+{
+ if (client_tracking_ops)
+ client_tracking_ops->remove(clp);
+}
+
+int
+nfsd4_client_record_check(struct nfs4_client *clp)
+{
+ if (client_tracking_ops)
+ return client_tracking_ops->check(clp);
+
+ return -EOPNOTSUPP;
+}
+
+void
+nfsd4_record_grace_done(struct net *net, time_t boot_time)
+{
+ if (client_tracking_ops)
+ client_tracking_ops->grace_done(net, boot_time);
+}
diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index 534f82f1..67269a0 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -2046,7 +2046,7 @@ nfsd4_reclaim_complete(struct svc_rqst *rqstp, struct nfsd4_compound_state *csta
goto out;

status = nfs_ok;
- nfsd4_create_clid_dir(cstate->session->se_client);
+ nfsd4_client_record_create(cstate->session->se_client);
out:
nfs4_unlock_state();
return status;
@@ -2241,7 +2241,7 @@ nfsd4_setclientid_confirm(struct svc_rqst *rqstp,
conf = find_confirmed_client_by_str(unconf->cl_recdir,
hash);
if (conf) {
- nfsd4_remove_clid_dir(conf);
+ nfsd4_client_record_remove(conf);
expire_client(conf);
}
move_to_confirmed(unconf);
@@ -3066,7 +3066,7 @@ static void
nfsd4_end_grace(void)
{
dprintk("NFSD: end of grace period\n");
- nfsd4_recdir_purge_old();
+ nfsd4_record_grace_done(&init_net, boot_time);
locks_end_grace(&nfsd4_manager);
/*
* Now that every NFSv4 client has had the chance to recover and
@@ -3115,7 +3115,7 @@ nfs4_laundromat(void)
clp = list_entry(pos, struct nfs4_client, cl_lru);
dprintk("NFSD: purging unused client (clientid %08x)\n",
clp->cl_clientid.cl_id);
- nfsd4_remove_clid_dir(clp);
+ nfsd4_client_record_remove(clp);
expire_client(clp);
}
spin_lock(&recall_lock);
@@ -3539,7 +3539,7 @@ nfsd4_open_confirm(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
dprintk("NFSD: %s: success, seqid=%d stateid=" STATEID_FMT "\n",
__func__, oc->oc_seqid, STATEID_VAL(&stp->st_stid.sc_stateid));

- nfsd4_create_clid_dir(oo->oo_owner.so_client);
+ nfsd4_client_record_create(oo->oo_owner.so_client);
status = nfs_ok;
out:
if (!cstate->replay_owner)
@@ -4379,7 +4379,7 @@ nfs4_client_to_reclaim(const char *name)
return 1;
}

-static void
+void
nfs4_release_reclaim(void)
{
struct nfs4_client_reclaim *crp = NULL;
@@ -4399,7 +4399,7 @@ nfs4_release_reclaim(void)

/*
* called from OPEN, CLAIM_PREVIOUS with a new clientid. */
-static struct nfs4_client_reclaim *
+struct nfs4_client_reclaim *
nfsd4_find_reclaim_client(struct nfs4_client *clp)
{
unsigned int strhashval;
@@ -4419,22 +4419,6 @@ nfsd4_find_reclaim_client(struct nfs4_client *clp)
return NULL;
}

-static int
-nfsd4_client_record_check(struct nfs4_client *clp)
-{
- /* did we already find that this client is stable? */
- if (test_bit(NFSD4_CLIENT_STABLE, &clp->cl_flags))
- return 0;
-
- /* look for it in the reclaim hashtable otherwise */
- if (nfsd4_find_reclaim_client(clp)) {
- set_bit(NFSD4_CLIENT_STABLE, &clp->cl_flags);
- return 0;
- }
-
- return -ENOENT;
-}
-
/*
* Called from OPEN. Look for clientid in reclaim list.
*/
@@ -4460,7 +4444,7 @@ void nfsd_forget_clients(u64 num)

nfs4_lock_state();
list_for_each_entry_safe(clp, next, &client_lru, cl_lru) {
- nfsd4_remove_clid_dir(clp);
+ nfsd4_client_record_remove(clp);
expire_client(clp);
if (++count == num)
break;
@@ -4595,19 +4579,6 @@ nfs4_state_init(void)
reclaim_str_hashtbl_size = 0;
}

-static void
-nfsd4_load_reboot_recovery_data(void)
-{
- int status;
-
- nfs4_lock_state();
- nfsd4_init_recdir();
- status = nfsd4_recdir_load();
- nfs4_unlock_state();
- if (status)
- printk("NFSD: Failure reading reboot recovery data\n");
-}
-
/*
* Since the lifetime of a delegation isn't limited to that of an open, a
* client may quite reasonably hang on to a delegation as long as it has
@@ -4636,7 +4607,15 @@ nfs4_state_start(void)
{
int ret;

- nfsd4_load_reboot_recovery_data();
+ /*
+ * FIXME: For now, we hang most of the pernet global stuff off of
+ * init_net until nfsd is fully containerized. Eventually, we'll
+ * need to pass a net pointer into this function, take a reference
+ * to that instead and then do most of the rest of this on a per-net
+ * basis.
+ */
+ get_net(&init_net);
+ nfsd4_client_tracking_init(&init_net);
boot_time = get_seconds();
locks_start_grace(&nfsd4_manager);
printk(KERN_INFO "NFSD: starting %ld-second grace period\n",
@@ -4660,8 +4639,8 @@ nfs4_state_start(void)
out_free_laundry:
destroy_workqueue(laundry_wq);
out_recovery:
- nfs4_release_reclaim();
- nfsd4_shutdown_recdir();
+ nfsd4_client_tracking_exit(&init_net);
+ put_net(&init_net);
return ret;
}

@@ -4695,7 +4674,8 @@ __nfs4_state_shutdown(void)
unhash_delegation(dp);
}

- nfsd4_shutdown_recdir();
+ nfsd4_client_tracking_exit(&init_net);
+ put_net(&init_net);
}

void
@@ -4705,7 +4685,6 @@ nfs4_state_shutdown(void)
destroy_workqueue(laundry_wq);
locks_end_grace(&nfsd4_manager);
nfs4_lock_state();
- nfs4_release_reclaim();
__nfs4_state_shutdown();
nfs4_unlock_state();
nfsd4_destroy_callback_queue();
diff --git a/fs/nfsd/state.h b/fs/nfsd/state.h
index b9c036d..a9ab5e0 100644
--- a/fs/nfsd/state.h
+++ b/fs/nfsd/state.h
@@ -466,6 +466,8 @@ extern __be32 nfs4_preprocess_stateid_op(struct nfsd4_compound_state *cstate,
extern void nfs4_lock_state(void);
extern void nfs4_unlock_state(void);
extern int nfs4_in_grace(void);
+extern void nfs4_release_reclaim(void);
+extern struct nfs4_client_reclaim *nfsd4_find_reclaim_client(struct nfs4_client *crp);
extern __be32 nfs4_check_open_reclaim(clientid_t *clid);
extern void nfs4_free_openowner(struct nfs4_openowner *);
extern void nfs4_free_lockowner(struct nfs4_lockowner *);
@@ -480,16 +482,17 @@ extern void nfsd4_destroy_callback_queue(void);
extern void nfsd4_shutdown_callback(struct nfs4_client *);
extern void nfs4_put_delegation(struct nfs4_delegation *dp);
extern __be32 nfs4_make_rec_clidname(char *clidname, struct xdr_netobj *clname);
-extern void nfsd4_init_recdir(void);
-extern int nfsd4_recdir_load(void);
-extern void nfsd4_shutdown_recdir(void);
extern int nfs4_client_to_reclaim(const char *name);
extern int nfs4_has_reclaimed_state(const char *name, bool use_exchange_id);
-extern void nfsd4_recdir_purge_old(void);
-extern void nfsd4_create_clid_dir(struct nfs4_client *clp);
-extern void nfsd4_remove_clid_dir(struct nfs4_client *clp);
extern void release_session_client(struct nfsd4_session *);
extern __be32 nfs4_validate_stateid(struct nfs4_client *, stateid_t *);
extern void nfsd4_purge_closed_stateid(struct nfs4_stateowner *);

+/* nfs4recover operations */
+extern int nfsd4_client_tracking_init(struct net *net);
+extern void nfsd4_client_tracking_exit(struct net *net);
+extern void nfsd4_client_record_create(struct nfs4_client *clp);
+extern void nfsd4_client_record_remove(struct nfs4_client *clp);
+extern int nfsd4_client_record_check(struct nfs4_client *clp);
+extern void nfsd4_record_grace_done(struct net *net, time_t boot_time);
#endif /* NFSD4_STATE_H */
--
1.7.7.6


2012-03-21 13:52:18

by Jeff Layton

[permalink] [raw]
Subject: [PATCH v10 3/8] sunrpc: create nfsd dir in rpc_pipefs

Add a new top-level dir in rpc_pipefs to hold the pipe for the clientid
upcall.

Signed-off-by: Jeff Layton <[email protected]>
---
net/sunrpc/rpc_pipe.c | 5 +++++
1 files changed, 5 insertions(+), 0 deletions(-)

diff --git a/net/sunrpc/rpc_pipe.c b/net/sunrpc/rpc_pipe.c
index 8584ec0..e651e1b 100644
--- a/net/sunrpc/rpc_pipe.c
+++ b/net/sunrpc/rpc_pipe.c
@@ -1014,6 +1014,7 @@ enum {
RPCAUTH_statd,
RPCAUTH_nfsd4_cb,
RPCAUTH_cache,
+ RPCAUTH_nfsd,
RPCAUTH_RootEOF
};

@@ -1046,6 +1047,10 @@ static const struct rpc_filelist files[] = {
.name = "cache",
.mode = S_IFDIR | S_IRUGO | S_IXUGO,
},
+ [RPCAUTH_nfsd] = {
+ .name = "nfsd",
+ .mode = S_IFDIR | S_IRUGO | S_IXUGO,
+ },
};

/*
--
1.7.7.6


2012-03-23 15:22:22

by J. Bruce Fields

[permalink] [raw]
Subject: Re: [PATCH v10 3/8] sunrpc: create nfsd dir in rpc_pipefs

On Fri, Mar 23, 2012 at 03:20:21PM +0000, Myklebust, Trond wrote:
> On Fri, 2012-03-23 at 09:31 -0400, J. Bruce Fields wrote:
> > On Fri, Mar 23, 2012 at 08:12:08AM -0400, J. Bruce Fields wrote:
> > > On Wed, Mar 21, 2012 at 09:52:04AM -0400, Jeff Layton wrote:
> > > > Add a new top-level dir in rpc_pipefs to hold the pipe for the clientid
> > > > upcall.
> > >
> > > After applying this patch, my tests consistently hang. The hang happens
> > > in excltest (of the special connectaton tests), over nfs4.1 and krb5.
> > > Looking at the wire traffic, I'm seeing DELAY returned from a setattr
> > > for mode on a newly-created (with EXCLUSIVE4_1) file. That open got a
> > > delegation, so presumably that's what's causing the DELAY, though I'm
> > > not seeing the server send a recall. That could be a krb5 bug.
> > >
> > > Whatever bug there is here, it's hard to tell why this patch in
> > > particular would make it more likely.
> > >
> > > So, still investigating!
> >
> > Reproduceable by:
> >
> > mount -osec=krb5,minorversion=1 server:/export/ /mnt/
> > cp cthon04/special/excltest /mnt/
> > cd /mnt
> > ./excltest
>
> Umm... When would you ever get a DELAY in the above scenario? I can see
> getting an NFS4ERR_OPENMODE, but not DELAY.

There's a setattr for mode right after the open. Is that unexpected?

The server doesn't really have to recall the delegation in that case (it
only needs to recall *other* clients' delegations) but I don't think
it's wrong to.

--b.

2012-03-21 13:52:17

by Jeff Layton

[permalink] [raw]
Subject: [PATCH v10 2/8] nfsd: add nfsd4_client_tracking_ops struct and a way to set it

Abstract out the mechanism that we use to track clients into a set of
client name tracking functions.

This gives us a mechanism to plug in a new set of client tracking
functions without disturbing the callers. It also gives us a way to
decide on what tracking scheme to use at runtime.

For now, this just looks like pointless abstraction, but later we'll
add a new alternate scheme for tracking clients on stable storage.

Note too that this patch anticipates the eventual containerization
of this code by passing in struct net pointers in places. No attempt
is made to containerize the legacy client tracker however.

Signed-off-by: Jeff Layton <[email protected]>
---
fs/nfsd/nfs4recover.c | 136 +++++++++++++++++++++++++++++++++++++++++++++----
fs/nfsd/nfs4state.c | 61 +++++++---------------
fs/nfsd/state.h | 15 +++--
3 files changed, 155 insertions(+), 57 deletions(-)

diff --git a/fs/nfsd/nfs4recover.c b/fs/nfsd/nfs4recover.c
index 6523809..e616f88 100644
--- a/fs/nfsd/nfs4recover.c
+++ b/fs/nfsd/nfs4recover.c
@@ -43,9 +43,20 @@

#define NFSDDBG_FACILITY NFSDDBG_PROC

+/* Declarations */
+struct nfsd4_client_tracking_ops {
+ int (*init)(struct net *);
+ void (*exit)(struct net *);
+ void (*create)(struct nfs4_client *);
+ void (*remove)(struct nfs4_client *);
+ int (*check)(struct nfs4_client *);
+ void (*grace_done)(struct net *, time_t);
+};
+
/* Globals */
static struct file *rec_file;
static char user_recovery_dirname[PATH_MAX] = "/var/lib/nfs/v4recovery";
+static struct nfsd4_client_tracking_ops *client_tracking_ops;

static int
nfs4_save_creds(const struct cred **original_creds)
@@ -117,7 +128,8 @@ out_no_tfm:
return status;
}

-void nfsd4_create_clid_dir(struct nfs4_client *clp)
+static void
+nfsd4_create_clid_dir(struct nfs4_client *clp)
{
const struct cred *original_cred;
char *dname = clp->cl_recdir;
@@ -264,7 +276,7 @@ out_unlock:
return status;
}

-void
+static void
nfsd4_remove_clid_dir(struct nfs4_client *clp)
{
const struct cred *original_cred;
@@ -291,7 +303,6 @@ out:
if (status)
printk("NFSD: Failed to remove expired client state directory"
" %.*s\n", HEXDIR_LEN, clp->cl_recdir);
- return;
}

static int
@@ -310,8 +321,9 @@ purge_old(struct dentry *parent, struct dentry *child)
return 0;
}

-void
-nfsd4_recdir_purge_old(void) {
+static void
+nfsd4_recdir_purge_old(struct net *net, time_t boot_time)
+{
int status;

if (!rec_file)
@@ -342,7 +354,7 @@ load_recdir(struct dentry *parent, struct dentry *child)
return 0;
}

-int
+static int
nfsd4_recdir_load(void) {
int status;

@@ -360,8 +372,8 @@ nfsd4_recdir_load(void) {
* Hold reference to the recovery directory.
*/

-void
-nfsd4_init_recdir()
+static int
+nfsd4_init_recdir(void)
{
const struct cred *original_cred;
int status;
@@ -376,20 +388,37 @@ nfsd4_init_recdir()
printk("NFSD: Unable to change credentials to find recovery"
" directory: error %d\n",
status);
- return;
+ return status;
}

rec_file = filp_open(user_recovery_dirname, O_RDONLY | O_DIRECTORY, 0);
if (IS_ERR(rec_file)) {
printk("NFSD: unable to find recovery directory %s\n",
user_recovery_dirname);
+ status = PTR_ERR(rec_file);
rec_file = NULL;
}

nfs4_reset_creds(original_cred);
+ return status;
}

-void
+static int
+nfsd4_load_reboot_recovery_data(struct net *net)
+{
+ int status;
+
+ nfs4_lock_state();
+ status = nfsd4_init_recdir();
+ if (!status)
+ status = nfsd4_recdir_load();
+ nfs4_unlock_state();
+ if (status)
+ printk(KERN_ERR "NFSD: Failure reading reboot recovery data\n");
+ return status;
+}
+
+static void
nfsd4_shutdown_recdir(void)
{
if (!rec_file)
@@ -398,6 +427,13 @@ nfsd4_shutdown_recdir(void)
rec_file = NULL;
}

+static void
+nfsd4_legacy_tracking_exit(struct net *net)
+{
+ nfs4_release_reclaim();
+ nfsd4_shutdown_recdir();
+}
+
/*
* Change the NFSv4 recovery directory to recdir.
*/
@@ -424,3 +460,83 @@ nfs4_recoverydir(void)
{
return user_recovery_dirname;
}
+
+static int
+nfsd4_check_legacy_client(struct nfs4_client *clp)
+{
+ /* did we already find that this client is stable? */
+ if (test_bit(NFSD4_CLIENT_STABLE, &clp->cl_flags))
+ return 0;
+
+ /* look for it in the reclaim hashtable otherwise */
+ if (nfsd4_find_reclaim_client(clp)) {
+ set_bit(NFSD4_CLIENT_STABLE, &clp->cl_flags);
+ return 0;
+ }
+
+ return -ENOENT;
+}
+
+static struct nfsd4_client_tracking_ops nfsd4_legacy_tracking_ops = {
+ .init = nfsd4_load_reboot_recovery_data,
+ .exit = nfsd4_legacy_tracking_exit,
+ .create = nfsd4_create_clid_dir,
+ .remove = nfsd4_remove_clid_dir,
+ .check = nfsd4_check_legacy_client,
+ .grace_done = nfsd4_recdir_purge_old,
+};
+
+int
+nfsd4_client_tracking_init(struct net *net)
+{
+ int status;
+
+ client_tracking_ops = &nfsd4_legacy_tracking_ops;
+
+ status = client_tracking_ops->init(net);
+ if (status) {
+ printk(KERN_WARNING "NFSD: Unable to initialize client "
+ "recovery tracking! (%d)\n", status);
+ client_tracking_ops = NULL;
+ }
+ return status;
+}
+
+void
+nfsd4_client_tracking_exit(struct net *net)
+{
+ if (client_tracking_ops) {
+ client_tracking_ops->exit(net);
+ client_tracking_ops = NULL;
+ }
+}
+
+void
+nfsd4_client_record_create(struct nfs4_client *clp)
+{
+ if (client_tracking_ops)
+ client_tracking_ops->create(clp);
+}
+
+void
+nfsd4_client_record_remove(struct nfs4_client *clp)
+{
+ if (client_tracking_ops)
+ client_tracking_ops->remove(clp);
+}
+
+int
+nfsd4_client_record_check(struct nfs4_client *clp)
+{
+ if (client_tracking_ops)
+ return client_tracking_ops->check(clp);
+
+ return -EOPNOTSUPP;
+}
+
+void
+nfsd4_record_grace_done(struct net *net, time_t boot_time)
+{
+ if (client_tracking_ops)
+ client_tracking_ops->grace_done(net, boot_time);
+}
diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index 534f82f1..fc65412 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -2046,7 +2046,7 @@ nfsd4_reclaim_complete(struct svc_rqst *rqstp, struct nfsd4_compound_state *csta
goto out;

status = nfs_ok;
- nfsd4_create_clid_dir(cstate->session->se_client);
+ nfsd4_client_record_create(cstate->session->se_client);
out:
nfs4_unlock_state();
return status;
@@ -2241,7 +2241,7 @@ nfsd4_setclientid_confirm(struct svc_rqst *rqstp,
conf = find_confirmed_client_by_str(unconf->cl_recdir,
hash);
if (conf) {
- nfsd4_remove_clid_dir(conf);
+ nfsd4_client_record_remove(conf);
expire_client(conf);
}
move_to_confirmed(unconf);
@@ -3066,7 +3066,7 @@ static void
nfsd4_end_grace(void)
{
dprintk("NFSD: end of grace period\n");
- nfsd4_recdir_purge_old();
+ nfsd4_record_grace_done(&init_net, boot_time);
locks_end_grace(&nfsd4_manager);
/*
* Now that every NFSv4 client has had the chance to recover and
@@ -3115,7 +3115,7 @@ nfs4_laundromat(void)
clp = list_entry(pos, struct nfs4_client, cl_lru);
dprintk("NFSD: purging unused client (clientid %08x)\n",
clp->cl_clientid.cl_id);
- nfsd4_remove_clid_dir(clp);
+ nfsd4_client_record_remove(clp);
expire_client(clp);
}
spin_lock(&recall_lock);
@@ -3539,7 +3539,7 @@ nfsd4_open_confirm(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
dprintk("NFSD: %s: success, seqid=%d stateid=" STATEID_FMT "\n",
__func__, oc->oc_seqid, STATEID_VAL(&stp->st_stid.sc_stateid));

- nfsd4_create_clid_dir(oo->oo_owner.so_client);
+ nfsd4_client_record_create(oo->oo_owner.so_client);
status = nfs_ok;
out:
if (!cstate->replay_owner)
@@ -4379,7 +4379,7 @@ nfs4_client_to_reclaim(const char *name)
return 1;
}

-static void
+void
nfs4_release_reclaim(void)
{
struct nfs4_client_reclaim *crp = NULL;
@@ -4399,7 +4399,7 @@ nfs4_release_reclaim(void)

/*
* called from OPEN, CLAIM_PREVIOUS with a new clientid. */
-static struct nfs4_client_reclaim *
+struct nfs4_client_reclaim *
nfsd4_find_reclaim_client(struct nfs4_client *clp)
{
unsigned int strhashval;
@@ -4419,22 +4419,6 @@ nfsd4_find_reclaim_client(struct nfs4_client *clp)
return NULL;
}

-static int
-nfsd4_client_record_check(struct nfs4_client *clp)
-{
- /* did we already find that this client is stable? */
- if (test_bit(NFSD4_CLIENT_STABLE, &clp->cl_flags))
- return 0;
-
- /* look for it in the reclaim hashtable otherwise */
- if (nfsd4_find_reclaim_client(clp)) {
- set_bit(NFSD4_CLIENT_STABLE, &clp->cl_flags);
- return 0;
- }
-
- return -ENOENT;
-}
-
/*
* Called from OPEN. Look for clientid in reclaim list.
*/
@@ -4595,19 +4579,6 @@ nfs4_state_init(void)
reclaim_str_hashtbl_size = 0;
}

-static void
-nfsd4_load_reboot_recovery_data(void)
-{
- int status;
-
- nfs4_lock_state();
- nfsd4_init_recdir();
- status = nfsd4_recdir_load();
- nfs4_unlock_state();
- if (status)
- printk("NFSD: Failure reading reboot recovery data\n");
-}
-
/*
* Since the lifetime of a delegation isn't limited to that of an open, a
* client may quite reasonably hang on to a delegation as long as it has
@@ -4636,7 +4607,15 @@ nfs4_state_start(void)
{
int ret;

- nfsd4_load_reboot_recovery_data();
+ /*
+ * FIXME: For now, we hang most of the pernet global stuff off of
+ * init_net until nfsd is fully containerized. Eventually, we'll
+ * need to pass a net pointer into this function, take a reference
+ * to that instead and then do most of the rest of this on a per-net
+ * basis.
+ */
+ get_net(&init_net);
+ nfsd4_client_tracking_init(&init_net);
boot_time = get_seconds();
locks_start_grace(&nfsd4_manager);
printk(KERN_INFO "NFSD: starting %ld-second grace period\n",
@@ -4660,8 +4639,8 @@ nfs4_state_start(void)
out_free_laundry:
destroy_workqueue(laundry_wq);
out_recovery:
- nfs4_release_reclaim();
- nfsd4_shutdown_recdir();
+ nfsd4_client_tracking_exit(&init_net);
+ put_net(&init_net);
return ret;
}

@@ -4695,7 +4674,8 @@ __nfs4_state_shutdown(void)
unhash_delegation(dp);
}

- nfsd4_shutdown_recdir();
+ nfsd4_client_tracking_exit(&init_net);
+ put_net(&init_net);
}

void
@@ -4705,7 +4685,6 @@ nfs4_state_shutdown(void)
destroy_workqueue(laundry_wq);
locks_end_grace(&nfsd4_manager);
nfs4_lock_state();
- nfs4_release_reclaim();
__nfs4_state_shutdown();
nfs4_unlock_state();
nfsd4_destroy_callback_queue();
diff --git a/fs/nfsd/state.h b/fs/nfsd/state.h
index b9c036d..a9ab5e0 100644
--- a/fs/nfsd/state.h
+++ b/fs/nfsd/state.h
@@ -466,6 +466,8 @@ extern __be32 nfs4_preprocess_stateid_op(struct nfsd4_compound_state *cstate,
extern void nfs4_lock_state(void);
extern void nfs4_unlock_state(void);
extern int nfs4_in_grace(void);
+extern void nfs4_release_reclaim(void);
+extern struct nfs4_client_reclaim *nfsd4_find_reclaim_client(struct nfs4_client *crp);
extern __be32 nfs4_check_open_reclaim(clientid_t *clid);
extern void nfs4_free_openowner(struct nfs4_openowner *);
extern void nfs4_free_lockowner(struct nfs4_lockowner *);
@@ -480,16 +482,17 @@ extern void nfsd4_destroy_callback_queue(void);
extern void nfsd4_shutdown_callback(struct nfs4_client *);
extern void nfs4_put_delegation(struct nfs4_delegation *dp);
extern __be32 nfs4_make_rec_clidname(char *clidname, struct xdr_netobj *clname);
-extern void nfsd4_init_recdir(void);
-extern int nfsd4_recdir_load(void);
-extern void nfsd4_shutdown_recdir(void);
extern int nfs4_client_to_reclaim(const char *name);
extern int nfs4_has_reclaimed_state(const char *name, bool use_exchange_id);
-extern void nfsd4_recdir_purge_old(void);
-extern void nfsd4_create_clid_dir(struct nfs4_client *clp);
-extern void nfsd4_remove_clid_dir(struct nfs4_client *clp);
extern void release_session_client(struct nfsd4_session *);
extern __be32 nfs4_validate_stateid(struct nfs4_client *, stateid_t *);
extern void nfsd4_purge_closed_stateid(struct nfs4_stateowner *);

+/* nfs4recover operations */
+extern int nfsd4_client_tracking_init(struct net *net);
+extern void nfsd4_client_tracking_exit(struct net *net);
+extern void nfsd4_client_record_create(struct nfs4_client *clp);
+extern void nfsd4_client_record_remove(struct nfs4_client *clp);
+extern int nfsd4_client_record_check(struct nfs4_client *clp);
+extern void nfsd4_record_grace_done(struct net *net, time_t boot_time);
#endif /* NFSD4_STATE_H */
--
1.7.7.6


2012-03-23 17:06:31

by J. Bruce Fields

[permalink] [raw]
Subject: Re: [PATCH v10 0/8] nfsd: overhaul the client name tracking code

On Wed, Mar 21, 2012 at 09:52:01AM -0400, Jeff Layton wrote:
> This is the tenth iteration of this patchset. The primary motivation
> of this respin is to fix up some merge conflicts with some fixes that
> Bruce merged recently.

Thanks, these look fine. I'm intending to merge for 3.4. But I going
want to spend some more time investigating the callback bug that
excltest is triggering before committing to my for-3.4 branch.

--b.

>
> For those who haven't followed along on the last few iterations, this
> patchset also begins the "containerization" of nfsd. It introduces a
> per-ns object that I envision growing over time as we make more of the
> nfsd code namespace aware.
>
> I've also rolled in the patch to convert the cl_cb_flags to a generic
> flags field since that's a prerequisite, and added a patch to ensure
> that no one tries to use the legacy client tracking code in anything but
> the init_net namespace.
>
> I'd like to see this go into 3.4 if possible...
>
> Thanks,
>
> Jeff Layton (8):
> nfsd: convert nfs4_client->cl_cb_flags to a generic flags field
> nfsd: add nfsd4_client_tracking_ops struct and a way to set it
> sunrpc: create nfsd dir in rpc_pipefs
> nfsd: add a per-net-namespace struct for nfsd
> nfsd: add a header describing upcall to nfsdcld
> nfsd: add the infrastructure to handle the cld upcall
> nfsd: add notifier to handle mount/unmount of rpc_pipefs sb
> nfsd: don't allow legacy client tracker init for anything but
> init_net
>
> fs/nfsd/netns.h | 35 +++
> fs/nfsd/nfs4callback.c | 14 +-
> fs/nfsd/nfs4proc.c | 3 +-
> fs/nfsd/nfs4recover.c | 636 +++++++++++++++++++++++++++++++++++++++++++++-
> fs/nfsd/nfs4state.c | 74 +++---
> fs/nfsd/nfsctl.c | 22 ++-
> fs/nfsd/state.h | 26 ++-
> include/linux/nfsd/cld.h | 56 ++++
> net/sunrpc/rpc_pipe.c | 5 +
> 9 files changed, 796 insertions(+), 75 deletions(-)
> create mode 100644 fs/nfsd/netns.h
> create mode 100644 include/linux/nfsd/cld.h
>
> --
> 1.7.7.6
>