2010-03-10 00:26:51

by J.Bruce Fields

[permalink] [raw]
Subject: allowing client to change callback path

The 4.0 spec allows clients to change the callback connection parameters
with an appropriate SETCLIENTID/SETCLIENTID_CONFIRM, but we've never
implemented that. Here's an attempt.

The real goal is to later use this to implement BIND_CONN_TO_SESSION.
Since that's the only way I see for a client to recover the backchannel
after a tcp connection without losing the session state, I figure 4.1
clients may be more likely to need this than 4.0 clients?

The first 7 patches are probably ready to go. The first two are to
client-side code. Trond, if you could ACK them, I'll queue them up in
the server tree. (Actually, the first could go now if it looks right.)

The rest of the patches are closer to an RFC, and are barely tested. I
avoid sleeping in the cb_recall callback by deferring the real work to a
workqueue, and then use the same trick to requeue a callback after the
callback client changes.

--b.


2010-03-18 15:06:35

by J.Bruce Fields

[permalink] [raw]
Subject: Re: [PATCH 02/11] sunrpc: take reference to bc_xprt in sunrpc layer

On Wed, Mar 10, 2010 at 02:05:45PM -0500, J. Bruce Fields wrote:
> On Tue, Mar 09, 2010 at 08:03:00PM -0500, Trond Myklebust wrote:
> > On Tue, 2010-03-09 at 19:28 -0500, J. Bruce Fields wrote:
> > > Currently we're requiring whoever holds the reference to a client set up
> > > the on the backchannel to also hold a reference to the forechannel
> > > transport; but this should be the responsibility of the lower level
> > > code.
...
> > > @@ -2552,6 +2555,7 @@ static struct rpc_xprt *xs_setup_bc_tcp(struct xprt_create *args)
> > > return xprt;
> > > ret = ERR_PTR(-EINVAL);
> > > out_err:
> > > + svc_xprt_put(xprt->bc_xprt);
> > > kfree(xprt->slot);
> > > kfree(xprt);
> > > return ret;
> >
> > OK... This makes my head spin.
> >
> > Why should the back channel socket hold a reference to the svc_sock? The
> > process that owns both of these things is an RPC server process. It
> > already holds a reference to the svc_sock and presumably also to the
> > back channel socket.
>
> Only while it's actually processing an rpc, I assume.
>
> I'll take a harder look at what should be the lifetime of the server
> socket that the backchannel requests use.
>
> By the way, we also need to make sure the nfs server code gets some kind
> of call when the client closes the connection, so we can set
> callback-down session flags appropriately.

One way to do that would be for nfsd to register a callback to be called
from svc_delete_xprt(). Then in that callback nfsd can set those
session flags appropriately, but it can also shut down the rpc client.
That ensures the rpc client (and its xprt) go away before the svc_xprt,
so we no longer need to hold a reference.

(Actually there's nothing preventing multiple backchannels from sharing
the same connection, so we'd need a list of callbacks.)

--b.

2010-03-10 00:26:51

by J.Bruce Fields

[permalink] [raw]
Subject: [PATCH 05/11] nfsd4: shutdown callbacks on expiry

Once we've expired the client, there's no further purpose to the
callbacks; go ahead and shut down the callback client rather than
waiting for the last reference to go.

Signed-off-by: J. Bruce Fields <[email protected]>
---
fs/nfsd/nfs4state.c | 2 +-
1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index 2d1ee98..9762fc0 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -697,7 +697,6 @@ shutdown_callback_client(struct nfs4_client *clp)
static inline void
free_client(struct nfs4_client *clp)
{
- shutdown_callback_client(clp);
if (clp->cl_cred.cr_group_info)
put_group_info(clp->cl_cred.cr_group_info);
kfree(clp->cl_principal);
@@ -750,6 +749,7 @@ expire_client(struct nfs4_client *clp)
se_perclnt);
release_session(ses);
}
+ shutdown_callback_client(clp);
put_nfs4_client(clp);
}

--
1.6.3.3


2010-03-10 00:26:51

by J.Bruce Fields

[permalink] [raw]
Subject: [PATCH 03/11] nfsd4: don't store cb_xprt in client

We're storing this cb_xprt in the client just to pass it to the rpc
client creation code; we don't actually use it outside of the function
where we set cl_cb_xprt. So just treat it as an argument to the
backchannel creation code. Note we don't need to keep it around just to
hold a reference count, since the rpc code is doing that for us now.

Signed-off-by: J. Bruce Fields <[email protected]>
---
fs/nfsd/nfs4callback.c | 8 ++++----
fs/nfsd/nfs4state.c | 8 ++------
fs/nfsd/state.h | 3 +--
3 files changed, 7 insertions(+), 12 deletions(-)

diff --git a/fs/nfsd/nfs4callback.c b/fs/nfsd/nfs4callback.c
index ed12ad4..17cd439 100644
--- a/fs/nfsd/nfs4callback.c
+++ b/fs/nfsd/nfs4callback.c
@@ -461,7 +461,7 @@ static int max_cb_time(void)
/* Reference counting, callback cleanup, etc., all look racy as heck.
* And why is cb_set an atomic? */

-int setup_callback_client(struct nfs4_client *clp)
+int setup_callback_client(struct nfs4_client *clp, struct svc_xprt *cb_xprt)
{
struct nfs4_cb_conn *cb = &clp->cl_cb_conn;
struct rpc_timeout timeparms = {
@@ -485,7 +485,7 @@ int setup_callback_client(struct nfs4_client *clp)
if (!clp->cl_principal && (clp->cl_flavor >= RPC_AUTH_GSS_KRB5))
return -EINVAL;
if (cb->cb_minorversion) {
- args.bc_xprt = clp->cl_cb_xprt;
+ args.bc_xprt = cb_xprt;
args.protocol = XPRT_TRANSPORT_BC_TCP;
}
/* Create RPC client */
@@ -557,13 +557,13 @@ void do_probe_callback(struct nfs4_client *clp)
* Set up the callback client and put a NFSPROC4_CB_NULL on the wire...
*/
void
-nfsd4_probe_callback(struct nfs4_client *clp)
+nfsd4_probe_callback(struct nfs4_client *clp, struct svc_xprt *cb_xprt)
{
int status;

BUG_ON(atomic_read(&clp->cl_cb_conn.cb_set));

- status = setup_callback_client(clp);
+ status = setup_callback_client(clp, cb_xprt);
if (status) {
warn_no_callback_path(clp, status);
return;
diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index efef7f2..2d1ee98 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -698,8 +698,6 @@ static inline void
free_client(struct nfs4_client *clp)
{
shutdown_callback_client(clp);
- if (clp->cl_cb_xprt)
- svc_xprt_put(clp->cl_cb_xprt);
if (clp->cl_cred.cr_group_info)
put_group_info(clp->cl_cred.cr_group_info);
kfree(clp->cl_principal);
@@ -1333,8 +1331,6 @@ nfsd4_create_session(struct svc_rqst *rqstp,
cr_ses->flags &= ~SESSION4_RDMA;

if (cr_ses->flags & SESSION4_BACK_CHAN) {
- unconf->cl_cb_xprt = rqstp->rq_xprt;
- svc_xprt_get(unconf->cl_cb_xprt);
rpc_copy_addr(
(struct sockaddr *)&unconf->cl_cb_conn.cb_addr,
sa);
@@ -1343,7 +1339,7 @@ nfsd4_create_session(struct svc_rqst *rqstp,
cstate->minorversion;
unconf->cl_cb_conn.cb_prog = cr_ses->callback_prog;
unconf->cl_cb_seq_nr = 1;
- nfsd4_probe_callback(unconf);
+ nfsd4_probe_callback(unconf, rqstp->rq_xprt);
}
conf = unconf;
} else {
@@ -1666,7 +1662,7 @@ nfsd4_setclientid_confirm(struct svc_rqst *rqstp,
}
move_to_confirmed(unconf);
conf = unconf;
- nfsd4_probe_callback(conf);
+ nfsd4_probe_callback(conf, NULL);
status = nfs_ok;
}
} else if ((!conf || (conf && !same_verf(&conf->cl_confirm, &confirm)))
diff --git a/fs/nfsd/state.h b/fs/nfsd/state.h
index fefeae2..b09943a 100644
--- a/fs/nfsd/state.h
+++ b/fs/nfsd/state.h
@@ -226,7 +226,6 @@ struct nfs4_client {
/* We currently support a single back channel with a single slot */
unsigned long cl_cb_slot_busy;
u32 cl_cb_seq_nr;
- struct svc_xprt *cl_cb_xprt; /* 4.1 callback transport */
struct rpc_wait_queue cl_cb_waitq; /* backchannel callers may */
/* wait here for slots */
};
@@ -380,7 +379,7 @@ extern __be32 nfs4_check_open_reclaim(clientid_t *clid);
extern void put_nfs4_client(struct nfs4_client *clp);
extern void nfs4_free_stateowner(struct kref *kref);
extern int set_callback_cred(void);
-extern void nfsd4_probe_callback(struct nfs4_client *clp);
+extern void nfsd4_probe_callback(struct nfs4_client *clp, struct svc_xprt *);
extern void nfsd4_cb_recall(struct nfs4_delegation *dp);
extern void nfs4_put_delegation(struct nfs4_delegation *dp);
extern __be32 nfs4_make_rec_clidname(char *clidname, struct xdr_netobj *clname);
--
1.6.3.3


2010-03-10 00:26:51

by J.Bruce Fields

[permalink] [raw]
Subject: [PATCH 09/11] nfsd: cl_count is unused

Now that the shutdown sequence guarantees callbacks are shut down before
the client is destroyed, we no longer have a use for cl_count.

We'll probably reinstate a reference count on the client some day, but
it will be held by users other than callbacks.

Signed-off-by: J. Bruce Fields <[email protected]>
---
fs/nfsd/nfs4callback.c | 6 +-----
fs/nfsd/nfs4state.c | 11 +----------
fs/nfsd/state.h | 2 --
3 files changed, 2 insertions(+), 17 deletions(-)

diff --git a/fs/nfsd/nfs4callback.c b/fs/nfsd/nfs4callback.c
index a8592be..15e2f60 100644
--- a/fs/nfsd/nfs4callback.c
+++ b/fs/nfsd/nfs4callback.c
@@ -681,10 +681,8 @@ static void nfsd4_cb_recall_done(struct rpc_task *task, void *calldata)
static void nfsd4_cb_recall_release(void *calldata)
{
struct nfs4_delegation *dp = calldata;
- struct nfs4_client *clp = dp->dl_client;

nfs4_put_delegation(dp);
- put_nfs4_client(clp);
}

static const struct rpc_call_ops nfsd4_cb_recall_ops = {
@@ -746,10 +744,8 @@ static void _nfsd4_cb_recall(struct nfs4_delegation *dp)
dp->dl_retries = 1;
status = rpc_call_async(clnt, &msg, RPC_TASK_SOFT,
&nfsd4_cb_recall_ops, dp);
- if (status) {
- put_nfs4_client(clp);
+ if (status)
nfs4_put_delegation(dp);
- }
}

void nfsd4_do_callback_rpc(struct work_struct *w)
diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index 239a90c..7c74af6 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -690,13 +690,6 @@ free_client(struct nfs4_client *clp)
kfree(clp);
}

-void
-put_nfs4_client(struct nfs4_client *clp)
-{
- if (atomic_dec_and_test(&clp->cl_count))
- free_client(clp);
-}
-
static void
expire_client(struct nfs4_client *clp)
{
@@ -733,7 +726,7 @@ expire_client(struct nfs4_client *clp)
release_session(ses);
}
nfsd4_set_callback_client(clp, NULL);
- put_nfs4_client(clp);
+ free_client(clp);
}

static void copy_verf(struct nfs4_client *target, nfs4_verifier *source)
@@ -819,7 +812,6 @@ static struct nfs4_client *create_client(struct xdr_netobj name, char *recdir,
}

memcpy(clp->cl_recdir, recdir, HEXDIR_LEN);
- atomic_set(&clp->cl_count, 1);
atomic_set(&clp->cl_cb_conn.cb_set, 0);
INIT_LIST_HEAD(&clp->cl_idhash);
INIT_LIST_HEAD(&clp->cl_strhash);
@@ -2006,7 +1998,6 @@ void nfsd_break_deleg_cb(struct file_lock *fl)
* lock) we know the server hasn't removed the lease yet, we know
* it's safe to take a reference: */
atomic_inc(&dp->dl_count);
- atomic_inc(&dp->dl_client->cl_count);

spin_lock(&recall_lock);
list_add_tail(&dp->dl_recall_lru, &del_recall_lru);
diff --git a/fs/nfsd/state.h b/fs/nfsd/state.h
index 92b0bf0..284bcf9 100644
--- a/fs/nfsd/state.h
+++ b/fs/nfsd/state.h
@@ -224,7 +224,6 @@ struct nfs4_client {
clientid_t cl_clientid; /* generated by server */
nfs4_verifier cl_confirm; /* generated by server */
struct nfs4_cb_conn cl_cb_conn; /* callback info */
- atomic_t cl_count; /* ref count */
u32 cl_firststate; /* recovery dir creation */

/* for nfs41 */
@@ -387,7 +386,6 @@ extern void nfs4_lock_state(void);
extern void nfs4_unlock_state(void);
extern int nfs4_in_grace(void);
extern __be32 nfs4_check_open_reclaim(clientid_t *clid);
-extern void put_nfs4_client(struct nfs4_client *clp);
extern void nfs4_free_stateowner(struct kref *kref);
extern int set_callback_cred(void);
extern void nfsd4_probe_callback(struct nfs4_client *clp, struct svc_xprt *);
--
1.6.3.3


2010-03-10 00:26:51

by J.Bruce Fields

[permalink] [raw]
Subject: [PATCH 02/11] sunrpc: take reference to bc_xprt in sunrpc layer

Currently we're requiring whoever holds the reference to a client set up
the on the backchannel to also hold a reference to the forechannel
transport; but this should be the responsibility of the lower level
code.

Signed-off-by: J. Bruce Fields <[email protected]>
Cc: Trond Myklebust <[email protected]>
---
net/sunrpc/xprtsock.c | 4 ++++
1 files changed, 4 insertions(+), 0 deletions(-)

diff --git a/net/sunrpc/xprtsock.c b/net/sunrpc/xprtsock.c
index a91f0a4..5adc000 100644
--- a/net/sunrpc/xprtsock.c
+++ b/net/sunrpc/xprtsock.c
@@ -792,6 +792,8 @@ static void xs_destroy(struct rpc_xprt *xprt)

xs_close(xprt);
xs_free_peer_addresses(xprt);
+ if (xprt->bc_xprt)
+ svc_xprt_put(xprt->bc_xprt);
kfree(xprt->slot);
kfree(xprt);
module_put(THIS_MODULE);
@@ -2510,6 +2512,7 @@ static struct rpc_xprt *xs_setup_bc_tcp(struct xprt_create *args)
* forechannel
*/
xprt->bc_xprt = args->bc_xprt;
+ svc_xprt_get(xprt->bc_xprt);
bc_sock = container_of(args->bc_xprt, struct svc_sock, sk_xprt);
bc_sock->sk_bc_xprt = xprt;
transport->sock = bc_sock->sk_sock;
@@ -2552,6 +2555,7 @@ static struct rpc_xprt *xs_setup_bc_tcp(struct xprt_create *args)
return xprt;
ret = ERR_PTR(-EINVAL);
out_err:
+ svc_xprt_put(xprt->bc_xprt);
kfree(xprt->slot);
kfree(xprt);
return ret;
--
1.6.3.3


2010-03-10 00:26:51

by J.Bruce Fields

[permalink] [raw]
Subject: [PATCH 06/11] nfsd4: remove dprintk

I haven't found this useful.

Signed-off-by: J. Bruce Fields <[email protected]>
---
fs/nfsd/nfs4state.c | 3 ---
1 files changed, 0 insertions(+), 3 deletions(-)

diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index 9762fc0..1b2637b 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -718,9 +718,6 @@ expire_client(struct nfs4_client *clp)
struct nfs4_delegation *dp;
struct list_head reaplist;

- dprintk("NFSD: expire_client cl_count %d\n",
- atomic_read(&clp->cl_count));
-
INIT_LIST_HEAD(&reaplist);
spin_lock(&recall_lock);
while (!list_empty(&clp->cl_delegations)) {
--
1.6.3.3


2010-03-10 00:26:51

by J.Bruce Fields

[permalink] [raw]
Subject: [PATCH 04/11] nfsd4: preallocate nfs4_rpc_args

Instead of allocating this small structure, just include it in the
delegation.

The nfsd4_callback structure isn't really necessary yet, but we plan to
add to it all the information necessary to perform a callback.

Signed-off-by: J. Bruce Fields <[email protected]>
---
fs/nfsd/nfs4callback.c | 18 +++---------------
fs/nfsd/state.h | 10 ++++++++++
2 files changed, 13 insertions(+), 15 deletions(-)

diff --git a/fs/nfsd/nfs4callback.c b/fs/nfsd/nfs4callback.c
index 17cd439..40a9c20 100644
--- a/fs/nfsd/nfs4callback.c
+++ b/fs/nfsd/nfs4callback.c
@@ -78,11 +78,6 @@ enum nfs_cb_opnum4 {
cb_sequence_dec_sz + \
op_dec_sz)

-struct nfs4_rpc_args {
- void *args_op;
- struct nfsd4_cb_sequence args_seq;
-};
-
/*
* Generic encode routines from fs/nfs/nfs4xdr.c
*/
@@ -676,7 +671,7 @@ static void nfsd4_cb_recall_done(struct rpc_task *task, void *calldata)
break;
default:
/* success, or error we can't handle */
- goto done;
+ return;
}
if (dp->dl_retries--) {
rpc_delay(task, 2*HZ);
@@ -687,8 +682,6 @@ static void nfsd4_cb_recall_done(struct rpc_task *task, void *calldata)
atomic_set(&clp->cl_cb_conn.cb_set, 0);
warn_no_callback_path(clp, task->tk_status);
}
-done:
- kfree(task->tk_msg.rpc_argp);
}

static void nfsd4_cb_recall_release(void *calldata)
@@ -714,24 +707,19 @@ nfsd4_cb_recall(struct nfs4_delegation *dp)
{
struct nfs4_client *clp = dp->dl_client;
struct rpc_clnt *clnt = clp->cl_cb_conn.cb_client;
- struct nfs4_rpc_args *args;
+ struct nfs4_rpc_args *args = &dp->dl_recall.cb_args;
struct rpc_message msg = {
.rpc_proc = &nfs4_cb_procedures[NFSPROC4_CLNT_CB_RECALL],
.rpc_cred = callback_cred
};
- int status = -ENOMEM;
+ int status;

- args = kzalloc(sizeof(*args), GFP_KERNEL);
- if (!args)
- goto out;
args->args_op = dp;
msg.rpc_argp = args;
dp->dl_retries = 1;
status = rpc_call_async(clnt, &msg, RPC_TASK_SOFT,
&nfsd4_cb_recall_ops, dp);
-out:
if (status) {
- kfree(args);
put_nfs4_client(clp);
nfs4_put_delegation(dp);
}
diff --git a/fs/nfsd/state.h b/fs/nfsd/state.h
index b09943a..6e5b005 100644
--- a/fs/nfsd/state.h
+++ b/fs/nfsd/state.h
@@ -70,6 +70,15 @@ struct nfsd4_cb_sequence {
struct nfs4_client *cbs_clp;
};

+struct nfs4_rpc_args {
+ void *args_op;
+ struct nfsd4_cb_sequence args_seq;
+};
+
+struct nfsd4_callback {
+ struct nfs4_rpc_args cb_args;
+};
+
struct nfs4_delegation {
struct list_head dl_perfile;
struct list_head dl_perclnt;
@@ -86,6 +95,7 @@ struct nfs4_delegation {
stateid_t dl_stateid;
struct knfsd_fh dl_fh;
int dl_retries;
+ struct nfsd4_callback dl_recall;
};

/* client delegation callback info */
--
1.6.3.3


2010-03-10 00:26:51

by J.Bruce Fields

[permalink] [raw]
Subject: [PATCH 07/11] nfsd4: remove probe task's reference on client

Any null probe rpc will be synchronously destroyed by the
rpc_shutdown_client() in expire_client(), so the rpc task cannot outlast
the nfs4 client. Therefore there's no need for that task to hold a
reference on the client.

Signed-off-by: J. Bruce Fields <[email protected]>
---
fs/nfsd/nfs4callback.c | 9 +--------
1 files changed, 1 insertions(+), 8 deletions(-)

diff --git a/fs/nfsd/nfs4callback.c b/fs/nfsd/nfs4callback.c
index 40a9c20..7f84686 100644
--- a/fs/nfsd/nfs4callback.c
+++ b/fs/nfsd/nfs4callback.c
@@ -509,7 +509,6 @@ static void nfsd4_cb_probe_done(struct rpc_task *task, void *calldata)
warn_no_callback_path(clp, task->tk_status);
else
atomic_set(&clp->cl_cb_conn.cb_set, 1);
- put_nfs4_client(clp);
}

static const struct rpc_call_ops nfsd4_cb_probe_ops = {
@@ -542,10 +541,8 @@ void do_probe_callback(struct nfs4_client *clp)
status = rpc_call_async(cb->cb_client, &msg,
RPC_TASK_SOFT | RPC_TASK_SOFTCONN,
&nfsd4_cb_probe_ops, (void *)clp);
- if (status) {
+ if (status)
warn_no_callback_path(clp, status);
- put_nfs4_client(clp);
- }
}

/*
@@ -563,10 +560,6 @@ nfsd4_probe_callback(struct nfs4_client *clp, struct svc_xprt *cb_xprt)
warn_no_callback_path(clp, status);
return;
}
-
- /* the task holds a reference to the nfs4_client struct */
- atomic_inc(&clp->cl_count);
-
do_probe_callback(clp);
}

--
1.6.3.3


2010-03-10 00:26:51

by J.Bruce Fields

[permalink] [raw]
Subject: [PATCH 10/11] nfsd4: rearrange cb data structures

Mainly I just want to separate the arguments used for setting up the tcp
client from the rest.

Signed-off-by: J. Bruce Fields <[email protected]>
---
fs/nfsd/nfs4callback.c | 23 +++++++++++------------
fs/nfsd/nfs4state.c | 12 ++++++------
fs/nfsd/state.h | 9 +++++----
3 files changed, 22 insertions(+), 22 deletions(-)

diff --git a/fs/nfsd/nfs4callback.c b/fs/nfsd/nfs4callback.c
index 15e2f60..1d83ac6 100644
--- a/fs/nfsd/nfs4callback.c
+++ b/fs/nfsd/nfs4callback.c
@@ -455,7 +455,7 @@ static int max_cb_time(void)
}

/* Reference counting, callback cleanup, etc., all look racy as heck.
- * And why is cb_set an atomic? */
+ * And why is cl_cb_set an atomic? */

int setup_callback_client(struct nfs4_client *clp, struct svc_xprt *cb_xprt)
{
@@ -491,7 +491,7 @@ int setup_callback_client(struct nfs4_client *clp, struct svc_xprt *cb_xprt)
PTR_ERR(client));
return PTR_ERR(client);
}
- cb->cb_client = client;
+ clp->cl_cb_client = client;
return 0;

}
@@ -509,7 +509,7 @@ static void nfsd4_cb_probe_done(struct rpc_task *task, void *calldata)
if (task->tk_status)
warn_no_callback_path(clp, task->tk_status);
else
- atomic_set(&clp->cl_cb_conn.cb_set, 1);
+ atomic_set(&clp->cl_cb_set, 1);
}

static const struct rpc_call_ops nfsd4_cb_probe_ops = {
@@ -531,7 +531,6 @@ int set_callback_cred(void)

void do_probe_callback(struct nfs4_client *clp)
{
- struct nfs4_cb_conn *cb = &clp->cl_cb_conn;
struct rpc_message msg = {
.rpc_proc = &nfs4_cb_procedures[NFSPROC4_CLNT_CB_NULL],
.rpc_argp = clp,
@@ -539,7 +538,7 @@ void do_probe_callback(struct nfs4_client *clp)
};
int status;

- status = rpc_call_async(cb->cb_client, &msg,
+ status = rpc_call_async(clp->cl_cb_client, &msg,
RPC_TASK_SOFT | RPC_TASK_SOFTCONN,
&nfsd4_cb_probe_ops, (void *)clp);
if (status)
@@ -554,7 +553,7 @@ nfsd4_probe_callback(struct nfs4_client *clp, struct svc_xprt *cb_xprt)
{
int status;

- BUG_ON(atomic_read(&clp->cl_cb_conn.cb_set));
+ BUG_ON(atomic_read(&clp->cl_cb_set));

status = setup_callback_client(clp, cb_xprt);
if (status) {
@@ -656,7 +655,7 @@ static void nfsd4_cb_recall_done(struct rpc_task *task, void *calldata)
switch (task->tk_status) {
case -EIO:
/* Network partition? */
- atomic_set(&clp->cl_cb_conn.cb_set, 0);
+ atomic_set(&clp->cl_cb_set, 0);
warn_no_callback_path(clp, task->tk_status);
case -EBADHANDLE:
case -NFS4ERR_BAD_STATEID:
@@ -673,7 +672,7 @@ static void nfsd4_cb_recall_done(struct rpc_task *task, void *calldata)
rpc_restart_call(task);
return;
} else {
- atomic_set(&clp->cl_cb_conn.cb_set, 0);
+ atomic_set(&clp->cl_cb_set, 0);
warn_no_callback_path(clp, task->tk_status);
}
}
@@ -709,11 +708,11 @@ void nfsd4_destroy_callback_queue(void)
void nfsd4_set_callback_client(struct nfs4_client *clp, struct rpc_clnt
*new)
{
- struct rpc_clnt *old = clp->cl_cb_conn.cb_client;
+ struct rpc_clnt *old = clp->cl_cb_client;

- clp->cl_cb_conn.cb_client = new;
+ clp->cl_cb_client = new;
/*
- * After this, any work that saw the old value of cb_client will
+ * After this, any work that saw the old value of cl_cb_client will
* be gone:
*/
flush_workqueue(callback_wq);
@@ -728,7 +727,7 @@ void nfsd4_set_callback_client(struct nfs4_client *clp, struct rpc_clnt
static void _nfsd4_cb_recall(struct nfs4_delegation *dp)
{
struct nfs4_client *clp = dp->dl_client;
- struct rpc_clnt *clnt = clp->cl_cb_conn.cb_client;
+ struct rpc_clnt *clnt = clp->cl_cb_client;
struct nfs4_rpc_args *args = &dp->dl_recall.cb_args;
struct rpc_message msg = {
.rpc_proc = &nfs4_cb_procedures[NFSPROC4_CLNT_CB_RECALL],
diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index 7c74af6..89efd5e 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -812,7 +812,7 @@ static struct nfs4_client *create_client(struct xdr_netobj name, char *recdir,
}

memcpy(clp->cl_recdir, recdir, HEXDIR_LEN);
- atomic_set(&clp->cl_cb_conn.cb_set, 0);
+ atomic_set(&clp->cl_cb_set, 0);
INIT_LIST_HEAD(&clp->cl_idhash);
INIT_LIST_HEAD(&clp->cl_strhash);
INIT_LIST_HEAD(&clp->cl_openowners);
@@ -1603,7 +1603,7 @@ nfsd4_setclientid_confirm(struct svc_rqst *rqstp,
else {
/* XXX: We just turn off callbacks until we can handle
* change request correctly. */
- atomic_set(&conf->cl_cb_conn.cb_set, 0);
+ atomic_set(&conf->cl_cb_set, 0);
expire_client(unconf);
status = nfs_ok;

@@ -2316,7 +2316,7 @@ nfs4_open_delegation(struct svc_fh *fh, struct nfsd4_open *open, struct nfs4_sta
{
struct nfs4_delegation *dp;
struct nfs4_stateowner *sop = stp->st_stateowner;
- struct nfs4_cb_conn *cb = &sop->so_client->cl_cb_conn;
+ int cb_up = atomic_read(&sop->so_client->cl_cb_set);
struct file_lock fl, *flp = &fl;
int status, flag = 0;

@@ -2324,7 +2324,7 @@ nfs4_open_delegation(struct svc_fh *fh, struct nfsd4_open *open, struct nfs4_sta
open->op_recall = 0;
switch (open->op_claim_type) {
case NFS4_OPEN_CLAIM_PREVIOUS:
- if (!atomic_read(&cb->cb_set))
+ if (!cb_up)
open->op_recall = 1;
flag = open->op_delegate_type;
if (flag == NFS4_OPEN_DELEGATE_NONE)
@@ -2335,7 +2335,7 @@ nfs4_open_delegation(struct svc_fh *fh, struct nfsd4_open *open, struct nfs4_sta
* had the chance to reclaim theirs.... */
if (locks_in_grace())
goto out;
- if (!atomic_read(&cb->cb_set) || !sop->so_confirmed)
+ if (!cb_up || !sop->so_confirmed)
goto out;
if (open->op_share_access & NFS4_SHARE_ACCESS_WRITE)
flag = NFS4_OPEN_DELEGATE_WRITE;
@@ -2506,7 +2506,7 @@ nfsd4_renew(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
renew_client(clp);
status = nfserr_cb_path_down;
if (!list_empty(&clp->cl_delegations)
- && !atomic_read(&clp->cl_cb_conn.cb_set))
+ && !atomic_read(&clp->cl_cb_set))
goto out;
status = nfs_ok;
out:
diff --git a/fs/nfsd/state.h b/fs/nfsd/state.h
index 284bcf9..8ecba4a 100644
--- a/fs/nfsd/state.h
+++ b/fs/nfsd/state.h
@@ -107,9 +107,6 @@ struct nfs4_cb_conn {
u32 cb_prog;
u32 cb_minorversion;
u32 cb_ident; /* minorversion 0 only */
- /* RPC client info */
- atomic_t cb_set; /* successful CB_NULL call */
- struct rpc_clnt * cb_client;
};

/* Maximum number of slots per session. 160 is useful for long haul TCP */
@@ -223,9 +220,13 @@ struct nfs4_client {
struct svc_cred cl_cred; /* setclientid principal */
clientid_t cl_clientid; /* generated by server */
nfs4_verifier cl_confirm; /* generated by server */
- struct nfs4_cb_conn cl_cb_conn; /* callback info */
u32 cl_firststate; /* recovery dir creation */

+ /* for v4.0 and v4.1 callbacks: */
+ struct nfs4_cb_conn cl_cb_conn;
+ struct rpc_clnt *cl_cb_client;
+ atomic_t cl_cb_set;
+
/* for nfs41 */
struct list_head cl_sessions;
struct nfsd4_clid_slot cl_cs_slot; /* create_session slot */
--
1.6.3.3


2010-03-10 00:26:51

by J.Bruce Fields

[permalink] [raw]
Subject: [PATCH 11/11] nfsd4: allow 4.0 clients to change callback path

The rfc allows a client to change the callback parameters, but we didn't
previously implement it.

Teach the callbacks to rerun themselves (by placing themselves on a
workqueue) when they recognize that their rpc task has been killed and
that the callback connection has changed.

Then we can change the callback connection by setting up a new rpc
client, modifying the nfs4 client to point at it, waiting for any work
in progress to complete, and then shutting down the old client.

Signed-off-by: J. Bruce Fields <[email protected]>
---
fs/nfsd/nfs4callback.c | 26 +++++++++++++++++++-------
fs/nfsd/nfs4state.c | 7 +++----
fs/nfsd/state.h | 2 +-
3 files changed, 23 insertions(+), 12 deletions(-)

diff --git a/fs/nfsd/nfs4callback.c b/fs/nfsd/nfs4callback.c
index 1d83ac6..151ea90 100644
--- a/fs/nfsd/nfs4callback.c
+++ b/fs/nfsd/nfs4callback.c
@@ -457,9 +457,8 @@ static int max_cb_time(void)
/* Reference counting, callback cleanup, etc., all look racy as heck.
* And why is cl_cb_set an atomic? */

-int setup_callback_client(struct nfs4_client *clp, struct svc_xprt *cb_xprt)
+int setup_callback_client(struct nfs4_client *clp, struct nfs4_cb_conn *cb, struct svc_xprt *cb_xprt)
{
- struct nfs4_cb_conn *cb = &clp->cl_cb_conn;
struct rpc_timeout timeparms = {
.to_initval = max_cb_time(),
.to_retries = 0,
@@ -491,7 +490,7 @@ int setup_callback_client(struct nfs4_client *clp, struct svc_xprt *cb_xprt)
PTR_ERR(client));
return PTR_ERR(client);
}
- clp->cl_cb_client = client;
+ nfsd4_set_callback_client(clp, client);
return 0;

}
@@ -549,13 +548,13 @@ void do_probe_callback(struct nfs4_client *clp)
* Set up the callback client and put a NFSPROC4_CB_NULL on the wire...
*/
void
-nfsd4_probe_callback(struct nfs4_client *clp, struct svc_xprt *cb_xprt)
+nfsd4_probe_callback(struct nfs4_client *clp, struct nfs4_cb_conn *cb, struct svc_xprt *cb_xprt)
{
int status;

BUG_ON(atomic_read(&clp->cl_cb_set));

- status = setup_callback_client(clp, cb_xprt);
+ status = setup_callback_client(clp, cb, cb_xprt);
if (status) {
warn_no_callback_path(clp, status);
return;
@@ -645,18 +644,32 @@ static void nfsd4_cb_done(struct rpc_task *task, void *calldata)
}
}

+
static void nfsd4_cb_recall_done(struct rpc_task *task, void *calldata)
{
struct nfs4_delegation *dp = calldata;
struct nfs4_client *clp = dp->dl_client;
+ struct rpc_clnt *current_rpc_client = clp->cl_cb_client;

nfsd4_cb_done(task, calldata);

+ if (current_rpc_client == NULL) {
+ /* We're shutting down; give up. */
+ /* XXX: err, or is it ok just to fall through
+ * and rpc_restart_call? */
+ return;
+ }
+
switch (task->tk_status) {
case -EIO:
/* Network partition? */
atomic_set(&clp->cl_cb_set, 0);
warn_no_callback_path(clp, task->tk_status);
+ if (current_rpc_client != task->tk_client) {
+ /* queue a callback on the new connection: */
+ nfsd4_cb_recall(dp);
+ return;
+ }
case -EBADHANDLE:
case -NFS4ERR_BAD_STATEID:
/* Race: client probably got cb_recall
@@ -705,8 +718,7 @@ void nfsd4_destroy_callback_queue(void)
destroy_workqueue(callback_wq);
}

-void nfsd4_set_callback_client(struct nfs4_client *clp, struct rpc_clnt
-*new)
+void nfsd4_set_callback_client(struct nfs4_client *clp, struct rpc_clnt *new)
{
struct rpc_clnt *old = clp->cl_cb_client;

diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index 89efd5e..d71d59c 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -1314,7 +1314,7 @@ nfsd4_create_session(struct svc_rqst *rqstp,
cstate->minorversion;
unconf->cl_cb_conn.cb_prog = cr_ses->callback_prog;
unconf->cl_cb_seq_nr = 1;
- nfsd4_probe_callback(unconf, rqstp->rq_xprt);
+ nfsd4_probe_callback(unconf, &unconf->cl_cb_conn, rqstp->rq_xprt);
}
conf = unconf;
} else {
@@ -1601,9 +1601,8 @@ nfsd4_setclientid_confirm(struct svc_rqst *rqstp,
if (!same_creds(&conf->cl_cred, &unconf->cl_cred))
status = nfserr_clid_inuse;
else {
- /* XXX: We just turn off callbacks until we can handle
- * change request correctly. */
atomic_set(&conf->cl_cb_set, 0);
+ nfsd4_probe_callback(conf, &unconf->cl_cb_conn, NULL);
expire_client(unconf);
status = nfs_ok;

@@ -1637,7 +1636,7 @@ nfsd4_setclientid_confirm(struct svc_rqst *rqstp,
}
move_to_confirmed(unconf);
conf = unconf;
- nfsd4_probe_callback(conf, NULL);
+ nfsd4_probe_callback(conf, &conf->cl_cb_conn, NULL);
status = nfs_ok;
}
} else if ((!conf || (conf && !same_verf(&conf->cl_confirm, &confirm)))
diff --git a/fs/nfsd/state.h b/fs/nfsd/state.h
index 8ecba4a..3b0e088 100644
--- a/fs/nfsd/state.h
+++ b/fs/nfsd/state.h
@@ -389,7 +389,7 @@ extern int nfs4_in_grace(void);
extern __be32 nfs4_check_open_reclaim(clientid_t *clid);
extern void nfs4_free_stateowner(struct kref *kref);
extern int set_callback_cred(void);
-extern void nfsd4_probe_callback(struct nfs4_client *clp, struct svc_xprt *);
+extern void nfsd4_probe_callback(struct nfs4_client *clp, struct nfs4_cb_conn *, struct svc_xprt *);
extern void nfsd4_do_callback_rpc(struct work_struct *);
extern void nfsd4_cb_recall(struct nfs4_delegation *dp);
extern int nfsd4_create_callback_queue(void);
--
1.6.3.3


2010-03-10 00:26:51

by J.Bruce Fields

[permalink] [raw]
Subject: [PATCH 08/11] nfsd4: don't sleep in lease-break callback

The NFSv4 server's fl_break callback can sleep (dropping the BKL), in
order to allocate a new rpc task to send a recall to the client.

As far as I can tell this doesn't cause any races in the current code,
but the analysis is difficult. Also, the sleep here may complicate the
move away from the BKL.

So, just schedule some work to do the job for us instead. The work will
later also prove useful for restarting a call after the callback
information is changed.

Signed-off-by: J. Bruce Fields <[email protected]>
---
fs/nfsd/nfs4callback.c | 54 ++++++++++++++++++++++++++++++++++++++++++++++-
fs/nfsd/nfs4state.c | 34 ++++++++++++++----------------
fs/nfsd/state.h | 5 ++++
3 files changed, 73 insertions(+), 20 deletions(-)

diff --git a/fs/nfsd/nfs4callback.c b/fs/nfsd/nfs4callback.c
index 7f84686..a8592be 100644
--- a/fs/nfsd/nfs4callback.c
+++ b/fs/nfsd/nfs4callback.c
@@ -32,6 +32,7 @@
*/

#include <linux/sunrpc/clnt.h>
+#include <linux/sunrpc/svc_xprt.h>
#include "nfsd.h"
#include "state.h"

@@ -692,11 +693,41 @@ static const struct rpc_call_ops nfsd4_cb_recall_ops = {
.rpc_release = nfsd4_cb_recall_release,
};

+static struct workqueue_struct *callback_wq;
+
+int nfsd4_create_callback_queue(void)
+{
+ callback_wq = create_singlethread_workqueue("nfsd4_callbacks");
+ if (!callback_wq)
+ return -ENOMEM;
+ return 0;
+}
+
+void nfsd4_destroy_callback_queue(void)
+{
+ destroy_workqueue(callback_wq);
+}
+
+void nfsd4_set_callback_client(struct nfs4_client *clp, struct rpc_clnt
+*new)
+{
+ struct rpc_clnt *old = clp->cl_cb_conn.cb_client;
+
+ clp->cl_cb_conn.cb_client = new;
+ /*
+ * After this, any work that saw the old value of cb_client will
+ * be gone:
+ */
+ flush_workqueue(callback_wq);
+ /* So we can safely shut it down: */
+ if (old)
+ rpc_shutdown_client(old);
+}
+
/*
* called with dp->dl_count inc'ed.
*/
-void
-nfsd4_cb_recall(struct nfs4_delegation *dp)
+static void _nfsd4_cb_recall(struct nfs4_delegation *dp)
{
struct nfs4_client *clp = dp->dl_client;
struct rpc_clnt *clnt = clp->cl_cb_conn.cb_client;
@@ -707,6 +738,9 @@ nfsd4_cb_recall(struct nfs4_delegation *dp)
};
int status;

+ if (clnt == NULL)
+ return; /* Client is shutting down; give up. */
+
args->args_op = dp;
msg.rpc_argp = args;
dp->dl_retries = 1;
@@ -717,3 +751,19 @@ nfsd4_cb_recall(struct nfs4_delegation *dp)
nfs4_put_delegation(dp);
}
}
+
+void nfsd4_do_callback_rpc(struct work_struct *w)
+{
+ /* XXX: for now, just send off delegation recall. */
+ /* In future, generalize to handle any sort of callback. */
+ struct nfsd4_callback *c = container_of(w, struct nfsd4_callback, cb_work);
+ struct nfs4_delegation *dp = container_of(c, struct nfs4_delegation, dl_recall);
+
+ _nfsd4_cb_recall(dp);
+}
+
+
+void nfsd4_cb_recall(struct nfs4_delegation *dp)
+{
+ queue_work(callback_wq, &dp->dl_recall.cb_work);
+}
diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index 1b2637b..239a90c 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -198,6 +198,7 @@ alloc_init_deleg(struct nfs4_client *clp, struct nfs4_stateid *stp, struct svc_f
atomic_set(&dp->dl_count, 1);
list_add(&dp->dl_perfile, &fp->fi_delegations);
list_add(&dp->dl_perclnt, &clp->cl_delegations);
+ INIT_WORK(&dp->dl_recall.cb_work, nfsd4_do_callback_rpc);
return dp;
}

@@ -679,21 +680,6 @@ static struct nfs4_client *alloc_client(struct xdr_netobj name)
return clp;
}

-static void
-shutdown_callback_client(struct nfs4_client *clp)
-{
- struct rpc_clnt *clnt = clp->cl_cb_conn.cb_client;
-
- if (clnt) {
- /*
- * Callback threads take a reference on the client, so there
- * should be no outstanding callbacks at this point.
- */
- clp->cl_cb_conn.cb_client = NULL;
- rpc_shutdown_client(clnt);
- }
-}
-
static inline void
free_client(struct nfs4_client *clp)
{
@@ -746,7 +732,7 @@ expire_client(struct nfs4_client *clp)
se_perclnt);
release_session(ses);
}
- shutdown_callback_client(clp);
+ nfsd4_set_callback_client(clp, NULL);
put_nfs4_client(clp);
}

@@ -1388,7 +1374,7 @@ nfsd4_destroy_session(struct svc_rqst *r,
spin_unlock(&sessionid_lock);

/* wait for callbacks */
- shutdown_callback_client(ses->se_client);
+ nfsd4_set_callback_client(ses->se_client, NULL);
nfsd4_put_session(ses);
status = nfs_ok;
out:
@@ -4000,16 +3986,27 @@ set_max_delegations(void)
static int
__nfs4_state_start(void)
{
+ int ret;
+
boot_time = get_seconds();
locks_start_grace(&nfsd4_manager);
printk(KERN_INFO "NFSD: starting %ld-second grace period\n",
nfsd4_grace);
+ ret = set_callback_cred();
+ if (ret)
+ return -ENOMEM;
laundry_wq = create_singlethread_workqueue("nfsd4");
if (laundry_wq == NULL)
return -ENOMEM;
+ ret = nfsd4_create_callback_queue();
+ if (ret)
+ goto out_free_laundry;
queue_delayed_work(laundry_wq, &laundromat_work, nfsd4_grace * HZ);
set_max_delegations();
- return set_callback_cred();
+ return 0;
+out_free_laundry:
+ destroy_workqueue(laundry_wq);
+ return ret;
}

int
@@ -4071,6 +4068,7 @@ nfs4_state_shutdown(void)
nfs4_lock_state();
nfs4_release_reclaim();
__nfs4_state_shutdown();
+ nfsd4_destroy_callback_queue();
nfs4_unlock_state();
}

diff --git a/fs/nfsd/state.h b/fs/nfsd/state.h
index 6e5b005..92b0bf0 100644
--- a/fs/nfsd/state.h
+++ b/fs/nfsd/state.h
@@ -77,6 +77,7 @@ struct nfs4_rpc_args {

struct nfsd4_callback {
struct nfs4_rpc_args cb_args;
+ struct work_struct cb_work;
};

struct nfs4_delegation {
@@ -390,7 +391,11 @@ extern void put_nfs4_client(struct nfs4_client *clp);
extern void nfs4_free_stateowner(struct kref *kref);
extern int set_callback_cred(void);
extern void nfsd4_probe_callback(struct nfs4_client *clp, struct svc_xprt *);
+extern void nfsd4_do_callback_rpc(struct work_struct *);
extern void nfsd4_cb_recall(struct nfs4_delegation *dp);
+extern int nfsd4_create_callback_queue(void);
+extern void nfsd4_destroy_callback_queue(void);
+extern void nfsd4_set_callback_client(struct nfs4_client *, struct rpc_clnt *);
extern void nfs4_put_delegation(struct nfs4_delegation *dp);
extern __be32 nfs4_make_rec_clidname(char *clidname, struct xdr_netobj *clname);
extern void nfsd4_init_recdir(char *recdir_name);
--
1.6.3.3


2010-03-10 00:26:51

by J.Bruce Fields

[permalink] [raw]
Subject: [PATCH 01/11] sunrpc: fix leak on error on socket xprt setup

Also collect exit code together while we're at it.

Signed-off-by: J. Bruce Fields <[email protected]>
Cc: Trond Myklebust <[email protected]>
---
net/sunrpc/xprtsock.c | 29 ++++++++++++++++++-----------
1 files changed, 18 insertions(+), 11 deletions(-)

diff --git a/net/sunrpc/xprtsock.c b/net/sunrpc/xprtsock.c
index 7124129..a91f0a4 100644
--- a/net/sunrpc/xprtsock.c
+++ b/net/sunrpc/xprtsock.c
@@ -2333,6 +2333,7 @@ static struct rpc_xprt *xs_setup_udp(struct xprt_create *args)
struct sockaddr *addr = args->dstaddr;
struct rpc_xprt *xprt;
struct sock_xprt *transport;
+ struct rpc_xprt *ret;

xprt = xs_setup_xprt(args, xprt_udp_slot_table_entries);
if (IS_ERR(xprt))
@@ -2371,8 +2372,8 @@ static struct rpc_xprt *xs_setup_udp(struct xprt_create *args)
xs_format_peer_addresses(xprt, "udp", RPCBIND_NETID_UDP6);
break;
default:
- kfree(xprt);
- return ERR_PTR(-EAFNOSUPPORT);
+ ret = ERR_PTR(-EAFNOSUPPORT);
+ goto out_err;
}

if (xprt_bound(xprt))
@@ -2387,10 +2388,11 @@ static struct rpc_xprt *xs_setup_udp(struct xprt_create *args)

if (try_module_get(THIS_MODULE))
return xprt;
-
+ ret = ERR_PTR(-EINVAL);
+out_err:
kfree(xprt->slot);
kfree(xprt);
- return ERR_PTR(-EINVAL);
+ return ret;
}

static const struct rpc_timeout xs_tcp_default_timeout = {
@@ -2409,6 +2411,7 @@ static struct rpc_xprt *xs_setup_tcp(struct xprt_create *args)
struct sockaddr *addr = args->dstaddr;
struct rpc_xprt *xprt;
struct sock_xprt *transport;
+ struct rpc_xprt *ret;

xprt = xs_setup_xprt(args, xprt_tcp_slot_table_entries);
if (IS_ERR(xprt))
@@ -2445,8 +2448,8 @@ static struct rpc_xprt *xs_setup_tcp(struct xprt_create *args)
xs_format_peer_addresses(xprt, "tcp", RPCBIND_NETID_TCP6);
break;
default:
- kfree(xprt);
- return ERR_PTR(-EAFNOSUPPORT);
+ ret = ERR_PTR(-EAFNOSUPPORT);
+ goto out_err;
}

if (xprt_bound(xprt))
@@ -2462,10 +2465,11 @@ static struct rpc_xprt *xs_setup_tcp(struct xprt_create *args)

if (try_module_get(THIS_MODULE))
return xprt;
-
+ ret = ERR_PTR(-EINVAL);
+out_err:
kfree(xprt->slot);
kfree(xprt);
- return ERR_PTR(-EINVAL);
+ return ret;
}

/**
@@ -2479,6 +2483,7 @@ static struct rpc_xprt *xs_setup_bc_tcp(struct xprt_create *args)
struct rpc_xprt *xprt;
struct sock_xprt *transport;
struct svc_sock *bc_sock;
+ struct rpc_xprt *ret;

if (!args->bc_xprt)
ERR_PTR(-EINVAL);
@@ -2522,8 +2527,8 @@ static struct rpc_xprt *xs_setup_bc_tcp(struct xprt_create *args)
RPCBIND_NETID_TCP6);
break;
default:
- kfree(xprt);
- return ERR_PTR(-EAFNOSUPPORT);
+ ret = ERR_PTR(-EAFNOSUPPORT);
+ goto out_err;
}

if (xprt_bound(xprt))
@@ -2545,9 +2550,11 @@ static struct rpc_xprt *xs_setup_bc_tcp(struct xprt_create *args)

if (try_module_get(THIS_MODULE))
return xprt;
+ ret = ERR_PTR(-EINVAL);
+out_err:
kfree(xprt->slot);
kfree(xprt);
- return ERR_PTR(-EINVAL);
+ return ret;
}

static struct xprt_class xs_udp_transport = {
--
1.6.3.3


2010-03-10 01:03:22

by Myklebust, Trond

[permalink] [raw]
Subject: Re: [PATCH 02/11] sunrpc: take reference to bc_xprt in sunrpc layer

On Tue, 2010-03-09 at 19:28 -0500, J. Bruce Fields wrote:
> Currently we're requiring whoever holds the reference to a client set up
> the on the backchannel to also hold a reference to the forechannel
> transport; but this should be the responsibility of the lower level
> code.
>
> Signed-off-by: J. Bruce Fields <[email protected]>
> Cc: Trond Myklebust <[email protected]>
> ---
> net/sunrpc/xprtsock.c | 4 ++++
> 1 files changed, 4 insertions(+), 0 deletions(-)
>
> diff --git a/net/sunrpc/xprtsock.c b/net/sunrpc/xprtsock.c
> index a91f0a4..5adc000 100644
> --- a/net/sunrpc/xprtsock.c
> +++ b/net/sunrpc/xprtsock.c
> @@ -792,6 +792,8 @@ static void xs_destroy(struct rpc_xprt *xprt)
>
> xs_close(xprt);
> xs_free_peer_addresses(xprt);
> + if (xprt->bc_xprt)
> + svc_xprt_put(xprt->bc_xprt);
> kfree(xprt->slot);
> kfree(xprt);
> module_put(THIS_MODULE);
> @@ -2510,6 +2512,7 @@ static struct rpc_xprt *xs_setup_bc_tcp(struct xprt_create *args)
> * forechannel
> */
> xprt->bc_xprt = args->bc_xprt;
> + svc_xprt_get(xprt->bc_xprt);
> bc_sock = container_of(args->bc_xprt, struct svc_sock, sk_xprt);
> bc_sock->sk_bc_xprt = xprt;
> transport->sock = bc_sock->sk_sock;
> @@ -2552,6 +2555,7 @@ static struct rpc_xprt *xs_setup_bc_tcp(struct xprt_create *args)
> return xprt;
> ret = ERR_PTR(-EINVAL);
> out_err:
> + svc_xprt_put(xprt->bc_xprt);
> kfree(xprt->slot);
> kfree(xprt);
> return ret;

OK... This makes my head spin.

Why should the back channel socket hold a reference to the svc_sock? The
process that owns both of these things is an RPC server process. It
already holds a reference to the svc_sock and presumably also to the
back channel socket.

No?

Trond

2010-03-10 19:04:18

by J.Bruce Fields

[permalink] [raw]
Subject: Re: [PATCH 02/11] sunrpc: take reference to bc_xprt in sunrpc layer

On Tue, Mar 09, 2010 at 08:03:00PM -0500, Trond Myklebust wrote:
> On Tue, 2010-03-09 at 19:28 -0500, J. Bruce Fields wrote:
> > Currently we're requiring whoever holds the reference to a client set up
> > the on the backchannel to also hold a reference to the forechannel
> > transport; but this should be the responsibility of the lower level
> > code.
> >
> > Signed-off-by: J. Bruce Fields <[email protected]>
> > Cc: Trond Myklebust <[email protected]>
> > ---
> > net/sunrpc/xprtsock.c | 4 ++++
> > 1 files changed, 4 insertions(+), 0 deletions(-)
> >
> > diff --git a/net/sunrpc/xprtsock.c b/net/sunrpc/xprtsock.c
> > index a91f0a4..5adc000 100644
> > --- a/net/sunrpc/xprtsock.c
> > +++ b/net/sunrpc/xprtsock.c
> > @@ -792,6 +792,8 @@ static void xs_destroy(struct rpc_xprt *xprt)
> >
> > xs_close(xprt);
> > xs_free_peer_addresses(xprt);
> > + if (xprt->bc_xprt)
> > + svc_xprt_put(xprt->bc_xprt);
> > kfree(xprt->slot);
> > kfree(xprt);
> > module_put(THIS_MODULE);
> > @@ -2510,6 +2512,7 @@ static struct rpc_xprt *xs_setup_bc_tcp(struct xprt_create *args)
> > * forechannel
> > */
> > xprt->bc_xprt = args->bc_xprt;
> > + svc_xprt_get(xprt->bc_xprt);
> > bc_sock = container_of(args->bc_xprt, struct svc_sock, sk_xprt);
> > bc_sock->sk_bc_xprt = xprt;
> > transport->sock = bc_sock->sk_sock;
> > @@ -2552,6 +2555,7 @@ static struct rpc_xprt *xs_setup_bc_tcp(struct xprt_create *args)
> > return xprt;
> > ret = ERR_PTR(-EINVAL);
> > out_err:
> > + svc_xprt_put(xprt->bc_xprt);
> > kfree(xprt->slot);
> > kfree(xprt);
> > return ret;
>
> OK... This makes my head spin.
>
> Why should the back channel socket hold a reference to the svc_sock? The
> process that owns both of these things is an RPC server process. It
> already holds a reference to the svc_sock and presumably also to the
> back channel socket.

Only while it's actually processing an rpc, I assume.

I'll take a harder look at what should be the lifetime of the server
socket that the backchannel requests use.

By the way, we also need to make sure the nfs server code gets some kind
of call when the client closes the connection, so we can set
callback-down session flags appropriately.

--b.

2010-04-08 16:05:51

by J.Bruce Fields

[permalink] [raw]
Subject: Re: [PATCH 02/11] sunrpc: take reference to bc_xprt in sunrpc layer

On Thu, Mar 18, 2010 at 11:08:18AM -0400, J. Bruce Fields wrote:
> On Wed, Mar 10, 2010 at 02:05:45PM -0500, J. Bruce Fields wrote:
> > On Tue, Mar 09, 2010 at 08:03:00PM -0500, Trond Myklebust wrote:
> > > Why should the back channel socket hold a reference to the svc_sock? The
> > > process that owns both of these things is an RPC server process. It
> > > already holds a reference to the svc_sock and presumably also to the
> > > back channel socket.
> >
> > Only while it's actually processing an rpc, I assume.
> >
> > I'll take a harder look at what should be the lifetime of the server
> > socket that the backchannel requests use.
> >
> > By the way, we also need to make sure the nfs server code gets some kind
> > of call when the client closes the connection, so we can set
> > callback-down session flags appropriately.
>
> One way to do that would be for nfsd to register a callback to be called
> from svc_delete_xprt(). Then in that callback nfsd can set those
> session flags appropriately, but it can also shut down the rpc client.
> That ensures the rpc client (and its xprt) go away before the svc_xprt,
> so we no longer need to hold a reference.
>
> (Actually there's nothing preventing multiple backchannels from sharing
> the same connection, so we'd need a list of callbacks.)

That would look like the below. The server would use it to register a
callback that cleans up the old rpc client, sets
SEQ4_STATUS_CB_PATH_DOWN if necessary, etc.

(Actually I think it may make the locking simplest just to keep these
callbacks under a spinlock, in which case the callback will need to
defer most of that work to a workqueue, which I don't see any problem
with so far.)

--b.

commit aecf73a3ef70426845deff43e4435a87ff9170ab
Author: J. Bruce Fields <[email protected]>
Date: Mon Mar 22 15:37:17 2010 -0400

nfsd: provide callbacks on svc_xprt deletion.

NFSv4.1 needs warning when a client tcp connection goes down, if that
connection is being used as a backchannel, so that it can warn the
client that it has lost the backchannel connection.

This also allows us to destroy the rpc_client, or any other structures
associated with the backchannel connection, before the svc_xprt is
destroyed, thus removing the need for some odd reference counting.

Signed-off-by: J. Bruce Fields <[email protected]>

diff --git a/include/linux/sunrpc/svc_xprt.h b/include/linux/sunrpc/svc_xprt.h
index 5f4e18b..22229e4 100644
--- a/include/linux/sunrpc/svc_xprt.h
+++ b/include/linux/sunrpc/svc_xprt.h
@@ -32,6 +32,16 @@ struct svc_xprt_class {
u32 xcl_max_payload;
};

+/*
+ * This is embedded in an object that wants a callback before deleting
+ * an xprt; intended for use by NFSv4.1, which needs to know when a
+ * client's tcp connection (and hence possibly a backchannel) goes away.
+ */
+struct svc_xpt_user {
+ struct list_head list;
+ void (*callback)(struct svc_xpt_user *);
+};
+
struct svc_xprt {
struct svc_xprt_class *xpt_class;
struct svc_xprt_ops *xpt_ops;
@@ -66,8 +76,23 @@ struct svc_xprt {
struct sockaddr_storage xpt_remote; /* remote peer's address */
size_t xpt_remotelen; /* length of address */
struct rpc_wait_queue xpt_bc_pending; /* backchannel wait queue */
+ struct list_head xpt_users; /* callbacks on free */
};

+static inline void register_xpt_user(struct svc_xprt *xpt, struct svc_xpt_user *u)
+{
+ spin_lock(&xpt->xpt_lock);
+ list_add(&u->list, &xpt->xpt_users);
+ spin_unlock(&xpt->xpt_lock);
+}
+
+static inline void unregister_xpt_user(struct svc_xprt *xpt, struct svc_xpt_user *u)
+{
+ spin_lock(&xpt->xpt_lock);
+ list_del_init(&u->list);
+ spin_unlock(&xpt->xpt_lock);
+}
+
int svc_reg_xprt_class(struct svc_xprt_class *);
void svc_unreg_xprt_class(struct svc_xprt_class *);
void svc_xprt_init(struct svc_xprt_class *, struct svc_xprt *,
diff --git a/net/sunrpc/svc_xprt.c b/net/sunrpc/svc_xprt.c
index c334f54..82cd43c 100644
--- a/net/sunrpc/svc_xprt.c
+++ b/net/sunrpc/svc_xprt.c
@@ -155,6 +155,7 @@ void svc_xprt_init(struct svc_xprt_class *xcl, struct svc_xprt *xprt,
INIT_LIST_HEAD(&xprt->xpt_list);
INIT_LIST_HEAD(&xprt->xpt_ready);
INIT_LIST_HEAD(&xprt->xpt_deferred);
+ INIT_LIST_HEAD(&xprt->xpt_users);
mutex_init(&xprt->xpt_mutex);
spin_lock_init(&xprt->xpt_lock);
set_bit(XPT_BUSY, &xprt->xpt_flags);
@@ -865,6 +866,18 @@ static void svc_age_temp_xprts(unsigned long closure)
mod_timer(&serv->sv_temptimer, jiffies + svc_conn_age_period * HZ);
}

+static void call_xpt_users(struct svc_xprt *xprt)
+{
+ struct svc_xpt_user *u;
+
+ spin_lock(&xprt->xpt_lock);
+ while (!list_empty(&xprt->xpt_users)) {
+ u = list_first_entry(&xprt->xpt_users, struct svc_xpt_user, list);
+ u->callback(u);
+ }
+ spin_unlock(&xprt->xpt_lock);
+}
+
/*
* Remove a dead transport
*/
@@ -897,6 +910,7 @@ void svc_delete_xprt(struct svc_xprt *xprt)
while ((dr = svc_deferred_dequeue(xprt)) != NULL)
kfree(dr);

+ call_xpt_users(xprt);
svc_xprt_put(xprt);
}