2009-09-04 16:18:49

by Benny Halevy

[permalink] [raw]
Subject: [PATCH 0/10] nfsd41 backchannel patches for 2.6.32

Bruce,

Here's the updated patchset implementing the nfs41 backchannel
for the nfs server.

Changes from previous version:
- Rebase onto git://git.linux-nfs.org/~bfields/linux.git for-2.6.32

- bc_send_request does not block on the xpt_mutex
but rather uses the rpc_sleep_on to wait on it.

- nfsd4_create_session initializes unconf->cl_cb_conn.cb_addr.

- cosmetic-only changes cleaned up.

[PATCH 01/10] nfsd41: sunrpc: move struct rpc_buffer def into sunrpc.h
[PATCH 02/10] nfsd41: sunrpc: Added rpc server-side backchannel handling
[PATCH 03/10] nfsd4: fix whitespace in NFSPROC4_CLNT_CB_NULL definition
[PATCH 04/10] nfsd41: Backchannel: callback infrastructure
[PATCH 05/10] nfsd41: Backchannel: Add sequence arguments to callback RPC arguments
[PATCH 06/10] nfsd41: Backchannel: Server backchannel RPC wait queue
[PATCH 07/10] nfsd41: Backchannel: Setup sequence information
[PATCH 08/10] nfsd41: Backchannel: cb_sequence callback
[PATCH 09/10] nfsd41: Backchannel: Implement cb_recall over NFSv4.1
[PATCH 10/10] nfsd41: Refactor create_client()

Benny


2009-09-10 16:11:50

by J. Bruce Fields

[permalink] [raw]
Subject: Re: [PATCH v2 01/12] nfsd41: sunrpc: move struct rpc_buffer def into sunrpc.h

On Thu, Sep 10, 2009 at 12:25:04PM +0300, Benny Halevy wrote:
> Move struct rpc_buffer's definition into a sunrpc.h, a common, internal
> header file, in preparation for supporting the nfsv4.1 backchannel.

Applied, thanks.

--b.

>
> Signed-off-by: Benny Halevy <[email protected]>
> [nfs41: sunrpc: #include <linux/net.h> from sunrpc.h]
> Signed-off-by: Benny Halevy <[email protected]>
> ---
> net/sunrpc/sched.c | 7 ++-----
> net/sunrpc/sunrpc.h | 10 ++++++++++
> 2 files changed, 12 insertions(+), 5 deletions(-)
>
> diff --git a/net/sunrpc/sched.c b/net/sunrpc/sched.c
> index 8f459ab..cef74ba 100644
> --- a/net/sunrpc/sched.c
> +++ b/net/sunrpc/sched.c
> @@ -21,6 +21,8 @@
>
> #include <linux/sunrpc/clnt.h>
>
> +#include "sunrpc.h"
> +
> #ifdef RPC_DEBUG
> #define RPCDBG_FACILITY RPCDBG_SCHED
> #define RPC_TASK_MAGIC_ID 0xf00baa
> @@ -711,11 +713,6 @@ static void rpc_async_schedule(struct work_struct *work)
> __rpc_execute(container_of(work, struct rpc_task, u.tk_work));
> }
>
> -struct rpc_buffer {
> - size_t len;
> - char data[];
> -};
> -
> /**
> * rpc_malloc - allocate an RPC buffer
> * @task: RPC task that will use this buffer
> diff --git a/net/sunrpc/sunrpc.h b/net/sunrpc/sunrpc.h
> index 5d9dd74..13171e6 100644
> --- a/net/sunrpc/sunrpc.h
> +++ b/net/sunrpc/sunrpc.h
> @@ -27,6 +27,16 @@ SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
> #ifndef _NET_SUNRPC_SUNRPC_H
> #define _NET_SUNRPC_SUNRPC_H
>
> +#include <linux/net.h>
> +
> +/*
> + * Header for dynamically allocated rpc buffers.
> + */
> +struct rpc_buffer {
> + size_t len;
> + char data[];
> +};
> +
> static inline int rpc_reply_expected(struct rpc_task *task)
> {
> return (task->tk_msg.rpc_proc != NULL) &&
> --
> 1.6.4
>

2009-09-11 20:58:20

by J. Bruce Fields

[permalink] [raw]
Subject: Re: [PATCH v3 03/12] nfsd41: sunrpc: add new xprt class for nfsv4.1 backchannel

On Thu, Sep 10, 2009 at 05:33:30PM +0300, Benny Halevy wrote:
> diff --git a/include/linux/sunrpc/xprtrdma.h b/include/linux/sunrpc/xprtrdma.h
> index 54a379c..c2f04e1 100644
> --- a/include/linux/sunrpc/xprtrdma.h
> +++ b/include/linux/sunrpc/xprtrdma.h
> @@ -41,11 +41,6 @@
> #define _LINUX_SUNRPC_XPRTRDMA_H
>
> /*
> - * RPC transport identifier for RDMA
> - */
> -#define XPRT_TRANSPORT_RDMA 256
> -
> -/*
> * rpcbind (v3+) RDMA netid.
> */
> #define RPCBIND_NETID_RDMA "rdma"
> diff --git a/include/linux/sunrpc/xprtsock.h b/include/linux/sunrpc/xprtsock.h
> index c2a46c4..d7c98d1 100644
> --- a/include/linux/sunrpc/xprtsock.h
> +++ b/include/linux/sunrpc/xprtsock.h
> @@ -20,8 +20,13 @@ void cleanup_socket_xprt(void);
> * values. No such restriction exists for new transports, except that
> * they may not collide with these values (17 and 6, respectively).
> */
> -#define XPRT_TRANSPORT_UDP IPPROTO_UDP
> -#define XPRT_TRANSPORT_TCP IPPROTO_TCP
> +#define XPRT_TRANSPORT_BC (1 << 31)
> +enum xprt_transports {
> + XPRT_TRANSPORT_UDP = IPPROTO_UDP,
> + XPRT_TRANSPORT_TCP = IPPROTO_TCP,
> + XPRT_TRANSPORT_BC_TCP = IPPROTO_TCP | XPRT_TRANSPORT_BC,
> + XPRT_TRANSPORT_RDMA = 256
> +};

This fails to compile when CONFIG_SUNRPC_XPRT_RDMA is set.

A minimal fix might be:

--- a/net/sunrpc/xprtrdma/transport.c
+++ b/net/sunrpc/xprtrdma/transport.c
@@ -50,6 +50,8 @@
#include <linux/module.h>
#include <linux/init.h>
#include <linux/seq_file.h>
+#include <linux/in.h>
+#include <linux/sunrpc/xprtsock.h>

#include "xprt_rdma.h"

Or maybe just ditch the enum and leave these as they were before.

--b.

2009-09-11 21:12:33

by Alexandros Batsakis

[permalink] [raw]
Subject: Re: [pnfs] [PATCH v3 03/12] nfsd41: sunrpc: add new xprt class for nfsv4.1 backchannel

On Fri, Sep 11, 2009 at 1:58 PM, J. Bruce Fields <[email protected]>=
wrote:
> On Thu, Sep 10, 2009 at 05:33:30PM +0300, Benny Halevy wrote:
>> diff --git a/include/linux/sunrpc/xprtrdma.h b/include/linux/sunrpc/=
xprtrdma.h
>> index 54a379c..c2f04e1 100644
>> --- a/include/linux/sunrpc/xprtrdma.h
>> +++ b/include/linux/sunrpc/xprtrdma.h
>> @@ -41,11 +41,6 @@
>> =A0#define _LINUX_SUNRPC_XPRTRDMA_H
>>
>> =A0/*
>> - * RPC transport identifier for RDMA
>> - */
>> -#define XPRT_TRANSPORT_RDMA =A0256
>> -
>> -/*
>> =A0 * rpcbind (v3+) RDMA netid.
>> =A0 */
>> =A0#define RPCBIND_NETID_RDMA =A0 "rdma"
>> diff --git a/include/linux/sunrpc/xprtsock.h b/include/linux/sunrpc/=
xprtsock.h
>> index c2a46c4..d7c98d1 100644
>> --- a/include/linux/sunrpc/xprtsock.h
>> +++ b/include/linux/sunrpc/xprtsock.h
>> @@ -20,8 +20,13 @@ void =A0 =A0 =A0 =A0 =A0 =A0 =A0 cleanup_socket_x=
prt(void);
>> =A0 * values. No such restriction exists for new transports, except =
that
>> =A0 * they may not collide with these values (17 and 6, respectively=
).
>> =A0 */
>> -#define XPRT_TRANSPORT_UDP =A0 IPPROTO_UDP
>> -#define XPRT_TRANSPORT_TCP =A0 IPPROTO_TCP
>> +#define XPRT_TRANSPORT_BC =A0 =A0(1 << 31)
>> +enum xprt_transports {
>> + =A0 =A0 XPRT_TRANSPORT_UDP =A0 =A0 =A0=3D IPPROTO_UDP,
>> + =A0 =A0 XPRT_TRANSPORT_TCP =A0 =A0 =A0=3D IPPROTO_TCP,
>> + =A0 =A0 XPRT_TRANSPORT_BC_TCP =A0 =3D IPPROTO_TCP | XPRT_TRANSPORT=
_BC,
>> + =A0 =A0 XPRT_TRANSPORT_RDMA =A0 =A0 =3D 256
>> +};
>
> This fails to compile when CONFIG_SUNRPC_XPRT_RDMA is set.
>
> A minimal fix might be:
>
> =A0 =A0 =A0 =A0--- a/net/sunrpc/xprtrdma/transport.c
> =A0 =A0 =A0 =A0+++ b/net/sunrpc/xprtrdma/transport.c
> =A0 =A0 =A0 =A0@@ -50,6 +50,8 @@
> =A0 =A0 =A0 =A0 #include <linux/module.h>
> =A0 =A0 =A0 =A0 #include <linux/init.h>
> =A0 =A0 =A0 =A0 #include <linux/seq_file.h>
> =A0 =A0 =A0 =A0+#include <linux/in.h>
> =A0 =A0 =A0 =A0+#include <linux/sunrpc/xprtsock.h>
>
> =A0 =A0 =A0 =A0 #include "xprt_rdma.h"
>
> Or maybe just ditch the enum and leave these as they were before.
>

The incentive here was to have all the transports definitions together
so that nobody re-uses a number by mistake (not very likely of
course), so I think the fix you suggested is appropriate.

-alexandros

> --b.
> _______________________________________________
> pNFS mailing list
> [email protected]
> http://linux-nfs.org/cgi-bin/mailman/listinfo/pnfs
>

2009-09-11 22:48:24

by Sun_Peixing

[permalink] [raw]
Subject: Build error of latest Linux-pnfs 2.6.31


I got the following error from include/linux/stats.h. It looks
obviously that the line 481 and line 499 are duplicated. If I remove
line 481 from that file, then I got error in the end
"ERROR: "nfs4_reset_lease" [fs/nfsd/nfsd.ko] undefined!".

Could you tell me how to fix that? I downloaded the package using git
by doing:

git clone git://git.linux-nfs.org/projects/bhalevy/linux-pnfs.git

On 09/11/2009. So I got the latest patch.

When Make started, I got error from

Thanks for help

Peixing

CC [M] fs/nfsd/nfs4proc.o
In file included from fs/nfsd/nfs4proc.c:47:
include/linux/nfsd/state.h:483: warning: struct nfs4_layoutrecall
declared inside parameter list
include/linux/nfsd/state.h:483: warning: its scope is only this
definition or declaration, which is probably not what you want
include/linux/nfsd/state.h:484: warning: struct nfs4_layoutrecall
declared inside parameter list
include/linux/nfsd/state.h:485: warning: struct nfs4_layoutrecall
declared inside parameter list
include/linux/nfsd/state.h:487: warning: struct nfs4_layoutrecall
declared inside parameter list
include/linux/nfsd/state.h:518: error: static declaration of
release_pnfs_ds_dev_list follows non-static declaration
include/linux/nfsd/state.h:481: error: previous declaration of
release_pnfs_ds_dev_list was here
make[2]: *** [fs/nfsd/nfs4proc.o] Error 1

2009-09-13 09:30:26

by Benny Halevy

[permalink] [raw]
Subject: Re: Build error of latest Linux-pnfs 2.6.31

On 2009-09-12 01:29, [email protected] wrote:
> I got the following error from include/linux/stats.h. It looks
> obviously that the line 481 and line 499 are duplicated. If I remove
> line 481 from that file, then I got error in the end
> "ERROR: "nfs4_reset_lease" [fs/nfsd/nfsd.ko] undefined!".
>
> Could you tell me how to fix that? I downloaded the package using git
> by doing:

It looks like CONFIG_PNFSD is configured as 'n' isn't it?
Anyway, does the following patch help?

>From 9a0a09a7cb3116dcb2f1610eb7579e06e80a60f4 Mon Sep 17 00:00:00 2001
From: Benny Halevy <[email protected]>
Date: Sun, 13 Sep 2009 12:26:31 +0300
Subject: [PATCH] SQUASHME: pnfsd: fix compiler warnings when CONFIG_PNFSD is not defined

Signed-off-by: Benny Halevy <[email protected]>
---
include/linux/nfsd/state.h | 8 ++++----
1 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/include/linux/nfsd/state.h b/include/linux/nfsd/state.h
index 9c0df5b..28e491d 100644
--- a/include/linux/nfsd/state.h
+++ b/include/linux/nfsd/state.h
@@ -478,14 +478,14 @@ extern struct nfs4_stateid * find_stateid(stateid_t *stid, int flags);
extern struct nfs4_delegation * find_delegation_stateid(struct inode *ino,
stateid_t *stid);
extern __be32 nfs4_check_stateid(stateid_t *stateid);
+extern void expire_client_lock(struct nfs4_client *clp);
+#if defined(CONFIG_PNFSD)
extern void release_pnfs_ds_dev_list(struct nfs4_stateid *stp);
extern void nfs4_pnfs_state_init(void);
extern int put_layoutrecall(struct nfs4_layoutrecall *);
extern void nomatching_layout(struct nfs4_layoutrecall *);
extern void *layoutrecall_done(struct nfs4_layoutrecall *);
-extern void expire_client_lock(struct nfs4_client *clp);
extern int nfsd4_cb_layout(struct nfs4_layoutrecall *lp);
-#if defined(CONFIG_PNFSD)
extern void nfsd4_free_pnfs_slabs(void);
extern void nfsd4_free_slab(struct kmem_cache **slab);
extern int nfsd4_init_pnfs_slabs(void);
@@ -515,8 +515,8 @@ extern void pnfs_expire_client(struct nfs4_client *clp);
#else /* CONFIG_PNFSD */
static inline void nfsd4_free_pnfs_slabs(void) {}
static inline int nfsd4_init_pnfs_slabs(void) { return 0; }
-static void release_pnfs_ds_dev_list(struct nfs4_stateid *stp) {}
-static void pnfs_expire_client(struct nfs4_client *clp) {}
+static inline void release_pnfs_ds_dev_list(struct nfs4_stateid *stp) {}
+static inline void pnfs_expire_client(struct nfs4_client *clp) {}
#endif /* CONFIG_PNFSD */

static inline void
--
1.6.4


>
> git clone git://git.linux-nfs.org/projects/bhalevy/linux-pnfs.git
>
> On 09/11/2009. So I got the latest patch.
>
> When Make started, I got error from
>
> Thanks for help
>
> Peixing
>
> CC [M] fs/nfsd/nfs4proc.o
> In file included from fs/nfsd/nfs4proc.c:47:
> include/linux/nfsd/state.h:483: warning: struct nfs4_layoutrecall
> declared inside parameter list
> include/linux/nfsd/state.h:483: warning: its scope is only this
> definition or declaration, which is probably not what you want
> include/linux/nfsd/state.h:484: warning: struct nfs4_layoutrecall
> declared inside parameter list
> include/linux/nfsd/state.h:485: warning: struct nfs4_layoutrecall
> declared inside parameter list
> include/linux/nfsd/state.h:487: warning: struct nfs4_layoutrecall
> declared inside parameter list
> include/linux/nfsd/state.h:518: error: static declaration of
> release_pnfs_ds_dev_list follows non-static declaration
> include/linux/nfsd/state.h:481: error: previous declaration of
> release_pnfs_ds_dev_list was here
> make[2]: *** [fs/nfsd/nfs4proc.o] Error 1

2009-09-13 20:27:18

by J. Bruce Fields

[permalink] [raw]
Subject: Re: [PATCH v2 09/12] nfsd41: Backchannel: cb_sequence callback

On Thu, Sep 10, 2009 at 12:26:51PM +0300, Benny Halevy wrote:
> Implement the cb_sequence callback conforming to draft-ietf-nfsv4-minorversion1
>
> Note: highest slot id and target highest slot id do not have to be 0
> as was previously implemented. They can be greater than what the
> nfs server sent if the client supports a larger slot table on the
> backchannel. At this point we just ignore that.

Minor point (applying as is), but, in future: a changelog that says how
this version of the patch differs from a previous version won't be
useful to someone reading the git history (and lacking the previous
versions). If you think the above mistake is one that someone might
risk making again, a comment in the appropriate spot in the code might
be more useful.

--b.

>
> Signed-off-by: Benny Halevy <[email protected]>
> Signed-off-by: Ricardo Labiaga <[email protected]>
> [Rework the back channel xdr using the shared v4.0 and v4.1 framework.]
> Signed-off-by: Andy Adamson <[email protected]>
> [fixed indentation]
> Signed-off-by: Benny Halevy <[email protected]>
> [nfsd41: use nfsd4_cb_sequence for callback minorversion]
> Signed-off-by: Benny Halevy <[email protected]>
> [nfsd41: fix verification of CB_SEQUENCE highest slot id[
> Signed-off-by: Benny Halevy <[email protected]>
> [nfsd41: Backchannel: Remove old backchannel serialization]
> [nfsd41: Backchannel: First callback sequence ID should be 1]
> Signed-off-by: Ricardo Labiaga <[email protected]>
> Signed-off-by: Benny Halevy <[email protected]>
> [nfsd41: decode_cb_sequence does not need to actually decode ignored fields]
> Signed-off-by: Benny Halevy <[email protected]>
> Signed-off-by: Ricardo Labiaga <[email protected]>
> Signed-off-by: Benny Halevy <[email protected]>
> ---
> fs/nfsd/nfs4callback.c | 72 ++++++++++++++++++++++++++++++++++++++++++++++++
> 1 files changed, 72 insertions(+), 0 deletions(-)
>
> diff --git a/fs/nfsd/nfs4callback.c b/fs/nfsd/nfs4callback.c
> index e79e3a4..5e9659c 100644
> --- a/fs/nfsd/nfs4callback.c
> +++ b/fs/nfsd/nfs4callback.c
> @@ -256,6 +256,27 @@ encode_cb_recall(struct xdr_stream *xdr, struct nfs4_delegation *dp,
> hdr->nops++;
> }
>
> +static void
> +encode_cb_sequence(struct xdr_stream *xdr, struct nfsd4_cb_sequence *args,
> + struct nfs4_cb_compound_hdr *hdr)
> +{
> + __be32 *p;
> +
> + if (hdr->minorversion == 0)
> + return;
> +
> + RESERVE_SPACE(1 + NFS4_MAX_SESSIONID_LEN + 20);
> +
> + WRITE32(OP_CB_SEQUENCE);
> + WRITEMEM(args->cbs_clp->cl_sessionid.data, NFS4_MAX_SESSIONID_LEN);
> + WRITE32(args->cbs_clp->cl_cb_seq_nr);
> + WRITE32(0); /* slotid, always 0 */
> + WRITE32(0); /* highest slotid always 0 */
> + WRITE32(0); /* cachethis always 0 */
> + WRITE32(0); /* FIXME: support referring_call_lists */
> + hdr->nops++;
> +}
> +
> static int
> nfs4_xdr_enc_cb_null(struct rpc_rqst *req, __be32 *p)
> {
> @@ -317,6 +338,57 @@ decode_cb_op_hdr(struct xdr_stream *xdr, enum nfs_opnum4 expected)
> return 0;
> }
>
> +/*
> + * Our current back channel implmentation supports a single backchannel
> + * with a single slot.
> + */
> +static int
> +decode_cb_sequence(struct xdr_stream *xdr, struct nfsd4_cb_sequence *res,
> + struct rpc_rqst *rqstp)
> +{
> + struct nfs4_sessionid id;
> + int status;
> + u32 dummy;
> + __be32 *p;
> +
> + if (res->cbs_minorversion == 0)
> + return 0;
> +
> + status = decode_cb_op_hdr(xdr, OP_CB_SEQUENCE);
> + if (status)
> + return status;
> +
> + /*
> + * If the server returns different values for sessionID, slotID or
> + * sequence number, the server is looney tunes.
> + */
> + status = -ESERVERFAULT;
> +
> + READ_BUF(NFS4_MAX_SESSIONID_LEN + 16);
> + memcpy(id.data, p, NFS4_MAX_SESSIONID_LEN);
> + p += XDR_QUADLEN(NFS4_MAX_SESSIONID_LEN);
> + if (memcmp(id.data, res->cbs_clp->cl_sessionid.data,
> + NFS4_MAX_SESSIONID_LEN)) {
> + dprintk("%s Invalid session id\n", __func__);
> + goto out;
> + }
> + READ32(dummy);
> + if (dummy != res->cbs_clp->cl_cb_seq_nr) {
> + dprintk("%s Invalid sequence number\n", __func__);
> + goto out;
> + }
> + READ32(dummy); /* slotid must be 0 */
> + if (dummy != 0) {
> + dprintk("%s Invalid slotid\n", __func__);
> + goto out;
> + }
> + /* FIXME: process highest slotid and target highest slotid */
> + status = 0;
> +out:
> + return status;
> +}
> +
> +
> static int
> nfs4_xdr_dec_cb_null(struct rpc_rqst *req, __be32 *p)
> {
> --
> 1.6.4
>

2009-09-13 20:39:46

by J. Bruce Fields

[permalink] [raw]
Subject: Re: [PATCH v2 10/12] nfsd41: Backchannel: Implement cb_recall over NFSv4.1

On Thu, Sep 10, 2009 at 12:27:04PM +0300, Benny Halevy wrote:
> @@ -668,16 +701,19 @@ static void nfsd4_cb_recall_done(struct rpc_task *task, void *calldata)
> break;
> default:
> /* success, or error we can't handle */
> - return;
> + goto done;
> }
> if (dp->dl_retries--) {
> rpc_delay(task, 2*HZ);
> task->tk_status = 0;
> rpc_restart_call(task);
> + return;
> } else {
> atomic_set(&clp->cl_cb_conn.cb_set, 0);
> warn_no_callback_path(clp, task->tk_status);
> }
> +done:
> + kfree(task->tk_msg.rpc_argp);
> }
>
> static void nfsd4_cb_recall_release(void *calldata)
> @@ -703,16 +739,24 @@ nfsd4_cb_recall(struct nfs4_delegation *dp)
> {
> struct nfs4_client *clp = dp->dl_client;
> struct rpc_clnt *clnt = clp->cl_cb_conn.cb_client;
> + struct nfs4_rpc_args *args;
> struct rpc_message msg = {
> .rpc_proc = &nfs4_cb_procedures[NFSPROC4_CLNT_CB_RECALL],
> - .rpc_argp = dp,
> .rpc_cred = clp->cl_cb_conn.cb_cred
> };
> int status;
>
> + args = kzalloc(sizeof(*args), GFP_KERNEL);
> + if (!args) {
> + status = -ENOMEM;
> + goto out;
> + }
> + args->args_op = dp;
> + msg.rpc_argp = args;
> dp->dl_retries = 1;
> status = rpc_call_async(clnt, &msg, RPC_TASK_SOFT,
> &nfsd4_cb_recall_ops, dp);
> +out:
> if (status) {
> put_nfs4_client(clp);
> nfs4_put_delegation(dp);
> --
> 1.6.4
>

Tracing down through rpc_call_async.... It doesn't look to me like it
will call rpc_call_done on returning an error, so you're leaking args in
that case. Probably just need a kfree(args); applying as below.

--b.

commit 18d8774a149c28b8c9ac66592873e78bbc66330a
Author: Ricardo Labiaga <[email protected]>
Date: Thu Sep 10 12:27:04 2009 +0300

nfsd41: Backchannel: Implement cb_recall over NFSv4.1

Signed-off-by: Ricardo Labiaga <[email protected]>
[nfsd41: cb_recall callback]
[Share v4.0 and v4.1 back channel xdr]
Signed-off-by: Andy Adamson <[email protected]>
Signed-off-by: Ricardo Labiaga <[email protected]>
Signed-off-by: Benny Halevy <[email protected]>
[Share v4.0 and v4.1 back channel xdr]
Signed-off-by: Andy Adamson <[email protected]>
Signed-off-by: Benny Halevy <[email protected]>
[nfsd41: use nfsd4_cb_sequence for callback minorversion]
[nfsd41: conditionally decode_sequence in nfs4_xdr_dec_cb_recall]
Signed-off-by: Benny Halevy <[email protected]>
[nfsd41: Backchannel: Add sequence arguments to callback RPC arguments]
Signed-off-by: Ricardo Labiaga <[email protected]>
[pulled-in definition of nfsd4_cb_done]
Signed-off-by: Benny Halevy <[email protected]>
Signed-off-by: J. Bruce Fields <[email protected]>

diff --git a/fs/nfsd/nfs4callback.c b/fs/nfsd/nfs4callback.c
index 74c8b33..fabfe12 100644
--- a/fs/nfsd/nfs4callback.c
+++ b/fs/nfsd/nfs4callback.c
@@ -267,15 +267,19 @@ nfs4_xdr_enc_cb_null(struct rpc_rqst *req, __be32 *p)
}

static int
-nfs4_xdr_enc_cb_recall(struct rpc_rqst *req, __be32 *p, struct nfs4_delegation *args)
+nfs4_xdr_enc_cb_recall(struct rpc_rqst *req, __be32 *p,
+ struct nfs4_rpc_args *rpc_args)
{
struct xdr_stream xdr;
+ struct nfs4_delegation *args = rpc_args->args_op;
struct nfs4_cb_compound_hdr hdr = {
.ident = args->dl_ident,
+ .minorversion = rpc_args->args_seq.cbs_minorversion,
};

xdr_init_encode(&xdr, &req->rq_snd_buf, p);
encode_cb_compound_hdr(&xdr, &hdr);
+ encode_cb_sequence(&xdr, &rpc_args->args_seq, &hdr);
encode_cb_recall(&xdr, args, &hdr);
encode_cb_nops(&hdr);
return 0;
@@ -324,7 +328,8 @@ nfs4_xdr_dec_cb_null(struct rpc_rqst *req, __be32 *p)
}

static int
-nfs4_xdr_dec_cb_recall(struct rpc_rqst *rqstp, __be32 *p)
+nfs4_xdr_dec_cb_recall(struct rpc_rqst *rqstp, __be32 *p,
+ struct nfsd4_cb_sequence *seq)
{
struct xdr_stream xdr;
struct nfs4_cb_compound_hdr hdr;
@@ -334,6 +339,11 @@ nfs4_xdr_dec_cb_recall(struct rpc_rqst *rqstp, __be32 *p)
status = decode_cb_compound_hdr(&xdr, &hdr);
if (status)
goto out;
+ if (seq) {
+ status = decode_cb_sequence(&xdr, seq, rqstp);
+ if (status)
+ goto out;
+ }
status = decode_cb_op_hdr(&xdr, OP_CB_RECALL);
out:
return status;
@@ -575,11 +585,34 @@ static void nfsd4_cb_prepare(struct rpc_task *task, void *calldata)
rpc_call_start(task);
}

+static void nfsd4_cb_done(struct rpc_task *task, void *calldata)
+{
+ struct nfs4_delegation *dp = calldata;
+ struct nfs4_client *clp = dp->dl_client;
+
+ dprintk("%s: minorversion=%d\n", __func__,
+ clp->cl_cb_conn.cb_minorversion);
+
+ if (clp->cl_cb_conn.cb_minorversion) {
+ /* No need for lock, access serialized in nfsd4_cb_prepare */
+ ++clp->cl_cb_seq_nr;
+ clear_bit(0, &clp->cl_cb_slot_busy);
+ rpc_wake_up_next(&clp->cl_cb_waitq);
+ dprintk("%s: freed slot, new seqid=%d\n", __func__,
+ clp->cl_cb_seq_nr);
+
+ /* We're done looking into the sequence information */
+ task->tk_msg.rpc_resp = NULL;
+ }
+}
+
static void nfsd4_cb_recall_done(struct rpc_task *task, void *calldata)
{
struct nfs4_delegation *dp = calldata;
struct nfs4_client *clp = dp->dl_client;

+ nfsd4_cb_done(task, calldata);
+
switch (task->tk_status) {
case -EIO:
/* Network partition? */
@@ -592,16 +625,19 @@ static void nfsd4_cb_recall_done(struct rpc_task *task, void *calldata)
break;
default:
/* success, or error we can't handle */
- return;
+ goto done;
}
if (dp->dl_retries--) {
rpc_delay(task, 2*HZ);
task->tk_status = 0;
rpc_restart_call(task);
+ return;
} else {
atomic_set(&clp->cl_cb_conn.cb_set, 0);
warn_no_callback_path(clp, task->tk_status);
}
+done:
+ kfree(task->tk_msg.rpc_argp);
}

static void nfsd4_cb_recall_release(void *calldata)
@@ -627,17 +663,24 @@ nfsd4_cb_recall(struct nfs4_delegation *dp)
{
struct nfs4_client *clp = dp->dl_client;
struct rpc_clnt *clnt = clp->cl_cb_conn.cb_client;
+ struct nfs4_rpc_args *args;
struct rpc_message msg = {
.rpc_proc = &nfs4_cb_procedures[NFSPROC4_CLNT_CB_RECALL],
- .rpc_argp = dp,
.rpc_cred = clp->cl_cb_conn.cb_cred
};
- int status;
+ int status = -ENOMEM;

+ args = kzalloc(sizeof(*args), GFP_KERNEL);
+ if (!args)
+ goto out;
+ args->args_op = dp;
+ msg.rpc_argp = args;
dp->dl_retries = 1;
status = rpc_call_async(clnt, &msg, RPC_TASK_SOFT,
&nfsd4_cb_recall_ops, dp);
+out:
if (status) {
+ kfree(args);
put_nfs4_client(clp);
nfs4_put_delegation(dp);
}

2009-09-14 07:21:47

by Boaz Harrosh

[permalink] [raw]
Subject: Re: [pnfs] [PATCH v2 09/12] nfsd41: Backchannel: cb_sequence callback

On 09/13/2009 11:27 PM, J. Bruce Fields wrote:
> On Thu, Sep 10, 2009 at 12:26:51PM +0300, Benny Halevy wrote:
>> Implement the cb_sequence callback conforming to draft-ietf-nfsv4-minorversion1
>>
>> Note: highest slot id and target highest slot id do not have to be 0
>> as was previously implemented. They can be greater than what the
>> nfs server sent if the client supports a larger slot table on the
>> backchannel. At this point we just ignore that.
>
> Minor point (applying as is), but, in future: a changelog that says how
> this version of the patch differs from a previous version won't be
> useful to someone reading the git history (and lacking the previous
> versions). If you think the above mistake is one that someone might
> risk making again, a comment in the appropriate spot in the code might
> be more useful.
>
> --b.
>

I disagree. The paragraph above is perfectly understandable if you just
remove the "as was previously implemented". There is no missing information.
Then if so, the "as was previously implemented" is useful extra information
that says that "we tried that before and it was bad". This is the kind of
information that is important to a reader, even without reading the old version.

Just my $0.017

Boaz

2009-09-14 08:27:25

by Benny Halevy

[permalink] [raw]
Subject: Re: [PATCH v2 10/12] nfsd41: Backchannel: Implement cb_recall over NFSv4.1

On Sep. 13, 2009, 23:39 +0300, "J. Bruce Fields" <[email protected]> wrote:
> On Thu, Sep 10, 2009 at 12:27:04PM +0300, Benny Halevy wrote:
>> @@ -668,16 +701,19 @@ static void nfsd4_cb_recall_done(struct rpc_task *task, void *calldata)
>> break;
>> default:
>> /* success, or error we can't handle */
>> - return;
>> + goto done;
>> }
>> if (dp->dl_retries--) {
>> rpc_delay(task, 2*HZ);
>> task->tk_status = 0;
>> rpc_restart_call(task);
>> + return;
>> } else {
>> atomic_set(&clp->cl_cb_conn.cb_set, 0);
>> warn_no_callback_path(clp, task->tk_status);
>> }
>> +done:
>> + kfree(task->tk_msg.rpc_argp);
>> }
>>
>> static void nfsd4_cb_recall_release(void *calldata)
>> @@ -703,16 +739,24 @@ nfsd4_cb_recall(struct nfs4_delegation *dp)
>> {
>> struct nfs4_client *clp = dp->dl_client;
>> struct rpc_clnt *clnt = clp->cl_cb_conn.cb_client;
>> + struct nfs4_rpc_args *args;
>> struct rpc_message msg = {
>> .rpc_proc = &nfs4_cb_procedures[NFSPROC4_CLNT_CB_RECALL],
>> - .rpc_argp = dp,
>> .rpc_cred = clp->cl_cb_conn.cb_cred
>> };
>> int status;
>>
>> + args = kzalloc(sizeof(*args), GFP_KERNEL);
>> + if (!args) {
>> + status = -ENOMEM;
>> + goto out;
>> + }
>> + args->args_op = dp;
>> + msg.rpc_argp = args;
>> dp->dl_retries = 1;
>> status = rpc_call_async(clnt, &msg, RPC_TASK_SOFT,
>> &nfsd4_cb_recall_ops, dp);
>> +out:
>> if (status) {
>> put_nfs4_client(clp);
>> nfs4_put_delegation(dp);
>> --
>> 1.6.4
>>
>
> Tracing down through rpc_call_async.... It doesn't look to me like it
> will call rpc_call_done on returning an error, so you're leaking args in
> that case. Probably just need a kfree(args); applying as below.

Right on the spot. Thanks!
(We could do that in the rpc_release method but then the
calldata would have to be a struct nfs4_rpc_args
which is somewhat cumbersome...)

Benny

>
> --b.
>
> commit 18d8774a149c28b8c9ac66592873e78bbc66330a
> Author: Ricardo Labiaga <[email protected]>
> Date: Thu Sep 10 12:27:04 2009 +0300
>
> nfsd41: Backchannel: Implement cb_recall over NFSv4.1
>
> Signed-off-by: Ricardo Labiaga <[email protected]>
> [nfsd41: cb_recall callback]
> [Share v4.0 and v4.1 back channel xdr]
> Signed-off-by: Andy Adamson <[email protected]>
> Signed-off-by: Ricardo Labiaga <[email protected]>
> Signed-off-by: Benny Halevy <[email protected]>
> [Share v4.0 and v4.1 back channel xdr]
> Signed-off-by: Andy Adamson <[email protected]>
> Signed-off-by: Benny Halevy <[email protected]>
> [nfsd41: use nfsd4_cb_sequence for callback minorversion]
> [nfsd41: conditionally decode_sequence in nfs4_xdr_dec_cb_recall]
> Signed-off-by: Benny Halevy <[email protected]>
> [nfsd41: Backchannel: Add sequence arguments to callback RPC arguments]
> Signed-off-by: Ricardo Labiaga <[email protected]>
> [pulled-in definition of nfsd4_cb_done]
> Signed-off-by: Benny Halevy <[email protected]>
> Signed-off-by: J. Bruce Fields <[email protected]>
>
> diff --git a/fs/nfsd/nfs4callback.c b/fs/nfsd/nfs4callback.c
> index 74c8b33..fabfe12 100644
> --- a/fs/nfsd/nfs4callback.c
> +++ b/fs/nfsd/nfs4callback.c
> @@ -267,15 +267,19 @@ nfs4_xdr_enc_cb_null(struct rpc_rqst *req, __be32 *p)
> }
>
> static int
> -nfs4_xdr_enc_cb_recall(struct rpc_rqst *req, __be32 *p, struct nfs4_delegation *args)
> +nfs4_xdr_enc_cb_recall(struct rpc_rqst *req, __be32 *p,
> + struct nfs4_rpc_args *rpc_args)
> {
> struct xdr_stream xdr;
> + struct nfs4_delegation *args = rpc_args->args_op;
> struct nfs4_cb_compound_hdr hdr = {
> .ident = args->dl_ident,
> + .minorversion = rpc_args->args_seq.cbs_minorversion,
> };
>
> xdr_init_encode(&xdr, &req->rq_snd_buf, p);
> encode_cb_compound_hdr(&xdr, &hdr);
> + encode_cb_sequence(&xdr, &rpc_args->args_seq, &hdr);
> encode_cb_recall(&xdr, args, &hdr);
> encode_cb_nops(&hdr);
> return 0;
> @@ -324,7 +328,8 @@ nfs4_xdr_dec_cb_null(struct rpc_rqst *req, __be32 *p)
> }
>
> static int
> -nfs4_xdr_dec_cb_recall(struct rpc_rqst *rqstp, __be32 *p)
> +nfs4_xdr_dec_cb_recall(struct rpc_rqst *rqstp, __be32 *p,
> + struct nfsd4_cb_sequence *seq)
> {
> struct xdr_stream xdr;
> struct nfs4_cb_compound_hdr hdr;
> @@ -334,6 +339,11 @@ nfs4_xdr_dec_cb_recall(struct rpc_rqst *rqstp, __be32 *p)
> status = decode_cb_compound_hdr(&xdr, &hdr);
> if (status)
> goto out;
> + if (seq) {
> + status = decode_cb_sequence(&xdr, seq, rqstp);
> + if (status)
> + goto out;
> + }
> status = decode_cb_op_hdr(&xdr, OP_CB_RECALL);
> out:
> return status;
> @@ -575,11 +585,34 @@ static void nfsd4_cb_prepare(struct rpc_task *task, void *calldata)
> rpc_call_start(task);
> }
>
> +static void nfsd4_cb_done(struct rpc_task *task, void *calldata)
> +{
> + struct nfs4_delegation *dp = calldata;
> + struct nfs4_client *clp = dp->dl_client;
> +
> + dprintk("%s: minorversion=%d\n", __func__,
> + clp->cl_cb_conn.cb_minorversion);
> +
> + if (clp->cl_cb_conn.cb_minorversion) {
> + /* No need for lock, access serialized in nfsd4_cb_prepare */
> + ++clp->cl_cb_seq_nr;
> + clear_bit(0, &clp->cl_cb_slot_busy);
> + rpc_wake_up_next(&clp->cl_cb_waitq);
> + dprintk("%s: freed slot, new seqid=%d\n", __func__,
> + clp->cl_cb_seq_nr);
> +
> + /* We're done looking into the sequence information */
> + task->tk_msg.rpc_resp = NULL;
> + }
> +}
> +
> static void nfsd4_cb_recall_done(struct rpc_task *task, void *calldata)
> {
> struct nfs4_delegation *dp = calldata;
> struct nfs4_client *clp = dp->dl_client;
>
> + nfsd4_cb_done(task, calldata);
> +
> switch (task->tk_status) {
> case -EIO:
> /* Network partition? */
> @@ -592,16 +625,19 @@ static void nfsd4_cb_recall_done(struct rpc_task *task, void *calldata)
> break;
> default:
> /* success, or error we can't handle */
> - return;
> + goto done;
> }
> if (dp->dl_retries--) {
> rpc_delay(task, 2*HZ);
> task->tk_status = 0;
> rpc_restart_call(task);
> + return;
> } else {
> atomic_set(&clp->cl_cb_conn.cb_set, 0);
> warn_no_callback_path(clp, task->tk_status);
> }
> +done:
> + kfree(task->tk_msg.rpc_argp);
> }
>
> static void nfsd4_cb_recall_release(void *calldata)
> @@ -627,17 +663,24 @@ nfsd4_cb_recall(struct nfs4_delegation *dp)
> {
> struct nfs4_client *clp = dp->dl_client;
> struct rpc_clnt *clnt = clp->cl_cb_conn.cb_client;
> + struct nfs4_rpc_args *args;
> struct rpc_message msg = {
> .rpc_proc = &nfs4_cb_procedures[NFSPROC4_CLNT_CB_RECALL],
> - .rpc_argp = dp,
> .rpc_cred = clp->cl_cb_conn.cb_cred
> };
> - int status;
> + int status = -ENOMEM;
>
> + args = kzalloc(sizeof(*args), GFP_KERNEL);
> + if (!args)
> + goto out;
> + args->args_op = dp;
> + msg.rpc_argp = args;
> dp->dl_retries = 1;
> status = rpc_call_async(clnt, &msg, RPC_TASK_SOFT,
> &nfsd4_cb_recall_ops, dp);
> +out:
> if (status) {
> + kfree(args);
> put_nfs4_client(clp);
> nfs4_put_delegation(dp);
> }

2009-09-14 16:35:34

by J. Bruce Fields

[permalink] [raw]
Subject: Re: [PATCH v2 05/12] nfsd41: Backchannel: callback infrastructure

On Thu, Sep 10, 2009 at 12:25:59PM +0300, Benny Halevy wrote:
> From: Andy Adamson <[email protected]>
>
> Keep the xprt used for create_session in cl_cb_xprt.
> Mark cl_callback.cb_minorversion = 1 and remember
> the client provided cl_callback.cb_prog rpc program number.
> Use it to probe the callback path.
>
> Use the client's network address to initialize as the
> callback's address as expected by the xprt creation
> routines.
>
> Define xdr sizes and code nfs4_cb_compound header to be able
> to send a null callback rpc.
>
> Signed-off-by: Andy Adamson<[email protected]>
> Signed-off-by: Benny Halevy <[email protected]>
> Signed-off-by: Ricardo Labiaga <[email protected]>
> [get callback minorversion from fore channel's]
> Signed-off-by: Benny Halevy <[email protected]>
> [nfsd41: change bc_sock to bc_xprt]
> Signed-off-by: Benny Halevy <[email protected]>
> [pulled definition for cl_cb_xprt]
> Signed-off-by: Benny Halevy <[email protected]>
> [nfsd41: set up backchannel's cb_addr]
> [moved rpc_create_args init to "nfsd: modify nfsd4.1 backchannel to use new xprt class"]
> Signed-off-by: Benny Halevy <[email protected]>
> ---
> fs/nfsd/nfs4callback.c | 21 +++++++++++++++++++--
> fs/nfsd/nfs4state.c | 14 ++++++++++++++
> include/linux/nfsd/state.h | 3 +++
> 3 files changed, 36 insertions(+), 2 deletions(-)
>
> diff --git a/fs/nfsd/nfs4callback.c b/fs/nfsd/nfs4callback.c
> index 63bb384..3e3e15b 100644
> --- a/fs/nfsd/nfs4callback.c
> +++ b/fs/nfsd/nfs4callback.c
> @@ -43,6 +43,7 @@
> #include <linux/sunrpc/xdr.h>
> #include <linux/sunrpc/svc.h>
> #include <linux/sunrpc/clnt.h>
> +#include <linux/sunrpc/svcsock.h>
> #include <linux/nfsd/nfsd.h>
> #include <linux/nfsd/state.h>
> #include <linux/sunrpc/sched.h>
> @@ -52,16 +53,19 @@
>
> #define NFSPROC4_CB_NULL 0
> #define NFSPROC4_CB_COMPOUND 1
> +#define NFS4_STATEID_SIZE 16
>
> /* Index of predefined Linux callback client operations */
>
> enum {
> NFSPROC4_CLNT_CB_NULL = 0,
> NFSPROC4_CLNT_CB_RECALL,
> + NFSPROC4_CLNT_CB_SEQUENCE,
> };
>
> enum nfs_cb_opnum4 {
> OP_CB_RECALL = 4,
> + OP_CB_SEQUENCE = 11,
> };
>
> #define NFS4_MAXTAGLEN 20
> @@ -70,15 +74,22 @@ enum nfs_cb_opnum4 {
> #define NFS4_dec_cb_null_sz 0
> #define cb_compound_enc_hdr_sz 4
> #define cb_compound_dec_hdr_sz (3 + (NFS4_MAXTAGLEN >> 2))
> +#define sessionid_sz (NFS4_MAX_SESSIONID_LEN >> 2)
> +#define cb_sequence_enc_sz (sessionid_sz + 4 + \
> + 1 /* no referring calls list yet */)
> +#define cb_sequence_dec_sz (op_dec_sz + sessionid_sz + 4)
> +
> #define op_enc_sz 1
> #define op_dec_sz 2
> #define enc_nfs4_fh_sz (1 + (NFS4_FHSIZE >> 2))
> #define enc_stateid_sz (NFS4_STATEID_SIZE >> 2)
> #define NFS4_enc_cb_recall_sz (cb_compound_enc_hdr_sz + \
> + cb_sequence_enc_sz + \
> 1 + enc_stateid_sz + \
> enc_nfs4_fh_sz)
>
> #define NFS4_dec_cb_recall_sz (cb_compound_dec_hdr_sz + \
> + cb_sequence_dec_sz + \
> op_dec_sz)
>
> /*
> @@ -137,11 +148,13 @@ xdr_error: \
> } while (0)
>
> struct nfs4_cb_compound_hdr {
> - int status;
> - u32 ident;
> + /* args */
> + u32 ident; /* minorversion 0 only */
> u32 nops;
> __be32 *nops_p;
> u32 minorversion;
> + /* res */
> + int status;
> u32 taglen;
> char *tag;
> };
> @@ -399,6 +412,10 @@ int setup_callback_client(struct nfs4_client *clp)
> if (!clp->cl_principal && (clp->cl_flavor >= RPC_AUTH_GSS_KRB5))
> return -EINVAL;
>
> + dprintk("%s: program %s 0x%x nrvers %u version %u minorversion %u\n",
> + __func__, args.program->name, args.prognumber,
> + args.program->nrvers, args.version, cb->cb_minorversion);
> +
> /* Create RPC client */
> client = rpc_create(&args);
> if (IS_ERR(client)) {
> diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
> index 46e9ac5..e4c3223 100644
> --- a/fs/nfsd/nfs4state.c
> +++ b/fs/nfsd/nfs4state.c
> @@ -706,6 +706,8 @@ static inline void
> free_client(struct nfs4_client *clp)
> {
> shutdown_callback_client(clp);
> + if (clp->cl_cb_xprt)
> + svc_xprt_put(clp->cl_cb_xprt);
> if (clp->cl_cred.cr_group_info)
> put_group_info(clp->cl_cred.cr_group_info);
> kfree(clp->cl_principal);
> @@ -1321,6 +1323,18 @@ nfsd4_create_session(struct svc_rqst *rqstp,
> cr_ses->flags &= ~SESSION4_PERSIST;
> cr_ses->flags &= ~SESSION4_RDMA;
>
> + if (cr_ses->flags & SESSION4_BACK_CHAN) {
> + unconf->cl_cb_xprt = rqstp->rq_xprt;
> + svc_xprt_get(unconf->cl_cb_xprt);
> + rpc_copy_addr(
> + (struct sockaddr *)&unconf->cl_cb_conn.cb_addr,
> + sa);
> + unconf->cl_cb_conn.cb_addrlen = svc_addr_len(sa);
> + unconf->cl_cb_conn.cb_minorversion =
> + cstate->minorversion;
> + unconf->cl_cb_conn.cb_prog = cr_ses->callback_prog;
> + nfsd4_probe_callback(unconf);

This results in a NULL deference in rpcauth_lookup_credcache()--probably
some callback parameters that aren't set up right yet.

--b.

> + }
> conf = unconf;
> } else {
> status = nfserr_stale_clientid;
> diff --git a/include/linux/nfsd/state.h b/include/linux/nfsd/state.h
> index 70ef5f4..243277b 100644
> --- a/include/linux/nfsd/state.h
> +++ b/include/linux/nfsd/state.h
> @@ -212,6 +212,9 @@ struct nfs4_client {
> struct nfsd4_clid_slot cl_cs_slot; /* create_session slot */
> u32 cl_exchange_flags;
> struct nfs4_sessionid cl_sessionid;
> +
> + /* for nfs41 callbacks */
> + struct svc_xprt *cl_cb_xprt; /* 4.1 callback transport */
> };
>
> /* struct nfs4_client_reset
> --
> 1.6.4
>

2009-09-14 16:49:49

by J. Bruce Fields

[permalink] [raw]
Subject: Re: [PATCH v2 05/12] nfsd41: Backchannel: callback infrastructure

On Mon, Sep 14, 2009 at 12:35:35PM -0400, bfields wrote:
> On Thu, Sep 10, 2009 at 12:25:59PM +0300, Benny Halevy wrote:
> > From: Andy Adamson <[email protected]>
> >
> > Keep the xprt used for create_session in cl_cb_xprt.
> > Mark cl_callback.cb_minorversion = 1 and remember
> > the client provided cl_callback.cb_prog rpc program number.
> > Use it to probe the callback path.
> >
> > Use the client's network address to initialize as the
> > callback's address as expected by the xprt creation
> > routines.
> >
> > Define xdr sizes and code nfs4_cb_compound header to be able
> > to send a null callback rpc.
> >
> > Signed-off-by: Andy Adamson<[email protected]>
> > Signed-off-by: Benny Halevy <[email protected]>
> > Signed-off-by: Ricardo Labiaga <[email protected]>
> > [get callback minorversion from fore channel's]
> > Signed-off-by: Benny Halevy <[email protected]>
> > [nfsd41: change bc_sock to bc_xprt]
> > Signed-off-by: Benny Halevy <[email protected]>
> > [pulled definition for cl_cb_xprt]
> > Signed-off-by: Benny Halevy <[email protected]>
> > [nfsd41: set up backchannel's cb_addr]
> > [moved rpc_create_args init to "nfsd: modify nfsd4.1 backchannel to use new xprt class"]
> > Signed-off-by: Benny Halevy <[email protected]>
> > ---
> > fs/nfsd/nfs4callback.c | 21 +++++++++++++++++++--
> > fs/nfsd/nfs4state.c | 14 ++++++++++++++
> > include/linux/nfsd/state.h | 3 +++
> > 3 files changed, 36 insertions(+), 2 deletions(-)
> >
> > diff --git a/fs/nfsd/nfs4callback.c b/fs/nfsd/nfs4callback.c
> > index 63bb384..3e3e15b 100644
> > --- a/fs/nfsd/nfs4callback.c
> > +++ b/fs/nfsd/nfs4callback.c
> > @@ -43,6 +43,7 @@
> > #include <linux/sunrpc/xdr.h>
> > #include <linux/sunrpc/svc.h>
> > #include <linux/sunrpc/clnt.h>
> > +#include <linux/sunrpc/svcsock.h>
> > #include <linux/nfsd/nfsd.h>
> > #include <linux/nfsd/state.h>
> > #include <linux/sunrpc/sched.h>
> > @@ -52,16 +53,19 @@
> >
> > #define NFSPROC4_CB_NULL 0
> > #define NFSPROC4_CB_COMPOUND 1
> > +#define NFS4_STATEID_SIZE 16
> >
> > /* Index of predefined Linux callback client operations */
> >
> > enum {
> > NFSPROC4_CLNT_CB_NULL = 0,
> > NFSPROC4_CLNT_CB_RECALL,
> > + NFSPROC4_CLNT_CB_SEQUENCE,
> > };
> >
> > enum nfs_cb_opnum4 {
> > OP_CB_RECALL = 4,
> > + OP_CB_SEQUENCE = 11,
> > };
> >
> > #define NFS4_MAXTAGLEN 20
> > @@ -70,15 +74,22 @@ enum nfs_cb_opnum4 {
> > #define NFS4_dec_cb_null_sz 0
> > #define cb_compound_enc_hdr_sz 4
> > #define cb_compound_dec_hdr_sz (3 + (NFS4_MAXTAGLEN >> 2))
> > +#define sessionid_sz (NFS4_MAX_SESSIONID_LEN >> 2)
> > +#define cb_sequence_enc_sz (sessionid_sz + 4 + \
> > + 1 /* no referring calls list yet */)
> > +#define cb_sequence_dec_sz (op_dec_sz + sessionid_sz + 4)
> > +
> > #define op_enc_sz 1
> > #define op_dec_sz 2
> > #define enc_nfs4_fh_sz (1 + (NFS4_FHSIZE >> 2))
> > #define enc_stateid_sz (NFS4_STATEID_SIZE >> 2)
> > #define NFS4_enc_cb_recall_sz (cb_compound_enc_hdr_sz + \
> > + cb_sequence_enc_sz + \
> > 1 + enc_stateid_sz + \
> > enc_nfs4_fh_sz)
> >
> > #define NFS4_dec_cb_recall_sz (cb_compound_dec_hdr_sz + \
> > + cb_sequence_dec_sz + \
> > op_dec_sz)
> >
> > /*
> > @@ -137,11 +148,13 @@ xdr_error: \
> > } while (0)
> >
> > struct nfs4_cb_compound_hdr {
> > - int status;
> > - u32 ident;
> > + /* args */
> > + u32 ident; /* minorversion 0 only */
> > u32 nops;
> > __be32 *nops_p;
> > u32 minorversion;
> > + /* res */
> > + int status;
> > u32 taglen;
> > char *tag;
> > };
> > @@ -399,6 +412,10 @@ int setup_callback_client(struct nfs4_client *clp)
> > if (!clp->cl_principal && (clp->cl_flavor >= RPC_AUTH_GSS_KRB5))
> > return -EINVAL;
> >
> > + dprintk("%s: program %s 0x%x nrvers %u version %u minorversion %u\n",
> > + __func__, args.program->name, args.prognumber,
> > + args.program->nrvers, args.version, cb->cb_minorversion);
> > +
> > /* Create RPC client */
> > client = rpc_create(&args);
> > if (IS_ERR(client)) {
> > diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
> > index 46e9ac5..e4c3223 100644
> > --- a/fs/nfsd/nfs4state.c
> > +++ b/fs/nfsd/nfs4state.c
> > @@ -706,6 +706,8 @@ static inline void
> > free_client(struct nfs4_client *clp)
> > {
> > shutdown_callback_client(clp);
> > + if (clp->cl_cb_xprt)
> > + svc_xprt_put(clp->cl_cb_xprt);
> > if (clp->cl_cred.cr_group_info)
> > put_group_info(clp->cl_cred.cr_group_info);
> > kfree(clp->cl_principal);
> > @@ -1321,6 +1323,18 @@ nfsd4_create_session(struct svc_rqst *rqstp,
> > cr_ses->flags &= ~SESSION4_PERSIST;
> > cr_ses->flags &= ~SESSION4_RDMA;
> >
> > + if (cr_ses->flags & SESSION4_BACK_CHAN) {
> > + unconf->cl_cb_xprt = rqstp->rq_xprt;
> > + svc_xprt_get(unconf->cl_cb_xprt);
> > + rpc_copy_addr(
> > + (struct sockaddr *)&unconf->cl_cb_conn.cb_addr,
> > + sa);
> > + unconf->cl_cb_conn.cb_addrlen = svc_addr_len(sa);
> > + unconf->cl_cb_conn.cb_minorversion =
> > + cstate->minorversion;
> > + unconf->cl_cb_conn.cb_prog = cr_ses->callback_prog;
> > + nfsd4_probe_callback(unconf);
>
> This results in a NULL deference in rpcauth_lookup_credcache()--probably
> some callback parameters that aren't set up right yet.

Note--that's fixed 7 patches later in fsd41: Refactor create_client(),
but I don't actually understand how yet.

--b.

>
> --b.
>
> > + }
> > conf = unconf;
> > } else {
> > status = nfserr_stale_clientid;
> > diff --git a/include/linux/nfsd/state.h b/include/linux/nfsd/state.h
> > index 70ef5f4..243277b 100644
> > --- a/include/linux/nfsd/state.h
> > +++ b/include/linux/nfsd/state.h
> > @@ -212,6 +212,9 @@ struct nfs4_client {
> > struct nfsd4_clid_slot cl_cs_slot; /* create_session slot */
> > u32 cl_exchange_flags;
> > struct nfs4_sessionid cl_sessionid;
> > +
> > + /* for nfs41 callbacks */
> > + struct svc_xprt *cl_cb_xprt; /* 4.1 callback transport */
> > };
> >
> > /* struct nfs4_client_reset
> > --
> > 1.6.4
> >

2009-09-14 17:22:29

by Benny Halevy

[permalink] [raw]
Subject: Re: [PATCH v2 05/12] nfsd41: Backchannel: callback infrastructure

On Sep. 14, 2009, 19:49 +0300, "J. Bruce Fields" <[email protected]> wrote:
> On Mon, Sep 14, 2009 at 12:35:35PM -0400, bfields wrote:
>> On Thu, Sep 10, 2009 at 12:25:59PM +0300, Benny Halevy wrote:
>>> From: Andy Adamson <[email protected]>
>>>
>>> Keep the xprt used for create_session in cl_cb_xprt.
>>> Mark cl_callback.cb_minorversion = 1 and remember
>>> the client provided cl_callback.cb_prog rpc program number.
>>> Use it to probe the callback path.
>>>
>>> Use the client's network address to initialize as the
>>> callback's address as expected by the xprt creation
>>> routines.
>>>
>>> Define xdr sizes and code nfs4_cb_compound header to be able
>>> to send a null callback rpc.
>>>
>>> Signed-off-by: Andy Adamson<[email protected]>
>>> Signed-off-by: Benny Halevy <[email protected]>
>>> Signed-off-by: Ricardo Labiaga <[email protected]>
>>> [get callback minorversion from fore channel's]
>>> Signed-off-by: Benny Halevy <[email protected]>
>>> [nfsd41: change bc_sock to bc_xprt]
>>> Signed-off-by: Benny Halevy <[email protected]>
>>> [pulled definition for cl_cb_xprt]
>>> Signed-off-by: Benny Halevy <[email protected]>
>>> [nfsd41: set up backchannel's cb_addr]
>>> [moved rpc_create_args init to "nfsd: modify nfsd4.1 backchannel to use new xprt class"]
>>> Signed-off-by: Benny Halevy <[email protected]>
>>> ---
>>> fs/nfsd/nfs4callback.c | 21 +++++++++++++++++++--
>>> fs/nfsd/nfs4state.c | 14 ++++++++++++++
>>> include/linux/nfsd/state.h | 3 +++
>>> 3 files changed, 36 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/fs/nfsd/nfs4callback.c b/fs/nfsd/nfs4callback.c
>>> index 63bb384..3e3e15b 100644
>>> --- a/fs/nfsd/nfs4callback.c
>>> +++ b/fs/nfsd/nfs4callback.c
>>> @@ -43,6 +43,7 @@
>>> #include <linux/sunrpc/xdr.h>
>>> #include <linux/sunrpc/svc.h>
>>> #include <linux/sunrpc/clnt.h>
>>> +#include <linux/sunrpc/svcsock.h>
>>> #include <linux/nfsd/nfsd.h>
>>> #include <linux/nfsd/state.h>
>>> #include <linux/sunrpc/sched.h>
>>> @@ -52,16 +53,19 @@
>>>
>>> #define NFSPROC4_CB_NULL 0
>>> #define NFSPROC4_CB_COMPOUND 1
>>> +#define NFS4_STATEID_SIZE 16
>>>
>>> /* Index of predefined Linux callback client operations */
>>>
>>> enum {
>>> NFSPROC4_CLNT_CB_NULL = 0,
>>> NFSPROC4_CLNT_CB_RECALL,
>>> + NFSPROC4_CLNT_CB_SEQUENCE,
>>> };
>>>
>>> enum nfs_cb_opnum4 {
>>> OP_CB_RECALL = 4,
>>> + OP_CB_SEQUENCE = 11,
>>> };
>>>
>>> #define NFS4_MAXTAGLEN 20
>>> @@ -70,15 +74,22 @@ enum nfs_cb_opnum4 {
>>> #define NFS4_dec_cb_null_sz 0
>>> #define cb_compound_enc_hdr_sz 4
>>> #define cb_compound_dec_hdr_sz (3 + (NFS4_MAXTAGLEN >> 2))
>>> +#define sessionid_sz (NFS4_MAX_SESSIONID_LEN >> 2)
>>> +#define cb_sequence_enc_sz (sessionid_sz + 4 + \
>>> + 1 /* no referring calls list yet */)
>>> +#define cb_sequence_dec_sz (op_dec_sz + sessionid_sz + 4)
>>> +
>>> #define op_enc_sz 1
>>> #define op_dec_sz 2
>>> #define enc_nfs4_fh_sz (1 + (NFS4_FHSIZE >> 2))
>>> #define enc_stateid_sz (NFS4_STATEID_SIZE >> 2)
>>> #define NFS4_enc_cb_recall_sz (cb_compound_enc_hdr_sz + \
>>> + cb_sequence_enc_sz + \
>>> 1 + enc_stateid_sz + \
>>> enc_nfs4_fh_sz)
>>>
>>> #define NFS4_dec_cb_recall_sz (cb_compound_dec_hdr_sz + \
>>> + cb_sequence_dec_sz + \
>>> op_dec_sz)
>>>
>>> /*
>>> @@ -137,11 +148,13 @@ xdr_error: \
>>> } while (0)
>>>
>>> struct nfs4_cb_compound_hdr {
>>> - int status;
>>> - u32 ident;
>>> + /* args */
>>> + u32 ident; /* minorversion 0 only */
>>> u32 nops;
>>> __be32 *nops_p;
>>> u32 minorversion;
>>> + /* res */
>>> + int status;
>>> u32 taglen;
>>> char *tag;
>>> };
>>> @@ -399,6 +412,10 @@ int setup_callback_client(struct nfs4_client *clp)
>>> if (!clp->cl_principal && (clp->cl_flavor >= RPC_AUTH_GSS_KRB5))
>>> return -EINVAL;
>>>
>>> + dprintk("%s: program %s 0x%x nrvers %u version %u minorversion %u\n",
>>> + __func__, args.program->name, args.prognumber,
>>> + args.program->nrvers, args.version, cb->cb_minorversion);
>>> +
>>> /* Create RPC client */
>>> client = rpc_create(&args);
>>> if (IS_ERR(client)) {
>>> diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
>>> index 46e9ac5..e4c3223 100644
>>> --- a/fs/nfsd/nfs4state.c
>>> +++ b/fs/nfsd/nfs4state.c
>>> @@ -706,6 +706,8 @@ static inline void
>>> free_client(struct nfs4_client *clp)
>>> {
>>> shutdown_callback_client(clp);
>>> + if (clp->cl_cb_xprt)
>>> + svc_xprt_put(clp->cl_cb_xprt);
>>> if (clp->cl_cred.cr_group_info)
>>> put_group_info(clp->cl_cred.cr_group_info);
>>> kfree(clp->cl_principal);
>>> @@ -1321,6 +1323,18 @@ nfsd4_create_session(struct svc_rqst *rqstp,
>>> cr_ses->flags &= ~SESSION4_PERSIST;
>>> cr_ses->flags &= ~SESSION4_RDMA;
>>>
>>> + if (cr_ses->flags & SESSION4_BACK_CHAN) {
>>> + unconf->cl_cb_xprt = rqstp->rq_xprt;
>>> + svc_xprt_get(unconf->cl_cb_xprt);
>>> + rpc_copy_addr(
>>> + (struct sockaddr *)&unconf->cl_cb_conn.cb_addr,
>>> + sa);
>>> + unconf->cl_cb_conn.cb_addrlen = svc_addr_len(sa);
>>> + unconf->cl_cb_conn.cb_minorversion =
>>> + cstate->minorversion;
>>> + unconf->cl_cb_conn.cb_prog = cr_ses->callback_prog;
>>> + nfsd4_probe_callback(unconf);
>> This results in a NULL deference in rpcauth_lookup_credcache()--probably
>> some callback parameters that aren't set up right yet.

Where exactly is the NULL deref?

>
> Note--that's fixed 7 patches later in fsd41: Refactor create_client(),
> but I don't actually understand how yet.

unconf's cl_flavor initialization was moved in the latter patch
from nfsd4_setclientid to create_client so maybe this could
be the culprit (though, assuming it is initialized to 0
it will choosing implicitly authnull_ops in rpcauth_create()
which _should_ work...)

Benny

>
> --b.
>
>> --b.
>>
>>> + }
>>> conf = unconf;
>>> } else {
>>> status = nfserr_stale_clientid;
>>> diff --git a/include/linux/nfsd/state.h b/include/linux/nfsd/state.h
>>> index 70ef5f4..243277b 100644
>>> --- a/include/linux/nfsd/state.h
>>> +++ b/include/linux/nfsd/state.h
>>> @@ -212,6 +212,9 @@ struct nfs4_client {
>>> struct nfsd4_clid_slot cl_cs_slot; /* create_session slot */
>>> u32 cl_exchange_flags;
>>> struct nfs4_sessionid cl_sessionid;
>>> +
>>> + /* for nfs41 callbacks */
>>> + struct svc_xprt *cl_cb_xprt; /* 4.1 callback transport */
>>> };
>>>
>>> /* struct nfs4_client_reset
>>> --
>>> 1.6.4
>>>

2009-09-14 20:04:22

by J. Bruce Fields

[permalink] [raw]
Subject: Re: [PATCH v2 05/12] nfsd41: Backchannel: callback infrastructure

On Mon, Sep 14, 2009 at 08:23:37PM +0300, Benny Halevy wrote:
> Where exactly is the NULL deref?
>
> >
> > Note--that's fixed 7 patches later in fsd41: Refactor create_client(),
> > but I don't actually understand how yet.
>
> unconf's cl_flavor initialization was moved in the latter patch
> from nfsd4_setclientid to create_client so maybe this could
> be the culprit (though, assuming it is initialized to 0
> it will choosing implicitly authnull_ops in rpcauth_create()
> which _should_ work...)

Oog, yes, turns out auth_null doesn't initialize the cred hashtable. So
also reproduceable by mounting with "mount -tnfs4 -osec=null", then
touching a file. So either we should be using some other interface, or
rpcauth_lookupcred should be checking au_credcache, or something.

--b.

2009-09-14 20:17:37

by Trond Myklebust

[permalink] [raw]
Subject: Re: [PATCH v2 05/12] nfsd41: Backchannel: callback infrastructure

On Mon, 2009-09-14 at 16:04 -0400, J. Bruce Fields wrote:
> On Mon, Sep 14, 2009 at 08:23:37PM +0300, Benny Halevy wrote:
> > Where exactly is the NULL deref?
> >
> > >
> > > Note--that's fixed 7 patches later in fsd41: Refactor create_client(),
> > > but I don't actually understand how yet.
> >
> > unconf's cl_flavor initialization was moved in the latter patch
> > from nfsd4_setclientid to create_client so maybe this could
> > be the culprit (though, assuming it is initialized to 0
> > it will choosing implicitly authnull_ops in rpcauth_create()
> > which _should_ work...)
>
> Oog, yes, turns out auth_null doesn't initialize the cred hashtable. So
> also reproduceable by mounting with "mount -tnfs4 -osec=null", then
> touching a file. So either we should be using some other interface, or
> rpcauth_lookupcred should be checking au_credcache, or something.

There shouldn't be a need for an auth_null hashtable. It isn't a
credential...

Trond


2009-09-17 19:39:10

by J. Bruce Fields

[permalink] [raw]
Subject: Re: [PATCH v2 0/12] nfsd41 backchannel patches for 2.6.32

On Thu, Sep 10, 2009 at 12:23:35PM +0300, Benny Halevy wrote:
> Bruce,
>
> This version incorporates the latest fixes from Alexandros
> that address Trond's comments:
> http://linux-nfs.org/pipermail/pnfs/2009-September/009052.html
> http://linux-nfs.org/pipermail/pnfs/2009-September/009053.html
> http://linux-nfs.org/pipermail/pnfs/2009-September/009059.html
>
> This version introduces a new xprt class for the nfsv4.1 backchannel
> with its own setup routine to keep xs_setup_tcp clean.
>
> I also cleaned up the patches a little further by removing
> a bit of dead code and fixing checkpatch whitespace related
> warnings in Alexandros squashme patches.

These should all be in my for-2.6.32 branch; any testing would be
appreciated to make sure I didn't introduce any problems.

--b.

>
> Benny
>
> On Sep. 04, 2009, 19:18 +0300, Benny Halevy <[email protected]> wrote:
> > Bruce,
> >
> > Here's the updated patchset implementing the nfs41 backchannel
> > for the nfs server.
> >
> > Changes from previous version:
> > - Rebase onto git://git.linux-nfs.org/~bfields/linux.git for-2.6.32
> >
> > - bc_send_request does not block on the xpt_mutex
> > but rather uses the rpc_sleep_on to wait on it.
> >
> > - nfsd4_create_session initializes unconf->cl_cb_conn.cb_addr.
> >
> > - cosmetic-only changes cleaned up.
> >
> > [PATCH 01/10] nfsd41: sunrpc: move struct rpc_buffer def into sunrpc.h
> > [PATCH 02/10] nfsd41: sunrpc: Added rpc server-side backchannel handling
> > [PATCH 03/10] nfsd4: fix whitespace in NFSPROC4_CLNT_CB_NULL definition
> > [PATCH 04/10] nfsd41: Backchannel: callback infrastructure
> > [PATCH 05/10] nfsd41: Backchannel: Add sequence arguments to callback RPC arguments
> > [PATCH 06/10] nfsd41: Backchannel: Server backchannel RPC wait queue
> > [PATCH 07/10] nfsd41: Backchannel: Setup sequence information
> > [PATCH 08/10] nfsd41: Backchannel: cb_sequence callback
> > [PATCH 09/10] nfsd41: Backchannel: Implement cb_recall over NFSv4.1
> > [PATCH 10/10] nfsd41: Refactor create_client()
> >
> > Benny
> > _______________________________________________
> > pNFS mailing list
> > [email protected]
> > http://linux-nfs.org/cgi-bin/mailman/listinfo/pnfs
>

2009-09-17 19:47:22

by Benny Halevy

[permalink] [raw]
Subject: Re: [PATCH v2 0/12] nfsd41 backchannel patches for 2.6.32

On Sep. 17, 2009, 22:39 +0300, "J. Bruce Fields" <[email protected]> wrote:
> On Thu, Sep 10, 2009 at 12:23:35PM +0300, Benny Halevy wrote:
>> Bruce,
>>
>> This version incorporates the latest fixes from Alexandros
>> that address Trond's comments:
>> http://linux-nfs.org/pipermail/pnfs/2009-September/009052.html
>> http://linux-nfs.org/pipermail/pnfs/2009-September/009053.html
>> http://linux-nfs.org/pipermail/pnfs/2009-September/009059.html
>>
>> This version introduces a new xprt class for the nfsv4.1 backchannel
>> with its own setup routine to keep xs_setup_tcp clean.
>>
>> I also cleaned up the patches a little further by removing
>> a bit of dead code and fixing checkpatch whitespace related
>> warnings in Alexandros squashme patches.
>
> These should all be in my for-2.6.32 branch; any testing would be
> appreciated to make sure I didn't introduce any problems.

Thanks!
I'll sync up with your branch and start pounding on it.

Benny

>
> --b.
>
>> Benny
>>
>> On Sep. 04, 2009, 19:18 +0300, Benny Halevy <[email protected]> wrote:
>>> Bruce,
>>>
>>> Here's the updated patchset implementing the nfs41 backchannel
>>> for the nfs server.
>>>
>>> Changes from previous version:
>>> - Rebase onto git://git.linux-nfs.org/~bfields/linux.git for-2.6.32
>>>
>>> - bc_send_request does not block on the xpt_mutex
>>> but rather uses the rpc_sleep_on to wait on it.
>>>
>>> - nfsd4_create_session initializes unconf->cl_cb_conn.cb_addr.
>>>
>>> - cosmetic-only changes cleaned up.
>>>
>>> [PATCH 01/10] nfsd41: sunrpc: move struct rpc_buffer def into sunrpc.h
>>> [PATCH 02/10] nfsd41: sunrpc: Added rpc server-side backchannel handling
>>> [PATCH 03/10] nfsd4: fix whitespace in NFSPROC4_CLNT_CB_NULL definition
>>> [PATCH 04/10] nfsd41: Backchannel: callback infrastructure
>>> [PATCH 05/10] nfsd41: Backchannel: Add sequence arguments to callback RPC arguments
>>> [PATCH 06/10] nfsd41: Backchannel: Server backchannel RPC wait queue
>>> [PATCH 07/10] nfsd41: Backchannel: Setup sequence information
>>> [PATCH 08/10] nfsd41: Backchannel: cb_sequence callback
>>> [PATCH 09/10] nfsd41: Backchannel: Implement cb_recall over NFSv4.1
>>> [PATCH 10/10] nfsd41: Refactor create_client()
>>>
>>> Benny
>>> _______________________________________________
>>> pNFS mailing list
>>> [email protected]
>>> http://linux-nfs.org/cgi-bin/mailman/listinfo/pnfs

2009-09-10 09:23:07

by Benny Halevy

[permalink] [raw]
Subject: [PATCH v2 0/12] nfsd41 backchannel patches for 2.6.32

Bruce,

This version incorporates the latest fixes from Alexandros
that address Trond's comments:
http://linux-nfs.org/pipermail/pnfs/2009-September/009052.html
http://linux-nfs.org/pipermail/pnfs/2009-September/009053.html
http://linux-nfs.org/pipermail/pnfs/2009-September/009059.html

This version introduces a new xprt class for the nfsv4.1 backchannel
with its own setup routine to keep xs_setup_tcp clean.

I also cleaned up the patches a little further by removing
a bit of dead code and fixing checkpatch whitespace related
warnings in Alexandros squashme patches.

Benny

On Sep. 04, 2009, 19:18 +0300, Benny Halevy <[email protected]> wrote:
> Bruce,
>
> Here's the updated patchset implementing the nfs41 backchannel
> for the nfs server.
>
> Changes from previous version:
> - Rebase onto git://git.linux-nfs.org/~bfields/linux.git for-2.6.32
>
> - bc_send_request does not block on the xpt_mutex
> but rather uses the rpc_sleep_on to wait on it.
>
> - nfsd4_create_session initializes unconf->cl_cb_conn.cb_addr.
>
> - cosmetic-only changes cleaned up.
>
> [PATCH 01/10] nfsd41: sunrpc: move struct rpc_buffer def into sunrpc.h
> [PATCH 02/10] nfsd41: sunrpc: Added rpc server-side backchannel handling
> [PATCH 03/10] nfsd4: fix whitespace in NFSPROC4_CLNT_CB_NULL definition
> [PATCH 04/10] nfsd41: Backchannel: callback infrastructure
> [PATCH 05/10] nfsd41: Backchannel: Add sequence arguments to callback RPC arguments
> [PATCH 06/10] nfsd41: Backchannel: Server backchannel RPC wait queue
> [PATCH 07/10] nfsd41: Backchannel: Setup sequence information
> [PATCH 08/10] nfsd41: Backchannel: cb_sequence callback
> [PATCH 09/10] nfsd41: Backchannel: Implement cb_recall over NFSv4.1
> [PATCH 10/10] nfsd41: Refactor create_client()
>
> Benny
> _______________________________________________
> pNFS mailing list
> [email protected]
> http://linux-nfs.org/cgi-bin/mailman/listinfo/pnfs

2009-09-10 09:24:35

by Benny Halevy

[permalink] [raw]
Subject: [PATCH v2 01/12] nfsd41: sunrpc: move struct rpc_buffer def into sunrpc.h

Move struct rpc_buffer's definition into a sunrpc.h, a common, internal
header file, in preparation for supporting the nfsv4.1 backchannel.

Signed-off-by: Benny Halevy <[email protected]>
[nfs41: sunrpc: #include <linux/net.h> from sunrpc.h]
Signed-off-by: Benny Halevy <[email protected]>
---
net/sunrpc/sched.c | 7 ++-----
net/sunrpc/sunrpc.h | 10 ++++++++++
2 files changed, 12 insertions(+), 5 deletions(-)

diff --git a/net/sunrpc/sched.c b/net/sunrpc/sched.c
index 8f459ab..cef74ba 100644
--- a/net/sunrpc/sched.c
+++ b/net/sunrpc/sched.c
@@ -21,6 +21,8 @@

#include <linux/sunrpc/clnt.h>

+#include "sunrpc.h"
+
#ifdef RPC_DEBUG
#define RPCDBG_FACILITY RPCDBG_SCHED
#define RPC_TASK_MAGIC_ID 0xf00baa
@@ -711,11 +713,6 @@ static void rpc_async_schedule(struct work_struct *work)
__rpc_execute(container_of(work, struct rpc_task, u.tk_work));
}

-struct rpc_buffer {
- size_t len;
- char data[];
-};
-
/**
* rpc_malloc - allocate an RPC buffer
* @task: RPC task that will use this buffer
diff --git a/net/sunrpc/sunrpc.h b/net/sunrpc/sunrpc.h
index 5d9dd74..13171e6 100644
--- a/net/sunrpc/sunrpc.h
+++ b/net/sunrpc/sunrpc.h
@@ -27,6 +27,16 @@ SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
#ifndef _NET_SUNRPC_SUNRPC_H
#define _NET_SUNRPC_SUNRPC_H

+#include <linux/net.h>
+
+/*
+ * Header for dynamically allocated rpc buffers.
+ */
+struct rpc_buffer {
+ size_t len;
+ char data[];
+};
+
static inline int rpc_reply_expected(struct rpc_task *task)
{
return (task->tk_msg.rpc_proc != NULL) &&
--
1.6.4


2009-09-10 09:24:50

by Benny Halevy

[permalink] [raw]
Subject: [PATCH v2 02/12] nfsd41: sunrpc: Added rpc server-side backchannel handling

From: Rahul Iyer <[email protected]>

When the call direction is a reply, copy the xid and call direction into the
req->rq_private_buf.head[0].iov_base otherwise rpc_verify_header returns
rpc_garbage.

Signed-off-by: Rahul Iyer <[email protected]>
Signed-off-by: Mike Sager <[email protected]>
Signed-off-by: Marc Eshel <[email protected]>
Signed-off-by: Benny Halevy <[email protected]>
Signed-off-by: Ricardo Labiaga <[email protected]>
Signed-off-by: Andy Adamson <[email protected]>
Signed-off-by: Benny Halevy <[email protected]>
[get rid of CONFIG_NFSD_V4_1]
[sunrpc: refactoring of svc_tcp_recvfrom]
[nfsd41: sunrpc: create common send routine for the fore and the back channels]
[nfsd41: sunrpc: Use free_page() to free server backchannel pages]
[nfsd41: sunrpc: Document server backchannel locking]
[nfsd41: sunrpc: remove bc_connect_worker()]
[nfsd41: sunrpc: Define xprt_server_backchannel()[
[nfsd41: sunrpc: remove bc_close and bc_init_auto_disconnect dummy functions]
[nfsd41: sunrpc: eliminate unneeded switch statement in xs_setup_tcp()]
[nfsd41: sunrpc: Don't auto close the server backchannel connection]
[nfsd41: sunrpc: Remove unused functions]
Signed-off-by: Alexandros Batsakis <[email protected]>
Signed-off-by: Ricardo Labiaga <[email protected]>
Signed-off-by: Benny Halevy <[email protected]>
[nfsd41: change bc_sock to bc_xprt]
[nfsd41: sunrpc: move struct rpc_buffer def into a common header file]
[nfsd41: sunrpc: use rpc_sleep in bc_send_request so not to block on mutex]
[removed cosmetic changes]
Signed-off-by: Benny Halevy <[email protected]>
[sunrpc: add new xprt class for nfsv4.1 backchannel]
[sunrpc: v2.1 change handling of auto_close and init_auto_disconnect operations for the nfsv4.1 backchannel]
Signed-off-by: Alexandros Batsakis <[email protected]>
[reverted more cosmetic leftovers]
[got rid of xprt_server_backchannel]
[separated "nfsd41: sunrpc: add new xprt class for nfsv4.1 backchannel"]
Signed-off-by: Benny Halevy <[email protected]>
Cc: Trond Myklebust <[email protected]>
---
include/linux/sunrpc/svc_xprt.h | 1 +
include/linux/sunrpc/svcsock.h | 1 +
include/linux/sunrpc/xprt.h | 1 +
net/sunrpc/sunrpc.h | 4 +
net/sunrpc/svc_xprt.c | 2 +
net/sunrpc/svcsock.c | 172 +++++++++++++++++++++++++++++++--------
net/sunrpc/xprt.c | 15 +++-
net/sunrpc/xprtsock.c | 146 +++++++++++++++++++++++++++++++++
8 files changed, 303 insertions(+), 39 deletions(-)

diff --git a/include/linux/sunrpc/svc_xprt.h b/include/linux/sunrpc/svc_xprt.h
index 2223ae0..5f4e18b 100644
--- a/include/linux/sunrpc/svc_xprt.h
+++ b/include/linux/sunrpc/svc_xprt.h
@@ -65,6 +65,7 @@ struct svc_xprt {
size_t xpt_locallen; /* length of address */
struct sockaddr_storage xpt_remote; /* remote peer's address */
size_t xpt_remotelen; /* length of address */
+ struct rpc_wait_queue xpt_bc_pending; /* backchannel wait queue */
};

int svc_reg_xprt_class(struct svc_xprt_class *);
diff --git a/include/linux/sunrpc/svcsock.h b/include/linux/sunrpc/svcsock.h
index 04dba23..1b353a7 100644
--- a/include/linux/sunrpc/svcsock.h
+++ b/include/linux/sunrpc/svcsock.h
@@ -28,6 +28,7 @@ struct svc_sock {
/* private TCP part */
u32 sk_reclen; /* length of record */
u32 sk_tcplen; /* current read length */
+ struct rpc_xprt *sk_bc_xprt; /* NFSv4.1 backchannel xprt */
};

/*
diff --git a/include/linux/sunrpc/xprt.h b/include/linux/sunrpc/xprt.h
index c090df4..228d694 100644
--- a/include/linux/sunrpc/xprt.h
+++ b/include/linux/sunrpc/xprt.h
@@ -179,6 +179,7 @@ struct rpc_xprt {
spinlock_t reserve_lock; /* lock slot table */
u32 xid; /* Next XID value to use */
struct rpc_task * snd_task; /* Task blocked in send */
+ struct svc_xprt *bc_xprt; /* NFSv4.1 backchannel */
#if defined(CONFIG_NFS_V4_1)
struct svc_serv *bc_serv; /* The RPC service which will */
/* process the callback */
diff --git a/net/sunrpc/sunrpc.h b/net/sunrpc/sunrpc.h
index 13171e6..90c292e 100644
--- a/net/sunrpc/sunrpc.h
+++ b/net/sunrpc/sunrpc.h
@@ -43,5 +43,9 @@ static inline int rpc_reply_expected(struct rpc_task *task)
(task->tk_msg.rpc_proc->p_decode != NULL);
}

+int svc_send_common(struct socket *sock, struct xdr_buf *xdr,
+ struct page *headpage, unsigned long headoffset,
+ struct page *tailpage, unsigned long tailoffset);
+
#endif /* _NET_SUNRPC_SUNRPC_H */

diff --git a/net/sunrpc/svc_xprt.c b/net/sunrpc/svc_xprt.c
index 912dea5..df124f7 100644
--- a/net/sunrpc/svc_xprt.c
+++ b/net/sunrpc/svc_xprt.c
@@ -160,6 +160,7 @@ void svc_xprt_init(struct svc_xprt_class *xcl, struct svc_xprt *xprt,
mutex_init(&xprt->xpt_mutex);
spin_lock_init(&xprt->xpt_lock);
set_bit(XPT_BUSY, &xprt->xpt_flags);
+ rpc_init_wait_queue(&xprt->xpt_bc_pending, "xpt_bc_pending");
}
EXPORT_SYMBOL_GPL(svc_xprt_init);

@@ -810,6 +811,7 @@ int svc_send(struct svc_rqst *rqstp)
else
len = xprt->xpt_ops->xpo_sendto(rqstp);
mutex_unlock(&xprt->xpt_mutex);
+ rpc_wake_up(&xprt->xpt_bc_pending);
svc_xprt_release(rqstp);

if (len == -ECONNREFUSED || len == -ENOTCONN || len == -EAGAIN)
diff --git a/net/sunrpc/svcsock.c b/net/sunrpc/svcsock.c
index 76a380d..ccc5e83 100644
--- a/net/sunrpc/svcsock.c
+++ b/net/sunrpc/svcsock.c
@@ -49,6 +49,7 @@
#include <linux/sunrpc/msg_prot.h>
#include <linux/sunrpc/svcsock.h>
#include <linux/sunrpc/stats.h>
+#include <linux/sunrpc/xprt.h>

#define RPCDBG_FACILITY RPCDBG_SVCXPRT

@@ -153,49 +154,27 @@ static void svc_set_cmsg_data(struct svc_rqst *rqstp, struct cmsghdr *cmh)
}

/*
- * Generic sendto routine
+ * send routine intended to be shared by the fore- and back-channel
*/
-static int svc_sendto(struct svc_rqst *rqstp, struct xdr_buf *xdr)
+int svc_send_common(struct socket *sock, struct xdr_buf *xdr,
+ struct page *headpage, unsigned long headoffset,
+ struct page *tailpage, unsigned long tailoffset)
{
- struct svc_sock *svsk =
- container_of(rqstp->rq_xprt, struct svc_sock, sk_xprt);
- struct socket *sock = svsk->sk_sock;
- int slen;
- union {
- struct cmsghdr hdr;
- long all[SVC_PKTINFO_SPACE / sizeof(long)];
- } buffer;
- struct cmsghdr *cmh = &buffer.hdr;
- int len = 0;
int result;
int size;
struct page **ppage = xdr->pages;
size_t base = xdr->page_base;
unsigned int pglen = xdr->page_len;
unsigned int flags = MSG_MORE;
- RPC_IFDEBUG(char buf[RPC_MAX_ADDRBUFLEN]);
+ int slen;
+ int len = 0;

slen = xdr->len;

- if (rqstp->rq_prot == IPPROTO_UDP) {
- struct msghdr msg = {
- .msg_name = &rqstp->rq_addr,
- .msg_namelen = rqstp->rq_addrlen,
- .msg_control = cmh,
- .msg_controllen = sizeof(buffer),
- .msg_flags = MSG_MORE,
- };
-
- svc_set_cmsg_data(rqstp, cmh);
-
- if (sock_sendmsg(sock, &msg, 0) < 0)
- goto out;
- }
-
/* send head */
if (slen == xdr->head[0].iov_len)
flags = 0;
- len = kernel_sendpage(sock, rqstp->rq_respages[0], 0,
+ len = kernel_sendpage(sock, headpage, headoffset,
xdr->head[0].iov_len, flags);
if (len != xdr->head[0].iov_len)
goto out;
@@ -219,16 +198,58 @@ static int svc_sendto(struct svc_rqst *rqstp, struct xdr_buf *xdr)
base = 0;
ppage++;
}
+
/* send tail */
if (xdr->tail[0].iov_len) {
- result = kernel_sendpage(sock, rqstp->rq_respages[0],
- ((unsigned long)xdr->tail[0].iov_base)
- & (PAGE_SIZE-1),
- xdr->tail[0].iov_len, 0);
-
+ result = kernel_sendpage(sock, tailpage, tailoffset,
+ xdr->tail[0].iov_len, 0);
if (result > 0)
len += result;
}
+
+out:
+ return len;
+}
+
+
+/*
+ * Generic sendto routine
+ */
+static int svc_sendto(struct svc_rqst *rqstp, struct xdr_buf *xdr)
+{
+ struct svc_sock *svsk =
+ container_of(rqstp->rq_xprt, struct svc_sock, sk_xprt);
+ struct socket *sock = svsk->sk_sock;
+ union {
+ struct cmsghdr hdr;
+ long all[SVC_PKTINFO_SPACE / sizeof(long)];
+ } buffer;
+ struct cmsghdr *cmh = &buffer.hdr;
+ int len = 0;
+ unsigned long tailoff;
+ unsigned long headoff;
+ RPC_IFDEBUG(char buf[RPC_MAX_ADDRBUFLEN]);
+
+ if (rqstp->rq_prot == IPPROTO_UDP) {
+ struct msghdr msg = {
+ .msg_name = &rqstp->rq_addr,
+ .msg_namelen = rqstp->rq_addrlen,
+ .msg_control = cmh,
+ .msg_controllen = sizeof(buffer),
+ .msg_flags = MSG_MORE,
+ };
+
+ svc_set_cmsg_data(rqstp, cmh);
+
+ if (sock_sendmsg(sock, &msg, 0) < 0)
+ goto out;
+ }
+
+ tailoff = ((unsigned long)xdr->tail[0].iov_base) & (PAGE_SIZE-1);
+ headoff = 0;
+ len = svc_send_common(sock, xdr, rqstp->rq_respages[0], headoff,
+ rqstp->rq_respages[0], tailoff);
+
out:
dprintk("svc: socket %p sendto([%p %Zu... ], %d) = %d (addr %s)\n",
svsk, xdr->head[0].iov_base, xdr->head[0].iov_len,
@@ -951,6 +972,57 @@ static int svc_tcp_recv_record(struct svc_sock *svsk, struct svc_rqst *rqstp)
return -EAGAIN;
}

+static int svc_process_calldir(struct svc_sock *svsk, struct svc_rqst *rqstp,
+ struct rpc_rqst **reqpp, struct kvec *vec)
+{
+ struct rpc_rqst *req = NULL;
+ u32 *p;
+ u32 xid;
+ u32 calldir;
+ int len;
+
+ len = svc_recvfrom(rqstp, vec, 1, 8);
+ if (len < 0)
+ goto error;
+
+ p = (u32 *)rqstp->rq_arg.head[0].iov_base;
+ xid = *p++;
+ calldir = *p;
+
+ if (calldir == 0) {
+ /* REQUEST is the most common case */
+ vec[0] = rqstp->rq_arg.head[0];
+ } else {
+ /* REPLY */
+ if (svsk->sk_bc_xprt)
+ req = xprt_lookup_rqst(svsk->sk_bc_xprt, xid);
+
+ if (!req) {
+ printk(KERN_NOTICE
+ "%s: Got unrecognized reply: "
+ "calldir 0x%x sk_bc_xprt %p xid %08x\n",
+ __func__, ntohl(calldir),
+ svsk->sk_bc_xprt, xid);
+ vec[0] = rqstp->rq_arg.head[0];
+ goto out;
+ }
+
+ memcpy(&req->rq_private_buf, &req->rq_rcv_buf,
+ sizeof(struct xdr_buf));
+ /* copy the xid and call direction */
+ memcpy(req->rq_private_buf.head[0].iov_base,
+ rqstp->rq_arg.head[0].iov_base, 8);
+ vec[0] = req->rq_private_buf.head[0];
+ }
+ out:
+ vec[0].iov_base += 8;
+ vec[0].iov_len -= 8;
+ len = svsk->sk_reclen - 8;
+ error:
+ *reqpp = req;
+ return len;
+}
+
/*
* Receive data from a TCP socket.
*/
@@ -962,6 +1034,7 @@ static int svc_tcp_recvfrom(struct svc_rqst *rqstp)
int len;
struct kvec *vec;
int pnum, vlen;
+ struct rpc_rqst *req = NULL;

dprintk("svc: tcp_recv %p data %d conn %d close %d\n",
svsk, test_bit(XPT_DATA, &svsk->sk_xprt.xpt_flags),
@@ -975,9 +1048,27 @@ static int svc_tcp_recvfrom(struct svc_rqst *rqstp)
vec = rqstp->rq_vec;
vec[0] = rqstp->rq_arg.head[0];
vlen = PAGE_SIZE;
+
+ /*
+ * We have enough data for the whole tcp record. Let's try and read the
+ * first 8 bytes to get the xid and the call direction. We can use this
+ * to figure out if this is a call or a reply to a callback. If
+ * sk_reclen is < 8 (xid and calldir), then this is a malformed packet.
+ * In that case, don't bother with the calldir and just read the data.
+ * It will be rejected in svc_process.
+ */
+ if (len >= 8) {
+ len = svc_process_calldir(svsk, rqstp, &req, vec);
+ if (len < 0)
+ goto err_again;
+ vlen -= 8;
+ }
+
pnum = 1;
while (vlen < len) {
- vec[pnum].iov_base = page_address(rqstp->rq_pages[pnum]);
+ vec[pnum].iov_base = (req) ?
+ page_address(req->rq_private_buf.pages[pnum - 1]) :
+ page_address(rqstp->rq_pages[pnum]);
vec[pnum].iov_len = PAGE_SIZE;
pnum++;
vlen += PAGE_SIZE;
@@ -989,6 +1080,16 @@ static int svc_tcp_recvfrom(struct svc_rqst *rqstp)
if (len < 0)
goto err_again;

+ /*
+ * Account for the 8 bytes we read earlier
+ */
+ len += 8;
+
+ if (req) {
+ xprt_complete_rqst(req->rq_task, len);
+ len = 0;
+ goto out;
+ }
dprintk("svc: TCP complete record (%d bytes)\n", len);
rqstp->rq_arg.len = len;
rqstp->rq_arg.page_base = 0;
@@ -1002,6 +1103,7 @@ static int svc_tcp_recvfrom(struct svc_rqst *rqstp)
rqstp->rq_xprt_ctxt = NULL;
rqstp->rq_prot = IPPROTO_TCP;

+out:
/* Reset TCP read info */
svsk->sk_reclen = 0;
svsk->sk_tcplen = 0;
diff --git a/net/sunrpc/xprt.c b/net/sunrpc/xprt.c
index f412a85..f577e5a 100644
--- a/net/sunrpc/xprt.c
+++ b/net/sunrpc/xprt.c
@@ -832,6 +832,11 @@ static void xprt_timer(struct rpc_task *task)
spin_unlock_bh(&xprt->transport_lock);
}

+static inline int xprt_has_timer(struct rpc_xprt *xprt)
+{
+ return xprt->idle_timeout != (~0);
+}
+
/**
* xprt_prepare_transmit - reserve the transport before sending a request
* @task: RPC task about to send a request
@@ -1013,7 +1018,7 @@ void xprt_release(struct rpc_task *task)
if (!list_empty(&req->rq_list))
list_del(&req->rq_list);
xprt->last_used = jiffies;
- if (list_empty(&xprt->recv))
+ if (list_empty(&xprt->recv) && xprt_has_timer(xprt))
mod_timer(&xprt->timer,
xprt->last_used + xprt->idle_timeout);
spin_unlock_bh(&xprt->transport_lock);
@@ -1082,8 +1087,11 @@ found:
#endif /* CONFIG_NFS_V4_1 */

INIT_WORK(&xprt->task_cleanup, xprt_autoclose);
- setup_timer(&xprt->timer, xprt_init_autodisconnect,
- (unsigned long)xprt);
+ if (xprt_has_timer(xprt))
+ setup_timer(&xprt->timer, xprt_init_autodisconnect,
+ (unsigned long)xprt);
+ else
+ init_timer(&xprt->timer);
xprt->last_used = jiffies;
xprt->cwnd = RPC_INITCWND;
xprt->bind_index = 0;
@@ -1102,7 +1110,6 @@ found:

dprintk("RPC: created transport %p with %u slots\n", xprt,
xprt->max_reqs);
-
return xprt;
}

diff --git a/net/sunrpc/xprtsock.c b/net/sunrpc/xprtsock.c
index 62438f3..d9a2b81 100644
--- a/net/sunrpc/xprtsock.c
+++ b/net/sunrpc/xprtsock.c
@@ -32,6 +32,7 @@
#include <linux/tcp.h>
#include <linux/sunrpc/clnt.h>
#include <linux/sunrpc/sched.h>
+#include <linux/sunrpc/svcsock.h>
#include <linux/sunrpc/xprtsock.h>
#include <linux/file.h>
#ifdef CONFIG_NFS_V4_1
@@ -43,6 +44,7 @@
#include <net/udp.h>
#include <net/tcp.h>

+#include "sunrpc.h"
/*
* xprtsock tunables
*/
@@ -2098,6 +2100,134 @@ static void xs_tcp_print_stats(struct rpc_xprt *xprt, struct seq_file *seq)
xprt->stat.bklog_u);
}

+/*
+ * Allocate a bunch of pages for a scratch buffer for the rpc code. The reason
+ * we allocate pages instead doing a kmalloc like rpc_malloc is because we want
+ * to use the server side send routines.
+ */
+void *bc_malloc(struct rpc_task *task, size_t size)
+{
+ struct page *page;
+ struct rpc_buffer *buf;
+
+ BUG_ON(size > PAGE_SIZE - sizeof(struct rpc_buffer));
+ page = alloc_page(GFP_KERNEL);
+
+ if (!page)
+ return NULL;
+
+ buf = page_address(page);
+ buf->len = PAGE_SIZE;
+
+ return buf->data;
+}
+
+/*
+ * Free the space allocated in the bc_alloc routine
+ */
+void bc_free(void *buffer)
+{
+ struct rpc_buffer *buf;
+
+ if (!buffer)
+ return;
+
+ buf = container_of(buffer, struct rpc_buffer, data);
+ free_page((unsigned long)buf);
+}
+
+/*
+ * Use the svc_sock to send the callback. Must be called with svsk->sk_mutex
+ * held. Borrows heavily from svc_tcp_sendto and xs_tcp_send_request.
+ */
+static int bc_sendto(struct rpc_rqst *req)
+{
+ int len;
+ struct xdr_buf *xbufp = &req->rq_snd_buf;
+ struct rpc_xprt *xprt = req->rq_xprt;
+ struct sock_xprt *transport =
+ container_of(xprt, struct sock_xprt, xprt);
+ struct socket *sock = transport->sock;
+ unsigned long headoff;
+ unsigned long tailoff;
+
+ /*
+ * Set up the rpc header and record marker stuff
+ */
+ xs_encode_tcp_record_marker(xbufp);
+
+ tailoff = (unsigned long)xbufp->tail[0].iov_base & ~PAGE_MASK;
+ headoff = (unsigned long)xbufp->head[0].iov_base & ~PAGE_MASK;
+ len = svc_send_common(sock, xbufp,
+ virt_to_page(xbufp->head[0].iov_base), headoff,
+ xbufp->tail[0].iov_base, tailoff);
+
+ if (len != xbufp->len) {
+ printk(KERN_NOTICE "Error sending entire callback!\n");
+ len = -EAGAIN;
+ }
+
+ return len;
+}
+
+/*
+ * The send routine. Borrows from svc_send
+ */
+static int bc_send_request(struct rpc_task *task)
+{
+ struct rpc_rqst *req = task->tk_rqstp;
+ struct svc_xprt *xprt;
+ struct svc_sock *svsk;
+ u32 len;
+
+ dprintk("sending request with xid: %08x\n", ntohl(req->rq_xid));
+ /*
+ * Get the server socket associated with this callback xprt
+ */
+ xprt = req->rq_xprt->bc_xprt;
+ svsk = container_of(xprt, struct svc_sock, sk_xprt);
+
+ /*
+ * Grab the mutex to serialize data as the connection is shared
+ * with the fore channel
+ */
+ if (!mutex_trylock(&xprt->xpt_mutex)) {
+ rpc_sleep_on(&xprt->xpt_bc_pending, task, NULL);
+ if (!mutex_trylock(&xprt->xpt_mutex))
+ return -EAGAIN;
+ rpc_wake_up_queued_task(&xprt->xpt_bc_pending, task);
+ }
+ if (test_bit(XPT_DEAD, &xprt->xpt_flags))
+ len = -ENOTCONN;
+ else
+ len = bc_sendto(req);
+ mutex_unlock(&xprt->xpt_mutex);
+
+ if (len > 0)
+ len = 0;
+
+ return len;
+}
+
+/*
+ * The close routine. Since this is client initiated, we do nothing
+ */
+
+static void bc_close(struct rpc_xprt *xprt)
+{
+ return;
+}
+
+/*
+ * The xprt destroy routine. Again, because this connection is client
+ * initiated, we do nothing
+ */
+
+static void bc_destroy(struct rpc_xprt *xprt)
+{
+ return;
+}
+
static struct rpc_xprt_ops xs_udp_ops = {
.set_buffer_size = xs_udp_set_buffer_size,
.reserve_xprt = xprt_reserve_xprt_cong,
@@ -2134,6 +2264,22 @@ static struct rpc_xprt_ops xs_tcp_ops = {
.print_stats = xs_tcp_print_stats,
};

+/*
+ * The rpc_xprt_ops for the server backchannel
+ */
+
+static struct rpc_xprt_ops bc_tcp_ops = {
+ .reserve_xprt = xprt_reserve_xprt,
+ .release_xprt = xprt_release_xprt,
+ .buf_alloc = bc_malloc,
+ .buf_free = bc_free,
+ .send_request = bc_send_request,
+ .set_retrans_timeout = xprt_set_retrans_timeout_def,
+ .close = bc_close,
+ .destroy = bc_destroy,
+ .print_stats = xs_tcp_print_stats,
+};
+
static struct rpc_xprt *xs_setup_xprt(struct xprt_create *args,
unsigned int slot_table_size)
{
--
1.6.4


2009-09-10 09:25:17

by Benny Halevy

[permalink] [raw]
Subject: [PATCH v2 04/12] nfsd4: fix whitespace in NFSPROC4_CLNT_CB_NULL definition

Signed-off-by: Benny Halevy <[email protected]>
---
fs/nfsd/nfs4callback.c | 2 +-
1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/fs/nfsd/nfs4callback.c b/fs/nfsd/nfs4callback.c
index 81d1c52..63bb384 100644
--- a/fs/nfsd/nfs4callback.c
+++ b/fs/nfsd/nfs4callback.c
@@ -56,7 +56,7 @@
/* Index of predefined Linux callback client operations */

enum {
- NFSPROC4_CLNT_CB_NULL = 0,
+ NFSPROC4_CLNT_CB_NULL = 0,
NFSPROC4_CLNT_CB_RECALL,
};

--
1.6.4


2009-09-10 09:25:04

by Benny Halevy

[permalink] [raw]
Subject: [PATCH v2 03/12] nfsd41: sunrpc: add new xprt class for nfsv4.1 backchannel

From: Alexandros Batsakis <[email protected]>

Signed-off-by: Alexandros Batsakis <[email protected]>
Signed-off-by: Benny Halevy <[email protected]>
---
include/linux/sunrpc/clnt.h | 1 +
include/linux/sunrpc/xprt.h | 1 +
include/linux/sunrpc/xprtrdma.h | 5 --
include/linux/sunrpc/xprtsock.h | 9 +++-
net/sunrpc/clnt.c | 1 +
net/sunrpc/xprtsock.c | 96 ++++++++++++++++++++++++++++++++++++++-
6 files changed, 104 insertions(+), 9 deletions(-)

diff --git a/include/linux/sunrpc/clnt.h b/include/linux/sunrpc/clnt.h
index 3d02558..8ed9642 100644
--- a/include/linux/sunrpc/clnt.h
+++ b/include/linux/sunrpc/clnt.h
@@ -114,6 +114,7 @@ struct rpc_create_args {
rpc_authflavor_t authflavor;
unsigned long flags;
char *client_name;
+ struct svc_xprt *bc_xprt; /* NFSv4.1 backchannel */
};

/* Values for "flags" field */
diff --git a/include/linux/sunrpc/xprt.h b/include/linux/sunrpc/xprt.h
index 228d694..7cc42af 100644
--- a/include/linux/sunrpc/xprt.h
+++ b/include/linux/sunrpc/xprt.h
@@ -232,6 +232,7 @@ struct xprt_create {
struct sockaddr * srcaddr; /* optional local address */
struct sockaddr * dstaddr; /* remote peer address */
size_t addrlen;
+ struct svc_xprt *bc_xprt; /* NFSv4.1 backchannel */
};

struct xprt_class {
diff --git a/include/linux/sunrpc/xprtrdma.h b/include/linux/sunrpc/xprtrdma.h
index 54a379c..c2f04e1 100644
--- a/include/linux/sunrpc/xprtrdma.h
+++ b/include/linux/sunrpc/xprtrdma.h
@@ -41,11 +41,6 @@
#define _LINUX_SUNRPC_XPRTRDMA_H

/*
- * RPC transport identifier for RDMA
- */
-#define XPRT_TRANSPORT_RDMA 256
-
-/*
* rpcbind (v3+) RDMA netid.
*/
#define RPCBIND_NETID_RDMA "rdma"
diff --git a/include/linux/sunrpc/xprtsock.h b/include/linux/sunrpc/xprtsock.h
index c2a46c4..d7c98d1 100644
--- a/include/linux/sunrpc/xprtsock.h
+++ b/include/linux/sunrpc/xprtsock.h
@@ -20,8 +20,13 @@ void cleanup_socket_xprt(void);
* values. No such restriction exists for new transports, except that
* they may not collide with these values (17 and 6, respectively).
*/
-#define XPRT_TRANSPORT_UDP IPPROTO_UDP
-#define XPRT_TRANSPORT_TCP IPPROTO_TCP
+#define XPRT_TRANSPORT_BC (1 << 31)
+enum xprt_transports {
+ XPRT_TRANSPORT_UDP = IPPROTO_UDP,
+ XPRT_TRANSPORT_TCP = IPPROTO_TCP,
+ XPRT_TRANSPORT_BC_TCP = IPPROTO_TCP | XPRT_TRANSPORT_BC,
+ XPRT_TRANSPORT_RDMA = 256
+};

/*
* RPC slot table sizes for UDP, TCP transports
diff --git a/net/sunrpc/clnt.c b/net/sunrpc/clnt.c
index c1e467e..7389804 100644
--- a/net/sunrpc/clnt.c
+++ b/net/sunrpc/clnt.c
@@ -288,6 +288,7 @@ struct rpc_clnt *rpc_create(struct rpc_create_args *args)
.srcaddr = args->saddress,
.dstaddr = args->address,
.addrlen = args->addrsize,
+ .bc_xprt = args->bc_xprt,
};
char servername[48];

diff --git a/net/sunrpc/xprtsock.c b/net/sunrpc/xprtsock.c
index d9a2b81..6edcd5c 100644
--- a/net/sunrpc/xprtsock.c
+++ b/net/sunrpc/xprtsock.c
@@ -2468,11 +2468,93 @@ static struct rpc_xprt *xs_setup_tcp(struct xprt_create *args)
return ERR_PTR(-EINVAL);
}

+/**
+ * xs_setup_bc_tcp - Set up transport to use a TCP backchannel socket
+ * @args: rpc transport creation arguments
+ *
+ */
+static struct rpc_xprt *xs_setup_bc_tcp(struct xprt_create *args)
+{
+ struct sockaddr *addr = args->dstaddr;
+ struct rpc_xprt *xprt;
+ struct sock_xprt *transport;
+ struct svc_sock *bc_sock;
+
+ if (!args->bc_xprt)
+ ERR_PTR(-EINVAL);
+
+ xprt = xs_setup_xprt(args, xprt_tcp_slot_table_entries);
+ if (IS_ERR(xprt))
+ return xprt;
+ transport = container_of(xprt, struct sock_xprt, xprt);
+
+ xprt->prot = IPPROTO_TCP;
+ xprt->tsh_size = sizeof(rpc_fraghdr) / sizeof(u32);
+ xprt->max_payload = RPC_MAX_FRAGMENT_SIZE;
+ xprt->timeout = &xs_tcp_default_timeout;
+
+ /* backchannel */
+ xprt_set_bound(xprt);
+ xprt->bind_timeout = 0;
+ xprt->connect_timeout = 0;
+ xprt->reestablish_timeout = 0;
+ xprt->idle_timeout = (~0);
+
+ /*
+ * The backchannel uses the same socket connection as the
+ * forechannel
+ */
+ xprt->bc_xprt = args->bc_xprt;
+ bc_sock = container_of(args->bc_xprt, struct svc_sock, sk_xprt);
+ bc_sock->sk_bc_xprt = xprt;
+ transport->sock = bc_sock->sk_sock;
+ transport->inet = bc_sock->sk_sk;
+
+ xprt->ops = &bc_tcp_ops;
+
+ switch (addr->sa_family) {
+ case AF_INET:
+ xs_format_peer_addresses(xprt, "tcp",
+ RPCBIND_NETID_TCP);
+ break;
+ case AF_INET6:
+ xs_format_peer_addresses(xprt, "tcp",
+ RPCBIND_NETID_TCP6);
+ break;
+ default:
+ kfree(xprt);
+ return ERR_PTR(-EAFNOSUPPORT);
+ }
+
+ if (xprt_bound(xprt))
+ dprintk("RPC: set up xprt to %s (port %s) via %s\n",
+ xprt->address_strings[RPC_DISPLAY_ADDR],
+ xprt->address_strings[RPC_DISPLAY_PORT],
+ xprt->address_strings[RPC_DISPLAY_PROTO]);
+ else
+ dprintk("RPC: set up xprt to %s (autobind) via %s\n",
+ xprt->address_strings[RPC_DISPLAY_ADDR],
+ xprt->address_strings[RPC_DISPLAY_PROTO]);
+
+ /*
+ * Since we don't want connections for the backchannel, we set
+ * the xprt status to connected
+ */
+ xprt_set_connected(xprt);
+
+
+ if (try_module_get(THIS_MODULE))
+ return xprt;
+ kfree(xprt->slot);
+ kfree(xprt);
+ return ERR_PTR(-EINVAL);
+}
+
static struct xprt_class xs_udp_transport = {
.list = LIST_HEAD_INIT(xs_udp_transport.list),
.name = "udp",
.owner = THIS_MODULE,
- .ident = IPPROTO_UDP,
+ .ident = XPRT_TRANSPORT_UDP,
.setup = xs_setup_udp,
};

@@ -2480,10 +2562,18 @@ static struct xprt_class xs_tcp_transport = {
.list = LIST_HEAD_INIT(xs_tcp_transport.list),
.name = "tcp",
.owner = THIS_MODULE,
- .ident = IPPROTO_TCP,
+ .ident = XPRT_TRANSPORT_TCP,
.setup = xs_setup_tcp,
};

+static struct xprt_class xs_bc_tcp_transport = {
+ .list = LIST_HEAD_INIT(xs_bc_tcp_transport.list),
+ .name = "tcp NFSv4.1 backchannel",
+ .owner = THIS_MODULE,
+ .ident = XPRT_TRANSPORT_BC_TCP,
+ .setup = xs_setup_bc_tcp,
+};
+
/**
* init_socket_xprt - set up xprtsock's sysctls, register with RPC client
*
@@ -2497,6 +2587,7 @@ int init_socket_xprt(void)

xprt_register_transport(&xs_udp_transport);
xprt_register_transport(&xs_tcp_transport);
+ xprt_register_transport(&xs_bc_tcp_transport);

return 0;
}
@@ -2516,6 +2607,7 @@ void cleanup_socket_xprt(void)

xprt_unregister_transport(&xs_udp_transport);
xprt_unregister_transport(&xs_tcp_transport);
+ xprt_unregister_transport(&xs_bc_tcp_transport);
}

static int param_set_uint_minmax(const char *val, struct kernel_param *kp,
--
1.6.4


2009-09-10 09:25:31

by Benny Halevy

[permalink] [raw]
Subject: [PATCH v2 05/12] nfsd41: Backchannel: callback infrastructure

From: Andy Adamson <[email protected]>

Keep the xprt used for create_session in cl_cb_xprt.
Mark cl_callback.cb_minorversion = 1 and remember
the client provided cl_callback.cb_prog rpc program number.
Use it to probe the callback path.

Use the client's network address to initialize as the
callback's address as expected by the xprt creation
routines.

Define xdr sizes and code nfs4_cb_compound header to be able
to send a null callback rpc.

Signed-off-by: Andy Adamson<[email protected]>
Signed-off-by: Benny Halevy <[email protected]>
Signed-off-by: Ricardo Labiaga <[email protected]>
[get callback minorversion from fore channel's]
Signed-off-by: Benny Halevy <[email protected]>
[nfsd41: change bc_sock to bc_xprt]
Signed-off-by: Benny Halevy <[email protected]>
[pulled definition for cl_cb_xprt]
Signed-off-by: Benny Halevy <[email protected]>
[nfsd41: set up backchannel's cb_addr]
[moved rpc_create_args init to "nfsd: modify nfsd4.1 backchannel to use new xprt class"]
Signed-off-by: Benny Halevy <[email protected]>
---
fs/nfsd/nfs4callback.c | 21 +++++++++++++++++++--
fs/nfsd/nfs4state.c | 14 ++++++++++++++
include/linux/nfsd/state.h | 3 +++
3 files changed, 36 insertions(+), 2 deletions(-)

diff --git a/fs/nfsd/nfs4callback.c b/fs/nfsd/nfs4callback.c
index 63bb384..3e3e15b 100644
--- a/fs/nfsd/nfs4callback.c
+++ b/fs/nfsd/nfs4callback.c
@@ -43,6 +43,7 @@
#include <linux/sunrpc/xdr.h>
#include <linux/sunrpc/svc.h>
#include <linux/sunrpc/clnt.h>
+#include <linux/sunrpc/svcsock.h>
#include <linux/nfsd/nfsd.h>
#include <linux/nfsd/state.h>
#include <linux/sunrpc/sched.h>
@@ -52,16 +53,19 @@

#define NFSPROC4_CB_NULL 0
#define NFSPROC4_CB_COMPOUND 1
+#define NFS4_STATEID_SIZE 16

/* Index of predefined Linux callback client operations */

enum {
NFSPROC4_CLNT_CB_NULL = 0,
NFSPROC4_CLNT_CB_RECALL,
+ NFSPROC4_CLNT_CB_SEQUENCE,
};

enum nfs_cb_opnum4 {
OP_CB_RECALL = 4,
+ OP_CB_SEQUENCE = 11,
};

#define NFS4_MAXTAGLEN 20
@@ -70,15 +74,22 @@ enum nfs_cb_opnum4 {
#define NFS4_dec_cb_null_sz 0
#define cb_compound_enc_hdr_sz 4
#define cb_compound_dec_hdr_sz (3 + (NFS4_MAXTAGLEN >> 2))
+#define sessionid_sz (NFS4_MAX_SESSIONID_LEN >> 2)
+#define cb_sequence_enc_sz (sessionid_sz + 4 + \
+ 1 /* no referring calls list yet */)
+#define cb_sequence_dec_sz (op_dec_sz + sessionid_sz + 4)
+
#define op_enc_sz 1
#define op_dec_sz 2
#define enc_nfs4_fh_sz (1 + (NFS4_FHSIZE >> 2))
#define enc_stateid_sz (NFS4_STATEID_SIZE >> 2)
#define NFS4_enc_cb_recall_sz (cb_compound_enc_hdr_sz + \
+ cb_sequence_enc_sz + \
1 + enc_stateid_sz + \
enc_nfs4_fh_sz)

#define NFS4_dec_cb_recall_sz (cb_compound_dec_hdr_sz + \
+ cb_sequence_dec_sz + \
op_dec_sz)

/*
@@ -137,11 +148,13 @@ xdr_error: \
} while (0)

struct nfs4_cb_compound_hdr {
- int status;
- u32 ident;
+ /* args */
+ u32 ident; /* minorversion 0 only */
u32 nops;
__be32 *nops_p;
u32 minorversion;
+ /* res */
+ int status;
u32 taglen;
char *tag;
};
@@ -399,6 +412,10 @@ int setup_callback_client(struct nfs4_client *clp)
if (!clp->cl_principal && (clp->cl_flavor >= RPC_AUTH_GSS_KRB5))
return -EINVAL;

+ dprintk("%s: program %s 0x%x nrvers %u version %u minorversion %u\n",
+ __func__, args.program->name, args.prognumber,
+ args.program->nrvers, args.version, cb->cb_minorversion);
+
/* Create RPC client */
client = rpc_create(&args);
if (IS_ERR(client)) {
diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index 46e9ac5..e4c3223 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -706,6 +706,8 @@ static inline void
free_client(struct nfs4_client *clp)
{
shutdown_callback_client(clp);
+ if (clp->cl_cb_xprt)
+ svc_xprt_put(clp->cl_cb_xprt);
if (clp->cl_cred.cr_group_info)
put_group_info(clp->cl_cred.cr_group_info);
kfree(clp->cl_principal);
@@ -1321,6 +1323,18 @@ nfsd4_create_session(struct svc_rqst *rqstp,
cr_ses->flags &= ~SESSION4_PERSIST;
cr_ses->flags &= ~SESSION4_RDMA;

+ if (cr_ses->flags & SESSION4_BACK_CHAN) {
+ unconf->cl_cb_xprt = rqstp->rq_xprt;
+ svc_xprt_get(unconf->cl_cb_xprt);
+ rpc_copy_addr(
+ (struct sockaddr *)&unconf->cl_cb_conn.cb_addr,
+ sa);
+ unconf->cl_cb_conn.cb_addrlen = svc_addr_len(sa);
+ unconf->cl_cb_conn.cb_minorversion =
+ cstate->minorversion;
+ unconf->cl_cb_conn.cb_prog = cr_ses->callback_prog;
+ nfsd4_probe_callback(unconf);
+ }
conf = unconf;
} else {
status = nfserr_stale_clientid;
diff --git a/include/linux/nfsd/state.h b/include/linux/nfsd/state.h
index 70ef5f4..243277b 100644
--- a/include/linux/nfsd/state.h
+++ b/include/linux/nfsd/state.h
@@ -212,6 +212,9 @@ struct nfs4_client {
struct nfsd4_clid_slot cl_cs_slot; /* create_session slot */
u32 cl_exchange_flags;
struct nfs4_sessionid cl_sessionid;
+
+ /* for nfs41 callbacks */
+ struct svc_xprt *cl_cb_xprt; /* 4.1 callback transport */
};

/* struct nfs4_client_reset
--
1.6.4


2009-09-10 09:25:43

by Benny Halevy

[permalink] [raw]
Subject: [PATCH v2 06/12] nfsd41: Backchannel: Add sequence arguments to callback RPC arguments

From: Ricardo Labiaga <[email protected]>

Follow the model we use in the client. Make the sequence arguments
part of the regular RPC arguments. None of the callbacks that are
soon to be implemented expect results that need to be passed back
to the caller, so we don't define a separate RPC results structure.
For session validation, the cb_sequence decoding will use a pointer
to the sequence arguments that are part of the RPC argument.

Signed-off-by: Ricardo Labiaga <[email protected]>
[define struct nfsd4_cb_sequence here]
Signed-off-by: Benny Halevy <[email protected]>
---
fs/nfsd/nfs4callback.c | 5 +++++
include/linux/nfsd/state.h | 6 ++++++
2 files changed, 11 insertions(+), 0 deletions(-)

diff --git a/fs/nfsd/nfs4callback.c b/fs/nfsd/nfs4callback.c
index 3e3e15b..ff59a61 100644
--- a/fs/nfsd/nfs4callback.c
+++ b/fs/nfsd/nfs4callback.c
@@ -92,6 +92,11 @@ enum nfs_cb_opnum4 {
cb_sequence_dec_sz + \
op_dec_sz)

+struct nfs4_rpc_args {
+ void *args_op;
+ struct nfsd4_cb_sequence args_seq;
+};
+
/*
* Generic encode routines from fs/nfs/nfs4xdr.c
*/
diff --git a/include/linux/nfsd/state.h b/include/linux/nfsd/state.h
index 243277b..f69ea48 100644
--- a/include/linux/nfsd/state.h
+++ b/include/linux/nfsd/state.h
@@ -60,6 +60,12 @@ typedef struct {
#define si_stateownerid si_opaque.so_stateownerid
#define si_fileid si_opaque.so_fileid

+struct nfsd4_cb_sequence {
+ /* args/res */
+ u32 cbs_minorversion;
+ struct nfs4_client *cbs_clp;
+};
+
struct nfs4_delegation {
struct list_head dl_perfile;
struct list_head dl_perclnt;
--
1.6.4


2009-09-10 09:25:56

by Benny Halevy

[permalink] [raw]
Subject: [PATCH v2 07/12] nfsd41: Backchannel: Server backchannel RPC wait queue

From: Ricardo Labiaga <[email protected]>

RPC callback requests will wait on this wait queue if the backchannel
is out of slots.

Signed-off-by: Ricardo Labiaga <[email protected]>
Signed-off-by: Benny Halevy <[email protected]>
---
fs/nfsd/nfs4state.c | 2 ++
include/linux/nfsd/state.h | 4 ++++
2 files changed, 6 insertions(+), 0 deletions(-)

diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index e4c3223..957f6e5 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -779,6 +779,8 @@ static struct nfs4_client *create_client(struct xdr_netobj name, char *recdir)
INIT_LIST_HEAD(&clp->cl_delegations);
INIT_LIST_HEAD(&clp->cl_sessions);
INIT_LIST_HEAD(&clp->cl_lru);
+ clear_bit(0, &clp->cl_cb_slot_busy);
+ rpc_init_wait_queue(&clp->cl_cb_waitq, "Backchannel slot table");
return clp;
}

diff --git a/include/linux/nfsd/state.h b/include/linux/nfsd/state.h
index f69ea48..234e9af 100644
--- a/include/linux/nfsd/state.h
+++ b/include/linux/nfsd/state.h
@@ -220,7 +220,11 @@ struct nfs4_client {
struct nfs4_sessionid cl_sessionid;

/* for nfs41 callbacks */
+ /* We currently support a single back channel with a single slot */
+ unsigned long cl_cb_slot_busy;
struct svc_xprt *cl_cb_xprt; /* 4.1 callback transport */
+ struct rpc_wait_queue cl_cb_waitq; /* backchannel callers may */
+ /* wait here for slots */
};

/* struct nfs4_client_reset
--
1.6.4


2009-09-10 09:26:09

by Benny Halevy

[permalink] [raw]
Subject: [PATCH v2 08/12] nfsd41: Backchannel: Setup sequence information

From: Ricardo Labiaga <[email protected]>

Follows the model used by the NFS client. Setup the RPC prepare and done
function pointers so that we can populate the sequence information if
minorversion == 1. rpc_run_task() is then invoked directly just like
existing NFS client operations do.

nfsd4_cb_prepare() determines if the sequence information needs to be setup.
If the slot is in use, it adds itself to the wait queue.

nfsd4_cb_done() wakes anyone sleeping on the callback channel wait queue
after our RPC reply has been received. It also sets the task message
result pointer to NULL to clearly indicate we're done using it.

Signed-off-by: Ricardo Labiaga <[email protected]>
[define and initialize cl_cb_seq_nr here]
[pulled out unused defintion of nfsd4_cb_done]
Signed-off-by: Benny Halevy <[email protected]>
---
fs/nfsd/nfs4callback.c | 62 ++++++++++++++++++++++++++++++++++++++++++++
fs/nfsd/nfs4state.c | 1 +
include/linux/nfsd/state.h | 1 +
3 files changed, 64 insertions(+), 0 deletions(-)

diff --git a/fs/nfsd/nfs4callback.c b/fs/nfsd/nfs4callback.c
index ff59a61..e79e3a4 100644
--- a/fs/nfsd/nfs4callback.c
+++ b/fs/nfsd/nfs4callback.c
@@ -518,6 +518,67 @@ nfsd4_probe_callback(struct nfs4_client *clp)
do_probe_callback(clp);
}

+/*
+ * There's currently a single callback channel slot.
+ * If the slot is available, then mark it busy. Otherwise, set the
+ * thread for sleeping on the callback RPC wait queue.
+ */
+static int nfsd41_cb_setup_sequence(struct nfs4_client *clp,
+ struct rpc_task *task)
+{
+ struct nfs4_rpc_args *args = task->tk_msg.rpc_argp;
+ u32 *ptr = (u32 *)clp->cl_sessionid.data;
+ int status = 0;
+
+ dprintk("%s: %u:%u:%u:%u\n", __func__,
+ ptr[0], ptr[1], ptr[2], ptr[3]);
+
+ if (test_and_set_bit(0, &clp->cl_cb_slot_busy) != 0) {
+ rpc_sleep_on(&clp->cl_cb_waitq, task, NULL);
+ dprintk("%s slot is busy\n", __func__);
+ status = -EAGAIN;
+ goto out;
+ }
+
+ /*
+ * We'll need the clp during XDR encoding and decoding,
+ * and the sequence during decoding to verify the reply
+ */
+ args->args_seq.cbs_clp = clp;
+ task->tk_msg.rpc_resp = &args->args_seq;
+
+out:
+ dprintk("%s status=%d\n", __func__, status);
+ return status;
+}
+
+/*
+ * TODO: cb_sequence should support referring call lists, cachethis, multiple
+ * slots, and mark callback channel down on communication errors.
+ */
+static void nfsd4_cb_prepare(struct rpc_task *task, void *calldata)
+{
+ struct nfs4_delegation *dp = calldata;
+ struct nfs4_client *clp = dp->dl_client;
+ struct nfs4_rpc_args *args = task->tk_msg.rpc_argp;
+ u32 minorversion = clp->cl_cb_conn.cb_minorversion;
+ int status = 0;
+
+ args->args_seq.cbs_minorversion = minorversion;
+ if (minorversion) {
+ status = nfsd41_cb_setup_sequence(clp, task);
+ if (status) {
+ if (status != -EAGAIN) {
+ /* terminate rpc task */
+ task->tk_status = status;
+ task->tk_action = NULL;
+ }
+ return;
+ }
+ }
+ rpc_call_start(task);
+}
+
static void nfsd4_cb_recall_done(struct rpc_task *task, void *calldata)
{
struct nfs4_delegation *dp = calldata;
@@ -557,6 +618,7 @@ static void nfsd4_cb_recall_release(void *calldata)
}

static const struct rpc_call_ops nfsd4_cb_recall_ops = {
+ .rpc_call_prepare = nfsd4_cb_prepare,
.rpc_call_done = nfsd4_cb_recall_done,
.rpc_release = nfsd4_cb_recall_release,
};
diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index 957f6e5..b2ffa3b 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -1335,6 +1335,7 @@ nfsd4_create_session(struct svc_rqst *rqstp,
unconf->cl_cb_conn.cb_minorversion =
cstate->minorversion;
unconf->cl_cb_conn.cb_prog = cr_ses->callback_prog;
+ unconf->cl_cb_seq_nr = 1;
nfsd4_probe_callback(unconf);
}
conf = unconf;
diff --git a/include/linux/nfsd/state.h b/include/linux/nfsd/state.h
index 234e9af..b621428 100644
--- a/include/linux/nfsd/state.h
+++ b/include/linux/nfsd/state.h
@@ -222,6 +222,7 @@ struct nfs4_client {
/* for nfs41 callbacks */
/* We currently support a single back channel with a single slot */
unsigned long cl_cb_slot_busy;
+ u32 cl_cb_seq_nr;
struct svc_xprt *cl_cb_xprt; /* 4.1 callback transport */
struct rpc_wait_queue cl_cb_waitq; /* backchannel callers may */
/* wait here for slots */
--
1.6.4


2009-09-10 09:26:39

by Benny Halevy

[permalink] [raw]
Subject: [PATCH v2 10/12] nfsd41: Backchannel: Implement cb_recall over NFSv4.1

From: Ricardo Labiaga <[email protected]>

Signed-off-by: Ricardo Labiaga <[email protected]>
[nfsd41: cb_recall callback]
[Share v4.0 and v4.1 back channel xdr]
Signed-off-by: Andy Adamson <[email protected]>
Signed-off-by: Ricardo Labiaga <[email protected]>
Signed-off-by: Benny Halevy <[email protected]>
[Share v4.0 and v4.1 back channel xdr]
Signed-off-by: Andy Adamson <[email protected]>
Signed-off-by: Benny Halevy <[email protected]>
[nfsd41: use nfsd4_cb_sequence for callback minorversion]
[nfsd41: conditionally decode_sequence in nfs4_xdr_dec_cb_recall]
Signed-off-by: Benny Halevy <[email protected]>
[nfsd41: Backchannel: Add sequence arguments to callback RPC arguments]
Signed-off-by: Ricardo Labiaga <[email protected]>
[pulled-in definition of nfsd4_cb_done]
Signed-off-by: Benny Halevy <[email protected]>
---
fs/nfsd/nfs4callback.c | 52 ++++++++++++++++++++++++++++++++++++++++++++---
1 files changed, 48 insertions(+), 4 deletions(-)

diff --git a/fs/nfsd/nfs4callback.c b/fs/nfsd/nfs4callback.c
index 5e9659c..30dc375 100644
--- a/fs/nfsd/nfs4callback.c
+++ b/fs/nfsd/nfs4callback.c
@@ -288,15 +288,19 @@ nfs4_xdr_enc_cb_null(struct rpc_rqst *req, __be32 *p)
}

static int
-nfs4_xdr_enc_cb_recall(struct rpc_rqst *req, __be32 *p, struct nfs4_delegation *args)
+nfs4_xdr_enc_cb_recall(struct rpc_rqst *req, __be32 *p,
+ struct nfs4_rpc_args *rpc_args)
{
struct xdr_stream xdr;
+ struct nfs4_delegation *args = rpc_args->args_op;
struct nfs4_cb_compound_hdr hdr = {
.ident = args->dl_ident,
+ .minorversion = rpc_args->args_seq.cbs_minorversion,
};

xdr_init_encode(&xdr, &req->rq_snd_buf, p);
encode_cb_compound_hdr(&xdr, &hdr);
+ encode_cb_sequence(&xdr, &rpc_args->args_seq, &hdr);
encode_cb_recall(&xdr, args, &hdr);
encode_cb_nops(&hdr);
return 0;
@@ -396,7 +400,8 @@ nfs4_xdr_dec_cb_null(struct rpc_rqst *req, __be32 *p)
}

static int
-nfs4_xdr_dec_cb_recall(struct rpc_rqst *rqstp, __be32 *p)
+nfs4_xdr_dec_cb_recall(struct rpc_rqst *rqstp, __be32 *p,
+ struct nfsd4_cb_sequence *seq)
{
struct xdr_stream xdr;
struct nfs4_cb_compound_hdr hdr;
@@ -406,6 +411,11 @@ nfs4_xdr_dec_cb_recall(struct rpc_rqst *rqstp, __be32 *p)
status = decode_cb_compound_hdr(&xdr, &hdr);
if (status)
goto out;
+ if (seq) {
+ status = decode_cb_sequence(&xdr, seq, rqstp);
+ if (status)
+ goto out;
+ }
status = decode_cb_op_hdr(&xdr, OP_CB_RECALL);
out:
return status;
@@ -651,11 +661,34 @@ static void nfsd4_cb_prepare(struct rpc_task *task, void *calldata)
rpc_call_start(task);
}

+static void nfsd4_cb_done(struct rpc_task *task, void *calldata)
+{
+ struct nfs4_delegation *dp = calldata;
+ struct nfs4_client *clp = dp->dl_client;
+
+ dprintk("%s: minorversion=%d\n", __func__,
+ clp->cl_cb_conn.cb_minorversion);
+
+ if (clp->cl_cb_conn.cb_minorversion) {
+ /* No need for lock, access serialized in nfsd4_cb_prepare */
+ ++clp->cl_cb_seq_nr;
+ clear_bit(0, &clp->cl_cb_slot_busy);
+ rpc_wake_up_next(&clp->cl_cb_waitq);
+ dprintk("%s: freed slot, new seqid=%d\n", __func__,
+ clp->cl_cb_seq_nr);
+
+ /* We're done looking into the sequence information */
+ task->tk_msg.rpc_resp = NULL;
+ }
+}
+
static void nfsd4_cb_recall_done(struct rpc_task *task, void *calldata)
{
struct nfs4_delegation *dp = calldata;
struct nfs4_client *clp = dp->dl_client;

+ nfsd4_cb_done(task, calldata);
+
switch (task->tk_status) {
case -EIO:
/* Network partition? */
@@ -668,16 +701,19 @@ static void nfsd4_cb_recall_done(struct rpc_task *task, void *calldata)
break;
default:
/* success, or error we can't handle */
- return;
+ goto done;
}
if (dp->dl_retries--) {
rpc_delay(task, 2*HZ);
task->tk_status = 0;
rpc_restart_call(task);
+ return;
} else {
atomic_set(&clp->cl_cb_conn.cb_set, 0);
warn_no_callback_path(clp, task->tk_status);
}
+done:
+ kfree(task->tk_msg.rpc_argp);
}

static void nfsd4_cb_recall_release(void *calldata)
@@ -703,16 +739,24 @@ nfsd4_cb_recall(struct nfs4_delegation *dp)
{
struct nfs4_client *clp = dp->dl_client;
struct rpc_clnt *clnt = clp->cl_cb_conn.cb_client;
+ struct nfs4_rpc_args *args;
struct rpc_message msg = {
.rpc_proc = &nfs4_cb_procedures[NFSPROC4_CLNT_CB_RECALL],
- .rpc_argp = dp,
.rpc_cred = clp->cl_cb_conn.cb_cred
};
int status;

+ args = kzalloc(sizeof(*args), GFP_KERNEL);
+ if (!args) {
+ status = -ENOMEM;
+ goto out;
+ }
+ args->args_op = dp;
+ msg.rpc_argp = args;
dp->dl_retries = 1;
status = rpc_call_async(clnt, &msg, RPC_TASK_SOFT,
&nfsd4_cb_recall_ops, dp);
+out:
if (status) {
put_nfs4_client(clp);
nfs4_put_delegation(dp);
--
1.6.4


2009-09-10 09:26:22

by Benny Halevy

[permalink] [raw]
Subject: [PATCH v2 09/12] nfsd41: Backchannel: cb_sequence callback

Implement the cb_sequence callback conforming to draft-ietf-nfsv4-minorversion1

Note: highest slot id and target highest slot id do not have to be 0
as was previously implemented. They can be greater than what the
nfs server sent if the client supports a larger slot table on the
backchannel. At this point we just ignore that.

Signed-off-by: Benny Halevy <[email protected]>
Signed-off-by: Ricardo Labiaga <[email protected]>
[Rework the back channel xdr using the shared v4.0 and v4.1 framework.]
Signed-off-by: Andy Adamson <[email protected]>
[fixed indentation]
Signed-off-by: Benny Halevy <[email protected]>
[nfsd41: use nfsd4_cb_sequence for callback minorversion]
Signed-off-by: Benny Halevy <[email protected]>
[nfsd41: fix verification of CB_SEQUENCE highest slot id[
Signed-off-by: Benny Halevy <[email protected]>
[nfsd41: Backchannel: Remove old backchannel serialization]
[nfsd41: Backchannel: First callback sequence ID should be 1]
Signed-off-by: Ricardo Labiaga <[email protected]>
Signed-off-by: Benny Halevy <[email protected]>
[nfsd41: decode_cb_sequence does not need to actually decode ignored fields]
Signed-off-by: Benny Halevy <[email protected]>
Signed-off-by: Ricardo Labiaga <[email protected]>
Signed-off-by: Benny Halevy <[email protected]>
---
fs/nfsd/nfs4callback.c | 72 ++++++++++++++++++++++++++++++++++++++++++++++++
1 files changed, 72 insertions(+), 0 deletions(-)

diff --git a/fs/nfsd/nfs4callback.c b/fs/nfsd/nfs4callback.c
index e79e3a4..5e9659c 100644
--- a/fs/nfsd/nfs4callback.c
+++ b/fs/nfsd/nfs4callback.c
@@ -256,6 +256,27 @@ encode_cb_recall(struct xdr_stream *xdr, struct nfs4_delegation *dp,
hdr->nops++;
}

+static void
+encode_cb_sequence(struct xdr_stream *xdr, struct nfsd4_cb_sequence *args,
+ struct nfs4_cb_compound_hdr *hdr)
+{
+ __be32 *p;
+
+ if (hdr->minorversion == 0)
+ return;
+
+ RESERVE_SPACE(1 + NFS4_MAX_SESSIONID_LEN + 20);
+
+ WRITE32(OP_CB_SEQUENCE);
+ WRITEMEM(args->cbs_clp->cl_sessionid.data, NFS4_MAX_SESSIONID_LEN);
+ WRITE32(args->cbs_clp->cl_cb_seq_nr);
+ WRITE32(0); /* slotid, always 0 */
+ WRITE32(0); /* highest slotid always 0 */
+ WRITE32(0); /* cachethis always 0 */
+ WRITE32(0); /* FIXME: support referring_call_lists */
+ hdr->nops++;
+}
+
static int
nfs4_xdr_enc_cb_null(struct rpc_rqst *req, __be32 *p)
{
@@ -317,6 +338,57 @@ decode_cb_op_hdr(struct xdr_stream *xdr, enum nfs_opnum4 expected)
return 0;
}

+/*
+ * Our current back channel implmentation supports a single backchannel
+ * with a single slot.
+ */
+static int
+decode_cb_sequence(struct xdr_stream *xdr, struct nfsd4_cb_sequence *res,
+ struct rpc_rqst *rqstp)
+{
+ struct nfs4_sessionid id;
+ int status;
+ u32 dummy;
+ __be32 *p;
+
+ if (res->cbs_minorversion == 0)
+ return 0;
+
+ status = decode_cb_op_hdr(xdr, OP_CB_SEQUENCE);
+ if (status)
+ return status;
+
+ /*
+ * If the server returns different values for sessionID, slotID or
+ * sequence number, the server is looney tunes.
+ */
+ status = -ESERVERFAULT;
+
+ READ_BUF(NFS4_MAX_SESSIONID_LEN + 16);
+ memcpy(id.data, p, NFS4_MAX_SESSIONID_LEN);
+ p += XDR_QUADLEN(NFS4_MAX_SESSIONID_LEN);
+ if (memcmp(id.data, res->cbs_clp->cl_sessionid.data,
+ NFS4_MAX_SESSIONID_LEN)) {
+ dprintk("%s Invalid session id\n", __func__);
+ goto out;
+ }
+ READ32(dummy);
+ if (dummy != res->cbs_clp->cl_cb_seq_nr) {
+ dprintk("%s Invalid sequence number\n", __func__);
+ goto out;
+ }
+ READ32(dummy); /* slotid must be 0 */
+ if (dummy != 0) {
+ dprintk("%s Invalid slotid\n", __func__);
+ goto out;
+ }
+ /* FIXME: process highest slotid and target highest slotid */
+ status = 0;
+out:
+ return status;
+}
+
+
static int
nfs4_xdr_dec_cb_null(struct rpc_rqst *req, __be32 *p)
{
--
1.6.4


2009-09-10 09:26:52

by Benny Halevy

[permalink] [raw]
Subject: [PATCH v2 11/12] nfsd41: modify nfsd4.1 backchannel to use new xprt class

From: Alexandros Batsakis <[email protected]>

This patch enables the use of the nfsv4.1 backchannel.

Signed-off-by: Alexandros Batsakis <[email protected]>
[initialize rpc_create_args.bc_xprt too]
Signed-off-by: Benny Halevy <[email protected]>
---
fs/nfsd/nfs4callback.c | 8 +++++++-
1 files changed, 7 insertions(+), 1 deletions(-)

diff --git a/fs/nfsd/nfs4callback.c b/fs/nfsd/nfs4callback.c
index 30dc375..236e7ee 100644
--- a/fs/nfsd/nfs4callback.c
+++ b/fs/nfsd/nfs4callback.c
@@ -48,6 +48,7 @@
#include <linux/nfsd/state.h>
#include <linux/sunrpc/sched.h>
#include <linux/nfs4.h>
+#include <linux/sunrpc/xprtsock.h>

#define NFSDDBG_FACILITY NFSDDBG_PROC

@@ -483,7 +484,7 @@ int setup_callback_client(struct nfs4_client *clp)
.to_retries = 0,
};
struct rpc_create_args args = {
- .protocol = IPPROTO_TCP,
+ .protocol = XPRT_TRANSPORT_TCP,
.address = (struct sockaddr *) &cb->cb_addr,
.addrsize = cb->cb_addrlen,
.timeout = &timeparms,
@@ -499,6 +500,11 @@ int setup_callback_client(struct nfs4_client *clp)
if (!clp->cl_principal && (clp->cl_flavor >= RPC_AUTH_GSS_KRB5))
return -EINVAL;

+ if (cb->cb_minorversion) {
+ args.bc_xprt = clp->cl_cb_xprt;
+ args.protocol = XPRT_TRANSPORT_BC_TCP;
+ }
+
dprintk("%s: program %s 0x%x nrvers %u version %u minorversion %u\n",
__func__, args.program->name, args.prognumber,
args.program->nrvers, args.version, cb->cb_minorversion);
--
1.6.4


2009-09-10 09:27:05

by Benny Halevy

[permalink] [raw]
Subject: [PATCH v2 12/12] nfsd41: Refactor create_client()

From: Ricardo Labiaga <[email protected]>

Move common initialization of 'struct nfs4_client' inside create_client().

Signed-off-by: Ricardo Labiaga <[email protected]>

[nfsd41: Remember the auth flavor to use for callbacks]
Signed-off-by: Ricardo Labiaga <[email protected]>
Signed-off-by: Benny Halevy <[email protected]>
---
fs/nfsd/nfs4state.c | 89 ++++++++++++++++++++++++++-------------------------
1 files changed, 45 insertions(+), 44 deletions(-)

diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index b2ffa3b..f344de2 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -763,27 +763,6 @@ expire_client(struct nfs4_client *clp)
put_nfs4_client(clp);
}

-static struct nfs4_client *create_client(struct xdr_netobj name, char *recdir)
-{
- struct nfs4_client *clp;
-
- clp = alloc_client(name);
- if (clp == NULL)
- return NULL;
- memcpy(clp->cl_recdir, recdir, HEXDIR_LEN);
- atomic_set(&clp->cl_count, 1);
- atomic_set(&clp->cl_cb_conn.cb_set, 0);
- INIT_LIST_HEAD(&clp->cl_idhash);
- INIT_LIST_HEAD(&clp->cl_strhash);
- INIT_LIST_HEAD(&clp->cl_openowners);
- INIT_LIST_HEAD(&clp->cl_delegations);
- INIT_LIST_HEAD(&clp->cl_sessions);
- INIT_LIST_HEAD(&clp->cl_lru);
- clear_bit(0, &clp->cl_cb_slot_busy);
- rpc_init_wait_queue(&clp->cl_cb_waitq, "Backchannel slot table");
- return clp;
-}
-
static void copy_verf(struct nfs4_client *target, nfs4_verifier *source)
{
memcpy(target->cl_verifier.data, source->data,
@@ -846,6 +825,46 @@ static void gen_confirm(struct nfs4_client *clp)
*p++ = i++;
}

+static struct nfs4_client *create_client(struct xdr_netobj name, char *recdir,
+ struct svc_rqst *rqstp, nfs4_verifier *verf)
+{
+ struct nfs4_client *clp;
+ struct sockaddr *sa = svc_addr(rqstp);
+ char *princ;
+
+ clp = alloc_client(name);
+ if (clp == NULL)
+ return NULL;
+
+ princ = svc_gss_principal(rqstp);
+ if (princ) {
+ clp->cl_principal = kstrdup(princ, GFP_KERNEL);
+ if (clp->cl_principal == NULL) {
+ free_client(clp);
+ return NULL;
+ }
+ }
+
+ memcpy(clp->cl_recdir, recdir, HEXDIR_LEN);
+ atomic_set(&clp->cl_count, 1);
+ atomic_set(&clp->cl_cb_conn.cb_set, 0);
+ INIT_LIST_HEAD(&clp->cl_idhash);
+ INIT_LIST_HEAD(&clp->cl_strhash);
+ INIT_LIST_HEAD(&clp->cl_openowners);
+ INIT_LIST_HEAD(&clp->cl_delegations);
+ INIT_LIST_HEAD(&clp->cl_sessions);
+ INIT_LIST_HEAD(&clp->cl_lru);
+ clear_bit(0, &clp->cl_cb_slot_busy);
+ rpc_init_wait_queue(&clp->cl_cb_waitq, "Backchannel slot table");
+ copy_verf(clp, verf);
+ rpc_copy_addr((struct sockaddr *) &clp->cl_addr, sa);
+ clp->cl_flavor = rqstp->rq_flavor;
+ copy_cred(&clp->cl_cred, &rqstp->rq_cred);
+ gen_confirm(clp);
+
+ return clp;
+}
+
static int check_name(struct xdr_netobj name)
{
if (name.len == 0)
@@ -1193,17 +1212,13 @@ nfsd4_exchange_id(struct svc_rqst *rqstp,

out_new:
/* Normal case */
- new = create_client(exid->clname, dname);
+ new = create_client(exid->clname, dname, rqstp, &verf);
if (new == NULL) {
status = nfserr_serverfault;
goto out;
}

- copy_verf(new, &verf);
- copy_cred(&new->cl_cred, &rqstp->rq_cred);
- rpc_copy_addr((struct sockaddr *) &new->cl_addr, sa);
gen_clid(new);
- gen_confirm(new);
add_to_unconfirmed(new, strhashval);
out_copy:
exid->clientid.cl_boot = new->cl_clientid.cl_boot;
@@ -1477,7 +1492,6 @@ nfsd4_setclientid(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
unsigned int strhashval;
struct nfs4_client *conf, *unconf, *new;
__be32 status;
- char *princ;
char dname[HEXDIR_LEN];

if (!check_name(clname))
@@ -1522,7 +1536,7 @@ nfsd4_setclientid(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
*/
if (unconf)
expire_client(unconf);
- new = create_client(clname, dname);
+ new = create_client(clname, dname, rqstp, &clverifier);
if (new == NULL)
goto out;
gen_clid(new);
@@ -1539,7 +1553,7 @@ nfsd4_setclientid(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
*/
expire_client(unconf);
}
- new = create_client(clname, dname);
+ new = create_client(clname, dname, rqstp, &clverifier);
if (new == NULL)
goto out;
copy_clid(new, conf);
@@ -1549,7 +1563,7 @@ nfsd4_setclientid(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
* probable client reboot; state will be removed if
* confirmed.
*/
- new = create_client(clname, dname);
+ new = create_client(clname, dname, rqstp, &clverifier);
if (new == NULL)
goto out;
gen_clid(new);
@@ -1560,24 +1574,11 @@ nfsd4_setclientid(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
* confirmed.
*/
expire_client(unconf);
- new = create_client(clname, dname);
+ new = create_client(clname, dname, rqstp, &clverifier);
if (new == NULL)
goto out;
gen_clid(new);
}
- copy_verf(new, &clverifier);
- rpc_copy_addr((struct sockaddr *) &new->cl_addr, sa);
- new->cl_flavor = rqstp->rq_flavor;
- princ = svc_gss_principal(rqstp);
- if (princ) {
- new->cl_principal = kstrdup(princ, GFP_KERNEL);
- if (new->cl_principal == NULL) {
- free_client(new);
- goto out;
- }
- }
- copy_cred(&new->cl_cred, &rqstp->rq_cred);
- gen_confirm(new);
gen_callback(new, setclid, rpc_get_scope_id(sa));
add_to_unconfirmed(new, strhashval);
setclid->se_clientid.cl_boot = new->cl_clientid.cl_boot;
--
1.6.4


2009-09-10 11:49:57

by Myklebust, Trond

[permalink] [raw]
Subject: Re: [PATCH v2 02/12] nfsd41: sunrpc: Added rpc server-side backchannel handling

On Thu, 2009-09-10 at 12:25 +0300, Benny Halevy wrote:
> From: Rahul Iyer <[email protected]>
>
> When the call direction is a reply, copy the xid and call direction into the
> req->rq_private_buf.head[0].iov_base otherwise rpc_verify_header returns
> rpc_garbage.
>
> Signed-off-by: Rahul Iyer <[email protected]>
> Signed-off-by: Mike Sager <[email protected]>
> Signed-off-by: Marc Eshel <[email protected]>
> Signed-off-by: Benny Halevy <[email protected]>
> Signed-off-by: Ricardo Labiaga <[email protected]>
> Signed-off-by: Andy Adamson <[email protected]>
> Signed-off-by: Benny Halevy <[email protected]>
> [get rid of CONFIG_NFSD_V4_1]
> [sunrpc: refactoring of svc_tcp_recvfrom]
> [nfsd41: sunrpc: create common send routine for the fore and the back channels]
> [nfsd41: sunrpc: Use free_page() to free server backchannel pages]
> [nfsd41: sunrpc: Document server backchannel locking]
> [nfsd41: sunrpc: remove bc_connect_worker()]
> [nfsd41: sunrpc: Define xprt_server_backchannel()[
> [nfsd41: sunrpc: remove bc_close and bc_init_auto_disconnect dummy functions]
> [nfsd41: sunrpc: eliminate unneeded switch statement in xs_setup_tcp()]
> [nfsd41: sunrpc: Don't auto close the server backchannel connection]
> [nfsd41: sunrpc: Remove unused functions]
> Signed-off-by: Alexandros Batsakis <[email protected]>
> Signed-off-by: Ricardo Labiaga <[email protected]>
> Signed-off-by: Benny Halevy <[email protected]>
> [nfsd41: change bc_sock to bc_xprt]
> [nfsd41: sunrpc: move struct rpc_buffer def into a common header file]
> [nfsd41: sunrpc: use rpc_sleep in bc_send_request so not to block on mutex]
> [removed cosmetic changes]
> Signed-off-by: Benny Halevy <[email protected]>
> [sunrpc: add new xprt class for nfsv4.1 backchannel]
> [sunrpc: v2.1 change handling of auto_close and init_auto_disconnect operations for the nfsv4.1 backchannel]
> Signed-off-by: Alexandros Batsakis <[email protected]>
> [reverted more cosmetic leftovers]
> [got rid of xprt_server_backchannel]
> [separated "nfsd41: sunrpc: add new xprt class for nfsv4.1 backchannel"]
> Signed-off-by: Benny Halevy <[email protected]>
> Cc: Trond Myklebust <[email protected]>
> ---
> include/linux/sunrpc/svc_xprt.h | 1 +
> include/linux/sunrpc/svcsock.h | 1 +
> include/linux/sunrpc/xprt.h | 1 +
> net/sunrpc/sunrpc.h | 4 +
> net/sunrpc/svc_xprt.c | 2 +
> net/sunrpc/svcsock.c | 172 +++++++++++++++++++++++++++++++--------
> net/sunrpc/xprt.c | 15 +++-
> net/sunrpc/xprtsock.c | 146 +++++++++++++++++++++++++++++++++
> 8 files changed, 303 insertions(+), 39 deletions(-)
>
> diff --git a/include/linux/sunrpc/svc_xprt.h b/include/linux/sunrpc/svc_xprt.h
> index 2223ae0..5f4e18b 100644
> --- a/include/linux/sunrpc/svc_xprt.h
> +++ b/include/linux/sunrpc/svc_xprt.h
> @@ -65,6 +65,7 @@ struct svc_xprt {
> size_t xpt_locallen; /* length of address */
> struct sockaddr_storage xpt_remote; /* remote peer's address */
> size_t xpt_remotelen; /* length of address */
> + struct rpc_wait_queue xpt_bc_pending; /* backchannel wait queue */
> };
>
> int svc_reg_xprt_class(struct svc_xprt_class *);
> diff --git a/include/linux/sunrpc/svcsock.h b/include/linux/sunrpc/svcsock.h
> index 04dba23..1b353a7 100644
> --- a/include/linux/sunrpc/svcsock.h
> +++ b/include/linux/sunrpc/svcsock.h
> @@ -28,6 +28,7 @@ struct svc_sock {
> /* private TCP part */
> u32 sk_reclen; /* length of record */
> u32 sk_tcplen; /* current read length */
> + struct rpc_xprt *sk_bc_xprt; /* NFSv4.1 backchannel xprt */
> };
>
> /*
> diff --git a/include/linux/sunrpc/xprt.h b/include/linux/sunrpc/xprt.h
> index c090df4..228d694 100644
> --- a/include/linux/sunrpc/xprt.h
> +++ b/include/linux/sunrpc/xprt.h
> @@ -179,6 +179,7 @@ struct rpc_xprt {
> spinlock_t reserve_lock; /* lock slot table */
> u32 xid; /* Next XID value to use */
> struct rpc_task * snd_task; /* Task blocked in send */
> + struct svc_xprt *bc_xprt; /* NFSv4.1 backchannel */
> #if defined(CONFIG_NFS_V4_1)
> struct svc_serv *bc_serv; /* The RPC service which will */
> /* process the callback */
> diff --git a/net/sunrpc/sunrpc.h b/net/sunrpc/sunrpc.h
> index 13171e6..90c292e 100644
> --- a/net/sunrpc/sunrpc.h
> +++ b/net/sunrpc/sunrpc.h
> @@ -43,5 +43,9 @@ static inline int rpc_reply_expected(struct rpc_task *task)
> (task->tk_msg.rpc_proc->p_decode != NULL);
> }
>
> +int svc_send_common(struct socket *sock, struct xdr_buf *xdr,
> + struct page *headpage, unsigned long headoffset,
> + struct page *tailpage, unsigned long tailoffset);
> +
> #endif /* _NET_SUNRPC_SUNRPC_H */
>
> diff --git a/net/sunrpc/svc_xprt.c b/net/sunrpc/svc_xprt.c
> index 912dea5..df124f7 100644
> --- a/net/sunrpc/svc_xprt.c
> +++ b/net/sunrpc/svc_xprt.c
> @@ -160,6 +160,7 @@ void svc_xprt_init(struct svc_xprt_class *xcl, struct svc_xprt *xprt,
> mutex_init(&xprt->xpt_mutex);
> spin_lock_init(&xprt->xpt_lock);
> set_bit(XPT_BUSY, &xprt->xpt_flags);
> + rpc_init_wait_queue(&xprt->xpt_bc_pending, "xpt_bc_pending");
> }
> EXPORT_SYMBOL_GPL(svc_xprt_init);
>
> @@ -810,6 +811,7 @@ int svc_send(struct svc_rqst *rqstp)
> else
> len = xprt->xpt_ops->xpo_sendto(rqstp);
> mutex_unlock(&xprt->xpt_mutex);
> + rpc_wake_up(&xprt->xpt_bc_pending);
> svc_xprt_release(rqstp);
>
> if (len == -ECONNREFUSED || len == -ENOTCONN || len == -EAGAIN)
> diff --git a/net/sunrpc/svcsock.c b/net/sunrpc/svcsock.c
> index 76a380d..ccc5e83 100644
> --- a/net/sunrpc/svcsock.c
> +++ b/net/sunrpc/svcsock.c
> @@ -49,6 +49,7 @@
> #include <linux/sunrpc/msg_prot.h>
> #include <linux/sunrpc/svcsock.h>
> #include <linux/sunrpc/stats.h>
> +#include <linux/sunrpc/xprt.h>
>
> #define RPCDBG_FACILITY RPCDBG_SVCXPRT
>
> @@ -153,49 +154,27 @@ static void svc_set_cmsg_data(struct svc_rqst *rqstp, struct cmsghdr *cmh)
> }
>
> /*
> - * Generic sendto routine
> + * send routine intended to be shared by the fore- and back-channel
> */
> -static int svc_sendto(struct svc_rqst *rqstp, struct xdr_buf *xdr)
> +int svc_send_common(struct socket *sock, struct xdr_buf *xdr,
> + struct page *headpage, unsigned long headoffset,
> + struct page *tailpage, unsigned long tailoffset)
> {
> - struct svc_sock *svsk =
> - container_of(rqstp->rq_xprt, struct svc_sock, sk_xprt);
> - struct socket *sock = svsk->sk_sock;
> - int slen;
> - union {
> - struct cmsghdr hdr;
> - long all[SVC_PKTINFO_SPACE / sizeof(long)];
> - } buffer;
> - struct cmsghdr *cmh = &buffer.hdr;
> - int len = 0;
> int result;
> int size;
> struct page **ppage = xdr->pages;
> size_t base = xdr->page_base;
> unsigned int pglen = xdr->page_len;
> unsigned int flags = MSG_MORE;
> - RPC_IFDEBUG(char buf[RPC_MAX_ADDRBUFLEN]);
> + int slen;
> + int len = 0;
>
> slen = xdr->len;
>
> - if (rqstp->rq_prot == IPPROTO_UDP) {
> - struct msghdr msg = {
> - .msg_name = &rqstp->rq_addr,
> - .msg_namelen = rqstp->rq_addrlen,
> - .msg_control = cmh,
> - .msg_controllen = sizeof(buffer),
> - .msg_flags = MSG_MORE,
> - };
> -
> - svc_set_cmsg_data(rqstp, cmh);
> -
> - if (sock_sendmsg(sock, &msg, 0) < 0)
> - goto out;
> - }
> -
> /* send head */
> if (slen == xdr->head[0].iov_len)
> flags = 0;
> - len = kernel_sendpage(sock, rqstp->rq_respages[0], 0,
> + len = kernel_sendpage(sock, headpage, headoffset,
> xdr->head[0].iov_len, flags);
> if (len != xdr->head[0].iov_len)
> goto out;
> @@ -219,16 +198,58 @@ static int svc_sendto(struct svc_rqst *rqstp, struct xdr_buf *xdr)
> base = 0;
> ppage++;
> }
> +
> /* send tail */
> if (xdr->tail[0].iov_len) {
> - result = kernel_sendpage(sock, rqstp->rq_respages[0],
> - ((unsigned long)xdr->tail[0].iov_base)
> - & (PAGE_SIZE-1),
> - xdr->tail[0].iov_len, 0);
> -
> + result = kernel_sendpage(sock, tailpage, tailoffset,
> + xdr->tail[0].iov_len, 0);
> if (result > 0)
> len += result;
> }
> +
> +out:
> + return len;
> +}
> +
> +
> +/*
> + * Generic sendto routine
> + */
> +static int svc_sendto(struct svc_rqst *rqstp, struct xdr_buf *xdr)
> +{
> + struct svc_sock *svsk =
> + container_of(rqstp->rq_xprt, struct svc_sock, sk_xprt);
> + struct socket *sock = svsk->sk_sock;
> + union {
> + struct cmsghdr hdr;
> + long all[SVC_PKTINFO_SPACE / sizeof(long)];
> + } buffer;
> + struct cmsghdr *cmh = &buffer.hdr;
> + int len = 0;
> + unsigned long tailoff;
> + unsigned long headoff;
> + RPC_IFDEBUG(char buf[RPC_MAX_ADDRBUFLEN]);
> +
> + if (rqstp->rq_prot == IPPROTO_UDP) {
> + struct msghdr msg = {
> + .msg_name = &rqstp->rq_addr,
> + .msg_namelen = rqstp->rq_addrlen,
> + .msg_control = cmh,
> + .msg_controllen = sizeof(buffer),
> + .msg_flags = MSG_MORE,
> + };
> +
> + svc_set_cmsg_data(rqstp, cmh);
> +
> + if (sock_sendmsg(sock, &msg, 0) < 0)
> + goto out;
> + }
> +
> + tailoff = ((unsigned long)xdr->tail[0].iov_base) & (PAGE_SIZE-1);
> + headoff = 0;
> + len = svc_send_common(sock, xdr, rqstp->rq_respages[0], headoff,
> + rqstp->rq_respages[0], tailoff);
> +
> out:
> dprintk("svc: socket %p sendto([%p %Zu... ], %d) = %d (addr %s)\n",
> svsk, xdr->head[0].iov_base, xdr->head[0].iov_len,
> @@ -951,6 +972,57 @@ static int svc_tcp_recv_record(struct svc_sock *svsk, struct svc_rqst *rqstp)
> return -EAGAIN;
> }
>
> +static int svc_process_calldir(struct svc_sock *svsk, struct svc_rqst *rqstp,
> + struct rpc_rqst **reqpp, struct kvec *vec)
> +{
> + struct rpc_rqst *req = NULL;
> + u32 *p;
> + u32 xid;
> + u32 calldir;
> + int len;
> +
> + len = svc_recvfrom(rqstp, vec, 1, 8);
> + if (len < 0)
> + goto error;
> +
> + p = (u32 *)rqstp->rq_arg.head[0].iov_base;
> + xid = *p++;
> + calldir = *p;
> +
> + if (calldir == 0) {
> + /* REQUEST is the most common case */
> + vec[0] = rqstp->rq_arg.head[0];
> + } else {
> + /* REPLY */
> + if (svsk->sk_bc_xprt)
> + req = xprt_lookup_rqst(svsk->sk_bc_xprt, xid);
> +
> + if (!req) {
> + printk(KERN_NOTICE
> + "%s: Got unrecognized reply: "
> + "calldir 0x%x sk_bc_xprt %p xid %08x\n",
> + __func__, ntohl(calldir),
> + svsk->sk_bc_xprt, xid);
> + vec[0] = rqstp->rq_arg.head[0];
> + goto out;
> + }
> +
> + memcpy(&req->rq_private_buf, &req->rq_rcv_buf,
> + sizeof(struct xdr_buf));
> + /* copy the xid and call direction */
> + memcpy(req->rq_private_buf.head[0].iov_base,
> + rqstp->rq_arg.head[0].iov_base, 8);
> + vec[0] = req->rq_private_buf.head[0];
> + }
> + out:
> + vec[0].iov_base += 8;
> + vec[0].iov_len -= 8;
> + len = svsk->sk_reclen - 8;
> + error:
> + *reqpp = req;
> + return len;
> +}
> +
> /*
> * Receive data from a TCP socket.
> */
> @@ -962,6 +1034,7 @@ static int svc_tcp_recvfrom(struct svc_rqst *rqstp)
> int len;
> struct kvec *vec;
> int pnum, vlen;
> + struct rpc_rqst *req = NULL;
>
> dprintk("svc: tcp_recv %p data %d conn %d close %d\n",
> svsk, test_bit(XPT_DATA, &svsk->sk_xprt.xpt_flags),
> @@ -975,9 +1048,27 @@ static int svc_tcp_recvfrom(struct svc_rqst *rqstp)
> vec = rqstp->rq_vec;
> vec[0] = rqstp->rq_arg.head[0];
> vlen = PAGE_SIZE;
> +
> + /*
> + * We have enough data for the whole tcp record. Let's try and read the
> + * first 8 bytes to get the xid and the call direction. We can use this
> + * to figure out if this is a call or a reply to a callback. If
> + * sk_reclen is < 8 (xid and calldir), then this is a malformed packet.
> + * In that case, don't bother with the calldir and just read the data.
> + * It will be rejected in svc_process.
> + */
> + if (len >= 8) {
> + len = svc_process_calldir(svsk, rqstp, &req, vec);
> + if (len < 0)
> + goto err_again;
> + vlen -= 8;
> + }
> +
> pnum = 1;
> while (vlen < len) {
> - vec[pnum].iov_base = page_address(rqstp->rq_pages[pnum]);
> + vec[pnum].iov_base = (req) ?
> + page_address(req->rq_private_buf.pages[pnum - 1]) :
> + page_address(rqstp->rq_pages[pnum]);
> vec[pnum].iov_len = PAGE_SIZE;
> pnum++;
> vlen += PAGE_SIZE;
> @@ -989,6 +1080,16 @@ static int svc_tcp_recvfrom(struct svc_rqst *rqstp)
> if (len < 0)
> goto err_again;
>
> + /*
> + * Account for the 8 bytes we read earlier
> + */
> + len += 8;
> +
> + if (req) {
> + xprt_complete_rqst(req->rq_task, len);
> + len = 0;
> + goto out;
> + }
> dprintk("svc: TCP complete record (%d bytes)\n", len);
> rqstp->rq_arg.len = len;
> rqstp->rq_arg.page_base = 0;
> @@ -1002,6 +1103,7 @@ static int svc_tcp_recvfrom(struct svc_rqst *rqstp)
> rqstp->rq_xprt_ctxt = NULL;
> rqstp->rq_prot = IPPROTO_TCP;
>
> +out:
> /* Reset TCP read info */
> svsk->sk_reclen = 0;
> svsk->sk_tcplen = 0;
> diff --git a/net/sunrpc/xprt.c b/net/sunrpc/xprt.c
> index f412a85..f577e5a 100644
> --- a/net/sunrpc/xprt.c
> +++ b/net/sunrpc/xprt.c
> @@ -832,6 +832,11 @@ static void xprt_timer(struct rpc_task *task)
> spin_unlock_bh(&xprt->transport_lock);
> }
>
> +static inline int xprt_has_timer(struct rpc_xprt *xprt)
> +{
> + return xprt->idle_timeout != (~0);
> +}

Why did this change again?

It's a disconnect timer, and the idle_timeout sets the timeout period. A
test for whether or not that period is 0 therefore makes sense (a zero
timeout being a nonsense value for a timer).

Testing for arbitrary non-zero values is more dubious, and forces the
backchannel to explicitly set a non-zero value. What value does that
add?


--
Trond Myklebust
Linux NFS client maintainer

NetApp
[email protected]
http://www.netapp.com

2009-09-10 12:33:16

by Benny Halevy

[permalink] [raw]
Subject: Re: [PATCH v2 02/12] nfsd41: sunrpc: Added rpc server-side backchannel handling

On Sep. 10, 2009, 14:49 +0300, Trond Myklebust <[email protected]> wrote:
> > diff --git a/net/sunrpc/xprt.c b/net/sunrpc/xprt.c
> > > index f412a85..f577e5a 100644
> > > --- a/net/sunrpc/xprt.c
> > > +++ b/net/sunrpc/xprt.c
> > > @@ -832,6 +832,11 @@ static void xprt_timer(struct rpc_task *task)
> > > spin_unlock_bh(&xprt->transport_lock);
> > > }
> > >
> > > +static inline int xprt_has_timer(struct rpc_xprt *xprt)
> > > +{
> > > + return xprt->idle_timeout != (~0);
> > > +}
>
> Why did this change again?
>
> It's a disconnect timer, and the idle_timeout sets the timeout period. A
> test for whether or not that period is 0 therefore makes sense (a zero
> timeout being a nonsense value for a timer).
>
> Testing for arbitrary non-zero values is more dubious, and forces the
> backchannel to explicitly set a non-zero value. What value does that
> add?
>

Good question. I agree with your direction.

Alexandros, why was this != 0 in PATCH 3/3 v2:
http://linux-nfs.org/pipermail/pnfs/2009-September/009057.html
but changed back to ~0 in PATCH 3/3 v2.1?
http://linux-nfs.org/pipermail/pnfs/2009-September/009059.html

With this in mind, xs_setup_bc_tcp can simply initialize idle_timeout
to zero, right?
xprt->bind_timeout = 0;
xprt->connect_timeout = 0;
xprt->reestablish_timeout = 0;
- xprt->idle_timeout = (~0);
+ xprt->idle_timeout = 0;

/*
* The backchannel uses the same socket connection as the

Benny

>
> --
> Trond Myklebust
> Linux NFS client maintainer
>
> NetApp
> [email protected]
> http://www.netapp.com
>

2009-09-10 13:19:32

by Alexandros Batsakis

[permalink] [raw]
Subject: Re: [pnfs] [PATCH v2 02/12] nfsd41: sunrpc: Added rpc server-side backchannel handling

Trond,

you are right. It was changed to be similar to the pre-existing code,
but there is no need. The timeout value should to be set explicitly to
0 as Benny suggested.

-alexandros

On Thu, Sep 10, 2009 at 5:33 AM, Benny Halevy <[email protected]> wro=
te:
> On Sep. 10, 2009, 14:49 +0300, Trond Myklebust <Trond.Myklebust@netap=
p.com> wrote:
>> > diff --git a/net/sunrpc/xprt.c b/net/sunrpc/xprt.c
>> > > index f412a85..f577e5a 100644
>> > > --- a/net/sunrpc/xprt.c
>> > > +++ b/net/sunrpc/xprt.c
>> > > @@ -832,6 +832,11 @@ static void xprt_timer(struct rpc_task *tas=
k)
>> > > =A0 spin_unlock_bh(&xprt->transport_lock);
>> > > =A0}
>> > >
>> > > +static inline int xprt_has_timer(struct rpc_xprt *xprt)
>> > > +{
>> > > + return xprt->idle_timeout !=3D (~0);
>> > > +}
>>
>> Why did this change again?
>>
>> It's a disconnect timer, and the idle_timeout sets the timeout perio=
d. A
>> test for whether or not that period is 0 therefore makes sense (a ze=
ro
>> timeout being a nonsense value for a timer).
>>
>> Testing for arbitrary non-zero values is more dubious, and forces th=
e
>> backchannel to explicitly set a non-zero value. What value does that
>> add?
>>
>
> Good question. I agree with your direction.
>
> Alexandros, why was this !=3D 0 in PATCH 3/3 v2:
> http://linux-nfs.org/pipermail/pnfs/2009-September/009057.html
> but changed back to ~0 in PATCH 3/3 v2.1?
> http://linux-nfs.org/pipermail/pnfs/2009-September/009059.html
>
> With this in mind, xs_setup_bc_tcp can simply initialize idle_timeout
> to zero, right?
> =A0 =A0 =A0 =A0xprt->bind_timeout =3D 0;
> =A0 =A0 =A0 =A0xprt->connect_timeout =3D 0;
> =A0 =A0 =A0 =A0xprt->reestablish_timeout =3D 0;
> - =A0 =A0 =A0 xprt->idle_timeout =3D (~0);
> + =A0 =A0 =A0 xprt->idle_timeout =3D 0;
>
> =A0 =A0 =A0 =A0/*
> =A0 =A0 =A0 =A0 * The backchannel uses the same socket connection as =
the
>
> Benny
>
>>
>> --
>> Trond Myklebust
>> Linux NFS client maintainer
>>
>> NetApp
>> [email protected]
>> http://www.netapp.com
>>
> _______________________________________________
> pNFS mailing list
> [email protected]
> http://linux-nfs.org/cgi-bin/mailman/listinfo/pnfs
>

2009-09-10 14:32:59

by Benny Halevy

[permalink] [raw]
Subject: [PATCH v3 03/12] nfsd41: sunrpc: add new xprt class for nfsv4.1 backchannel

From: Alexandros Batsakis <[email protected]>

[sunrpc: change idle timeout value for the backchannel]
Signed-off-by: Alexandros Batsakis <[email protected]>
Signed-off-by: Benny Halevy <[email protected]>
---

Changes from v2:

xs_setup_bc_tcp sets cprt->idle_timeout = 0 rather than ~0

---
include/linux/sunrpc/clnt.h | 1 +
include/linux/sunrpc/xprt.h | 1 +
include/linux/sunrpc/xprtrdma.h | 5 --
include/linux/sunrpc/xprtsock.h | 9 +++-
net/sunrpc/clnt.c | 1 +
net/sunrpc/xprtsock.c | 96 ++++++++++++++++++++++++++++++++++++++-
6 files changed, 104 insertions(+), 9 deletions(-)

diff --git a/include/linux/sunrpc/clnt.h b/include/linux/sunrpc/clnt.h
index 3d02558..8ed9642 100644
--- a/include/linux/sunrpc/clnt.h
+++ b/include/linux/sunrpc/clnt.h
@@ -114,6 +114,7 @@ struct rpc_create_args {
rpc_authflavor_t authflavor;
unsigned long flags;
char *client_name;
+ struct svc_xprt *bc_xprt; /* NFSv4.1 backchannel */
};

/* Values for "flags" field */
diff --git a/include/linux/sunrpc/xprt.h b/include/linux/sunrpc/xprt.h
index 228d694..7cc42af 100644
--- a/include/linux/sunrpc/xprt.h
+++ b/include/linux/sunrpc/xprt.h
@@ -232,6 +232,7 @@ struct xprt_create {
struct sockaddr * srcaddr; /* optional local address */
struct sockaddr * dstaddr; /* remote peer address */
size_t addrlen;
+ struct svc_xprt *bc_xprt; /* NFSv4.1 backchannel */
};

struct xprt_class {
diff --git a/include/linux/sunrpc/xprtrdma.h b/include/linux/sunrpc/xprtrdma.h
index 54a379c..c2f04e1 100644
--- a/include/linux/sunrpc/xprtrdma.h
+++ b/include/linux/sunrpc/xprtrdma.h
@@ -41,11 +41,6 @@
#define _LINUX_SUNRPC_XPRTRDMA_H

/*
- * RPC transport identifier for RDMA
- */
-#define XPRT_TRANSPORT_RDMA 256
-
-/*
* rpcbind (v3+) RDMA netid.
*/
#define RPCBIND_NETID_RDMA "rdma"
diff --git a/include/linux/sunrpc/xprtsock.h b/include/linux/sunrpc/xprtsock.h
index c2a46c4..d7c98d1 100644
--- a/include/linux/sunrpc/xprtsock.h
+++ b/include/linux/sunrpc/xprtsock.h
@@ -20,8 +20,13 @@ void cleanup_socket_xprt(void);
* values. No such restriction exists for new transports, except that
* they may not collide with these values (17 and 6, respectively).
*/
-#define XPRT_TRANSPORT_UDP IPPROTO_UDP
-#define XPRT_TRANSPORT_TCP IPPROTO_TCP
+#define XPRT_TRANSPORT_BC (1 << 31)
+enum xprt_transports {
+ XPRT_TRANSPORT_UDP = IPPROTO_UDP,
+ XPRT_TRANSPORT_TCP = IPPROTO_TCP,
+ XPRT_TRANSPORT_BC_TCP = IPPROTO_TCP | XPRT_TRANSPORT_BC,
+ XPRT_TRANSPORT_RDMA = 256
+};

/*
* RPC slot table sizes for UDP, TCP transports
diff --git a/net/sunrpc/clnt.c b/net/sunrpc/clnt.c
index c1e467e..7389804 100644
--- a/net/sunrpc/clnt.c
+++ b/net/sunrpc/clnt.c
@@ -288,6 +288,7 @@ struct rpc_clnt *rpc_create(struct rpc_create_args *args)
.srcaddr = args->saddress,
.dstaddr = args->address,
.addrlen = args->addrsize,
+ .bc_xprt = args->bc_xprt,
};
char servername[48];

diff --git a/net/sunrpc/xprtsock.c b/net/sunrpc/xprtsock.c
index d9a2b81..bee4154 100644
--- a/net/sunrpc/xprtsock.c
+++ b/net/sunrpc/xprtsock.c
@@ -2468,11 +2468,93 @@ static struct rpc_xprt *xs_setup_tcp(struct xprt_create *args)
return ERR_PTR(-EINVAL);
}

+/**
+ * xs_setup_bc_tcp - Set up transport to use a TCP backchannel socket
+ * @args: rpc transport creation arguments
+ *
+ */
+static struct rpc_xprt *xs_setup_bc_tcp(struct xprt_create *args)
+{
+ struct sockaddr *addr = args->dstaddr;
+ struct rpc_xprt *xprt;
+ struct sock_xprt *transport;
+ struct svc_sock *bc_sock;
+
+ if (!args->bc_xprt)
+ ERR_PTR(-EINVAL);
+
+ xprt = xs_setup_xprt(args, xprt_tcp_slot_table_entries);
+ if (IS_ERR(xprt))
+ return xprt;
+ transport = container_of(xprt, struct sock_xprt, xprt);
+
+ xprt->prot = IPPROTO_TCP;
+ xprt->tsh_size = sizeof(rpc_fraghdr) / sizeof(u32);
+ xprt->max_payload = RPC_MAX_FRAGMENT_SIZE;
+ xprt->timeout = &xs_tcp_default_timeout;
+
+ /* backchannel */
+ xprt_set_bound(xprt);
+ xprt->bind_timeout = 0;
+ xprt->connect_timeout = 0;
+ xprt->reestablish_timeout = 0;
+ xprt->idle_timeout = 0;
+
+ /*
+ * The backchannel uses the same socket connection as the
+ * forechannel
+ */
+ xprt->bc_xprt = args->bc_xprt;
+ bc_sock = container_of(args->bc_xprt, struct svc_sock, sk_xprt);
+ bc_sock->sk_bc_xprt = xprt;
+ transport->sock = bc_sock->sk_sock;
+ transport->inet = bc_sock->sk_sk;
+
+ xprt->ops = &bc_tcp_ops;
+
+ switch (addr->sa_family) {
+ case AF_INET:
+ xs_format_peer_addresses(xprt, "tcp",
+ RPCBIND_NETID_TCP);
+ break;
+ case AF_INET6:
+ xs_format_peer_addresses(xprt, "tcp",
+ RPCBIND_NETID_TCP6);
+ break;
+ default:
+ kfree(xprt);
+ return ERR_PTR(-EAFNOSUPPORT);
+ }
+
+ if (xprt_bound(xprt))
+ dprintk("RPC: set up xprt to %s (port %s) via %s\n",
+ xprt->address_strings[RPC_DISPLAY_ADDR],
+ xprt->address_strings[RPC_DISPLAY_PORT],
+ xprt->address_strings[RPC_DISPLAY_PROTO]);
+ else
+ dprintk("RPC: set up xprt to %s (autobind) via %s\n",
+ xprt->address_strings[RPC_DISPLAY_ADDR],
+ xprt->address_strings[RPC_DISPLAY_PROTO]);
+
+ /*
+ * Since we don't want connections for the backchannel, we set
+ * the xprt status to connected
+ */
+ xprt_set_connected(xprt);
+
+
+ if (try_module_get(THIS_MODULE))
+ return xprt;
+ kfree(xprt->slot);
+ kfree(xprt);
+ return ERR_PTR(-EINVAL);
+}
+
static struct xprt_class xs_udp_transport = {
.list = LIST_HEAD_INIT(xs_udp_transport.list),
.name = "udp",
.owner = THIS_MODULE,
- .ident = IPPROTO_UDP,
+ .ident = XPRT_TRANSPORT_UDP,
.setup = xs_setup_udp,
};

@@ -2480,10 +2562,18 @@ static struct xprt_class xs_tcp_transport = {
.list = LIST_HEAD_INIT(xs_tcp_transport.list),
.name = "tcp",
.owner = THIS_MODULE,
- .ident = IPPROTO_TCP,
+ .ident = XPRT_TRANSPORT_TCP,
.setup = xs_setup_tcp,
};

+static struct xprt_class xs_bc_tcp_transport = {
+ .list = LIST_HEAD_INIT(xs_bc_tcp_transport.list),
+ .name = "tcp NFSv4.1 backchannel",
+ .owner = THIS_MODULE,
+ .ident = XPRT_TRANSPORT_BC_TCP,
+ .setup = xs_setup_bc_tcp,
+};
+
/**
* init_socket_xprt - set up xprtsock's sysctls, register with RPC client
*
@@ -2497,6 +2587,7 @@ int init_socket_xprt(void)

xprt_register_transport(&xs_udp_transport);
xprt_register_transport(&xs_tcp_transport);
+ xprt_register_transport(&xs_bc_tcp_transport);

return 0;
}
@@ -2516,6 +2607,7 @@ void cleanup_socket_xprt(void)

xprt_unregister_transport(&xs_udp_transport);
xprt_unregister_transport(&xs_tcp_transport);
+ xprt_unregister_transport(&xs_bc_tcp_transport);
}

static int param_set_uint_minmax(const char *val, struct kernel_param *kp,
--
1.6.4



2009-09-10 14:36:55

by Benny Halevy

[permalink] [raw]
Subject: Re: [pnfs] [PATCH v2 0/12] nfsd41 backchannel patches for 2.6.32

Bruce, I've updated patches 2 and 3 based on Trond's comment to
PATCH v2 02/12 and squashed Alexandros fix for it.

Here's the latest list of patches:
[PATCH v2 01/12] nfsd41: sunrpc: move struct rpc_buffer def into sunrpc.h
[PATCH v3 02/12] nfsd41: sunrpc: Added rpc server-side backchannel handling
[PATCH v3 03/12] nfsd41: sunrpc: add new xprt class for nfsv4.1 backchannel
[PATCH v2 04/12] nfsd4: fix whitespace in NFSPROC4_CLNT_CB_NULL definition
[PATCH v2 05/12] nfsd41: Backchannel: callback infrastructure
[PATCH v2 06/12] nfsd41: Backchannel: Add sequence arguments to callback RPC arguments
[PATCH v2 07/12] nfsd41: Backchannel: Server backchannel RPC wait queue
[PATCH v2 08/12] nfsd41: Backchannel: Setup sequence information
[PATCH v2 09/12] nfsd41: Backchannel: cb_sequence callback
[PATCH v2 10/12] nfsd41: Backchannel: Implement cb_recall over NFSv4.1
[PATCH v2 11/12] nfsd41: modify nfsd4.1 backchannel to use new xprt class
[PATCH v2 12/12] nfsd41: Refactor create_client()

Thanks,

Benny

On Sep. 10, 2009, 12:23 +0300, Benny Halevy <[email protected]> wrote:
> Bruce,
>
> This version incorporates the latest fixes from Alexandros
> that address Trond's comments:
> http://linux-nfs.org/pipermail/pnfs/2009-September/009052.html
> http://linux-nfs.org/pipermail/pnfs/2009-September/009053.html
> http://linux-nfs.org/pipermail/pnfs/2009-September/009059.html
>
> This version introduces a new xprt class for the nfsv4.1 backchannel
> with its own setup routine to keep xs_setup_tcp clean.
>
> I also cleaned up the patches a little further by removing
> a bit of dead code and fixing checkpatch whitespace related
> warnings in Alexandros squashme patches.
>
> Benny
>
> On Sep. 04, 2009, 19:18 +0300, Benny Halevy <[email protected]> wrote:
>> Bruce,
>>
>> Here's the updated patchset implementing the nfs41 backchannel
>> for the nfs server.
>>
>> Changes from previous version:
>> - Rebase onto git://git.linux-nfs.org/~bfields/linux.git for-2.6.32
>>
>> - bc_send_request does not block on the xpt_mutex
>> but rather uses the rpc_sleep_on to wait on it.
>>
>> - nfsd4_create_session initializes unconf->cl_cb_conn.cb_addr.
>>
>> - cosmetic-only changes cleaned up.
>>
>> [PATCH 01/10] nfsd41: sunrpc: move struct rpc_buffer def into sunrpc.h
>> [PATCH 02/10] nfsd41: sunrpc: Added rpc server-side backchannel handling
>> [PATCH 03/10] nfsd4: fix whitespace in NFSPROC4_CLNT_CB_NULL definition
>> [PATCH 04/10] nfsd41: Backchannel: callback infrastructure
>> [PATCH 05/10] nfsd41: Backchannel: Add sequence arguments to callback RPC arguments
>> [PATCH 06/10] nfsd41: Backchannel: Server backchannel RPC wait queue
>> [PATCH 07/10] nfsd41: Backchannel: Setup sequence information
>> [PATCH 08/10] nfsd41: Backchannel: cb_sequence callback
>> [PATCH 09/10] nfsd41: Backchannel: Implement cb_recall over NFSv4.1
>> [PATCH 10/10] nfsd41: Refactor create_client()
>>
>> Benny
>> _______________________________________________
>> pNFS mailing list
>> [email protected]
>> http://linux-nfs.org/cgi-bin/mailman/listinfo/pnfs
> _______________________________________________
> pNFS mailing list
> [email protected]
> http://linux-nfs.org/cgi-bin/mailman/listinfo/pnfs

2009-09-10 14:45:25

by Benny Halevy

[permalink] [raw]
Subject: Re: [pnfs] [PATCH v2 0/12] nfsd41 backchannel patches for 2.6.32

On Sep. 10, 2009, 17:37 +0300, Benny Halevy <[email protected]> wrote:
> Bruce, I've updated patches 2 and 3 based on Trond's comment to
> PATCH v2 02/12 and squashed Alexandros fix for it.

I forgot to mention that this patchset is also available on
git://linux-nfs.org/~bhalevy/linux-pnfs.git nfsd41-for-2.6.32

Benny

>
> Here's the latest list of patches:
> [PATCH v2 01/12] nfsd41: sunrpc: move struct rpc_buffer def into sunrpc.h
> [PATCH v3 02/12] nfsd41: sunrpc: Added rpc server-side backchannel handling
> [PATCH v3 03/12] nfsd41: sunrpc: add new xprt class for nfsv4.1 backchannel
> [PATCH v2 04/12] nfsd4: fix whitespace in NFSPROC4_CLNT_CB_NULL definition
> [PATCH v2 05/12] nfsd41: Backchannel: callback infrastructure
> [PATCH v2 06/12] nfsd41: Backchannel: Add sequence arguments to callback RPC arguments
> [PATCH v2 07/12] nfsd41: Backchannel: Server backchannel RPC wait queue
> [PATCH v2 08/12] nfsd41: Backchannel: Setup sequence information
> [PATCH v2 09/12] nfsd41: Backchannel: cb_sequence callback
> [PATCH v2 10/12] nfsd41: Backchannel: Implement cb_recall over NFSv4.1
> [PATCH v2 11/12] nfsd41: modify nfsd4.1 backchannel to use new xprt class
> [PATCH v2 12/12] nfsd41: Refactor create_client()
>
> Thanks,
>
> Benny
>
> On Sep. 10, 2009, 12:23 +0300, Benny Halevy <[email protected]> wrote:
>> Bruce,
>>
>> This version incorporates the latest fixes from Alexandros
>> that address Trond's comments:
>> http://linux-nfs.org/pipermail/pnfs/2009-September/009052.html
>> http://linux-nfs.org/pipermail/pnfs/2009-September/009053.html
>> http://linux-nfs.org/pipermail/pnfs/2009-September/009059.html
>>
>> This version introduces a new xprt class for the nfsv4.1 backchannel
>> with its own setup routine to keep xs_setup_tcp clean.
>>
>> I also cleaned up the patches a little further by removing
>> a bit of dead code and fixing checkpatch whitespace related
>> warnings in Alexandros squashme patches.
>>
>> Benny
>>
>> On Sep. 04, 2009, 19:18 +0300, Benny Halevy <[email protected]> wrote:
>>> Bruce,
>>>
>>> Here's the updated patchset implementing the nfs41 backchannel
>>> for the nfs server.
>>>
>>> Changes from previous version:
>>> - Rebase onto git://git.linux-nfs.org/~bfields/linux.git for-2.6.32
>>>
>>> - bc_send_request does not block on the xpt_mutex
>>> but rather uses the rpc_sleep_on to wait on it.
>>>
>>> - nfsd4_create_session initializes unconf->cl_cb_conn.cb_addr.
>>>
>>> - cosmetic-only changes cleaned up.
>>>
>>> [PATCH 01/10] nfsd41: sunrpc: move struct rpc_buffer def into sunrpc.h
>>> [PATCH 02/10] nfsd41: sunrpc: Added rpc server-side backchannel handling
>>> [PATCH 03/10] nfsd4: fix whitespace in NFSPROC4_CLNT_CB_NULL definition
>>> [PATCH 04/10] nfsd41: Backchannel: callback infrastructure
>>> [PATCH 05/10] nfsd41: Backchannel: Add sequence arguments to callback RPC arguments
>>> [PATCH 06/10] nfsd41: Backchannel: Server backchannel RPC wait queue
>>> [PATCH 07/10] nfsd41: Backchannel: Setup sequence information
>>> [PATCH 08/10] nfsd41: Backchannel: cb_sequence callback
>>> [PATCH 09/10] nfsd41: Backchannel: Implement cb_recall over NFSv4.1
>>> [PATCH 10/10] nfsd41: Refactor create_client()
>>>
>>> Benny
>>> _______________________________________________
>>> pNFS mailing list
>>> [email protected]
>>> http://linux-nfs.org/cgi-bin/mailman/listinfo/pnfs
>> _______________________________________________
>> pNFS mailing list
>> [email protected]
>> http://linux-nfs.org/cgi-bin/mailman/listinfo/pnfs
> _______________________________________________
> pNFS mailing list
> [email protected]
> http://linux-nfs.org/cgi-bin/mailman/listinfo/pnfs

2009-09-10 16:28:05

by J. Bruce Fields

[permalink] [raw]
Subject: Re: [pnfs] [PATCH v2 0/12] nfsd41 backchannel patches for 2.6.32

On Thu, Sep 10, 2009 at 05:37:26PM +0300, Benny Halevy wrote:
> Bruce, I've updated patches 2 and 3 based on Trond's comment to
> PATCH v2 02/12 and squashed Alexandros fix for it.

Thanks! Trond, did you have any more comments?

--b.

>
> Here's the latest list of patches:
> [PATCH v2 01/12] nfsd41: sunrpc: move struct rpc_buffer def into sunrpc.h
> [PATCH v3 02/12] nfsd41: sunrpc: Added rpc server-side backchannel handling
> [PATCH v3 03/12] nfsd41: sunrpc: add new xprt class for nfsv4.1 backchannel
> [PATCH v2 04/12] nfsd4: fix whitespace in NFSPROC4_CLNT_CB_NULL definition
> [PATCH v2 05/12] nfsd41: Backchannel: callback infrastructure
> [PATCH v2 06/12] nfsd41: Backchannel: Add sequence arguments to callback RPC arguments
> [PATCH v2 07/12] nfsd41: Backchannel: Server backchannel RPC wait queue
> [PATCH v2 08/12] nfsd41: Backchannel: Setup sequence information
> [PATCH v2 09/12] nfsd41: Backchannel: cb_sequence callback
> [PATCH v2 10/12] nfsd41: Backchannel: Implement cb_recall over NFSv4.1
> [PATCH v2 11/12] nfsd41: modify nfsd4.1 backchannel to use new xprt class
> [PATCH v2 12/12] nfsd41: Refactor create_client()
>
> Thanks,
>
> Benny
>
> On Sep. 10, 2009, 12:23 +0300, Benny Halevy <[email protected]> wrote:
> > Bruce,
> >
> > This version incorporates the latest fixes from Alexandros
> > that address Trond's comments:
> > http://linux-nfs.org/pipermail/pnfs/2009-September/009052.html
> > http://linux-nfs.org/pipermail/pnfs/2009-September/009053.html
> > http://linux-nfs.org/pipermail/pnfs/2009-September/009059.html
> >
> > This version introduces a new xprt class for the nfsv4.1 backchannel
> > with its own setup routine to keep xs_setup_tcp clean.
> >
> > I also cleaned up the patches a little further by removing
> > a bit of dead code and fixing checkpatch whitespace related
> > warnings in Alexandros squashme patches.
> >
> > Benny
> >
> > On Sep. 04, 2009, 19:18 +0300, Benny Halevy <[email protected]> wrote:
> >> Bruce,
> >>
> >> Here's the updated patchset implementing the nfs41 backchannel
> >> for the nfs server.
> >>
> >> Changes from previous version:
> >> - Rebase onto git://git.linux-nfs.org/~bfields/linux.git for-2.6.32
> >>
> >> - bc_send_request does not block on the xpt_mutex
> >> but rather uses the rpc_sleep_on to wait on it.
> >>
> >> - nfsd4_create_session initializes unconf->cl_cb_conn.cb_addr.
> >>
> >> - cosmetic-only changes cleaned up.
> >>
> >> [PATCH 01/10] nfsd41: sunrpc: move struct rpc_buffer def into sunrpc.h
> >> [PATCH 02/10] nfsd41: sunrpc: Added rpc server-side backchannel handling
> >> [PATCH 03/10] nfsd4: fix whitespace in NFSPROC4_CLNT_CB_NULL definition
> >> [PATCH 04/10] nfsd41: Backchannel: callback infrastructure
> >> [PATCH 05/10] nfsd41: Backchannel: Add sequence arguments to callback RPC arguments
> >> [PATCH 06/10] nfsd41: Backchannel: Server backchannel RPC wait queue
> >> [PATCH 07/10] nfsd41: Backchannel: Setup sequence information
> >> [PATCH 08/10] nfsd41: Backchannel: cb_sequence callback
> >> [PATCH 09/10] nfsd41: Backchannel: Implement cb_recall over NFSv4.1
> >> [PATCH 10/10] nfsd41: Refactor create_client()
> >>
> >> Benny
> >> _______________________________________________
> >> pNFS mailing list
> >> [email protected]
> >> http://linux-nfs.org/cgi-bin/mailman/listinfo/pnfs
> > _______________________________________________
> > pNFS mailing list
> > [email protected]
> > http://linux-nfs.org/cgi-bin/mailman/listinfo/pnfs
>

2009-09-10 17:11:02

by Myklebust, Trond

[permalink] [raw]
Subject: Re: [pnfs] [PATCH v2 0/12] nfsd41 backchannel patches for 2.6.32

On Thu, 2009-09-10 at 12:28 -0400, J. Bruce Fields wrote:
> On Thu, Sep 10, 2009 at 05:37:26PM +0300, Benny Halevy wrote:
> > Bruce, I've updated patches 2 and 3 based on Trond's comment to
> > PATCH v2 02/12 and squashed Alexandros fix for it.
>
> Thanks! Trond, did you have any more comments?

No.

Acked-by: Trond Myklebust <[email protected]>

--
Trond Myklebust
Linux NFS client maintainer

NetApp
[email protected]
http://www.netapp.com

2009-09-04 16:31:21

by Benny Halevy

[permalink] [raw]
Subject: [PATCH 01/10] nfsd41: sunrpc: move struct rpc_buffer def into sunrpc.h

Move struct rpc_buffer's definition into a sunrpc.h, a common, internal
header file, in preparation for supporting the nfsv4.1 backchannel.

Signed-off-by: Benny Halevy <[email protected]>
[nfs41: sunrpc: #include <linux/net.h> from sunrpc.h]
Signed-off-by: Benny Halevy <[email protected]>
---
net/sunrpc/sched.c | 7 ++-----
net/sunrpc/sunrpc.h | 10 ++++++++++
2 files changed, 12 insertions(+), 5 deletions(-)

diff --git a/net/sunrpc/sched.c b/net/sunrpc/sched.c
index 8f459ab..cef74ba 100644
--- a/net/sunrpc/sched.c
+++ b/net/sunrpc/sched.c
@@ -21,6 +21,8 @@

#include <linux/sunrpc/clnt.h>

+#include "sunrpc.h"
+
#ifdef RPC_DEBUG
#define RPCDBG_FACILITY RPCDBG_SCHED
#define RPC_TASK_MAGIC_ID 0xf00baa
@@ -711,11 +713,6 @@ static void rpc_async_schedule(struct work_struct *work)
__rpc_execute(container_of(work, struct rpc_task, u.tk_work));
}

-struct rpc_buffer {
- size_t len;
- char data[];
-};
-
/**
* rpc_malloc - allocate an RPC buffer
* @task: RPC task that will use this buffer
diff --git a/net/sunrpc/sunrpc.h b/net/sunrpc/sunrpc.h
index 5d9dd74..13171e6 100644
--- a/net/sunrpc/sunrpc.h
+++ b/net/sunrpc/sunrpc.h
@@ -27,6 +27,16 @@ SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
#ifndef _NET_SUNRPC_SUNRPC_H
#define _NET_SUNRPC_SUNRPC_H

+#include <linux/net.h>
+
+/*
+ * Header for dynamically allocated rpc buffers.
+ */
+struct rpc_buffer {
+ size_t len;
+ char data[];
+};
+
static inline int rpc_reply_expected(struct rpc_task *task)
{
return (task->tk_msg.rpc_proc != NULL) &&
--
1.6.4


2009-09-04 16:31:34

by Benny Halevy

[permalink] [raw]
Subject: [PATCH 02/10] nfsd41: sunrpc: Added rpc server-side backchannel handling

From: Rahul Iyer <[email protected]>

Signed-off-by: Rahul Iyer <[email protected]>
Signed-off-by: Mike Sager <[email protected]>
Signed-off-by: Marc Eshel <[email protected]>
Signed-off-by: Benny Halevy <[email protected]>
Signed-off-by: Ricardo Labiaga <[email protected]>

When the call direction is a reply, copy the xid and call direction into the
req->rq_private_buf.head[0].iov_base otherwise rpc_verify_header returns
rpc_garbage.

Signed-off-by: Andy Adamson <[email protected]>
Signed-off-by: Benny Halevy <[email protected]>
[get rid of CONFIG_NFSD_V4_1]
Signed-off-by: Benny Halevy <[email protected]>
[sunrpc: refactoring of svc_tcp_recvfrom]
Signed-off-by: Alexandros Batsakis <[email protected]>
Signed-off-by: Ricardo Labiaga <[email protected]>
[nfsd41: sunrpc: create common send routine for the fore and the back channels]
Signed-off-by: Alexandros Batsakis <[email protected]>
Signed-off-by: Ricardo Labiaga <[email protected]>
[nfsd41: sunrpc: Use free_page() to free server backchannel pages]
Signed-off-by: Alexandros Batsakis <[email protected]>
Signed-off-by: Ricardo Labiaga <[email protected]>
[nfsd41: sunrpc: Document server backchannel locking]
Signed-off-by: Alexandros Batsakis <[email protected]>
Signed-off-by: Ricardo Labiaga <[email protected]>
[nfsd41: sunrpc: remove bc_connect_worker()]
Signed-off-by: Alexandros Batsakis <[email protected]>
Signed-off-by: Ricardo Labiaga <[email protected]>
[nfsd41: sunrpc: Define xprt_server_backchannel()[
Signed-off-by: Ricardo Labiaga <[email protected]>
[nfsd41: sunrpc: remove bc_close and bc_init_auto_disconnect dummy functions]
Signed-off-by: Alexandros Batsakis <[email protected]>
Signed-off-by: Ricardo Labiaga <[email protected]>
[nfsd41: sunrpc: eliminate unneeded switch statement in xs_setup_tcp()]
Signed-off-by: Alexandros Batsakis <[email protected]>
Signed-off-by: Ricardo Labiaga <[email protected]>
[nfsd41: sunrpc: Don't auto close the server backchannel connection]
Signed-off-by: Ricardo Labiaga <[email protected]>
[nfsd41: sunrpc: Remove unused functions]
Signed-off-by: Ricardo Labiaga <[email protected]>
Signed-off-by: Benny Halevy <[email protected]>
[nfsd41: change bc_sock to bc_xprt]
[nfsd41: sunrpc: move struct rpc_buffer def into a common header file]
Signed-off-by: Benny Halevy <[email protected]>
[nfsd41: sunrpc: use rpc_sleep in bc_send_request so not to block on mutex]
[removed cosmetic changes]
Signed-off-by: Benny Halevy <[email protected]>
Cc: Trond Myklebust <[email protected]>
---
include/linux/sunrpc/clnt.h | 1 +
include/linux/sunrpc/svc_xprt.h | 1 +
include/linux/sunrpc/svcsock.h | 1 +
include/linux/sunrpc/xprt.h | 7 ++
net/sunrpc/clnt.c | 1 +
net/sunrpc/sunrpc.h | 4 +
net/sunrpc/svc_xprt.c | 2 +
net/sunrpc/svcsock.c | 172 +++++++++++++++++++++++++++-------
net/sunrpc/xprt.c | 13 +++
net/sunrpc/xprtsock.c | 198 +++++++++++++++++++++++++++++++++++++-
10 files changed, 359 insertions(+), 41 deletions(-)

diff --git a/include/linux/sunrpc/clnt.h b/include/linux/sunrpc/clnt.h
index 3d02558..8ed9642 100644
--- a/include/linux/sunrpc/clnt.h
+++ b/include/linux/sunrpc/clnt.h
@@ -114,6 +114,7 @@ struct rpc_create_args {
rpc_authflavor_t authflavor;
unsigned long flags;
char *client_name;
+ struct svc_xprt *bc_xprt; /* NFSv4.1 backchannel */
};

/* Values for "flags" field */
diff --git a/include/linux/sunrpc/svc_xprt.h b/include/linux/sunrpc/svc_xprt.h
index 2223ae0..5f4e18b 100644
--- a/include/linux/sunrpc/svc_xprt.h
+++ b/include/linux/sunrpc/svc_xprt.h
@@ -65,6 +65,7 @@ struct svc_xprt {
size_t xpt_locallen; /* length of address */
struct sockaddr_storage xpt_remote; /* remote peer's address */
size_t xpt_remotelen; /* length of address */
+ struct rpc_wait_queue xpt_bc_pending; /* backchannel wait queue */
};

int svc_reg_xprt_class(struct svc_xprt_class *);
diff --git a/include/linux/sunrpc/svcsock.h b/include/linux/sunrpc/svcsock.h
index 04dba23..4b854e2 100644
--- a/include/linux/sunrpc/svcsock.h
+++ b/include/linux/sunrpc/svcsock.h
@@ -28,6 +28,7 @@ struct svc_sock {
/* private TCP part */
u32 sk_reclen; /* length of record */
u32 sk_tcplen; /* current read length */
+ struct rpc_xprt *sk_bc_xprt; /* NFSv4.1 backchannel xprt */
};

/*
diff --git a/include/linux/sunrpc/xprt.h b/include/linux/sunrpc/xprt.h
index c090df4..cfad635 100644
--- a/include/linux/sunrpc/xprt.h
+++ b/include/linux/sunrpc/xprt.h
@@ -179,6 +179,7 @@ struct rpc_xprt {
spinlock_t reserve_lock; /* lock slot table */
u32 xid; /* Next XID value to use */
struct rpc_task * snd_task; /* Task blocked in send */
+ struct svc_xprt *bc_xprt; /* NFSv4.1 backchannel */
#if defined(CONFIG_NFS_V4_1)
struct svc_serv *bc_serv; /* The RPC service which will */
/* process the callback */
@@ -231,6 +232,7 @@ struct xprt_create {
struct sockaddr * srcaddr; /* optional local address */
struct sockaddr * dstaddr; /* remote peer address */
size_t addrlen;
+ struct svc_xprt *bc_xprt; /* NFSv4.1 backchannel */
};

struct xprt_class {
@@ -366,6 +368,11 @@ static inline int xprt_test_and_set_binding(struct rpc_xprt *xprt)
return test_and_set_bit(XPRT_BINDING, &xprt->state);
}

+static inline int xprt_server_backchannel(struct rpc_xprt *xprt)
+{
+ return xprt->bc_xprt != NULL;
+}
+
#endif /* __KERNEL__*/

#endif /* _LINUX_SUNRPC_XPRT_H */
diff --git a/net/sunrpc/clnt.c b/net/sunrpc/clnt.c
index c1e467e..7389804 100644
--- a/net/sunrpc/clnt.c
+++ b/net/sunrpc/clnt.c
@@ -288,6 +288,7 @@ struct rpc_clnt *rpc_create(struct rpc_create_args *args)
.srcaddr = args->saddress,
.dstaddr = args->address,
.addrlen = args->addrsize,
+ .bc_xprt = args->bc_xprt,
};
char servername[48];

diff --git a/net/sunrpc/sunrpc.h b/net/sunrpc/sunrpc.h
index 13171e6..90c292e 100644
--- a/net/sunrpc/sunrpc.h
+++ b/net/sunrpc/sunrpc.h
@@ -43,5 +43,9 @@ static inline int rpc_reply_expected(struct rpc_task *task)
(task->tk_msg.rpc_proc->p_decode != NULL);
}

+int svc_send_common(struct socket *sock, struct xdr_buf *xdr,
+ struct page *headpage, unsigned long headoffset,
+ struct page *tailpage, unsigned long tailoffset);
+
#endif /* _NET_SUNRPC_SUNRPC_H */

diff --git a/net/sunrpc/svc_xprt.c b/net/sunrpc/svc_xprt.c
index 912dea5..df124f7 100644
--- a/net/sunrpc/svc_xprt.c
+++ b/net/sunrpc/svc_xprt.c
@@ -160,6 +160,7 @@ void svc_xprt_init(struct svc_xprt_class *xcl, struct svc_xprt *xprt,
mutex_init(&xprt->xpt_mutex);
spin_lock_init(&xprt->xpt_lock);
set_bit(XPT_BUSY, &xprt->xpt_flags);
+ rpc_init_wait_queue(&xprt->xpt_bc_pending, "xpt_bc_pending");
}
EXPORT_SYMBOL_GPL(svc_xprt_init);

@@ -810,6 +811,7 @@ int svc_send(struct svc_rqst *rqstp)
else
len = xprt->xpt_ops->xpo_sendto(rqstp);
mutex_unlock(&xprt->xpt_mutex);
+ rpc_wake_up(&xprt->xpt_bc_pending);
svc_xprt_release(rqstp);

if (len == -ECONNREFUSED || len == -ENOTCONN || len == -EAGAIN)
diff --git a/net/sunrpc/svcsock.c b/net/sunrpc/svcsock.c
index 76a380d..ccc5e83 100644
--- a/net/sunrpc/svcsock.c
+++ b/net/sunrpc/svcsock.c
@@ -49,6 +49,7 @@
#include <linux/sunrpc/msg_prot.h>
#include <linux/sunrpc/svcsock.h>
#include <linux/sunrpc/stats.h>
+#include <linux/sunrpc/xprt.h>

#define RPCDBG_FACILITY RPCDBG_SVCXPRT

@@ -153,49 +154,27 @@ static void svc_set_cmsg_data(struct svc_rqst *rqstp, struct cmsghdr *cmh)
}

/*
- * Generic sendto routine
+ * send routine intended to be shared by the fore- and back-channel
*/
-static int svc_sendto(struct svc_rqst *rqstp, struct xdr_buf *xdr)
+int svc_send_common(struct socket *sock, struct xdr_buf *xdr,
+ struct page *headpage, unsigned long headoffset,
+ struct page *tailpage, unsigned long tailoffset)
{
- struct svc_sock *svsk =
- container_of(rqstp->rq_xprt, struct svc_sock, sk_xprt);
- struct socket *sock = svsk->sk_sock;
- int slen;
- union {
- struct cmsghdr hdr;
- long all[SVC_PKTINFO_SPACE / sizeof(long)];
- } buffer;
- struct cmsghdr *cmh = &buffer.hdr;
- int len = 0;
int result;
int size;
struct page **ppage = xdr->pages;
size_t base = xdr->page_base;
unsigned int pglen = xdr->page_len;
unsigned int flags = MSG_MORE;
- RPC_IFDEBUG(char buf[RPC_MAX_ADDRBUFLEN]);
+ int slen;
+ int len = 0;

slen = xdr->len;

- if (rqstp->rq_prot == IPPROTO_UDP) {
- struct msghdr msg = {
- .msg_name = &rqstp->rq_addr,
- .msg_namelen = rqstp->rq_addrlen,
- .msg_control = cmh,
- .msg_controllen = sizeof(buffer),
- .msg_flags = MSG_MORE,
- };
-
- svc_set_cmsg_data(rqstp, cmh);
-
- if (sock_sendmsg(sock, &msg, 0) < 0)
- goto out;
- }
-
/* send head */
if (slen == xdr->head[0].iov_len)
flags = 0;
- len = kernel_sendpage(sock, rqstp->rq_respages[0], 0,
+ len = kernel_sendpage(sock, headpage, headoffset,
xdr->head[0].iov_len, flags);
if (len != xdr->head[0].iov_len)
goto out;
@@ -219,16 +198,58 @@ static int svc_sendto(struct svc_rqst *rqstp, struct xdr_buf *xdr)
base = 0;
ppage++;
}
+
/* send tail */
if (xdr->tail[0].iov_len) {
- result = kernel_sendpage(sock, rqstp->rq_respages[0],
- ((unsigned long)xdr->tail[0].iov_base)
- & (PAGE_SIZE-1),
- xdr->tail[0].iov_len, 0);
-
+ result = kernel_sendpage(sock, tailpage, tailoffset,
+ xdr->tail[0].iov_len, 0);
if (result > 0)
len += result;
}
+
+out:
+ return len;
+}
+
+
+/*
+ * Generic sendto routine
+ */
+static int svc_sendto(struct svc_rqst *rqstp, struct xdr_buf *xdr)
+{
+ struct svc_sock *svsk =
+ container_of(rqstp->rq_xprt, struct svc_sock, sk_xprt);
+ struct socket *sock = svsk->sk_sock;
+ union {
+ struct cmsghdr hdr;
+ long all[SVC_PKTINFO_SPACE / sizeof(long)];
+ } buffer;
+ struct cmsghdr *cmh = &buffer.hdr;
+ int len = 0;
+ unsigned long tailoff;
+ unsigned long headoff;
+ RPC_IFDEBUG(char buf[RPC_MAX_ADDRBUFLEN]);
+
+ if (rqstp->rq_prot == IPPROTO_UDP) {
+ struct msghdr msg = {
+ .msg_name = &rqstp->rq_addr,
+ .msg_namelen = rqstp->rq_addrlen,
+ .msg_control = cmh,
+ .msg_controllen = sizeof(buffer),
+ .msg_flags = MSG_MORE,
+ };
+
+ svc_set_cmsg_data(rqstp, cmh);
+
+ if (sock_sendmsg(sock, &msg, 0) < 0)
+ goto out;
+ }
+
+ tailoff = ((unsigned long)xdr->tail[0].iov_base) & (PAGE_SIZE-1);
+ headoff = 0;
+ len = svc_send_common(sock, xdr, rqstp->rq_respages[0], headoff,
+ rqstp->rq_respages[0], tailoff);
+
out:
dprintk("svc: socket %p sendto([%p %Zu... ], %d) = %d (addr %s)\n",
svsk, xdr->head[0].iov_base, xdr->head[0].iov_len,
@@ -951,6 +972,57 @@ static int svc_tcp_recv_record(struct svc_sock *svsk, struct svc_rqst *rqstp)
return -EAGAIN;
}

+static int svc_process_calldir(struct svc_sock *svsk, struct svc_rqst *rqstp,
+ struct rpc_rqst **reqpp, struct kvec *vec)
+{
+ struct rpc_rqst *req = NULL;
+ u32 *p;
+ u32 xid;
+ u32 calldir;
+ int len;
+
+ len = svc_recvfrom(rqstp, vec, 1, 8);
+ if (len < 0)
+ goto error;
+
+ p = (u32 *)rqstp->rq_arg.head[0].iov_base;
+ xid = *p++;
+ calldir = *p;
+
+ if (calldir == 0) {
+ /* REQUEST is the most common case */
+ vec[0] = rqstp->rq_arg.head[0];
+ } else {
+ /* REPLY */
+ if (svsk->sk_bc_xprt)
+ req = xprt_lookup_rqst(svsk->sk_bc_xprt, xid);
+
+ if (!req) {
+ printk(KERN_NOTICE
+ "%s: Got unrecognized reply: "
+ "calldir 0x%x sk_bc_xprt %p xid %08x\n",
+ __func__, ntohl(calldir),
+ svsk->sk_bc_xprt, xid);
+ vec[0] = rqstp->rq_arg.head[0];
+ goto out;
+ }
+
+ memcpy(&req->rq_private_buf, &req->rq_rcv_buf,
+ sizeof(struct xdr_buf));
+ /* copy the xid and call direction */
+ memcpy(req->rq_private_buf.head[0].iov_base,
+ rqstp->rq_arg.head[0].iov_base, 8);
+ vec[0] = req->rq_private_buf.head[0];
+ }
+ out:
+ vec[0].iov_base += 8;
+ vec[0].iov_len -= 8;
+ len = svsk->sk_reclen - 8;
+ error:
+ *reqpp = req;
+ return len;
+}
+
/*
* Receive data from a TCP socket.
*/
@@ -962,6 +1034,7 @@ static int svc_tcp_recvfrom(struct svc_rqst *rqstp)
int len;
struct kvec *vec;
int pnum, vlen;
+ struct rpc_rqst *req = NULL;

dprintk("svc: tcp_recv %p data %d conn %d close %d\n",
svsk, test_bit(XPT_DATA, &svsk->sk_xprt.xpt_flags),
@@ -975,9 +1048,27 @@ static int svc_tcp_recvfrom(struct svc_rqst *rqstp)
vec = rqstp->rq_vec;
vec[0] = rqstp->rq_arg.head[0];
vlen = PAGE_SIZE;
+
+ /*
+ * We have enough data for the whole tcp record. Let's try and read the
+ * first 8 bytes to get the xid and the call direction. We can use this
+ * to figure out if this is a call or a reply to a callback. If
+ * sk_reclen is < 8 (xid and calldir), then this is a malformed packet.
+ * In that case, don't bother with the calldir and just read the data.
+ * It will be rejected in svc_process.
+ */
+ if (len >= 8) {
+ len = svc_process_calldir(svsk, rqstp, &req, vec);
+ if (len < 0)
+ goto err_again;
+ vlen -= 8;
+ }
+
pnum = 1;
while (vlen < len) {
- vec[pnum].iov_base = page_address(rqstp->rq_pages[pnum]);
+ vec[pnum].iov_base = (req) ?
+ page_address(req->rq_private_buf.pages[pnum - 1]) :
+ page_address(rqstp->rq_pages[pnum]);
vec[pnum].iov_len = PAGE_SIZE;
pnum++;
vlen += PAGE_SIZE;
@@ -989,6 +1080,16 @@ static int svc_tcp_recvfrom(struct svc_rqst *rqstp)
if (len < 0)
goto err_again;

+ /*
+ * Account for the 8 bytes we read earlier
+ */
+ len += 8;
+
+ if (req) {
+ xprt_complete_rqst(req->rq_task, len);
+ len = 0;
+ goto out;
+ }
dprintk("svc: TCP complete record (%d bytes)\n", len);
rqstp->rq_arg.len = len;
rqstp->rq_arg.page_base = 0;
@@ -1002,6 +1103,7 @@ static int svc_tcp_recvfrom(struct svc_rqst *rqstp)
rqstp->rq_xprt_ctxt = NULL;
rqstp->rq_prot = IPPROTO_TCP;

+out:
/* Reset TCP read info */
svsk->sk_reclen = 0;
svsk->sk_tcplen = 0;
diff --git a/net/sunrpc/xprt.c b/net/sunrpc/xprt.c
index f412a85..7b0cf70 100644
--- a/net/sunrpc/xprt.c
+++ b/net/sunrpc/xprt.c
@@ -599,6 +599,9 @@ static void xprt_autoclose(struct work_struct *work)
struct rpc_xprt *xprt =
container_of(work, struct rpc_xprt, task_cleanup);

+ if (xprt_server_backchannel(xprt))
+ return;
+
xprt->ops->close(xprt);
clear_bit(XPRT_CLOSE_WAIT, &xprt->state);
xprt_release_write(xprt, NULL);
@@ -669,6 +672,9 @@ xprt_init_autodisconnect(unsigned long data)
{
struct rpc_xprt *xprt = (struct rpc_xprt *)data;

+ if (xprt_server_backchannel(xprt))
+ return;
+
spin_lock(&xprt->transport_lock);
if (!list_empty(&xprt->recv) || xprt->shutdown)
goto out_abort;
@@ -1103,6 +1109,13 @@ found:
dprintk("RPC: created transport %p with %u slots\n", xprt,
xprt->max_reqs);

+ /*
+ * Since we don't want connections for the backchannel, we set
+ * the xprt status to connected
+ */
+ if (args->bc_xprt)
+ xprt_set_connected(xprt);
+
return xprt;
}

diff --git a/net/sunrpc/xprtsock.c b/net/sunrpc/xprtsock.c
index 62438f3..592681c 100644
--- a/net/sunrpc/xprtsock.c
+++ b/net/sunrpc/xprtsock.c
@@ -32,6 +32,7 @@
#include <linux/tcp.h>
#include <linux/sunrpc/clnt.h>
#include <linux/sunrpc/sched.h>
+#include <linux/sunrpc/svcsock.h>
#include <linux/sunrpc/xprtsock.h>
#include <linux/file.h>
#ifdef CONFIG_NFS_V4_1
@@ -43,6 +44,7 @@
#include <net/udp.h>
#include <net/tcp.h>

+#include "sunrpc.h"
/*
* xprtsock tunables
*/
@@ -2098,6 +2100,134 @@ static void xs_tcp_print_stats(struct rpc_xprt *xprt, struct seq_file *seq)
xprt->stat.bklog_u);
}

+/*
+ * Allocate a bunch of pages for a scratch buffer for the rpc code. The reason
+ * we allocate pages instead doing a kmalloc like rpc_malloc is because we want
+ * to use the server side send routines.
+ */
+void *bc_malloc(struct rpc_task *task, size_t size)
+{
+ struct page *page;
+ struct rpc_buffer *buf;
+
+ BUG_ON(size > PAGE_SIZE - sizeof(struct rpc_buffer));
+ page = alloc_page(GFP_KERNEL);
+
+ if (!page)
+ return NULL;
+
+ buf = page_address(page);
+ buf->len = PAGE_SIZE;
+
+ return buf->data;
+}
+
+/*
+ * Free the space allocated in the bc_alloc routine
+ */
+void bc_free(void *buffer)
+{
+ struct rpc_buffer *buf;
+
+ if (!buffer)
+ return;
+
+ buf = container_of(buffer, struct rpc_buffer, data);
+ free_page((unsigned long)buf);
+}
+
+/*
+ * Use the svc_sock to send the callback. Must be called with svsk->sk_mutex
+ * held. Borrows heavily from svc_tcp_sendto and xs_tcp_send_request.
+ */
+static int bc_sendto(struct rpc_rqst *req)
+{
+ int len;
+ struct xdr_buf *xbufp = &req->rq_snd_buf;
+ struct rpc_xprt *xprt = req->rq_xprt;
+ struct sock_xprt *transport =
+ container_of(xprt, struct sock_xprt, xprt);
+ struct socket *sock = transport->sock;
+ unsigned long headoff;
+ unsigned long tailoff;
+
+ /*
+ * Set up the rpc header and record marker stuff
+ */
+ xs_encode_tcp_record_marker(xbufp);
+
+ tailoff = (unsigned long)xbufp->tail[0].iov_base & ~PAGE_MASK;
+ headoff = (unsigned long)xbufp->head[0].iov_base & ~PAGE_MASK;
+ len = svc_send_common(sock, xbufp,
+ virt_to_page(xbufp->head[0].iov_base), headoff,
+ xbufp->tail[0].iov_base, tailoff);
+
+ if (len != xbufp->len) {
+ printk(KERN_NOTICE "Error sending entire callback!\n");
+ len = -EAGAIN;
+ }
+
+ return len;
+}
+
+/*
+ * The send routine. Borrows from svc_send
+ */
+static int bc_send_request(struct rpc_task *task)
+{
+ struct rpc_rqst *req = task->tk_rqstp;
+ struct svc_xprt *xprt;
+ struct svc_sock *svsk;
+ u32 len;
+
+ dprintk("sending request with xid: %08x\n", ntohl(req->rq_xid));
+ /*
+ * Get the server socket associated with this callback xprt
+ */
+ xprt = req->rq_xprt->bc_xprt;
+ svsk = container_of(xprt, struct svc_sock, sk_xprt);
+
+ /*
+ * Grab the mutex to serialize data as the connection is shared
+ * with the fore channel
+ */
+ if (!mutex_trylock(&xprt->xpt_mutex)) {
+ rpc_sleep_on(&xprt->xpt_bc_pending, task, NULL);
+ if (!mutex_trylock(&xprt->xpt_mutex))
+ return -EAGAIN;
+ rpc_wake_up_queued_task(&xprt->xpt_bc_pending, task);
+ }
+ if (test_bit(XPT_DEAD, &xprt->xpt_flags))
+ len = -ENOTCONN;
+ else
+ len = bc_sendto(req);
+ mutex_unlock(&xprt->xpt_mutex);
+
+ if (len > 0)
+ len = 0;
+
+ return len;
+}
+
+/*
+ * The close routine. Since this is client initiated, we do nothing
+ */
+
+static void bc_close(struct rpc_xprt *xprt)
+{
+ return;
+}
+
+/*
+ * The xprt destroy routine. Again, because this connection is client
+ * initiated, we do nothing
+ */
+
+static void bc_destroy(struct rpc_xprt *xprt)
+{
+ return;
+}
+
static struct rpc_xprt_ops xs_udp_ops = {
.set_buffer_size = xs_udp_set_buffer_size,
.reserve_xprt = xprt_reserve_xprt_cong,
@@ -2134,6 +2264,22 @@ static struct rpc_xprt_ops xs_tcp_ops = {
.print_stats = xs_tcp_print_stats,
};

+/*
+ * The rpc_xprt_ops for the server backchannel
+ */
+
+static struct rpc_xprt_ops bc_tcp_ops = {
+ .reserve_xprt = xprt_reserve_xprt,
+ .release_xprt = xprt_release_xprt,
+ .buf_alloc = bc_malloc,
+ .buf_free = bc_free,
+ .send_request = bc_send_request,
+ .set_retrans_timeout = xprt_set_retrans_timeout_def,
+ .close = bc_close,
+ .destroy = bc_destroy,
+ .print_stats = xs_tcp_print_stats,
+};
+
static struct rpc_xprt *xs_setup_xprt(struct xprt_create *args,
unsigned int slot_table_size)
{
@@ -2272,14 +2418,46 @@ static struct rpc_xprt *xs_setup_tcp(struct xprt_create *args)
xprt->prot = IPPROTO_TCP;
xprt->tsh_size = sizeof(rpc_fraghdr) / sizeof(u32);
xprt->max_payload = RPC_MAX_FRAGMENT_SIZE;
+ xprt->timeout = &xs_tcp_default_timeout;

- xprt->bind_timeout = XS_BIND_TO;
- xprt->connect_timeout = XS_TCP_CONN_TO;
- xprt->reestablish_timeout = XS_TCP_INIT_REEST_TO;
- xprt->idle_timeout = XS_IDLE_DISC_TO;
+ if (args->bc_xprt) {
+ struct svc_sock *bc_sock;

- xprt->ops = &xs_tcp_ops;
- xprt->timeout = &xs_tcp_default_timeout;
+ /* backchannel */
+ xprt_set_bound(xprt);
+ xprt->bind_timeout = 0;
+ xprt->connect_timeout = 0;
+ xprt->reestablish_timeout = 0;
+ xprt->idle_timeout = (~0);
+
+ /*
+ * The backchannel uses the same socket connection as the
+ * forechannel
+ */
+ xprt->bc_xprt = args->bc_xprt;
+ bc_sock = container_of(args->bc_xprt, struct svc_sock, sk_xprt);
+ bc_sock->sk_bc_xprt = xprt;
+ transport->sock = bc_sock->sk_sock;
+ transport->inet = bc_sock->sk_sk;
+
+ xprt->ops = &bc_tcp_ops;
+
+ switch (addr->sa_family) {
+ case AF_INET:
+ xs_format_peer_addresses(xprt, "tcp",
+ RPCBIND_NETID_TCP);
+ break;
+ case AF_INET6:
+ xs_format_peer_addresses(xprt, "tcp",
+ RPCBIND_NETID_TCP6);
+ break;
+ default:
+ kfree(xprt);
+ return ERR_PTR(-EAFNOSUPPORT);
+ }
+
+ goto out;
+ }

switch (addr->sa_family) {
case AF_INET:
@@ -2303,6 +2481,14 @@ static struct rpc_xprt *xs_setup_tcp(struct xprt_create *args)
return ERR_PTR(-EAFNOSUPPORT);
}

+ xprt->bind_timeout = XS_BIND_TO;
+ xprt->connect_timeout = XS_TCP_CONN_TO;
+ xprt->reestablish_timeout = XS_TCP_INIT_REEST_TO;
+ xprt->idle_timeout = XS_IDLE_DISC_TO;
+
+ xprt->ops = &xs_tcp_ops;
+
+out:
if (xprt_bound(xprt))
dprintk("RPC: set up xprt to %s (port %s) via %s\n",
xprt->address_strings[RPC_DISPLAY_ADDR],
--
1.6.4


2009-09-04 16:31:46

by Benny Halevy

[permalink] [raw]
Subject: [PATCH 03/10] nfsd4: fix whitespace in NFSPROC4_CLNT_CB_NULL definition

Signed-off-by: Benny Halevy <[email protected]>
---
fs/nfsd/nfs4callback.c | 2 +-
1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/fs/nfsd/nfs4callback.c b/fs/nfsd/nfs4callback.c
index 81d1c52..63bb384 100644
--- a/fs/nfsd/nfs4callback.c
+++ b/fs/nfsd/nfs4callback.c
@@ -56,7 +56,7 @@
/* Index of predefined Linux callback client operations */

enum {
- NFSPROC4_CLNT_CB_NULL = 0,
+ NFSPROC4_CLNT_CB_NULL = 0,
NFSPROC4_CLNT_CB_RECALL,
};

--
1.6.4


2009-09-04 16:32:00

by Benny Halevy

[permalink] [raw]
Subject: [PATCH 04/10] nfsd41: Backchannel: callback infrastructure

From: Andy Adamson <[email protected]>

Keep the xprt used for create_session in cl_cb_xprt.
Mark cl_callback.cb_minorversion = 1 and remember
the client provided cl_callback.cb_prog rpc program number.
Use it to probe the callback path.

Use the client's network address to initialize as the
callback's address as expected by the xprt creation
routines.

Define xdr sizes and code nfs4_cb_compound header to be able
to send a null callback rpc.

Signed-off-by: Andy Adamson<[email protected]>
Signed-off-by: Benny Halevy <[email protected]>
Signed-off-by: Ricardo Labiaga <[email protected]>
[get callback minorversion from fore channel's]
Signed-off-by: Benny Halevy <[email protected]>
[nfsd41: change bc_sock to bc_xprt]
Signed-off-by: Benny Halevy <[email protected]>
[pulled definition for cl_cb_xprt]
Signed-off-by: Benny Halevy <[email protected]>
[nfsd41: set up backchannel's cb_addr]
Signed-off-by: Benny Halevy <[email protected]>
---
fs/nfsd/nfs4callback.c | 24 ++++++++++++++++++++++--
fs/nfsd/nfs4state.c | 14 ++++++++++++++
include/linux/nfsd/state.h | 3 +++
3 files changed, 39 insertions(+), 2 deletions(-)

diff --git a/fs/nfsd/nfs4callback.c b/fs/nfsd/nfs4callback.c
index 63bb384..f25ef3c 100644
--- a/fs/nfsd/nfs4callback.c
+++ b/fs/nfsd/nfs4callback.c
@@ -43,6 +43,7 @@
#include <linux/sunrpc/xdr.h>
#include <linux/sunrpc/svc.h>
#include <linux/sunrpc/clnt.h>
+#include <linux/sunrpc/svcsock.h>
#include <linux/nfsd/nfsd.h>
#include <linux/nfsd/state.h>
#include <linux/sunrpc/sched.h>
@@ -52,16 +53,19 @@

#define NFSPROC4_CB_NULL 0
#define NFSPROC4_CB_COMPOUND 1
+#define NFS4_STATEID_SIZE 16

/* Index of predefined Linux callback client operations */

enum {
NFSPROC4_CLNT_CB_NULL = 0,
NFSPROC4_CLNT_CB_RECALL,
+ NFSPROC4_CLNT_CB_SEQUENCE,
};

enum nfs_cb_opnum4 {
OP_CB_RECALL = 4,
+ OP_CB_SEQUENCE = 11,
};

#define NFS4_MAXTAGLEN 20
@@ -70,15 +74,22 @@ enum nfs_cb_opnum4 {
#define NFS4_dec_cb_null_sz 0
#define cb_compound_enc_hdr_sz 4
#define cb_compound_dec_hdr_sz (3 + (NFS4_MAXTAGLEN >> 2))
+#define sessionid_sz (NFS4_MAX_SESSIONID_LEN >> 2)
+#define cb_sequence_enc_sz (sessionid_sz + 4 + \
+ 1 /* no referring calls list yet */)
+#define cb_sequence_dec_sz (op_dec_sz + sessionid_sz + 4)
+
#define op_enc_sz 1
#define op_dec_sz 2
#define enc_nfs4_fh_sz (1 + (NFS4_FHSIZE >> 2))
#define enc_stateid_sz (NFS4_STATEID_SIZE >> 2)
#define NFS4_enc_cb_recall_sz (cb_compound_enc_hdr_sz + \
+ cb_sequence_enc_sz + \
1 + enc_stateid_sz + \
enc_nfs4_fh_sz)

#define NFS4_dec_cb_recall_sz (cb_compound_dec_hdr_sz + \
+ cb_sequence_dec_sz + \
op_dec_sz)

/*
@@ -137,11 +148,13 @@ xdr_error: \
} while (0)

struct nfs4_cb_compound_hdr {
- int status;
- u32 ident;
+ /* args */
+ u32 ident; /* minorversion 0 only */
u32 nops;
__be32 *nops_p;
u32 minorversion;
+ /* res */
+ int status;
u32 taglen;
char *tag;
};
@@ -399,6 +412,13 @@ int setup_callback_client(struct nfs4_client *clp)
if (!clp->cl_principal && (clp->cl_flavor >= RPC_AUTH_GSS_KRB5))
return -EINVAL;

+ if (cb->cb_minorversion)
+ args.bc_xprt = clp->cl_cb_xprt;
+
+ dprintk("%s: program %s 0x%x nrvers %u version %u minorversion %u\n",
+ __func__, args.program->name, args.prognumber,
+ args.program->nrvers, args.version, cb->cb_minorversion);
+
/* Create RPC client */
client = rpc_create(&args);
if (IS_ERR(client)) {
diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index 46e9ac5..e4c3223 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -706,6 +706,8 @@ static inline void
free_client(struct nfs4_client *clp)
{
shutdown_callback_client(clp);
+ if (clp->cl_cb_xprt)
+ svc_xprt_put(clp->cl_cb_xprt);
if (clp->cl_cred.cr_group_info)
put_group_info(clp->cl_cred.cr_group_info);
kfree(clp->cl_principal);
@@ -1321,6 +1323,18 @@ nfsd4_create_session(struct svc_rqst *rqstp,
cr_ses->flags &= ~SESSION4_PERSIST;
cr_ses->flags &= ~SESSION4_RDMA;

+ if (cr_ses->flags & SESSION4_BACK_CHAN) {
+ unconf->cl_cb_xprt = rqstp->rq_xprt;
+ svc_xprt_get(unconf->cl_cb_xprt);
+ rpc_copy_addr(
+ (struct sockaddr *)&unconf->cl_cb_conn.cb_addr,
+ sa);
+ unconf->cl_cb_conn.cb_addrlen = svc_addr_len(sa);
+ unconf->cl_cb_conn.cb_minorversion =
+ cstate->minorversion;
+ unconf->cl_cb_conn.cb_prog = cr_ses->callback_prog;
+ nfsd4_probe_callback(unconf);
+ }
conf = unconf;
} else {
status = nfserr_stale_clientid;
diff --git a/include/linux/nfsd/state.h b/include/linux/nfsd/state.h
index 70ef5f4..243277b 100644
--- a/include/linux/nfsd/state.h
+++ b/include/linux/nfsd/state.h
@@ -212,6 +212,9 @@ struct nfs4_client {
struct nfsd4_clid_slot cl_cs_slot; /* create_session slot */
u32 cl_exchange_flags;
struct nfs4_sessionid cl_sessionid;
+
+ /* for nfs41 callbacks */
+ struct svc_xprt *cl_cb_xprt; /* 4.1 callback transport */
};

/* struct nfs4_client_reset
--
1.6.4


2009-09-04 16:32:13

by Benny Halevy

[permalink] [raw]
Subject: [PATCH 05/10] nfsd41: Backchannel: Add sequence arguments to callback RPC arguments

From: Ricardo Labiaga <[email protected]>

Follow the model we use in the client. Make the sequence arguments
part of the regular RPC arguments. None of the callbacks that are
soon to be implemented expect results that need to be passed back
to the caller, so we don't define a separate RPC results structure.
For session validation, the cb_sequence decoding will use a pointer
to the sequence arguments that are part of the RPC argument.

Signed-off-by: Ricardo Labiaga <[email protected]>
[define struct nfsd4_cb_sequence here]
Signed-off-by: Benny Halevy <[email protected]>
---
fs/nfsd/nfs4callback.c | 5 +++++
include/linux/nfsd/state.h | 6 ++++++
2 files changed, 11 insertions(+), 0 deletions(-)

diff --git a/fs/nfsd/nfs4callback.c b/fs/nfsd/nfs4callback.c
index f25ef3c..32ea3f5 100644
--- a/fs/nfsd/nfs4callback.c
+++ b/fs/nfsd/nfs4callback.c
@@ -92,6 +92,11 @@ enum nfs_cb_opnum4 {
cb_sequence_dec_sz + \
op_dec_sz)

+struct nfs4_rpc_args {
+ void *args_op;
+ struct nfsd4_cb_sequence args_seq;
+};
+
/*
* Generic encode routines from fs/nfs/nfs4xdr.c
*/
diff --git a/include/linux/nfsd/state.h b/include/linux/nfsd/state.h
index 243277b..f69ea48 100644
--- a/include/linux/nfsd/state.h
+++ b/include/linux/nfsd/state.h
@@ -60,6 +60,12 @@ typedef struct {
#define si_stateownerid si_opaque.so_stateownerid
#define si_fileid si_opaque.so_fileid

+struct nfsd4_cb_sequence {
+ /* args/res */
+ u32 cbs_minorversion;
+ struct nfs4_client *cbs_clp;
+};
+
struct nfs4_delegation {
struct list_head dl_perfile;
struct list_head dl_perclnt;
--
1.6.4


2009-09-04 16:32:26

by Benny Halevy

[permalink] [raw]
Subject: [PATCH 06/10] nfsd41: Backchannel: Server backchannel RPC wait queue

From: Ricardo Labiaga <[email protected]>

RPC callback requests will wait on this wait queue if the backchannel
is out of slots.

Signed-off-by: Ricardo Labiaga <[email protected]>
Signed-off-by: Benny Halevy <[email protected]>
---
fs/nfsd/nfs4state.c | 2 ++
include/linux/nfsd/state.h | 4 ++++
2 files changed, 6 insertions(+), 0 deletions(-)

diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index e4c3223..957f6e5 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -779,6 +779,8 @@ static struct nfs4_client *create_client(struct xdr_netobj name, char *recdir)
INIT_LIST_HEAD(&clp->cl_delegations);
INIT_LIST_HEAD(&clp->cl_sessions);
INIT_LIST_HEAD(&clp->cl_lru);
+ clear_bit(0, &clp->cl_cb_slot_busy);
+ rpc_init_wait_queue(&clp->cl_cb_waitq, "Backchannel slot table");
return clp;
}

diff --git a/include/linux/nfsd/state.h b/include/linux/nfsd/state.h
index f69ea48..234e9af 100644
--- a/include/linux/nfsd/state.h
+++ b/include/linux/nfsd/state.h
@@ -220,7 +220,11 @@ struct nfs4_client {
struct nfs4_sessionid cl_sessionid;

/* for nfs41 callbacks */
+ /* We currently support a single back channel with a single slot */
+ unsigned long cl_cb_slot_busy;
struct svc_xprt *cl_cb_xprt; /* 4.1 callback transport */
+ struct rpc_wait_queue cl_cb_waitq; /* backchannel callers may */
+ /* wait here for slots */
};

/* struct nfs4_client_reset
--
1.6.4


2009-09-04 16:32:39

by Benny Halevy

[permalink] [raw]
Subject: [PATCH 07/10] nfsd41: Backchannel: Setup sequence information

From: Ricardo Labiaga <[email protected]>

Follows the model used by the NFS client. Setup the RPC prepare and done
function pointers so that we can populate the sequence information if
minorversion == 1. rpc_run_task() is then invoked directly just like
existing NFS client operations do.

nfsd4_cb_prepare() determines if the sequence information needs to be setup.
If the slot is in use, it adds itself to the wait queue.

nfsd4_cb_done() wakes anyone sleeping on the callback channel wait queue
after our RPC reply has been received. It also sets the task message
result pointer to NULL to clearly indicate we're done using it.

Signed-off-by: Ricardo Labiaga <[email protected]>
[define and initialize cl_cb_seq_nr here]
[pulled out unused defintion of nfsd4_cb_done]
Signed-off-by: Benny Halevy <[email protected]>
---
fs/nfsd/nfs4callback.c | 62 ++++++++++++++++++++++++++++++++++++++++++++
fs/nfsd/nfs4state.c | 1 +
include/linux/nfsd/state.h | 1 +
3 files changed, 64 insertions(+), 0 deletions(-)

diff --git a/fs/nfsd/nfs4callback.c b/fs/nfsd/nfs4callback.c
index 32ea3f5..da36a46 100644
--- a/fs/nfsd/nfs4callback.c
+++ b/fs/nfsd/nfs4callback.c
@@ -521,6 +521,67 @@ nfsd4_probe_callback(struct nfs4_client *clp)
do_probe_callback(clp);
}

+/*
+ * There's currently a single callback channel slot.
+ * If the slot is available, then mark it busy. Otherwise, set the
+ * thread for sleeping on the callback RPC wait queue.
+ */
+static int nfsd41_cb_setup_sequence(struct nfs4_client *clp,
+ struct rpc_task *task)
+{
+ struct nfs4_rpc_args *args = task->tk_msg.rpc_argp;
+ u32 *ptr = (u32 *)clp->cl_sessionid.data;
+ int status = 0;
+
+ dprintk("%s: %u:%u:%u:%u\n", __func__,
+ ptr[0], ptr[1], ptr[2], ptr[3]);
+
+ if (test_and_set_bit(0, &clp->cl_cb_slot_busy) != 0) {
+ rpc_sleep_on(&clp->cl_cb_waitq, task, NULL);
+ dprintk("%s slot is busy\n", __func__);
+ status = -EAGAIN;
+ goto out;
+ }
+
+ /*
+ * We'll need the clp during XDR encoding and decoding,
+ * and the sequence during decoding to verify the reply
+ */
+ args->args_seq.cbs_clp = clp;
+ task->tk_msg.rpc_resp = &args->args_seq;
+
+out:
+ dprintk("%s status=%d\n", __func__, status);
+ return status;
+}
+
+/*
+ * TODO: cb_sequence should support referring call lists, cachethis, multiple
+ * slots, and mark callback channel down on communication errors.
+ */
+static void nfsd4_cb_prepare(struct rpc_task *task, void *calldata)
+{
+ struct nfs4_delegation *dp = calldata;
+ struct nfs4_client *clp = dp->dl_client;
+ struct nfs4_rpc_args *args = task->tk_msg.rpc_argp;
+ u32 minorversion = clp->cl_cb_conn.cb_minorversion;
+ int status = 0;
+
+ args->args_seq.cbs_minorversion = minorversion;
+ if (minorversion) {
+ status = nfsd41_cb_setup_sequence(clp, task);
+ if (status) {
+ if (status != -EAGAIN) {
+ /* terminate rpc task */
+ task->tk_status = status;
+ task->tk_action = NULL;
+ }
+ return;
+ }
+ }
+ rpc_call_start(task);
+}
+
static void nfsd4_cb_recall_done(struct rpc_task *task, void *calldata)
{
struct nfs4_delegation *dp = calldata;
@@ -560,6 +621,7 @@ static void nfsd4_cb_recall_release(void *calldata)
}

static const struct rpc_call_ops nfsd4_cb_recall_ops = {
+ .rpc_call_prepare = nfsd4_cb_prepare,
.rpc_call_done = nfsd4_cb_recall_done,
.rpc_release = nfsd4_cb_recall_release,
};
diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index 957f6e5..b2ffa3b 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -1335,6 +1335,7 @@ nfsd4_create_session(struct svc_rqst *rqstp,
unconf->cl_cb_conn.cb_minorversion =
cstate->minorversion;
unconf->cl_cb_conn.cb_prog = cr_ses->callback_prog;
+ unconf->cl_cb_seq_nr = 1;
nfsd4_probe_callback(unconf);
}
conf = unconf;
diff --git a/include/linux/nfsd/state.h b/include/linux/nfsd/state.h
index 234e9af..b621428 100644
--- a/include/linux/nfsd/state.h
+++ b/include/linux/nfsd/state.h
@@ -222,6 +222,7 @@ struct nfs4_client {
/* for nfs41 callbacks */
/* We currently support a single back channel with a single slot */
unsigned long cl_cb_slot_busy;
+ u32 cl_cb_seq_nr;
struct svc_xprt *cl_cb_xprt; /* 4.1 callback transport */
struct rpc_wait_queue cl_cb_waitq; /* backchannel callers may */
/* wait here for slots */
--
1.6.4


2009-09-04 16:32:52

by Benny Halevy

[permalink] [raw]
Subject: [PATCH 08/10] nfsd41: Backchannel: cb_sequence callback

Implement the cb_sequence callback conforming to draft-ietf-nfsv4-minorversion1

Note: highest slot id and target highest slot id do not have to be 0
as was previously implemented. They can be greater than what the
nfs server sent if the client supports a larger slot table on the
backchannel. At this point we just ignore that.

Signed-off-by: Benny Halevy <[email protected]>
Signed-off-by: Ricardo Labiaga <[email protected]>
[Rework the back channel xdr using the shared v4.0 and v4.1 framework.]
Signed-off-by: Andy Adamson <[email protected]>
[fixed indentation]
Signed-off-by: Benny Halevy <[email protected]>
[nfsd41: use nfsd4_cb_sequence for callback minorversion]
Signed-off-by: Benny Halevy <[email protected]>
[nfsd41: fix verification of CB_SEQUENCE highest slot id[
Signed-off-by: Benny Halevy <[email protected]>
[nfsd41: Backchannel: Remove old backchannel serialization]
[nfsd41: Backchannel: First callback sequence ID should be 1]
Signed-off-by: Ricardo Labiaga <[email protected]>
Signed-off-by: Benny Halevy <[email protected]>
[nfsd41: decode_cb_sequence does not need to actually decode ignored fields]
Signed-off-by: Benny Halevy <[email protected]>
Signed-off-by: Ricardo Labiaga <[email protected]>
Signed-off-by: Benny Halevy <[email protected]>
---
fs/nfsd/nfs4callback.c | 72 ++++++++++++++++++++++++++++++++++++++++++++++++
1 files changed, 72 insertions(+), 0 deletions(-)

diff --git a/fs/nfsd/nfs4callback.c b/fs/nfsd/nfs4callback.c
index da36a46..2282594 100644
--- a/fs/nfsd/nfs4callback.c
+++ b/fs/nfsd/nfs4callback.c
@@ -256,6 +256,27 @@ encode_cb_recall(struct xdr_stream *xdr, struct nfs4_delegation *dp,
hdr->nops++;
}

+static void
+encode_cb_sequence(struct xdr_stream *xdr, struct nfsd4_cb_sequence *args,
+ struct nfs4_cb_compound_hdr *hdr)
+{
+ __be32 *p;
+
+ if (hdr->minorversion == 0)
+ return;
+
+ RESERVE_SPACE(1 + NFS4_MAX_SESSIONID_LEN + 20);
+
+ WRITE32(OP_CB_SEQUENCE);
+ WRITEMEM(args->cbs_clp->cl_sessionid.data, NFS4_MAX_SESSIONID_LEN);
+ WRITE32(args->cbs_clp->cl_cb_seq_nr);
+ WRITE32(0); /* slotid, always 0 */
+ WRITE32(0); /* highest slotid always 0 */
+ WRITE32(0); /* cachethis always 0 */
+ WRITE32(0); /* FIXME: support referring_call_lists */
+ hdr->nops++;
+}
+
static int
nfs4_xdr_enc_cb_null(struct rpc_rqst *req, __be32 *p)
{
@@ -317,6 +338,57 @@ decode_cb_op_hdr(struct xdr_stream *xdr, enum nfs_opnum4 expected)
return 0;
}

+/*
+ * Our current back channel implmentation supports a single backchannel
+ * with a single slot.
+ */
+static int
+decode_cb_sequence(struct xdr_stream *xdr, struct nfsd4_cb_sequence *res,
+ struct rpc_rqst *rqstp)
+{
+ struct nfs4_sessionid id;
+ int status;
+ u32 dummy;
+ __be32 *p;
+
+ if (res->cbs_minorversion == 0)
+ return 0;
+
+ status = decode_cb_op_hdr(xdr, OP_CB_SEQUENCE);
+ if (status)
+ return status;
+
+ /*
+ * If the server returns different values for sessionID, slotID or
+ * sequence number, the server is looney tunes.
+ */
+ status = -ESERVERFAULT;
+
+ READ_BUF(NFS4_MAX_SESSIONID_LEN + 16);
+ memcpy(id.data, p, NFS4_MAX_SESSIONID_LEN);
+ p += XDR_QUADLEN(NFS4_MAX_SESSIONID_LEN);
+ if (memcmp(id.data, res->cbs_clp->cl_sessionid.data,
+ NFS4_MAX_SESSIONID_LEN)) {
+ dprintk("%s Invalid session id\n", __func__);
+ goto out;
+ }
+ READ32(dummy);
+ if (dummy != res->cbs_clp->cl_cb_seq_nr) {
+ dprintk("%s Invalid sequence number\n", __func__);
+ goto out;
+ }
+ READ32(dummy); /* slotid must be 0 */
+ if (dummy != 0) {
+ dprintk("%s Invalid slotid\n", __func__);
+ goto out;
+ }
+ /* FIXME: process highest slotid and target highest slotid */
+ status = 0;
+out:
+ return status;
+}
+
+
static int
nfs4_xdr_dec_cb_null(struct rpc_rqst *req, __be32 *p)
{
--
1.6.4


2009-09-04 16:33:06

by Benny Halevy

[permalink] [raw]
Subject: [PATCH 09/10] nfsd41: Backchannel: Implement cb_recall over NFSv4.1

From: Ricardo Labiaga <[email protected]>

Signed-off-by: Ricardo Labiaga <[email protected]>
[nfsd41: cb_recall callback]
[Share v4.0 and v4.1 back channel xdr]
Signed-off-by: Andy Adamson <[email protected]>
Signed-off-by: Ricardo Labiaga <[email protected]>
Signed-off-by: Benny Halevy <[email protected]>
[Share v4.0 and v4.1 back channel xdr]
Signed-off-by: Andy Adamson <[email protected]>
Signed-off-by: Benny Halevy <[email protected]>
[nfsd41: use nfsd4_cb_sequence for callback minorversion]
[nfsd41: conditionally decode_sequence in nfs4_xdr_dec_cb_recall]
Signed-off-by: Benny Halevy <[email protected]>
[nfsd41: Backchannel: Add sequence arguments to callback RPC arguments]
Signed-off-by: Ricardo Labiaga <[email protected]>
[pulled-in definition of nfsd4_cb_done]
Signed-off-by: Benny Halevy <[email protected]>
---
fs/nfsd/nfs4callback.c | 52 ++++++++++++++++++++++++++++++++++++++++++++---
1 files changed, 48 insertions(+), 4 deletions(-)

diff --git a/fs/nfsd/nfs4callback.c b/fs/nfsd/nfs4callback.c
index 2282594..7295af3 100644
--- a/fs/nfsd/nfs4callback.c
+++ b/fs/nfsd/nfs4callback.c
@@ -288,15 +288,19 @@ nfs4_xdr_enc_cb_null(struct rpc_rqst *req, __be32 *p)
}

static int
-nfs4_xdr_enc_cb_recall(struct rpc_rqst *req, __be32 *p, struct nfs4_delegation *args)
+nfs4_xdr_enc_cb_recall(struct rpc_rqst *req, __be32 *p,
+ struct nfs4_rpc_args *rpc_args)
{
struct xdr_stream xdr;
+ struct nfs4_delegation *args = rpc_args->args_op;
struct nfs4_cb_compound_hdr hdr = {
.ident = args->dl_ident,
+ .minorversion = rpc_args->args_seq.cbs_minorversion,
};

xdr_init_encode(&xdr, &req->rq_snd_buf, p);
encode_cb_compound_hdr(&xdr, &hdr);
+ encode_cb_sequence(&xdr, &rpc_args->args_seq, &hdr);
encode_cb_recall(&xdr, args, &hdr);
encode_cb_nops(&hdr);
return 0;
@@ -396,7 +400,8 @@ nfs4_xdr_dec_cb_null(struct rpc_rqst *req, __be32 *p)
}

static int
-nfs4_xdr_dec_cb_recall(struct rpc_rqst *rqstp, __be32 *p)
+nfs4_xdr_dec_cb_recall(struct rpc_rqst *rqstp, __be32 *p,
+ struct nfsd4_cb_sequence *seq)
{
struct xdr_stream xdr;
struct nfs4_cb_compound_hdr hdr;
@@ -406,6 +411,11 @@ nfs4_xdr_dec_cb_recall(struct rpc_rqst *rqstp, __be32 *p)
status = decode_cb_compound_hdr(&xdr, &hdr);
if (status)
goto out;
+ if (seq) {
+ status = decode_cb_sequence(&xdr, seq, rqstp);
+ if (status)
+ goto out;
+ }
status = decode_cb_op_hdr(&xdr, OP_CB_RECALL);
out:
return status;
@@ -654,11 +664,34 @@ static void nfsd4_cb_prepare(struct rpc_task *task, void *calldata)
rpc_call_start(task);
}

+static void nfsd4_cb_done(struct rpc_task *task, void *calldata)
+{
+ struct nfs4_delegation *dp = calldata;
+ struct nfs4_client *clp = dp->dl_client;
+
+ dprintk("%s: minorversion=%d\n", __func__,
+ clp->cl_cb_conn.cb_minorversion);
+
+ if (clp->cl_cb_conn.cb_minorversion) {
+ /* No need for lock, access serialized in nfsd4_cb_prepare */
+ ++clp->cl_cb_seq_nr;
+ clear_bit(0, &clp->cl_cb_slot_busy);
+ rpc_wake_up_next(&clp->cl_cb_waitq);
+ dprintk("%s: freed slot, new seqid=%d\n", __func__,
+ clp->cl_cb_seq_nr);
+
+ /* We're done looking into the sequence information */
+ task->tk_msg.rpc_resp = NULL;
+ }
+}
+
static void nfsd4_cb_recall_done(struct rpc_task *task, void *calldata)
{
struct nfs4_delegation *dp = calldata;
struct nfs4_client *clp = dp->dl_client;

+ nfsd4_cb_done(task, calldata);
+
switch (task->tk_status) {
case -EIO:
/* Network partition? */
@@ -671,16 +704,19 @@ static void nfsd4_cb_recall_done(struct rpc_task *task, void *calldata)
break;
default:
/* success, or error we can't handle */
- return;
+ goto done;
}
if (dp->dl_retries--) {
rpc_delay(task, 2*HZ);
task->tk_status = 0;
rpc_restart_call(task);
+ return;
} else {
atomic_set(&clp->cl_cb_conn.cb_set, 0);
warn_no_callback_path(clp, task->tk_status);
}
+done:
+ kfree(task->tk_msg.rpc_argp);
}

static void nfsd4_cb_recall_release(void *calldata)
@@ -706,16 +742,24 @@ nfsd4_cb_recall(struct nfs4_delegation *dp)
{
struct nfs4_client *clp = dp->dl_client;
struct rpc_clnt *clnt = clp->cl_cb_conn.cb_client;
+ struct nfs4_rpc_args *args;
struct rpc_message msg = {
.rpc_proc = &nfs4_cb_procedures[NFSPROC4_CLNT_CB_RECALL],
- .rpc_argp = dp,
.rpc_cred = clp->cl_cb_conn.cb_cred
};
int status;

+ args = kzalloc(sizeof(*args), GFP_KERNEL);
+ if (!args) {
+ status = -ENOMEM;
+ goto out;
+ }
+ args->args_op = dp;
+ msg.rpc_argp = args;
dp->dl_retries = 1;
status = rpc_call_async(clnt, &msg, RPC_TASK_SOFT,
&nfsd4_cb_recall_ops, dp);
+out:
if (status) {
put_nfs4_client(clp);
nfs4_put_delegation(dp);
--
1.6.4


2009-09-04 16:33:19

by Benny Halevy

[permalink] [raw]
Subject: [PATCH 10/10] nfsd41: Refactor create_client()

From: Ricardo Labiaga <[email protected]>

Move common initialization of 'struct nfs4_client' inside create_client().

Signed-off-by: Ricardo Labiaga <[email protected]>

[nfsd41: Remember the auth flavor to use for callbacks]
Signed-off-by: Ricardo Labiaga <[email protected]>
Signed-off-by: Benny Halevy <[email protected]>
---
fs/nfsd/nfs4state.c | 89 ++++++++++++++++++++++++++-------------------------
1 files changed, 45 insertions(+), 44 deletions(-)

diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index b2ffa3b..f344de2 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -763,27 +763,6 @@ expire_client(struct nfs4_client *clp)
put_nfs4_client(clp);
}

-static struct nfs4_client *create_client(struct xdr_netobj name, char *recdir)
-{
- struct nfs4_client *clp;
-
- clp = alloc_client(name);
- if (clp == NULL)
- return NULL;
- memcpy(clp->cl_recdir, recdir, HEXDIR_LEN);
- atomic_set(&clp->cl_count, 1);
- atomic_set(&clp->cl_cb_conn.cb_set, 0);
- INIT_LIST_HEAD(&clp->cl_idhash);
- INIT_LIST_HEAD(&clp->cl_strhash);
- INIT_LIST_HEAD(&clp->cl_openowners);
- INIT_LIST_HEAD(&clp->cl_delegations);
- INIT_LIST_HEAD(&clp->cl_sessions);
- INIT_LIST_HEAD(&clp->cl_lru);
- clear_bit(0, &clp->cl_cb_slot_busy);
- rpc_init_wait_queue(&clp->cl_cb_waitq, "Backchannel slot table");
- return clp;
-}
-
static void copy_verf(struct nfs4_client *target, nfs4_verifier *source)
{
memcpy(target->cl_verifier.data, source->data,
@@ -846,6 +825,46 @@ static void gen_confirm(struct nfs4_client *clp)
*p++ = i++;
}

+static struct nfs4_client *create_client(struct xdr_netobj name, char *recdir,
+ struct svc_rqst *rqstp, nfs4_verifier *verf)
+{
+ struct nfs4_client *clp;
+ struct sockaddr *sa = svc_addr(rqstp);
+ char *princ;
+
+ clp = alloc_client(name);
+ if (clp == NULL)
+ return NULL;
+
+ princ = svc_gss_principal(rqstp);
+ if (princ) {
+ clp->cl_principal = kstrdup(princ, GFP_KERNEL);
+ if (clp->cl_principal == NULL) {
+ free_client(clp);
+ return NULL;
+ }
+ }
+
+ memcpy(clp->cl_recdir, recdir, HEXDIR_LEN);
+ atomic_set(&clp->cl_count, 1);
+ atomic_set(&clp->cl_cb_conn.cb_set, 0);
+ INIT_LIST_HEAD(&clp->cl_idhash);
+ INIT_LIST_HEAD(&clp->cl_strhash);
+ INIT_LIST_HEAD(&clp->cl_openowners);
+ INIT_LIST_HEAD(&clp->cl_delegations);
+ INIT_LIST_HEAD(&clp->cl_sessions);
+ INIT_LIST_HEAD(&clp->cl_lru);
+ clear_bit(0, &clp->cl_cb_slot_busy);
+ rpc_init_wait_queue(&clp->cl_cb_waitq, "Backchannel slot table");
+ copy_verf(clp, verf);
+ rpc_copy_addr((struct sockaddr *) &clp->cl_addr, sa);
+ clp->cl_flavor = rqstp->rq_flavor;
+ copy_cred(&clp->cl_cred, &rqstp->rq_cred);
+ gen_confirm(clp);
+
+ return clp;
+}
+
static int check_name(struct xdr_netobj name)
{
if (name.len == 0)
@@ -1193,17 +1212,13 @@ nfsd4_exchange_id(struct svc_rqst *rqstp,

out_new:
/* Normal case */
- new = create_client(exid->clname, dname);
+ new = create_client(exid->clname, dname, rqstp, &verf);
if (new == NULL) {
status = nfserr_serverfault;
goto out;
}

- copy_verf(new, &verf);
- copy_cred(&new->cl_cred, &rqstp->rq_cred);
- rpc_copy_addr((struct sockaddr *) &new->cl_addr, sa);
gen_clid(new);
- gen_confirm(new);
add_to_unconfirmed(new, strhashval);
out_copy:
exid->clientid.cl_boot = new->cl_clientid.cl_boot;
@@ -1477,7 +1492,6 @@ nfsd4_setclientid(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
unsigned int strhashval;
struct nfs4_client *conf, *unconf, *new;
__be32 status;
- char *princ;
char dname[HEXDIR_LEN];

if (!check_name(clname))
@@ -1522,7 +1536,7 @@ nfsd4_setclientid(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
*/
if (unconf)
expire_client(unconf);
- new = create_client(clname, dname);
+ new = create_client(clname, dname, rqstp, &clverifier);
if (new == NULL)
goto out;
gen_clid(new);
@@ -1539,7 +1553,7 @@ nfsd4_setclientid(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
*/
expire_client(unconf);
}
- new = create_client(clname, dname);
+ new = create_client(clname, dname, rqstp, &clverifier);
if (new == NULL)
goto out;
copy_clid(new, conf);
@@ -1549,7 +1563,7 @@ nfsd4_setclientid(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
* probable client reboot; state will be removed if
* confirmed.
*/
- new = create_client(clname, dname);
+ new = create_client(clname, dname, rqstp, &clverifier);
if (new == NULL)
goto out;
gen_clid(new);
@@ -1560,24 +1574,11 @@ nfsd4_setclientid(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
* confirmed.
*/
expire_client(unconf);
- new = create_client(clname, dname);
+ new = create_client(clname, dname, rqstp, &clverifier);
if (new == NULL)
goto out;
gen_clid(new);
}
- copy_verf(new, &clverifier);
- rpc_copy_addr((struct sockaddr *) &new->cl_addr, sa);
- new->cl_flavor = rqstp->rq_flavor;
- princ = svc_gss_principal(rqstp);
- if (princ) {
- new->cl_principal = kstrdup(princ, GFP_KERNEL);
- if (new->cl_principal == NULL) {
- free_client(new);
- goto out;
- }
- }
- copy_cred(&new->cl_cred, &rqstp->rq_cred);
- gen_confirm(new);
gen_callback(new, setclid, rpc_get_scope_id(sa));
add_to_unconfirmed(new, strhashval);
setclid->se_clientid.cl_boot = new->cl_clientid.cl_boot;
--
1.6.4


2009-09-04 17:01:00

by Myklebust, Trond

[permalink] [raw]
Subject: Re: [PATCH 02/10] nfsd41: sunrpc: Added rpc server-side backchannel handling

On Fri, 2009-09-04 at 19:31 +0300, Benny Halevy wrote:
> From: Rahul Iyer <[email protected]>
>
> Signed-off-by: Rahul Iyer <[email protected]>
> Signed-off-by: Mike Sager <[email protected]>
> Signed-off-by: Marc Eshel <[email protected]>
> Signed-off-by: Benny Halevy <[email protected]>
> Signed-off-by: Ricardo Labiaga <[email protected]>
>
> When the call direction is a reply, copy the xid and call direction into the
> req->rq_private_buf.head[0].iov_base otherwise rpc_verify_header returns
> rpc_garbage.
>
> Signed-off-by: Andy Adamson <[email protected]>
> Signed-off-by: Benny Halevy <[email protected]>
> [get rid of CONFIG_NFSD_V4_1]
> Signed-off-by: Benny Halevy <[email protected]>
> [sunrpc: refactoring of svc_tcp_recvfrom]
> Signed-off-by: Alexandros Batsakis <[email protected]>
> Signed-off-by: Ricardo Labiaga <[email protected]>
> [nfsd41: sunrpc: create common send routine for the fore and the back channels]
> Signed-off-by: Alexandros Batsakis <[email protected]>
> Signed-off-by: Ricardo Labiaga <[email protected]>
> [nfsd41: sunrpc: Use free_page() to free server backchannel pages]
> Signed-off-by: Alexandros Batsakis <[email protected]>
> Signed-off-by: Ricardo Labiaga <[email protected]>

OK... Let's not go overboard with all the 'Signed-off-by's here... I'm
sure this ping-ponging between Alexandros and Ricardo can be
consolidated a bit. Particularly given that they work for the same
company...

> [nfsd41: sunrpc: Document server backchannel locking]
> Signed-off-by: Alexandros Batsakis <[email protected]>
> Signed-off-by: Ricardo Labiaga <[email protected]>
> [nfsd41: sunrpc: remove bc_connect_worker()]
> Signed-off-by: Alexandros Batsakis <[email protected]>
> Signed-off-by: Ricardo Labiaga <[email protected]>
> [nfsd41: sunrpc: Define xprt_server_backchannel()[
> Signed-off-by: Ricardo Labiaga <[email protected]>
> [nfsd41: sunrpc: remove bc_close and bc_init_auto_disconnect dummy functions]
> Signed-off-by: Alexandros Batsakis <[email protected]>
> Signed-off-by: Ricardo Labiaga <[email protected]>
> [nfsd41: sunrpc: eliminate unneeded switch statement in xs_setup_tcp()]
> Signed-off-by: Alexandros Batsakis <[email protected]>
> Signed-off-by: Ricardo Labiaga <[email protected]>
> [nfsd41: sunrpc: Don't auto close the server backchannel connection]
> Signed-off-by: Ricardo Labiaga <[email protected]>
> [nfsd41: sunrpc: Remove unused functions]
> Signed-off-by: Ricardo Labiaga <[email protected]>
> Signed-off-by: Benny Halevy <[email protected]>
> [nfsd41: change bc_sock to bc_xprt]
> [nfsd41: sunrpc: move struct rpc_buffer def into a common header file]
> Signed-off-by: Benny Halevy <[email protected]>
> [nfsd41: sunrpc: use rpc_sleep in bc_send_request so not to block on mutex]
> [removed cosmetic changes]
> Signed-off-by: Benny Halevy <[email protected]>
> Cc: Trond Myklebust <[email protected]>
> ---
> include/linux/sunrpc/clnt.h | 1 +
> include/linux/sunrpc/svc_xprt.h | 1 +
> include/linux/sunrpc/svcsock.h | 1 +
> include/linux/sunrpc/xprt.h | 7 ++
> net/sunrpc/clnt.c | 1 +
> net/sunrpc/sunrpc.h | 4 +
> net/sunrpc/svc_xprt.c | 2 +
> net/sunrpc/svcsock.c | 172 +++++++++++++++++++++++++++-------
> net/sunrpc/xprt.c | 13 +++
> net/sunrpc/xprtsock.c | 198 +++++++++++++++++++++++++++++++++++++-
> 10 files changed, 359 insertions(+), 41 deletions(-)
>
> diff --git a/include/linux/sunrpc/clnt.h b/include/linux/sunrpc/clnt.h
> index 3d02558..8ed9642 100644
> --- a/include/linux/sunrpc/clnt.h
> +++ b/include/linux/sunrpc/clnt.h
> @@ -114,6 +114,7 @@ struct rpc_create_args {
> rpc_authflavor_t authflavor;
> unsigned long flags;
> char *client_name;
> + struct svc_xprt *bc_xprt; /* NFSv4.1 backchannel */
> };
>
> /* Values for "flags" field */
> diff --git a/include/linux/sunrpc/svc_xprt.h b/include/linux/sunrpc/svc_xprt.h
> index 2223ae0..5f4e18b 100644
> --- a/include/linux/sunrpc/svc_xprt.h
> +++ b/include/linux/sunrpc/svc_xprt.h
> @@ -65,6 +65,7 @@ struct svc_xprt {
> size_t xpt_locallen; /* length of address */
> struct sockaddr_storage xpt_remote; /* remote peer's address */
> size_t xpt_remotelen; /* length of address */
> + struct rpc_wait_queue xpt_bc_pending; /* backchannel wait queue */
> };
>
> int svc_reg_xprt_class(struct svc_xprt_class *);
> diff --git a/include/linux/sunrpc/svcsock.h b/include/linux/sunrpc/svcsock.h
> index 04dba23..4b854e2 100644
> --- a/include/linux/sunrpc/svcsock.h
> +++ b/include/linux/sunrpc/svcsock.h
> @@ -28,6 +28,7 @@ struct svc_sock {
> /* private TCP part */
> u32 sk_reclen; /* length of record */
> u32 sk_tcplen; /* current read length */
> + struct rpc_xprt *sk_bc_xprt; /* NFSv4.1 backchannel xprt */
> };
>
> /*
> diff --git a/include/linux/sunrpc/xprt.h b/include/linux/sunrpc/xprt.h
> index c090df4..cfad635 100644
> --- a/include/linux/sunrpc/xprt.h
> +++ b/include/linux/sunrpc/xprt.h
> @@ -179,6 +179,7 @@ struct rpc_xprt {
> spinlock_t reserve_lock; /* lock slot table */
> u32 xid; /* Next XID value to use */
> struct rpc_task * snd_task; /* Task blocked in send */
> + struct svc_xprt *bc_xprt; /* NFSv4.1 backchannel */
> #if defined(CONFIG_NFS_V4_1)
> struct svc_serv *bc_serv; /* The RPC service which will */
> /* process the callback */
> @@ -231,6 +232,7 @@ struct xprt_create {
> struct sockaddr * srcaddr; /* optional local address */
> struct sockaddr * dstaddr; /* remote peer address */
> size_t addrlen;
> + struct svc_xprt *bc_xprt; /* NFSv4.1 backchannel */
> };
>
> struct xprt_class {
> @@ -366,6 +368,11 @@ static inline int xprt_test_and_set_binding(struct rpc_xprt *xprt)
> return test_and_set_bit(XPRT_BINDING, &xprt->state);
> }
>
> +static inline int xprt_server_backchannel(struct rpc_xprt *xprt)
> +{
> + return xprt->bc_xprt != NULL;
> +}

xprt_is_server_backchannel()? When I see a function with a name like the
above, I tend to assume it will actually return a backchannel object.

> +
> #endif /* __KERNEL__*/
>
> #endif /* _LINUX_SUNRPC_XPRT_H */
> diff --git a/net/sunrpc/clnt.c b/net/sunrpc/clnt.c
> index c1e467e..7389804 100644
> --- a/net/sunrpc/clnt.c
> +++ b/net/sunrpc/clnt.c
> @@ -288,6 +288,7 @@ struct rpc_clnt *rpc_create(struct rpc_create_args *args)
> .srcaddr = args->saddress,
> .dstaddr = args->address,
> .addrlen = args->addrsize,
> + .bc_xprt = args->bc_xprt,
> };
> char servername[48];
>
> diff --git a/net/sunrpc/sunrpc.h b/net/sunrpc/sunrpc.h
> index 13171e6..90c292e 100644
> --- a/net/sunrpc/sunrpc.h
> +++ b/net/sunrpc/sunrpc.h
> @@ -43,5 +43,9 @@ static inline int rpc_reply_expected(struct rpc_task *task)
> (task->tk_msg.rpc_proc->p_decode != NULL);
> }
>
> +int svc_send_common(struct socket *sock, struct xdr_buf *xdr,
> + struct page *headpage, unsigned long headoffset,
> + struct page *tailpage, unsigned long tailoffset);
> +
> #endif /* _NET_SUNRPC_SUNRPC_H */
>
> diff --git a/net/sunrpc/svc_xprt.c b/net/sunrpc/svc_xprt.c
> index 912dea5..df124f7 100644
> --- a/net/sunrpc/svc_xprt.c
> +++ b/net/sunrpc/svc_xprt.c
> @@ -160,6 +160,7 @@ void svc_xprt_init(struct svc_xprt_class *xcl, struct svc_xprt *xprt,
> mutex_init(&xprt->xpt_mutex);
> spin_lock_init(&xprt->xpt_lock);
> set_bit(XPT_BUSY, &xprt->xpt_flags);
> + rpc_init_wait_queue(&xprt->xpt_bc_pending, "xpt_bc_pending");
> }
> EXPORT_SYMBOL_GPL(svc_xprt_init);
>
> @@ -810,6 +811,7 @@ int svc_send(struct svc_rqst *rqstp)
> else
> len = xprt->xpt_ops->xpo_sendto(rqstp);
> mutex_unlock(&xprt->xpt_mutex);
> + rpc_wake_up(&xprt->xpt_bc_pending);
> svc_xprt_release(rqstp);
>
> if (len == -ECONNREFUSED || len == -ENOTCONN || len == -EAGAIN)
> diff --git a/net/sunrpc/svcsock.c b/net/sunrpc/svcsock.c
> index 76a380d..ccc5e83 100644
> --- a/net/sunrpc/svcsock.c
> +++ b/net/sunrpc/svcsock.c
> @@ -49,6 +49,7 @@
> #include <linux/sunrpc/msg_prot.h>
> #include <linux/sunrpc/svcsock.h>
> #include <linux/sunrpc/stats.h>
> +#include <linux/sunrpc/xprt.h>
>
> #define RPCDBG_FACILITY RPCDBG_SVCXPRT
>
> @@ -153,49 +154,27 @@ static void svc_set_cmsg_data(struct svc_rqst *rqstp, struct cmsghdr *cmh)
> }
>
> /*
> - * Generic sendto routine
> + * send routine intended to be shared by the fore- and back-channel
> */
> -static int svc_sendto(struct svc_rqst *rqstp, struct xdr_buf *xdr)
> +int svc_send_common(struct socket *sock, struct xdr_buf *xdr,
> + struct page *headpage, unsigned long headoffset,
> + struct page *tailpage, unsigned long tailoffset)
> {
> - struct svc_sock *svsk =
> - container_of(rqstp->rq_xprt, struct svc_sock, sk_xprt);
> - struct socket *sock = svsk->sk_sock;
> - int slen;
> - union {
> - struct cmsghdr hdr;
> - long all[SVC_PKTINFO_SPACE / sizeof(long)];
> - } buffer;
> - struct cmsghdr *cmh = &buffer.hdr;
> - int len = 0;
> int result;
> int size;
> struct page **ppage = xdr->pages;
> size_t base = xdr->page_base;
> unsigned int pglen = xdr->page_len;
> unsigned int flags = MSG_MORE;
> - RPC_IFDEBUG(char buf[RPC_MAX_ADDRBUFLEN]);
> + int slen;
> + int len = 0;
>
> slen = xdr->len;
>
> - if (rqstp->rq_prot == IPPROTO_UDP) {
> - struct msghdr msg = {
> - .msg_name = &rqstp->rq_addr,
> - .msg_namelen = rqstp->rq_addrlen,
> - .msg_control = cmh,
> - .msg_controllen = sizeof(buffer),
> - .msg_flags = MSG_MORE,
> - };
> -
> - svc_set_cmsg_data(rqstp, cmh);
> -
> - if (sock_sendmsg(sock, &msg, 0) < 0)
> - goto out;
> - }
> -
> /* send head */
> if (slen == xdr->head[0].iov_len)
> flags = 0;
> - len = kernel_sendpage(sock, rqstp->rq_respages[0], 0,
> + len = kernel_sendpage(sock, headpage, headoffset,
> xdr->head[0].iov_len, flags);
> if (len != xdr->head[0].iov_len)
> goto out;
> @@ -219,16 +198,58 @@ static int svc_sendto(struct svc_rqst *rqstp, struct xdr_buf *xdr)
> base = 0;
> ppage++;
> }
> +
> /* send tail */
> if (xdr->tail[0].iov_len) {
> - result = kernel_sendpage(sock, rqstp->rq_respages[0],
> - ((unsigned long)xdr->tail[0].iov_base)
> - & (PAGE_SIZE-1),
> - xdr->tail[0].iov_len, 0);
> -
> + result = kernel_sendpage(sock, tailpage, tailoffset,
> + xdr->tail[0].iov_len, 0);
> if (result > 0)
> len += result;
> }
> +
> +out:
> + return len;
> +}
> +
> +
> +/*
> + * Generic sendto routine
> + */
> +static int svc_sendto(struct svc_rqst *rqstp, struct xdr_buf *xdr)
> +{
> + struct svc_sock *svsk =
> + container_of(rqstp->rq_xprt, struct svc_sock, sk_xprt);
> + struct socket *sock = svsk->sk_sock;
> + union {
> + struct cmsghdr hdr;
> + long all[SVC_PKTINFO_SPACE / sizeof(long)];
> + } buffer;
> + struct cmsghdr *cmh = &buffer.hdr;
> + int len = 0;
> + unsigned long tailoff;
> + unsigned long headoff;
> + RPC_IFDEBUG(char buf[RPC_MAX_ADDRBUFLEN]);
> +
> + if (rqstp->rq_prot == IPPROTO_UDP) {
> + struct msghdr msg = {
> + .msg_name = &rqstp->rq_addr,
> + .msg_namelen = rqstp->rq_addrlen,
> + .msg_control = cmh,
> + .msg_controllen = sizeof(buffer),
> + .msg_flags = MSG_MORE,
> + };
> +
> + svc_set_cmsg_data(rqstp, cmh);
> +
> + if (sock_sendmsg(sock, &msg, 0) < 0)
> + goto out;
> + }
> +
> + tailoff = ((unsigned long)xdr->tail[0].iov_base) & (PAGE_SIZE-1);
> + headoff = 0;
> + len = svc_send_common(sock, xdr, rqstp->rq_respages[0], headoff,
> + rqstp->rq_respages[0], tailoff);
> +
> out:
> dprintk("svc: socket %p sendto([%p %Zu... ], %d) = %d (addr %s)\n",
> svsk, xdr->head[0].iov_base, xdr->head[0].iov_len,
> @@ -951,6 +972,57 @@ static int svc_tcp_recv_record(struct svc_sock *svsk, struct svc_rqst *rqstp)
> return -EAGAIN;
> }
>
> +static int svc_process_calldir(struct svc_sock *svsk, struct svc_rqst *rqstp,
> + struct rpc_rqst **reqpp, struct kvec *vec)
> +{
> + struct rpc_rqst *req = NULL;
> + u32 *p;
> + u32 xid;
> + u32 calldir;
> + int len;
> +
> + len = svc_recvfrom(rqstp, vec, 1, 8);
> + if (len < 0)
> + goto error;
> +
> + p = (u32 *)rqstp->rq_arg.head[0].iov_base;
> + xid = *p++;
> + calldir = *p;
> +
> + if (calldir == 0) {
> + /* REQUEST is the most common case */
> + vec[0] = rqstp->rq_arg.head[0];
> + } else {
> + /* REPLY */
> + if (svsk->sk_bc_xprt)
> + req = xprt_lookup_rqst(svsk->sk_bc_xprt, xid);
> +
> + if (!req) {
> + printk(KERN_NOTICE
> + "%s: Got unrecognized reply: "
> + "calldir 0x%x sk_bc_xprt %p xid %08x\n",
> + __func__, ntohl(calldir),
> + svsk->sk_bc_xprt, xid);
> + vec[0] = rqstp->rq_arg.head[0];
> + goto out;
> + }
> +
> + memcpy(&req->rq_private_buf, &req->rq_rcv_buf,
> + sizeof(struct xdr_buf));
> + /* copy the xid and call direction */
> + memcpy(req->rq_private_buf.head[0].iov_base,
> + rqstp->rq_arg.head[0].iov_base, 8);
> + vec[0] = req->rq_private_buf.head[0];
> + }
> + out:
> + vec[0].iov_base += 8;
> + vec[0].iov_len -= 8;
> + len = svsk->sk_reclen - 8;
> + error:
> + *reqpp = req;
> + return len;
> +}
> +
> /*
> * Receive data from a TCP socket.
> */
> @@ -962,6 +1034,7 @@ static int svc_tcp_recvfrom(struct svc_rqst *rqstp)
> int len;
> struct kvec *vec;
> int pnum, vlen;
> + struct rpc_rqst *req = NULL;
>
> dprintk("svc: tcp_recv %p data %d conn %d close %d\n",
> svsk, test_bit(XPT_DATA, &svsk->sk_xprt.xpt_flags),
> @@ -975,9 +1048,27 @@ static int svc_tcp_recvfrom(struct svc_rqst *rqstp)
> vec = rqstp->rq_vec;
> vec[0] = rqstp->rq_arg.head[0];
> vlen = PAGE_SIZE;
> +
> + /*
> + * We have enough data for the whole tcp record. Let's try and read the
> + * first 8 bytes to get the xid and the call direction. We can use this
> + * to figure out if this is a call or a reply to a callback. If
> + * sk_reclen is < 8 (xid and calldir), then this is a malformed packet.
> + * In that case, don't bother with the calldir and just read the data.
> + * It will be rejected in svc_process.
> + */
> + if (len >= 8) {
> + len = svc_process_calldir(svsk, rqstp, &req, vec);
> + if (len < 0)
> + goto err_again;
> + vlen -= 8;
> + }
> +
> pnum = 1;
> while (vlen < len) {
> - vec[pnum].iov_base = page_address(rqstp->rq_pages[pnum]);
> + vec[pnum].iov_base = (req) ?
> + page_address(req->rq_private_buf.pages[pnum - 1]) :
> + page_address(rqstp->rq_pages[pnum]);
> vec[pnum].iov_len = PAGE_SIZE;
> pnum++;
> vlen += PAGE_SIZE;
> @@ -989,6 +1080,16 @@ static int svc_tcp_recvfrom(struct svc_rqst *rqstp)
> if (len < 0)
> goto err_again;
>
> + /*
> + * Account for the 8 bytes we read earlier
> + */
> + len += 8;
> +
> + if (req) {
> + xprt_complete_rqst(req->rq_task, len);
> + len = 0;
> + goto out;
> + }
> dprintk("svc: TCP complete record (%d bytes)\n", len);
> rqstp->rq_arg.len = len;
> rqstp->rq_arg.page_base = 0;
> @@ -1002,6 +1103,7 @@ static int svc_tcp_recvfrom(struct svc_rqst *rqstp)
> rqstp->rq_xprt_ctxt = NULL;
> rqstp->rq_prot = IPPROTO_TCP;
>
> +out:
> /* Reset TCP read info */
> svsk->sk_reclen = 0;
> svsk->sk_tcplen = 0;
> diff --git a/net/sunrpc/xprt.c b/net/sunrpc/xprt.c
> index f412a85..7b0cf70 100644
> --- a/net/sunrpc/xprt.c
> +++ b/net/sunrpc/xprt.c
> @@ -599,6 +599,9 @@ static void xprt_autoclose(struct work_struct *work)
> struct rpc_xprt *xprt =
> container_of(work, struct rpc_xprt, task_cleanup);
>
> + if (xprt_server_backchannel(xprt))
> + return;

Why do you need this? For one thing, it means that XPRT_CLOSE_WAIT never
gets cleared...

> +
> xprt->ops->close(xprt);
> clear_bit(XPRT_CLOSE_WAIT, &xprt->state);
> xprt_release_write(xprt, NULL);
> @@ -669,6 +672,9 @@ xprt_init_autodisconnect(unsigned long data)
> {
> struct rpc_xprt *xprt = (struct rpc_xprt *)data;
>
> + if (xprt_server_backchannel(xprt))
> + return;

Hmm... Do you need this? Why would you want to set up an autodisconnect
timer in the first place?

> +
> spin_lock(&xprt->transport_lock);
> if (!list_empty(&xprt->recv) || xprt->shutdown)
> goto out_abort;
> @@ -1103,6 +1109,13 @@ found:
> dprintk("RPC: created transport %p with %u slots\n", xprt,
> xprt->max_reqs);
>
> + /*
> + * Since we don't want connections for the backchannel, we set
> + * the xprt status to connected
> + */
> + if (args->bc_xprt)
> + xprt_set_connected(xprt);
> +

Why don't you do this in the ->setup() callback?

> return xprt;
> }
>
> diff --git a/net/sunrpc/xprtsock.c b/net/sunrpc/xprtsock.c
> index 62438f3..592681c 100644
> --- a/net/sunrpc/xprtsock.c
> +++ b/net/sunrpc/xprtsock.c
> @@ -32,6 +32,7 @@
> #include <linux/tcp.h>
> #include <linux/sunrpc/clnt.h>
> #include <linux/sunrpc/sched.h>
> +#include <linux/sunrpc/svcsock.h>
> #include <linux/sunrpc/xprtsock.h>
> #include <linux/file.h>
> #ifdef CONFIG_NFS_V4_1
> @@ -43,6 +44,7 @@
> #include <net/udp.h>
> #include <net/tcp.h>
>
> +#include "sunrpc.h"
> /*
> * xprtsock tunables
> */
> @@ -2098,6 +2100,134 @@ static void xs_tcp_print_stats(struct rpc_xprt *xprt, struct seq_file *seq)
> xprt->stat.bklog_u);
> }
>
> +/*
> + * Allocate a bunch of pages for a scratch buffer for the rpc code. The reason
> + * we allocate pages instead doing a kmalloc like rpc_malloc is because we want
> + * to use the server side send routines.
> + */
> +void *bc_malloc(struct rpc_task *task, size_t size)
> +{
> + struct page *page;
> + struct rpc_buffer *buf;
> +
> + BUG_ON(size > PAGE_SIZE - sizeof(struct rpc_buffer));
> + page = alloc_page(GFP_KERNEL);
> +
> + if (!page)
> + return NULL;
> +
> + buf = page_address(page);
> + buf->len = PAGE_SIZE;
> +
> + return buf->data;
> +}
> +
> +/*
> + * Free the space allocated in the bc_alloc routine
> + */
> +void bc_free(void *buffer)
> +{
> + struct rpc_buffer *buf;
> +
> + if (!buffer)
> + return;
> +
> + buf = container_of(buffer, struct rpc_buffer, data);
> + free_page((unsigned long)buf);
> +}
> +
> +/*
> + * Use the svc_sock to send the callback. Must be called with svsk->sk_mutex
> + * held. Borrows heavily from svc_tcp_sendto and xs_tcp_send_request.
> + */
> +static int bc_sendto(struct rpc_rqst *req)
> +{
> + int len;
> + struct xdr_buf *xbufp = &req->rq_snd_buf;
> + struct rpc_xprt *xprt = req->rq_xprt;
> + struct sock_xprt *transport =
> + container_of(xprt, struct sock_xprt, xprt);
> + struct socket *sock = transport->sock;
> + unsigned long headoff;
> + unsigned long tailoff;
> +
> + /*
> + * Set up the rpc header and record marker stuff
> + */
> + xs_encode_tcp_record_marker(xbufp);
> +
> + tailoff = (unsigned long)xbufp->tail[0].iov_base & ~PAGE_MASK;
> + headoff = (unsigned long)xbufp->head[0].iov_base & ~PAGE_MASK;
> + len = svc_send_common(sock, xbufp,
> + virt_to_page(xbufp->head[0].iov_base), headoff,
> + xbufp->tail[0].iov_base, tailoff);
> +
> + if (len != xbufp->len) {
> + printk(KERN_NOTICE "Error sending entire callback!\n");
> + len = -EAGAIN;
> + }
> +
> + return len;
> +}
> +
> +/*
> + * The send routine. Borrows from svc_send
> + */
> +static int bc_send_request(struct rpc_task *task)
> +{
> + struct rpc_rqst *req = task->tk_rqstp;
> + struct svc_xprt *xprt;
> + struct svc_sock *svsk;
> + u32 len;
> +
> + dprintk("sending request with xid: %08x\n", ntohl(req->rq_xid));
> + /*
> + * Get the server socket associated with this callback xprt
> + */
> + xprt = req->rq_xprt->bc_xprt;
> + svsk = container_of(xprt, struct svc_sock, sk_xprt);
> +
> + /*
> + * Grab the mutex to serialize data as the connection is shared
> + * with the fore channel
> + */
> + if (!mutex_trylock(&xprt->xpt_mutex)) {
> + rpc_sleep_on(&xprt->xpt_bc_pending, task, NULL);
> + if (!mutex_trylock(&xprt->xpt_mutex))
> + return -EAGAIN;
> + rpc_wake_up_queued_task(&xprt->xpt_bc_pending, task);
> + }
> + if (test_bit(XPT_DEAD, &xprt->xpt_flags))
> + len = -ENOTCONN;
> + else
> + len = bc_sendto(req);
> + mutex_unlock(&xprt->xpt_mutex);
> +
> + if (len > 0)
> + len = 0;
> +
> + return len;
> +}
> +
> +/*
> + * The close routine. Since this is client initiated, we do nothing
> + */
> +
> +static void bc_close(struct rpc_xprt *xprt)
> +{
> + return;
> +}
> +
> +/*
> + * The xprt destroy routine. Again, because this connection is client
> + * initiated, we do nothing
> + */
> +
> +static void bc_destroy(struct rpc_xprt *xprt)
> +{
> + return;
> +}
> +
> static struct rpc_xprt_ops xs_udp_ops = {
> .set_buffer_size = xs_udp_set_buffer_size,
> .reserve_xprt = xprt_reserve_xprt_cong,
> @@ -2134,6 +2264,22 @@ static struct rpc_xprt_ops xs_tcp_ops = {
> .print_stats = xs_tcp_print_stats,
> };
>
> +/*
> + * The rpc_xprt_ops for the server backchannel
> + */
> +
> +static struct rpc_xprt_ops bc_tcp_ops = {
> + .reserve_xprt = xprt_reserve_xprt,
> + .release_xprt = xprt_release_xprt,
> + .buf_alloc = bc_malloc,
> + .buf_free = bc_free,
> + .send_request = bc_send_request,
> + .set_retrans_timeout = xprt_set_retrans_timeout_def,
> + .close = bc_close,
> + .destroy = bc_destroy,
> + .print_stats = xs_tcp_print_stats,
> +};
> +
> static struct rpc_xprt *xs_setup_xprt(struct xprt_create *args,
> unsigned int slot_table_size)
> {
> @@ -2272,14 +2418,46 @@ static struct rpc_xprt *xs_setup_tcp(struct xprt_create *args)
> xprt->prot = IPPROTO_TCP;
> xprt->tsh_size = sizeof(rpc_fraghdr) / sizeof(u32);
> xprt->max_payload = RPC_MAX_FRAGMENT_SIZE;
> + xprt->timeout = &xs_tcp_default_timeout;
>
> - xprt->bind_timeout = XS_BIND_TO;
> - xprt->connect_timeout = XS_TCP_CONN_TO;
> - xprt->reestablish_timeout = XS_TCP_INIT_REEST_TO;
> - xprt->idle_timeout = XS_IDLE_DISC_TO;
> + if (args->bc_xprt) {
> + struct svc_sock *bc_sock;

Why are you embedding this inside the forward channel setup code? Just
make a TCP backchannel class with its own ->setup().

> - xprt->ops = &xs_tcp_ops;
> - xprt->timeout = &xs_tcp_default_timeout;
> + /* backchannel */
> + xprt_set_bound(xprt);
> + xprt->bind_timeout = 0;
> + xprt->connect_timeout = 0;
> + xprt->reestablish_timeout = 0;
> + xprt->idle_timeout = (~0);
> +
> + /*
> + * The backchannel uses the same socket connection as the
> + * forechannel
> + */
> + xprt->bc_xprt = args->bc_xprt;
> + bc_sock = container_of(args->bc_xprt, struct svc_sock, sk_xprt);
> + bc_sock->sk_bc_xprt = xprt;
> + transport->sock = bc_sock->sk_sock;
> + transport->inet = bc_sock->sk_sk;
> +
> + xprt->ops = &bc_tcp_ops;
> +
> + switch (addr->sa_family) {
> + case AF_INET:
> + xs_format_peer_addresses(xprt, "tcp",
> + RPCBIND_NETID_TCP);
> + break;
> + case AF_INET6:
> + xs_format_peer_addresses(xprt, "tcp",
> + RPCBIND_NETID_TCP6);
> + break;
> + default:
> + kfree(xprt);
> + return ERR_PTR(-EAFNOSUPPORT);
> + }
> +
> + goto out;
> + }
>
> switch (addr->sa_family) {
> case AF_INET:
> @@ -2303,6 +2481,14 @@ static struct rpc_xprt *xs_setup_tcp(struct xprt_create *args)
> return ERR_PTR(-EAFNOSUPPORT);
> }
>
> + xprt->bind_timeout = XS_BIND_TO;
> + xprt->connect_timeout = XS_TCP_CONN_TO;
> + xprt->reestablish_timeout = XS_TCP_INIT_REEST_TO;
> + xprt->idle_timeout = XS_IDLE_DISC_TO;
> +
> + xprt->ops = &xs_tcp_ops;
> +
> +out:
> if (xprt_bound(xprt))
> dprintk("RPC: set up xprt to %s (port %s) via %s\n",
> xprt->address_strings[RPC_DISPLAY_ADDR],

--
Trond Myklebust
Linux NFS client maintainer

NetApp
[email protected]
http://www.netapp.com