2007-05-16 19:20:49

by Greg Banks

[permalink] [raw]
Subject: [RFC,PATCH 3/14] knfsd: prepare reply per transport


Move the code at the beginning of svc_process() that sets up
page buffers for the reply, into a new sko_prepape_reply
method in svc_sock_ops.

Signed-off-by: Greg Banks <[email protected]>
Signed-off-by: Peter Leckie <[email protected]>
---

include/linux/sunrpc/svcsock.h | 6 +++
net/sunrpc/svc.c | 22 ++------------
net/sunrpc/svcsock.c | 46 ++++++++++++++++++++++++++++--
3 files changed, 54 insertions(+), 20 deletions(-)

Index: linux/net/sunrpc/svcsock.c
===================================================================
--- linux.orig/net/sunrpc/svcsock.c 2007-05-17 00:16:39.911496313 +1000
+++ linux/net/sunrpc/svcsock.c 2007-05-17 00:36:20.381217499 +1000
@@ -880,12 +880,37 @@ svc_udp_sendto(struct svc_rqst *rqstp)
return error;
}

+/*
+ * Setup response xdr_buf. Initially it has just one page.
+ */
+static int
+svc_tcpip_prepare_reply(struct svc_rqst *rqstp)
+{
+ struct kvec *resv = &rqstp->rq_res.head[0];
+
+ rqstp->rq_resused = 1;
+ resv->iov_base = page_address(rqstp->rq_respages[0]);
+ resv->iov_len = 0;
+ rqstp->rq_res.pages = rqstp->rq_respages + 1;
+ rqstp->rq_res.len = 0;
+ rqstp->rq_res.page_base = 0;
+ rqstp->rq_res.page_len = 0;
+ rqstp->rq_res.buflen = PAGE_SIZE;
+ rqstp->rq_res.tail[0].iov_base = NULL;
+ rqstp->rq_res.tail[0].iov_len = 0;
+ /* Will be turned off only in gss privacy case: */
+ rqstp->rq_sendfile_ok = 1;
+
+ return 0;
+}
+
static const struct svc_sock_ops svc_udp_ops = {
.sko_name = "udp",
.sko_recvfrom = svc_udp_recvfrom,
.sko_sendto = svc_udp_sendto,
.sko_detach = svc_tcpip_detach,
- .sko_free = svc_tcpip_free
+ .sko_free = svc_tcpip_free,
+ .sko_prepare_reply = svc_tcpip_prepare_reply
};

static void
@@ -1324,12 +1349,29 @@ svc_tcp_sendto(struct svc_rqst *rqstp)
return sent;
}

+/*
+ * Setup response xdr_buf. Initially it has just one page.
+ */
+static int
+svc_tcp_prepare_reply(struct svc_rqst *rqstp)
+{
+ struct kvec *resv = &rqstp->rq_res.head[0];
+
+ svc_tcpip_prepare_reply(rqstp);
+
+ /* tcp needs a space for the record length... */
+ svc_putnl(resv, 0);
+
+ return 0;
+}
+
static const struct svc_sock_ops svc_tcp_ops = {
.sko_name = "tcp",
.sko_recvfrom = svc_tcp_recvfrom,
.sko_sendto = svc_tcp_sendto,
.sko_detach = svc_tcpip_detach,
- .sko_free = svc_tcpip_free
+ .sko_free = svc_tcpip_free,
+ .sko_prepare_reply = svc_tcp_prepare_reply
};

static void
Index: linux/include/linux/sunrpc/svcsock.h
===================================================================
--- linux.orig/include/linux/sunrpc/svcsock.h 2007-05-17 00:12:50.074342601 +1000
+++ linux/include/linux/sunrpc/svcsock.h 2007-05-17 00:36:20.553194321 +1000
@@ -27,6 +27,12 @@ struct svc_sock_ops {
* destruction of a svc_sock.
*/
void (*sko_free)(struct svc_sock *);
+ /*
+ * Perform any transport-specific work necessary to setup
+ * the reply buffer before the reply is encoded. May
+ * fail, e.g. due to memory allocation.
+ */
+ int (*sko_prepare_reply)(struct svc_rqst *);
};

/*
Index: linux/net/sunrpc/svc.c
===================================================================
--- linux.orig/net/sunrpc/svc.c 2007-04-26 13:08:32.000000000 +1000
+++ linux/net/sunrpc/svc.c 2007-05-17 00:36:20.557193782 +1000
@@ -800,24 +800,10 @@ svc_process(struct svc_rqst *rqstp)
if (argv->iov_len < 6*4)
goto err_short_len;

- /* setup response xdr_buf.
- * Initially it has just one page
- */
- rqstp->rq_resused = 1;
- resv->iov_base = page_address(rqstp->rq_respages[0]);
- resv->iov_len = 0;
- rqstp->rq_res.pages = rqstp->rq_respages + 1;
- rqstp->rq_res.len = 0;
- rqstp->rq_res.page_base = 0;
- rqstp->rq_res.page_len = 0;
- rqstp->rq_res.buflen = PAGE_SIZE;
- rqstp->rq_res.tail[0].iov_base = NULL;
- rqstp->rq_res.tail[0].iov_len = 0;
- /* Will be turned off only in gss privacy case: */
- rqstp->rq_sendfile_ok = 1;
- /* tcp needs a space for the record length... */
- if (rqstp->rq_prot == IPPROTO_TCP)
- svc_putnl(resv, 0);
+ /* setup response xdr_buf. */
+ if (rqstp->rq_sock->sk_ops->sko_prepare_reply &&
+ rqstp->rq_sock->sk_ops->sko_prepare_reply(rqstp))
+ goto dropit;

rqstp->rq_xid = svc_getu32(argv);
svc_putu32(resv, rqstp->rq_xid);
--
Greg Banks, R&D Software Engineer, SGI Australian Software Group.
Apparently, I'm Bedevere. Which MPHG character are you?
I don't speak for SGI.

-------------------------------------------------------------------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs


2007-05-16 20:53:51

by J. Bruce Fields

[permalink] [raw]
Subject: Re: [RFC,PATCH 3/14] knfsd: prepare reply per transport

On Thu, May 17, 2007 at 05:20:47AM +1000, Greg Banks wrote:
>
> Move the code at the beginning of svc_process() that sets up
> page buffers for the reply, into a new sko_prepape_reply

s/prepape/prepare/.

--b.

-------------------------------------------------------------------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2007-05-16 21:30:13

by Tom Tucker

[permalink] [raw]
Subject: Re: [RFC,PATCH 3/14] knfsd: prepare reply per transport


Greg:

I like this patch organization. I'll replicate this in the integrated
tree...

See comment below.

On 5/16/07 2:20 PM, "Greg Banks" <[email protected]> wrote:

>
> Move the code at the beginning of svc_process() that sets up
> page buffers for the reply, into a new sko_prepape_reply
> method in svc_sock_ops.
>
> Signed-off-by: Greg Banks <[email protected]>
> Signed-off-by: Peter Leckie <[email protected]>
> ---
>
> include/linux/sunrpc/svcsock.h | 6 +++
> net/sunrpc/svc.c | 22 ++------------
> net/sunrpc/svcsock.c | 46 ++++++++++++++++++++++++++++--
> 3 files changed, 54 insertions(+), 20 deletions(-)
>
> Index: linux/net/sunrpc/svcsock.c
> ===================================================================
> --- linux.orig/net/sunrpc/svcsock.c 2007-05-17 00:16:39.911496313 +1000
> +++ linux/net/sunrpc/svcsock.c 2007-05-17 00:36:20.381217499 +1000
> @@ -880,12 +880,37 @@ svc_udp_sendto(struct svc_rqst *rqstp)
> return error;
> }
>
> +/*
> + * Setup response xdr_buf. Initially it has just one page.
> + */
> +static int
> +svc_tcpip_prepare_reply(struct svc_rqst *rqstp)
> +{
> + struct kvec *resv = &rqstp->rq_res.head[0];
> +
> + rqstp->rq_resused = 1;
> + resv->iov_base = page_address(rqstp->rq_respages[0]);
> + resv->iov_len = 0;
> + rqstp->rq_res.pages = rqstp->rq_respages + 1;
> + rqstp->rq_res.len = 0;
> + rqstp->rq_res.page_base = 0;
> + rqstp->rq_res.page_len = 0;
> + rqstp->rq_res.buflen = PAGE_SIZE;
> + rqstp->rq_res.tail[0].iov_base = NULL;
> + rqstp->rq_res.tail[0].iov_len = 0;
> + /* Will be turned off only in gss privacy case: */
> + rqstp->rq_sendfile_ok = 1;

I think this belongs in the svc_process logic. It doesn't have anything to
do with the buffer, but rather whether or not GSS is turned on.

> +
> + return 0;
> +}
> +
> static const struct svc_sock_ops svc_udp_ops = {
> .sko_name = "udp",
> .sko_recvfrom = svc_udp_recvfrom,
> .sko_sendto = svc_udp_sendto,
> .sko_detach = svc_tcpip_detach,
> - .sko_free = svc_tcpip_free
> + .sko_free = svc_tcpip_free,
> + .sko_prepare_reply = svc_tcpip_prepare_reply
> };
>
> static void
> @@ -1324,12 +1349,29 @@ svc_tcp_sendto(struct svc_rqst *rqstp)
> return sent;
> }
>
> +/*
> + * Setup response xdr_buf. Initially it has just one page.
> + */
> +static int
> +svc_tcp_prepare_reply(struct svc_rqst *rqstp)
> +{
> + struct kvec *resv = &rqstp->rq_res.head[0];
> +
> + svc_tcpip_prepare_reply(rqstp);
> +
> + /* tcp needs a space for the record length... */
> + svc_putnl(resv, 0);
> +
> + return 0;
> +}
> +
> static const struct svc_sock_ops svc_tcp_ops = {
> .sko_name = "tcp",
> .sko_recvfrom = svc_tcp_recvfrom,
> .sko_sendto = svc_tcp_sendto,
> .sko_detach = svc_tcpip_detach,
> - .sko_free = svc_tcpip_free
> + .sko_free = svc_tcpip_free,
> + .sko_prepare_reply = svc_tcp_prepare_reply
> };
>
> static void
> Index: linux/include/linux/sunrpc/svcsock.h
> ===================================================================
> --- linux.orig/include/linux/sunrpc/svcsock.h 2007-05-17 00:12:50.074342601
> +1000
> +++ linux/include/linux/sunrpc/svcsock.h 2007-05-17 00:36:20.553194321 +1000
> @@ -27,6 +27,12 @@ struct svc_sock_ops {
> * destruction of a svc_sock.
> */
> void (*sko_free)(struct svc_sock *);
> + /*
> + * Perform any transport-specific work necessary to setup
> + * the reply buffer before the reply is encoded. May
> + * fail, e.g. due to memory allocation.
> + */
> + int (*sko_prepare_reply)(struct svc_rqst *);
> };
>
> /*
> Index: linux/net/sunrpc/svc.c
> ===================================================================
> --- linux.orig/net/sunrpc/svc.c 2007-04-26 13:08:32.000000000 +1000
> +++ linux/net/sunrpc/svc.c 2007-05-17 00:36:20.557193782 +1000
> @@ -800,24 +800,10 @@ svc_process(struct svc_rqst *rqstp)
> if (argv->iov_len < 6*4)
> goto err_short_len;
>
> - /* setup response xdr_buf.
> - * Initially it has just one page
> - */
> - rqstp->rq_resused = 1;
> - resv->iov_base = page_address(rqstp->rq_respages[0]);
> - resv->iov_len = 0;
> - rqstp->rq_res.pages = rqstp->rq_respages + 1;
> - rqstp->rq_res.len = 0;
> - rqstp->rq_res.page_base = 0;
> - rqstp->rq_res.page_len = 0;
> - rqstp->rq_res.buflen = PAGE_SIZE;
> - rqstp->rq_res.tail[0].iov_base = NULL;
> - rqstp->rq_res.tail[0].iov_len = 0;
> - /* Will be turned off only in gss privacy case: */
> - rqstp->rq_sendfile_ok = 1;
> - /* tcp needs a space for the record length... */
> - if (rqstp->rq_prot == IPPROTO_TCP)
> - svc_putnl(resv, 0);
> + /* setup response xdr_buf. */
> + if (rqstp->rq_sock->sk_ops->sko_prepare_reply &&
> + rqstp->rq_sock->sk_ops->sko_prepare_reply(rqstp))
> + goto dropit;
>
> rqstp->rq_xid = svc_getu32(argv);
> svc_putu32(resv, rqstp->rq_xid);



-------------------------------------------------------------------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2007-05-17 07:01:17

by Greg Banks

[permalink] [raw]
Subject: Re: [RFC,PATCH 3/14] knfsd: prepare reply per transport

On Wed, May 16, 2007 at 04:53:46PM -0400, J. Bruce Fields wrote:
> On Thu, May 17, 2007 at 05:20:47AM +1000, Greg Banks wrote:
> >
> > Move the code at the beginning of svc_process() that sets up
> > page buffers for the reply, into a new sko_prepape_reply
>
> s/prepape/prepare/.

Fixed, thanks.

Greg.
--
Greg Banks, R&D Software Engineer, SGI Australian Software Group.
Apparently, I'm Bedevere. Which MPHG character are you?
I don't speak for SGI.

-------------------------------------------------------------------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2007-05-17 07:53:46

by Greg Banks

[permalink] [raw]
Subject: Re: [RFC,PATCH 3/14] knfsd: prepare reply per transport

On Wed, May 16, 2007 at 04:35:07PM -0500, Tom Tucker wrote:
>
> Greg:
>
> I like this patch organization. I'll replicate this in the integrated
> tree...

Great.

> > + /* Will be turned off only in gss privacy case: */
> > + rqstp->rq_sendfile_ok = 1;
>
> I think this belongs in the svc_process logic. It doesn't have anything to
> do with the buffer, but rather whether or not GSS is turned on.

Fixed.

I'll push a revised patchset including feedback from yourself
and Bruce, to you only, in a few minutes.

Greg.
--
Greg Banks, R&D Software Engineer, SGI Australian Software Group.
Apparently, I'm Bedevere. Which MPHG character are you?
I don't speak for SGI.

-------------------------------------------------------------------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2007-05-17 09:16:52

by Iyer, Rahul

[permalink] [raw]
Subject: Re: [RFC,PATCH 3/14] knfsd: prepare reply per transport

Hi Greg,
I like the idea of a transport switch for the server side. The way I see
it, it's not just the server, it's also the "callback server" on the
client that can benefit.

I had a question. Maybe it's a dumb question, and I may be way off base,
but I don't see why we need 2 transport switches - one for the client,
one for the server? Why not have one?
I've written some code to do NFSv4.1 callbacks and wound up implementing
another "transport". Since in v4.1 it's actually possible for clients to
send requests (fore-channel) and receive callbacks (back channel) over
the same connection, I had to "make" another transport that essentially
did TCP, but used server side conventions for the rpc_xprt_ops send
routines so that the server could use the existing rpc_call_(a)sync()
mechanisms. Similarly, on the client side, I had to implement the
equivalent of the svc_send() routines using the client side xs_tcp_*
routines. The resultant code is reasonably clean, but screams out
"inefficiency" because of the small amount of code sharing that was
actually possible.

Given this, a unified transport switch would really rock. Both the
client and the server treat the connection as a "full duplex" (I mean
can send calls and receive replies on the same connection). One set of
tcp/udp read/write methods for both the client and server. No more
svc_send for the server and xs_*_send_request on the client. The one
true networking way to rule them both! Everything would be much cleaner,
and since we're going down this path, I assumed it's a feasibility study
worth doing. Maybe I'm oversimplifying, but this seems doable. After
all, what we're doing in both cases (client and server) is shoving bits
down a socket. Is my question justified or am I *way* off base and
ignorant of some fundamental issues(s)?
Thanks
Regards
Rahul


> -----Original Message-----
> From: Greg Banks [mailto:[email protected]]
> Sent: Thursday, May 17, 2007 12:54 AM
> To: Tom Tucker
> Cc: Linux NFS Mailing List; Talpey, Thomas; Peter Leckie; Greg Banks
> Subject: Re: [NFS] [RFC,PATCH 3/14] knfsd: prepare reply per transport
>
> On Wed, May 16, 2007 at 04:35:07PM -0500, Tom Tucker wrote:
> >
> > Greg:
> >
> > I like this patch organization. I'll replicate this in the
> integrated
> > tree...
>
> Great.
>
> > > + /* Will be turned off only in gss privacy case: */
> > > + rqstp->rq_sendfile_ok = 1;
> >
> > I think this belongs in the svc_process logic. It doesn't have
> > anything to do with the buffer, but rather whether or not
> GSS is turned on.
>
> Fixed.
>
> I'll push a revised patchset including feedback from yourself
> and Bruce, to you only, in a few minutes.
>
> Greg.
> --
> Greg Banks, R&D Software Engineer, SGI Australian Software Group.
> Apparently, I'm Bedevere. Which MPHG character are you?
> I don't speak for SGI.
>
> --------------------------------------------------------------
> -----------
> This SF.net email is sponsored by DB2 Express Download DB2
> Express C - the FREE version of DB2 express and take control
> of your XML. No limits. Just data. Click to get it now.
> http://sourceforge.net/powerbar/db2/
> _______________________________________________
> NFS maillist - [email protected]
> https://lists.sourceforge.net/lists/listinfo/nfs
>

-------------------------------------------------------------------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2007-05-17 10:48:58

by NeilBrown

[permalink] [raw]
Subject: Re: [RFC,PATCH 3/14] knfsd: prepare reply per transport

On Thursday May 17, [email protected] wrote:
>
> Move the code at the beginning of svc_process() that sets up
> page buffers for the reply, into a new sko_prepape_reply
> method in svc_sock_ops.
>

> +static int
> +svc_tcp_prepare_reply(struct svc_rqst *rqstp)
> +{
> + struct kvec *resv = &rqstp->rq_res.head[0];
> +
> + svc_tcpip_prepare_reply(rqstp);
> +
> + /* tcp needs a space for the record length... */
^^

I appreciate that you are copying a comment verbatim, but can we drop
the 'a'?

Thanks,
NeilBrown

-------------------------------------------------------------------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2007-05-17 15:26:33

by Tom Tucker

[permalink] [raw]
Subject: Re: [RFC,PATCH 3/14] knfsd: prepare reply per transport


I think this is worth considering. Since the client can receive requests
in 4.1, the distinction becomes arbitrary.

On Thu, 2007-05-17 at 02:16 -0700, Iyer, Rahul wrote:
> Hi Greg,
> I like the idea of a transport switch for the server side. The way I see
> it, it's not just the server, it's also the "callback server" on the
> client that can benefit.
>
> I had a question. Maybe it's a dumb question, and I may be way off base,
> but I don't see why we need 2 transport switches - one for the client,
> one for the server? Why not have one?
> I've written some code to do NFSv4.1 callbacks and wound up implementing
> another "transport". Since in v4.1 it's actually possible for clients to
> send requests (fore-channel) and receive callbacks (back channel) over
> the same connection, I had to "make" another transport that essentially
> did TCP, but used server side conventions for the rpc_xprt_ops send
> routines so that the server could use the existing rpc_call_(a)sync()
> mechanisms. Similarly, on the client side, I had to implement the
> equivalent of the svc_send() routines using the client side xs_tcp_*
> routines. The resultant code is reasonably clean, but screams out
> "inefficiency" because of the small amount of code sharing that was
> actually possible.
>
> Given this, a unified transport switch would really rock. Both the
> client and the server treat the connection as a "full duplex" (I mean
> can send calls and receive replies on the same connection). One set of
> tcp/udp read/write methods for both the client and server. No more
> svc_send for the server and xs_*_send_request on the client. The one
> true networking way to rule them both! Everything would be much cleaner,
> and since we're going down this path, I assumed it's a feasibility study
> worth doing. Maybe I'm oversimplifying, but this seems doable. After
> all, what we're doing in both cases (client and server) is shoving bits
> down a socket. Is my question justified or am I *way* off base and
> ignorant of some fundamental issues(s)?
> Thanks
> Regards
> Rahul
>
>
> > -----Original Message-----
> > From: Greg Banks [mailto:[email protected]]
> > Sent: Thursday, May 17, 2007 12:54 AM
> > To: Tom Tucker
> > Cc: Linux NFS Mailing List; Talpey, Thomas; Peter Leckie; Greg Banks
> > Subject: Re: [NFS] [RFC,PATCH 3/14] knfsd: prepare reply per transport
> >
> > On Wed, May 16, 2007 at 04:35:07PM -0500, Tom Tucker wrote:
> > >
> > > Greg:
> > >
> > > I like this patch organization. I'll replicate this in the
> > integrated
> > > tree...
> >
> > Great.
> >
> > > > + /* Will be turned off only in gss privacy case: */
> > > > + rqstp->rq_sendfile_ok = 1;
> > >
> > > I think this belongs in the svc_process logic. It doesn't have
> > > anything to do with the buffer, but rather whether or not
> > GSS is turned on.
> >
> > Fixed.
> >
> > I'll push a revised patchset including feedback from yourself
> > and Bruce, to you only, in a few minutes.
> >
> > Greg.
> > --
> > Greg Banks, R&D Software Engineer, SGI Australian Software Group.
> > Apparently, I'm Bedevere. Which MPHG character are you?
> > I don't speak for SGI.
> >
> > --------------------------------------------------------------
> > -----------
> > This SF.net email is sponsored by DB2 Express Download DB2
> > Express C - the FREE version of DB2 express and take control
> > of your XML. No limits. Just data. Click to get it now.
> > http://sourceforge.net/powerbar/db2/
> > _______________________________________________
> > NFS maillist - [email protected]
> > https://lists.sourceforge.net/lists/listinfo/nfs
> >


-------------------------------------------------------------------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2007-05-18 03:16:58

by Greg Banks

[permalink] [raw]
Subject: Re: [RFC,PATCH 3/14] knfsd: prepare reply per transport

On Thu, May 17, 2007 at 02:16:46AM -0700, Iyer, Rahul wrote:
> Hi Greg,
> I like the idea of a transport switch for the server side. The way I see
> it, it's not just the server, it's also the "callback server" on the
> client that can benefit.
>
> I had a question. Maybe it's a dumb question, and I may be way off base,
> but I don't see why we need 2 transport switches - one for the client,
> one for the server? Why not have one?

Two separate sets of code with different requirements.

> I've written some code to do NFSv4.1 callbacks and wound up implementing
> another "transport". Since in v4.1 it's actually possible for clients to
> send requests (fore-channel) and receive callbacks (back channel) over
> the same connection,

A very sensible thing to do, but hard to retrofit to the existing Linux
code without significant surgery. As you've discovered, the server and
client code are two mostly-separate code bases (with mostly-separate
authors and maintainers) that happen to link into the same module.
Unifying these would be a major job. I'd love to see it happen, for
example to unify the XDR buffering code, but it would be an uphill
battle technically and possibly politically also. For some reason,
code (like lockd) that lives in both the server and client side tends
to be neglected by both camps.

> Given this, a unified transport switch would really rock.
> [...] Maybe I'm oversimplifying, but this seems doable.

Yes...eventually.

Greg.
--
Greg Banks, R&D Software Engineer, SGI Australian Software Group.
Apparently, I'm Bedevere. Which MPHG character are you?
I don't speak for SGI.

-------------------------------------------------------------------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2007-05-18 04:01:40

by J. Bruce Fields

[permalink] [raw]
Subject: Re: [RFC,PATCH 3/14] knfsd: prepare reply per transport

On Fri, May 18, 2007 at 01:16:30PM +1000, Greg Banks wrote:
> On Thu, May 17, 2007 at 02:16:46AM -0700, Iyer, Rahul wrote:
> > I've written some code to do NFSv4.1 callbacks and wound up implementing
> > another "transport". Since in v4.1 it's actually possible for clients to
> > send requests (fore-channel) and receive callbacks (back channel) over
> > the same connection,
>
> A very sensible thing to do, but hard to retrofit to the existing Linux
> code without significant surgery. As you've discovered, the server and
> client code are two mostly-separate code bases (with mostly-separate
> authors and maintainers) that happen to link into the same module.
> Unifying these would be a major job. I'd love to see it happen, for
> example to unify the XDR buffering code, but it would be an uphill
> battle technically and possibly politically also. For some reason,
> code (like lockd) that lives in both the server and client side tends
> to be neglected by both camps.
>
> > Given this, a unified transport switch would really rock.
> > [...] Maybe I'm oversimplifying, but this seems doable.
>
> Yes...eventually.

And the easiest approach may be to go ahead and introduce the completely
separate server-side transport switch and then later find ways increase
code sharing between the two in incremental steps....

--b.

-------------------------------------------------------------------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2007-05-18 04:08:07

by J. Bruce Fields

[permalink] [raw]
Subject: Re: [RFC,PATCH 3/14] knfsd: prepare reply per transport

On Fri, May 18, 2007 at 12:01:37AM -0400, bfields wrote:
> On Fri, May 18, 2007 at 01:16:30PM +1000, Greg Banks wrote:
> > On Thu, May 17, 2007 at 02:16:46AM -0700, Iyer, Rahul wrote:
> > > I've written some code to do NFSv4.1 callbacks and wound up implementing
> > > another "transport". Since in v4.1 it's actually possible for clients to
> > > send requests (fore-channel) and receive callbacks (back channel) over
> > > the same connection,
> >
> > A very sensible thing to do, but hard to retrofit to the existing Linux
> > code without significant surgery. As you've discovered, the server and
> > client code are two mostly-separate code bases (with mostly-separate
> > authors and maintainers) that happen to link into the same module.
> > Unifying these would be a major job. I'd love to see it happen, for
> > example to unify the XDR buffering code, but it would be an uphill
> > battle technically and possibly politically also. For some reason,
> > code (like lockd) that lives in both the server and client side tends
> > to be neglected by both camps.
> >
> > > Given this, a unified transport switch would really rock.
> > > [...] Maybe I'm oversimplifying, but this seems doable.
> >
> > Yes...eventually.
>
> And the easiest approach may be to go ahead and introduce the completely
> separate server-side transport switch and then later find ways increase
> code sharing between the two in incremental steps....

(But I don't think people who are interested in doing this should be
scared off by "politics". I'm sure careful patches would be welcomed by
everyone. Care is required, though--small differences between the
assumptions made in the two code bases can tricky. Anyway, that's my
lame excuse for why I ended up with such messy rpcsec_gss privacy code.)

--b.

-------------------------------------------------------------------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2007-05-18 06:01:02

by Greg Banks

[permalink] [raw]
Subject: Re: [RFC,PATCH 3/14] knfsd: prepare reply per transport

On Thu, May 17, 2007 at 08:48:51PM +1000, Neil Brown wrote:
> On Thursday May 17, [email protected] wrote:
> >
> > Move the code at the beginning of svc_process() that sets up
> > page buffers for the reply, into a new sko_prepape_reply
> > method in svc_sock_ops.
> >
>
> > +static int
> > +svc_tcp_prepare_reply(struct svc_rqst *rqstp)
> > +{
> > + struct kvec *resv = &rqstp->rq_res.head[0];
> > +
> > + svc_tcpip_prepare_reply(rqstp);
> > +
> > + /* tcp needs a space for the record length... */
> ^^
>
> I appreciate that you are copying a comment verbatim, but can we drop
> the 'a'?
>

Fixed. It now reads

+ /* tcp needs room for the record length... */


Greg.
--
Greg Banks, R&D Software Engineer, SGI Australian Software Group.
Apparently, I'm Bedevere. Which MPHG character are you?
I don't speak for SGI.

-------------------------------------------------------------------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2007-05-18 14:42:53

by Trond Myklebust

[permalink] [raw]
Subject: Re: [RFC,PATCH 3/14] knfsd: prepare reply per transport

On Fri, 2007-05-18 at 13:16 +1000, Greg Banks wrote:
> A very sensible thing to do, but hard to retrofit to the existing Linux
> code without significant surgery. As you've discovered, the server and
> client code are two mostly-separate code bases (with mostly-separate
> authors and maintainers) that happen to link into the same module.
> Unifying these would be a major job. I'd love to see it happen, for
> example to unify the XDR buffering code, but it would be an uphill
> battle technically and possibly politically also. For some reason,
> code (like lockd) that lives in both the server and client side tends
> to be neglected by both camps.
>
> > Given this, a unified transport switch would really rock.
> > [...] Maybe I'm oversimplifying, but this seems doable.
>
> Yes...eventually.

I'm not sure that politics is really the problem here. I think the
biggest issue is that the client and server have very different
workloads.

The job of the client is to pump as much data as fast as possible
through a single socket, to place incoming data into the correct reply
buffers as quickly and efficiently as possible, and to handle exceptions
such as dropped connections, or socket buffer starvation by resending
the request after reconnecting/waiting for the socket buffer to empty.
It uses a single workqueue, and non-blocking I/O in order to achieve
this goal.

The job of the server is to listen for new connections, to round-robin
through several sockets, to read incoming data into pre-allocated
anonymous buffers, to process the RPC call, then to pump out the result
as quickly as possible. It handles exceptions like dropped connections
by dropping the request and moving on. Resource starvation is handled by
deferring handling the request and/or possibly dropping it altogether.

Apart from the tasks of 'reading data' and 'writing data', it is hard to
see much that could be shared. Even for the case of reading/writing, the
code differs due to the client's need to identify the incoming request
or the server's need to provision socket write resources.
So if anyone is serious about wanting to share transport switches
between client and server, then what is first needed is a detailed
analysis of what actually _can_ be shared. After that we can discuss
what it actually makes sense to share.

Trond


-------------------------------------------------------------------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs