These patches fix a few bugs and add a few enhancements that were
found while porting the RDMA transport driver to the new switch design.
--
Signed-off-by: Tom Tucker <[email protected]>
-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems? Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs
On Fri, Oct 19, 2007 at 04:45:23PM -0500, Tom Tucker wrote:
>
> The rq_arg.len includes the size of the transport header. The computations
> assumed that it did not.
ok, assuming that your latest svc_rdma_xdr_decode_req(), which I
haven't read yet, doesn't change rq_arg.len as it parses the RDMA
chunking header.
The reason I mention that is that one set of patches I was working
with some time ago had different semantics for rq_arg.len; when
parsing the chunking header, rq_arg.len was adjusted downward as
rq_head[0].iov_base was adjusted forward. This kept rq_arg.len equal
to the sum of rq_{head[0],pages[...],tail[0]}.iov_len, which seemed
like a good idea.
But your way seems valid too.
Greg.
--
Greg Banks, R&D Software Engineer, SGI Australian Software Group.
Apparently, I'm Bedevere. Which MPHG character are you?
I don't speak for SGI.
-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems? Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs
Greg Banks wrote:
> On Fri, Oct 19, 2007 at 04:45:23PM -0500, Tom Tucker wrote:
>
>> The rq_arg.len includes the size of the transport header. The computations
>> assumed that it did not.
>>
>
> ok, assuming that your latest svc_rdma_xdr_decode_req(), which I
> haven't read yet, doesn't change rq_arg.len as it parses the RDMA
> chunking header.
>
> The reason I mention that is that one set of patches I was working
> with some time ago had different semantics for rq_arg.len; when
> parsing the chunking header, rq_arg.len was adjusted downward as
> rq_head[0].iov_base was adjusted forward. This kept rq_arg.len equal
> to the sum of rq_{head[0],pages[...],tail[0]}.iov_len, which seemed
> like a good idea.
>
> But your way seems valid too.
>
>
Well to be honest, I'm somewhat conflicted. The TCP transport driver
doesn't include the transport header in rq_arg.len. For the RDMA
transport, however, there was a lot more parsing involved with the
transport header and I wanted to keep rq_arg.len inclusive in order to
enable more robust error checking for short messages, etc...
So right now, it's mixed, but since TCP doesn't save it's transport
header, it doesn't matter. So to cut to the chase here, the design
decision being made is that any transport that needs to save it's
transport header _must_ include the transport header in the rq_arg.len
Tom
> Greg.
>
-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems? Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs
On Mon, Oct 22, 2007 at 10:11:44PM -0500, Tom Tucker wrote:
> [...] So to cut to the chase here, the design
> decision being made is that any transport that needs to save it's
> transport header _must_ include the transport header in the rq_arg.len
Works for me, just remember to document it.
Greg.
--
Greg Banks, R&D Software Engineer, SGI Australian Software Group.
Apparently, I'm Bedevere. Which MPHG character are you?
I don't speak for SGI.
-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems? Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs
On Tue, Oct 23, 2007 at 12:45:30PM +1000, Greg Banks wrote:
> On Fri, Oct 19, 2007 at 04:45:23PM -0500, Tom Tucker wrote:
> >
> > The rq_arg.len includes the size of the transport header. The computations
> > assumed that it did not.
>
> ok, assuming that your latest svc_rdma_xdr_decode_req(), which I
> haven't read yet, doesn't change rq_arg.len as it parses the RDMA
> chunking header.
>
> The reason I mention that is that one set of patches I was working
> with some time ago had different semantics for rq_arg.len; when
> parsing the chunking header, rq_arg.len was adjusted downward as
> rq_head[0].iov_base was adjusted forward. This kept rq_arg.len equal
> to the sum of rq_{head[0],pages[...],tail[0]}.iov_len, which seemed
> like a good idea.
>
> But your way seems valid too.
The integrity and privacy code may be a particularly unpleasant source
of assumptions about the various length fields; see e.g.
unwrap_integ_data() and unwrap_priv_data(). (But maybe nobody cares
about krb5i/krb5p over rdma?)
--b.
-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems? Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs
On Tue, Oct 23, 2007 at 02:03:55PM -0400, J. Bruce Fields wrote:
> On Tue, Oct 23, 2007 at 12:45:30PM +1000, Greg Banks wrote:
> > On Fri, Oct 19, 2007 at 04:45:23PM -0500, Tom Tucker wrote:
> > >
> > > The rq_arg.len includes the size of the transport header. The computations
> > > assumed that it did not.
> >
> > ok, assuming that your latest svc_rdma_xdr_decode_req(), which I
> > haven't read yet, doesn't change rq_arg.len as it parses the RDMA
> > chunking header.
> >
> > The reason I mention that is that one set of patches I was working
> > with some time ago had different semantics for rq_arg.len; when
> > parsing the chunking header, rq_arg.len was adjusted downward as
> > rq_head[0].iov_base was adjusted forward. This kept rq_arg.len equal
> > to the sum of rq_{head[0],pages[...],tail[0]}.iov_len, which seemed
> > like a good idea.
> >
> > But your way seems valid too.
>
> The integrity and privacy code may be a particularly unpleasant source
> of assumptions about the various length fields; see e.g.
> unwrap_integ_data() and unwrap_priv_data(). (But maybe nobody cares
> about krb5i/krb5p over rdma?)
More or less the entire point of NFS/RDMA is that the CPU never
has to see the data that goes on the wire; no memory copies (*),
no message digest calculations, no encryptions. This makes krb5[ip]
over RDMA rather an unhelpful combination.
But if there's some code there that makes actual assumptions about
the xdr_buf.len field, we need to find it, document it and ensure
we're maintaining that assumption in the new code.
(*) Last time I looked the write path still had an unnecessary memcpy.
Greg.
--
Greg Banks, R&D Software Engineer, SGI Australian Software Group.
Apparently, I'm Bedevere. Which MPHG character are you?
I don't speak for SGI.
-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems? Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs
On Wed, Oct 24, 2007 at 12:45:09PM +1000, Greg Banks wrote:
> On Tue, Oct 23, 2007 at 02:03:55PM -0400, J. Bruce Fields wrote:
> > On Tue, Oct 23, 2007 at 12:45:30PM +1000, Greg Banks wrote:
> > > On Fri, Oct 19, 2007 at 04:45:23PM -0500, Tom Tucker wrote:
> > > >
> > > > The rq_arg.len includes the size of the transport header. The computations
> > > > assumed that it did not.
> > >
> > > ok, assuming that your latest svc_rdma_xdr_decode_req(), which I
> > > haven't read yet, doesn't change rq_arg.len as it parses the RDMA
> > > chunking header.
> > >
> > > The reason I mention that is that one set of patches I was working
> > > with some time ago had different semantics for rq_arg.len; when
> > > parsing the chunking header, rq_arg.len was adjusted downward as
> > > rq_head[0].iov_base was adjusted forward. This kept rq_arg.len equal
> > > to the sum of rq_{head[0],pages[...],tail[0]}.iov_len, which seemed
> > > like a good idea.
> > >
> > > But your way seems valid too.
> >
> > The integrity and privacy code may be a particularly unpleasant source
> > of assumptions about the various length fields; see e.g.
> > unwrap_integ_data() and unwrap_priv_data(). (But maybe nobody cares
> > about krb5i/krb5p over rdma?)
>
> More or less the entire point of NFS/RDMA is that the CPU never
> has to see the data that goes on the wire; no memory copies (*),
> no message digest calculations, no encryptions. This makes krb5[ip]
> over RDMA rather an unhelpful combination.
Yeah, I figured.
But obviously we have to at least make sure that combination doesn't
cause a crash.
--b.
> But if there's some code there that makes actual assumptions about
> the xdr_buf.len field, we need to find it, document it and ensure
> we're maintaining that assumption in the new code.
-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems? Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs
The rq_arg.len includes the size of the transport header. The computations
assumed that it did not.
Signed-off-by: Tom Tucker <[email protected]>
---
net/sunrpc/svc_xprt.c | 16 +++++++---------
1 files changed, 7 insertions(+), 9 deletions(-)
diff --git a/net/sunrpc/svc_xprt.c b/net/sunrpc/svc_xprt.c
index b56bf05..13aacec 100644
--- a/net/sunrpc/svc_xprt.c
+++ b/net/sunrpc/svc_xprt.c
@@ -29,7 +29,6 @@ #include <asm/ioctls.h>
#include <linux/sunrpc/types.h>
#include <linux/sunrpc/clnt.h>
#include <linux/sunrpc/xdr.h>
-#include <linux/sunrpc/svcsock.h>
#include <linux/sunrpc/stats.h>
#include <linux/sunrpc/svc_xprt.h>
@@ -162,7 +161,7 @@ EXPORT_SYMBOL_GPL(svc_xprt_init);
static int svc_local_port(struct svc_xprt *xprt)
{
- int ret = 0;
+ int ret = -1;
switch (xprt->xpt_local.ss_family) {
case AF_INET:
ret = ntohs(((struct sockaddr_in *)
@@ -891,8 +890,7 @@ static struct cache_deferred_req *svc_de
int skip;
int size;
/* FIXME maybe discard if size too large */
- size = sizeof(struct svc_deferred_req) + rqstp->rq_arg.len +
- rqstp->rq_xprt_hlen;
+ size = sizeof(struct svc_deferred_req) + rqstp->rq_arg.len;
dr = kmalloc(size, GFP_KERNEL);
if (dr == NULL)
return NULL;
@@ -902,12 +900,12 @@ static struct cache_deferred_req *svc_de
memcpy(&dr->addr, &rqstp->rq_addr, rqstp->rq_addrlen);
dr->addrlen = rqstp->rq_addrlen;
dr->daddr = rqstp->rq_daddr;
- dr->argslen = (rqstp->rq_arg.len + rqstp->rq_xprt_hlen) >> 2;
+ dr->argslen = rqstp->rq_arg.len>>2;
dr->xprt_hlen = rqstp->rq_xprt_hlen;
/* back up head to the start of the buffer and copy */
- skip = (rqstp->rq_arg.len + rqstp->rq_xprt_hlen) -
- rqstp->rq_arg.head[0].iov_len;
+ skip = rqstp->rq_arg.len - rqstp->rq_arg.head[0].iov_len;
+
memcpy(dr->args, rqstp->rq_arg.head[0].iov_base - skip,
dr->argslen << 2);
}
@@ -930,8 +928,8 @@ static int svc_deferred_recv(struct svc_
/* The iov_len doesn't include the transport header bytes */
rqstp->rq_arg.head[0].iov_len = (dr->argslen<<2) - dr->xprt_hlen;
rqstp->rq_arg.page_len = 0;
- /* The rq_arg len does not include the transport header bytes */
- rqstp->rq_arg.len = (dr->argslen<<2) - dr->xprt_hlen;
+ /* The rq_arg len includes the transport header bytes */
+ rqstp->rq_arg.len = dr->argslen<<2;
rqstp->rq_prot = dr->prot;
memcpy(&rqstp->rq_addr, &dr->addr, dr->addrlen);
rqstp->rq_addrlen = dr->addrlen;
-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems? Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs
Create a transport independent version of the svc_sock_names function.
The toclose capability of the svc_sock_names service can be implemented
using the svc_xprt_find and svc_xprt_close services.
Signed-off-by: Tom Tucker <[email protected]>
---
fs/nfsd/nfsctl.c | 2 +-
include/linux/sunrpc/svc_xprt.h | 5 +++--
net/sunrpc/svc_xprt.c | 35 +++++++++++++++++++++++++++++++++++
3 files changed, 39 insertions(+), 3 deletions(-)
diff --git a/fs/nfsd/nfsctl.c b/fs/nfsd/nfsctl.c
index 0dfc130..3335270 100644
--- a/fs/nfsd/nfsctl.c
+++ b/fs/nfsd/nfsctl.c
@@ -503,7 +503,7 @@ static ssize_t write_ports(struct file *
int len = 0;
lock_kernel();
if (nfsd_serv)
- len = svc_sock_names(buf, nfsd_serv, NULL);
+ len = svc_xprt_names(nfsd_serv, buf, 0);
unlock_kernel();
return len;
}
diff --git a/include/linux/sunrpc/svc_xprt.h b/include/linux/sunrpc/svc_xprt.h
index 5dba9a5..b7d94ef 100644
--- a/include/linux/sunrpc/svc_xprt.h
+++ b/include/linux/sunrpc/svc_xprt.h
@@ -82,6 +82,7 @@ static inline void svc_xprt_get(struct s
void svc_delete_xprt(struct svc_xprt *xprt);
void svc_close_xprt(struct svc_xprt *xprt);
int svc_print_xprts(char *buf, int maxlen);
-struct svc_xprt *
-svc_find_xprt(struct svc_serv *serv, char *xprt_class, int af, int port);
+struct svc_xprt *svc_find_xprt(struct svc_serv *serv, char *xprt_class,
+ int af, int port);
+int svc_xprt_names(struct svc_serv *serv, char *buf, int buflen);
#endif /* SUNRPC_SVC_XPRT_H */
diff --git a/net/sunrpc/svc_xprt.c b/net/sunrpc/svc_xprt.c
index 13aacec..6b2c73c 100644
--- a/net/sunrpc/svc_xprt.c
+++ b/net/sunrpc/svc_xprt.c
@@ -993,3 +993,38 @@ struct svc_xprt *svc_find_xprt(struct sv
return found;
}
EXPORT_SYMBOL_GPL(svc_find_xprt);
+
+/*
+ * Format a buffer with a list of the active transports. A zero for
+ * the buflen parameter disables target buffer overflow checking.
+ */
+int svc_xprt_names(struct svc_serv *serv, char *buf, int buflen)
+{
+ struct svc_xprt *xprt;
+ char xprt_str[64];
+ int totlen = 0;
+ int len;
+
+ /* Sanity check args */
+ if (!serv)
+ return 0;
+
+ spin_lock_bh(&serv->sv_lock);
+ list_for_each_entry(xprt, &serv->sv_permsocks, xpt_list) {
+ len = snprintf(xprt_str, sizeof(xprt_str),
+ "%s %d\n", xprt->xpt_class->xcl_name,
+ svc_local_port(xprt));
+ /* If the string was truncated, replace with error string */
+ if (len >= sizeof(xprt_str))
+ strcpy(xprt_str, "name-too-long\n");
+ /* Don't overflow buffer */
+ len = strlen(xprt_str);
+ if (buflen && (len + totlen >= buflen))
+ break;
+ strcpy(buf+totlen, xprt_str);
+ totlen += len;
+ }
+ spin_unlock_bh(&serv->sv_lock);
+ return totlen;
+}
+EXPORT_SYMBOL_GPL(svc_xprt_names);
-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems? Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs
The call to svc_xprt_received function should be called after the transport
is completely initialized and added to the tempsocks list.
Signed-off-by: Tom Tucker <[email protected]>
---
net/sunrpc/svc_xprt.c | 2 +-
1 files changed, 1 insertions(+), 1 deletions(-)
diff --git a/net/sunrpc/svc_xprt.c b/net/sunrpc/svc_xprt.c
index 6b2c73c..bb007e7 100644
--- a/net/sunrpc/svc_xprt.c
+++ b/net/sunrpc/svc_xprt.c
@@ -615,7 +615,6 @@ int svc_recv(struct svc_rqst *rqstp, lon
struct svc_xprt *newxpt;
newxpt = xprt->xpt_ops->xpo_accept(xprt);
if (newxpt) {
- svc_xprt_received(newxpt);
/*
* We know this module_get will succeed because the
* listener holds a reference too
@@ -634,6 +633,7 @@ int svc_recv(struct svc_rqst *rqstp, lon
jiffies + svc_conn_age_period * HZ);
}
spin_unlock_bh(&serv->sv_lock);
+ svc_xprt_received(newxpt);
}
svc_xprt_received(xprt);
} else {
-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems? Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs
The svc_addsock function needs to set the local address in the
svc_xprt structure.
Signed-off-by: Tom Tucker <[email protected]>
---
net/sunrpc/svcsock.c | 5 +++++
1 files changed, 5 insertions(+), 0 deletions(-)
diff --git a/net/sunrpc/svcsock.c b/net/sunrpc/svcsock.c
index cc752b7..e1a27ee 100644
--- a/net/sunrpc/svcsock.c
+++ b/net/sunrpc/svcsock.c
@@ -1249,6 +1249,11 @@ int svc_addsock(struct svc_serv *serv,
else {
svsk = svc_setup_socket(serv, so, &err, SVC_SOCK_DEFAULTS);
if (svsk) {
+ int salen;
+ (void)kernel_getsockname(svsk->sk_sock,
+ (struct sockaddr *)
+ &svsk->sk_xprt.xpt_local,
+ &salen);
svc_xprt_received(&svsk->sk_xprt);
err = 0;
}
-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems? Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs
On Fri, Oct 19, 2007 at 04:45:26PM -0500, Tom Tucker wrote:
>
> The svc_addsock function needs to set the local address in the
> svc_xprt structure.
>
> Signed-off-by: Tom Tucker <[email protected]>
> ---
>
> net/sunrpc/svcsock.c | 5 +++++
> 1 files changed, 5 insertions(+), 0 deletions(-)
>
> diff --git a/net/sunrpc/svcsock.c b/net/sunrpc/svcsock.c
> index cc752b7..e1a27ee 100644
> --- a/net/sunrpc/svcsock.c
> +++ b/net/sunrpc/svcsock.c
> @@ -1249,6 +1249,11 @@ int svc_addsock(struct svc_serv *serv,
> else {
> svsk = svc_setup_socket(serv, so, &err, SVC_SOCK_DEFAULTS);
> if (svsk) {
> + int salen;
> + (void)kernel_getsockname(svsk->sk_sock,
> + (struct sockaddr *)
> + &svsk->sk_xprt.xpt_local,
> + &salen);
> svc_xprt_received(&svsk->sk_xprt);
> err = 0;
> }
I presume you did this to make UDP work again after you broke it in
patch "Move the sockaddr information to svc_xprt" ? I think this is
subtly wrong.
If you read svc_udp_recvfrom() more closely you'll see why. That code
uses struct in_pktinfo (in6_pktinfo on ipv6) to fill in rqstp->rq_daddr
with the destination address that each incoming RPC has on the wire.
The other patch plus this one instead fill in rq_daddr with the UDP
socket's idea of it's own address, which does not vary for each call
and will usually be 0.0.0.0:2049. So later svc_sendto() will attempt
to send the reply with a source address of 0.0.0.0:2049.
That will result in the networking layer choosing a source address
based on the interface through which the datagram is routed.
Most of the time that works fine but there are setups where this will
completely break NFS. For example, an HA configuration (where all NFS
traffic is sent to IP aliases, not the interface's primary address)
combined with careful firewall rules.
Perhaps I wasn't clear in my response to patch "Move the sockaddr
information to svc_xprt". The assumption in that patch that
svsk->sk_local is meaningful for the UDP socket, and can be used to
generate rqstp->rq_daddr, is incorrect.
Greg.
--
Greg Banks, R&D Software Engineer, SGI Australian Software Group.
Apparently, I'm Bedevere. Which MPHG character are you?
I don't speak for SGI.
-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems? Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs