2007-09-27 05:01:37

by Tom Tucker

[permalink] [raw]
Subject: [RFC,PATCH 00/33] SVC Transport Switch

The following series implements a pluggable transport switch for
RPC servers. The biggest changes in this latest incarnation
are as follows:

- The overall design of the switch has been modified to be more similar
to the client side, e.g.
- There is a transport class structure svc_xprt_class, and
- A transport independent structure is manipulated by xprt
independent code (svc_xprt)
- Further consolidation of transport independent logic out of
transport providers and into transport independent code.
- Transport independent code has been broken out into a separate file
- Transport independent functions prevously adorned with _sock_ have
had their names changed, e.g. svc_sock_enqueue
- atomic refcounts have been changed to krefs

The patchset is large (33 patches). There are some things that I would like to
do that I didn't do because the patchset is already big. For example, normalize
the creation of nfsd listening endpoints using writes to the portlist file.

I've attempted to organize the patchset such that logical changes are
clearly reviewable without too much clutter from functionally empty name
changes. This was somewhat awkward since intermediate patches may look
ugly/broken/incomplete to some reviewers. This was to avoid losing the
context of a change while keeping each patch a reasonable size. For example,
making svc_recv transport independent and moving it to the svc_xprt file
cannot be done in the same patch without losing the diffs to the svc_recv
function.

This patchset has had limited testing with TCP/UDP. In this case, the tests
included connectathon and building the kernel on an NFS mount running on the
transport switch.

This patchset is against the 2.6.23-rc8 kernel tree.

--
Signed-off-by: Tom Tucker <[email protected]>

-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2005.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs


2007-09-28 16:07:23

by Tom Tucker

[permalink] [raw]
Subject: Re: [RFC, PATCH 06/33] svc: Add transport specific xpo_release function

On Fri, 2007-09-28 at 12:58 +1000, Neil Brown wrote:
> On Thursday September 27, [email protected] wrote:
> >
> > The svc_sock_release function releases pages allocated to a thread. For
> > UDP, this also returns the receive skb to the stack. For RDMA it will
> > post a receive WR and bump the client credit count.
> >
> ..
> > diff --git a/include/linux/sunrpc/svc.h b/include/linux/sunrpc/svc.h
> > index 37f7448..cfb2652 100644
> > --- a/include/linux/sunrpc/svc.h
> > +++ b/include/linux/sunrpc/svc.h
> > @@ -217,7 +217,7 @@ struct svc_rqst {
> > struct auth_ops * rq_authop; /* authentication flavour */
> > u32 rq_flavor; /* pseudoflavor */
> > struct svc_cred rq_cred; /* auth info */
> > - struct sk_buff * rq_skbuff; /* fast recv inet buffer */
> > + void * rq_xprt_ctxt; /* transport specific context ptr */
> > struct svc_deferred_req*rq_deferred; /* deferred request we are replaying */
> >
> > struct xdr_buf rq_arg;
> ..
> > diff --git a/net/sunrpc/svcsock.c b/net/sunrpc/svcsock.c
> > index cc8c7ce..e7d203a 100644
> > --- a/net/sunrpc/svcsock.c
> > +++ b/net/sunrpc/svcsock.c
> > @@ -184,14 +184,14 @@ svc_thread_dequeue(struct svc_pool *pool
> > /*
> > * Release an skbuff after use
> > */
> > -static inline void
> > +static void
> > svc_release_skb(struct svc_rqst *rqstp)
> > {
> > - struct sk_buff *skb = rqstp->rq_skbuff;
> > + struct sk_buff *skb = (struct sk_buff *)rqstp->rq_xprt_ctxt;
>
> Minor style point: We don't cast void* in the kernel.

Oops, no we don't.

>
> > struct svc_deferred_req *dr = rqstp->rq_deferred;
> >
> > if (skb) {
> > - rqstp->rq_skbuff = NULL;
> > + rqstp->rq_xprt_ctxt = NULL;
> >
> > dprintk("svc: service %p, releasing skb %p\n", rqstp, skb);
> > skb_free_datagram(rqstp->rq_sock->sk_sk, skb);
> > @@ -394,7 +394,7 @@ svc_sock_release(struct svc_rqst *rqstp)
> > {
> > struct svc_sock *svsk = rqstp->rq_sock;
> >
> > - svc_release_skb(rqstp);
> > + rqstp->rq_xprt->xpt_ops.xpo_release(rqstp);
>
> These are somewhat ugly, aren't they?
> What would you think of giving rqstp a pointer directly to xpt_ops to
> avoid the double indirection?
>

Maybe I'm missing something, but does it actually save anything? The
xpt_ops structure is copied into the transport instance. I think you'd
end up with this:

rqstp->rq_ops->xpo_release(rqstp)

More aesthetically pleasing perhaps, but you'd still have "double
indirection". I think this is the same as we have today actually for the
sk_release/sk_sendto functions.

> NeilBrown


-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2005.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2007-09-28 16:10:58

by Tom Tucker

[permalink] [raw]
Subject: Re: [RFC, PATCH 09/33] svc: Add a transport function that checks for write space

On Fri, 2007-09-28 at 13:03 +1000, Neil Brown wrote:
> On Thursday September 27, [email protected] wrote:
> > @@ -898,6 +900,25 @@ svc_udp_prep_reply_hdr(struct svc_rqst *
> > {
> > }
> >
> > +static int
> > +svc_udp_has_wspace(struct svc_xprt *xprt)
> > +{
> > + struct svc_sock *svsk = (struct svc_sock*)xprt;
> > + struct svc_serv *serv = svsk->sk_server;
> > + int required;
> > +
> > + /*
> > + * Set the SOCK_NOSPACE flag before checking the available
> > + * sock space.
> > + */
> > + set_bit(SOCK_NOSPACE, &svsk->sk_sock->flags);
> > + required = atomic_read(&svsk->sk_reserved) + serv->sv_max_mesg;
> > + if (required*2 > sock_wspace(svsk->sk_sk))
> > + return 0;
> > + clear_bit(SOCK_NOSPACE, &svsk->sk_sock->flags);
> > + return 1;
> > +}
> > +
> > static struct svc_xprt_ops svc_udp_ops = {
> > .xpo_recvfrom = svc_udp_recvfrom,
> > .xpo_sendto = svc_udp_sendto,
> > @@ -1368,6 +1390,25 @@ svc_tcp_prep_reply_hdr(struct svc_rqst *
> > svc_putnl(resv, 0);
> > }
> >
> > +static int
> > +svc_tcp_has_wspace(struct svc_xprt *xprt)
> > +{
> > + struct svc_sock *svsk = (struct svc_sock*)xprt;
> > + struct svc_serv *serv = svsk->sk_server;
> > + int required;
> > +
> > + /*
> > + * Set the SOCK_NOSPACE flag before checking the available
> > + * sock space.
> > + */
> > + set_bit(SOCK_NOSPACE, &svsk->sk_sock->flags);
> > + required = atomic_read(&svsk->sk_reserved) + serv->sv_max_mesg;
> > + if (required*2 > sk_stream_wspace(svsk->sk_sk))
> > + return 0;
> > + clear_bit(SOCK_NOSPACE, &svsk->sk_sock->flags);
> > + return 1;
> > +}
> > +
> > static struct svc_xprt_ops svc_tcp_ops = {
> > .xpo_recvfrom = svc_tcp_recvfrom,
> > .xpo_sendto = svc_tcp_sendto,
>
> As these two functions are identical, could we just have one called
> "svc_sock_has_wspace" or similar?
>

They are not quite identical. One calls sk_stream_wspace(...) to get
socket space and the other calls sock_wspace(...). Maybe I should add
this to the comment, so it's more obvious? I also broken them apart
instead of combining them to make the importance of the ordering of the
setting/resetting of the NOSPACE bits obvious.

> Makes maintenance a little easier.
>
> NeilBrown


-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2005.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2007-09-28 16:11:34

by Tom Tucker

[permalink] [raw]
Subject: Re: [RFC,PATCH 11/33] svc: Add xpo_accept transport function

On Fri, 2007-09-28 at 13:21 +1000, Neil Brown wrote:
> On Thursday September 27, [email protected] wrote:
> > @@ -1046,9 +1054,10 @@ static inline int svc_port_is_privileged
> > /*
> > * Accept a TCP connection
> > */
> > -static void
> > -svc_tcp_accept(struct svc_sock *svsk)
> > +static struct svc_xprt *
> > +svc_tcp_accept(struct svc_xprt *xprt)
> > {
> > + struct svc_sock *svsk = (struct svc_sock *)xprt;
>
> This cast should use container_of
>
> struct svc_sock *svsk = container_of(xprt, struct svc_sock *, sk_xprt);
>
> That makes it clearer what is happening.

Good suggestion. Thanks,

>
>
>
> NeilBrown


-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2005.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2007-09-28 16:16:46

by Tom Tucker

[permalink] [raw]
Subject: Re: [RFC, PATCH 12/33] svc: Add a generic transport svc_create_xprt function

On Fri, 2007-09-28 at 13:21 +1000, Neil Brown wrote:
> On Thursday September 27, [email protected] wrote:
> > diff --git a/net/sunrpc/svc_xprt.c b/net/sunrpc/svc_xprt.c
> > index 6151db5..fab0ce3 100644
> > --- a/net/sunrpc/svc_xprt.c
> > +++ b/net/sunrpc/svc_xprt.c
> > @@ -93,3 +93,41 @@ void svc_xprt_init(struct svc_xprt_class
> > xpt->xpt_max_payload = xcl->xcl_max_payload;
> > }
> > EXPORT_SYMBOL_GPL(svc_xprt_init);
> > +
> > +int svc_create_xprt(struct svc_serv *serv, char *xprt_name, unsigned short port,
> > + int flags)
> > +{
> > + int ret = -ENOENT;
> > + struct list_head *le;
> > + struct sockaddr_in sin = {
> > + .sin_family = AF_INET,
> > + .sin_addr.s_addr = INADDR_ANY,
> > + .sin_port = htons(port),
> > + };
> > + dprintk("svc: creating transport %s[%d]\n", xprt_name, port);
> > + spin_lock(&svc_xprt_class_lock);
> > + list_for_each(le, &svc_xprt_class_list) {
> > + struct svc_xprt_class *xcl =
> > + list_entry(le, struct svc_xprt_class, xcl_list);
>
> list_for_each_entry is preferred.
>

Good suggestion,

> > + if (strcmp(xprt_name, xcl->xcl_name)==0) {
> > + spin_unlock(&svc_xprt_class_lock);
> > + if (try_module_get(xcl->xcl_owner)) {
> > + struct svc_xprt *newxprt;
> > + ret = 0;
> > + newxprt = xcl->xcl_ops->xpo_create
> > + (serv, (struct sockaddr*)&sin, flags);
> > + if (IS_ERR(newxprt)) {
> > + module_put(xcl->xcl_owner);
> > + ret = PTR_ERR(newxprt);
> > + }
> > + goto out;
> > + }
> > + }
> > + }
> > + spin_unlock(&svc_xprt_class_lock);
>
> if try_module_get fails, you spin_unlock twice. the "goto out;"
> needs to be moved down one line.
>

Yikes, bug. Good catch,

> And I'm confused as to why xpo_create returns a pointer which you
> never use. xpo_accept does the same thing: a pointer is returned,
> but only the success status is used. Why not just return
> 0-or-negative-error ??
>

It will be used later when we centralize the create logic.

> NeilBrown


-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2005.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2007-09-28 16:17:32

by Tom Tucker

[permalink] [raw]
Subject: Re: [RFC,PATCH 22/33] svc: Move sk_lastrecv to svc_xprt

On Fri, 2007-09-28 at 14:25 +1000, Neil Brown wrote:
> On Thursday September 27, [email protected] wrote:
> >
> > This functionally trivial change moves the tranpsort independent sk_lastrecv
> > field to the svc_xprt structure.
>
> It would seem that sk_lastrecv is entirely unused (Well, a dprintk
> prints it, but that isn't very interesting).

agreed.

> I think it used to be used to time out idle connections, but Greg's
> mark/sweep does a better job without needing this field. Shall we
> just remove it?

I think so. Greg?

>
> NeilBrown
>
>
> >
> > Signed-off-by: Tom Tucker <[email protected]>
> > ---
> >
> > include/linux/sunrpc/svc_xprt.h | 1 +
> > include/linux/sunrpc/svcsock.h | 1 -
> > net/sunrpc/svcsock.c | 6 +++---
> > 3 files changed, 4 insertions(+), 4 deletions(-)
> >
> > diff --git a/include/linux/sunrpc/svc_xprt.h b/include/linux/sunrpc/svc_xprt.h
> > index 5b2aef4..edb7ad2 100644
> > --- a/include/linux/sunrpc/svc_xprt.h
> > +++ b/include/linux/sunrpc/svc_xprt.h
> > @@ -56,6 +56,7 @@ #define XPT_LISTENER 11 /* listening e
> > struct svc_serv * xpt_server; /* service for this transport */
> > atomic_t xpt_reserved; /* space on outq that is reserved */
> > struct mutex xpt_mutex; /* to serialize sending data */
> > + time_t xpt_lastrecv; /* time of last received request */
> > };
> >
> > int svc_reg_xprt_class(struct svc_xprt_class *);
> > diff --git a/include/linux/sunrpc/svcsock.h b/include/linux/sunrpc/svcsock.h
> > index 41c2dfa..406d003 100644
> > --- a/include/linux/sunrpc/svcsock.h
> > +++ b/include/linux/sunrpc/svcsock.h
> > @@ -33,7 +33,6 @@ struct svc_sock {
> > /* private TCP part */
> > int sk_reclen; /* length of record */
> > int sk_tcplen; /* current read length */
> > - time_t sk_lastrecv; /* time of last received request */
> >
> > /* cache of various info for TCP sockets */
> > void *sk_info_authunix;
> > diff --git a/net/sunrpc/svcsock.c b/net/sunrpc/svcsock.c
> > index 71b7f86..04155aa 100644
> > --- a/net/sunrpc/svcsock.c
> > +++ b/net/sunrpc/svcsock.c
> > @@ -1622,7 +1622,7 @@ svc_recv(struct svc_rqst *rqstp, long ti
> > svc_sock_release(rqstp);
> > return -EAGAIN;
> > }
> > - svsk->sk_lastrecv = get_seconds();
> > + svsk->sk_xprt.xpt_lastrecv = get_seconds();
> > clear_bit(XPT_OLD, &svsk->sk_xprt.xpt_flags);
> >
> > rqstp->rq_secure = svc_port_is_privileged(svc_addr(rqstp));
> > @@ -1725,7 +1725,7 @@ svc_age_temp_sockets(unsigned long closu
> > svsk = list_entry(le, struct svc_sock, sk_xprt.xpt_list);
> >
> > dprintk("queuing svsk %p for closing, %lu seconds old\n",
> > - svsk, get_seconds() - svsk->sk_lastrecv);
> > + svsk, get_seconds() - svsk->sk_xprt.xpt_lastrecv);
> >
> > /* a thread will dequeue and close it soon */
> > svc_xprt_enqueue(&svsk->sk_xprt);
> > @@ -1773,7 +1773,7 @@ static struct svc_sock *svc_setup_socket
> > svsk->sk_ostate = inet->sk_state_change;
> > svsk->sk_odata = inet->sk_data_ready;
> > svsk->sk_owspace = inet->sk_write_space;
> > - svsk->sk_lastrecv = get_seconds();
> > + svsk->sk_xprt.xpt_lastrecv = get_seconds();
> > spin_lock_init(&svsk->sk_lock);
> > INIT_LIST_HEAD(&svsk->sk_deferred);
> >


-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2005.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2007-09-28 16:46:07

by Tom Tucker

[permalink] [raw]
Subject: Re: [RFC, PATCH 25/33] svc: Move the sockaddr information to svc_xprt

On Fri, 2007-09-28 at 14:36 +1000, Neil Brown wrote:
> On Thursday September 27, [email protected] wrote:
> >
> > Move the IP address fields to the svc_xprt structure. Note that this
> > assumes that _all_ RPC transports must have IP based 4-tuples. This
> > seems reasonable given the tight coupling with the portmapper etc...
> > Thoughts?
>
> I don't think NFSv4 requires portmapper (or rpcbind) ... does it?
>
> "Everything uses IP addresses" sounds a lot like "Everything is a
> socket". I would have supported the latter strongly until RDMA came
> along. Now I'm even less sure about the former.
>
> How much cost would there be in leaving the address in the
> per-transport data?

Very little. The original patchset had it in the per-transport data.

>
> NeilBrown


-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2005.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2007-09-28 17:35:13

by Tom Tucker

[permalink] [raw]
Subject: Re: [RFC, PATCH 33/33] knfsd: Support adding transports by writing portlist file

On Fri, 2007-09-28 at 14:48 +1000, Neil Brown wrote:
> On Thursday September 27, [email protected] wrote:
> >
> > Update the write handler for the portlist file to allow creating new
> > listening endpoints on a transport. The general form of the string is:
> >
> > <transport_name><space><port number>
> >
> > For example:
> >
> > tcp 2049
> >
> > This is intended to support the creation of a listening endpoint for
> > RDMA transports without adding #ifdef code to the nfssvc.c file.
> > The general idea is that the rpc.nfsd program would read the transports
> > file and then write the portlist file to create listening endpoints
> > for all or selected transports. The current mechanism of writing an
> > fd would become obsolete.
>
> Nuh.
> I'll only accept
> rdma 2049
> (or whatever) because there seems to be no other way to do it.
> Writing an 'fd' is the *preferred* way.
>
> There is more to binding an endpoint than protocol and port number.
> There is also local address and I'm not convinced that someone might
> come up with some other way they want to pre-condition a socket.

Agreed.

Forgive me for deferring the it-should-be-socket question for a
paragraph or two, but...

This version was much less ambitious than what I wanted to do, which was
extend the syntax and consolidate listener creation.

The string would be something like:

addr-qualifier addr port xprt-string xprt-specific-ops

For example,
ipv4 0.0.0.0 2049 udp
ipv4 10.2.1.5 2049 tcp
ipv6 ff:ff:ff:ff:10.4.1.5 2049 tcp
ipv4 10.3.1.5 2050 rdma

The upside is that all listeners would be created in the same way. The
downside is that the string processing and endpoint creation are all
done in the kernel in the proc-fs handler.

> If there was any way to associate an RDMA endpoint with a
> filedescriptor, I would much prefer that 'rpc.nfsd' does that and passes
> down the filedescriptor.

With regard to associating a socket with an RDMA endpoint, IMO it is
technically feasible, but politically impossible at this point. Various
degrees of integration have been discussed on both the OpenFabrics and
netdev mailing lists. To summarize the viewpoints:

- Key "netdev/core people" do not want _any_ core changes to enable RDMA
under any circumstances. They believe iWARP(RDMA/TOE) and IB (RDMA) are
"point in time technologies" that are doomed to the bone pile and that
integrating into the core would complicate and destabilize the stack.

- Most "RDMA people" see no benefit in sockets because the I/O model is
so different, the sockets connection model is not asynchronous, and
there is no way to specify the RDMA transport connection and route
qualifiers.

- A few "RDMA people" like a sockets approach. It was actually
implemented as part of an early SDP IB implementation.

Obviously, there are as many variants of the above as there are dogs in
the fight, but this is my sense of the fundamental issues.

> If RDMA is so no-Unix-like (rant rant..)
> that there is no such file descriptor, then I guess we can live with
> getting the kernel to open the connection.
>

At this point, I believe this is the way to do it.

Thanks,

Tom
>
> >
> > Signed-off-by: Tom Tucker <[email protected]>
> > ---
> >
> > fs/nfsd/nfsctl.c | 16 ++++++++++++++++
> > 1 files changed, 16 insertions(+), 0 deletions(-)
> >
> > diff --git a/fs/nfsd/nfsctl.c b/fs/nfsd/nfsctl.c
> > index baac89d..923b817 100644
> > --- a/fs/nfsd/nfsctl.c
> > +++ b/fs/nfsd/nfsctl.c
> > @@ -554,6 +554,22 @@ static ssize_t write_ports(struct file *
> > kfree(toclose);
> > return len;
> > }
> > + /*
> > + * Add a transport listener by writing it's transport name
> > + */
> > + if (isalnum(buf[0])) {
>
> Should really be "isalpha" as we already know it isn't isdigit.
>
> NeilBrown
>
>
> > + int err;
> > + char transport[16];
> > + int port;
> > + if (sscanf(buf, "%15s %4d", transport, &port) == 2) {
> > + err = nfsd_create_serv();
> > + if (!err)
> > + err = svc_create_xprt(nfsd_serv,
> > + transport, port,
> > + SVC_SOCK_ANONYMOUS);
> > + return err < 0 ? err : 0;
> > + }
> > + }
> > return -EINVAL;
> > }
> >


-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2005.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2007-09-28 17:40:26

by Tom Tucker

[permalink] [raw]
Subject: Re: [RFC,PATCH 00/33] SVC Transport Switch

On Fri, 2007-09-28 at 14:51 +1000, Neil Brown wrote:
> On Wednesday September 26, [email protected] wrote:
> >
> > I've attempted to organize the patchset such that logical changes are
> > clearly reviewable without too much clutter from functionally empty name
> > changes.
>
> And you did a very thorough job, thanks!
>

Thanks,

> Just a few minor issues as noted in previous emails. Most of them can
> be addressed by incremental patches rather than respinning the whole
> series. I'm just not sure about where the IP-address info should
> live. Maybe other people have opinions???
>

I've already redone the patchset for whitespace cleanup and to handle a
few checkpatch.pl style issues. How about if I roll in your suggestions,
repost the whole thing and then go to incremental after that?

Thoughts?
Tom
> NeilBrown


-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2005.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2007-09-27 05:01:37

by Tom Tucker

[permalink] [raw]
Subject: [RFC,PATCH 01/33] svc: Add an svc transport class


The transport class (svc_xprt_class) represents a type of transport, e.g.
udp, tcp, rdma. A transport class has a unique name and a set of transport
operations kept in the svc_xprt_ops structure.

A transport class can be dynamically registered and unregisterd. The
svc_xprt_class represents the module that implements the transport
type and keeps reference counts on the module to avoid unloading while
there are active users.

The endpoint (svc_xprt) is a generic, transport independent endpoint that can
be used to send and receive data for an RPC service. It inherits it's
operations from the transport class.

A transport driver module registers and unregisters itself with svc sunrpc
by calling svc_reg_xprt_class, and svc_unreg_xprt_class respectively.

Signed-off-by: Tom Tucker <[email protected]>
---

include/linux/sunrpc/debug.h | 1
include/linux/sunrpc/svc_xprt.h | 31 +++++++++++++
net/sunrpc/Makefile | 3 +
net/sunrpc/svc_xprt.c | 94 +++++++++++++++++++++++++++++++++++++++
4 files changed, 128 insertions(+), 1 deletions(-)

diff --git a/include/linux/sunrpc/debug.h b/include/linux/sunrpc/debug.h
index 3912cf1..092fcfa 100644
--- a/include/linux/sunrpc/debug.h
+++ b/include/linux/sunrpc/debug.h
@@ -21,6 +21,7 @@ #define RPCDBG_BIND 0x0020
#define RPCDBG_SCHED 0x0040
#define RPCDBG_TRANS 0x0080
#define RPCDBG_SVCSOCK 0x0100
+#define RPCDBG_SVCXPRT 0x0100
#define RPCDBG_SVCDSP 0x0200
#define RPCDBG_MISC 0x0400
#define RPCDBG_CACHE 0x0800
diff --git a/include/linux/sunrpc/svc_xprt.h b/include/linux/sunrpc/svc_xprt.h
new file mode 100644
index 0000000..a9a3afe
--- /dev/null
+++ b/include/linux/sunrpc/svc_xprt.h
@@ -0,0 +1,31 @@
+/*
+ * linux/include/linux/sunrpc/svc_xprt.h
+ *
+ * RPC server transport I/O
+ */
+
+#ifndef SUNRPC_SVC_XPRT_H
+#define SUNRPC_SVC_XPRT_H
+
+#include <linux/sunrpc/svc.h>
+
+struct svc_xprt_ops {
+};
+
+struct svc_xprt_class {
+ const char *xcl_name;
+ struct module *xcl_owner;
+ struct svc_xprt_ops *xcl_ops;
+ struct list_head xcl_list;
+};
+
+struct svc_xprt {
+ struct svc_xprt_class *xpt_class;
+ struct svc_xprt_ops xpt_ops;
+};
+
+int svc_reg_xprt_class(struct svc_xprt_class *);
+int svc_unreg_xprt_class(struct svc_xprt_class *);
+void svc_xprt_init(struct svc_xprt_class *, struct svc_xprt *);
+
+#endif /* SUNRPC_SVC_XPRT_H */
diff --git a/net/sunrpc/Makefile b/net/sunrpc/Makefile
index 8ebfc4d..e37aa99 100644
--- a/net/sunrpc/Makefile
+++ b/net/sunrpc/Makefile
@@ -10,6 +10,7 @@ sunrpc-y := clnt.o xprt.o socklib.o xprt
auth.o auth_null.o auth_unix.o \
svc.o svcsock.o svcauth.o svcauth_unix.o \
rpcb_clnt.o timer.o xdr.o \
- sunrpc_syms.o cache.o rpc_pipe.o
+ sunrpc_syms.o cache.o rpc_pipe.o \
+ svc_xprt.o
sunrpc-$(CONFIG_PROC_FS) += stats.o
sunrpc-$(CONFIG_SYSCTL) += sysctl.o
diff --git a/net/sunrpc/svc_xprt.c b/net/sunrpc/svc_xprt.c
new file mode 100644
index 0000000..f868192
--- /dev/null
+++ b/net/sunrpc/svc_xprt.c
@@ -0,0 +1,94 @@
+/*
+ * linux/net/sunrpc/svc_xprt.c
+ *
+ * Author: Tom Tucker <[email protected]>
+ */
+
+#include <linux/sched.h>
+#include <linux/errno.h>
+#include <linux/fcntl.h>
+#include <linux/net.h>
+#include <linux/in.h>
+#include <linux/inet.h>
+#include <linux/udp.h>
+#include <linux/tcp.h>
+#include <linux/unistd.h>
+#include <linux/slab.h>
+#include <linux/netdevice.h>
+#include <linux/skbuff.h>
+#include <linux/file.h>
+#include <linux/freezer.h>
+#include <net/sock.h>
+#include <net/checksum.h>
+#include <net/ip.h>
+#include <net/ipv6.h>
+#include <net/tcp_states.h>
+#include <asm/uaccess.h>
+#include <asm/ioctls.h>
+
+#include <linux/sunrpc/types.h>
+#include <linux/sunrpc/clnt.h>
+#include <linux/sunrpc/xdr.h>
+#include <linux/sunrpc/svcsock.h>
+#include <linux/sunrpc/stats.h>
+#include <linux/sunrpc/svc_xprt.h>
+
+#define RPCDBG_FACILITY RPCDBG_SVCXPRT
+
+/* List of registered transport classes */
+static spinlock_t svc_xprt_class_lock = SPIN_LOCK_UNLOCKED;
+static LIST_HEAD(svc_xprt_class_list);
+
+int svc_reg_xprt_class(struct svc_xprt_class *xcl)
+{
+ struct svc_xprt_class *cl;
+ int res = -EEXIST;
+
+ dprintk("svc: Adding svc transport class '%s'\n",
+ xcl->xcl_name);
+
+ INIT_LIST_HEAD(&xcl->xcl_list);
+ spin_lock(&svc_xprt_class_lock);
+ list_for_each_entry(cl, &svc_xprt_class_list, xcl_list) {
+ if (xcl == cl)
+ goto out;
+ }
+ list_add_tail(&xcl->xcl_list, &svc_xprt_class_list);
+ res = 0;
+out:
+ spin_unlock(&svc_xprt_class_lock);
+ return res;
+}
+EXPORT_SYMBOL_GPL(svc_reg_xprt_class);
+
+int svc_unreg_xprt_class(struct svc_xprt_class *xcl)
+{
+ struct svc_xprt_class *cl;
+ int res = 0;
+
+ dprintk("svc: Removing svc transport class '%s'\n", xcl->xcl_name);
+
+ spin_lock(&svc_xprt_class_lock);
+ list_for_each_entry(cl, &svc_xprt_class_list, xcl_list) {
+ if (xcl == cl) {
+ list_del_init(&cl->xcl_list);
+ goto out;
+ }
+ }
+ res = -ENOENT;
+ out:
+ spin_unlock(&svc_xprt_class_lock);
+ return res;
+}
+EXPORT_SYMBOL_GPL(svc_unreg_xprt_class);
+
+/*
+ * Called by transport drivers to initialize the transport independent
+ * portion of the transport instance.
+ */
+void svc_xprt_init(struct svc_xprt_class *xcl, struct svc_xprt *xpt)
+{
+ xpt->xpt_class = xcl;
+ xpt->xpt_ops = *xcl->xcl_ops;
+}
+EXPORT_SYMBOL_GPL(svc_xprt_init);

-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2005.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2007-09-27 05:01:40

by Tom Tucker

[permalink] [raw]
Subject: [RFC,PATCH 02/33] svc: Make svc_sock the tcp/udp transport


Make TCP and UDP svc_sock transports, and register them
with the svc transport core.

A transport type (svc_sock) has an svc_xprt as its first member,
and calls svc_xprt_init to initialize this field.

Signed-off-by: Tom Tucker <[email protected]>
---

include/linux/sunrpc/debug.h | 1 -
include/linux/sunrpc/svcsock.h | 4 ++++
net/sunrpc/sunrpc_syms.c | 4 +++-
net/sunrpc/svcsock.c | 33 ++++++++++++++++++++++++++++++++-
4 files changed, 39 insertions(+), 3 deletions(-)

diff --git a/include/linux/sunrpc/debug.h b/include/linux/sunrpc/debug.h
index 092fcfa..10709cb 100644
--- a/include/linux/sunrpc/debug.h
+++ b/include/linux/sunrpc/debug.h
@@ -20,7 +20,6 @@ #define RPCDBG_AUTH 0x0010
#define RPCDBG_BIND 0x0020
#define RPCDBG_SCHED 0x0040
#define RPCDBG_TRANS 0x0080
-#define RPCDBG_SVCSOCK 0x0100
#define RPCDBG_SVCXPRT 0x0100
#define RPCDBG_SVCDSP 0x0200
#define RPCDBG_MISC 0x0400
diff --git a/include/linux/sunrpc/svcsock.h b/include/linux/sunrpc/svcsock.h
index a53e0fa..1878cbe 100644
--- a/include/linux/sunrpc/svcsock.h
+++ b/include/linux/sunrpc/svcsock.h
@@ -10,11 +10,13 @@ #ifndef SUNRPC_SVCSOCK_H
#define SUNRPC_SVCSOCK_H

#include <linux/sunrpc/svc.h>
+#include <linux/sunrpc/svc_xprt.h>

/*
* RPC server socket.
*/
struct svc_sock {
+ struct svc_xprt sk_xprt;
struct list_head sk_ready; /* list of ready sockets */
struct list_head sk_list; /* list of all sockets */
struct socket * sk_sock; /* berkeley socket layer */
@@ -78,6 +80,8 @@ int svc_addsock(struct svc_serv *serv,
int fd,
char *name_return,
int *proto);
+void svc_init_xprt_sock(void);
+void svc_cleanup_xprt_sock(void);

/*
* svc_makesock socket characteristics
diff --git a/net/sunrpc/sunrpc_syms.c b/net/sunrpc/sunrpc_syms.c
index 384c4ad..a62ce47 100644
--- a/net/sunrpc/sunrpc_syms.c
+++ b/net/sunrpc/sunrpc_syms.c
@@ -151,7 +151,8 @@ #ifdef CONFIG_PROC_FS
#endif
cache_register(&ip_map_cache);
cache_register(&unix_gid_cache);
- init_socket_xprt();
+ svc_init_xprt_sock(); /* svc sock transport */
+ init_socket_xprt(); /* clnt sock transport */
rpcauth_init_module();
out:
return err;
@@ -162,6 +163,7 @@ cleanup_sunrpc(void)
{
rpcauth_remove_module();
cleanup_socket_xprt();
+ svc_cleanup_xprt_sock();
unregister_rpc_pipefs();
rpc_destroy_mempool();
if (cache_unregister(&ip_map_cache))
diff --git a/net/sunrpc/svcsock.c b/net/sunrpc/svcsock.c
index 036ab52..2a56697 100644
--- a/net/sunrpc/svcsock.c
+++ b/net/sunrpc/svcsock.c
@@ -74,7 +74,7 @@ #include <linux/sunrpc/stats.h>
*
*/

-#define RPCDBG_FACILITY RPCDBG_SVCSOCK
+#define RPCDBG_FACILITY RPCDBG_SVCXPRT


static struct svc_sock *svc_setup_socket(struct svc_serv *, struct socket *,
@@ -899,12 +899,21 @@ svc_udp_sendto(struct svc_rqst *rqstp)
return error;
}

+static struct svc_xprt_ops svc_udp_ops = {
+};
+
+static struct svc_xprt_class svc_udp_class = {
+ .xcl_name = "udp",
+ .xcl_ops = &svc_udp_ops,
+};
+
static void
svc_udp_init(struct svc_sock *svsk)
{
int one = 1;
mm_segment_t oldfs;

+ svc_xprt_init(&svc_udp_class, &svsk->sk_xprt);
svsk->sk_sk->sk_data_ready = svc_udp_data_ready;
svsk->sk_sk->sk_write_space = svc_write_space;
svsk->sk_recvfrom = svc_udp_recvfrom;
@@ -1343,12 +1352,33 @@ svc_tcp_sendto(struct svc_rqst *rqstp)
return sent;
}

+static struct svc_xprt_ops svc_tcp_ops = {
+};
+
+static struct svc_xprt_class svc_tcp_class = {
+ .xcl_name = "tcp",
+ .xcl_ops = &svc_tcp_ops,
+};
+
+void svc_init_xprt_sock(void)
+{
+ svc_reg_xprt_class(&svc_tcp_class);
+ svc_reg_xprt_class(&svc_udp_class);
+}
+
+void svc_cleanup_xprt_sock(void)
+{
+ svc_unreg_xprt_class(&svc_tcp_class);
+ svc_unreg_xprt_class(&svc_udp_class);
+}
+
static void
svc_tcp_init(struct svc_sock *svsk)
{
struct sock *sk = svsk->sk_sk;
struct tcp_sock *tp = tcp_sk(sk);

+ svc_xprt_init(&svc_tcp_class, &svsk->sk_xprt);
svsk->sk_recvfrom = svc_tcp_recvfrom;
svsk->sk_sendto = svc_tcp_sendto;

@@ -1964,3 +1994,4 @@ static struct svc_deferred_req *svc_defe
spin_unlock(&svsk->sk_lock);
return dr;
}
+

-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2005.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2007-09-27 05:01:42

by Tom Tucker

[permalink] [raw]
Subject: [RFC, PATCH 03/33] svc: Change the svc_sock in the rqstp structure to a transport


The rqstp structure contains a pointer to the transport for the
RPC request. This functionaly trivial patch adds an unamed union
with pointers to both svc_sock and svc_xprt. Ultimately the
union will be removed and only the rq_xprt field will remain. This
allows incrementally extracting transport independent interfaces without
one gigundo patch.

Signed-off-by: Tom Tucker <[email protected]>
---

include/linux/sunrpc/svc.h | 5 ++++-
1 files changed, 4 insertions(+), 1 deletions(-)

diff --git a/include/linux/sunrpc/svc.h b/include/linux/sunrpc/svc.h
index 8531a70..37f7448 100644
--- a/include/linux/sunrpc/svc.h
+++ b/include/linux/sunrpc/svc.h
@@ -204,7 +204,10 @@ union svc_addr_u {
struct svc_rqst {
struct list_head rq_list; /* idle list */
struct list_head rq_all; /* all threads list */
- struct svc_sock * rq_sock; /* socket */
+ union {
+ struct svc_xprt * rq_xprt; /* transport ptr */
+ struct svc_sock * rq_sock; /* socket ptr */
+ };
struct sockaddr_storage rq_addr; /* peer address */
size_t rq_addrlen;


-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2005.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2007-09-27 05:01:44

by Tom Tucker

[permalink] [raw]
Subject: [RFC,PATCH 07/33] svc: Add per-transport delete functions


Add transport specific xpo_detach and xpo_free functions. The xpo_detach
function causes the transport to stop delivering data-ready events
and enqueing the transport for I/O.

The xpo_free function frees all resources associated with the particular
transport instance.

Signed-off-by: Tom Tucker <[email protected]>
---

include/linux/sunrpc/svc_xprt.h | 2 +
net/sunrpc/svcsock.c | 58 ++++++++++++++++++++++++++++++---------
2 files changed, 47 insertions(+), 13 deletions(-)

diff --git a/include/linux/sunrpc/svc_xprt.h b/include/linux/sunrpc/svc_xprt.h
index 5871faa..85d84b2 100644
--- a/include/linux/sunrpc/svc_xprt.h
+++ b/include/linux/sunrpc/svc_xprt.h
@@ -13,6 +13,8 @@ struct svc_xprt_ops {
int (*xpo_recvfrom)(struct svc_rqst *);
int (*xpo_sendto)(struct svc_rqst *);
void (*xpo_release)(struct svc_rqst *);
+ void (*xpo_detach)(struct svc_xprt *);
+ void (*xpo_free)(struct svc_xprt *);
};

struct svc_xprt_class {
diff --git a/net/sunrpc/svcsock.c b/net/sunrpc/svcsock.c
index e7d203a..0db0a26 100644
--- a/net/sunrpc/svcsock.c
+++ b/net/sunrpc/svcsock.c
@@ -84,6 +84,8 @@ static void svc_udp_data_ready(struct s
static int svc_udp_recvfrom(struct svc_rqst *);
static int svc_udp_sendto(struct svc_rqst *);
static void svc_close_socket(struct svc_sock *svsk);
+static void svc_sock_detach(struct svc_xprt *);
+static void svc_sock_free(struct svc_xprt *);

static struct svc_deferred_req *svc_deferred_dequeue(struct svc_sock *svsk);
static int svc_deferred_recv(struct svc_rqst *rqstp);
@@ -376,16 +378,8 @@ static inline void
svc_sock_put(struct svc_sock *svsk)
{
if (atomic_dec_and_test(&svsk->sk_inuse)) {
- BUG_ON(! test_bit(SK_DEAD, &svsk->sk_flags));
-
- dprintk("svc: releasing dead socket\n");
- if (svsk->sk_sock->file)
- sockfd_put(svsk->sk_sock);
- else
- sock_release(svsk->sk_sock);
- if (svsk->sk_info_authunix != NULL)
- svcauth_unix_info_release(svsk->sk_info_authunix);
- kfree(svsk);
+ BUG_ON(!test_bit(SK_DEAD, &svsk->sk_flags));
+ svsk->sk_xprt.xpt_ops.xpo_free(&svsk->sk_xprt);
}
}

@@ -903,6 +897,8 @@ static struct svc_xprt_ops svc_udp_ops =
.xpo_recvfrom = svc_udp_recvfrom,
.xpo_sendto = svc_udp_sendto,
.xpo_release = svc_release_skb,
+ .xpo_detach = svc_sock_detach,
+ .xpo_free = svc_sock_free,
};

static struct svc_xprt_class svc_udp_class = {
@@ -1358,6 +1354,8 @@ static struct svc_xprt_ops svc_tcp_ops =
.xpo_recvfrom = svc_tcp_recvfrom,
.xpo_sendto = svc_tcp_sendto,
.xpo_release = svc_release_skb,
+ .xpo_detach = svc_sock_detach,
+ .xpo_free = svc_sock_free,
};

static struct svc_xprt_class svc_tcp_class = {
@@ -1815,6 +1813,42 @@ bummer:
}

/*
+ * Detach the svc_sock from the socket so that no
+ * more callbacks occur.
+ */
+static void
+svc_sock_detach(struct svc_xprt *xprt)
+{
+ struct svc_sock *svsk = (struct svc_sock *)xprt;
+ struct sock *sk = svsk->sk_sk;
+
+ dprintk("svc: svc_sock_detach(%p)\n", svsk);
+
+ /* put back the old socket callbacks */
+ sk->sk_state_change = svsk->sk_ostate;
+ sk->sk_data_ready = svsk->sk_odata;
+ sk->sk_write_space = svsk->sk_owspace;
+}
+
+/*
+ * Free the svc_sock's socket resources and the svc_sock itself.
+ */
+static void
+svc_sock_free(struct svc_xprt *xprt)
+{
+ struct svc_sock *svsk = (struct svc_sock *)xprt;
+ dprintk("svc: svc_sock_free(%p)\n", svsk);
+
+ if (svsk->sk_info_authunix != NULL)
+ svcauth_unix_info_release(svsk->sk_info_authunix);
+ if (svsk->sk_sock->file)
+ sockfd_put(svsk->sk_sock);
+ else
+ sock_release(svsk->sk_sock);
+ kfree(svsk);
+}
+
+/*
* Remove a dead socket
*/
static void
@@ -1828,9 +1862,7 @@ svc_delete_socket(struct svc_sock *svsk)
serv = svsk->sk_server;
sk = svsk->sk_sk;

- sk->sk_state_change = svsk->sk_ostate;
- sk->sk_data_ready = svsk->sk_odata;
- sk->sk_write_space = svsk->sk_owspace;
+ svsk->sk_xprt.xpt_ops.xpo_detach(&svsk->sk_xprt);

spin_lock_bh(&serv->sv_lock);


-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2005.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2007-09-27 05:01:46

by Tom Tucker

[permalink] [raw]
Subject: [RFC, PATCH 05/33] svc: Move sk_sendto and sk_recvfrom to svc_xprt_class


The sk_sendto and sk_recvfrom are function pointers that allow svc_sock
to be used for both UDP and TCP. Move these function pointers to the
svc_xprt_ops structure.

Signed-off-by: Tom Tucker <[email protected]>
---

include/linux/sunrpc/svc_xprt.h | 2 ++
include/linux/sunrpc/svcsock.h | 3 ---
net/sunrpc/svcsock.c | 12 ++++++------
3 files changed, 8 insertions(+), 9 deletions(-)

diff --git a/include/linux/sunrpc/svc_xprt.h b/include/linux/sunrpc/svc_xprt.h
index 827f0fe..f0ba052 100644
--- a/include/linux/sunrpc/svc_xprt.h
+++ b/include/linux/sunrpc/svc_xprt.h
@@ -10,6 +10,8 @@ #define SUNRPC_SVC_XPRT_H
#include <linux/sunrpc/svc.h>

struct svc_xprt_ops {
+ int (*xpo_recvfrom)(struct svc_rqst *);
+ int (*xpo_sendto)(struct svc_rqst *);
};

struct svc_xprt_class {
diff --git a/include/linux/sunrpc/svcsock.h b/include/linux/sunrpc/svcsock.h
index 1878cbe..08e78d0 100644
--- a/include/linux/sunrpc/svcsock.h
+++ b/include/linux/sunrpc/svcsock.h
@@ -45,9 +45,6 @@ #define SK_DETACHED 10 /* detached fro
* be revisted */
struct mutex sk_mutex; /* to serialize sending data */

- int (*sk_recvfrom)(struct svc_rqst *rqstp);
- int (*sk_sendto)(struct svc_rqst *rqstp);
-
/* We keep the old state_change and data_ready CB's here */
void (*sk_ostate)(struct sock *);
void (*sk_odata)(struct sock *, int bytes);
diff --git a/net/sunrpc/svcsock.c b/net/sunrpc/svcsock.c
index dd5c6fb..cc8c7ce 100644
--- a/net/sunrpc/svcsock.c
+++ b/net/sunrpc/svcsock.c
@@ -900,6 +900,8 @@ svc_udp_sendto(struct svc_rqst *rqstp)
}

static struct svc_xprt_ops svc_udp_ops = {
+ .xpo_recvfrom = svc_udp_recvfrom,
+ .xpo_sendto = svc_udp_sendto,
};

static struct svc_xprt_class svc_udp_class = {
@@ -917,8 +919,6 @@ svc_udp_init(struct svc_sock *svsk)
svc_xprt_init(&svc_udp_class, &svsk->sk_xprt);
svsk->sk_sk->sk_data_ready = svc_udp_data_ready;
svsk->sk_sk->sk_write_space = svc_write_space;
- svsk->sk_recvfrom = svc_udp_recvfrom;
- svsk->sk_sendto = svc_udp_sendto;

/* initialise setting must have enough space to
* receive and respond to one request.
@@ -1354,6 +1354,8 @@ svc_tcp_sendto(struct svc_rqst *rqstp)
}

static struct svc_xprt_ops svc_tcp_ops = {
+ .xpo_recvfrom = svc_tcp_recvfrom,
+ .xpo_sendto = svc_tcp_sendto,
};

static struct svc_xprt_class svc_tcp_class = {
@@ -1381,8 +1383,6 @@ svc_tcp_init(struct svc_sock *svsk)
struct tcp_sock *tp = tcp_sk(sk);

svc_xprt_init(&svc_tcp_class, &svsk->sk_xprt);
- svsk->sk_recvfrom = svc_tcp_recvfrom;
- svsk->sk_sendto = svc_tcp_sendto;

if (sk->sk_state == TCP_LISTEN) {
dprintk("setting up TCP socket for listening\n");
@@ -1530,7 +1530,7 @@ svc_recv(struct svc_rqst *rqstp, long ti

dprintk("svc: server %p, pool %u, socket %p, inuse=%d\n",
rqstp, pool->sp_id, svsk, atomic_read(&svsk->sk_inuse));
- len = svsk->sk_recvfrom(rqstp);
+ len = svsk->sk_xprt.xpt_ops.xpo_recvfrom(rqstp);
dprintk("svc: got len=%d\n", len);

/* No data, incomplete (TCP) read, or accept() */
@@ -1590,7 +1590,7 @@ svc_send(struct svc_rqst *rqstp)
if (test_bit(SK_DEAD, &svsk->sk_flags))
len = -ENOTCONN;
else
- len = svsk->sk_sendto(rqstp);
+ len = svsk->sk_xprt.xpt_ops.xpo_sendto(rqstp);
mutex_unlock(&svsk->sk_mutex);
svc_sock_release(rqstp);


-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2005.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2007-09-27 05:01:46

by Tom Tucker

[permalink] [raw]
Subject: [RFC,PATCH 08/33] svc: Add xpo_prep_reply_hdr


Some transports add fields to the RPC header for replies, e.g. the TCP
record length. This function is called when preparing the reply header
to allow each transport to add whatever fields it requires.

Signed-off-by: Tom Tucker <[email protected]>
---

include/linux/sunrpc/svc_xprt.h | 1 +
net/sunrpc/svc.c | 6 +++---
net/sunrpc/svcsock.c | 19 +++++++++++++++++++
3 files changed, 23 insertions(+), 3 deletions(-)

diff --git a/include/linux/sunrpc/svc_xprt.h b/include/linux/sunrpc/svc_xprt.h
index 85d84b2..1cd86fe 100644
--- a/include/linux/sunrpc/svc_xprt.h
+++ b/include/linux/sunrpc/svc_xprt.h
@@ -11,6 +11,7 @@ #include <linux/sunrpc/svc.h>

struct svc_xprt_ops {
int (*xpo_recvfrom)(struct svc_rqst *);
+ void (*xpo_prep_reply_hdr)(struct svc_rqst *);
int (*xpo_sendto)(struct svc_rqst *);
void (*xpo_release)(struct svc_rqst *);
void (*xpo_detach)(struct svc_xprt *);
diff --git a/net/sunrpc/svc.c b/net/sunrpc/svc.c
index 2a4b3c6..ee68117 100644
--- a/net/sunrpc/svc.c
+++ b/net/sunrpc/svc.c
@@ -815,9 +815,9 @@ svc_process(struct svc_rqst *rqstp)
rqstp->rq_res.tail[0].iov_len = 0;
/* Will be turned off only in gss privacy case: */
rqstp->rq_splice_ok = 1;
- /* tcp needs a space for the record length... */
- if (rqstp->rq_prot == IPPROTO_TCP)
- svc_putnl(resv, 0);
+
+ /* Setup reply header */
+ rqstp->rq_xprt->xpt_ops.xpo_prep_reply_hdr(rqstp);

rqstp->rq_xid = svc_getu32(argv);
svc_putu32(resv, rqstp->rq_xid);
diff --git a/net/sunrpc/svcsock.c b/net/sunrpc/svcsock.c
index 0db0a26..99f5faf 100644
--- a/net/sunrpc/svcsock.c
+++ b/net/sunrpc/svcsock.c
@@ -893,12 +893,18 @@ svc_udp_sendto(struct svc_rqst *rqstp)
return error;
}

+static void
+svc_udp_prep_reply_hdr(struct svc_rqst *rqstp)
+{
+}
+
static struct svc_xprt_ops svc_udp_ops = {
.xpo_recvfrom = svc_udp_recvfrom,
.xpo_sendto = svc_udp_sendto,
.xpo_release = svc_release_skb,
.xpo_detach = svc_sock_detach,
.xpo_free = svc_sock_free,
+ .xpo_prep_reply_hdr = svc_udp_prep_reply_hdr,
};

static struct svc_xprt_class svc_udp_class = {
@@ -1350,12 +1356,25 @@ svc_tcp_sendto(struct svc_rqst *rqstp)
return sent;
}

+/*
+ * Setup response header. TCP has a 4B record length field.
+ */
+static void
+svc_tcp_prep_reply_hdr(struct svc_rqst *rqstp)
+{
+ struct kvec *resv = &rqstp->rq_res.head[0];
+
+ /* tcp needs a space for the record length... */
+ svc_putnl(resv, 0);
+}
+
static struct svc_xprt_ops svc_tcp_ops = {
.xpo_recvfrom = svc_tcp_recvfrom,
.xpo_sendto = svc_tcp_sendto,
.xpo_release = svc_release_skb,
.xpo_detach = svc_sock_detach,
.xpo_free = svc_sock_free,
+ .xpo_prep_reply_hdr = svc_tcp_prep_reply_hdr,
};

static struct svc_xprt_class svc_tcp_class = {

-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2005.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2007-09-27 05:01:43

by Tom Tucker

[permalink] [raw]
Subject: [RFC, PATCH 04/33] svc: Add a max payload value to the transport


The svc_max_payload function currently looks at the socket type
to determine the max payload. Add a max payload value to svc_xprt_class
so it can be returned directly.

Signed-off-by: Tom Tucker <[email protected]>
---

include/linux/sunrpc/svc_xprt.h | 2 ++
net/sunrpc/svc.c | 4 +---
net/sunrpc/svc_xprt.c | 1 +
net/sunrpc/svcsock.c | 2 ++
4 files changed, 6 insertions(+), 3 deletions(-)

diff --git a/include/linux/sunrpc/svc_xprt.h b/include/linux/sunrpc/svc_xprt.h
index a9a3afe..827f0fe 100644
--- a/include/linux/sunrpc/svc_xprt.h
+++ b/include/linux/sunrpc/svc_xprt.h
@@ -17,11 +17,13 @@ struct svc_xprt_class {
struct module *xcl_owner;
struct svc_xprt_ops *xcl_ops;
struct list_head xcl_list;
+ u32 xcl_max_payload;
};

struct svc_xprt {
struct svc_xprt_class *xpt_class;
struct svc_xprt_ops xpt_ops;
+ u32 xpt_max_payload;
};

int svc_reg_xprt_class(struct svc_xprt_class *);
diff --git a/net/sunrpc/svc.c b/net/sunrpc/svc.c
index 55ea6df..2a4b3c6 100644
--- a/net/sunrpc/svc.c
+++ b/net/sunrpc/svc.c
@@ -1034,10 +1034,8 @@ err_bad:
*/
u32 svc_max_payload(const struct svc_rqst *rqstp)
{
- int max = RPCSVC_MAXPAYLOAD_TCP;
+ int max = rqstp->rq_xprt->xpt_max_payload;

- if (rqstp->rq_sock->sk_sock->type == SOCK_DGRAM)
- max = RPCSVC_MAXPAYLOAD_UDP;
if (rqstp->rq_server->sv_max_payload < max)
max = rqstp->rq_server->sv_max_payload;
return max;
diff --git a/net/sunrpc/svc_xprt.c b/net/sunrpc/svc_xprt.c
index f868192..6151db5 100644
--- a/net/sunrpc/svc_xprt.c
+++ b/net/sunrpc/svc_xprt.c
@@ -90,5 +90,6 @@ void svc_xprt_init(struct svc_xprt_class
{
xpt->xpt_class = xcl;
xpt->xpt_ops = *xcl->xcl_ops;
+ xpt->xpt_max_payload = xcl->xcl_max_payload;
}
EXPORT_SYMBOL_GPL(svc_xprt_init);
diff --git a/net/sunrpc/svcsock.c b/net/sunrpc/svcsock.c
index 2a56697..dd5c6fb 100644
--- a/net/sunrpc/svcsock.c
+++ b/net/sunrpc/svcsock.c
@@ -905,6 +905,7 @@ static struct svc_xprt_ops svc_udp_ops =
static struct svc_xprt_class svc_udp_class = {
.xcl_name = "udp",
.xcl_ops = &svc_udp_ops,
+ .xcl_max_payload = RPCSVC_MAXPAYLOAD_UDP,
};

static void
@@ -1358,6 +1359,7 @@ static struct svc_xprt_ops svc_tcp_ops =
static struct svc_xprt_class svc_tcp_class = {
.xcl_name = "tcp",
.xcl_ops = &svc_tcp_ops,
+ .xcl_max_payload = RPCSVC_MAXPAYLOAD_TCP,
};

void svc_init_xprt_sock(void)

-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2005.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2007-09-27 05:01:48

by Tom Tucker

[permalink] [raw]
Subject: [RFC, PATCH 06/33] svc: Add transport specific xpo_release function


The svc_sock_release function releases pages allocated to a thread. For
UDP, this also returns the receive skb to the stack. For RDMA it will
post a receive WR and bump the client credit count.

Signed-off-by: Tom Tucker <[email protected]>
---

include/linux/sunrpc/svc.h | 2 +-
include/linux/sunrpc/svc_xprt.h | 1 +
net/sunrpc/svcsock.c | 16 +++++++++-------
3 files changed, 11 insertions(+), 8 deletions(-)

diff --git a/include/linux/sunrpc/svc.h b/include/linux/sunrpc/svc.h
index 37f7448..cfb2652 100644
--- a/include/linux/sunrpc/svc.h
+++ b/include/linux/sunrpc/svc.h
@@ -217,7 +217,7 @@ struct svc_rqst {
struct auth_ops * rq_authop; /* authentication flavour */
u32 rq_flavor; /* pseudoflavor */
struct svc_cred rq_cred; /* auth info */
- struct sk_buff * rq_skbuff; /* fast recv inet buffer */
+ void * rq_xprt_ctxt; /* transport specific context ptr */
struct svc_deferred_req*rq_deferred; /* deferred request we are replaying */

struct xdr_buf rq_arg;
diff --git a/include/linux/sunrpc/svc_xprt.h b/include/linux/sunrpc/svc_xprt.h
index f0ba052..5871faa 100644
--- a/include/linux/sunrpc/svc_xprt.h
+++ b/include/linux/sunrpc/svc_xprt.h
@@ -12,6 +12,7 @@ #include <linux/sunrpc/svc.h>
struct svc_xprt_ops {
int (*xpo_recvfrom)(struct svc_rqst *);
int (*xpo_sendto)(struct svc_rqst *);
+ void (*xpo_release)(struct svc_rqst *);
};

struct svc_xprt_class {
diff --git a/net/sunrpc/svcsock.c b/net/sunrpc/svcsock.c
index cc8c7ce..e7d203a 100644
--- a/net/sunrpc/svcsock.c
+++ b/net/sunrpc/svcsock.c
@@ -184,14 +184,14 @@ svc_thread_dequeue(struct svc_pool *pool
/*
* Release an skbuff after use
*/
-static inline void
+static void
svc_release_skb(struct svc_rqst *rqstp)
{
- struct sk_buff *skb = rqstp->rq_skbuff;
+ struct sk_buff *skb = (struct sk_buff *)rqstp->rq_xprt_ctxt;
struct svc_deferred_req *dr = rqstp->rq_deferred;

if (skb) {
- rqstp->rq_skbuff = NULL;
+ rqstp->rq_xprt_ctxt = NULL;

dprintk("svc: service %p, releasing skb %p\n", rqstp, skb);
skb_free_datagram(rqstp->rq_sock->sk_sk, skb);
@@ -394,7 +394,7 @@ svc_sock_release(struct svc_rqst *rqstp)
{
struct svc_sock *svsk = rqstp->rq_sock;

- svc_release_skb(rqstp);
+ rqstp->rq_xprt->xpt_ops.xpo_release(rqstp);

svc_free_res_pages(rqstp);
rqstp->rq_res.page_len = 0;
@@ -866,7 +866,7 @@ svc_udp_recvfrom(struct svc_rqst *rqstp)
skb_free_datagram(svsk->sk_sk, skb);
return 0;
}
- rqstp->rq_skbuff = skb;
+ rqstp->rq_xprt_ctxt = skb;
}

rqstp->rq_arg.page_base = 0;
@@ -902,6 +902,7 @@ svc_udp_sendto(struct svc_rqst *rqstp)
static struct svc_xprt_ops svc_udp_ops = {
.xpo_recvfrom = svc_udp_recvfrom,
.xpo_sendto = svc_udp_sendto,
+ .xpo_release = svc_release_skb,
};

static struct svc_xprt_class svc_udp_class = {
@@ -1290,7 +1291,7 @@ svc_tcp_recvfrom(struct svc_rqst *rqstp)
rqstp->rq_arg.page_len = len - rqstp->rq_arg.head[0].iov_len;
}

- rqstp->rq_skbuff = NULL;
+ rqstp->rq_xprt_ctxt = NULL;
rqstp->rq_prot = IPPROTO_TCP;

/* Reset TCP read info */
@@ -1356,6 +1357,7 @@ svc_tcp_sendto(struct svc_rqst *rqstp)
static struct svc_xprt_ops svc_tcp_ops = {
.xpo_recvfrom = svc_tcp_recvfrom,
.xpo_sendto = svc_tcp_sendto,
+ .xpo_release = svc_release_skb,
};

static struct svc_xprt_class svc_tcp_class = {
@@ -1577,7 +1579,7 @@ svc_send(struct svc_rqst *rqstp)
}

/* release the receive skb before sending the reply */
- svc_release_skb(rqstp);
+ rqstp->rq_xprt->xpt_ops.xpo_release(rqstp);

/* calculate over-all length */
xb = & rqstp->rq_res;

-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2005.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2007-09-27 05:01:56

by Tom Tucker

[permalink] [raw]
Subject: [RFC, PATCH 09/33] svc: Add a transport function that checks for write space


In order to avoid blocking a service thread, the receive side checks
to see if there is sufficient write space to reply to the request.
Each transport has a different mechanism for determining if there is
enough write space to reply.

The code that checked for white space was coupled with code that
checked for CLOSE and CONN. These checks have been broken out into
separate statements to make the code easier to read.

Signed-off-by: Tom Tucker <[email protected]>
---

include/linux/sunrpc/svc_xprt.h | 1 +
net/sunrpc/svcsock.c | 62 +++++++++++++++++++++++++++++++++------
2 files changed, 53 insertions(+), 10 deletions(-)

diff --git a/include/linux/sunrpc/svc_xprt.h b/include/linux/sunrpc/svc_xprt.h
index 1cd86fe..47bedfa 100644
--- a/include/linux/sunrpc/svc_xprt.h
+++ b/include/linux/sunrpc/svc_xprt.h
@@ -10,6 +10,7 @@ #define SUNRPC_SVC_XPRT_H
#include <linux/sunrpc/svc.h>

struct svc_xprt_ops {
+ int (*xpo_has_wspace)(struct svc_xprt *);
int (*xpo_recvfrom)(struct svc_rqst *);
void (*xpo_prep_reply_hdr)(struct svc_rqst *);
int (*xpo_sendto)(struct svc_rqst *);
diff --git a/net/sunrpc/svcsock.c b/net/sunrpc/svcsock.c
index 99f5faf..1df9933 100644
--- a/net/sunrpc/svcsock.c
+++ b/net/sunrpc/svcsock.c
@@ -269,22 +269,24 @@ svc_sock_enqueue(struct svc_sock *svsk)
BUG_ON(svsk->sk_pool != NULL);
svsk->sk_pool = pool;

- set_bit(SOCK_NOSPACE, &svsk->sk_sock->flags);
- if (((atomic_read(&svsk->sk_reserved) + serv->sv_max_mesg)*2
- > svc_sock_wspace(svsk))
- && !test_bit(SK_CLOSE, &svsk->sk_flags)
- && !test_bit(SK_CONN, &svsk->sk_flags)) {
+ /* Handle pending connection */
+ if (test_bit(SK_CONN, &svsk->sk_flags))
+ goto process;
+
+ /* Handle close in-progress */
+ if (test_bit(SK_CLOSE, &svsk->sk_flags))
+ goto process;
+
+ /* Check if we have space to reply to a request */
+ if (!svsk->sk_xprt.xpt_ops.xpo_has_wspace(&svsk->sk_xprt)) {
/* Don't enqueue while not enough space for reply */
- dprintk("svc: socket %p no space, %d*2 > %ld, not enqueued\n",
- svsk->sk_sk, atomic_read(&svsk->sk_reserved)+serv->sv_max_mesg,
- svc_sock_wspace(svsk));
+ dprintk("svc: no write space, socket %p not enqueued\n", svsk);
svsk->sk_pool = NULL;
clear_bit(SK_BUSY, &svsk->sk_flags);
goto out_unlock;
}
- clear_bit(SOCK_NOSPACE, &svsk->sk_sock->flags);
-

+ process:
if (!list_empty(&pool->sp_threads)) {
rqstp = list_entry(pool->sp_threads.next,
struct svc_rqst,
@@ -898,6 +900,25 @@ svc_udp_prep_reply_hdr(struct svc_rqst *
{
}

+static int
+svc_udp_has_wspace(struct svc_xprt *xprt)
+{
+ struct svc_sock *svsk = (struct svc_sock*)xprt;
+ struct svc_serv *serv = svsk->sk_server;
+ int required;
+
+ /*
+ * Set the SOCK_NOSPACE flag before checking the available
+ * sock space.
+ */
+ set_bit(SOCK_NOSPACE, &svsk->sk_sock->flags);
+ required = atomic_read(&svsk->sk_reserved) + serv->sv_max_mesg;
+ if (required*2 > sock_wspace(svsk->sk_sk))
+ return 0;
+ clear_bit(SOCK_NOSPACE, &svsk->sk_sock->flags);
+ return 1;
+}
+
static struct svc_xprt_ops svc_udp_ops = {
.xpo_recvfrom = svc_udp_recvfrom,
.xpo_sendto = svc_udp_sendto,
@@ -905,6 +926,7 @@ static struct svc_xprt_ops svc_udp_ops =
.xpo_detach = svc_sock_detach,
.xpo_free = svc_sock_free,
.xpo_prep_reply_hdr = svc_udp_prep_reply_hdr,
+ .xpo_has_wspace = svc_udp_has_wspace,
};

static struct svc_xprt_class svc_udp_class = {
@@ -1368,6 +1390,25 @@ svc_tcp_prep_reply_hdr(struct svc_rqst *
svc_putnl(resv, 0);
}

+static int
+svc_tcp_has_wspace(struct svc_xprt *xprt)
+{
+ struct svc_sock *svsk = (struct svc_sock*)xprt;
+ struct svc_serv *serv = svsk->sk_server;
+ int required;
+
+ /*
+ * Set the SOCK_NOSPACE flag before checking the available
+ * sock space.
+ */
+ set_bit(SOCK_NOSPACE, &svsk->sk_sock->flags);
+ required = atomic_read(&svsk->sk_reserved) + serv->sv_max_mesg;
+ if (required*2 > sk_stream_wspace(svsk->sk_sk))
+ return 0;
+ clear_bit(SOCK_NOSPACE, &svsk->sk_sock->flags);
+ return 1;
+}
+
static struct svc_xprt_ops svc_tcp_ops = {
.xpo_recvfrom = svc_tcp_recvfrom,
.xpo_sendto = svc_tcp_sendto,
@@ -1375,6 +1416,7 @@ static struct svc_xprt_ops svc_tcp_ops =
.xpo_detach = svc_sock_detach,
.xpo_free = svc_sock_free,
.xpo_prep_reply_hdr = svc_tcp_prep_reply_hdr,
+ .xpo_has_wspace = svc_tcp_has_wspace,
};

static struct svc_xprt_class svc_tcp_class = {

-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2005.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2007-09-27 05:01:56

by Tom Tucker

[permalink] [raw]
Subject: [RFC, PATCH 10/33] svc: Move close processing to a single place


Close handling was duplicated in the UDP and TCP recvfrom
methods. This code has been moved to the transport independent
svc_recv function.

Signed-off-by: Tom Tucker <[email protected]>
---

net/sunrpc/svcsock.c | 24 ++++++++++--------------
1 files changed, 10 insertions(+), 14 deletions(-)

diff --git a/net/sunrpc/svcsock.c b/net/sunrpc/svcsock.c
index 1df9933..93c5fb6 100644
--- a/net/sunrpc/svcsock.c
+++ b/net/sunrpc/svcsock.c
@@ -792,11 +792,6 @@ svc_udp_recvfrom(struct svc_rqst *rqstp)
return svc_deferred_recv(rqstp);
}

- if (test_bit(SK_CLOSE, &svsk->sk_flags)) {
- svc_delete_socket(svsk);
- return 0;
- }
-
clear_bit(SK_DATA, &svsk->sk_flags);
skb = NULL;
err = kernel_recvmsg(svsk->sk_sock, &msg, NULL,
@@ -1199,11 +1194,6 @@ svc_tcp_recvfrom(struct svc_rqst *rqstp)
return svc_deferred_recv(rqstp);
}

- if (test_bit(SK_CLOSE, &svsk->sk_flags)) {
- svc_delete_socket(svsk);
- return 0;
- }
-
if (svsk->sk_sk->sk_state == TCP_LISTEN) {
svc_tcp_accept(svsk);
svc_sock_received(svsk);
@@ -1589,10 +1579,16 @@ svc_recv(struct svc_rqst *rqstp, long ti
}
spin_unlock_bh(&pool->sp_lock);

- dprintk("svc: server %p, pool %u, socket %p, inuse=%d\n",
- rqstp, pool->sp_id, svsk, atomic_read(&svsk->sk_inuse));
- len = svsk->sk_xprt.xpt_ops.xpo_recvfrom(rqstp);
- dprintk("svc: got len=%d\n", len);
+ len = 0;
+ if (test_bit(SK_CLOSE, &svsk->sk_flags)) {
+ dprintk("svc_recv: found SK_CLOSE\n");
+ svc_delete_socket(svsk);
+ } else {
+ dprintk("svc: server %p, pool %u, socket %p, inuse=%d\n",
+ rqstp, pool->sp_id, svsk, atomic_read(&svsk->sk_inuse));
+ len = svsk->sk_xprt.xpt_ops.xpo_recvfrom(rqstp);
+ dprintk("svc: got len=%d\n", len);
+ }

/* No data, incomplete (TCP) read, or accept() */
if (len == 0 || len == -EAGAIN) {

-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2005.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2007-09-27 05:01:58

by Tom Tucker

[permalink] [raw]
Subject: [RFC,PATCH 11/33] svc: Add xpo_accept transport function


Previously, the accept logic looked into the socket state to determine
whether to call accept or recv when data-ready was indicated on an endpoint.
Since some transports don't use sockets, this logic was changed to use a flag
bit (SK_LISTENER) to identify listening endpoints. A transport function
(xpo_accept) was added to allow each transport to define its own accept
processing. A transport's initialization logic is reponsible for setting the
SK_LISTENER bit. I didn't see any way to do this in transport independent
logic since the passive side of a UDP connection doesn't listen and
always recv's.

In the svc_recv function, if the SK_LISTENER bit is set, the transport
xpo_accept function is called to handle accept processing.

Note that all functions are defined even if they don't make sense
for a given transport. For example, accept doesn't mean anything for
UDP. The fuction is defined anyway and bug checks if called. The
UDP transport should never set the SK_LISTENER bit.

The code that poaches connections when the connection
limit is hit was moved to a subroutine to make the accept logic path
easier to follow. Since this is in the new connection path, it should
not be a performance issue.

Signed-off-by: Tom Tucker <[email protected]>
---

include/linux/sunrpc/svc_xprt.h | 1
include/linux/sunrpc/svcsock.h | 1
net/sunrpc/svcsock.c | 130 ++++++++++++++++++++++-----------------
3 files changed, 75 insertions(+), 57 deletions(-)

diff --git a/include/linux/sunrpc/svc_xprt.h b/include/linux/sunrpc/svc_xprt.h
index 47bedfa..4c1a650 100644
--- a/include/linux/sunrpc/svc_xprt.h
+++ b/include/linux/sunrpc/svc_xprt.h
@@ -10,6 +10,7 @@ #define SUNRPC_SVC_XPRT_H
#include <linux/sunrpc/svc.h>

struct svc_xprt_ops {
+ struct svc_xprt *(*xpo_accept)(struct svc_xprt *);
int (*xpo_has_wspace)(struct svc_xprt *);
int (*xpo_recvfrom)(struct svc_rqst *);
void (*xpo_prep_reply_hdr)(struct svc_rqst *);
diff --git a/include/linux/sunrpc/svcsock.h b/include/linux/sunrpc/svcsock.h
index 08e78d0..9882ce0 100644
--- a/include/linux/sunrpc/svcsock.h
+++ b/include/linux/sunrpc/svcsock.h
@@ -36,6 +36,7 @@ #define SK_CHNGBUF 7 /* need to change
#define SK_DEFERRED 8 /* request on sk_deferred */
#define SK_OLD 9 /* used for temp socket aging mark+sweep */
#define SK_DETACHED 10 /* detached from tempsocks list */
+#define SK_LISTENER 11 /* listening endpoint */

atomic_t sk_reserved; /* space on outq that is reserved */

diff --git a/net/sunrpc/svcsock.c b/net/sunrpc/svcsock.c
index 93c5fb6..19f7bbc 100644
--- a/net/sunrpc/svcsock.c
+++ b/net/sunrpc/svcsock.c
@@ -914,6 +914,13 @@ svc_udp_has_wspace(struct svc_xprt *xprt
return 1;
}

+static struct svc_xprt *
+svc_udp_accept(struct svc_xprt *xprt)
+{
+ BUG();
+ return NULL;
+}
+
static struct svc_xprt_ops svc_udp_ops = {
.xpo_recvfrom = svc_udp_recvfrom,
.xpo_sendto = svc_udp_sendto,
@@ -922,6 +929,7 @@ static struct svc_xprt_ops svc_udp_ops =
.xpo_free = svc_sock_free,
.xpo_prep_reply_hdr = svc_udp_prep_reply_hdr,
.xpo_has_wspace = svc_udp_has_wspace,
+ .xpo_accept = svc_udp_accept,
};

static struct svc_xprt_class svc_udp_class = {
@@ -1046,9 +1054,10 @@ static inline int svc_port_is_privileged
/*
* Accept a TCP connection
*/
-static void
-svc_tcp_accept(struct svc_sock *svsk)
+static struct svc_xprt *
+svc_tcp_accept(struct svc_xprt *xprt)
{
+ struct svc_sock *svsk = (struct svc_sock *)xprt;
struct sockaddr_storage addr;
struct sockaddr *sin = (struct sockaddr *) &addr;
struct svc_serv *serv = svsk->sk_server;
@@ -1060,7 +1069,7 @@ svc_tcp_accept(struct svc_sock *svsk)

dprintk("svc: tcp_accept %p sock %p\n", svsk, sock);
if (!sock)
- return;
+ return NULL;

clear_bit(SK_CONN, &svsk->sk_flags);
err = kernel_accept(sock, &newsock, O_NONBLOCK);
@@ -1071,7 +1080,7 @@ svc_tcp_accept(struct svc_sock *svsk)
else if (err != -EAGAIN && net_ratelimit())
printk(KERN_WARNING "%s: accept failed (err %d)!\n",
serv->sv_name, -err);
- return;
+ return NULL;
}

set_bit(SK_CONN, &svsk->sk_flags);
@@ -1117,59 +1126,14 @@ svc_tcp_accept(struct svc_sock *svsk)

svc_sock_received(newsvsk);

- /* make sure that we don't have too many active connections.
- * If we have, something must be dropped.
- *
- * There's no point in trying to do random drop here for
- * DoS prevention. The NFS clients does 1 reconnect in 15
- * seconds. An attacker can easily beat that.
- *
- * The only somewhat efficient mechanism would be if drop
- * old connections from the same IP first. But right now
- * we don't even record the client IP in svc_sock.
- */
- if (serv->sv_tmpcnt > (serv->sv_nrthreads+3)*20) {
- struct svc_sock *svsk = NULL;
- spin_lock_bh(&serv->sv_lock);
- if (!list_empty(&serv->sv_tempsocks)) {
- if (net_ratelimit()) {
- /* Try to help the admin */
- printk(KERN_NOTICE "%s: too many open TCP "
- "sockets, consider increasing the "
- "number of nfsd threads\n",
- serv->sv_name);
- printk(KERN_NOTICE
- "%s: last TCP connect from %s\n",
- serv->sv_name, __svc_print_addr(sin,
- buf, sizeof(buf)));
- }
- /*
- * Always select the oldest socket. It's not fair,
- * but so is life
- */
- svsk = list_entry(serv->sv_tempsocks.prev,
- struct svc_sock,
- sk_list);
- set_bit(SK_CLOSE, &svsk->sk_flags);
- atomic_inc(&svsk->sk_inuse);
- }
- spin_unlock_bh(&serv->sv_lock);
-
- if (svsk) {
- svc_sock_enqueue(svsk);
- svc_sock_put(svsk);
- }
-
- }
-
if (serv->sv_stats)
serv->sv_stats->nettcpconn++;

- return;
+ return &newsvsk->sk_xprt;

failed:
sock_release(newsock);
- return;
+ return NULL;
}

/*
@@ -1194,12 +1158,6 @@ svc_tcp_recvfrom(struct svc_rqst *rqstp)
return svc_deferred_recv(rqstp);
}

- if (svsk->sk_sk->sk_state == TCP_LISTEN) {
- svc_tcp_accept(svsk);
- svc_sock_received(svsk);
- return 0;
- }
-
if (test_and_clear_bit(SK_CHNGBUF, &svsk->sk_flags))
/* sndbuf needs to have room for one request
* per thread, otherwise we can stall even when the
@@ -1407,6 +1365,7 @@ static struct svc_xprt_ops svc_tcp_ops =
.xpo_free = svc_sock_free,
.xpo_prep_reply_hdr = svc_tcp_prep_reply_hdr,
.xpo_has_wspace = svc_tcp_has_wspace,
+ .xpo_accept = svc_tcp_accept,
};

static struct svc_xprt_class svc_tcp_class = {
@@ -1488,6 +1447,55 @@ svc_sock_update_bufs(struct svc_serv *se
spin_unlock_bh(&serv->sv_lock);
}

+static void
+svc_check_conn_limits(struct svc_serv *serv)
+{
+ char buf[RPC_MAX_ADDRBUFLEN];
+
+ /* make sure that we don't have too many active connections.
+ * If we have, something must be dropped.
+ *
+ * There's no point in trying to do random drop here for
+ * DoS prevention. The NFS clients does 1 reconnect in 15
+ * seconds. An attacker can easily beat that.
+ *
+ * The only somewhat efficient mechanism would be if drop
+ * old connections from the same IP first. But right now
+ * we don't even record the client IP in svc_sock.
+ */
+ if (serv->sv_tmpcnt > (serv->sv_nrthreads+3)*20) {
+ struct svc_sock *svsk = NULL;
+ spin_lock_bh(&serv->sv_lock);
+ if (!list_empty(&serv->sv_tempsocks)) {
+ if (net_ratelimit()) {
+ /* Try to help the admin */
+ printk(KERN_NOTICE "%s: too many open TCP "
+ "sockets, consider increasing the "
+ "number of nfsd threads\n",
+ serv->sv_name);
+ printk(KERN_NOTICE
+ "%s: last TCP connect from %s\n",
+ serv->sv_name, buf);
+ }
+ /*
+ * Always select the oldest socket. It's not fair,
+ * but so is life
+ */
+ svsk = list_entry(serv->sv_tempsocks.prev,
+ struct svc_sock,
+ sk_list);
+ set_bit(SK_CLOSE, &svsk->sk_flags);
+ atomic_inc(&svsk->sk_inuse);
+ }
+ spin_unlock_bh(&serv->sv_lock);
+
+ if (svsk) {
+ svc_sock_enqueue(svsk);
+ svc_sock_put(svsk);
+ }
+ }
+}
+
/*
* Receive the next request on any socket. This code is carefully
* organised not to touch any cachelines in the shared svc_serv
@@ -1583,6 +1591,12 @@ svc_recv(struct svc_rqst *rqstp, long ti
if (test_bit(SK_CLOSE, &svsk->sk_flags)) {
dprintk("svc_recv: found SK_CLOSE\n");
svc_delete_socket(svsk);
+ } else if (test_bit(SK_LISTENER, &svsk->sk_flags)) {
+ struct svc_xprt *newxpt;
+ newxpt = svsk->sk_xprt.xpt_ops.xpo_accept(&svsk->sk_xprt);
+ if (newxpt)
+ svc_check_conn_limits(svsk->sk_server);
+ svc_sock_received(svsk);
} else {
dprintk("svc: server %p, pool %u, socket %p, inuse=%d\n",
rqstp, pool->sp_id, svsk, atomic_read(&svsk->sk_inuse));
@@ -1859,6 +1873,8 @@ static int svc_create_socket(struct svc_
}

if ((svsk = svc_setup_socket(serv, sock, &error, flags)) != NULL) {
+ if (protocol == IPPROTO_TCP)
+ set_bit(SK_LISTENER, &svsk->sk_flags);
svc_sock_received(svsk);
return ntohs(inet_sk(svsk->sk_sk)->sport);
}

-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2005.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2007-09-27 05:02:00

by Tom Tucker

[permalink] [raw]
Subject: [RFC, PATCH 12/33] svc: Add a generic transport svc_create_xprt function


The svc_create_xprt function is a transport independent version
of the svc_makesock function.

Since transport instance creation contains transport dependent and
independent components, add an xpo_create transport function. The
transport implementation of this function allocates the memory for the
endpoint, implements the transport dependent initialization logic, and
calls svc_xprt_init to initialize the transport independent field (svc_xprt)
in it's data structure.

Signed-off-by: Tom Tucker <[email protected]>
---

include/linux/sunrpc/svc_xprt.h | 4 +++
net/sunrpc/svc_xprt.c | 38 ++++++++++++++++++++++++++
net/sunrpc/svcsock.c | 58 +++++++++++++++++++++++++++++----------
3 files changed, 85 insertions(+), 15 deletions(-)

diff --git a/include/linux/sunrpc/svc_xprt.h b/include/linux/sunrpc/svc_xprt.h
index 4c1a650..6a34bb4 100644
--- a/include/linux/sunrpc/svc_xprt.h
+++ b/include/linux/sunrpc/svc_xprt.h
@@ -10,6 +10,9 @@ #define SUNRPC_SVC_XPRT_H
#include <linux/sunrpc/svc.h>

struct svc_xprt_ops {
+ struct svc_xprt *(*xpo_create)(struct svc_serv *,
+ struct sockaddr *,
+ int);
struct svc_xprt *(*xpo_accept)(struct svc_xprt *);
int (*xpo_has_wspace)(struct svc_xprt *);
int (*xpo_recvfrom)(struct svc_rqst *);
@@ -37,5 +40,6 @@ struct svc_xprt {
int svc_reg_xprt_class(struct svc_xprt_class *);
int svc_unreg_xprt_class(struct svc_xprt_class *);
void svc_xprt_init(struct svc_xprt_class *, struct svc_xprt *);
+int svc_create_xprt(struct svc_serv *, char *, unsigned short, int);

#endif /* SUNRPC_SVC_XPRT_H */
diff --git a/net/sunrpc/svc_xprt.c b/net/sunrpc/svc_xprt.c
index 6151db5..fab0ce3 100644
--- a/net/sunrpc/svc_xprt.c
+++ b/net/sunrpc/svc_xprt.c
@@ -93,3 +93,41 @@ void svc_xprt_init(struct svc_xprt_class
xpt->xpt_max_payload = xcl->xcl_max_payload;
}
EXPORT_SYMBOL_GPL(svc_xprt_init);
+
+int svc_create_xprt(struct svc_serv *serv, char *xprt_name, unsigned short port,
+ int flags)
+{
+ int ret = -ENOENT;
+ struct list_head *le;
+ struct sockaddr_in sin = {
+ .sin_family = AF_INET,
+ .sin_addr.s_addr = INADDR_ANY,
+ .sin_port = htons(port),
+ };
+ dprintk("svc: creating transport %s[%d]\n", xprt_name, port);
+ spin_lock(&svc_xprt_class_lock);
+ list_for_each(le, &svc_xprt_class_list) {
+ struct svc_xprt_class *xcl =
+ list_entry(le, struct svc_xprt_class, xcl_list);
+ if (strcmp(xprt_name, xcl->xcl_name)==0) {
+ spin_unlock(&svc_xprt_class_lock);
+ if (try_module_get(xcl->xcl_owner)) {
+ struct svc_xprt *newxprt;
+ ret = 0;
+ newxprt = xcl->xcl_ops->xpo_create
+ (serv, (struct sockaddr*)&sin, flags);
+ if (IS_ERR(newxprt)) {
+ module_put(xcl->xcl_owner);
+ ret = PTR_ERR(newxprt);
+ }
+ goto out;
+ }
+ }
+ }
+ spin_unlock(&svc_xprt_class_lock);
+ dprintk("svc: transport %s not found\n", xprt_name);
+ out:
+ return ret;
+}
+EXPORT_SYMBOL_GPL(svc_create_xprt);
+
diff --git a/net/sunrpc/svcsock.c b/net/sunrpc/svcsock.c
index 19f7bbc..d37f773 100644
--- a/net/sunrpc/svcsock.c
+++ b/net/sunrpc/svcsock.c
@@ -90,6 +90,8 @@ static void svc_sock_free(struct svc_xp
static struct svc_deferred_req *svc_deferred_dequeue(struct svc_sock *svsk);
static int svc_deferred_recv(struct svc_rqst *rqstp);
static struct cache_deferred_req *svc_defer(struct cache_req *req);
+static struct svc_xprt *
+svc_create_socket(struct svc_serv *, int, struct sockaddr *, int, int);

/* apparently the "standard" is that clients close
* idle connections after 5 minutes, servers after
@@ -381,6 +383,7 @@ svc_sock_put(struct svc_sock *svsk)
{
if (atomic_dec_and_test(&svsk->sk_inuse)) {
BUG_ON(!test_bit(SK_DEAD, &svsk->sk_flags));
+ module_put(svsk->sk_xprt.xpt_class->xcl_owner);
svsk->sk_xprt.xpt_ops.xpo_free(&svsk->sk_xprt);
}
}
@@ -921,7 +924,15 @@ svc_udp_accept(struct svc_xprt *xprt)
return NULL;
}

+static struct svc_xprt *
+svc_udp_create(struct svc_serv *serv, struct sockaddr *sa, int flags)
+{
+ return svc_create_socket(serv, IPPROTO_UDP, sa,
+ sizeof(struct sockaddr_in), flags);
+}
+
static struct svc_xprt_ops svc_udp_ops = {
+ .xpo_create = svc_udp_create,
.xpo_recvfrom = svc_udp_recvfrom,
.xpo_sendto = svc_udp_sendto,
.xpo_release = svc_release_skb,
@@ -934,6 +945,7 @@ static struct svc_xprt_ops svc_udp_ops =

static struct svc_xprt_class svc_udp_class = {
.xcl_name = "udp",
+ .xcl_owner = THIS_MODULE,
.xcl_ops = &svc_udp_ops,
.xcl_max_payload = RPCSVC_MAXPAYLOAD_UDP,
};
@@ -1357,7 +1369,15 @@ svc_tcp_has_wspace(struct svc_xprt *xprt
return 1;
}

+static struct svc_xprt *
+svc_tcp_create(struct svc_serv *serv, struct sockaddr *sa, int flags)
+{
+ return svc_create_socket(serv, IPPROTO_TCP, sa,
+ sizeof(struct sockaddr_in), flags);
+}
+
static struct svc_xprt_ops svc_tcp_ops = {
+ .xpo_create = svc_tcp_create,
.xpo_recvfrom = svc_tcp_recvfrom,
.xpo_sendto = svc_tcp_sendto,
.xpo_release = svc_release_skb,
@@ -1370,6 +1390,7 @@ static struct svc_xprt_ops svc_tcp_ops =

static struct svc_xprt_class svc_tcp_class = {
.xcl_name = "tcp",
+ .xcl_owner = THIS_MODULE,
.xcl_ops = &svc_tcp_ops,
.xcl_max_payload = RPCSVC_MAXPAYLOAD_TCP,
};
@@ -1594,8 +1615,14 @@ svc_recv(struct svc_rqst *rqstp, long ti
} else if (test_bit(SK_LISTENER, &svsk->sk_flags)) {
struct svc_xprt *newxpt;
newxpt = svsk->sk_xprt.xpt_ops.xpo_accept(&svsk->sk_xprt);
- if (newxpt)
+ if (newxpt) {
+ /*
+ * We know this module_get will succeed because the
+ * listener holds a reference too
+ */
+ __module_get(newxpt->xpt_class->xcl_owner);
svc_check_conn_limits(svsk->sk_server);
+ }
svc_sock_received(svsk);
} else {
dprintk("svc: server %p, pool %u, socket %p, inuse=%d\n",
@@ -1835,8 +1862,9 @@ EXPORT_SYMBOL_GPL(svc_addsock);
/*
* Create socket for RPC service.
*/
-static int svc_create_socket(struct svc_serv *serv, int protocol,
- struct sockaddr *sin, int len, int flags)
+static struct svc_xprt *
+svc_create_socket(struct svc_serv *serv, int protocol,
+ struct sockaddr *sin, int len, int flags)
{
struct svc_sock *svsk;
struct socket *sock;
@@ -1851,13 +1879,13 @@ static int svc_create_socket(struct svc_
if (protocol != IPPROTO_UDP && protocol != IPPROTO_TCP) {
printk(KERN_WARNING "svc: only UDP and TCP "
"sockets supported\n");
- return -EINVAL;
+ return ERR_PTR(-EINVAL);
}
type = (protocol == IPPROTO_UDP)? SOCK_DGRAM : SOCK_STREAM;

error = sock_create_kern(sin->sa_family, type, protocol, &sock);
if (error < 0)
- return error;
+ return ERR_PTR(error);

svc_reclassify_socket(sock);

@@ -1876,13 +1904,13 @@ static int svc_create_socket(struct svc_
if (protocol == IPPROTO_TCP)
set_bit(SK_LISTENER, &svsk->sk_flags);
svc_sock_received(svsk);
- return ntohs(inet_sk(svsk->sk_sk)->sport);
+ return (struct svc_xprt *)svsk;
}

bummer:
dprintk("svc: svc_create_socket error = %d\n", -error);
sock_release(sock);
- return error;
+ return ERR_PTR(error);
}

/*
@@ -1995,15 +2023,15 @@ void svc_force_close_socket(struct svc_s
int svc_makesock(struct svc_serv *serv, int protocol, unsigned short port,
int flags)
{
- struct sockaddr_in sin = {
- .sin_family = AF_INET,
- .sin_addr.s_addr = INADDR_ANY,
- .sin_port = htons(port),
- };
-
dprintk("svc: creating socket proto = %d\n", protocol);
- return svc_create_socket(serv, protocol, (struct sockaddr *) &sin,
- sizeof(sin), flags);
+ switch (protocol) {
+ case IPPROTO_TCP:
+ return svc_create_xprt(serv, "tcp", port, flags);
+ case IPPROTO_UDP:
+ return svc_create_xprt(serv, "udp", port, flags);
+ default:
+ return -EINVAL;
+ }
}

/*

-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2005.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2007-09-27 05:02:02

by Tom Tucker

[permalink] [raw]
Subject: [RFC, PATCH 15/33] svc: Move sk_flags to the svc_xprt structure


This functionally trivial change moves the transport independent sk_flags
field to the transport independent svc_xprt structure.

Signed-off-by: Tom Tucker <[email protected]>
---

include/linux/sunrpc/svc_xprt.h | 12 +++
include/linux/sunrpc/svcsock.h | 13 ----
net/sunrpc/svcsock.c | 144 ++++++++++++++++++++-------------------
3 files changed, 84 insertions(+), 85 deletions(-)

diff --git a/include/linux/sunrpc/svc_xprt.h b/include/linux/sunrpc/svc_xprt.h
index c9196bc..935726e 100644
--- a/include/linux/sunrpc/svc_xprt.h
+++ b/include/linux/sunrpc/svc_xprt.h
@@ -37,6 +37,18 @@ struct svc_xprt {
struct svc_xprt_ops xpt_ops;
u32 xpt_max_payload;
struct kref xpt_ref;
+ unsigned long xpt_flags;
+#define XPT_BUSY 0 /* enqueued/receiving */
+#define XPT_CONN 1 /* conn pending */
+#define XPT_CLOSE 2 /* dead or dying */
+#define XPT_DATA 3 /* data pending */
+#define XPT_TEMP 4 /* connected transport */
+#define XPT_DEAD 6 /* transport closed */
+#define XPT_CHNGBUF 7 /* need to change snd/rcv buffer sizes */
+#define XPT_DEFERRED 8 /* deferred request pending */
+#define XPT_OLD 9 /* used for transport aging mark+sweep */
+#define XPT_DETACHED 10 /* detached from tempsocks list */
+#define XPT_LISTENER 11 /* listening endpoint */
};

int svc_reg_xprt_class(struct svc_xprt_class *);
diff --git a/include/linux/sunrpc/svcsock.h b/include/linux/sunrpc/svcsock.h
index ba07d50..b8a8496 100644
--- a/include/linux/sunrpc/svcsock.h
+++ b/include/linux/sunrpc/svcsock.h
@@ -24,19 +24,6 @@ struct svc_sock {

struct svc_pool * sk_pool; /* current pool iff queued */
struct svc_serv * sk_server; /* service for this socket */
- unsigned long sk_flags;
-#define SK_BUSY 0 /* enqueued/receiving */
-#define SK_CONN 1 /* conn pending */
-#define SK_CLOSE 2 /* dead or dying */
-#define SK_DATA 3 /* data pending */
-#define SK_TEMP 4 /* temp (TCP) socket */
-#define SK_DEAD 6 /* socket closed */
-#define SK_CHNGBUF 7 /* need to change snd/rcv buffer sizes */
-#define SK_DEFERRED 8 /* request on sk_deferred */
-#define SK_OLD 9 /* used for temp socket aging mark+sweep */
-#define SK_DETACHED 10 /* detached from tempsocks list */
-#define SK_LISTENER 11 /* listening endpoint */
-
atomic_t sk_reserved; /* space on outq that is reserved */

spinlock_t sk_lock; /* protects sk_deferred and
diff --git a/net/sunrpc/svcsock.c b/net/sunrpc/svcsock.c
index a92b000..c6b521d 100644
--- a/net/sunrpc/svcsock.c
+++ b/net/sunrpc/svcsock.c
@@ -55,22 +55,22 @@ #include <linux/sunrpc/stats.h>
* BKL protects svc_serv->sv_nrthread.
* svc_sock->sk_lock protects the svc_sock->sk_deferred list
* and the ->sk_info_authunix cache.
- * svc_sock->sk_flags.SK_BUSY prevents a svc_sock being enqueued multiply.
+ * svc_sock->sk_xprt.xpt_flags.XPT_BUSY prevents a svc_sock being enqueued multiply.
*
* Some flags can be set to certain values at any time
* providing that certain rules are followed:
*
- * SK_CONN, SK_DATA, can be set or cleared at any time.
+ * XPT_CONN, XPT_DATA, can be set or cleared at any time.
* after a set, svc_sock_enqueue must be called.
* after a clear, the socket must be read/accepted
* if this succeeds, it must be set again.
- * SK_CLOSE can set at any time. It is never cleared.
- * xpt_ref contains a bias of '1' until SK_DEAD is set.
+ * XPT_CLOSE can set at any time. It is never cleared.
+ * xpt_ref contains a bias of '1' until XPT_DEAD is set.
* so when xprt_ref hits zero, we know the transport is dead
* and no-one is using it.
- * SK_DEAD can only be set while SK_BUSY is held which ensures
+ * XPT_DEAD can only be set while XPT_BUSY is held which ensures
* no other thread will be using the socket or will try to
- * set SK_DEAD.
+ * set XPT_DEAD.
*
*/

@@ -235,10 +235,10 @@ svc_sock_enqueue(struct svc_sock *svsk)
struct svc_rqst *rqstp;
int cpu;

- if (!(svsk->sk_flags &
- ( (1<<SK_CONN)|(1<<SK_DATA)|(1<<SK_CLOSE)|(1<<SK_DEFERRED)) ))
+ if (!(svsk->sk_xprt.xpt_flags &
+ ( (1<<XPT_CONN)|(1<<XPT_DATA)|(1<<XPT_CLOSE)|(1<<XPT_DEFERRED)) ))
return;
- if (test_bit(SK_DEAD, &svsk->sk_flags))
+ if (test_bit(XPT_DEAD, &svsk->sk_xprt.xpt_flags))
return;

cpu = get_cpu();
@@ -252,7 +252,7 @@ svc_sock_enqueue(struct svc_sock *svsk)
printk(KERN_ERR
"svc_sock_enqueue: threads and sockets both waiting??\n");

- if (test_bit(SK_DEAD, &svsk->sk_flags)) {
+ if (test_bit(XPT_DEAD, &svsk->sk_xprt.xpt_flags)) {
/* Don't enqueue dead sockets */
dprintk("svc: socket %p is dead, not enqueued\n", svsk->sk_sk);
goto out_unlock;
@@ -260,10 +260,10 @@ svc_sock_enqueue(struct svc_sock *svsk)

/* Mark socket as busy. It will remain in this state until the
* server has processed all pending data and put the socket back
- * on the idle list. We update SK_BUSY atomically because
+ * on the idle list. We update XPT_BUSY atomically because
* it also guards against trying to enqueue the svc_sock twice.
*/
- if (test_and_set_bit(SK_BUSY, &svsk->sk_flags)) {
+ if (test_and_set_bit(XPT_BUSY, &svsk->sk_xprt.xpt_flags)) {
/* Don't enqueue socket while already enqueued */
dprintk("svc: socket %p busy, not enqueued\n", svsk->sk_sk);
goto out_unlock;
@@ -272,11 +272,11 @@ svc_sock_enqueue(struct svc_sock *svsk)
svsk->sk_pool = pool;

/* Handle pending connection */
- if (test_bit(SK_CONN, &svsk->sk_flags))
+ if (test_bit(XPT_CONN, &svsk->sk_xprt.xpt_flags))
goto process;

/* Handle close in-progress */
- if (test_bit(SK_CLOSE, &svsk->sk_flags))
+ if (test_bit(XPT_CLOSE, &svsk->sk_xprt.xpt_flags))
goto process;

/* Check if we have space to reply to a request */
@@ -284,7 +284,7 @@ svc_sock_enqueue(struct svc_sock *svsk)
/* Don't enqueue while not enough space for reply */
dprintk("svc: no write space, socket %p not enqueued\n", svsk);
svsk->sk_pool = NULL;
- clear_bit(SK_BUSY, &svsk->sk_flags);
+ clear_bit(XPT_BUSY, &svsk->sk_xprt.xpt_flags);
goto out_unlock;
}

@@ -340,14 +340,14 @@ svc_sock_dequeue(struct svc_pool *pool)
/*
* Having read something from a socket, check whether it
* needs to be re-enqueued.
- * Note: SK_DATA only gets cleared when a read-attempt finds
+ * Note: XPT_DATA only gets cleared when a read-attempt finds
* no (or insufficient) data.
*/
static inline void
svc_sock_received(struct svc_sock *svsk)
{
svsk->sk_pool = NULL;
- clear_bit(SK_BUSY, &svsk->sk_flags);
+ clear_bit(XPT_BUSY, &svsk->sk_xprt.xpt_flags);
svc_sock_enqueue(svsk);
}

@@ -696,8 +696,8 @@ svc_udp_data_ready(struct sock *sk, int

if (svsk) {
dprintk("svc: socket %p(inet %p), count=%d, busy=%d\n",
- svsk, sk, count, test_bit(SK_BUSY, &svsk->sk_flags));
- set_bit(SK_DATA, &svsk->sk_flags);
+ svsk, sk, count, test_bit(XPT_BUSY, &svsk->sk_xprt.xpt_flags));
+ set_bit(XPT_DATA, &svsk->sk_xprt.xpt_flags);
svc_sock_enqueue(svsk);
}
if (sk->sk_sleep && waitqueue_active(sk->sk_sleep))
@@ -714,7 +714,7 @@ svc_write_space(struct sock *sk)

if (svsk) {
dprintk("svc: socket %p(inet %p), write_space busy=%d\n",
- svsk, sk, test_bit(SK_BUSY, &svsk->sk_flags));
+ svsk, sk, test_bit(XPT_BUSY, &svsk->sk_xprt.xpt_flags));
svc_sock_enqueue(svsk);
}

@@ -764,7 +764,7 @@ svc_udp_recvfrom(struct svc_rqst *rqstp)
.msg_flags = MSG_DONTWAIT,
};

- if (test_and_clear_bit(SK_CHNGBUF, &svsk->sk_flags))
+ if (test_and_clear_bit(XPT_CHNGBUF, &svsk->sk_xprt.xpt_flags))
/* udp sockets need large rcvbuf as all pending
* requests are still in that buffer. sndbuf must
* also be large enough that there is enough space
@@ -782,7 +782,7 @@ svc_udp_recvfrom(struct svc_rqst *rqstp)
return svc_deferred_recv(rqstp);
}

- clear_bit(SK_DATA, &svsk->sk_flags);
+ clear_bit(XPT_DATA, &svsk->sk_xprt.xpt_flags);
skb = NULL;
err = kernel_recvmsg(svsk->sk_sock, &msg, NULL,
0, 0, MSG_PEEK | MSG_DONTWAIT);
@@ -793,7 +793,7 @@ svc_udp_recvfrom(struct svc_rqst *rqstp)
if (err != -EAGAIN) {
/* possibly an icmp error */
dprintk("svc: recvfrom returned error %d\n", -err);
- set_bit(SK_DATA, &svsk->sk_flags);
+ set_bit(XPT_DATA, &svsk->sk_xprt.xpt_flags);
}
svc_sock_received(svsk);
return -EAGAIN;
@@ -805,7 +805,7 @@ svc_udp_recvfrom(struct svc_rqst *rqstp)
need that much accuracy */
}
svsk->sk_sk->sk_stamp = skb->tstamp;
- set_bit(SK_DATA, &svsk->sk_flags); /* there may be more data... */
+ set_bit(XPT_DATA, &svsk->sk_xprt.xpt_flags); /* there may be more data... */

/*
* Maybe more packets - kick another thread ASAP.
@@ -955,8 +955,8 @@ svc_udp_init(struct svc_sock *svsk)
3 * svsk->sk_server->sv_max_mesg,
3 * svsk->sk_server->sv_max_mesg);

- set_bit(SK_DATA, &svsk->sk_flags); /* might have come in before data_ready set up */
- set_bit(SK_CHNGBUF, &svsk->sk_flags);
+ set_bit(XPT_DATA, &svsk->sk_xprt.xpt_flags); /* might have come in before data_ready set up */
+ set_bit(XPT_CHNGBUF, &svsk->sk_xprt.xpt_flags);

oldfs = get_fs();
set_fs(KERNEL_DS);
@@ -990,7 +990,7 @@ svc_tcp_listen_data_ready(struct sock *s
*/
if (sk->sk_state == TCP_LISTEN) {
if (svsk) {
- set_bit(SK_CONN, &svsk->sk_flags);
+ set_bit(XPT_CONN, &svsk->sk_xprt.xpt_flags);
svc_sock_enqueue(svsk);
} else
printk("svc: socket %p: no user data\n", sk);
@@ -1014,7 +1014,7 @@ svc_tcp_state_change(struct sock *sk)
if (!svsk)
printk("svc: socket %p: no user data\n", sk);
else {
- set_bit(SK_CLOSE, &svsk->sk_flags);
+ set_bit(XPT_CLOSE, &svsk->sk_xprt.xpt_flags);
svc_sock_enqueue(svsk);
}
if (sk->sk_sleep && waitqueue_active(sk->sk_sleep))
@@ -1029,7 +1029,7 @@ svc_tcp_data_ready(struct sock *sk, int
dprintk("svc: socket %p TCP data ready (svsk %p)\n",
sk, sk->sk_user_data);
if (svsk) {
- set_bit(SK_DATA, &svsk->sk_flags);
+ set_bit(XPT_DATA, &svsk->sk_xprt.xpt_flags);
svc_sock_enqueue(svsk);
}
if (sk->sk_sleep && waitqueue_active(sk->sk_sleep))
@@ -1070,7 +1070,7 @@ svc_tcp_accept(struct svc_xprt *xprt)
if (!sock)
return NULL;

- clear_bit(SK_CONN, &svsk->sk_flags);
+ clear_bit(XPT_CONN, &svsk->sk_xprt.xpt_flags);
err = kernel_accept(sock, &newsock, O_NONBLOCK);
if (err < 0) {
if (err == -ENOMEM)
@@ -1082,7 +1082,7 @@ svc_tcp_accept(struct svc_xprt *xprt)
return NULL;
}

- set_bit(SK_CONN, &svsk->sk_flags);
+ set_bit(XPT_CONN, &svsk->sk_xprt.xpt_flags);
svc_sock_enqueue(svsk);

err = kernel_getpeername(newsock, sin, &slen);
@@ -1148,16 +1148,16 @@ svc_tcp_recvfrom(struct svc_rqst *rqstp)
int pnum, vlen;

dprintk("svc: tcp_recv %p data %d conn %d close %d\n",
- svsk, test_bit(SK_DATA, &svsk->sk_flags),
- test_bit(SK_CONN, &svsk->sk_flags),
- test_bit(SK_CLOSE, &svsk->sk_flags));
+ svsk, test_bit(XPT_DATA, &svsk->sk_xprt.xpt_flags),
+ test_bit(XPT_CONN, &svsk->sk_xprt.xpt_flags),
+ test_bit(XPT_CLOSE, &svsk->sk_xprt.xpt_flags));

if ((rqstp->rq_deferred = svc_deferred_dequeue(svsk))) {
svc_sock_received(svsk);
return svc_deferred_recv(rqstp);
}

- if (test_and_clear_bit(SK_CHNGBUF, &svsk->sk_flags))
+ if (test_and_clear_bit(XPT_CHNGBUF, &svsk->sk_xprt.xpt_flags))
/* sndbuf needs to have room for one request
* per thread, otherwise we can stall even when the
* network isn't a bottleneck.
@@ -1174,7 +1174,7 @@ svc_tcp_recvfrom(struct svc_rqst *rqstp)
(serv->sv_nrthreads+3) * serv->sv_max_mesg,
3 * serv->sv_max_mesg);

- clear_bit(SK_DATA, &svsk->sk_flags);
+ clear_bit(XPT_DATA, &svsk->sk_xprt.xpt_flags);

/* Receive data. If we haven't got the record length yet, get
* the next four bytes. Otherwise try to gobble up as much as
@@ -1233,7 +1233,7 @@ svc_tcp_recvfrom(struct svc_rqst *rqstp)
return -EAGAIN; /* record not complete */
}
len = svsk->sk_reclen;
- set_bit(SK_DATA, &svsk->sk_flags);
+ set_bit(XPT_DATA, &svsk->sk_xprt.xpt_flags);

vec = rqstp->rq_vec;
vec[0] = rqstp->rq_arg.head[0];
@@ -1309,7 +1309,7 @@ svc_tcp_sendto(struct svc_rqst *rqstp)
reclen = htonl(0x80000000|((xbufp->len ) - 4));
memcpy(xbufp->head[0].iov_base, &reclen, 4);

- if (test_bit(SK_DEAD, &rqstp->rq_sock->sk_flags))
+ if (test_bit(XPT_DEAD, &rqstp->rq_sock->sk_xprt.xpt_flags))
return -ENOTCONN;

sent = svc_sendto(rqstp, &rqstp->rq_res);
@@ -1318,7 +1318,7 @@ svc_tcp_sendto(struct svc_rqst *rqstp)
rqstp->rq_sock->sk_server->sv_name,
(sent<0)?"got error":"sent only",
sent, xbufp->len);
- set_bit(SK_CLOSE, &rqstp->rq_sock->sk_flags);
+ set_bit(XPT_CLOSE, &rqstp->rq_sock->sk_xprt.xpt_flags);
svc_sock_enqueue(rqstp->rq_sock);
sent = -EAGAIN;
}
@@ -1405,7 +1405,7 @@ svc_tcp_init(struct svc_sock *svsk)
if (sk->sk_state == TCP_LISTEN) {
dprintk("setting up TCP socket for listening\n");
sk->sk_data_ready = svc_tcp_listen_data_ready;
- set_bit(SK_CONN, &svsk->sk_flags);
+ set_bit(XPT_CONN, &svsk->sk_xprt.xpt_flags);
} else {
dprintk("setting up TCP socket for reading\n");
sk->sk_state_change = svc_tcp_state_change;
@@ -1425,10 +1425,10 @@ svc_tcp_init(struct svc_sock *svsk)
3 * svsk->sk_server->sv_max_mesg,
3 * svsk->sk_server->sv_max_mesg);

- set_bit(SK_CHNGBUF, &svsk->sk_flags);
- set_bit(SK_DATA, &svsk->sk_flags);
+ set_bit(XPT_CHNGBUF, &svsk->sk_xprt.xpt_flags);
+ set_bit(XPT_DATA, &svsk->sk_xprt.xpt_flags);
if (sk->sk_state != TCP_ESTABLISHED)
- set_bit(SK_CLOSE, &svsk->sk_flags);
+ set_bit(XPT_CLOSE, &svsk->sk_xprt.xpt_flags);
}
}

@@ -1445,12 +1445,12 @@ svc_sock_update_bufs(struct svc_serv *se
list_for_each(le, &serv->sv_permsocks) {
struct svc_sock *svsk =
list_entry(le, struct svc_sock, sk_list);
- set_bit(SK_CHNGBUF, &svsk->sk_flags);
+ set_bit(XPT_CHNGBUF, &svsk->sk_xprt.xpt_flags);
}
list_for_each(le, &serv->sv_tempsocks) {
struct svc_sock *svsk =
list_entry(le, struct svc_sock, sk_list);
- set_bit(SK_CHNGBUF, &svsk->sk_flags);
+ set_bit(XPT_CHNGBUF, &svsk->sk_xprt.xpt_flags);
}
spin_unlock_bh(&serv->sv_lock);
}
@@ -1492,7 +1492,7 @@ svc_check_conn_limits(struct svc_serv *s
svsk = list_entry(serv->sv_tempsocks.prev,
struct svc_sock,
sk_list);
- set_bit(SK_CLOSE, &svsk->sk_flags);
+ set_bit(XPT_CLOSE, &svsk->sk_xprt.xpt_flags);
svc_xprt_get(&svsk->sk_xprt);
}
spin_unlock_bh(&serv->sv_lock);
@@ -1596,10 +1596,10 @@ svc_recv(struct svc_rqst *rqstp, long ti
spin_unlock_bh(&pool->sp_lock);

len = 0;
- if (test_bit(SK_CLOSE, &svsk->sk_flags)) {
- dprintk("svc_recv: found SK_CLOSE\n");
+ if (test_bit(XPT_CLOSE, &svsk->sk_xprt.xpt_flags)) {
+ dprintk("svc_recv: found XPT_CLOSE\n");
svc_delete_socket(svsk);
- } else if (test_bit(SK_LISTENER, &svsk->sk_flags)) {
+ } else if (test_bit(XPT_LISTENER, &svsk->sk_xprt.xpt_flags)) {
struct svc_xprt *newxpt;
newxpt = svsk->sk_xprt.xpt_ops.xpo_accept(&svsk->sk_xprt);
if (newxpt) {
@@ -1626,7 +1626,7 @@ svc_recv(struct svc_rqst *rqstp, long ti
return -EAGAIN;
}
svsk->sk_lastrecv = get_seconds();
- clear_bit(SK_OLD, &svsk->sk_flags);
+ clear_bit(XPT_OLD, &svsk->sk_xprt.xpt_flags);

rqstp->rq_secure = svc_port_is_privileged(svc_addr(rqstp));
rqstp->rq_chandle.defer = svc_defer;
@@ -1673,7 +1673,7 @@ svc_send(struct svc_rqst *rqstp)

/* Grab svsk->sk_mutex to serialize outgoing data. */
mutex_lock(&svsk->sk_mutex);
- if (test_bit(SK_DEAD, &svsk->sk_flags))
+ if (test_bit(XPT_DEAD, &svsk->sk_xprt.xpt_flags))
len = -ENOTCONN;
else
len = svsk->sk_xprt.xpt_ops.xpo_sendto(rqstp);
@@ -1709,21 +1709,21 @@ svc_age_temp_sockets(unsigned long closu
list_for_each_safe(le, next, &serv->sv_tempsocks) {
svsk = list_entry(le, struct svc_sock, sk_list);

- if (!test_and_set_bit(SK_OLD, &svsk->sk_flags))
+ if (!test_and_set_bit(XPT_OLD, &svsk->sk_xprt.xpt_flags))
continue;
if (atomic_read(&svsk->sk_xprt.xpt_ref.refcount) > 1
|| test_bit(SK_BUSY, &svsk->sk_flags))
continue;
svc_xprt_get(&svsk->sk_xprt);
list_move(le, &to_be_aged);
- set_bit(SK_CLOSE, &svsk->sk_flags);
- set_bit(SK_DETACHED, &svsk->sk_flags);
+ set_bit(XPT_CLOSE, &svsk->sk_xprt.xpt_flags);
+ set_bit(XPT_DETACHED, &svsk->sk_xprt.xpt_flags);
}
spin_unlock_bh(&serv->sv_lock);

while (!list_empty(&to_be_aged)) {
le = to_be_aged.next;
- /* fiddling the sk_list node is safe 'cos we're SK_DETACHED */
+ /* fiddling the sk_list node is safe 'cos we're XPT_DETACHED */
list_del_init(le);
svsk = list_entry(le, struct svc_sock, sk_list);

@@ -1769,7 +1769,7 @@ static struct svc_sock *svc_setup_socket
return NULL;
}

- set_bit(SK_BUSY, &svsk->sk_flags);
+ set_bit(XPT_BUSY, &svsk->sk_xprt.xpt_flags);
inet->sk_user_data = svsk;
svsk->sk_sock = sock;
svsk->sk_sk = inet;
@@ -1791,7 +1791,7 @@ static struct svc_sock *svc_setup_socket

spin_lock_bh(&serv->sv_lock);
if (is_temporary) {
- set_bit(SK_TEMP, &svsk->sk_flags);
+ set_bit(XPT_TEMP, &svsk->sk_xprt.xpt_flags);
list_add(&svsk->sk_list, &serv->sv_tempsocks);
serv->sv_tmpcnt++;
if (serv->sv_temptimer.function == NULL) {
@@ -1802,7 +1802,7 @@ static struct svc_sock *svc_setup_socket
jiffies + svc_conn_age_period * HZ);
}
} else {
- clear_bit(SK_TEMP, &svsk->sk_flags);
+ clear_bit(XPT_TEMP, &svsk->sk_xprt.xpt_flags);
list_add(&svsk->sk_list, &serv->sv_permsocks);
}
spin_unlock_bh(&serv->sv_lock);
@@ -1890,7 +1890,7 @@ svc_create_socket(struct svc_serv *serv,

if ((svsk = svc_setup_socket(serv, sock, &error, flags)) != NULL) {
if (protocol == IPPROTO_TCP)
- set_bit(SK_LISTENER, &svsk->sk_flags);
+ set_bit(XPT_LISTENER, &svsk->sk_xprt.xpt_flags);
svc_sock_received(svsk);
return (struct svc_xprt *)svsk;
}
@@ -1955,7 +1955,7 @@ svc_delete_socket(struct svc_sock *svsk)

spin_lock_bh(&serv->sv_lock);

- if (!test_and_set_bit(SK_DETACHED, &svsk->sk_flags))
+ if (!test_and_set_bit(XPT_DETACHED, &svsk->sk_xprt.xpt_flags))
list_del_init(&svsk->sk_list);
/*
* We used to delete the svc_sock from whichever list
@@ -1964,10 +1964,10 @@ svc_delete_socket(struct svc_sock *svsk)
* while still attached to a queue, the queue itself
* is about to be destroyed (in svc_destroy).
*/
- if (!test_and_set_bit(SK_DEAD, &svsk->sk_flags)) {
+ if (!test_and_set_bit(XPT_DEAD, &svsk->sk_xprt.xpt_flags)) {
BUG_ON(atomic_read(&svsk->sk_xprt.xpt_ref.refcount)<2);
svc_xprt_put(&svsk->sk_xprt);
- if (test_bit(SK_TEMP, &svsk->sk_flags))
+ if (test_bit(XPT_TEMP, &svsk->sk_xprt.xpt_flags))
serv->sv_tmpcnt--;
}

@@ -1976,26 +1976,26 @@ svc_delete_socket(struct svc_sock *svsk)

static void svc_close_socket(struct svc_sock *svsk)
{
- set_bit(SK_CLOSE, &svsk->sk_flags);
- if (test_and_set_bit(SK_BUSY, &svsk->sk_flags))
+ set_bit(XPT_CLOSE, &svsk->sk_xprt.xpt_flags);
+ if (test_and_set_bit(XPT_BUSY, &svsk->sk_xprt.xpt_flags))
/* someone else will have to effect the close */
return;

svc_xprt_get(&svsk->sk_xprt);
svc_delete_socket(svsk);
- clear_bit(SK_BUSY, &svsk->sk_flags);
+ clear_bit(XPT_BUSY, &svsk->sk_xprt.xpt_flags);
svc_xprt_put(&svsk->sk_xprt);
}

void svc_force_close_socket(struct svc_sock *svsk)
{
- set_bit(SK_CLOSE, &svsk->sk_flags);
- if (test_bit(SK_BUSY, &svsk->sk_flags)) {
+ set_bit(XPT_CLOSE, &svsk->sk_xprt.xpt_flags);
+ if (test_bit(XPT_BUSY, &svsk->sk_xprt.xpt_flags)) {
/* Waiting to be processed, but no threads left,
* So just remove it from the waiting list
*/
list_del_init(&svsk->sk_ready);
- clear_bit(SK_BUSY, &svsk->sk_flags);
+ clear_bit(XPT_BUSY, &svsk->sk_xprt.xpt_flags);
}
svc_close_socket(svsk);
}
@@ -2020,7 +2020,7 @@ static void svc_revisit(struct cache_def
spin_lock(&svsk->sk_lock);
list_add(&dr->handle.recent, &svsk->sk_deferred);
spin_unlock(&svsk->sk_lock);
- set_bit(SK_DEFERRED, &svsk->sk_flags);
+ set_bit(XPT_DEFERRED, &svsk->sk_xprt.xpt_flags);
svc_sock_enqueue(svsk);
svc_xprt_put(&svsk->sk_xprt);
}
@@ -2083,16 +2083,16 @@ static struct svc_deferred_req *svc_defe
{
struct svc_deferred_req *dr = NULL;

- if (!test_bit(SK_DEFERRED, &svsk->sk_flags))
+ if (!test_bit(XPT_DEFERRED, &svsk->sk_xprt.xpt_flags))
return NULL;
spin_lock(&svsk->sk_lock);
- clear_bit(SK_DEFERRED, &svsk->sk_flags);
+ clear_bit(XPT_DEFERRED, &svsk->sk_xprt.xpt_flags);
if (!list_empty(&svsk->sk_deferred)) {
dr = list_entry(svsk->sk_deferred.next,
struct svc_deferred_req,
handle.recent);
list_del_init(&dr->handle.recent);
- set_bit(SK_DEFERRED, &svsk->sk_flags);
+ set_bit(XPT_DEFERRED, &svsk->sk_xprt.xpt_flags);
}
spin_unlock(&svsk->sk_lock);
return dr;

-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2005.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2007-09-27 05:02:03

by Tom Tucker

[permalink] [raw]
Subject: [RFC, PATCH 16/33] svc: Move sk_server and sk_pool to svc_xprt


This is another incremental change that moves transport independent
fields from svc_sock to the svc_xprt structure. The changes
should be functionally null.

Signed-off-by: Tom Tucker <[email protected]>
---

include/linux/sunrpc/svc_xprt.h | 5 +++-
include/linux/sunrpc/svcsock.h | 2 -
net/sunrpc/svc_xprt.c | 3 +-
net/sunrpc/svcsock.c | 53 +++++++++++++++++++--------------------
4 files changed, 32 insertions(+), 31 deletions(-)

diff --git a/include/linux/sunrpc/svc_xprt.h b/include/linux/sunrpc/svc_xprt.h
index 935726e..b850922 100644
--- a/include/linux/sunrpc/svc_xprt.h
+++ b/include/linux/sunrpc/svc_xprt.h
@@ -49,11 +49,14 @@ #define XPT_DEFERRED 8 /* deferred req
#define XPT_OLD 9 /* used for transport aging mark+sweep */
#define XPT_DETACHED 10 /* detached from tempsocks list */
#define XPT_LISTENER 11 /* listening endpoint */
+
+ struct svc_pool * xpt_pool; /* current pool iff queued */
+ struct svc_serv * xpt_server; /* service for this transport */
};

int svc_reg_xprt_class(struct svc_xprt_class *);
int svc_unreg_xprt_class(struct svc_xprt_class *);
-void svc_xprt_init(struct svc_xprt_class *, struct svc_xprt *);
+void svc_xprt_init(struct svc_xprt_class *, struct svc_xprt *, struct svc_serv *);
int svc_create_xprt(struct svc_serv *, char *, unsigned short, int);
void svc_xprt_put(struct svc_xprt *xprt);

diff --git a/include/linux/sunrpc/svcsock.h b/include/linux/sunrpc/svcsock.h
index b8a8496..92d4cc9 100644
--- a/include/linux/sunrpc/svcsock.h
+++ b/include/linux/sunrpc/svcsock.h
@@ -22,8 +22,6 @@ struct svc_sock {
struct socket * sk_sock; /* berkeley socket layer */
struct sock * sk_sk; /* INET layer */

- struct svc_pool * sk_pool; /* current pool iff queued */
- struct svc_serv * sk_server; /* service for this socket */
atomic_t sk_reserved; /* space on outq that is reserved */

spinlock_t sk_lock; /* protects sk_deferred and
diff --git a/net/sunrpc/svc_xprt.c b/net/sunrpc/svc_xprt.c
index 9a9670f..e1a9378 100644
--- a/net/sunrpc/svc_xprt.c
+++ b/net/sunrpc/svc_xprt.c
@@ -102,12 +102,13 @@ EXPORT_SYMBOL_GPL(svc_xprt_put);
* Called by transport drivers to initialize the transport independent
* portion of the transport instance.
*/
-void svc_xprt_init(struct svc_xprt_class *xcl, struct svc_xprt *xpt)
+void svc_xprt_init(struct svc_xprt_class *xcl, struct svc_xprt *xpt, struct svc_serv *serv)
{
xpt->xpt_class = xcl;
xpt->xpt_ops = *xcl->xcl_ops;
xpt->xpt_max_payload = xcl->xcl_max_payload;
kref_init(&xpt->xpt_ref);
+ xpt->xpt_server = serv;
}
EXPORT_SYMBOL_GPL(svc_xprt_init);

diff --git a/net/sunrpc/svcsock.c b/net/sunrpc/svcsock.c
index c6b521d..625e31c 100644
--- a/net/sunrpc/svcsock.c
+++ b/net/sunrpc/svcsock.c
@@ -230,7 +230,7 @@ svc_sock_wspace(struct svc_sock *svsk)
static void
svc_sock_enqueue(struct svc_sock *svsk)
{
- struct svc_serv *serv = svsk->sk_server;
+ struct svc_serv *serv = svsk->sk_xprt.xpt_server;
struct svc_pool *pool;
struct svc_rqst *rqstp;
int cpu;
@@ -242,7 +242,7 @@ svc_sock_enqueue(struct svc_sock *svsk)
return;

cpu = get_cpu();
- pool = svc_pool_for_cpu(svsk->sk_server, cpu);
+ pool = svc_pool_for_cpu(svsk->sk_xprt.xpt_server, cpu);
put_cpu();

spin_lock_bh(&pool->sp_lock);
@@ -268,8 +268,8 @@ svc_sock_enqueue(struct svc_sock *svsk)
dprintk("svc: socket %p busy, not enqueued\n", svsk->sk_sk);
goto out_unlock;
}
- BUG_ON(svsk->sk_pool != NULL);
- svsk->sk_pool = pool;
+ BUG_ON(svsk->sk_xprt.xpt_pool != NULL);
+ svsk->sk_xprt.xpt_pool = pool;

/* Handle pending connection */
if (test_bit(XPT_CONN, &svsk->sk_xprt.xpt_flags))
@@ -283,7 +283,7 @@ svc_sock_enqueue(struct svc_sock *svsk)
if (!svsk->sk_xprt.xpt_ops.xpo_has_wspace(&svsk->sk_xprt)) {
/* Don't enqueue while not enough space for reply */
dprintk("svc: no write space, socket %p not enqueued\n", svsk);
- svsk->sk_pool = NULL;
+ svsk->sk_xprt.xpt_pool = NULL;
clear_bit(XPT_BUSY, &svsk->sk_xprt.xpt_flags);
goto out_unlock;
}
@@ -304,12 +304,12 @@ svc_sock_enqueue(struct svc_sock *svsk)
svc_xprt_get(&svsk->sk_xprt);
rqstp->rq_reserved = serv->sv_max_mesg;
atomic_add(rqstp->rq_reserved, &svsk->sk_reserved);
- BUG_ON(svsk->sk_pool != pool);
+ BUG_ON(svsk->sk_xprt.xpt_pool != pool);
wake_up(&rqstp->rq_wait);
} else {
dprintk("svc: socket %p put into queue\n", svsk->sk_sk);
list_add_tail(&svsk->sk_ready, &pool->sp_sockets);
- BUG_ON(svsk->sk_pool != pool);
+ BUG_ON(svsk->sk_xprt.xpt_pool != pool);
}

out_unlock:
@@ -346,7 +346,7 @@ svc_sock_dequeue(struct svc_pool *pool)
static inline void
svc_sock_received(struct svc_sock *svsk)
{
- svsk->sk_pool = NULL;
+ svsk->sk_xprt.xpt_pool = NULL;
clear_bit(XPT_BUSY, &svsk->sk_xprt.xpt_flags);
svc_sock_enqueue(svsk);
}
@@ -749,7 +749,7 @@ static int
svc_udp_recvfrom(struct svc_rqst *rqstp)
{
struct svc_sock *svsk = rqstp->rq_sock;
- struct svc_serv *serv = svsk->sk_server;
+ struct svc_serv *serv = svsk->sk_xprt.xpt_server;
struct sk_buff *skb;
union {
struct cmsghdr hdr;
@@ -889,7 +889,7 @@ static int
svc_udp_has_wspace(struct svc_xprt *xprt)
{
struct svc_sock *svsk = (struct svc_sock*)xprt;
- struct svc_serv *serv = svsk->sk_server;
+ struct svc_serv *serv = svsk->sk_xprt.xpt_server;
int required;

/*
@@ -943,7 +943,6 @@ svc_udp_init(struct svc_sock *svsk)
int one = 1;
mm_segment_t oldfs;

- svc_xprt_init(&svc_udp_class, &svsk->sk_xprt);
svsk->sk_sk->sk_data_ready = svc_udp_data_ready;
svsk->sk_sk->sk_write_space = svc_write_space;

@@ -952,8 +951,8 @@ svc_udp_init(struct svc_sock *svsk)
* svc_udp_recvfrom will re-adjust if necessary
*/
svc_sock_setbufsize(svsk->sk_sock,
- 3 * svsk->sk_server->sv_max_mesg,
- 3 * svsk->sk_server->sv_max_mesg);
+ 3 * svsk->sk_xprt.xpt_server->sv_max_mesg,
+ 3 * svsk->sk_xprt.xpt_server->sv_max_mesg);

set_bit(XPT_DATA, &svsk->sk_xprt.xpt_flags); /* might have come in before data_ready set up */
set_bit(XPT_CHNGBUF, &svsk->sk_xprt.xpt_flags);
@@ -1059,7 +1058,7 @@ svc_tcp_accept(struct svc_xprt *xprt)
struct svc_sock *svsk = (struct svc_sock *)xprt;
struct sockaddr_storage addr;
struct sockaddr *sin = (struct sockaddr *) &addr;
- struct svc_serv *serv = svsk->sk_server;
+ struct svc_serv *serv = svsk->sk_xprt.xpt_server;
struct socket *sock = svsk->sk_sock;
struct socket *newsock;
struct svc_sock *newsvsk;
@@ -1142,7 +1141,7 @@ static int
svc_tcp_recvfrom(struct svc_rqst *rqstp)
{
struct svc_sock *svsk = rqstp->rq_sock;
- struct svc_serv *serv = svsk->sk_server;
+ struct svc_serv *serv = svsk->sk_xprt.xpt_server;
int len;
struct kvec *vec;
int pnum, vlen;
@@ -1285,7 +1284,7 @@ svc_tcp_recvfrom(struct svc_rqst *rqstp)
svc_sock_received(svsk);
} else {
printk(KERN_NOTICE "%s: recvfrom returned errno %d\n",
- svsk->sk_server->sv_name, -len);
+ svsk->sk_xprt.xpt_server->sv_name, -len);
goto err_delete;
}

@@ -1315,7 +1314,7 @@ svc_tcp_sendto(struct svc_rqst *rqstp)
sent = svc_sendto(rqstp, &rqstp->rq_res);
if (sent != xbufp->len) {
printk(KERN_NOTICE "rpc-srv/tcp: %s: %s %d when sending %d bytes - shutting down socket\n",
- rqstp->rq_sock->sk_server->sv_name,
+ rqstp->rq_sock->sk_xprt.xpt_server->sv_name,
(sent<0)?"got error":"sent only",
sent, xbufp->len);
set_bit(XPT_CLOSE, &rqstp->rq_sock->sk_xprt.xpt_flags);
@@ -1341,7 +1340,7 @@ static int
svc_tcp_has_wspace(struct svc_xprt *xprt)
{
struct svc_sock *svsk = (struct svc_sock*)xprt;
- struct svc_serv *serv = svsk->sk_server;
+ struct svc_serv *serv = svsk->sk_xprt.xpt_server;
int required;

/*
@@ -1400,8 +1399,6 @@ svc_tcp_init(struct svc_sock *svsk)
struct sock *sk = svsk->sk_sk;
struct tcp_sock *tp = tcp_sk(sk);

- svc_xprt_init(&svc_tcp_class, &svsk->sk_xprt);
-
if (sk->sk_state == TCP_LISTEN) {
dprintk("setting up TCP socket for listening\n");
sk->sk_data_ready = svc_tcp_listen_data_ready;
@@ -1422,8 +1419,8 @@ svc_tcp_init(struct svc_sock *svsk)
* svc_tcp_recvfrom will re-adjust if necessary
*/
svc_sock_setbufsize(svsk->sk_sock,
- 3 * svsk->sk_server->sv_max_mesg,
- 3 * svsk->sk_server->sv_max_mesg);
+ 3 * svsk->sk_xprt.xpt_server->sv_max_mesg,
+ 3 * svsk->sk_xprt.xpt_server->sv_max_mesg);

set_bit(XPT_CHNGBUF, &svsk->sk_xprt.xpt_flags);
set_bit(XPT_DATA, &svsk->sk_xprt.xpt_flags);
@@ -1608,7 +1605,7 @@ svc_recv(struct svc_rqst *rqstp, long ti
* listener holds a reference too
*/
__module_get(newxpt->xpt_class->xcl_owner);
- svc_check_conn_limits(svsk->sk_server);
+ svc_check_conn_limits(svsk->sk_xprt.xpt_server);
}
svc_sock_received(svsk);
} else {
@@ -1776,7 +1773,6 @@ static struct svc_sock *svc_setup_socket
svsk->sk_ostate = inet->sk_state_change;
svsk->sk_odata = inet->sk_data_ready;
svsk->sk_owspace = inet->sk_write_space;
- svsk->sk_server = serv;
svsk->sk_lastrecv = get_seconds();
spin_lock_init(&svsk->sk_lock);
INIT_LIST_HEAD(&svsk->sk_deferred);
@@ -1784,10 +1780,13 @@ static struct svc_sock *svc_setup_socket
mutex_init(&svsk->sk_mutex);

/* Initialize the socket */
- if (sock->type == SOCK_DGRAM)
+ if (sock->type == SOCK_DGRAM) {
+ svc_xprt_init(&svc_udp_class, &svsk->sk_xprt, serv);
svc_udp_init(svsk);
- else
+ } else {
+ svc_xprt_init(&svc_tcp_class, &svsk->sk_xprt, serv);
svc_tcp_init(svsk);
+ }

spin_lock_bh(&serv->sv_lock);
if (is_temporary) {
@@ -1948,7 +1947,7 @@ svc_delete_socket(struct svc_sock *svsk)

dprintk("svc: svc_delete_socket(%p)\n", svsk);

- serv = svsk->sk_server;
+ serv = svsk->sk_xprt.xpt_server;
sk = svsk->sk_sk;

svsk->sk_xprt.xpt_ops.xpo_detach(&svsk->sk_xprt);

-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2005.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2007-09-27 05:02:04

by Tom Tucker

[permalink] [raw]
Subject: [RFC, PATCH 13/33] svc: Change services to use new svc_create_xprt service


Modify the various kernel RPC svcs to use the svc_create_xprt service.

Signed-off-by: Tom Tucker <[email protected]>
---

fs/lockd/svc.c | 21 ++++++++++-----------
fs/nfs/callback.c | 4 ++--
fs/nfsd/nfssvc.c | 4 ++--
include/linux/sunrpc/svcsock.h | 1 -
net/sunrpc/sunrpc_syms.c | 1 -
net/sunrpc/svcsock.c | 22 ----------------------
6 files changed, 14 insertions(+), 39 deletions(-)

diff --git a/fs/lockd/svc.c b/fs/lockd/svc.c
index 82e2192..e71cc37 100644
--- a/fs/lockd/svc.c
+++ b/fs/lockd/svc.c
@@ -219,13 +219,12 @@ lockd(struct svc_rqst *rqstp)
module_put_and_exit(0);
}

-
-static int find_socket(struct svc_serv *serv, int proto)
+static int find_xprt(struct svc_serv *serv, char *proto)
{
- struct svc_sock *svsk;
+ struct svc_xprt *xprt;
int found = 0;
- list_for_each_entry(svsk, &serv->sv_permsocks, sk_list)
- if (svsk->sk_sk->sk_protocol == proto) {
+ list_for_each_entry(xprt, &serv->sv_permsocks, xpt_list)
+ if (strcmp(xprt->xpt_class->xcl_name, proto)==0) {
found = 1;
break;
}
@@ -243,13 +242,13 @@ static int make_socks(struct svc_serv *s
int err = 0;

if (proto == IPPROTO_UDP || nlm_udpport)
- if (!find_socket(serv, IPPROTO_UDP))
- err = svc_makesock(serv, IPPROTO_UDP, nlm_udpport,
- SVC_SOCK_DEFAULTS);
+ if (!find_xprt(serv,"udp"))
+ err = svc_create_xprt(serv, "udp", nlm_udpport,
+ SVC_SOCK_DEFAULTS);
if (err >= 0 && (proto == IPPROTO_TCP || nlm_tcpport))
- if (!find_socket(serv, IPPROTO_TCP))
- err = svc_makesock(serv, IPPROTO_TCP, nlm_tcpport,
- SVC_SOCK_DEFAULTS);
+ if (!find_xprt(serv,"tcp"))
+ err = svc_create_xprt(serv, "tcp", nlm_tcpport,
+ SVC_SOCK_DEFAULTS);

if (err >= 0) {
warned = 0;
diff --git a/fs/nfs/callback.c b/fs/nfs/callback.c
index a796be5..e27ca14 100644
--- a/fs/nfs/callback.c
+++ b/fs/nfs/callback.c
@@ -123,8 +123,8 @@ int nfs_callback_up(void)
if (!serv)
goto out_err;

- ret = svc_makesock(serv, IPPROTO_TCP, nfs_callback_set_tcpport,
- SVC_SOCK_ANONYMOUS);
+ ret = svc_create_xprt(serv, "tcp", nfs_callback_set_tcpport,
+ SVC_SOCK_ANONYMOUS);
if (ret <= 0)
goto out_destroy;
nfs_callback_tcpport = ret;
diff --git a/fs/nfsd/nfssvc.c b/fs/nfsd/nfssvc.c
index a8c89ae..bf70b06 100644
--- a/fs/nfsd/nfssvc.c
+++ b/fs/nfsd/nfssvc.c
@@ -236,7 +236,7 @@ static int nfsd_init_socks(int port)

error = lockd_up(IPPROTO_UDP);
if (error >= 0) {
- error = svc_makesock(nfsd_serv, IPPROTO_UDP, port,
+ error = svc_create_xprt(nfsd_serv, "udp", port,
SVC_SOCK_DEFAULTS);
if (error < 0)
lockd_down();
@@ -247,7 +247,7 @@ static int nfsd_init_socks(int port)
#ifdef CONFIG_NFSD_TCP
error = lockd_up(IPPROTO_TCP);
if (error >= 0) {
- error = svc_makesock(nfsd_serv, IPPROTO_TCP, port,
+ error = svc_create_xprt(nfsd_serv, "tcp", port,
SVC_SOCK_DEFAULTS);
if (error < 0)
lockd_down();
diff --git a/include/linux/sunrpc/svcsock.h b/include/linux/sunrpc/svcsock.h
index 9882ce0..3181d9d 100644
--- a/include/linux/sunrpc/svcsock.h
+++ b/include/linux/sunrpc/svcsock.h
@@ -67,7 +67,6 @@ #define SK_LISTENER 11 /* listening en
/*
* Function prototypes.
*/
-int svc_makesock(struct svc_serv *, int, unsigned short, int flags);
void svc_force_close_socket(struct svc_sock *);
int svc_recv(struct svc_rqst *, long);
int svc_send(struct svc_rqst *);
diff --git a/net/sunrpc/sunrpc_syms.c b/net/sunrpc/sunrpc_syms.c
index a62ce47..e4cad0f 100644
--- a/net/sunrpc/sunrpc_syms.c
+++ b/net/sunrpc/sunrpc_syms.c
@@ -72,7 +72,6 @@ EXPORT_SYMBOL(svc_drop);
EXPORT_SYMBOL(svc_process);
EXPORT_SYMBOL(svc_recv);
EXPORT_SYMBOL(svc_wake_up);
-EXPORT_SYMBOL(svc_makesock);
EXPORT_SYMBOL(svc_reserve);
EXPORT_SYMBOL(svc_auth_register);
EXPORT_SYMBOL(auth_domain_lookup);
diff --git a/net/sunrpc/svcsock.c b/net/sunrpc/svcsock.c
index d37f773..f6a6e57 100644
--- a/net/sunrpc/svcsock.c
+++ b/net/sunrpc/svcsock.c
@@ -2012,28 +2012,6 @@ void svc_force_close_socket(struct svc_s
svc_close_socket(svsk);
}

-/**
- * svc_makesock - Make a socket for nfsd and lockd
- * @serv: RPC server structure
- * @protocol: transport protocol to use
- * @port: port to use
- * @flags: requested socket characteristics
- *
- */
-int svc_makesock(struct svc_serv *serv, int protocol, unsigned short port,
- int flags)
-{
- dprintk("svc: creating socket proto = %d\n", protocol);
- switch (protocol) {
- case IPPROTO_TCP:
- return svc_create_xprt(serv, "tcp", port, flags);
- case IPPROTO_UDP:
- return svc_create_xprt(serv, "udp", port, flags);
- default:
- return -EINVAL;
- }
-}
-
/*
* Handle defer and revisit of requests
*/

-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2005.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2007-09-27 05:02:05

by Tom Tucker

[permalink] [raw]
Subject: [RFC,PATCH 14/33] svc: Change sk_inuse to a kref


Change the atomic_t reference count to a kref and move it to the
transport indepenent svc_xprt structure. Change the reference count
wrapper names to be generic.

Signed-off-by: Tom Tucker <[email protected]>
---

include/linux/sunrpc/svc_xprt.h | 7 +++++
include/linux/sunrpc/svcsock.h | 1 -
net/sunrpc/svc_xprt.c | 17 ++++++++++++
net/sunrpc/svcsock.c | 54 +++++++++++++++------------------------
4 files changed, 45 insertions(+), 34 deletions(-)

diff --git a/include/linux/sunrpc/svc_xprt.h b/include/linux/sunrpc/svc_xprt.h
index 6a34bb4..c9196bc 100644
--- a/include/linux/sunrpc/svc_xprt.h
+++ b/include/linux/sunrpc/svc_xprt.h
@@ -8,6 +8,7 @@ #ifndef SUNRPC_SVC_XPRT_H
#define SUNRPC_SVC_XPRT_H

#include <linux/sunrpc/svc.h>
+#include <linux/module.h>

struct svc_xprt_ops {
struct svc_xprt *(*xpo_create)(struct svc_serv *,
@@ -35,11 +36,17 @@ struct svc_xprt {
struct svc_xprt_class *xpt_class;
struct svc_xprt_ops xpt_ops;
u32 xpt_max_payload;
+ struct kref xpt_ref;
};

int svc_reg_xprt_class(struct svc_xprt_class *);
int svc_unreg_xprt_class(struct svc_xprt_class *);
void svc_xprt_init(struct svc_xprt_class *, struct svc_xprt *);
int svc_create_xprt(struct svc_serv *, char *, unsigned short, int);
+void svc_xprt_put(struct svc_xprt *xprt);
+
+static inline void svc_xprt_get(struct svc_xprt *xprt) {
+ kref_get(&xprt->xpt_ref);
+}

#endif /* SUNRPC_SVC_XPRT_H */
diff --git a/include/linux/sunrpc/svcsock.h b/include/linux/sunrpc/svcsock.h
index 3181d9d..ba07d50 100644
--- a/include/linux/sunrpc/svcsock.h
+++ b/include/linux/sunrpc/svcsock.h
@@ -24,7 +24,6 @@ struct svc_sock {

struct svc_pool * sk_pool; /* current pool iff queued */
struct svc_serv * sk_server; /* service for this socket */
- atomic_t sk_inuse; /* use count */
unsigned long sk_flags;
#define SK_BUSY 0 /* enqueued/receiving */
#define SK_CONN 1 /* conn pending */
diff --git a/net/sunrpc/svc_xprt.c b/net/sunrpc/svc_xprt.c
index fab0ce3..9a9670f 100644
--- a/net/sunrpc/svc_xprt.c
+++ b/net/sunrpc/svc_xprt.c
@@ -82,6 +82,22 @@ int svc_unreg_xprt_class(struct svc_xprt
}
EXPORT_SYMBOL_GPL(svc_unreg_xprt_class);

+static inline void svc_xprt_free(struct kref *kref)
+{
+ struct svc_xprt *xprt =
+ container_of(kref, struct svc_xprt, xpt_ref);
+ struct module *owner = xprt->xpt_class->xcl_owner;
+ BUG_ON(atomic_read(&kref->refcount));
+ xprt->xpt_ops.xpo_free(xprt);
+ module_put(owner);
+}
+
+void svc_xprt_put(struct svc_xprt *xprt)
+{
+ kref_put(&xprt->xpt_ref, svc_xprt_free);
+}
+EXPORT_SYMBOL_GPL(svc_xprt_put);
+
/*
* Called by transport drivers to initialize the transport independent
* portion of the transport instance.
@@ -91,6 +107,7 @@ void svc_xprt_init(struct svc_xprt_class
xpt->xpt_class = xcl;
xpt->xpt_ops = *xcl->xcl_ops;
xpt->xpt_max_payload = xcl->xcl_max_payload;
+ kref_init(&xpt->xpt_ref);
}
EXPORT_SYMBOL_GPL(svc_xprt_init);

diff --git a/net/sunrpc/svcsock.c b/net/sunrpc/svcsock.c
index f6a6e57..a92b000 100644
--- a/net/sunrpc/svcsock.c
+++ b/net/sunrpc/svcsock.c
@@ -65,8 +65,8 @@ #include <linux/sunrpc/stats.h>
* after a clear, the socket must be read/accepted
* if this succeeds, it must be set again.
* SK_CLOSE can set at any time. It is never cleared.
- * sk_inuse contains a bias of '1' until SK_DEAD is set.
- * so when sk_inuse hits zero, we know the socket is dead
+ * xpt_ref contains a bias of '1' until SK_DEAD is set.
+ * so when xprt_ref hits zero, we know the transport is dead
* and no-one is using it.
* SK_DEAD can only be set while SK_BUSY is held which ensures
* no other thread will be using the socket or will try to
@@ -301,7 +301,7 @@ svc_sock_enqueue(struct svc_sock *svsk)
"svc_sock_enqueue: server %p, rq_sock=%p!\n",
rqstp, rqstp->rq_sock);
rqstp->rq_sock = svsk;
- atomic_inc(&svsk->sk_inuse);
+ svc_xprt_get(&svsk->sk_xprt);
rqstp->rq_reserved = serv->sv_max_mesg;
atomic_add(rqstp->rq_reserved, &svsk->sk_reserved);
BUG_ON(svsk->sk_pool != pool);
@@ -332,7 +332,7 @@ svc_sock_dequeue(struct svc_pool *pool)
list_del_init(&svsk->sk_ready);

dprintk("svc: socket %p dequeued, inuse=%d\n",
- svsk->sk_sk, atomic_read(&svsk->sk_inuse));
+ svsk->sk_sk, atomic_read(&svsk->sk_xprt.xpt_ref.refcount));

return svsk;
}
@@ -375,19 +375,6 @@ void svc_reserve(struct svc_rqst *rqstp,
}
}

-/*
- * Release a socket after use.
- */
-static inline void
-svc_sock_put(struct svc_sock *svsk)
-{
- if (atomic_dec_and_test(&svsk->sk_inuse)) {
- BUG_ON(!test_bit(SK_DEAD, &svsk->sk_flags));
- module_put(svsk->sk_xprt.xpt_class->xcl_owner);
- svsk->sk_xprt.xpt_ops.xpo_free(&svsk->sk_xprt);
- }
-}
-
static void
svc_sock_release(struct svc_rqst *rqstp)
{
@@ -414,7 +401,7 @@ svc_sock_release(struct svc_rqst *rqstp)
svc_reserve(rqstp, 0);
rqstp->rq_sock = NULL;

- svc_sock_put(svsk);
+ svc_xprt_put(&svsk->sk_xprt);
}

/*
@@ -1506,13 +1493,13 @@ svc_check_conn_limits(struct svc_serv *s
struct svc_sock,
sk_list);
set_bit(SK_CLOSE, &svsk->sk_flags);
- atomic_inc(&svsk->sk_inuse);
+ svc_xprt_get(&svsk->sk_xprt);
}
spin_unlock_bh(&serv->sv_lock);

if (svsk) {
svc_sock_enqueue(svsk);
- svc_sock_put(svsk);
+ svc_xprt_put(&svsk->sk_xprt);
}
}
}
@@ -1577,7 +1564,7 @@ svc_recv(struct svc_rqst *rqstp, long ti
spin_lock_bh(&pool->sp_lock);
if ((svsk = svc_sock_dequeue(pool)) != NULL) {
rqstp->rq_sock = svsk;
- atomic_inc(&svsk->sk_inuse);
+ svc_xprt_get(&svsk->sk_xprt);
rqstp->rq_reserved = serv->sv_max_mesg;
atomic_add(rqstp->rq_reserved, &svsk->sk_reserved);
} else {
@@ -1626,7 +1613,8 @@ svc_recv(struct svc_rqst *rqstp, long ti
svc_sock_received(svsk);
} else {
dprintk("svc: server %p, pool %u, socket %p, inuse=%d\n",
- rqstp, pool->sp_id, svsk, atomic_read(&svsk->sk_inuse));
+ rqstp, pool->sp_id, svsk,
+ atomic_read(&svsk->sk_xprt.xpt_ref.refcount));
len = svsk->sk_xprt.xpt_ops.xpo_recvfrom(rqstp);
dprintk("svc: got len=%d\n", len);
}
@@ -1723,9 +1711,10 @@ svc_age_temp_sockets(unsigned long closu

if (!test_and_set_bit(SK_OLD, &svsk->sk_flags))
continue;
- if (atomic_read(&svsk->sk_inuse) > 1 || test_bit(SK_BUSY, &svsk->sk_flags))
+ if (atomic_read(&svsk->sk_xprt.xpt_ref.refcount) > 1
+ || test_bit(SK_BUSY, &svsk->sk_flags))
continue;
- atomic_inc(&svsk->sk_inuse);
+ svc_xprt_get(&svsk->sk_xprt);
list_move(le, &to_be_aged);
set_bit(SK_CLOSE, &svsk->sk_flags);
set_bit(SK_DETACHED, &svsk->sk_flags);
@@ -1743,7 +1732,7 @@ svc_age_temp_sockets(unsigned long closu

/* a thread will dequeue and close it soon */
svc_sock_enqueue(svsk);
- svc_sock_put(svsk);
+ svc_xprt_put(&svsk->sk_xprt);
}

mod_timer(&serv->sv_temptimer, jiffies + svc_conn_age_period * HZ);
@@ -1788,7 +1777,6 @@ static struct svc_sock *svc_setup_socket
svsk->sk_odata = inet->sk_data_ready;
svsk->sk_owspace = inet->sk_write_space;
svsk->sk_server = serv;
- atomic_set(&svsk->sk_inuse, 1);
svsk->sk_lastrecv = get_seconds();
spin_lock_init(&svsk->sk_lock);
INIT_LIST_HEAD(&svsk->sk_deferred);
@@ -1977,8 +1965,8 @@ svc_delete_socket(struct svc_sock *svsk)
* is about to be destroyed (in svc_destroy).
*/
if (!test_and_set_bit(SK_DEAD, &svsk->sk_flags)) {
- BUG_ON(atomic_read(&svsk->sk_inuse)<2);
- atomic_dec(&svsk->sk_inuse);
+ BUG_ON(atomic_read(&svsk->sk_xprt.xpt_ref.refcount)<2);
+ svc_xprt_put(&svsk->sk_xprt);
if (test_bit(SK_TEMP, &svsk->sk_flags))
serv->sv_tmpcnt--;
}
@@ -1993,10 +1981,10 @@ static void svc_close_socket(struct svc_
/* someone else will have to effect the close */
return;

- atomic_inc(&svsk->sk_inuse);
+ svc_xprt_get(&svsk->sk_xprt);
svc_delete_socket(svsk);
clear_bit(SK_BUSY, &svsk->sk_flags);
- svc_sock_put(svsk);
+ svc_xprt_put(&svsk->sk_xprt);
}

void svc_force_close_socket(struct svc_sock *svsk)
@@ -2022,7 +2010,7 @@ static void svc_revisit(struct cache_def
struct svc_sock *svsk;

if (too_many) {
- svc_sock_put(dr->svsk);
+ svc_xprt_put(&dr->svsk->sk_xprt);
kfree(dr);
return;
}
@@ -2034,7 +2022,7 @@ static void svc_revisit(struct cache_def
spin_unlock(&svsk->sk_lock);
set_bit(SK_DEFERRED, &svsk->sk_flags);
svc_sock_enqueue(svsk);
- svc_sock_put(svsk);
+ svc_xprt_put(&svsk->sk_xprt);
}

static struct cache_deferred_req *
@@ -2064,7 +2052,7 @@ svc_defer(struct cache_req *req)
dr->argslen = rqstp->rq_arg.len >> 2;
memcpy(dr->args, rqstp->rq_arg.head[0].iov_base-skip, dr->argslen<<2);
}
- atomic_inc(&rqstp->rq_sock->sk_inuse);
+ svc_xprt_get(rqstp->rq_xprt);
dr->svsk = rqstp->rq_sock;

dr->handle.revisit = svc_revisit;

-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2005.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2007-09-27 05:02:06

by Tom Tucker

[permalink] [raw]
Subject: [RFC,PATCH 17/33] svc: Make close transport independent


Move sk_list and sk_ready to svc_xprt. This involves close because these
lists are walked by svcs when closing all their transports. So I combined
the moving of these lists to svc_xprt with making close transport independent.

The svc_force_sock_close has been changed to svc_close_all and takes a list
as an argument. This removes some svc internals knowledge from the svcs.

This code races with module removal and transport addition.

Signed-off-by: Tom Tucker <[email protected]>
---

fs/nfsd/nfssvc.c | 4 +-
include/linux/sunrpc/svc_xprt.h | 2 +
include/linux/sunrpc/svcsock.h | 4 --
net/sunrpc/svc.c | 9 +---
net/sunrpc/svc_xprt.c | 2 +
net/sunrpc/svcsock.c | 100 ++++++++++++++++++++-------------------
6 files changed, 60 insertions(+), 61 deletions(-)

diff --git a/fs/nfsd/nfssvc.c b/fs/nfsd/nfssvc.c
index bf70b06..4f6d6fd 100644
--- a/fs/nfsd/nfssvc.c
+++ b/fs/nfsd/nfssvc.c
@@ -155,8 +155,8 @@ static int killsig; /* signal that was u
static void nfsd_last_thread(struct svc_serv *serv)
{
/* When last nfsd thread exits we need to do some clean-up */
- struct svc_sock *svsk;
- list_for_each_entry(svsk, &serv->sv_permsocks, sk_list)
+ struct svc_xprt *xprt;
+ list_for_each_entry(xprt, &serv->sv_permsocks, xpt_list)
lockd_down();
nfsd_serv = NULL;
nfsd_racache_shutdown();
diff --git a/include/linux/sunrpc/svc_xprt.h b/include/linux/sunrpc/svc_xprt.h
index b850922..84e31bc 100644
--- a/include/linux/sunrpc/svc_xprt.h
+++ b/include/linux/sunrpc/svc_xprt.h
@@ -37,6 +37,8 @@ struct svc_xprt {
struct svc_xprt_ops xpt_ops;
u32 xpt_max_payload;
struct kref xpt_ref;
+ struct list_head xpt_list;
+ struct list_head xpt_ready;
unsigned long xpt_flags;
#define XPT_BUSY 0 /* enqueued/receiving */
#define XPT_CONN 1 /* conn pending */
diff --git a/include/linux/sunrpc/svcsock.h b/include/linux/sunrpc/svcsock.h
index 92d4cc9..060508b 100644
--- a/include/linux/sunrpc/svcsock.h
+++ b/include/linux/sunrpc/svcsock.h
@@ -17,8 +17,6 @@ #include <linux/sunrpc/svc_xprt.h>
*/
struct svc_sock {
struct svc_xprt sk_xprt;
- struct list_head sk_ready; /* list of ready sockets */
- struct list_head sk_list; /* list of all sockets */
struct socket * sk_sock; /* berkeley socket layer */
struct sock * sk_sk; /* INET layer */

@@ -51,7 +49,7 @@ struct svc_sock {
/*
* Function prototypes.
*/
-void svc_force_close_socket(struct svc_sock *);
+void svc_close_all(struct list_head *);
int svc_recv(struct svc_rqst *, long);
int svc_send(struct svc_rqst *);
void svc_drop(struct svc_rqst *);
diff --git a/net/sunrpc/svc.c b/net/sunrpc/svc.c
index ee68117..440ea59 100644
--- a/net/sunrpc/svc.c
+++ b/net/sunrpc/svc.c
@@ -458,9 +458,6 @@ svc_create_pooled(struct svc_program *pr
void
svc_destroy(struct svc_serv *serv)
{
- struct svc_sock *svsk;
- struct svc_sock *tmp;
-
dprintk("svc: svc_destroy(%s, %d)\n",
serv->sv_program->pg_name,
serv->sv_nrthreads);
@@ -475,14 +472,12 @@ svc_destroy(struct svc_serv *serv)

del_timer_sync(&serv->sv_temptimer);

- list_for_each_entry_safe(svsk, tmp, &serv->sv_tempsocks, sk_list)
- svc_force_close_socket(svsk);
+ svc_close_all(&serv->sv_tempsocks);

if (serv->sv_shutdown)
serv->sv_shutdown(serv);

- list_for_each_entry_safe(svsk, tmp, &serv->sv_permsocks, sk_list)
- svc_force_close_socket(svsk);
+ svc_close_all(&serv->sv_permsocks);

BUG_ON(!list_empty(&serv->sv_permsocks));
BUG_ON(!list_empty(&serv->sv_tempsocks));
diff --git a/net/sunrpc/svc_xprt.c b/net/sunrpc/svc_xprt.c
index e1a9378..c5eaf8b 100644
--- a/net/sunrpc/svc_xprt.c
+++ b/net/sunrpc/svc_xprt.c
@@ -109,6 +109,8 @@ void svc_xprt_init(struct svc_xprt_class
xpt->xpt_max_payload = xcl->xcl_max_payload;
kref_init(&xpt->xpt_ref);
xpt->xpt_server = serv;
+ INIT_LIST_HEAD(&xpt->xpt_list);
+ INIT_LIST_HEAD(&xpt->xpt_ready);
}
EXPORT_SYMBOL_GPL(svc_xprt_init);

diff --git a/net/sunrpc/svcsock.c b/net/sunrpc/svcsock.c
index 625e31c..be73044 100644
--- a/net/sunrpc/svcsock.c
+++ b/net/sunrpc/svcsock.c
@@ -79,11 +79,11 @@ #define RPCDBG_FACILITY RPCDBG_SVCXPRT

static struct svc_sock *svc_setup_socket(struct svc_serv *, struct socket *,
int *errp, int flags);
-static void svc_delete_socket(struct svc_sock *svsk);
+static void svc_delete_xprt(struct svc_xprt *xprt);
static void svc_udp_data_ready(struct sock *, int);
static int svc_udp_recvfrom(struct svc_rqst *);
static int svc_udp_sendto(struct svc_rqst *);
-static void svc_close_socket(struct svc_sock *svsk);
+static void svc_close_xprt(struct svc_xprt *xprt);
static void svc_sock_detach(struct svc_xprt *);
static void svc_sock_free(struct svc_xprt *);

@@ -308,7 +308,7 @@ svc_sock_enqueue(struct svc_sock *svsk)
wake_up(&rqstp->rq_wait);
} else {
dprintk("svc: socket %p put into queue\n", svsk->sk_sk);
- list_add_tail(&svsk->sk_ready, &pool->sp_sockets);
+ list_add_tail(&svsk->sk_xprt.xpt_ready, &pool->sp_sockets);
BUG_ON(svsk->sk_xprt.xpt_pool != pool);
}

@@ -328,8 +328,8 @@ svc_sock_dequeue(struct svc_pool *pool)
return NULL;

svsk = list_entry(pool->sp_sockets.next,
- struct svc_sock, sk_ready);
- list_del_init(&svsk->sk_ready);
+ struct svc_sock, sk_xprt.xpt_ready);
+ list_del_init(&svsk->sk_xprt.xpt_ready);

dprintk("svc: socket %p dequeued, inuse=%d\n",
svsk->sk_sk, atomic_read(&svsk->sk_xprt.xpt_ref.refcount));
@@ -587,7 +587,7 @@ svc_sock_names(char *buf, struct svc_ser
if (!serv)
return 0;
spin_lock_bh(&serv->sv_lock);
- list_for_each_entry(svsk, &serv->sv_permsocks, sk_list) {
+ list_for_each_entry(svsk, &serv->sv_permsocks, sk_xprt.xpt_list) {
int onelen = one_sock_name(buf+len, svsk);
if (toclose && strcmp(toclose, buf+len) == 0)
closesk = svsk;
@@ -599,7 +599,7 @@ svc_sock_names(char *buf, struct svc_ser
/* Should unregister with portmap, but you cannot
* unregister just one protocol...
*/
- svc_close_socket(closesk);
+ svc_close_xprt(&closesk->sk_xprt);
else if (toclose)
return -ENOENT;
return len;
@@ -1275,7 +1275,7 @@ svc_tcp_recvfrom(struct svc_rqst *rqstp)
return len;

err_delete:
- svc_delete_socket(svsk);
+ svc_delete_xprt(&svsk->sk_xprt);
return -EAGAIN;

error:
@@ -1441,12 +1441,12 @@ svc_sock_update_bufs(struct svc_serv *se
spin_lock_bh(&serv->sv_lock);
list_for_each(le, &serv->sv_permsocks) {
struct svc_sock *svsk =
- list_entry(le, struct svc_sock, sk_list);
+ list_entry(le, struct svc_sock, sk_xprt.xpt_list);
set_bit(XPT_CHNGBUF, &svsk->sk_xprt.xpt_flags);
}
list_for_each(le, &serv->sv_tempsocks) {
struct svc_sock *svsk =
- list_entry(le, struct svc_sock, sk_list);
+ list_entry(le, struct svc_sock, sk_xprt.xpt_list);
set_bit(XPT_CHNGBUF, &svsk->sk_xprt.xpt_flags);
}
spin_unlock_bh(&serv->sv_lock);
@@ -1488,7 +1488,7 @@ svc_check_conn_limits(struct svc_serv *s
*/
svsk = list_entry(serv->sv_tempsocks.prev,
struct svc_sock,
- sk_list);
+ sk_xprt.xpt_list);
set_bit(XPT_CLOSE, &svsk->sk_xprt.xpt_flags);
svc_xprt_get(&svsk->sk_xprt);
}
@@ -1595,7 +1595,7 @@ svc_recv(struct svc_rqst *rqstp, long ti
len = 0;
if (test_bit(XPT_CLOSE, &svsk->sk_xprt.xpt_flags)) {
dprintk("svc_recv: found XPT_CLOSE\n");
- svc_delete_socket(svsk);
+ svc_delete_xprt(&svsk->sk_xprt);
} else if (test_bit(XPT_LISTENER, &svsk->sk_xprt.xpt_flags)) {
struct svc_xprt *newxpt;
newxpt = svsk->sk_xprt.xpt_ops.xpo_accept(&svsk->sk_xprt);
@@ -1704,7 +1704,7 @@ svc_age_temp_sockets(unsigned long closu
}

list_for_each_safe(le, next, &serv->sv_tempsocks) {
- svsk = list_entry(le, struct svc_sock, sk_list);
+ svsk = list_entry(le, struct svc_sock, sk_xprt.xpt_list);

if (!test_and_set_bit(XPT_OLD, &svsk->sk_xprt.xpt_flags))
continue;
@@ -1720,9 +1720,9 @@ svc_age_temp_sockets(unsigned long closu

while (!list_empty(&to_be_aged)) {
le = to_be_aged.next;
- /* fiddling the sk_list node is safe 'cos we're XPT_DETACHED */
+ /* fiddling the sk_xprt.xpt_list node is safe 'cos we're XPT_DETACHED */
list_del_init(le);
- svsk = list_entry(le, struct svc_sock, sk_list);
+ svsk = list_entry(le, struct svc_sock, sk_xprt.xpt_list);

dprintk("queuing svsk %p for closing, %lu seconds old\n",
svsk, get_seconds() - svsk->sk_lastrecv);
@@ -1776,7 +1776,6 @@ static struct svc_sock *svc_setup_socket
svsk->sk_lastrecv = get_seconds();
spin_lock_init(&svsk->sk_lock);
INIT_LIST_HEAD(&svsk->sk_deferred);
- INIT_LIST_HEAD(&svsk->sk_ready);
mutex_init(&svsk->sk_mutex);

/* Initialize the socket */
@@ -1791,7 +1790,7 @@ static struct svc_sock *svc_setup_socket
spin_lock_bh(&serv->sv_lock);
if (is_temporary) {
set_bit(XPT_TEMP, &svsk->sk_xprt.xpt_flags);
- list_add(&svsk->sk_list, &serv->sv_tempsocks);
+ list_add(&svsk->sk_xprt.xpt_list, &serv->sv_tempsocks);
serv->sv_tmpcnt++;
if (serv->sv_temptimer.function == NULL) {
/* setup timer to age temp sockets */
@@ -1802,7 +1801,7 @@ static struct svc_sock *svc_setup_socket
}
} else {
clear_bit(XPT_TEMP, &svsk->sk_xprt.xpt_flags);
- list_add(&svsk->sk_list, &serv->sv_permsocks);
+ list_add(&svsk->sk_xprt.xpt_list, &serv->sv_permsocks);
}
spin_unlock_bh(&serv->sv_lock);

@@ -1937,66 +1936,69 @@ svc_sock_free(struct svc_xprt *xprt)
}

/*
- * Remove a dead socket
+ * Remove a dead transport
*/
static void
-svc_delete_socket(struct svc_sock *svsk)
+svc_delete_xprt(struct svc_xprt *xprt)
{
struct svc_serv *serv;
- struct sock *sk;

- dprintk("svc: svc_delete_socket(%p)\n", svsk);
+ dprintk("svc: svc_delete_xprt(%p)\n", xprt);

- serv = svsk->sk_xprt.xpt_server;
- sk = svsk->sk_sk;
+ serv = xprt->xpt_server;

- svsk->sk_xprt.xpt_ops.xpo_detach(&svsk->sk_xprt);
+ xprt->xpt_ops.xpo_detach(xprt);

spin_lock_bh(&serv->sv_lock);

- if (!test_and_set_bit(XPT_DETACHED, &svsk->sk_xprt.xpt_flags))
- list_del_init(&svsk->sk_list);
+ if (!test_and_set_bit(XPT_DETACHED, &xprt->xpt_flags))
+ list_del_init(&xprt->xpt_list);
/*
- * We used to delete the svc_sock from whichever list
- * it's sk_ready node was on, but we don't actually
+ * We used to delete the transport from whichever list
+ * it's sk_xprt.xpt_ready node was on, but we don't actually
* need to. This is because the only time we're called
* while still attached to a queue, the queue itself
* is about to be destroyed (in svc_destroy).
*/
- if (!test_and_set_bit(XPT_DEAD, &svsk->sk_xprt.xpt_flags)) {
- BUG_ON(atomic_read(&svsk->sk_xprt.xpt_ref.refcount)<2);
- svc_xprt_put(&svsk->sk_xprt);
- if (test_bit(XPT_TEMP, &svsk->sk_xprt.xpt_flags))
+ if (!test_and_set_bit(XPT_DEAD, &xprt->xpt_flags)) {
+ BUG_ON(atomic_read(&xprt->xpt_ref.refcount)<2);
+ svc_xprt_put(xprt);
+ if (test_bit(XPT_TEMP, &xprt->xpt_flags))
serv->sv_tmpcnt--;
}

spin_unlock_bh(&serv->sv_lock);
}

-static void svc_close_socket(struct svc_sock *svsk)
+static void svc_close_xprt(struct svc_xprt *xprt)
{
- set_bit(XPT_CLOSE, &svsk->sk_xprt.xpt_flags);
- if (test_and_set_bit(XPT_BUSY, &svsk->sk_xprt.xpt_flags))
+ set_bit(XPT_CLOSE, &xprt->xpt_flags);
+ if (test_and_set_bit(XPT_BUSY, &xprt->xpt_flags))
/* someone else will have to effect the close */
return;

- svc_xprt_get(&svsk->sk_xprt);
- svc_delete_socket(svsk);
- clear_bit(XPT_BUSY, &svsk->sk_xprt.xpt_flags);
- svc_xprt_put(&svsk->sk_xprt);
+ svc_xprt_get(xprt);
+ svc_delete_xprt(xprt);
+ clear_bit(XPT_BUSY, &xprt->xpt_flags);
+ svc_xprt_put(xprt);
}

-void svc_force_close_socket(struct svc_sock *svsk)
+void svc_close_all(struct list_head *xprt_list)
{
- set_bit(XPT_CLOSE, &svsk->sk_xprt.xpt_flags);
- if (test_bit(XPT_BUSY, &svsk->sk_xprt.xpt_flags)) {
- /* Waiting to be processed, but no threads left,
- * So just remove it from the waiting list
- */
- list_del_init(&svsk->sk_ready);
- clear_bit(XPT_BUSY, &svsk->sk_xprt.xpt_flags);
+ struct svc_xprt *xprt;
+ struct svc_xprt *tmp;
+
+ list_for_each_entry_safe(xprt, tmp, xprt_list, xpt_list) {
+ set_bit(XPT_CLOSE, &xprt->xpt_flags);
+ if (test_bit(XPT_BUSY, &xprt->xpt_flags)) {
+ /* Waiting to be processed, but no threads left,
+ * So just remove it from the waiting list
+ */
+ list_del_init(&xprt->xpt_ready);
+ clear_bit(XPT_BUSY, &xprt->xpt_flags);
+ }
+ svc_close_xprt(xprt);
}
- svc_close_socket(svsk);
}

/*

-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2005.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2007-09-27 05:02:13

by Tom Tucker

[permalink] [raw]
Subject: [RFC,PATCH 18/33] svc: Move sk_reserved to svc_xprt


This functionally trivial patch moves the sk_reserved field to the
transport independent svc_xprt structure.

Signed-off-by: Tom Tucker <[email protected]>
---

include/linux/sunrpc/svc_xprt.h | 1 +
include/linux/sunrpc/svcsock.h | 2 --
net/sunrpc/svcsock.c | 10 +++++-----
3 files changed, 6 insertions(+), 7 deletions(-)

diff --git a/include/linux/sunrpc/svc_xprt.h b/include/linux/sunrpc/svc_xprt.h
index 84e31bc..8b46561 100644
--- a/include/linux/sunrpc/svc_xprt.h
+++ b/include/linux/sunrpc/svc_xprt.h
@@ -54,6 +54,7 @@ #define XPT_LISTENER 11 /* listening e

struct svc_pool * xpt_pool; /* current pool iff queued */
struct svc_serv * xpt_server; /* service for this transport */
+ atomic_t xpt_reserved; /* space on outq that is reserved */
};

int svc_reg_xprt_class(struct svc_xprt_class *);
diff --git a/include/linux/sunrpc/svcsock.h b/include/linux/sunrpc/svcsock.h
index 060508b..ba41f11 100644
--- a/include/linux/sunrpc/svcsock.h
+++ b/include/linux/sunrpc/svcsock.h
@@ -20,8 +20,6 @@ struct svc_sock {
struct socket * sk_sock; /* berkeley socket layer */
struct sock * sk_sk; /* INET layer */

- atomic_t sk_reserved; /* space on outq that is reserved */
-
spinlock_t sk_lock; /* protects sk_deferred and
* sk_info_authunix */
struct list_head sk_deferred; /* deferred requests that need to
diff --git a/net/sunrpc/svcsock.c b/net/sunrpc/svcsock.c
index be73044..8ea950a 100644
--- a/net/sunrpc/svcsock.c
+++ b/net/sunrpc/svcsock.c
@@ -303,7 +303,7 @@ svc_sock_enqueue(struct svc_sock *svsk)
rqstp->rq_sock = svsk;
svc_xprt_get(&svsk->sk_xprt);
rqstp->rq_reserved = serv->sv_max_mesg;
- atomic_add(rqstp->rq_reserved, &svsk->sk_reserved);
+ atomic_add(rqstp->rq_reserved, &svsk->sk_xprt.xpt_reserved);
BUG_ON(svsk->sk_xprt.xpt_pool != pool);
wake_up(&rqstp->rq_wait);
} else {
@@ -368,7 +368,7 @@ void svc_reserve(struct svc_rqst *rqstp,

if (space < rqstp->rq_reserved) {
struct svc_sock *svsk = rqstp->rq_sock;
- atomic_sub((rqstp->rq_reserved - space), &svsk->sk_reserved);
+ atomic_sub((rqstp->rq_reserved - space), &svsk->sk_xprt.xpt_reserved);
rqstp->rq_reserved = space;

svc_sock_enqueue(svsk);
@@ -897,7 +897,7 @@ svc_udp_has_wspace(struct svc_xprt *xprt
* sock space.
*/
set_bit(SOCK_NOSPACE, &svsk->sk_sock->flags);
- required = atomic_read(&svsk->sk_reserved) + serv->sv_max_mesg;
+ required = atomic_read(&svsk->sk_xprt.xpt_reserved) + serv->sv_max_mesg;
if (required*2 > sock_wspace(svsk->sk_sk))
return 0;
clear_bit(SOCK_NOSPACE, &svsk->sk_sock->flags);
@@ -1348,7 +1348,7 @@ svc_tcp_has_wspace(struct svc_xprt *xprt
* sock space.
*/
set_bit(SOCK_NOSPACE, &svsk->sk_sock->flags);
- required = atomic_read(&svsk->sk_reserved) + serv->sv_max_mesg;
+ required = atomic_read(&svsk->sk_xprt.xpt_reserved) + serv->sv_max_mesg;
if (required*2 > sk_stream_wspace(svsk->sk_sk))
return 0;
clear_bit(SOCK_NOSPACE, &svsk->sk_sock->flags);
@@ -1563,7 +1563,7 @@ svc_recv(struct svc_rqst *rqstp, long ti
rqstp->rq_sock = svsk;
svc_xprt_get(&svsk->sk_xprt);
rqstp->rq_reserved = serv->sv_max_mesg;
- atomic_add(rqstp->rq_reserved, &svsk->sk_reserved);
+ atomic_add(rqstp->rq_reserved, &svsk->sk_xprt.xpt_reserved);
} else {
/* No data pending. Go to sleep */
svc_thread_enqueue(pool, rqstp);

-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2005.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2007-09-27 05:02:16

by Tom Tucker

[permalink] [raw]
Subject: [RFC, PATCH 19/33] svc: Make the enqueue service transport neutral and export it.


The svc_sock_enqueue function is now transport independent since all of
the fields it touches have been moved to the transport independent svc_xprt
structure. Change the function to use the svc_xprt structure directly
instead of the transport specific svc_sock structure.

Transport specific data-ready handlers need to call this function, so
export it.

Signed-off-by: Tom Tucker <[email protected]>
---

net/sunrpc/svcsock.c | 95 +++++++++++++++++++++++++-------------------------
1 files changed, 48 insertions(+), 47 deletions(-)

diff --git a/net/sunrpc/svcsock.c b/net/sunrpc/svcsock.c
index 8ea950a..db9963e 100644
--- a/net/sunrpc/svcsock.c
+++ b/net/sunrpc/svcsock.c
@@ -5,7 +5,7 @@
*
* The server scheduling algorithm does not always distribute the load
* evenly when servicing a single client. May need to modify the
- * svc_sock_enqueue procedure...
+ * svc_xprt_enqueue procedure...
*
* TCP support is largely untested and may be a little slow. The problem
* is that we currently do two separate recvfrom's, one for the 4-byte
@@ -61,7 +61,7 @@ #include <linux/sunrpc/stats.h>
* providing that certain rules are followed:
*
* XPT_CONN, XPT_DATA, can be set or cleared at any time.
- * after a set, svc_sock_enqueue must be called.
+ * after a set, svc_xprt_enqueue must be called.
* after a clear, the socket must be read/accepted
* if this succeeds, it must be set again.
* XPT_CLOSE can set at any time. It is never cleared.
@@ -227,22 +227,22 @@ svc_sock_wspace(struct svc_sock *svsk)
* processes, wake 'em up.
*
*/
-static void
-svc_sock_enqueue(struct svc_sock *svsk)
+void
+svc_xprt_enqueue(struct svc_xprt *xprt)
{
- struct svc_serv *serv = svsk->sk_xprt.xpt_server;
+ struct svc_serv *serv = xprt->xpt_server;
struct svc_pool *pool;
struct svc_rqst *rqstp;
int cpu;

- if (!(svsk->sk_xprt.xpt_flags &
+ if (!(xprt->xpt_flags &
( (1<<XPT_CONN)|(1<<XPT_DATA)|(1<<XPT_CLOSE)|(1<<XPT_DEFERRED)) ))
return;
- if (test_bit(XPT_DEAD, &svsk->sk_xprt.xpt_flags))
+ if (test_bit(XPT_DEAD, &xprt->xpt_flags))
return;

cpu = get_cpu();
- pool = svc_pool_for_cpu(svsk->sk_xprt.xpt_server, cpu);
+ pool = svc_pool_for_cpu(xprt->xpt_server, cpu);
put_cpu();

spin_lock_bh(&pool->sp_lock);
@@ -250,11 +250,11 @@ svc_sock_enqueue(struct svc_sock *svsk)
if (!list_empty(&pool->sp_threads) &&
!list_empty(&pool->sp_sockets))
printk(KERN_ERR
- "svc_sock_enqueue: threads and sockets both waiting??\n");
+ "svc_xprt_enqueue: threads and sockets both waiting??\n");

- if (test_bit(XPT_DEAD, &svsk->sk_xprt.xpt_flags)) {
+ if (test_bit(XPT_DEAD, &xprt->xpt_flags)) {
/* Don't enqueue dead sockets */
- dprintk("svc: socket %p is dead, not enqueued\n", svsk->sk_sk);
+ dprintk("svc: transport %p is dead, not enqueued\n", xprt);
goto out_unlock;
}

@@ -263,28 +263,28 @@ svc_sock_enqueue(struct svc_sock *svsk)
* on the idle list. We update XPT_BUSY atomically because
* it also guards against trying to enqueue the svc_sock twice.
*/
- if (test_and_set_bit(XPT_BUSY, &svsk->sk_xprt.xpt_flags)) {
+ if (test_and_set_bit(XPT_BUSY, &xprt->xpt_flags)) {
/* Don't enqueue socket while already enqueued */
- dprintk("svc: socket %p busy, not enqueued\n", svsk->sk_sk);
+ dprintk("svc: transport %p busy, not enqueued\n", xprt);
goto out_unlock;
}
- BUG_ON(svsk->sk_xprt.xpt_pool != NULL);
- svsk->sk_xprt.xpt_pool = pool;
+ BUG_ON(xprt->xpt_pool != NULL);
+ xprt->xpt_pool = pool;

/* Handle pending connection */
- if (test_bit(XPT_CONN, &svsk->sk_xprt.xpt_flags))
+ if (test_bit(XPT_CONN, &xprt->xpt_flags))
goto process;

/* Handle close in-progress */
- if (test_bit(XPT_CLOSE, &svsk->sk_xprt.xpt_flags))
+ if (test_bit(XPT_CLOSE, &xprt->xpt_flags))
goto process;

/* Check if we have space to reply to a request */
- if (!svsk->sk_xprt.xpt_ops.xpo_has_wspace(&svsk->sk_xprt)) {
+ if (!xprt->xpt_ops.xpo_has_wspace(xprt)) {
/* Don't enqueue while not enough space for reply */
- dprintk("svc: no write space, socket %p not enqueued\n", svsk);
- svsk->sk_xprt.xpt_pool = NULL;
- clear_bit(XPT_BUSY, &svsk->sk_xprt.xpt_flags);
+ dprintk("svc: no write space, transport %p not enqueued\n", xprt);
+ xprt->xpt_pool = NULL;
+ clear_bit(XPT_BUSY, &xprt->xpt_flags);
goto out_unlock;
}

@@ -293,28 +293,29 @@ svc_sock_enqueue(struct svc_sock *svsk)
rqstp = list_entry(pool->sp_threads.next,
struct svc_rqst,
rq_list);
- dprintk("svc: socket %p served by daemon %p\n",
- svsk->sk_sk, rqstp);
+ dprintk("svc: transport %p served by daemon %p\n",
+ xprt, rqstp);
svc_thread_dequeue(pool, rqstp);
- if (rqstp->rq_sock)
+ if (rqstp->rq_xprt)
printk(KERN_ERR
- "svc_sock_enqueue: server %p, rq_sock=%p!\n",
- rqstp, rqstp->rq_sock);
- rqstp->rq_sock = svsk;
- svc_xprt_get(&svsk->sk_xprt);
+ "svc_xprt_enqueue: server %p, rq_xprt=%p!\n",
+ rqstp, rqstp->rq_xprt);
+ rqstp->rq_xprt = xprt;
+ svc_xprt_get(xprt);
rqstp->rq_reserved = serv->sv_max_mesg;
- atomic_add(rqstp->rq_reserved, &svsk->sk_xprt.xpt_reserved);
- BUG_ON(svsk->sk_xprt.xpt_pool != pool);
+ atomic_add(rqstp->rq_reserved, &xprt->xpt_reserved);
+ BUG_ON(xprt->xpt_pool != pool);
wake_up(&rqstp->rq_wait);
} else {
- dprintk("svc: socket %p put into queue\n", svsk->sk_sk);
- list_add_tail(&svsk->sk_xprt.xpt_ready, &pool->sp_sockets);
- BUG_ON(svsk->sk_xprt.xpt_pool != pool);
+ dprintk("svc: transport %p put into queue\n", xprt);
+ list_add_tail(&xprt->xpt_ready, &pool->sp_sockets);
+ BUG_ON(xprt->xpt_pool != pool);
}

out_unlock:
spin_unlock_bh(&pool->sp_lock);
}
+EXPORT_SYMBOL_GPL(svc_xprt_enqueue);

/*
* Dequeue the first socket. Must be called with the pool->sp_lock held.
@@ -348,7 +349,7 @@ svc_sock_received(struct svc_sock *svsk)
{
svsk->sk_xprt.xpt_pool = NULL;
clear_bit(XPT_BUSY, &svsk->sk_xprt.xpt_flags);
- svc_sock_enqueue(svsk);
+ svc_xprt_enqueue(&svsk->sk_xprt);
}


@@ -367,11 +368,11 @@ void svc_reserve(struct svc_rqst *rqstp,
space += rqstp->rq_res.head[0].iov_len;

if (space < rqstp->rq_reserved) {
- struct svc_sock *svsk = rqstp->rq_sock;
- atomic_sub((rqstp->rq_reserved - space), &svsk->sk_xprt.xpt_reserved);
+ struct svc_xprt *xprt = rqstp->rq_xprt;
+ atomic_sub((rqstp->rq_reserved - space), &xprt->xpt_reserved);
rqstp->rq_reserved = space;

- svc_sock_enqueue(svsk);
+ svc_xprt_enqueue(xprt);
}
}

@@ -698,7 +699,7 @@ svc_udp_data_ready(struct sock *sk, int
dprintk("svc: socket %p(inet %p), count=%d, busy=%d\n",
svsk, sk, count, test_bit(XPT_BUSY, &svsk->sk_xprt.xpt_flags));
set_bit(XPT_DATA, &svsk->sk_xprt.xpt_flags);
- svc_sock_enqueue(svsk);
+ svc_xprt_enqueue(&svsk->sk_xprt);
}
if (sk->sk_sleep && waitqueue_active(sk->sk_sleep))
wake_up_interruptible(sk->sk_sleep);
@@ -715,7 +716,7 @@ svc_write_space(struct sock *sk)
if (svsk) {
dprintk("svc: socket %p(inet %p), write_space busy=%d\n",
svsk, sk, test_bit(XPT_BUSY, &svsk->sk_xprt.xpt_flags));
- svc_sock_enqueue(svsk);
+ svc_xprt_enqueue(&svsk->sk_xprt);
}

if (sk->sk_sleep && waitqueue_active(sk->sk_sleep)) {
@@ -990,7 +991,7 @@ svc_tcp_listen_data_ready(struct sock *s
if (sk->sk_state == TCP_LISTEN) {
if (svsk) {
set_bit(XPT_CONN, &svsk->sk_xprt.xpt_flags);
- svc_sock_enqueue(svsk);
+ svc_xprt_enqueue(&svsk->sk_xprt);
} else
printk("svc: socket %p: no user data\n", sk);
}
@@ -1014,7 +1015,7 @@ svc_tcp_state_change(struct sock *sk)
printk("svc: socket %p: no user data\n", sk);
else {
set_bit(XPT_CLOSE, &svsk->sk_xprt.xpt_flags);
- svc_sock_enqueue(svsk);
+ svc_xprt_enqueue(&svsk->sk_xprt);
}
if (sk->sk_sleep && waitqueue_active(sk->sk_sleep))
wake_up_interruptible_all(sk->sk_sleep);
@@ -1029,7 +1030,7 @@ svc_tcp_data_ready(struct sock *sk, int
sk, sk->sk_user_data);
if (svsk) {
set_bit(XPT_DATA, &svsk->sk_xprt.xpt_flags);
- svc_sock_enqueue(svsk);
+ svc_xprt_enqueue(&svsk->sk_xprt);
}
if (sk->sk_sleep && waitqueue_active(sk->sk_sleep))
wake_up_interruptible(sk->sk_sleep);
@@ -1082,7 +1083,7 @@ svc_tcp_accept(struct svc_xprt *xprt)
}

set_bit(XPT_CONN, &svsk->sk_xprt.xpt_flags);
- svc_sock_enqueue(svsk);
+ svc_xprt_enqueue(&svsk->sk_xprt);

err = kernel_getpeername(newsock, sin, &slen);
if (err < 0) {
@@ -1318,7 +1319,7 @@ svc_tcp_sendto(struct svc_rqst *rqstp)
(sent<0)?"got error":"sent only",
sent, xbufp->len);
set_bit(XPT_CLOSE, &rqstp->rq_sock->sk_xprt.xpt_flags);
- svc_sock_enqueue(rqstp->rq_sock);
+ svc_xprt_enqueue(rqstp->rq_xprt);
sent = -EAGAIN;
}
return sent;
@@ -1495,7 +1496,7 @@ svc_check_conn_limits(struct svc_serv *s
spin_unlock_bh(&serv->sv_lock);

if (svsk) {
- svc_sock_enqueue(svsk);
+ svc_xprt_enqueue(&svsk->sk_xprt);
svc_xprt_put(&svsk->sk_xprt);
}
}
@@ -1728,7 +1729,7 @@ svc_age_temp_sockets(unsigned long closu
svsk, get_seconds() - svsk->sk_lastrecv);

/* a thread will dequeue and close it soon */
- svc_sock_enqueue(svsk);
+ svc_xprt_enqueue(&svsk->sk_xprt);
svc_xprt_put(&svsk->sk_xprt);
}

@@ -2022,7 +2023,7 @@ static void svc_revisit(struct cache_def
list_add(&dr->handle.recent, &svsk->sk_deferred);
spin_unlock(&svsk->sk_lock);
set_bit(XPT_DEFERRED, &svsk->sk_xprt.xpt_flags);
- svc_sock_enqueue(svsk);
+ svc_xprt_enqueue(&svsk->sk_xprt);
svc_xprt_put(&svsk->sk_xprt);
}


-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2005.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2007-09-27 05:02:17

by Tom Tucker

[permalink] [raw]
Subject: [RFC,PATCH 20/33] svc: Make svc_send transport neutral


Move the sk_mutex field to the transport independent svc_xprt structure.
Now all the fields that svc_send touches are transport neutral. Change the
svc_send function to use the transport independent svc_xprt directly instead
of the transport dependent svc_sock structure.

Signed-off-by: Tom Tucker <[email protected]>
---

include/linux/sunrpc/svc_xprt.h | 1 +
include/linux/sunrpc/svcsock.h | 1 -
net/sunrpc/svc_xprt.c | 1 +
net/sunrpc/svcsock.c | 17 ++++++++---------
4 files changed, 10 insertions(+), 10 deletions(-)

diff --git a/include/linux/sunrpc/svc_xprt.h b/include/linux/sunrpc/svc_xprt.h
index 8b46561..bb3c02f 100644
--- a/include/linux/sunrpc/svc_xprt.h
+++ b/include/linux/sunrpc/svc_xprt.h
@@ -55,6 +55,7 @@ #define XPT_LISTENER 11 /* listening e
struct svc_pool * xpt_pool; /* current pool iff queued */
struct svc_serv * xpt_server; /* service for this transport */
atomic_t xpt_reserved; /* space on outq that is reserved */
+ struct mutex xpt_mutex; /* to serialize sending data */
};

int svc_reg_xprt_class(struct svc_xprt_class *);
diff --git a/include/linux/sunrpc/svcsock.h b/include/linux/sunrpc/svcsock.h
index ba41f11..41c2dfa 100644
--- a/include/linux/sunrpc/svcsock.h
+++ b/include/linux/sunrpc/svcsock.h
@@ -24,7 +24,6 @@ struct svc_sock {
* sk_info_authunix */
struct list_head sk_deferred; /* deferred requests that need to
* be revisted */
- struct mutex sk_mutex; /* to serialize sending data */

/* We keep the old state_change and data_ready CB's here */
void (*sk_ostate)(struct sock *);
diff --git a/net/sunrpc/svc_xprt.c b/net/sunrpc/svc_xprt.c
index c5eaf8b..d80fc5f 100644
--- a/net/sunrpc/svc_xprt.c
+++ b/net/sunrpc/svc_xprt.c
@@ -111,6 +111,7 @@ void svc_xprt_init(struct svc_xprt_class
xpt->xpt_server = serv;
INIT_LIST_HEAD(&xpt->xpt_list);
INIT_LIST_HEAD(&xpt->xpt_ready);
+ mutex_init(&xpt->xpt_mutex);
}
EXPORT_SYMBOL_GPL(svc_xprt_init);

diff --git a/net/sunrpc/svcsock.c b/net/sunrpc/svcsock.c
index db9963e..dd70a6d 100644
--- a/net/sunrpc/svcsock.c
+++ b/net/sunrpc/svcsock.c
@@ -1650,12 +1650,12 @@ svc_drop(struct svc_rqst *rqstp)
int
svc_send(struct svc_rqst *rqstp)
{
- struct svc_sock *svsk;
+ struct svc_xprt *xprt;
int len;
struct xdr_buf *xb;

- if ((svsk = rqstp->rq_sock) == NULL) {
- printk(KERN_WARNING "NULL socket pointer in %s:%d\n",
+ if ((xprt = rqstp->rq_xprt) == NULL) {
+ printk(KERN_WARNING "NULL transport pointer in %s:%d\n",
__FILE__, __LINE__);
return -EFAULT;
}
@@ -1669,13 +1669,13 @@ svc_send(struct svc_rqst *rqstp)
xb->page_len +
xb->tail[0].iov_len;

- /* Grab svsk->sk_mutex to serialize outgoing data. */
- mutex_lock(&svsk->sk_mutex);
- if (test_bit(XPT_DEAD, &svsk->sk_xprt.xpt_flags))
+ /* Grab mutex to serialize outgoing data. */
+ mutex_lock(&xprt->xpt_mutex);
+ if (test_bit(XPT_DEAD, &xprt->xpt_flags))
len = -ENOTCONN;
else
- len = svsk->sk_xprt.xpt_ops.xpo_sendto(rqstp);
- mutex_unlock(&svsk->sk_mutex);
+ len = xprt->xpt_ops.xpo_sendto(rqstp);
+ mutex_unlock(&xprt->xpt_mutex);
svc_sock_release(rqstp);

if (len == -ECONNREFUSED || len == -ENOTCONN || len == -EAGAIN)
@@ -1777,7 +1777,6 @@ static struct svc_sock *svc_setup_socket
svsk->sk_lastrecv = get_seconds();
spin_lock_init(&svsk->sk_lock);
INIT_LIST_HEAD(&svsk->sk_deferred);
- mutex_init(&svsk->sk_mutex);

/* Initialize the socket */
if (sock->type == SOCK_DGRAM) {

-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2005.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2007-09-27 05:02:20

by Tom Tucker

[permalink] [raw]
Subject: [RFC, PATCH 21/33] svc: Change svc_sock_received to svc_xprt_received and export it


All fields touched by svc_sock_received are now transport independent.
Change it to use svc_xprt directly. This function is called from
transport dependent code, so export it.

Signed-off-by: Tom Tucker <[email protected]>
---

include/linux/sunrpc/svc_xprt.h | 2 +-
net/sunrpc/svcsock.c | 37 ++++++++++++++++++-------------------
2 files changed, 19 insertions(+), 20 deletions(-)

diff --git a/include/linux/sunrpc/svc_xprt.h b/include/linux/sunrpc/svc_xprt.h
index bb3c02f..5b2aef4 100644
--- a/include/linux/sunrpc/svc_xprt.h
+++ b/include/linux/sunrpc/svc_xprt.h
@@ -62,8 +62,8 @@ int svc_reg_xprt_class(struct svc_xprt_c
int svc_unreg_xprt_class(struct svc_xprt_class *);
void svc_xprt_init(struct svc_xprt_class *, struct svc_xprt *, struct svc_serv *);
int svc_create_xprt(struct svc_serv *, char *, unsigned short, int);
+void svc_xprt_received(struct svc_xprt *);
void svc_xprt_put(struct svc_xprt *xprt);
-
static inline void svc_xprt_get(struct svc_xprt *xprt) {
kref_get(&xprt->xpt_ref);
}
diff --git a/net/sunrpc/svcsock.c b/net/sunrpc/svcsock.c
index dd70a6d..71b7f86 100644
--- a/net/sunrpc/svcsock.c
+++ b/net/sunrpc/svcsock.c
@@ -344,14 +344,14 @@ svc_sock_dequeue(struct svc_pool *pool)
* Note: XPT_DATA only gets cleared when a read-attempt finds
* no (or insufficient) data.
*/
-static inline void
-svc_sock_received(struct svc_sock *svsk)
+void
+svc_xprt_received(struct svc_xprt *xprt)
{
- svsk->sk_xprt.xpt_pool = NULL;
- clear_bit(XPT_BUSY, &svsk->sk_xprt.xpt_flags);
- svc_xprt_enqueue(&svsk->sk_xprt);
+ xprt->xpt_pool = NULL;
+ clear_bit(XPT_BUSY, &xprt->xpt_flags);
+ svc_xprt_enqueue(xprt);
}
-
+EXPORT_SYMBOL_GPL(svc_xprt_received);

/**
* svc_reserve - change the space reserved for the reply to a request.
@@ -779,7 +779,7 @@ svc_udp_recvfrom(struct svc_rqst *rqstp)
(serv->sv_nrthreads+3) * serv->sv_max_mesg);

if ((rqstp->rq_deferred = svc_deferred_dequeue(svsk))) {
- svc_sock_received(svsk);
+ svc_xprt_received(&svsk->sk_xprt);
return svc_deferred_recv(rqstp);
}

@@ -796,7 +796,7 @@ svc_udp_recvfrom(struct svc_rqst *rqstp)
dprintk("svc: recvfrom returned error %d\n", -err);
set_bit(XPT_DATA, &svsk->sk_xprt.xpt_flags);
}
- svc_sock_received(svsk);
+ svc_xprt_received(&svsk->sk_xprt);
return -EAGAIN;
}
rqstp->rq_addrlen = sizeof(rqstp->rq_addr);
@@ -811,7 +811,7 @@ svc_udp_recvfrom(struct svc_rqst *rqstp)
/*
* Maybe more packets - kick another thread ASAP.
*/
- svc_sock_received(svsk);
+ svc_xprt_received(&svsk->sk_xprt);

len = skb->len - sizeof(struct udphdr);
rqstp->rq_arg.len = len;
@@ -1123,8 +1123,6 @@ svc_tcp_accept(struct svc_xprt *xprt)
}
memcpy(&newsvsk->sk_local, sin, slen);

- svc_sock_received(newsvsk);
-
if (serv->sv_stats)
serv->sv_stats->nettcpconn++;

@@ -1153,7 +1151,7 @@ svc_tcp_recvfrom(struct svc_rqst *rqstp)
test_bit(XPT_CLOSE, &svsk->sk_xprt.xpt_flags));

if ((rqstp->rq_deferred = svc_deferred_dequeue(svsk))) {
- svc_sock_received(svsk);
+ svc_xprt_received(&svsk->sk_xprt);
return svc_deferred_recv(rqstp);
}

@@ -1193,7 +1191,7 @@ svc_tcp_recvfrom(struct svc_rqst *rqstp)
if (len < want) {
dprintk("svc: short recvfrom while reading record length (%d of %lu)\n",
len, want);
- svc_sock_received(svsk);
+ svc_xprt_received(&svsk->sk_xprt);
return -EAGAIN; /* record header not complete */
}

@@ -1229,7 +1227,7 @@ svc_tcp_recvfrom(struct svc_rqst *rqstp)
if (len < svsk->sk_reclen) {
dprintk("svc: incomplete TCP record (%d of %d)\n",
len, svsk->sk_reclen);
- svc_sock_received(svsk);
+ svc_xprt_received(&svsk->sk_xprt);
return -EAGAIN; /* record not complete */
}
len = svsk->sk_reclen;
@@ -1269,7 +1267,7 @@ svc_tcp_recvfrom(struct svc_rqst *rqstp)
svsk->sk_reclen = 0;
svsk->sk_tcplen = 0;

- svc_sock_received(svsk);
+ svc_xprt_received(&svsk->sk_xprt);
if (serv->sv_stats)
serv->sv_stats->nettcpcnt++;

@@ -1282,7 +1280,7 @@ svc_tcp_recvfrom(struct svc_rqst *rqstp)
error:
if (len == -EAGAIN) {
dprintk("RPC: TCP recvfrom got EAGAIN\n");
- svc_sock_received(svsk);
+ svc_xprt_received(&svsk->sk_xprt);
} else {
printk(KERN_NOTICE "%s: recvfrom returned errno %d\n",
svsk->sk_xprt.xpt_server->sv_name, -len);
@@ -1601,6 +1599,7 @@ svc_recv(struct svc_rqst *rqstp, long ti
struct svc_xprt *newxpt;
newxpt = svsk->sk_xprt.xpt_ops.xpo_accept(&svsk->sk_xprt);
if (newxpt) {
+ svc_xprt_received(newxpt);
/*
* We know this module_get will succeed because the
* listener holds a reference too
@@ -1608,7 +1607,7 @@ svc_recv(struct svc_rqst *rqstp, long ti
__module_get(newxpt->xpt_class->xcl_owner);
svc_check_conn_limits(svsk->sk_xprt.xpt_server);
}
- svc_sock_received(svsk);
+ svc_xprt_received(&svsk->sk_xprt);
} else {
dprintk("svc: server %p, pool %u, socket %p, inuse=%d\n",
rqstp, pool->sp_id, svsk,
@@ -1832,7 +1831,7 @@ int svc_addsock(struct svc_serv *serv,
else {
svsk = svc_setup_socket(serv, so, &err, SVC_SOCK_DEFAULTS);
if (svsk) {
- svc_sock_received(svsk);
+ svc_xprt_received(&svsk->sk_xprt);
err = 0;
}
}
@@ -1889,7 +1888,7 @@ svc_create_socket(struct svc_serv *serv,
if ((svsk = svc_setup_socket(serv, sock, &error, flags)) != NULL) {
if (protocol == IPPROTO_TCP)
set_bit(XPT_LISTENER, &svsk->sk_xprt.xpt_flags);
- svc_sock_received(svsk);
+ svc_xprt_received(&svsk->sk_xprt);
return (struct svc_xprt *)svsk;
}


-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2005.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2007-09-27 05:02:21

by Tom Tucker

[permalink] [raw]
Subject: [RFC,PATCH 22/33] svc: Move sk_lastrecv to svc_xprt


This functionally trivial change moves the tranpsort independent sk_lastrecv
field to the svc_xprt structure.

Signed-off-by: Tom Tucker <[email protected]>
---

include/linux/sunrpc/svc_xprt.h | 1 +
include/linux/sunrpc/svcsock.h | 1 -
net/sunrpc/svcsock.c | 6 +++---
3 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/include/linux/sunrpc/svc_xprt.h b/include/linux/sunrpc/svc_xprt.h
index 5b2aef4..edb7ad2 100644
--- a/include/linux/sunrpc/svc_xprt.h
+++ b/include/linux/sunrpc/svc_xprt.h
@@ -56,6 +56,7 @@ #define XPT_LISTENER 11 /* listening e
struct svc_serv * xpt_server; /* service for this transport */
atomic_t xpt_reserved; /* space on outq that is reserved */
struct mutex xpt_mutex; /* to serialize sending data */
+ time_t xpt_lastrecv; /* time of last received request */
};

int svc_reg_xprt_class(struct svc_xprt_class *);
diff --git a/include/linux/sunrpc/svcsock.h b/include/linux/sunrpc/svcsock.h
index 41c2dfa..406d003 100644
--- a/include/linux/sunrpc/svcsock.h
+++ b/include/linux/sunrpc/svcsock.h
@@ -33,7 +33,6 @@ struct svc_sock {
/* private TCP part */
int sk_reclen; /* length of record */
int sk_tcplen; /* current read length */
- time_t sk_lastrecv; /* time of last received request */

/* cache of various info for TCP sockets */
void *sk_info_authunix;
diff --git a/net/sunrpc/svcsock.c b/net/sunrpc/svcsock.c
index 71b7f86..04155aa 100644
--- a/net/sunrpc/svcsock.c
+++ b/net/sunrpc/svcsock.c
@@ -1622,7 +1622,7 @@ svc_recv(struct svc_rqst *rqstp, long ti
svc_sock_release(rqstp);
return -EAGAIN;
}
- svsk->sk_lastrecv = get_seconds();
+ svsk->sk_xprt.xpt_lastrecv = get_seconds();
clear_bit(XPT_OLD, &svsk->sk_xprt.xpt_flags);

rqstp->rq_secure = svc_port_is_privileged(svc_addr(rqstp));
@@ -1725,7 +1725,7 @@ svc_age_temp_sockets(unsigned long closu
svsk = list_entry(le, struct svc_sock, sk_xprt.xpt_list);

dprintk("queuing svsk %p for closing, %lu seconds old\n",
- svsk, get_seconds() - svsk->sk_lastrecv);
+ svsk, get_seconds() - svsk->sk_xprt.xpt_lastrecv);

/* a thread will dequeue and close it soon */
svc_xprt_enqueue(&svsk->sk_xprt);
@@ -1773,7 +1773,7 @@ static struct svc_sock *svc_setup_socket
svsk->sk_ostate = inet->sk_state_change;
svsk->sk_odata = inet->sk_data_ready;
svsk->sk_owspace = inet->sk_write_space;
- svsk->sk_lastrecv = get_seconds();
+ svsk->sk_xprt.xpt_lastrecv = get_seconds();
spin_lock_init(&svsk->sk_lock);
INIT_LIST_HEAD(&svsk->sk_deferred);


-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2005.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2007-09-27 05:02:24

by Tom Tucker

[permalink] [raw]
Subject: [RFC,PATCH 23/33] svc: Move the authinfo cache to svc_xprt.


Move the authinfo cache to svc_xprt. This allows both the TCP and RDMA
transports to share this logic. A flag bit is used to determine if
auth information is to be cached or not. Previously, this code looked
at the transport protocol.

I've also changed the spin_lock/unlock logic so that a lock is not taken for
transports that are not caching auth info.

Signed-off-by: Tom Tucker <[email protected]>
---

include/linux/sunrpc/svc_xprt.h | 5 +++-
include/linux/sunrpc/svcsock.h | 5 ----
net/sunrpc/svc_xprt.c | 4 +++
net/sunrpc/svcauth_unix.c | 54 +++++++++++++++++++++------------------
net/sunrpc/svcsock.c | 23 ++++++++---------
5 files changed, 48 insertions(+), 43 deletions(-)

diff --git a/include/linux/sunrpc/svc_xprt.h b/include/linux/sunrpc/svc_xprt.h
index edb7ad2..c763dce 100644
--- a/include/linux/sunrpc/svc_xprt.h
+++ b/include/linux/sunrpc/svc_xprt.h
@@ -51,12 +51,15 @@ #define XPT_DEFERRED 8 /* deferred req
#define XPT_OLD 9 /* used for transport aging mark+sweep */
#define XPT_DETACHED 10 /* detached from tempsocks list */
#define XPT_LISTENER 11 /* listening endpoint */
-
+#define XPT_CACHE_AUTH 12 /* cache auth info */
struct svc_pool * xpt_pool; /* current pool iff queued */
struct svc_serv * xpt_server; /* service for this transport */
atomic_t xpt_reserved; /* space on outq that is reserved */
struct mutex xpt_mutex; /* to serialize sending data */
time_t xpt_lastrecv; /* time of last received request */
+ spinlock_t xpt_lock; /* protects sk_deferred
+ * and xpt_auth_cache */
+ void * xpt_auth_cache; /* auth cache */
};

int svc_reg_xprt_class(struct svc_xprt_class *);
diff --git a/include/linux/sunrpc/svcsock.h b/include/linux/sunrpc/svcsock.h
index 406d003..f2ed6a2 100644
--- a/include/linux/sunrpc/svcsock.h
+++ b/include/linux/sunrpc/svcsock.h
@@ -20,8 +20,6 @@ struct svc_sock {
struct socket * sk_sock; /* berkeley socket layer */
struct sock * sk_sk; /* INET layer */

- spinlock_t sk_lock; /* protects sk_deferred and
- * sk_info_authunix */
struct list_head sk_deferred; /* deferred requests that need to
* be revisted */

@@ -34,9 +32,6 @@ struct svc_sock {
int sk_reclen; /* length of record */
int sk_tcplen; /* current read length */

- /* cache of various info for TCP sockets */
- void *sk_info_authunix;
-
struct sockaddr_storage sk_local; /* local address */
struct sockaddr_storage sk_remote; /* remote peer's address */
int sk_remotelen; /* length of address */
diff --git a/net/sunrpc/svc_xprt.c b/net/sunrpc/svc_xprt.c
index d80fc5f..06bf4e8 100644
--- a/net/sunrpc/svc_xprt.c
+++ b/net/sunrpc/svc_xprt.c
@@ -89,6 +89,9 @@ static inline void svc_xprt_free(struct
struct module *owner = xprt->xpt_class->xcl_owner;
BUG_ON(atomic_read(&kref->refcount));
xprt->xpt_ops.xpo_free(xprt);
+ if (test_bit(XPT_CACHE_AUTH, &xprt->xpt_flags)
+ && xprt->xpt_auth_cache != NULL)
+ svcauth_unix_info_release(xprt->xpt_auth_cache);
module_put(owner);
}

@@ -112,6 +115,7 @@ void svc_xprt_init(struct svc_xprt_class
INIT_LIST_HEAD(&xpt->xpt_list);
INIT_LIST_HEAD(&xpt->xpt_ready);
mutex_init(&xpt->xpt_mutex);
+ spin_lock_init(&xpt->xpt_lock);
}
EXPORT_SYMBOL_GPL(svc_xprt_init);

diff --git a/net/sunrpc/svcauth_unix.c b/net/sunrpc/svcauth_unix.c
index 4114794..6815157 100644
--- a/net/sunrpc/svcauth_unix.c
+++ b/net/sunrpc/svcauth_unix.c
@@ -384,41 +384,45 @@ void svcauth_unix_purge(void)
static inline struct ip_map *
ip_map_cached_get(struct svc_rqst *rqstp)
{
- struct ip_map *ipm;
- struct svc_sock *svsk = rqstp->rq_sock;
- spin_lock(&svsk->sk_lock);
- ipm = svsk->sk_info_authunix;
- if (ipm != NULL) {
- if (!cache_valid(&ipm->h)) {
- /*
- * The entry has been invalidated since it was
- * remembered, e.g. by a second mount from the
- * same IP address.
- */
- svsk->sk_info_authunix = NULL;
- spin_unlock(&svsk->sk_lock);
- cache_put(&ipm->h, &ip_map_cache);
- return NULL;
+ struct ip_map *ipm = NULL;
+ struct svc_xprt *xprt = rqstp->rq_xprt;
+
+ if (test_bit(XPT_CACHE_AUTH, &xprt->xpt_flags)) {
+ spin_lock(&xprt->xpt_lock);
+ ipm = xprt->xpt_auth_cache;
+ if (ipm != NULL) {
+ if (!cache_valid(&ipm->h)) {
+ /*
+ * The entry has been invalidated since it was
+ * remembered, e.g. by a second mount from the
+ * same IP address.
+ */
+ xprt->xpt_auth_cache = NULL;
+ spin_unlock(&xprt->xpt_lock);
+ cache_put(&ipm->h, &ip_map_cache);
+ return NULL;
+ }
+ cache_get(&ipm->h);
}
- cache_get(&ipm->h);
+ spin_unlock(&xprt->xpt_lock);
}
- spin_unlock(&svsk->sk_lock);
return ipm;
}

static inline void
ip_map_cached_put(struct svc_rqst *rqstp, struct ip_map *ipm)
{
- struct svc_sock *svsk = rqstp->rq_sock;
+ struct svc_xprt *xprt = rqstp->rq_xprt;

- spin_lock(&svsk->sk_lock);
- if (svsk->sk_sock->type == SOCK_STREAM &&
- svsk->sk_info_authunix == NULL) {
- /* newly cached, keep the reference */
- svsk->sk_info_authunix = ipm;
- ipm = NULL;
+ if (test_bit(XPT_CACHE_AUTH, &xprt->xpt_flags)) {
+ spin_lock(&xprt->xpt_lock);
+ if (xprt->xpt_auth_cache == NULL) {
+ /* newly cached, keep the reference */
+ xprt->xpt_auth_cache = ipm;
+ ipm = NULL;
+ }
+ spin_unlock(&xprt->xpt_lock);
}
- spin_unlock(&svsk->sk_lock);
if (ipm)
cache_put(&ipm->h, &ip_map_cache);
}
diff --git a/net/sunrpc/svcsock.c b/net/sunrpc/svcsock.c
index 04155aa..1dead5d 100644
--- a/net/sunrpc/svcsock.c
+++ b/net/sunrpc/svcsock.c
@@ -53,8 +53,8 @@ #include <linux/sunrpc/stats.h>
* svc_serv->sv_lock protects sv_tempsocks, sv_permsocks, sv_tmpcnt.
* when both need to be taken (rare), svc_serv->sv_lock is first.
* BKL protects svc_serv->sv_nrthread.
- * svc_sock->sk_lock protects the svc_sock->sk_deferred list
- * and the ->sk_info_authunix cache.
+ * svc_sock->sk_xprt.xpt_lock protects the svc_sock->sk_deferred list
+ * and the ->sk_xprt.xpt_auth_cache cache.
* svc_sock->sk_xprt.xpt_flags.XPT_BUSY prevents a svc_sock being enqueued multiply.
*
* Some flags can be set to certain values at any time
@@ -107,16 +107,16 @@ static struct lock_class_key svc_slock_k
static inline void svc_reclassify_socket(struct socket *sock)
{
struct sock *sk = sock->sk;
- BUG_ON(sk->sk_lock.owner != NULL);
+ BUG_ON(sk->sk_xprt.xpt_lock.owner != NULL);
switch (sk->sk_family) {
case AF_INET:
sock_lock_init_class_and_name(sk, "slock-AF_INET-NFSD",
- &svc_slock_key[0], "sk_lock-AF_INET-NFSD", &svc_key[0]);
+ &svc_slock_key[0], "sk_xprt.xpt_lock-AF_INET-NFSD", &svc_key[0]);
break;

case AF_INET6:
sock_lock_init_class_and_name(sk, "slock-AF_INET6-NFSD",
- &svc_slock_key[1], "sk_lock-AF_INET6-NFSD", &svc_key[1]);
+ &svc_slock_key[1], "sk_xprt.xpt_lock-AF_INET6-NFSD", &svc_key[1]);
break;

default:
@@ -1774,16 +1774,17 @@ static struct svc_sock *svc_setup_socket
svsk->sk_odata = inet->sk_data_ready;
svsk->sk_owspace = inet->sk_write_space;
svsk->sk_xprt.xpt_lastrecv = get_seconds();
- spin_lock_init(&svsk->sk_lock);
INIT_LIST_HEAD(&svsk->sk_deferred);

/* Initialize the socket */
if (sock->type == SOCK_DGRAM) {
svc_xprt_init(&svc_udp_class, &svsk->sk_xprt, serv);
svc_udp_init(svsk);
+ clear_bit(XPT_CACHE_AUTH, &svsk->sk_xprt.xpt_flags);
} else {
svc_xprt_init(&svc_tcp_class, &svsk->sk_xprt, serv);
svc_tcp_init(svsk);
+ set_bit(XPT_CACHE_AUTH, &svsk->sk_xprt.xpt_flags);
}

spin_lock_bh(&serv->sv_lock);
@@ -1925,8 +1926,6 @@ svc_sock_free(struct svc_xprt *xprt)
struct svc_sock *svsk = (struct svc_sock *)xprt;
dprintk("svc: svc_sock_free(%p)\n", svsk);

- if (svsk->sk_info_authunix != NULL)
- svcauth_unix_info_release(svsk->sk_info_authunix);
if (svsk->sk_sock->file)
sockfd_put(svsk->sk_sock);
else
@@ -2017,9 +2016,9 @@ static void svc_revisit(struct cache_def
dprintk("revisit queued\n");
svsk = dr->svsk;
dr->svsk = NULL;
- spin_lock(&svsk->sk_lock);
+ spin_lock(&svsk->sk_xprt.xpt_lock);
list_add(&dr->handle.recent, &svsk->sk_deferred);
- spin_unlock(&svsk->sk_lock);
+ spin_unlock(&svsk->sk_xprt.xpt_lock);
set_bit(XPT_DEFERRED, &svsk->sk_xprt.xpt_flags);
svc_xprt_enqueue(&svsk->sk_xprt);
svc_xprt_put(&svsk->sk_xprt);
@@ -2085,7 +2084,7 @@ static struct svc_deferred_req *svc_defe

if (!test_bit(XPT_DEFERRED, &svsk->sk_xprt.xpt_flags))
return NULL;
- spin_lock(&svsk->sk_lock);
+ spin_lock(&svsk->sk_xprt.xpt_lock);
clear_bit(XPT_DEFERRED, &svsk->sk_xprt.xpt_flags);
if (!list_empty(&svsk->sk_deferred)) {
dr = list_entry(svsk->sk_deferred.next,
@@ -2094,7 +2093,7 @@ static struct svc_deferred_req *svc_defe
list_del_init(&dr->handle.recent);
set_bit(XPT_DEFERRED, &svsk->sk_xprt.xpt_flags);
}
- spin_unlock(&svsk->sk_lock);
+ spin_unlock(&svsk->sk_xprt.xpt_lock);
return dr;
}


-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2005.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2007-09-27 05:02:26

by Tom Tucker

[permalink] [raw]
Subject: [RFC, PATCH 24/33] svc: Make deferral processing xprt independent


This functionally trivial patch moves the transport independent sk_deferred
list to the svc_xprt structure and updates the svc_deferred_req structure
to keep pointers to svc_xprt's directly.

Signed-off-by: Tom Tucker <[email protected]>
---

include/linux/sunrpc/svc.h | 2 +
include/linux/sunrpc/svc_xprt.h | 2 +
include/linux/sunrpc/svcsock.h | 3 --
net/sunrpc/svc_xprt.c | 1 +
net/sunrpc/svcsock.c | 60 +++++++++++++++++----------------------
5 files changed, 30 insertions(+), 38 deletions(-)

diff --git a/include/linux/sunrpc/svc.h b/include/linux/sunrpc/svc.h
index cfb2652..40adc9d 100644
--- a/include/linux/sunrpc/svc.h
+++ b/include/linux/sunrpc/svc.h
@@ -320,7 +320,7 @@ static inline void svc_free_res_pages(st

struct svc_deferred_req {
u32 prot; /* protocol (UDP or TCP) */
- struct svc_sock *svsk;
+ struct svc_xprt *xprt;
struct sockaddr_storage addr; /* where reply must go */
size_t addrlen;
union svc_addr_u daddr; /* where reply must come from */
diff --git a/include/linux/sunrpc/svc_xprt.h b/include/linux/sunrpc/svc_xprt.h
index c763dce..ef5026d 100644
--- a/include/linux/sunrpc/svc_xprt.h
+++ b/include/linux/sunrpc/svc_xprt.h
@@ -60,6 +60,8 @@ #define XPT_CACHE_AUTH 12 /* cache aut
spinlock_t xpt_lock; /* protects sk_deferred
* and xpt_auth_cache */
void * xpt_auth_cache; /* auth cache */
+ struct list_head xpt_deferred; /* deferred requests that need to
+ * be revisted */
};

int svc_reg_xprt_class(struct svc_xprt_class *);
diff --git a/include/linux/sunrpc/svcsock.h b/include/linux/sunrpc/svcsock.h
index f2ed6a2..96a229e 100644
--- a/include/linux/sunrpc/svcsock.h
+++ b/include/linux/sunrpc/svcsock.h
@@ -20,9 +20,6 @@ struct svc_sock {
struct socket * sk_sock; /* berkeley socket layer */
struct sock * sk_sk; /* INET layer */

- struct list_head sk_deferred; /* deferred requests that need to
- * be revisted */
-
/* We keep the old state_change and data_ready CB's here */
void (*sk_ostate)(struct sock *);
void (*sk_odata)(struct sock *, int bytes);
diff --git a/net/sunrpc/svc_xprt.c b/net/sunrpc/svc_xprt.c
index 06bf4e8..14cd288 100644
--- a/net/sunrpc/svc_xprt.c
+++ b/net/sunrpc/svc_xprt.c
@@ -114,6 +114,7 @@ void svc_xprt_init(struct svc_xprt_class
xpt->xpt_server = serv;
INIT_LIST_HEAD(&xpt->xpt_list);
INIT_LIST_HEAD(&xpt->xpt_ready);
+ INIT_LIST_HEAD(&xpt->xpt_deferred);
mutex_init(&xpt->xpt_mutex);
spin_lock_init(&xpt->xpt_lock);
}
diff --git a/net/sunrpc/svcsock.c b/net/sunrpc/svcsock.c
index 1dead5d..9e3071c 100644
--- a/net/sunrpc/svcsock.c
+++ b/net/sunrpc/svcsock.c
@@ -53,7 +53,7 @@ #include <linux/sunrpc/stats.h>
* svc_serv->sv_lock protects sv_tempsocks, sv_permsocks, sv_tmpcnt.
* when both need to be taken (rare), svc_serv->sv_lock is first.
* BKL protects svc_serv->sv_nrthread.
- * svc_sock->sk_xprt.xpt_lock protects the svc_sock->sk_deferred list
+ * svc_sock->sk_xprt.xpt_lock protects the svc_sock->sk_xprt.xpt_deferred list
* and the ->sk_xprt.xpt_auth_cache cache.
* svc_sock->sk_xprt.xpt_flags.XPT_BUSY prevents a svc_sock being enqueued multiply.
*
@@ -87,7 +87,7 @@ static void svc_close_xprt(struct svc_x
static void svc_sock_detach(struct svc_xprt *);
static void svc_sock_free(struct svc_xprt *);

-static struct svc_deferred_req *svc_deferred_dequeue(struct svc_sock *svsk);
+static struct svc_deferred_req *svc_deferred_dequeue(struct svc_xprt *xprt);
static int svc_deferred_recv(struct svc_rqst *rqstp);
static struct cache_deferred_req *svc_defer(struct cache_req *req);
static struct svc_xprt *
@@ -778,11 +778,6 @@ svc_udp_recvfrom(struct svc_rqst *rqstp)
(serv->sv_nrthreads+3) * serv->sv_max_mesg,
(serv->sv_nrthreads+3) * serv->sv_max_mesg);

- if ((rqstp->rq_deferred = svc_deferred_dequeue(svsk))) {
- svc_xprt_received(&svsk->sk_xprt);
- return svc_deferred_recv(rqstp);
- }
-
clear_bit(XPT_DATA, &svsk->sk_xprt.xpt_flags);
skb = NULL;
err = kernel_recvmsg(svsk->sk_sock, &msg, NULL,
@@ -1150,11 +1145,6 @@ svc_tcp_recvfrom(struct svc_rqst *rqstp)
test_bit(XPT_CONN, &svsk->sk_xprt.xpt_flags),
test_bit(XPT_CLOSE, &svsk->sk_xprt.xpt_flags));

- if ((rqstp->rq_deferred = svc_deferred_dequeue(svsk))) {
- svc_xprt_received(&svsk->sk_xprt);
- return svc_deferred_recv(rqstp);
- }
-
if (test_and_clear_bit(XPT_CHNGBUF, &svsk->sk_xprt.xpt_flags))
/* sndbuf needs to have room for one request
* per thread, otherwise we can stall even when the
@@ -1612,7 +1602,12 @@ svc_recv(struct svc_rqst *rqstp, long ti
dprintk("svc: server %p, pool %u, socket %p, inuse=%d\n",
rqstp, pool->sp_id, svsk,
atomic_read(&svsk->sk_xprt.xpt_ref.refcount));
- len = svsk->sk_xprt.xpt_ops.xpo_recvfrom(rqstp);
+
+ if ((rqstp->rq_deferred = svc_deferred_dequeue(&svsk->sk_xprt))) {
+ svc_xprt_received(&svsk->sk_xprt);
+ len = svc_deferred_recv(rqstp);
+ } else
+ len = svsk->sk_xprt.xpt_ops.xpo_recvfrom(rqstp);
dprintk("svc: got len=%d\n", len);
}

@@ -1774,7 +1769,6 @@ static struct svc_sock *svc_setup_socket
svsk->sk_odata = inet->sk_data_ready;
svsk->sk_owspace = inet->sk_write_space;
svsk->sk_xprt.xpt_lastrecv = get_seconds();
- INIT_LIST_HEAD(&svsk->sk_deferred);

/* Initialize the socket */
if (sock->type == SOCK_DGRAM) {
@@ -2006,22 +2000,21 @@ void svc_close_all(struct list_head *xpr
static void svc_revisit(struct cache_deferred_req *dreq, int too_many)
{
struct svc_deferred_req *dr = container_of(dreq, struct svc_deferred_req, handle);
- struct svc_sock *svsk;
+ struct svc_xprt *xprt = dr->xprt;

if (too_many) {
- svc_xprt_put(&dr->svsk->sk_xprt);
+ svc_xprt_put(xprt);
kfree(dr);
return;
}
dprintk("revisit queued\n");
- svsk = dr->svsk;
- dr->svsk = NULL;
- spin_lock(&svsk->sk_xprt.xpt_lock);
- list_add(&dr->handle.recent, &svsk->sk_deferred);
- spin_unlock(&svsk->sk_xprt.xpt_lock);
- set_bit(XPT_DEFERRED, &svsk->sk_xprt.xpt_flags);
- svc_xprt_enqueue(&svsk->sk_xprt);
- svc_xprt_put(&svsk->sk_xprt);
+ dr->xprt = NULL;
+ spin_lock(&xprt->xpt_lock);
+ list_add(&dr->handle.recent, &xprt->xpt_deferred);
+ spin_unlock(&xprt->xpt_lock);
+ set_bit(XPT_DEFERRED, &xprt->xpt_flags);
+ svc_xprt_enqueue(xprt);
+ svc_xprt_put(xprt);
}

static struct cache_deferred_req *
@@ -2052,7 +2045,7 @@ svc_defer(struct cache_req *req)
memcpy(dr->args, rqstp->rq_arg.head[0].iov_base-skip, dr->argslen<<2);
}
svc_xprt_get(rqstp->rq_xprt);
- dr->svsk = rqstp->rq_sock;
+ dr->xprt = rqstp->rq_xprt;

dr->handle.revisit = svc_revisit;
return &dr->handle;
@@ -2078,22 +2071,21 @@ static int svc_deferred_recv(struct svc_
}


-static struct svc_deferred_req *svc_deferred_dequeue(struct svc_sock *svsk)
+static struct svc_deferred_req *svc_deferred_dequeue(struct svc_xprt *xprt)
{
struct svc_deferred_req *dr = NULL;

- if (!test_bit(XPT_DEFERRED, &svsk->sk_xprt.xpt_flags))
+ if (!test_bit(XPT_DEFERRED, &xprt->xpt_flags))
return NULL;
- spin_lock(&svsk->sk_xprt.xpt_lock);
- clear_bit(XPT_DEFERRED, &svsk->sk_xprt.xpt_flags);
- if (!list_empty(&svsk->sk_deferred)) {
- dr = list_entry(svsk->sk_deferred.next,
+ spin_lock(&xprt->xpt_lock);
+ clear_bit(XPT_DEFERRED, &xprt->xpt_flags);
+ if (!list_empty(&xprt->xpt_deferred)) {
+ dr = list_entry(xprt->xpt_deferred.next,
struct svc_deferred_req,
handle.recent);
list_del_init(&dr->handle.recent);
- set_bit(XPT_DEFERRED, &svsk->sk_xprt.xpt_flags);
+ set_bit(XPT_DEFERRED, &xprt->xpt_flags);
}
- spin_unlock(&svsk->sk_xprt.xpt_lock);
+ spin_unlock(&xprt->xpt_lock);
return dr;
}
-

-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2005.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2007-09-27 05:02:26

by Tom Tucker

[permalink] [raw]
Subject: [RFC,PATCH 27/33] svc: Make svc_recv transport neutral


All of the transport field and functions used by svc_recv are now
transport independent. Change the svc_recv function to use the svc_xprt
structure directly instead of the transport specific svc_sock structure.

Signed-off-by: Tom Tucker <[email protected]>
---

net/sunrpc/svc_xprt.c | 1 +
net/sunrpc/svcsock.c | 65 ++++++++++++++++++++++++-------------------------
2 files changed, 33 insertions(+), 33 deletions(-)

diff --git a/net/sunrpc/svc_xprt.c b/net/sunrpc/svc_xprt.c
index 14cd288..8d87b6a 100644
--- a/net/sunrpc/svc_xprt.c
+++ b/net/sunrpc/svc_xprt.c
@@ -117,6 +117,7 @@ void svc_xprt_init(struct svc_xprt_class
INIT_LIST_HEAD(&xpt->xpt_deferred);
mutex_init(&xpt->xpt_mutex);
spin_lock_init(&xpt->xpt_lock);
+ xpt->xpt_lastrecv = get_seconds();
}
EXPORT_SYMBOL_GPL(svc_xprt_init);

diff --git a/net/sunrpc/svcsock.c b/net/sunrpc/svcsock.c
index 89f345d..f5da434 100644
--- a/net/sunrpc/svcsock.c
+++ b/net/sunrpc/svcsock.c
@@ -320,22 +320,22 @@ EXPORT_SYMBOL_GPL(svc_xprt_enqueue);
/*
* Dequeue the first socket. Must be called with the pool->sp_lock held.
*/
-static inline struct svc_sock *
-svc_sock_dequeue(struct svc_pool *pool)
+static inline struct svc_xprt *
+svc_xprt_dequeue(struct svc_pool *pool)
{
- struct svc_sock *svsk;
+ struct svc_xprt *xprt;

if (list_empty(&pool->sp_sockets))
return NULL;

- svsk = list_entry(pool->sp_sockets.next,
- struct svc_sock, sk_xprt.xpt_ready);
- list_del_init(&svsk->sk_xprt.xpt_ready);
+ xprt = list_entry(pool->sp_sockets.next,
+ struct svc_xprt, xpt_ready);
+ list_del_init(&xprt->xpt_ready);

- dprintk("svc: socket %p dequeued, inuse=%d\n",
- svsk->sk_sk, atomic_read(&svsk->sk_xprt.xpt_ref.refcount));
+ dprintk("svc: transport %p dequeued, inuse=%d\n",
+ xprt, atomic_read(&xprt->xpt_ref.refcount));

- return svsk;
+ return xprt;
}

/*
@@ -1500,7 +1500,7 @@ static void inline svc_copy_addr(struct
int
svc_recv(struct svc_rqst *rqstp, long timeout)
{
- struct svc_sock *svsk = NULL;
+ struct svc_xprt *xprt = NULL;
struct svc_serv *serv = rqstp->rq_server;
struct svc_pool *pool = rqstp->rq_pool;
int len, i;
@@ -1511,9 +1511,9 @@ svc_recv(struct svc_rqst *rqstp, long ti
dprintk("svc: server %p waiting for data (to = %ld)\n",
rqstp, timeout);

- if (rqstp->rq_sock)
+ if (rqstp->rq_xprt)
printk(KERN_ERR
- "svc_recv: service %p, socket not NULL!\n",
+ "svc_recv: service %p, transport not NULL!\n",
rqstp);
if (waitqueue_active(&rqstp->rq_wait))
printk(KERN_ERR
@@ -1550,11 +1550,11 @@ svc_recv(struct svc_rqst *rqstp, long ti
return -EINTR;

spin_lock_bh(&pool->sp_lock);
- if ((svsk = svc_sock_dequeue(pool)) != NULL) {
- rqstp->rq_sock = svsk;
- svc_xprt_get(&svsk->sk_xprt);
+ if ((xprt = svc_xprt_dequeue(pool)) != NULL) {
+ rqstp->rq_xprt = xprt;
+ svc_xprt_get(xprt);
rqstp->rq_reserved = serv->sv_max_mesg;
- atomic_add(rqstp->rq_reserved, &svsk->sk_xprt.xpt_reserved);
+ atomic_add(rqstp->rq_reserved, &xprt->xpt_reserved);
} else {
/* No data pending. Go to sleep */
svc_thread_enqueue(pool, rqstp);
@@ -1574,7 +1574,7 @@ svc_recv(struct svc_rqst *rqstp, long ti
spin_lock_bh(&pool->sp_lock);
remove_wait_queue(&rqstp->rq_wait, &wait);

- if (!(svsk = rqstp->rq_sock)) {
+ if (!(xprt = rqstp->rq_xprt)) {
svc_thread_dequeue(pool, rqstp);
spin_unlock_bh(&pool->sp_lock);
dprintk("svc: server %p, no data yet\n", rqstp);
@@ -1584,12 +1584,12 @@ svc_recv(struct svc_rqst *rqstp, long ti
spin_unlock_bh(&pool->sp_lock);

len = 0;
- if (test_bit(XPT_CLOSE, &svsk->sk_xprt.xpt_flags)) {
+ if (test_bit(XPT_CLOSE, &xprt->xpt_flags)) {
dprintk("svc_recv: found XPT_CLOSE\n");
- svc_delete_xprt(&svsk->sk_xprt);
- } else if (test_bit(XPT_LISTENER, &svsk->sk_xprt.xpt_flags)) {
+ svc_delete_xprt(xprt);
+ } else if (test_bit(XPT_LISTENER, &xprt->xpt_flags)) {
struct svc_xprt *newxpt;
- newxpt = svsk->sk_xprt.xpt_ops.xpo_accept(&svsk->sk_xprt);
+ newxpt = xprt->xpt_ops.xpo_accept(xprt);
if (newxpt) {
svc_xprt_received(newxpt);
/*
@@ -1597,20 +1597,20 @@ svc_recv(struct svc_rqst *rqstp, long ti
* listener holds a reference too
*/
__module_get(newxpt->xpt_class->xcl_owner);
- svc_check_conn_limits(svsk->sk_xprt.xpt_server);
+ svc_check_conn_limits(xprt->xpt_server);
}
- svc_xprt_received(&svsk->sk_xprt);
+ svc_xprt_received(xprt);
} else {
- dprintk("svc: server %p, pool %u, socket %p, inuse=%d\n",
- rqstp, pool->sp_id, svsk,
- atomic_read(&svsk->sk_xprt.xpt_ref.refcount));
+ dprintk("svc: server %p, pool %u, transport %p, inuse=%d\n",
+ rqstp, pool->sp_id, xprt,
+ atomic_read(&xprt->xpt_ref.refcount));

- if ((rqstp->rq_deferred = svc_deferred_dequeue(&svsk->sk_xprt))) {
- svc_xprt_received(&svsk->sk_xprt);
+ if ((rqstp->rq_deferred = svc_deferred_dequeue(xprt))) {
+ svc_xprt_received(xprt);
len = svc_deferred_recv(rqstp);
} else
- len = svsk->sk_xprt.xpt_ops.xpo_recvfrom(rqstp);
- svc_copy_addr(rqstp, &svsk->sk_xprt);
+ len = xprt->xpt_ops.xpo_recvfrom(rqstp);
+ svc_copy_addr(rqstp, xprt);
dprintk("svc: got len=%d\n", len);
}

@@ -1620,8 +1620,8 @@ svc_recv(struct svc_rqst *rqstp, long ti
svc_xprt_release(rqstp);
return -EAGAIN;
}
- svsk->sk_xprt.xpt_lastrecv = get_seconds();
- clear_bit(XPT_OLD, &svsk->sk_xprt.xpt_flags);
+ xprt->xpt_lastrecv = get_seconds();
+ clear_bit(XPT_OLD, &xprt->xpt_flags);

rqstp->rq_secure = svc_port_is_privileged(svc_addr(rqstp));
rqstp->rq_chandle.defer = svc_defer;
@@ -1771,7 +1771,6 @@ static struct svc_sock *svc_setup_socket
svsk->sk_ostate = inet->sk_state_change;
svsk->sk_odata = inet->sk_data_ready;
svsk->sk_owspace = inet->sk_write_space;
- svsk->sk_xprt.xpt_lastrecv = get_seconds();

/* Initialize the socket */
if (sock->type == SOCK_DGRAM) {

-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2005.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2007-09-27 05:02:30

by Tom Tucker

[permalink] [raw]
Subject: [RFC, PATCH 25/33] svc: Move the sockaddr information to svc_xprt


Move the IP address fields to the svc_xprt structure. Note that this
assumes that _all_ RPC transports must have IP based 4-tuples. This
seems reasonable given the tight coupling with the portmapper etc...
Thoughts?

Signed-off-by: Tom Tucker <[email protected]>
---

include/linux/sunrpc/svc_xprt.h | 3 ++
include/linux/sunrpc/svcsock.h | 4 ---
net/sunrpc/svcsock.c | 50 +++++++++++++++++++++------------------
3 files changed, 30 insertions(+), 27 deletions(-)

diff --git a/include/linux/sunrpc/svc_xprt.h b/include/linux/sunrpc/svc_xprt.h
index ef5026d..e00ff60 100644
--- a/include/linux/sunrpc/svc_xprt.h
+++ b/include/linux/sunrpc/svc_xprt.h
@@ -62,6 +62,9 @@ #define XPT_CACHE_AUTH 12 /* cache aut
void * xpt_auth_cache; /* auth cache */
struct list_head xpt_deferred; /* deferred requests that need to
* be revisted */
+ struct sockaddr_storage xpt_local; /* local address */
+ struct sockaddr_storage xpt_remote; /* remote peer's address */
+ int xpt_remotelen; /* length of address */
};

int svc_reg_xprt_class(struct svc_xprt_class *);
diff --git a/include/linux/sunrpc/svcsock.h b/include/linux/sunrpc/svcsock.h
index 96a229e..206f092 100644
--- a/include/linux/sunrpc/svcsock.h
+++ b/include/linux/sunrpc/svcsock.h
@@ -28,10 +28,6 @@ struct svc_sock {
/* private TCP part */
int sk_reclen; /* length of record */
int sk_tcplen; /* current read length */
-
- struct sockaddr_storage sk_local; /* local address */
- struct sockaddr_storage sk_remote; /* remote peer's address */
- int sk_remotelen; /* length of address */
};

/*
diff --git a/net/sunrpc/svcsock.c b/net/sunrpc/svcsock.c
index 9e3071c..37978f7 100644
--- a/net/sunrpc/svcsock.c
+++ b/net/sunrpc/svcsock.c
@@ -631,33 +631,13 @@ svc_recvfrom(struct svc_rqst *rqstp, str
struct msghdr msg = {
.msg_flags = MSG_DONTWAIT,
};
- struct sockaddr *sin;
int len;

len = kernel_recvmsg(svsk->sk_sock, &msg, iov, nr, buflen,
msg.msg_flags);

- /* sock_recvmsg doesn't fill in the name/namelen, so we must..
- */
- memcpy(&rqstp->rq_addr, &svsk->sk_remote, svsk->sk_remotelen);
- rqstp->rq_addrlen = svsk->sk_remotelen;
-
- /* Destination address in request is needed for binding the
- * source address in RPC callbacks later.
- */
- sin = (struct sockaddr *)&svsk->sk_local;
- switch (sin->sa_family) {
- case AF_INET:
- rqstp->rq_daddr.addr = ((struct sockaddr_in *)sin)->sin_addr;
- break;
- case AF_INET6:
- rqstp->rq_daddr.addr6 = ((struct sockaddr_in6 *)sin)->sin6_addr;
- break;
- }
-
dprintk("svc: socket %p recvfrom(%p, %Zu) = %d\n",
svsk, iov[0].iov_base, iov[0].iov_len, len);
-
return len;
}

@@ -1109,14 +1089,14 @@ svc_tcp_accept(struct svc_xprt *xprt)
if (!(newsvsk = svc_setup_socket(serv, newsock, &err,
(SVC_SOCK_ANONYMOUS | SVC_SOCK_TEMPORARY))))
goto failed;
- memcpy(&newsvsk->sk_remote, sin, slen);
- newsvsk->sk_remotelen = slen;
+ memcpy(&newsvsk->sk_xprt.xpt_remote, sin, slen);
+ newsvsk->sk_xprt.xpt_remotelen = slen;
err = kernel_getsockname(newsock, sin, &slen);
if (unlikely(err < 0)) {
dprintk("svc_tcp_accept: kernel_getsockname error %d\n", -err);
slen = offsetof(struct sockaddr, sa_data);
}
- memcpy(&newsvsk->sk_local, sin, slen);
+ memcpy(&newsvsk->sk_xprt.xpt_local, sin, slen);

if (serv->sv_stats)
serv->sv_stats->nettcpconn++;
@@ -1490,6 +1470,29 @@ svc_check_conn_limits(struct svc_serv *s
}
}

+static void inline svc_copy_addr(struct svc_rqst *rqstp, struct svc_xprt *xprt)
+{
+ struct sockaddr *sin;
+
+ /* sock_recvmsg doesn't fill in the name/namelen, so we must..
+ */
+ memcpy(&rqstp->rq_addr, &xprt->xpt_remote, xprt->xpt_remotelen);
+ rqstp->rq_addrlen = xprt->xpt_remotelen;
+
+ /* Destination address in request is needed for binding the
+ * source address in RPC callbacks later.
+ */
+ sin = (struct sockaddr *)&xprt->xpt_local;
+ switch (sin->sa_family) {
+ case AF_INET:
+ rqstp->rq_daddr.addr = ((struct sockaddr_in *)sin)->sin_addr;
+ break;
+ case AF_INET6:
+ rqstp->rq_daddr.addr6 = ((struct sockaddr_in6 *)sin)->sin6_addr;
+ break;
+ }
+}
+
/*
* Receive the next request on any socket. This code is carefully
* organised not to touch any cachelines in the shared svc_serv
@@ -1608,6 +1611,7 @@ svc_recv(struct svc_rqst *rqstp, long ti
len = svc_deferred_recv(rqstp);
} else
len = svsk->sk_xprt.xpt_ops.xpo_recvfrom(rqstp);
+ svc_copy_addr(rqstp, &svsk->sk_xprt);
dprintk("svc: got len=%d\n", len);
}


-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2005.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2007-09-27 05:02:30

by Tom Tucker

[permalink] [raw]
Subject: [RFC, PATCH 26/33] svc: Make svc_sock_release svc_xprt_release


The svc_sock_release function only touches transport independent fields.
Change the function to manipulate svc_xprt directly instead of the transport
dependent svc_sock structure.

Signed-off-by: Tom Tucker <[email protected]>
---

net/sunrpc/svcsock.c | 15 +++++++--------
1 files changed, 7 insertions(+), 8 deletions(-)

diff --git a/net/sunrpc/svcsock.c b/net/sunrpc/svcsock.c
index 37978f7..89f345d 100644
--- a/net/sunrpc/svcsock.c
+++ b/net/sunrpc/svcsock.c
@@ -377,9 +377,9 @@ void svc_reserve(struct svc_rqst *rqstp,
}

static void
-svc_sock_release(struct svc_rqst *rqstp)
+svc_xprt_release(struct svc_rqst *rqstp)
{
- struct svc_sock *svsk = rqstp->rq_sock;
+ struct svc_xprt *xprt = rqstp->rq_xprt;

rqstp->rq_xprt->xpt_ops.xpo_release(rqstp);

@@ -387,7 +387,6 @@ svc_sock_release(struct svc_rqst *rqstp)
rqstp->rq_res.page_len = 0;
rqstp->rq_res.page_base = 0;

-
/* Reset response buffer and release
* the reservation.
* But first, check that enough space was reserved
@@ -400,9 +399,9 @@ svc_sock_release(struct svc_rqst *rqstp)

rqstp->rq_res.head[0].iov_len = 0;
svc_reserve(rqstp, 0);
- rqstp->rq_sock = NULL;
+ rqstp->rq_xprt = NULL;

- svc_xprt_put(&svsk->sk_xprt);
+ svc_xprt_put(xprt);
}

/*
@@ -1618,7 +1617,7 @@ svc_recv(struct svc_rqst *rqstp, long ti
/* No data, incomplete (TCP) read, or accept() */
if (len == 0 || len == -EAGAIN) {
rqstp->rq_res.len = 0;
- svc_sock_release(rqstp);
+ svc_xprt_release(rqstp);
return -EAGAIN;
}
svsk->sk_xprt.xpt_lastrecv = get_seconds();
@@ -1639,7 +1638,7 @@ void
svc_drop(struct svc_rqst *rqstp)
{
dprintk("svc: socket %p dropped request\n", rqstp->rq_sock);
- svc_sock_release(rqstp);
+ svc_xprt_release(rqstp);
}

/*
@@ -1674,7 +1673,7 @@ svc_send(struct svc_rqst *rqstp)
else
len = xprt->xpt_ops.xpo_sendto(rqstp);
mutex_unlock(&xprt->xpt_mutex);
- svc_sock_release(rqstp);
+ svc_xprt_release(rqstp);

if (len == -ECONNREFUSED || len == -ENOTCONN || len == -EAGAIN)
return 0;

-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2005.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2007-09-27 05:02:31

by Tom Tucker

[permalink] [raw]
Subject: [RFC, PATCH 29/33] svc: Move common create logic to common code


Move the code that adds a transport instance to the sv_tempsocks and
sv_permsocks lists out of the transport specific functions and into core
logic.

The svc_addsock routine still manipulates sv_permsocks directly. This
code may be removed when rpc.nfsd is modified to create transports
by writing to the portlist file.

Signed-off-by: Tom Tucker <[email protected]>
---

net/sunrpc/svc_xprt.c | 7 +++++++
net/sunrpc/svcsock.c | 38 +++++++++++++++++++-------------------
2 files changed, 26 insertions(+), 19 deletions(-)

diff --git a/net/sunrpc/svc_xprt.c b/net/sunrpc/svc_xprt.c
index 8d87b6a..78c93a4 100644
--- a/net/sunrpc/svc_xprt.c
+++ b/net/sunrpc/svc_xprt.c
@@ -146,6 +146,13 @@ int svc_create_xprt(struct svc_serv *ser
if (IS_ERR(newxprt)) {
module_put(xcl->xcl_owner);
ret = PTR_ERR(newxprt);
+ } else {
+ clear_bit(XPT_TEMP,
+ &newxprt->xpt_flags);
+ spin_lock_bh(&serv->sv_lock);
+ list_add(&newxprt->xpt_list,
+ &serv->sv_permsocks);
+ spin_unlock_bh(&serv->sv_lock);
}
goto out;
}
diff --git a/net/sunrpc/svcsock.c b/net/sunrpc/svcsock.c
index b1affaa..8a708a8 100644
--- a/net/sunrpc/svcsock.c
+++ b/net/sunrpc/svcsock.c
@@ -92,6 +92,7 @@ static int svc_deferred_recv(struct svc_
static struct cache_deferred_req *svc_defer(struct cache_req *req);
static struct svc_xprt *
svc_create_socket(struct svc_serv *, int, struct sockaddr *, int, int);
+static void svc_age_temp_xprts(unsigned long closure);

/* apparently the "standard" is that clients close
* idle connections after 5 minutes, servers after
@@ -1598,6 +1599,18 @@ svc_recv(struct svc_rqst *rqstp, long ti
*/
__module_get(newxpt->xpt_class->xcl_owner);
svc_check_conn_limits(xprt->xpt_server);
+ spin_lock_bh(&serv->sv_lock);
+ set_bit(XPT_TEMP, &newxpt->xpt_flags);
+ list_add(&newxpt->xpt_list, &serv->sv_tempsocks);
+ serv->sv_tmpcnt++;
+ if (serv->sv_temptimer.function == NULL) {
+ /* setup timer to age temp sockets */
+ setup_timer(&serv->sv_temptimer, svc_age_temp_xprts,
+ (unsigned long)serv);
+ mod_timer(&serv->sv_temptimer,
+ jiffies + svc_conn_age_period * HZ);
+ }
+ spin_unlock_bh(&serv->sv_lock);
}
svc_xprt_received(xprt);
} else {
@@ -1746,7 +1759,6 @@ static struct svc_sock *svc_setup_socket
struct svc_sock *svsk;
struct sock *inet;
int pmap_register = !(flags & SVC_SOCK_ANONYMOUS);
- int is_temporary = flags & SVC_SOCK_TEMPORARY;

dprintk("svc: svc_setup_socket %p\n", sock);
if (!(svsk = kzalloc(sizeof(*svsk), GFP_KERNEL))) {
@@ -1785,24 +1797,6 @@ static struct svc_sock *svc_setup_socket
set_bit(XPT_CACHE_AUTH, &svsk->sk_xprt.xpt_flags);
}

- spin_lock_bh(&serv->sv_lock);
- if (is_temporary) {
- set_bit(XPT_TEMP, &svsk->sk_xprt.xpt_flags);
- list_add(&svsk->sk_xprt.xpt_list, &serv->sv_tempsocks);
- serv->sv_tmpcnt++;
- if (serv->sv_temptimer.function == NULL) {
- /* setup timer to age temp sockets */
- setup_timer(&serv->sv_temptimer, svc_age_temp_xprts,
- (unsigned long)serv);
- mod_timer(&serv->sv_temptimer,
- jiffies + svc_conn_age_period * HZ);
- }
- } else {
- clear_bit(XPT_TEMP, &svsk->sk_xprt.xpt_flags);
- list_add(&svsk->sk_xprt.xpt_list, &serv->sv_permsocks);
- }
- spin_unlock_bh(&serv->sv_lock);
-
dprintk("svc: svc_setup_socket created %p (inet %p)\n",
svsk, svsk->sk_sk);

@@ -1833,6 +1827,12 @@ int svc_addsock(struct svc_serv *serv,
svc_xprt_received(&svsk->sk_xprt);
err = 0;
}
+ if (so->sk->sk_protocol == IPPROTO_TCP)
+ set_bit(XPT_LISTENER, &svsk->sk_xprt.xpt_flags);
+ clear_bit(XPT_TEMP, &svsk->sk_xprt.xpt_flags);
+ spin_lock_bh(&serv->sv_lock);
+ list_add(&svsk->sk_xprt.xpt_list, &serv->sv_permsocks);
+ spin_unlock_bh(&serv->sv_lock);
}
if (err) {
sockfd_put(so);

-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2005.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2007-09-27 05:02:35

by Tom Tucker

[permalink] [raw]
Subject: [RFC, PATCH 28/33] svc: Make svc_age_temp_sockets svc_age_temp_transports


This function is transport independent. Change it to use svc_xprt directly
and change it's name to reflect this.

Signed-off-by: Tom Tucker <[email protected]>
---

net/sunrpc/svcsock.c | 32 +++++++++++++++++---------------
1 files changed, 17 insertions(+), 15 deletions(-)

diff --git a/net/sunrpc/svcsock.c b/net/sunrpc/svcsock.c
index f5da434..b1affaa 100644
--- a/net/sunrpc/svcsock.c
+++ b/net/sunrpc/svcsock.c
@@ -1685,49 +1685,51 @@ svc_send(struct svc_rqst *rqstp)
* a mark-and-sweep algorithm.
*/
static void
-svc_age_temp_sockets(unsigned long closure)
+svc_age_temp_xprts(unsigned long closure)
{
struct svc_serv *serv = (struct svc_serv *)closure;
- struct svc_sock *svsk;
+ struct svc_xprt *xprt;
struct list_head *le, *next;
LIST_HEAD(to_be_aged);

- dprintk("svc_age_temp_sockets\n");
+ dprintk("svc_age_temp_xprts\n");

if (!spin_trylock_bh(&serv->sv_lock)) {
/* busy, try again 1 sec later */
- dprintk("svc_age_temp_sockets: busy\n");
+ dprintk("svc_age_temp_xprts: busy\n");
mod_timer(&serv->sv_temptimer, jiffies + HZ);
return;
}

list_for_each_safe(le, next, &serv->sv_tempsocks) {
- svsk = list_entry(le, struct svc_sock, sk_xprt.xpt_list);
+ xprt = list_entry(le, struct svc_xprt, xpt_list);

- if (!test_and_set_bit(XPT_OLD, &svsk->sk_xprt.xpt_flags))
+ /* First time through, just mark it OLD. Second time
+ * through, close it. */
+ if (!test_and_set_bit(XPT_OLD, &xprt->xpt_flags))
continue;
if (atomic_read(&svsk->sk_xprt.xpt_ref.refcount) > 1
|| test_bit(SK_BUSY, &svsk->sk_flags))
continue;
- svc_xprt_get(&svsk->sk_xprt);
+ svc_xprt_get(xprt);
list_move(le, &to_be_aged);
- set_bit(XPT_CLOSE, &svsk->sk_xprt.xpt_flags);
- set_bit(XPT_DETACHED, &svsk->sk_xprt.xpt_flags);
+ set_bit(XPT_CLOSE, &xprt->xpt_flags);
+ set_bit(XPT_DETACHED, &xprt->xpt_flags);
}
spin_unlock_bh(&serv->sv_lock);

while (!list_empty(&to_be_aged)) {
le = to_be_aged.next;
- /* fiddling the sk_xprt.xpt_list node is safe 'cos we're XPT_DETACHED */
+ /* fiddling the xpt_list node is safe 'cos we're XPT_DETACHED */
list_del_init(le);
- svsk = list_entry(le, struct svc_sock, sk_xprt.xpt_list);
+ xprt = list_entry(le, struct svc_xprt, xpt_list);

dprintk("queuing svsk %p for closing, %lu seconds old\n",
- svsk, get_seconds() - svsk->sk_xprt.xpt_lastrecv);
+ xprt, get_seconds() - xprt->xpt_lastrecv);

/* a thread will dequeue and close it soon */
- svc_xprt_enqueue(&svsk->sk_xprt);
- svc_xprt_put(&svsk->sk_xprt);
+ svc_xprt_enqueue(xprt);
+ svc_xprt_put(xprt);
}

mod_timer(&serv->sv_temptimer, jiffies + svc_conn_age_period * HZ);
@@ -1790,7 +1792,7 @@ static struct svc_sock *svc_setup_socket
serv->sv_tmpcnt++;
if (serv->sv_temptimer.function == NULL) {
/* setup timer to age temp sockets */
- setup_timer(&serv->sv_temptimer, svc_age_temp_sockets,
+ setup_timer(&serv->sv_temptimer, svc_age_temp_xprts,
(unsigned long)serv);
mod_timer(&serv->sv_temptimer,
jiffies + svc_conn_age_period * HZ);

-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2005.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2007-09-27 05:02:41

by Tom Tucker

[permalink] [raw]
Subject: [RFC, PATCH 30/33] svc: Removing remaining references to rq_sock in rqstp


This functionally empty patch removes rq_sock and unamed union
from rqstp structure.

Signed-off-by: Tom Tucker <[email protected]>
---

include/linux/sunrpc/svc.h | 5 +----
net/sunrpc/svcsock.c | 29 ++++++++++++++++-------------
2 files changed, 17 insertions(+), 17 deletions(-)

diff --git a/include/linux/sunrpc/svc.h b/include/linux/sunrpc/svc.h
index 40adc9d..04eb20e 100644
--- a/include/linux/sunrpc/svc.h
+++ b/include/linux/sunrpc/svc.h
@@ -204,10 +204,7 @@ union svc_addr_u {
struct svc_rqst {
struct list_head rq_list; /* idle list */
struct list_head rq_all; /* all threads list */
- union {
- struct svc_xprt * rq_xprt; /* transport ptr */
- struct svc_sock * rq_sock; /* socket ptr */
- };
+ struct svc_xprt * rq_xprt; /* transport ptr */
struct sockaddr_storage rq_addr; /* peer address */
size_t rq_addrlen;

diff --git a/net/sunrpc/svcsock.c b/net/sunrpc/svcsock.c
index 8a708a8..5ea26b2 100644
--- a/net/sunrpc/svcsock.c
+++ b/net/sunrpc/svcsock.c
@@ -196,10 +196,11 @@ svc_release_skb(struct svc_rqst *rqstp)
struct svc_deferred_req *dr = rqstp->rq_deferred;

if (skb) {
+ struct svc_sock *svsk = (struct svc_sock *)rqstp->rq_xprt;
rqstp->rq_xprt_ctxt = NULL;

dprintk("svc: service %p, releasing skb %p\n", rqstp, skb);
- skb_free_datagram(rqstp->rq_sock->sk_sk, skb);
+ skb_free_datagram(svsk->sk_sk, skb);
}
if (dr) {
rqstp->rq_deferred = NULL;
@@ -428,7 +429,7 @@ svc_wake_up(struct svc_serv *serv)
dprintk("svc: daemon %p woken up.\n", rqstp);
/*
svc_thread_dequeue(pool, rqstp);
- rqstp->rq_sock = NULL;
+ rqstp->rq_xprt = NULL;
*/
wake_up(&rqstp->rq_wait);
}
@@ -445,7 +446,8 @@ #define SVC_PKTINFO_SPACE \

static void svc_set_cmsg_data(struct svc_rqst *rqstp, struct cmsghdr *cmh)
{
- switch (rqstp->rq_sock->sk_sk->sk_family) {
+ struct svc_sock *svsk = (struct svc_sock *)rqstp->rq_xprt;
+ switch (svsk->sk_sk->sk_family) {
case AF_INET: {
struct in_pktinfo *pki = CMSG_DATA(cmh);

@@ -478,7 +480,7 @@ static void svc_set_cmsg_data(struct svc
static int
svc_sendto(struct svc_rqst *rqstp, struct xdr_buf *xdr)
{
- struct svc_sock *svsk = rqstp->rq_sock;
+ struct svc_sock *svsk = (struct svc_sock *)rqstp->rq_xprt;
struct socket *sock = svsk->sk_sock;
int slen;
union {
@@ -551,7 +553,7 @@ svc_sendto(struct svc_rqst *rqstp, struc
}
out:
dprintk("svc: socket %p sendto([%p %Zu... ], %d) = %d (addr %s)\n",
- rqstp->rq_sock, xdr->head[0].iov_base, xdr->head[0].iov_len,
+ svsk, xdr->head[0].iov_base, xdr->head[0].iov_len,
xdr->len, len, svc_print_addr(rqstp, buf, sizeof(buf)));

return len;
@@ -627,7 +629,7 @@ svc_recv_available(struct svc_sock *svsk
static int
svc_recvfrom(struct svc_rqst *rqstp, struct kvec *iov, int nr, int buflen)
{
- struct svc_sock *svsk = rqstp->rq_sock;
+ struct svc_sock *svsk = (struct svc_sock *)rqstp->rq_xprt;
struct msghdr msg = {
.msg_flags = MSG_DONTWAIT,
};
@@ -709,7 +711,8 @@ svc_write_space(struct sock *sk)
static inline void svc_udp_get_dest_address(struct svc_rqst *rqstp,
struct cmsghdr *cmh)
{
- switch (rqstp->rq_sock->sk_sk->sk_family) {
+ struct svc_sock *svsk = (struct svc_sock *)rqstp->rq_xprt;
+ switch (svsk->sk_sk->sk_family) {
case AF_INET: {
struct in_pktinfo *pki = CMSG_DATA(cmh);
rqstp->rq_daddr.addr.s_addr = pki->ipi_spec_dst.s_addr;
@@ -729,7 +732,7 @@ static inline void svc_udp_get_dest_addr
static int
svc_udp_recvfrom(struct svc_rqst *rqstp)
{
- struct svc_sock *svsk = rqstp->rq_sock;
+ struct svc_sock *svsk = (struct svc_sock *)rqstp->rq_xprt;
struct svc_serv *serv = svsk->sk_xprt.xpt_server;
struct sk_buff *skb;
union {
@@ -1114,7 +1117,7 @@ failed:
static int
svc_tcp_recvfrom(struct svc_rqst *rqstp)
{
- struct svc_sock *svsk = rqstp->rq_sock;
+ struct svc_sock *svsk = (struct svc_sock *)rqstp->rq_xprt;
struct svc_serv *serv = svsk->sk_xprt.xpt_server;
int len;
struct kvec *vec;
@@ -1277,16 +1280,16 @@ svc_tcp_sendto(struct svc_rqst *rqstp)
reclen = htonl(0x80000000|((xbufp->len ) - 4));
memcpy(xbufp->head[0].iov_base, &reclen, 4);

- if (test_bit(XPT_DEAD, &rqstp->rq_sock->sk_xprt.xpt_flags))
+ if (test_bit(XPT_DEAD, &rqstp->rq_xprt->xpt_flags))
return -ENOTCONN;

sent = svc_sendto(rqstp, &rqstp->rq_res);
if (sent != xbufp->len) {
printk(KERN_NOTICE "rpc-srv/tcp: %s: %s %d when sending %d bytes - shutting down socket\n",
- rqstp->rq_sock->sk_xprt.xpt_server->sv_name,
+ rqstp->rq_xprt->xpt_server->sv_name,
(sent<0)?"got error":"sent only",
sent, xbufp->len);
- set_bit(XPT_CLOSE, &rqstp->rq_sock->sk_xprt.xpt_flags);
+ set_bit(XPT_CLOSE, &rqstp->rq_xprt->xpt_flags);
svc_xprt_enqueue(rqstp->rq_xprt);
sent = -EAGAIN;
}
@@ -1650,7 +1653,7 @@ svc_recv(struct svc_rqst *rqstp, long ti
void
svc_drop(struct svc_rqst *rqstp)
{
- dprintk("svc: socket %p dropped request\n", rqstp->rq_sock);
+ dprintk("svc: xprt %p dropped request\n", rqstp->rq_xprt);
svc_xprt_release(rqstp);
}


-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2005.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2007-09-27 05:02:43

by Tom Tucker

[permalink] [raw]
Subject: [RFC,PATCH 32/33] svc: Add /proc/sys/sunrpc/transport files


Add a file that when read lists the set of registered svc
transports.

Signed-off-by: Tom Tucker <[email protected]>
---

include/linux/sunrpc/debug.h | 1 +
include/linux/sunrpc/svc_xprt.h | 2 +-
net/sunrpc/svc_xprt.c | 28 ++++++++++++++++++++++++++++
net/sunrpc/sysctl.c | 37 +++++++++++++++++++++++++++++++++++++
4 files changed, 67 insertions(+), 1 deletions(-)

diff --git a/include/linux/sunrpc/debug.h b/include/linux/sunrpc/debug.h
index 10709cb..89458df 100644
--- a/include/linux/sunrpc/debug.h
+++ b/include/linux/sunrpc/debug.h
@@ -88,6 +88,7 @@ enum {
CTL_SLOTTABLE_TCP,
CTL_MIN_RESVPORT,
CTL_MAX_RESVPORT,
+ CTL_TRANSPORTS,
};

#endif /* _LINUX_SUNRPC_DEBUG_H_ */
diff --git a/include/linux/sunrpc/svc_xprt.h b/include/linux/sunrpc/svc_xprt.h
index a0fbb4f..e4ce4e1 100644
--- a/include/linux/sunrpc/svc_xprt.h
+++ b/include/linux/sunrpc/svc_xprt.h
@@ -80,5 +80,5 @@ static inline void svc_xprt_get(struct s
}
void svc_delete_xprt(struct svc_xprt *xprt);
void svc_close_xprt(struct svc_xprt *xprt);
-
+int svc_print_xprts(char *buf, int maxlen);
#endif /* SUNRPC_SVC_XPRT_H */
diff --git a/net/sunrpc/svc_xprt.c b/net/sunrpc/svc_xprt.c
index 1a7cde0..99c47c8 100644
--- a/net/sunrpc/svc_xprt.c
+++ b/net/sunrpc/svc_xprt.c
@@ -93,6 +93,34 @@ int svc_unreg_xprt_class(struct svc_xprt
}
EXPORT_SYMBOL_GPL(svc_unreg_xprt_class);

+/*
+ * Format the transport list for printing
+ */
+int svc_print_xprts(char *buf, int maxlen)
+{
+ struct list_head *le;
+ char tmpstr[80];
+ int len = 0;
+ buf[0] = '\0';
+
+ spin_lock(&svc_xprt_class_lock);
+ list_for_each(le, &svc_xprt_class_list) {
+ int slen;
+ struct svc_xprt_class *xcl =
+ list_entry(le, struct svc_xprt_class, xcl_list);
+
+ sprintf(tmpstr, "%s %d\n", xcl->xcl_name, xcl->xcl_max_payload);
+ slen = strlen(tmpstr);
+ if (len + slen > maxlen)
+ break;
+ len += slen;
+ strcat(buf, tmpstr);
+ }
+ spin_unlock(&svc_xprt_class_lock);
+
+ return len;
+}
+
static inline void svc_xprt_free(struct kref *kref)
{
struct svc_xprt *xprt =
diff --git a/net/sunrpc/sysctl.c b/net/sunrpc/sysctl.c
index 738db32..8642f6f 100644
--- a/net/sunrpc/sysctl.c
+++ b/net/sunrpc/sysctl.c
@@ -18,6 +18,7 @@ #include <asm/uaccess.h>
#include <linux/sunrpc/types.h>
#include <linux/sunrpc/sched.h>
#include <linux/sunrpc/stats.h>
+#include <linux/sunrpc/svc_xprt.h>

/*
* Declare the debug flags here
@@ -27,6 +28,8 @@ unsigned int nfs_debug;
unsigned int nfsd_debug;
unsigned int nlm_debug;

+char xprt_buf[128];
+
#ifdef RPC_DEBUG

static struct ctl_table_header *sunrpc_table_header;
@@ -48,6 +51,32 @@ rpc_unregister_sysctl(void)
}
}

+static int proc_do_xprt(ctl_table *table, int write, struct file *file,
+ void __user *buffer, size_t *lenp, loff_t *ppos)
+{
+ char tmpbuf[sizeof(xprt_buf)];
+ int len;
+ if ((*ppos && !write) || !*lenp) {
+ *lenp = 0;
+ return 0;
+ }
+ if (write)
+ return -EINVAL;
+ else {
+
+ len = svc_print_xprts(tmpbuf, sizeof(tmpbuf));
+ if (!access_ok(VERIFY_WRITE, buffer, len))
+ return -EFAULT;
+
+ if (__copy_to_user(buffer, tmpbuf, len))
+ return -EFAULT;
+ }
+
+ *lenp -= len;
+ *ppos += len;
+ return 0;
+}
+
static int
proc_dodebug(ctl_table *table, int write, struct file *file,
void __user *buffer, size_t *lenp, loff_t *ppos)
@@ -145,6 +174,14 @@ static ctl_table debug_table[] = {
.mode = 0644,
.proc_handler = &proc_dodebug
},
+ {
+ .ctl_name = CTL_TRANSPORTS,
+ .procname = "transports",
+ .data = xprt_buf,
+ .maxlen = sizeof(xprt_buf),
+ .mode = 0444,
+ .proc_handler = &proc_do_xprt,
+ },
{ .ctl_name = 0 }
};


-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2005.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2007-09-27 05:02:42

by Tom Tucker

[permalink] [raw]
Subject: [RFC, PATCH 31/33] svc: Move the xprt independent code to the svc_xprt.c file


This functionally trivial patch moves all of the transport independent
functions from the svcsock.c file to the transport independent svc_xprt.c
file.

Signed-off-by: Tom Tucker <[email protected]>
---

include/linux/sunrpc/svc_xprt.h | 4
net/sunrpc/svc_xprt.c | 748 +++++++++++++++++++++++++++++++++++++++
net/sunrpc/svcsock.c | 750 ---------------------------------------
3 files changed, 752 insertions(+), 750 deletions(-)

diff --git a/include/linux/sunrpc/svc_xprt.h b/include/linux/sunrpc/svc_xprt.h
index e00ff60..a0fbb4f 100644
--- a/include/linux/sunrpc/svc_xprt.h
+++ b/include/linux/sunrpc/svc_xprt.h
@@ -72,9 +72,13 @@ int svc_unreg_xprt_class(struct svc_xprt
void svc_xprt_init(struct svc_xprt_class *, struct svc_xprt *, struct svc_serv *);
int svc_create_xprt(struct svc_serv *, char *, unsigned short, int);
void svc_xprt_received(struct svc_xprt *);
+void svc_xprt_enqueue(struct svc_xprt *xprt);
+int svc_port_is_privileged(struct sockaddr *sin);
void svc_xprt_put(struct svc_xprt *xprt);
static inline void svc_xprt_get(struct svc_xprt *xprt) {
kref_get(&xprt->xpt_ref);
}
+void svc_delete_xprt(struct svc_xprt *xprt);
+void svc_close_xprt(struct svc_xprt *xprt);

#endif /* SUNRPC_SVC_XPRT_H */
diff --git a/net/sunrpc/svc_xprt.c b/net/sunrpc/svc_xprt.c
index 78c93a4..1a7cde0 100644
--- a/net/sunrpc/svc_xprt.c
+++ b/net/sunrpc/svc_xprt.c
@@ -35,6 +35,17 @@ #include <linux/sunrpc/svc_xprt.h>

#define RPCDBG_FACILITY RPCDBG_SVCXPRT

+static struct svc_deferred_req *svc_deferred_dequeue(struct svc_xprt *xprt);
+static int svc_deferred_recv(struct svc_rqst *rqstp);
+static struct cache_deferred_req *svc_defer(struct cache_req *req);
+static void svc_age_temp_xprts(unsigned long closure);
+/* apparently the "standard" is that clients close
+ * idle connections after 5 minutes, servers after
+ * 6 minutes
+ * http://www.connectathon.org/talks96/nfstcp.pdf
+ */
+static int svc_conn_age_period = 6*60;
+
/* List of registered transport classes */
static spinlock_t svc_xprt_class_lock = SPIN_LOCK_UNLOCKED;
static LIST_HEAD(svc_xprt_class_list);
@@ -165,3 +176,740 @@ int svc_create_xprt(struct svc_serv *ser
}
EXPORT_SYMBOL_GPL(svc_create_xprt);

+/*
+ * Queue up an idle server thread. Must have pool->sp_lock held.
+ * Note: this is really a stack rather than a queue, so that we only
+ * use as many different threads as we need, and the rest don't pollute
+ * the cache.
+ */
+static inline void
+svc_thread_enqueue(struct svc_pool *pool, struct svc_rqst *rqstp)
+{
+ list_add(&rqstp->rq_list, &pool->sp_threads);
+}
+
+/*
+ * Dequeue an nfsd thread. Must have pool->sp_lock held.
+ */
+static inline void
+svc_thread_dequeue(struct svc_pool *pool, struct svc_rqst *rqstp)
+{
+ list_del(&rqstp->rq_list);
+}
+
+/*
+ * Queue up a socket with data pending. If there are idle nfsd
+ * processes, wake 'em up.
+ *
+ */
+void
+svc_xprt_enqueue(struct svc_xprt *xprt)
+{
+ struct svc_serv *serv = xprt->xpt_server;
+ struct svc_pool *pool;
+ struct svc_rqst *rqstp;
+ int cpu;
+
+ if (!(xprt->xpt_flags &
+ ( (1<<XPT_CONN)|(1<<XPT_DATA)|(1<<XPT_CLOSE)|(1<<XPT_DEFERRED)) ))
+ return;
+ if (test_bit(XPT_DEAD, &xprt->xpt_flags))
+ return;
+
+ cpu = get_cpu();
+ pool = svc_pool_for_cpu(xprt->xpt_server, cpu);
+ put_cpu();
+
+ spin_lock_bh(&pool->sp_lock);
+
+ if (!list_empty(&pool->sp_threads) &&
+ !list_empty(&pool->sp_sockets))
+ printk(KERN_ERR
+ "svc_xprt_enqueue: threads and sockets both waiting??\n");
+
+ if (test_bit(XPT_DEAD, &xprt->xpt_flags)) {
+ /* Don't enqueue dead sockets */
+ dprintk("svc: transport %p is dead, not enqueued\n", xprt);
+ goto out_unlock;
+ }
+
+ /* Mark socket as busy. It will remain in this state until the
+ * server has processed all pending data and put the socket back
+ * on the idle list. We update XPT_BUSY atomically because
+ * it also guards against trying to enqueue the svc_sock twice.
+ */
+ if (test_and_set_bit(XPT_BUSY, &xprt->xpt_flags)) {
+ /* Don't enqueue socket while already enqueued */
+ dprintk("svc: transport %p busy, not enqueued\n", xprt);
+ goto out_unlock;
+ }
+ BUG_ON(xprt->xpt_pool != NULL);
+ xprt->xpt_pool = pool;
+
+ /* Handle pending connection */
+ if (test_bit(XPT_CONN, &xprt->xpt_flags))
+ goto process;
+
+ /* Handle close in-progress */
+ if (test_bit(XPT_CLOSE, &xprt->xpt_flags))
+ goto process;
+
+ /* Check if we have space to reply to a request */
+ if (!xprt->xpt_ops.xpo_has_wspace(xprt)) {
+ /* Don't enqueue while not enough space for reply */
+ dprintk("svc: no write space, transport %p not enqueued\n", xprt);
+ xprt->xpt_pool = NULL;
+ clear_bit(XPT_BUSY, &xprt->xpt_flags);
+ goto out_unlock;
+ }
+
+ process:
+ if (!list_empty(&pool->sp_threads)) {
+ rqstp = list_entry(pool->sp_threads.next,
+ struct svc_rqst,
+ rq_list);
+ dprintk("svc: transport %p served by daemon %p\n",
+ xprt, rqstp);
+ svc_thread_dequeue(pool, rqstp);
+ if (rqstp->rq_xprt)
+ printk(KERN_ERR
+ "svc_xprt_enqueue: server %p, rq_xprt=%p!\n",
+ rqstp, rqstp->rq_xprt);
+ rqstp->rq_xprt = xprt;
+ svc_xprt_get(xprt);
+ rqstp->rq_reserved = serv->sv_max_mesg;
+ atomic_add(rqstp->rq_reserved, &xprt->xpt_reserved);
+ BUG_ON(xprt->xpt_pool != pool);
+ wake_up(&rqstp->rq_wait);
+ } else {
+ dprintk("svc: transport %p put into queue\n", xprt);
+ list_add_tail(&xprt->xpt_ready, &pool->sp_sockets);
+ BUG_ON(xprt->xpt_pool != pool);
+ }
+
+out_unlock:
+ spin_unlock_bh(&pool->sp_lock);
+}
+EXPORT_SYMBOL_GPL(svc_xprt_enqueue);
+
+/*
+ * Dequeue the first socket. Must be called with the pool->sp_lock held.
+ */
+static inline struct svc_xprt *
+svc_xprt_dequeue(struct svc_pool *pool)
+{
+ struct svc_xprt *xprt;
+
+ if (list_empty(&pool->sp_sockets))
+ return NULL;
+
+ xprt = list_entry(pool->sp_sockets.next,
+ struct svc_xprt, xpt_ready);
+ list_del_init(&xprt->xpt_ready);
+
+ dprintk("svc: transport %p dequeued, inuse=%d\n",
+ xprt, atomic_read(&xprt->xpt_ref.refcount));
+
+ return xprt;
+}
+
+/*
+ * Having read something from a socket, check whether it
+ * needs to be re-enqueued.
+ * Note: XPT_DATA only gets cleared when a read-attempt finds
+ * no (or insufficient) data.
+ */
+void
+svc_xprt_received(struct svc_xprt *xprt)
+{
+ xprt->xpt_pool = NULL;
+ clear_bit(XPT_BUSY, &xprt->xpt_flags);
+ svc_xprt_enqueue(xprt);
+}
+EXPORT_SYMBOL_GPL(svc_xprt_received);
+
+/**
+ * svc_reserve - change the space reserved for the reply to a request.
+ * @rqstp: The request in question
+ * @space: new max space to reserve
+ *
+ * Each request reserves some space on the output queue of the socket
+ * to make sure the reply fits. This function reduces that reserved
+ * space to be the amount of space used already, plus @space.
+ *
+ */
+void svc_reserve(struct svc_rqst *rqstp, int space)
+{
+ space += rqstp->rq_res.head[0].iov_len;
+
+ if (space < rqstp->rq_reserved) {
+ struct svc_xprt *xprt = rqstp->rq_xprt;
+ atomic_sub((rqstp->rq_reserved - space), &xprt->xpt_reserved);
+ rqstp->rq_reserved = space;
+
+ svc_xprt_enqueue(xprt);
+ }
+}
+
+static void
+svc_xprt_release(struct svc_rqst *rqstp)
+{
+ struct svc_xprt *xprt = rqstp->rq_xprt;
+
+ rqstp->rq_xprt->xpt_ops.xpo_release(rqstp);
+
+ svc_free_res_pages(rqstp);
+ rqstp->rq_res.page_len = 0;
+ rqstp->rq_res.page_base = 0;
+
+ /* Reset response buffer and release
+ * the reservation.
+ * But first, check that enough space was reserved
+ * for the reply, otherwise we have a bug!
+ */
+ if ((rqstp->rq_res.len) > rqstp->rq_reserved)
+ printk(KERN_ERR "RPC request reserved %d but used %d\n",
+ rqstp->rq_reserved,
+ rqstp->rq_res.len);
+
+ rqstp->rq_res.head[0].iov_len = 0;
+ svc_reserve(rqstp, 0);
+ rqstp->rq_xprt = NULL;
+
+ svc_xprt_put(xprt);
+}
+
+/*
+ * External function to wake up a server waiting for data
+ * This really only makes sense for services like lockd
+ * which have exactly one thread anyway.
+ */
+void
+svc_wake_up(struct svc_serv *serv)
+{
+ struct svc_rqst *rqstp;
+ unsigned int i;
+ struct svc_pool *pool;
+
+ for (i = 0; i < serv->sv_nrpools; i++) {
+ pool = &serv->sv_pools[i];
+
+ spin_lock_bh(&pool->sp_lock);
+ if (!list_empty(&pool->sp_threads)) {
+ rqstp = list_entry(pool->sp_threads.next,
+ struct svc_rqst,
+ rq_list);
+ dprintk("svc: daemon %p woken up.\n", rqstp);
+ /*
+ svc_thread_dequeue(pool, rqstp);
+ rqstp->rq_xprt = NULL;
+ */
+ wake_up(&rqstp->rq_wait);
+ }
+ spin_unlock_bh(&pool->sp_lock);
+ }
+}
+
+static void
+svc_check_conn_limits(struct svc_serv *serv)
+{
+ char buf[RPC_MAX_ADDRBUFLEN];
+
+ /* make sure that we don't have too many active connections.
+ * If we have, something must be dropped.
+ *
+ * There's no point in trying to do random drop here for
+ * DoS prevention. The NFS clients does 1 reconnect in 15
+ * seconds. An attacker can easily beat that.
+ *
+ * The only somewhat efficient mechanism would be if drop
+ * old connections from the same IP first. But right now
+ * we don't even record the client IP in svc_sock.
+ */
+ if (serv->sv_tmpcnt > (serv->sv_nrthreads+3)*20) {
+ struct svc_sock *svsk = NULL;
+ spin_lock_bh(&serv->sv_lock);
+ if (!list_empty(&serv->sv_tempsocks)) {
+ if (net_ratelimit()) {
+ /* Try to help the admin */
+ printk(KERN_NOTICE "%s: too many open TCP "
+ "sockets, consider increasing the "
+ "number of nfsd threads\n",
+ serv->sv_name);
+ printk(KERN_NOTICE
+ "%s: last TCP connect from %s\n",
+ serv->sv_name, buf);
+ }
+ /*
+ * Always select the oldest socket. It's not fair,
+ * but so is life
+ */
+ svsk = list_entry(serv->sv_tempsocks.prev,
+ struct svc_sock,
+ sk_xprt.xpt_list);
+ set_bit(XPT_CLOSE, &svsk->sk_xprt.xpt_flags);
+ svc_xprt_get(&svsk->sk_xprt);
+ }
+ spin_unlock_bh(&serv->sv_lock);
+
+ if (svsk) {
+ svc_xprt_enqueue(&svsk->sk_xprt);
+ svc_xprt_put(&svsk->sk_xprt);
+ }
+ }
+}
+
+static void inline svc_copy_addr(struct svc_rqst *rqstp, struct svc_xprt *xprt)
+{
+ struct sockaddr *sin;
+
+ /* sock_recvmsg doesn't fill in the name/namelen, so we must..
+ */
+ memcpy(&rqstp->rq_addr, &xprt->xpt_remote, xprt->xpt_remotelen);
+ rqstp->rq_addrlen = xprt->xpt_remotelen;
+
+ /* Destination address in request is needed for binding the
+ * source address in RPC callbacks later.
+ */
+ sin = (struct sockaddr *)&xprt->xpt_local;
+ switch (sin->sa_family) {
+ case AF_INET:
+ rqstp->rq_daddr.addr = ((struct sockaddr_in *)sin)->sin_addr;
+ break;
+ case AF_INET6:
+ rqstp->rq_daddr.addr6 = ((struct sockaddr_in6 *)sin)->sin6_addr;
+ break;
+ }
+}
+
+int svc_port_is_privileged(struct sockaddr *sin)
+{
+ switch (sin->sa_family) {
+ case AF_INET:
+ return ntohs(((struct sockaddr_in *)sin)->sin_port)
+ < PROT_SOCK;
+ case AF_INET6:
+ return ntohs(((struct sockaddr_in6 *)sin)->sin6_port)
+ < PROT_SOCK;
+ default:
+ return 0;
+ }
+}
+
+/*
+ * Receive the next request on any socket. This code is carefully
+ * organised not to touch any cachelines in the shared svc_serv
+ * structure, only cachelines in the local svc_pool.
+ */
+int
+svc_recv(struct svc_rqst *rqstp, long timeout)
+{
+ struct svc_xprt *xprt = NULL;
+ struct svc_serv *serv = rqstp->rq_server;
+ struct svc_pool *pool = rqstp->rq_pool;
+ int len, i;
+ int pages;
+ struct xdr_buf *arg;
+ DECLARE_WAITQUEUE(wait, current);
+
+ dprintk("svc: server %p waiting for data (to = %ld)\n",
+ rqstp, timeout);
+
+ if (rqstp->rq_xprt)
+ printk(KERN_ERR
+ "svc_recv: service %p, transport not NULL!\n",
+ rqstp);
+ if (waitqueue_active(&rqstp->rq_wait))
+ printk(KERN_ERR
+ "svc_recv: service %p, wait queue active!\n",
+ rqstp);
+
+
+ /* now allocate needed pages. If we get a failure, sleep briefly */
+ pages = (serv->sv_max_mesg + PAGE_SIZE) / PAGE_SIZE;
+ for (i=0; i < pages ; i++)
+ while (rqstp->rq_pages[i] == NULL) {
+ struct page *p = alloc_page(GFP_KERNEL);
+ if (!p)
+ schedule_timeout_uninterruptible(msecs_to_jiffies(500));
+ rqstp->rq_pages[i] = p;
+ }
+ rqstp->rq_pages[i++] = NULL; /* this might be seen in nfs_read_actor */
+ BUG_ON(pages >= RPCSVC_MAXPAGES);
+
+ /* Make arg->head point to first page and arg->pages point to rest */
+ arg = &rqstp->rq_arg;
+ arg->head[0].iov_base = page_address(rqstp->rq_pages[0]);
+ arg->head[0].iov_len = PAGE_SIZE;
+ arg->pages = rqstp->rq_pages + 1;
+ arg->page_base = 0;
+ /* save at least one page for response */
+ arg->page_len = (pages-2)*PAGE_SIZE;
+ arg->len = (pages-1)*PAGE_SIZE;
+ arg->tail[0].iov_len = 0;
+
+ try_to_freeze();
+ cond_resched();
+ if (signalled())
+ return -EINTR;
+
+ spin_lock_bh(&pool->sp_lock);
+ if ((xprt = svc_xprt_dequeue(pool)) != NULL) {
+ rqstp->rq_xprt = xprt;
+ svc_xprt_get(xprt);
+ rqstp->rq_reserved = serv->sv_max_mesg;
+ atomic_add(rqstp->rq_reserved, &xprt->xpt_reserved);
+ } else {
+ /* No data pending. Go to sleep */
+ svc_thread_enqueue(pool, rqstp);
+
+ /*
+ * We have to be able to interrupt this wait
+ * to bring down the daemons ...
+ */
+ set_current_state(TASK_INTERRUPTIBLE);
+ add_wait_queue(&rqstp->rq_wait, &wait);
+ spin_unlock_bh(&pool->sp_lock);
+
+ schedule_timeout(timeout);
+
+ try_to_freeze();
+
+ spin_lock_bh(&pool->sp_lock);
+ remove_wait_queue(&rqstp->rq_wait, &wait);
+
+ if (!(xprt = rqstp->rq_xprt)) {
+ svc_thread_dequeue(pool, rqstp);
+ spin_unlock_bh(&pool->sp_lock);
+ dprintk("svc: server %p, no data yet\n", rqstp);
+ return signalled()? -EINTR : -EAGAIN;
+ }
+ }
+ spin_unlock_bh(&pool->sp_lock);
+
+ len = 0;
+ if (test_bit(XPT_CLOSE, &xprt->xpt_flags)) {
+ dprintk("svc_recv: found XPT_CLOSE\n");
+ svc_delete_xprt(xprt);
+ } else if (test_bit(XPT_LISTENER, &xprt->xpt_flags)) {
+ struct svc_xprt *newxpt;
+ newxpt = xprt->xpt_ops.xpo_accept(xprt);
+ if (newxpt) {
+ svc_xprt_received(newxpt);
+ /*
+ * We know this module_get will succeed because the
+ * listener holds a reference too
+ */
+ __module_get(newxpt->xpt_class->xcl_owner);
+ svc_check_conn_limits(xprt->xpt_server);
+ spin_lock_bh(&serv->sv_lock);
+ set_bit(XPT_TEMP, &newxpt->xpt_flags);
+ list_add(&newxpt->xpt_list, &serv->sv_tempsocks);
+ serv->sv_tmpcnt++;
+ if (serv->sv_temptimer.function == NULL) {
+ /* setup timer to age temp sockets */
+ setup_timer(&serv->sv_temptimer, svc_age_temp_xprts,
+ (unsigned long)serv);
+ mod_timer(&serv->sv_temptimer,
+ jiffies + svc_conn_age_period * HZ);
+ }
+ spin_unlock_bh(&serv->sv_lock);
+ }
+ svc_xprt_received(xprt);
+ } else {
+ dprintk("svc: server %p, pool %u, transport %p, inuse=%d\n",
+ rqstp, pool->sp_id, xprt,
+ atomic_read(&xprt->xpt_ref.refcount));
+
+ if ((rqstp->rq_deferred = svc_deferred_dequeue(xprt))) {
+ svc_xprt_received(xprt);
+ len = svc_deferred_recv(rqstp);
+ } else
+ len = xprt->xpt_ops.xpo_recvfrom(rqstp);
+ svc_copy_addr(rqstp, xprt);
+ dprintk("svc: got len=%d\n", len);
+ }
+
+ /* No data, incomplete (TCP) read, or accept() */
+ if (len == 0 || len == -EAGAIN) {
+ rqstp->rq_res.len = 0;
+ svc_xprt_release(rqstp);
+ return -EAGAIN;
+ }
+ xprt->xpt_lastrecv = get_seconds();
+ clear_bit(XPT_OLD, &xprt->xpt_flags);
+
+ rqstp->rq_secure = svc_port_is_privileged(svc_addr(rqstp));
+ rqstp->rq_chandle.defer = svc_defer;
+
+ if (serv->sv_stats)
+ serv->sv_stats->netcnt++;
+ return len;
+}
+
+/*
+ * Drop request
+ */
+void
+svc_drop(struct svc_rqst *rqstp)
+{
+ dprintk("svc: xprt %p dropped request\n", rqstp->rq_xprt);
+ svc_xprt_release(rqstp);
+}
+
+/*
+ * Return reply to client.
+ */
+int
+svc_send(struct svc_rqst *rqstp)
+{
+ struct svc_xprt *xprt;
+ int len;
+ struct xdr_buf *xb;
+
+ if ((xprt = rqstp->rq_xprt) == NULL) {
+ printk(KERN_WARNING "NULL transport pointer in %s:%d\n",
+ __FILE__, __LINE__);
+ return -EFAULT;
+ }
+
+ /* release the receive skb before sending the reply */
+ rqstp->rq_xprt->xpt_ops.xpo_release(rqstp);
+
+ /* calculate over-all length */
+ xb = & rqstp->rq_res;
+ xb->len = xb->head[0].iov_len +
+ xb->page_len +
+ xb->tail[0].iov_len;
+
+ /* Grab mutex to serialize outgoing data. */
+ mutex_lock(&xprt->xpt_mutex);
+ if (test_bit(XPT_DEAD, &xprt->xpt_flags))
+ len = -ENOTCONN;
+ else
+ len = xprt->xpt_ops.xpo_sendto(rqstp);
+ mutex_unlock(&xprt->xpt_mutex);
+ svc_xprt_release(rqstp);
+
+ if (len == -ECONNREFUSED || len == -ENOTCONN || len == -EAGAIN)
+ return 0;
+ return len;
+}
+
+/*
+ * Timer function to close old temporary sockets, using
+ * a mark-and-sweep algorithm.
+ */
+static void
+svc_age_temp_xprts(unsigned long closure)
+{
+ struct svc_serv *serv = (struct svc_serv *)closure;
+ struct svc_xprt *xprt;
+ struct list_head *le, *next;
+ LIST_HEAD(to_be_aged);
+
+ dprintk("svc_age_temp_xprts\n");
+
+ if (!spin_trylock_bh(&serv->sv_lock)) {
+ /* busy, try again 1 sec later */
+ dprintk("svc_age_temp_xprts: busy\n");
+ mod_timer(&serv->sv_temptimer, jiffies + HZ);
+ return;
+ }
+
+ list_for_each_safe(le, next, &serv->sv_tempsocks) {
+ xprt = list_entry(le, struct svc_xprt, xpt_list);
+
+ /* First time through, just mark it OLD. Second time
+ * through, close it. */
+ if (!test_and_set_bit(XPT_OLD, &xprt->xpt_flags))
+ continue;
+ if (atomic_read(&xprt->xpt_ref.refcount)
+ || test_bit(XPT_BUSY, &xprt->xpt_flags))
+ continue;
+ svc_xprt_get(xprt);
+ list_move(le, &to_be_aged);
+ set_bit(XPT_CLOSE, &xprt->xpt_flags);
+ set_bit(XPT_DETACHED, &xprt->xpt_flags);
+ }
+ spin_unlock_bh(&serv->sv_lock);
+
+ while (!list_empty(&to_be_aged)) {
+ le = to_be_aged.next;
+ /* fiddling the xpt_list node is safe 'cos we're XPT_DETACHED */
+ list_del_init(le);
+ xprt = list_entry(le, struct svc_xprt, xpt_list);
+
+ dprintk("queuing svsk %p for closing, %lu seconds old\n",
+ xprt, get_seconds() - xprt->xpt_lastrecv);
+
+ /* a thread will dequeue and close it soon */
+ svc_xprt_enqueue(xprt);
+ svc_xprt_put(xprt);
+ }
+
+ mod_timer(&serv->sv_temptimer, jiffies + svc_conn_age_period * HZ);
+}
+
+/*
+ * Remove a dead transport
+ */
+void
+svc_delete_xprt(struct svc_xprt *xprt)
+{
+ struct svc_serv *serv;
+
+ dprintk("svc: svc_delete_xprt(%p)\n", xprt);
+
+ serv = xprt->xpt_server;
+
+ xprt->xpt_ops.xpo_detach(xprt);
+
+ spin_lock_bh(&serv->sv_lock);
+
+ if (!test_and_set_bit(XPT_DETACHED, &xprt->xpt_flags))
+ list_del_init(&xprt->xpt_list);
+ /*
+ * We used to delete the transport from whichever list
+ * it's sk_xprt.xpt_ready node was on, but we don't actually
+ * need to. This is because the only time we're called
+ * while still attached to a queue, the queue itself
+ * is about to be destroyed (in svc_destroy).
+ */
+ if (!test_and_set_bit(XPT_DEAD, &xprt->xpt_flags)) {
+ BUG_ON(atomic_read(&xprt->xpt_ref.refcount)<2);
+ svc_xprt_put(xprt);
+ if (test_bit(XPT_TEMP, &xprt->xpt_flags))
+ serv->sv_tmpcnt--;
+ }
+
+ spin_unlock_bh(&serv->sv_lock);
+}
+
+void svc_close_xprt(struct svc_xprt *xprt)
+{
+ set_bit(XPT_CLOSE, &xprt->xpt_flags);
+ if (test_and_set_bit(XPT_BUSY, &xprt->xpt_flags))
+ /* someone else will have to effect the close */
+ return;
+
+ svc_xprt_get(xprt);
+ svc_delete_xprt(xprt);
+ clear_bit(XPT_BUSY, &xprt->xpt_flags);
+ svc_xprt_put(xprt);
+}
+
+void svc_close_all(struct list_head *xprt_list)
+{
+ struct svc_xprt *xprt;
+ struct svc_xprt *tmp;
+
+ list_for_each_entry_safe(xprt, tmp, xprt_list, xpt_list) {
+ set_bit(XPT_CLOSE, &xprt->xpt_flags);
+ if (test_bit(XPT_BUSY, &xprt->xpt_flags)) {
+ /* Waiting to be processed, but no threads left,
+ * So just remove it from the waiting list
+ */
+ list_del_init(&xprt->xpt_ready);
+ clear_bit(XPT_BUSY, &xprt->xpt_flags);
+ }
+ svc_close_xprt(xprt);
+ }
+}
+
+/*
+ * Handle defer and revisit of requests
+ */
+
+static void svc_revisit(struct cache_deferred_req *dreq, int too_many)
+{
+ struct svc_deferred_req *dr = container_of(dreq, struct svc_deferred_req, handle);
+ struct svc_xprt *xprt = dr->xprt;
+
+ if (too_many) {
+ svc_xprt_put(xprt);
+ kfree(dr);
+ return;
+ }
+ dprintk("revisit queued\n");
+ dr->xprt = NULL;
+ spin_lock(&xprt->xpt_lock);
+ list_add(&dr->handle.recent, &xprt->xpt_deferred);
+ spin_unlock(&xprt->xpt_lock);
+ set_bit(XPT_DEFERRED, &xprt->xpt_flags);
+ svc_xprt_enqueue(xprt);
+ svc_xprt_put(xprt);
+}
+
+static struct cache_deferred_req *
+svc_defer(struct cache_req *req)
+{
+ struct svc_rqst *rqstp = container_of(req, struct svc_rqst, rq_chandle);
+ int size = sizeof(struct svc_deferred_req) + (rqstp->rq_arg.len);
+ struct svc_deferred_req *dr;
+
+ if (rqstp->rq_arg.page_len)
+ return NULL; /* if more than a page, give up FIXME */
+ if (rqstp->rq_deferred) {
+ dr = rqstp->rq_deferred;
+ rqstp->rq_deferred = NULL;
+ } else {
+ int skip = rqstp->rq_arg.len - rqstp->rq_arg.head[0].iov_len;
+ /* FIXME maybe discard if size too large */
+ dr = kmalloc(size, GFP_KERNEL);
+ if (dr == NULL)
+ return NULL;
+
+ dr->handle.owner = rqstp->rq_server;
+ dr->prot = rqstp->rq_prot;
+ memcpy(&dr->addr, &rqstp->rq_addr, rqstp->rq_addrlen);
+ dr->addrlen = rqstp->rq_addrlen;
+ dr->daddr = rqstp->rq_daddr;
+ dr->argslen = rqstp->rq_arg.len >> 2;
+ memcpy(dr->args, rqstp->rq_arg.head[0].iov_base-skip, dr->argslen<<2);
+ }
+ svc_xprt_get(rqstp->rq_xprt);
+ dr->xprt = rqstp->rq_xprt;
+
+ dr->handle.revisit = svc_revisit;
+ return &dr->handle;
+}
+
+/*
+ * recv data from a deferred request into an active one
+ */
+static int svc_deferred_recv(struct svc_rqst *rqstp)
+{
+ struct svc_deferred_req *dr = rqstp->rq_deferred;
+
+ rqstp->rq_arg.head[0].iov_base = dr->args;
+ rqstp->rq_arg.head[0].iov_len = dr->argslen<<2;
+ rqstp->rq_arg.page_len = 0;
+ rqstp->rq_arg.len = dr->argslen<<2;
+ rqstp->rq_prot = dr->prot;
+ memcpy(&rqstp->rq_addr, &dr->addr, dr->addrlen);
+ rqstp->rq_addrlen = dr->addrlen;
+ rqstp->rq_daddr = dr->daddr;
+ rqstp->rq_respages = rqstp->rq_pages;
+ return dr->argslen<<2;
+}
+
+
+static struct svc_deferred_req *svc_deferred_dequeue(struct svc_xprt *xprt)
+{
+ struct svc_deferred_req *dr = NULL;
+
+ if (!test_bit(XPT_DEFERRED, &xprt->xpt_flags))
+ return NULL;
+ spin_lock(&xprt->xpt_lock);
+ clear_bit(XPT_DEFERRED, &xprt->xpt_flags);
+ if (!list_empty(&xprt->xpt_deferred)) {
+ dr = list_entry(xprt->xpt_deferred.next,
+ struct svc_deferred_req,
+ handle.recent);
+ list_del_init(&dr->handle.recent);
+ set_bit(XPT_DEFERRED, &xprt->xpt_flags);
+ }
+ spin_unlock(&xprt->xpt_lock);
+ return dr;
+}
diff --git a/net/sunrpc/svcsock.c b/net/sunrpc/svcsock.c
index 5ea26b2..d0c61c6 100644
--- a/net/sunrpc/svcsock.c
+++ b/net/sunrpc/svcsock.c
@@ -79,27 +79,14 @@ #define RPCDBG_FACILITY RPCDBG_SVCXPRT

static struct svc_sock *svc_setup_socket(struct svc_serv *, struct socket *,
int *errp, int flags);
-static void svc_delete_xprt(struct svc_xprt *xprt);
static void svc_udp_data_ready(struct sock *, int);
static int svc_udp_recvfrom(struct svc_rqst *);
static int svc_udp_sendto(struct svc_rqst *);
-static void svc_close_xprt(struct svc_xprt *xprt);
static void svc_sock_detach(struct svc_xprt *);
static void svc_sock_free(struct svc_xprt *);

-static struct svc_deferred_req *svc_deferred_dequeue(struct svc_xprt *xprt);
-static int svc_deferred_recv(struct svc_rqst *rqstp);
-static struct cache_deferred_req *svc_defer(struct cache_req *req);
static struct svc_xprt *
svc_create_socket(struct svc_serv *, int, struct sockaddr *, int, int);
-static void svc_age_temp_xprts(unsigned long closure);
-
-/* apparently the "standard" is that clients close
- * idle connections after 5 minutes, servers after
- * 6 minutes
- * http://www.connectathon.org/talks96/nfstcp.pdf
- */
-static int svc_conn_age_period = 6*60;

#ifdef CONFIG_DEBUG_LOCK_ALLOC
static struct lock_class_key svc_key[2];
@@ -166,27 +153,6 @@ char *svc_print_addr(struct svc_rqst *rq
EXPORT_SYMBOL_GPL(svc_print_addr);

/*
- * Queue up an idle server thread. Must have pool->sp_lock held.
- * Note: this is really a stack rather than a queue, so that we only
- * use as many different threads as we need, and the rest don't pollute
- * the cache.
- */
-static inline void
-svc_thread_enqueue(struct svc_pool *pool, struct svc_rqst *rqstp)
-{
- list_add(&rqstp->rq_list, &pool->sp_threads);
-}
-
-/*
- * Dequeue an nfsd thread. Must have pool->sp_lock held.
- */
-static inline void
-svc_thread_dequeue(struct svc_pool *pool, struct svc_rqst *rqstp)
-{
- list_del(&rqstp->rq_list);
-}
-
-/*
* Release an skbuff after use
*/
static void
@@ -224,219 +190,6 @@ svc_sock_wspace(struct svc_sock *svsk)
return wspace;
}

-/*
- * Queue up a socket with data pending. If there are idle nfsd
- * processes, wake 'em up.
- *
- */
-void
-svc_xprt_enqueue(struct svc_xprt *xprt)
-{
- struct svc_serv *serv = xprt->xpt_server;
- struct svc_pool *pool;
- struct svc_rqst *rqstp;
- int cpu;
-
- if (!(xprt->xpt_flags &
- ( (1<<XPT_CONN)|(1<<XPT_DATA)|(1<<XPT_CLOSE)|(1<<XPT_DEFERRED)) ))
- return;
- if (test_bit(XPT_DEAD, &xprt->xpt_flags))
- return;
-
- cpu = get_cpu();
- pool = svc_pool_for_cpu(xprt->xpt_server, cpu);
- put_cpu();
-
- spin_lock_bh(&pool->sp_lock);
-
- if (!list_empty(&pool->sp_threads) &&
- !list_empty(&pool->sp_sockets))
- printk(KERN_ERR
- "svc_xprt_enqueue: threads and sockets both waiting??\n");
-
- if (test_bit(XPT_DEAD, &xprt->xpt_flags)) {
- /* Don't enqueue dead sockets */
- dprintk("svc: transport %p is dead, not enqueued\n", xprt);
- goto out_unlock;
- }
-
- /* Mark socket as busy. It will remain in this state until the
- * server has processed all pending data and put the socket back
- * on the idle list. We update XPT_BUSY atomically because
- * it also guards against trying to enqueue the svc_sock twice.
- */
- if (test_and_set_bit(XPT_BUSY, &xprt->xpt_flags)) {
- /* Don't enqueue socket while already enqueued */
- dprintk("svc: transport %p busy, not enqueued\n", xprt);
- goto out_unlock;
- }
- BUG_ON(xprt->xpt_pool != NULL);
- xprt->xpt_pool = pool;
-
- /* Handle pending connection */
- if (test_bit(XPT_CONN, &xprt->xpt_flags))
- goto process;
-
- /* Handle close in-progress */
- if (test_bit(XPT_CLOSE, &xprt->xpt_flags))
- goto process;
-
- /* Check if we have space to reply to a request */
- if (!xprt->xpt_ops.xpo_has_wspace(xprt)) {
- /* Don't enqueue while not enough space for reply */
- dprintk("svc: no write space, transport %p not enqueued\n", xprt);
- xprt->xpt_pool = NULL;
- clear_bit(XPT_BUSY, &xprt->xpt_flags);
- goto out_unlock;
- }
-
- process:
- if (!list_empty(&pool->sp_threads)) {
- rqstp = list_entry(pool->sp_threads.next,
- struct svc_rqst,
- rq_list);
- dprintk("svc: transport %p served by daemon %p\n",
- xprt, rqstp);
- svc_thread_dequeue(pool, rqstp);
- if (rqstp->rq_xprt)
- printk(KERN_ERR
- "svc_xprt_enqueue: server %p, rq_xprt=%p!\n",
- rqstp, rqstp->rq_xprt);
- rqstp->rq_xprt = xprt;
- svc_xprt_get(xprt);
- rqstp->rq_reserved = serv->sv_max_mesg;
- atomic_add(rqstp->rq_reserved, &xprt->xpt_reserved);
- BUG_ON(xprt->xpt_pool != pool);
- wake_up(&rqstp->rq_wait);
- } else {
- dprintk("svc: transport %p put into queue\n", xprt);
- list_add_tail(&xprt->xpt_ready, &pool->sp_sockets);
- BUG_ON(xprt->xpt_pool != pool);
- }
-
-out_unlock:
- spin_unlock_bh(&pool->sp_lock);
-}
-EXPORT_SYMBOL_GPL(svc_xprt_enqueue);
-
-/*
- * Dequeue the first socket. Must be called with the pool->sp_lock held.
- */
-static inline struct svc_xprt *
-svc_xprt_dequeue(struct svc_pool *pool)
-{
- struct svc_xprt *xprt;
-
- if (list_empty(&pool->sp_sockets))
- return NULL;
-
- xprt = list_entry(pool->sp_sockets.next,
- struct svc_xprt, xpt_ready);
- list_del_init(&xprt->xpt_ready);
-
- dprintk("svc: transport %p dequeued, inuse=%d\n",
- xprt, atomic_read(&xprt->xpt_ref.refcount));
-
- return xprt;
-}
-
-/*
- * Having read something from a socket, check whether it
- * needs to be re-enqueued.
- * Note: XPT_DATA only gets cleared when a read-attempt finds
- * no (or insufficient) data.
- */
-void
-svc_xprt_received(struct svc_xprt *xprt)
-{
- xprt->xpt_pool = NULL;
- clear_bit(XPT_BUSY, &xprt->xpt_flags);
- svc_xprt_enqueue(xprt);
-}
-EXPORT_SYMBOL_GPL(svc_xprt_received);
-
-/**
- * svc_reserve - change the space reserved for the reply to a request.
- * @rqstp: The request in question
- * @space: new max space to reserve
- *
- * Each request reserves some space on the output queue of the socket
- * to make sure the reply fits. This function reduces that reserved
- * space to be the amount of space used already, plus @space.
- *
- */
-void svc_reserve(struct svc_rqst *rqstp, int space)
-{
- space += rqstp->rq_res.head[0].iov_len;
-
- if (space < rqstp->rq_reserved) {
- struct svc_xprt *xprt = rqstp->rq_xprt;
- atomic_sub((rqstp->rq_reserved - space), &xprt->xpt_reserved);
- rqstp->rq_reserved = space;
-
- svc_xprt_enqueue(xprt);
- }
-}
-
-static void
-svc_xprt_release(struct svc_rqst *rqstp)
-{
- struct svc_xprt *xprt = rqstp->rq_xprt;
-
- rqstp->rq_xprt->xpt_ops.xpo_release(rqstp);
-
- svc_free_res_pages(rqstp);
- rqstp->rq_res.page_len = 0;
- rqstp->rq_res.page_base = 0;
-
- /* Reset response buffer and release
- * the reservation.
- * But first, check that enough space was reserved
- * for the reply, otherwise we have a bug!
- */
- if ((rqstp->rq_res.len) > rqstp->rq_reserved)
- printk(KERN_ERR "RPC request reserved %d but used %d\n",
- rqstp->rq_reserved,
- rqstp->rq_res.len);
-
- rqstp->rq_res.head[0].iov_len = 0;
- svc_reserve(rqstp, 0);
- rqstp->rq_xprt = NULL;
-
- svc_xprt_put(xprt);
-}
-
-/*
- * External function to wake up a server waiting for data
- * This really only makes sense for services like lockd
- * which have exactly one thread anyway.
- */
-void
-svc_wake_up(struct svc_serv *serv)
-{
- struct svc_rqst *rqstp;
- unsigned int i;
- struct svc_pool *pool;
-
- for (i = 0; i < serv->sv_nrpools; i++) {
- pool = &serv->sv_pools[i];
-
- spin_lock_bh(&pool->sp_lock);
- if (!list_empty(&pool->sp_threads)) {
- rqstp = list_entry(pool->sp_threads.next,
- struct svc_rqst,
- rq_list);
- dprintk("svc: daemon %p woken up.\n", rqstp);
- /*
- svc_thread_dequeue(pool, rqstp);
- rqstp->rq_xprt = NULL;
- */
- wake_up(&rqstp->rq_wait);
- }
- spin_unlock_bh(&pool->sp_lock);
- }
-}
-
union svc_pktinfo_u {
struct in_pktinfo pkti;
struct in6_pktinfo pkti6;
@@ -1014,20 +767,6 @@ svc_tcp_data_ready(struct sock *sk, int
wake_up_interruptible(sk->sk_sleep);
}

-static inline int svc_port_is_privileged(struct sockaddr *sin)
-{
- switch (sin->sa_family) {
- case AF_INET:
- return ntohs(((struct sockaddr_in *)sin)->sin_port)
- < PROT_SOCK;
- case AF_INET6:
- return ntohs(((struct sockaddr_in6 *)sin)->sin6_port)
- < PROT_SOCK;
- default:
- return 0;
- }
-}
-
/*
* Accept a TCP connection
*/
@@ -1424,333 +1163,6 @@ svc_sock_update_bufs(struct svc_serv *se
spin_unlock_bh(&serv->sv_lock);
}

-static void
-svc_check_conn_limits(struct svc_serv *serv)
-{
- char buf[RPC_MAX_ADDRBUFLEN];
-
- /* make sure that we don't have too many active connections.
- * If we have, something must be dropped.
- *
- * There's no point in trying to do random drop here for
- * DoS prevention. The NFS clients does 1 reconnect in 15
- * seconds. An attacker can easily beat that.
- *
- * The only somewhat efficient mechanism would be if drop
- * old connections from the same IP first. But right now
- * we don't even record the client IP in svc_sock.
- */
- if (serv->sv_tmpcnt > (serv->sv_nrthreads+3)*20) {
- struct svc_sock *svsk = NULL;
- spin_lock_bh(&serv->sv_lock);
- if (!list_empty(&serv->sv_tempsocks)) {
- if (net_ratelimit()) {
- /* Try to help the admin */
- printk(KERN_NOTICE "%s: too many open TCP "
- "sockets, consider increasing the "
- "number of nfsd threads\n",
- serv->sv_name);
- printk(KERN_NOTICE
- "%s: last TCP connect from %s\n",
- serv->sv_name, buf);
- }
- /*
- * Always select the oldest socket. It's not fair,
- * but so is life
- */
- svsk = list_entry(serv->sv_tempsocks.prev,
- struct svc_sock,
- sk_xprt.xpt_list);
- set_bit(XPT_CLOSE, &svsk->sk_xprt.xpt_flags);
- svc_xprt_get(&svsk->sk_xprt);
- }
- spin_unlock_bh(&serv->sv_lock);
-
- if (svsk) {
- svc_xprt_enqueue(&svsk->sk_xprt);
- svc_xprt_put(&svsk->sk_xprt);
- }
- }
-}
-
-static void inline svc_copy_addr(struct svc_rqst *rqstp, struct svc_xprt *xprt)
-{
- struct sockaddr *sin;
-
- /* sock_recvmsg doesn't fill in the name/namelen, so we must..
- */
- memcpy(&rqstp->rq_addr, &xprt->xpt_remote, xprt->xpt_remotelen);
- rqstp->rq_addrlen = xprt->xpt_remotelen;
-
- /* Destination address in request is needed for binding the
- * source address in RPC callbacks later.
- */
- sin = (struct sockaddr *)&xprt->xpt_local;
- switch (sin->sa_family) {
- case AF_INET:
- rqstp->rq_daddr.addr = ((struct sockaddr_in *)sin)->sin_addr;
- break;
- case AF_INET6:
- rqstp->rq_daddr.addr6 = ((struct sockaddr_in6 *)sin)->sin6_addr;
- break;
- }
-}
-
-/*
- * Receive the next request on any socket. This code is carefully
- * organised not to touch any cachelines in the shared svc_serv
- * structure, only cachelines in the local svc_pool.
- */
-int
-svc_recv(struct svc_rqst *rqstp, long timeout)
-{
- struct svc_xprt *xprt = NULL;
- struct svc_serv *serv = rqstp->rq_server;
- struct svc_pool *pool = rqstp->rq_pool;
- int len, i;
- int pages;
- struct xdr_buf *arg;
- DECLARE_WAITQUEUE(wait, current);
-
- dprintk("svc: server %p waiting for data (to = %ld)\n",
- rqstp, timeout);
-
- if (rqstp->rq_xprt)
- printk(KERN_ERR
- "svc_recv: service %p, transport not NULL!\n",
- rqstp);
- if (waitqueue_active(&rqstp->rq_wait))
- printk(KERN_ERR
- "svc_recv: service %p, wait queue active!\n",
- rqstp);
-
-
- /* now allocate needed pages. If we get a failure, sleep briefly */
- pages = (serv->sv_max_mesg + PAGE_SIZE) / PAGE_SIZE;
- for (i=0; i < pages ; i++)
- while (rqstp->rq_pages[i] == NULL) {
- struct page *p = alloc_page(GFP_KERNEL);
- if (!p)
- schedule_timeout_uninterruptible(msecs_to_jiffies(500));
- rqstp->rq_pages[i] = p;
- }
- rqstp->rq_pages[i++] = NULL; /* this might be seen in nfs_read_actor */
- BUG_ON(pages >= RPCSVC_MAXPAGES);
-
- /* Make arg->head point to first page and arg->pages point to rest */
- arg = &rqstp->rq_arg;
- arg->head[0].iov_base = page_address(rqstp->rq_pages[0]);
- arg->head[0].iov_len = PAGE_SIZE;
- arg->pages = rqstp->rq_pages + 1;
- arg->page_base = 0;
- /* save at least one page for response */
- arg->page_len = (pages-2)*PAGE_SIZE;
- arg->len = (pages-1)*PAGE_SIZE;
- arg->tail[0].iov_len = 0;
-
- try_to_freeze();
- cond_resched();
- if (signalled())
- return -EINTR;
-
- spin_lock_bh(&pool->sp_lock);
- if ((xprt = svc_xprt_dequeue(pool)) != NULL) {
- rqstp->rq_xprt = xprt;
- svc_xprt_get(xprt);
- rqstp->rq_reserved = serv->sv_max_mesg;
- atomic_add(rqstp->rq_reserved, &xprt->xpt_reserved);
- } else {
- /* No data pending. Go to sleep */
- svc_thread_enqueue(pool, rqstp);
-
- /*
- * We have to be able to interrupt this wait
- * to bring down the daemons ...
- */
- set_current_state(TASK_INTERRUPTIBLE);
- add_wait_queue(&rqstp->rq_wait, &wait);
- spin_unlock_bh(&pool->sp_lock);
-
- schedule_timeout(timeout);
-
- try_to_freeze();
-
- spin_lock_bh(&pool->sp_lock);
- remove_wait_queue(&rqstp->rq_wait, &wait);
-
- if (!(xprt = rqstp->rq_xprt)) {
- svc_thread_dequeue(pool, rqstp);
- spin_unlock_bh(&pool->sp_lock);
- dprintk("svc: server %p, no data yet\n", rqstp);
- return signalled()? -EINTR : -EAGAIN;
- }
- }
- spin_unlock_bh(&pool->sp_lock);
-
- len = 0;
- if (test_bit(XPT_CLOSE, &xprt->xpt_flags)) {
- dprintk("svc_recv: found XPT_CLOSE\n");
- svc_delete_xprt(xprt);
- } else if (test_bit(XPT_LISTENER, &xprt->xpt_flags)) {
- struct svc_xprt *newxpt;
- newxpt = xprt->xpt_ops.xpo_accept(xprt);
- if (newxpt) {
- svc_xprt_received(newxpt);
- /*
- * We know this module_get will succeed because the
- * listener holds a reference too
- */
- __module_get(newxpt->xpt_class->xcl_owner);
- svc_check_conn_limits(xprt->xpt_server);
- spin_lock_bh(&serv->sv_lock);
- set_bit(XPT_TEMP, &newxpt->xpt_flags);
- list_add(&newxpt->xpt_list, &serv->sv_tempsocks);
- serv->sv_tmpcnt++;
- if (serv->sv_temptimer.function == NULL) {
- /* setup timer to age temp sockets */
- setup_timer(&serv->sv_temptimer, svc_age_temp_xprts,
- (unsigned long)serv);
- mod_timer(&serv->sv_temptimer,
- jiffies + svc_conn_age_period * HZ);
- }
- spin_unlock_bh(&serv->sv_lock);
- }
- svc_xprt_received(xprt);
- } else {
- dprintk("svc: server %p, pool %u, transport %p, inuse=%d\n",
- rqstp, pool->sp_id, xprt,
- atomic_read(&xprt->xpt_ref.refcount));
-
- if ((rqstp->rq_deferred = svc_deferred_dequeue(xprt))) {
- svc_xprt_received(xprt);
- len = svc_deferred_recv(rqstp);
- } else
- len = xprt->xpt_ops.xpo_recvfrom(rqstp);
- svc_copy_addr(rqstp, xprt);
- dprintk("svc: got len=%d\n", len);
- }
-
- /* No data, incomplete (TCP) read, or accept() */
- if (len == 0 || len == -EAGAIN) {
- rqstp->rq_res.len = 0;
- svc_xprt_release(rqstp);
- return -EAGAIN;
- }
- xprt->xpt_lastrecv = get_seconds();
- clear_bit(XPT_OLD, &xprt->xpt_flags);
-
- rqstp->rq_secure = svc_port_is_privileged(svc_addr(rqstp));
- rqstp->rq_chandle.defer = svc_defer;
-
- if (serv->sv_stats)
- serv->sv_stats->netcnt++;
- return len;
-}
-
-/*
- * Drop request
- */
-void
-svc_drop(struct svc_rqst *rqstp)
-{
- dprintk("svc: xprt %p dropped request\n", rqstp->rq_xprt);
- svc_xprt_release(rqstp);
-}
-
-/*
- * Return reply to client.
- */
-int
-svc_send(struct svc_rqst *rqstp)
-{
- struct svc_xprt *xprt;
- int len;
- struct xdr_buf *xb;
-
- if ((xprt = rqstp->rq_xprt) == NULL) {
- printk(KERN_WARNING "NULL transport pointer in %s:%d\n",
- __FILE__, __LINE__);
- return -EFAULT;
- }
-
- /* release the receive skb before sending the reply */
- rqstp->rq_xprt->xpt_ops.xpo_release(rqstp);
-
- /* calculate over-all length */
- xb = & rqstp->rq_res;
- xb->len = xb->head[0].iov_len +
- xb->page_len +
- xb->tail[0].iov_len;
-
- /* Grab mutex to serialize outgoing data. */
- mutex_lock(&xprt->xpt_mutex);
- if (test_bit(XPT_DEAD, &xprt->xpt_flags))
- len = -ENOTCONN;
- else
- len = xprt->xpt_ops.xpo_sendto(rqstp);
- mutex_unlock(&xprt->xpt_mutex);
- svc_xprt_release(rqstp);
-
- if (len == -ECONNREFUSED || len == -ENOTCONN || len == -EAGAIN)
- return 0;
- return len;
-}
-
-/*
- * Timer function to close old temporary sockets, using
- * a mark-and-sweep algorithm.
- */
-static void
-svc_age_temp_xprts(unsigned long closure)
-{
- struct svc_serv *serv = (struct svc_serv *)closure;
- struct svc_xprt *xprt;
- struct list_head *le, *next;
- LIST_HEAD(to_be_aged);
-
- dprintk("svc_age_temp_xprts\n");
-
- if (!spin_trylock_bh(&serv->sv_lock)) {
- /* busy, try again 1 sec later */
- dprintk("svc_age_temp_xprts: busy\n");
- mod_timer(&serv->sv_temptimer, jiffies + HZ);
- return;
- }
-
- list_for_each_safe(le, next, &serv->sv_tempsocks) {
- xprt = list_entry(le, struct svc_xprt, xpt_list);
-
- /* First time through, just mark it OLD. Second time
- * through, close it. */
- if (!test_and_set_bit(XPT_OLD, &xprt->xpt_flags))
- continue;
- if (atomic_read(&svsk->sk_xprt.xpt_ref.refcount) > 1
- || test_bit(SK_BUSY, &svsk->sk_flags))
- continue;
- svc_xprt_get(xprt);
- list_move(le, &to_be_aged);
- set_bit(XPT_CLOSE, &xprt->xpt_flags);
- set_bit(XPT_DETACHED, &xprt->xpt_flags);
- }
- spin_unlock_bh(&serv->sv_lock);
-
- while (!list_empty(&to_be_aged)) {
- le = to_be_aged.next;
- /* fiddling the xpt_list node is safe 'cos we're XPT_DETACHED */
- list_del_init(le);
- xprt = list_entry(le, struct svc_xprt, xpt_list);
-
- dprintk("queuing svsk %p for closing, %lu seconds old\n",
- xprt, get_seconds() - xprt->xpt_lastrecv);
-
- /* a thread will dequeue and close it soon */
- svc_xprt_enqueue(xprt);
- svc_xprt_put(xprt);
- }
-
- mod_timer(&serv->sv_temptimer, jiffies + svc_conn_age_period * HZ);
-}
-
/*
* Initialize socket for RPC use and create svc_sock struct
* XXX: May want to setsockopt SO_SNDBUF and SO_RCVBUF.
@@ -1934,165 +1346,3 @@ svc_sock_free(struct svc_xprt *xprt)
kfree(svsk);
}

-/*
- * Remove a dead transport
- */
-static void
-svc_delete_xprt(struct svc_xprt *xprt)
-{
- struct svc_serv *serv;
-
- dprintk("svc: svc_delete_xprt(%p)\n", xprt);
-
- serv = xprt->xpt_server;
-
- xprt->xpt_ops.xpo_detach(xprt);
-
- spin_lock_bh(&serv->sv_lock);
-
- if (!test_and_set_bit(XPT_DETACHED, &xprt->xpt_flags))
- list_del_init(&xprt->xpt_list);
- /*
- * We used to delete the transport from whichever list
- * it's sk_xprt.xpt_ready node was on, but we don't actually
- * need to. This is because the only time we're called
- * while still attached to a queue, the queue itself
- * is about to be destroyed (in svc_destroy).
- */
- if (!test_and_set_bit(XPT_DEAD, &xprt->xpt_flags)) {
- BUG_ON(atomic_read(&xprt->xpt_ref.refcount)<2);
- svc_xprt_put(xprt);
- if (test_bit(XPT_TEMP, &xprt->xpt_flags))
- serv->sv_tmpcnt--;
- }
-
- spin_unlock_bh(&serv->sv_lock);
-}
-
-static void svc_close_xprt(struct svc_xprt *xprt)
-{
- set_bit(XPT_CLOSE, &xprt->xpt_flags);
- if (test_and_set_bit(XPT_BUSY, &xprt->xpt_flags))
- /* someone else will have to effect the close */
- return;
-
- svc_xprt_get(xprt);
- svc_delete_xprt(xprt);
- clear_bit(XPT_BUSY, &xprt->xpt_flags);
- svc_xprt_put(xprt);
-}
-
-void svc_close_all(struct list_head *xprt_list)
-{
- struct svc_xprt *xprt;
- struct svc_xprt *tmp;
-
- list_for_each_entry_safe(xprt, tmp, xprt_list, xpt_list) {
- set_bit(XPT_CLOSE, &xprt->xpt_flags);
- if (test_bit(XPT_BUSY, &xprt->xpt_flags)) {
- /* Waiting to be processed, but no threads left,
- * So just remove it from the waiting list
- */
- list_del_init(&xprt->xpt_ready);
- clear_bit(XPT_BUSY, &xprt->xpt_flags);
- }
- svc_close_xprt(xprt);
- }
-}
-
-/*
- * Handle defer and revisit of requests
- */
-
-static void svc_revisit(struct cache_deferred_req *dreq, int too_many)
-{
- struct svc_deferred_req *dr = container_of(dreq, struct svc_deferred_req, handle);
- struct svc_xprt *xprt = dr->xprt;
-
- if (too_many) {
- svc_xprt_put(xprt);
- kfree(dr);
- return;
- }
- dprintk("revisit queued\n");
- dr->xprt = NULL;
- spin_lock(&xprt->xpt_lock);
- list_add(&dr->handle.recent, &xprt->xpt_deferred);
- spin_unlock(&xprt->xpt_lock);
- set_bit(XPT_DEFERRED, &xprt->xpt_flags);
- svc_xprt_enqueue(xprt);
- svc_xprt_put(xprt);
-}
-
-static struct cache_deferred_req *
-svc_defer(struct cache_req *req)
-{
- struct svc_rqst *rqstp = container_of(req, struct svc_rqst, rq_chandle);
- int size = sizeof(struct svc_deferred_req) + (rqstp->rq_arg.len);
- struct svc_deferred_req *dr;
-
- if (rqstp->rq_arg.page_len)
- return NULL; /* if more than a page, give up FIXME */
- if (rqstp->rq_deferred) {
- dr = rqstp->rq_deferred;
- rqstp->rq_deferred = NULL;
- } else {
- int skip = rqstp->rq_arg.len - rqstp->rq_arg.head[0].iov_len;
- /* FIXME maybe discard if size too large */
- dr = kmalloc(size, GFP_KERNEL);
- if (dr == NULL)
- return NULL;
-
- dr->handle.owner = rqstp->rq_server;
- dr->prot = rqstp->rq_prot;
- memcpy(&dr->addr, &rqstp->rq_addr, rqstp->rq_addrlen);
- dr->addrlen = rqstp->rq_addrlen;
- dr->daddr = rqstp->rq_daddr;
- dr->argslen = rqstp->rq_arg.len >> 2;
- memcpy(dr->args, rqstp->rq_arg.head[0].iov_base-skip, dr->argslen<<2);
- }
- svc_xprt_get(rqstp->rq_xprt);
- dr->xprt = rqstp->rq_xprt;
-
- dr->handle.revisit = svc_revisit;
- return &dr->handle;
-}
-
-/*
- * recv data from a deferred request into an active one
- */
-static int svc_deferred_recv(struct svc_rqst *rqstp)
-{
- struct svc_deferred_req *dr = rqstp->rq_deferred;
-
- rqstp->rq_arg.head[0].iov_base = dr->args;
- rqstp->rq_arg.head[0].iov_len = dr->argslen<<2;
- rqstp->rq_arg.page_len = 0;
- rqstp->rq_arg.len = dr->argslen<<2;
- rqstp->rq_prot = dr->prot;
- memcpy(&rqstp->rq_addr, &dr->addr, dr->addrlen);
- rqstp->rq_addrlen = dr->addrlen;
- rqstp->rq_daddr = dr->daddr;
- rqstp->rq_respages = rqstp->rq_pages;
- return dr->argslen<<2;
-}
-
-
-static struct svc_deferred_req *svc_deferred_dequeue(struct svc_xprt *xprt)
-{
- struct svc_deferred_req *dr = NULL;
-
- if (!test_bit(XPT_DEFERRED, &xprt->xpt_flags))
- return NULL;
- spin_lock(&xprt->xpt_lock);
- clear_bit(XPT_DEFERRED, &xprt->xpt_flags);
- if (!list_empty(&xprt->xpt_deferred)) {
- dr = list_entry(xprt->xpt_deferred.next,
- struct svc_deferred_req,
- handle.recent);
- list_del_init(&dr->handle.recent);
- set_bit(XPT_DEFERRED, &xprt->xpt_flags);
- }
- spin_unlock(&xprt->xpt_lock);
- return dr;
-}

-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2005.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2007-09-27 05:02:45

by Tom Tucker

[permalink] [raw]
Subject: [RFC, PATCH 33/33] knfsd: Support adding transports by writing portlist file


Update the write handler for the portlist file to allow creating new
listening endpoints on a transport. The general form of the string is:

<transport_name><space><port number>

For example:

tcp 2049

This is intended to support the creation of a listening endpoint for
RDMA transports without adding #ifdef code to the nfssvc.c file.
The general idea is that the rpc.nfsd program would read the transports
file and then write the portlist file to create listening endpoints
for all or selected transports. The current mechanism of writing an
fd would become obsolete.

Signed-off-by: Tom Tucker <[email protected]>
---

fs/nfsd/nfsctl.c | 16 ++++++++++++++++
1 files changed, 16 insertions(+), 0 deletions(-)

diff --git a/fs/nfsd/nfsctl.c b/fs/nfsd/nfsctl.c
index baac89d..923b817 100644
--- a/fs/nfsd/nfsctl.c
+++ b/fs/nfsd/nfsctl.c
@@ -554,6 +554,22 @@ static ssize_t write_ports(struct file *
kfree(toclose);
return len;
}
+ /*
+ * Add a transport listener by writing it's transport name
+ */
+ if (isalnum(buf[0])) {
+ int err;
+ char transport[16];
+ int port;
+ if (sscanf(buf, "%15s %4d", transport, &port) == 2) {
+ err = nfsd_create_serv();
+ if (!err)
+ err = svc_create_xprt(nfsd_serv,
+ transport, port,
+ SVC_SOCK_ANONYMOUS);
+ return err < 0 ? err : 0;
+ }
+ }
return -EINVAL;
}


-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2005.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2007-09-27 17:55:42

by J. Bruce Fields

[permalink] [raw]
Subject: Re: [RFC,PATCH 00/33] SVC Transport Switch

On Wed, Sep 26, 2007 at 11:57:51PM -0500, Tom Tucker wrote:
> The following series implements a pluggable transport switch for
> RPC servers.

Seeing as this seems not to break anything obvious (well, it compiles
anyway), and people seem to agree we'll merge some version of this
eventually, I've added it to for-mm in hopes it'll get included in an
-mm release soon.

I'll take an hour to read through it some time just for fun, but I'm
depending on Neil and others for serious review in this case. (One
trivial complaint: git's complaining about lines that add trailing
whitespace. Might want to run scripts/checkpatch.pl, feeling free to
ignore any false positives.)

--b.

> The biggest changes in this latest incarnation
> are as follows:
>
> - The overall design of the switch has been modified to be more similar
> to the client side, e.g.
> - There is a transport class structure svc_xprt_class, and
> - A transport independent structure is manipulated by xprt
> independent code (svc_xprt)
> - Further consolidation of transport independent logic out of
> transport providers and into transport independent code.
> - Transport independent code has been broken out into a separate file
> - Transport independent functions prevously adorned with _sock_ have
> had their names changed, e.g. svc_sock_enqueue
> - atomic refcounts have been changed to krefs
>
> The patchset is large (33 patches). There are some things that I would like to
> do that I didn't do because the patchset is already big. For example, normalize
> the creation of nfsd listening endpoints using writes to the portlist file.
>
> I've attempted to organize the patchset such that logical changes are
> clearly reviewable without too much clutter from functionally empty name
> changes. This was somewhat awkward since intermediate patches may look
> ugly/broken/incomplete to some reviewers. This was to avoid losing the
> context of a change while keeping each patch a reasonable size. For example,
> making svc_recv transport independent and moving it to the svc_xprt file
> cannot be done in the same patch without losing the diffs to the svc_recv
> function.
>
> This patchset has had limited testing with TCP/UDP. In this case, the tests
> included connectathon and building the kernel on an NFS mount running on the
> transport switch.
>
> This patchset is against the 2.6.23-rc8 kernel tree.
>
> --
> Signed-off-by: Tom Tucker <[email protected]>
>
> -------------------------------------------------------------------------
> This SF.net email is sponsored by: Microsoft
> Defy all challenges. Microsoft(R) Visual Studio 2005.
> http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
> _______________________________________________
> NFS maillist - [email protected]
> https://lists.sourceforge.net/lists/listinfo/nfs

-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2005.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2007-09-28 02:58:52

by NeilBrown

[permalink] [raw]
Subject: Re: [RFC, PATCH 06/33] svc: Add transport specific xpo_release function

On Thursday September 27, [email protected] wrote:
>
> The svc_sock_release function releases pages allocated to a thread. For
> UDP, this also returns the receive skb to the stack. For RDMA it will
> post a receive WR and bump the client credit count.
>
..
> diff --git a/include/linux/sunrpc/svc.h b/include/linux/sunrpc/svc.h
> index 37f7448..cfb2652 100644
> --- a/include/linux/sunrpc/svc.h
> +++ b/include/linux/sunrpc/svc.h
> @@ -217,7 +217,7 @@ struct svc_rqst {
> struct auth_ops * rq_authop; /* authentication flavour */
> u32 rq_flavor; /* pseudoflavor */
> struct svc_cred rq_cred; /* auth info */
> - struct sk_buff * rq_skbuff; /* fast recv inet buffer */
> + void * rq_xprt_ctxt; /* transport specific context ptr */
> struct svc_deferred_req*rq_deferred; /* deferred request we are replaying */
>
> struct xdr_buf rq_arg;
..
> diff --git a/net/sunrpc/svcsock.c b/net/sunrpc/svcsock.c
> index cc8c7ce..e7d203a 100644
> --- a/net/sunrpc/svcsock.c
> +++ b/net/sunrpc/svcsock.c
> @@ -184,14 +184,14 @@ svc_thread_dequeue(struct svc_pool *pool
> /*
> * Release an skbuff after use
> */
> -static inline void
> +static void
> svc_release_skb(struct svc_rqst *rqstp)
> {
> - struct sk_buff *skb = rqstp->rq_skbuff;
> + struct sk_buff *skb = (struct sk_buff *)rqstp->rq_xprt_ctxt;

Minor style point: We don't cast void* in the kernel.

> struct svc_deferred_req *dr = rqstp->rq_deferred;
>
> if (skb) {
> - rqstp->rq_skbuff = NULL;
> + rqstp->rq_xprt_ctxt = NULL;
>
> dprintk("svc: service %p, releasing skb %p\n", rqstp, skb);
> skb_free_datagram(rqstp->rq_sock->sk_sk, skb);
> @@ -394,7 +394,7 @@ svc_sock_release(struct svc_rqst *rqstp)
> {
> struct svc_sock *svsk = rqstp->rq_sock;
>
> - svc_release_skb(rqstp);
> + rqstp->rq_xprt->xpt_ops.xpo_release(rqstp);

These are somewhat ugly, aren't they?
What would you think of giving rqstp a pointer directly to xpt_ops to
avoid the double indirection?

NeilBrown

-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2005.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2007-09-28 03:03:50

by NeilBrown

[permalink] [raw]
Subject: Re: [RFC, PATCH 09/33] svc: Add a transport function that checks for write space

On Thursday September 27, [email protected] wrote:
> @@ -898,6 +900,25 @@ svc_udp_prep_reply_hdr(struct svc_rqst *
> {
> }
>
> +static int
> +svc_udp_has_wspace(struct svc_xprt *xprt)
> +{
> + struct svc_sock *svsk = (struct svc_sock*)xprt;
> + struct svc_serv *serv = svsk->sk_server;
> + int required;
> +
> + /*
> + * Set the SOCK_NOSPACE flag before checking the available
> + * sock space.
> + */
> + set_bit(SOCK_NOSPACE, &svsk->sk_sock->flags);
> + required = atomic_read(&svsk->sk_reserved) + serv->sv_max_mesg;
> + if (required*2 > sock_wspace(svsk->sk_sk))
> + return 0;
> + clear_bit(SOCK_NOSPACE, &svsk->sk_sock->flags);
> + return 1;
> +}
> +
> static struct svc_xprt_ops svc_udp_ops = {
> .xpo_recvfrom = svc_udp_recvfrom,
> .xpo_sendto = svc_udp_sendto,
> @@ -1368,6 +1390,25 @@ svc_tcp_prep_reply_hdr(struct svc_rqst *
> svc_putnl(resv, 0);
> }
>
> +static int
> +svc_tcp_has_wspace(struct svc_xprt *xprt)
> +{
> + struct svc_sock *svsk = (struct svc_sock*)xprt;
> + struct svc_serv *serv = svsk->sk_server;
> + int required;
> +
> + /*
> + * Set the SOCK_NOSPACE flag before checking the available
> + * sock space.
> + */
> + set_bit(SOCK_NOSPACE, &svsk->sk_sock->flags);
> + required = atomic_read(&svsk->sk_reserved) + serv->sv_max_mesg;
> + if (required*2 > sk_stream_wspace(svsk->sk_sk))
> + return 0;
> + clear_bit(SOCK_NOSPACE, &svsk->sk_sock->flags);
> + return 1;
> +}
> +
> static struct svc_xprt_ops svc_tcp_ops = {
> .xpo_recvfrom = svc_tcp_recvfrom,
> .xpo_sendto = svc_tcp_sendto,

As these two functions are identical, could we just have one called
"svc_sock_has_wspace" or similar?

Makes maintenance a little easier.

NeilBrown

-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2005.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2007-09-28 03:21:20

by NeilBrown

[permalink] [raw]
Subject: Re: [RFC,PATCH 11/33] svc: Add xpo_accept transport function

On Thursday September 27, [email protected] wrote:
> @@ -1046,9 +1054,10 @@ static inline int svc_port_is_privileged
> /*
> * Accept a TCP connection
> */
> -static void
> -svc_tcp_accept(struct svc_sock *svsk)
> +static struct svc_xprt *
> +svc_tcp_accept(struct svc_xprt *xprt)
> {
> + struct svc_sock *svsk = (struct svc_sock *)xprt;

This cast should use container_of

struct svc_sock *svsk = container_of(xprt, struct svc_sock *, sk_xprt);

That makes it clearer what is happening.



NeilBrown

-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2005.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2007-09-28 03:21:44

by NeilBrown

[permalink] [raw]
Subject: Re: [RFC, PATCH 12/33] svc: Add a generic transport svc_create_xprt function

On Thursday September 27, [email protected] wrote:
> diff --git a/net/sunrpc/svc_xprt.c b/net/sunrpc/svc_xprt.c
> index 6151db5..fab0ce3 100644
> --- a/net/sunrpc/svc_xprt.c
> +++ b/net/sunrpc/svc_xprt.c
> @@ -93,3 +93,41 @@ void svc_xprt_init(struct svc_xprt_class
> xpt->xpt_max_payload = xcl->xcl_max_payload;
> }
> EXPORT_SYMBOL_GPL(svc_xprt_init);
> +
> +int svc_create_xprt(struct svc_serv *serv, char *xprt_name, unsigned short port,
> + int flags)
> +{
> + int ret = -ENOENT;
> + struct list_head *le;
> + struct sockaddr_in sin = {
> + .sin_family = AF_INET,
> + .sin_addr.s_addr = INADDR_ANY,
> + .sin_port = htons(port),
> + };
> + dprintk("svc: creating transport %s[%d]\n", xprt_name, port);
> + spin_lock(&svc_xprt_class_lock);
> + list_for_each(le, &svc_xprt_class_list) {
> + struct svc_xprt_class *xcl =
> + list_entry(le, struct svc_xprt_class, xcl_list);

list_for_each_entry is preferred.

> + if (strcmp(xprt_name, xcl->xcl_name)==0) {
> + spin_unlock(&svc_xprt_class_lock);
> + if (try_module_get(xcl->xcl_owner)) {
> + struct svc_xprt *newxprt;
> + ret = 0;
> + newxprt = xcl->xcl_ops->xpo_create
> + (serv, (struct sockaddr*)&sin, flags);
> + if (IS_ERR(newxprt)) {
> + module_put(xcl->xcl_owner);
> + ret = PTR_ERR(newxprt);
> + }
> + goto out;
> + }
> + }
> + }
> + spin_unlock(&svc_xprt_class_lock);

if try_module_get fails, you spin_unlock twice. the "goto out;"
needs to be moved down one line.

And I'm confused as to why xpo_create returns a pointer which you
never use. xpo_accept does the same thing: a pointer is returned,
but only the success status is used. Why not just return
0-or-negative-error ??

NeilBrown

-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2005.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2007-09-28 04:25:13

by NeilBrown

[permalink] [raw]
Subject: Re: [RFC,PATCH 22/33] svc: Move sk_lastrecv to svc_xprt

On Thursday September 27, [email protected] wrote:
>
> This functionally trivial change moves the tranpsort independent sk_lastrecv
> field to the svc_xprt structure.

It would seem that sk_lastrecv is entirely unused (Well, a dprintk
prints it, but that isn't very interesting).
I think it used to be used to time out idle connections, but Greg's
mark/sweep does a better job without needing this field. Shall we
just remove it?

NeilBrown


>
> Signed-off-by: Tom Tucker <[email protected]>
> ---
>
> include/linux/sunrpc/svc_xprt.h | 1 +
> include/linux/sunrpc/svcsock.h | 1 -
> net/sunrpc/svcsock.c | 6 +++---
> 3 files changed, 4 insertions(+), 4 deletions(-)
>
> diff --git a/include/linux/sunrpc/svc_xprt.h b/include/linux/sunrpc/svc_xprt.h
> index 5b2aef4..edb7ad2 100644
> --- a/include/linux/sunrpc/svc_xprt.h
> +++ b/include/linux/sunrpc/svc_xprt.h
> @@ -56,6 +56,7 @@ #define XPT_LISTENER 11 /* listening e
> struct svc_serv * xpt_server; /* service for this transport */
> atomic_t xpt_reserved; /* space on outq that is reserved */
> struct mutex xpt_mutex; /* to serialize sending data */
> + time_t xpt_lastrecv; /* time of last received request */
> };
>
> int svc_reg_xprt_class(struct svc_xprt_class *);
> diff --git a/include/linux/sunrpc/svcsock.h b/include/linux/sunrpc/svcsock.h
> index 41c2dfa..406d003 100644
> --- a/include/linux/sunrpc/svcsock.h
> +++ b/include/linux/sunrpc/svcsock.h
> @@ -33,7 +33,6 @@ struct svc_sock {
> /* private TCP part */
> int sk_reclen; /* length of record */
> int sk_tcplen; /* current read length */
> - time_t sk_lastrecv; /* time of last received request */
>
> /* cache of various info for TCP sockets */
> void *sk_info_authunix;
> diff --git a/net/sunrpc/svcsock.c b/net/sunrpc/svcsock.c
> index 71b7f86..04155aa 100644
> --- a/net/sunrpc/svcsock.c
> +++ b/net/sunrpc/svcsock.c
> @@ -1622,7 +1622,7 @@ svc_recv(struct svc_rqst *rqstp, long ti
> svc_sock_release(rqstp);
> return -EAGAIN;
> }
> - svsk->sk_lastrecv = get_seconds();
> + svsk->sk_xprt.xpt_lastrecv = get_seconds();
> clear_bit(XPT_OLD, &svsk->sk_xprt.xpt_flags);
>
> rqstp->rq_secure = svc_port_is_privileged(svc_addr(rqstp));
> @@ -1725,7 +1725,7 @@ svc_age_temp_sockets(unsigned long closu
> svsk = list_entry(le, struct svc_sock, sk_xprt.xpt_list);
>
> dprintk("queuing svsk %p for closing, %lu seconds old\n",
> - svsk, get_seconds() - svsk->sk_lastrecv);
> + svsk, get_seconds() - svsk->sk_xprt.xpt_lastrecv);
>
> /* a thread will dequeue and close it soon */
> svc_xprt_enqueue(&svsk->sk_xprt);
> @@ -1773,7 +1773,7 @@ static struct svc_sock *svc_setup_socket
> svsk->sk_ostate = inet->sk_state_change;
> svsk->sk_odata = inet->sk_data_ready;
> svsk->sk_owspace = inet->sk_write_space;
> - svsk->sk_lastrecv = get_seconds();
> + svsk->sk_xprt.xpt_lastrecv = get_seconds();
> spin_lock_init(&svsk->sk_lock);
> INIT_LIST_HEAD(&svsk->sk_deferred);
>

-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2005.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2007-09-28 04:36:25

by NeilBrown

[permalink] [raw]
Subject: Re: [RFC, PATCH 25/33] svc: Move the sockaddr information to svc_xprt

On Thursday September 27, [email protected] wrote:
>
> Move the IP address fields to the svc_xprt structure. Note that this
> assumes that _all_ RPC transports must have IP based 4-tuples. This
> seems reasonable given the tight coupling with the portmapper etc...
> Thoughts?

I don't think NFSv4 requires portmapper (or rpcbind) ... does it?

"Everything uses IP addresses" sounds a lot like "Everything is a
socket". I would have supported the latter strongly until RDMA came
along. Now I'm even less sure about the former.

How much cost would there be in leaving the address in the
per-transport data?

NeilBrown

-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2005.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2007-09-28 04:49:00

by NeilBrown

[permalink] [raw]
Subject: Re: [RFC, PATCH 33/33] knfsd: Support adding transports by writing portlist file

On Thursday September 27, [email protected] wrote:
>
> Update the write handler for the portlist file to allow creating new
> listening endpoints on a transport. The general form of the string is:
>
> <transport_name><space><port number>
>
> For example:
>
> tcp 2049
>
> This is intended to support the creation of a listening endpoint for
> RDMA transports without adding #ifdef code to the nfssvc.c file.
> The general idea is that the rpc.nfsd program would read the transports
> file and then write the portlist file to create listening endpoints
> for all or selected transports. The current mechanism of writing an
> fd would become obsolete.

Nuh.
I'll only accept
rdma 2049
(or whatever) because there seems to be no other way to do it.
Writing an 'fd' is the *preferred* way.

There is more to binding an endpoint than protocol and port number.
There is also local address and I'm not convinced that someone might
come up with some other way they want to pre-condition a socket.

If there was any way to associate an RDMA endpoint with a
filedescriptor, I would much prefer that 'rpc.nfsd' does that and passes
down the filedescriptor. If RDMA is so no-Unix-like (rant rant..)
that there is no such file descriptor, then I guess we can live with
getting the kernel to open the connection.


>
> Signed-off-by: Tom Tucker <[email protected]>
> ---
>
> fs/nfsd/nfsctl.c | 16 ++++++++++++++++
> 1 files changed, 16 insertions(+), 0 deletions(-)
>
> diff --git a/fs/nfsd/nfsctl.c b/fs/nfsd/nfsctl.c
> index baac89d..923b817 100644
> --- a/fs/nfsd/nfsctl.c
> +++ b/fs/nfsd/nfsctl.c
> @@ -554,6 +554,22 @@ static ssize_t write_ports(struct file *
> kfree(toclose);
> return len;
> }
> + /*
> + * Add a transport listener by writing it's transport name
> + */
> + if (isalnum(buf[0])) {

Should really be "isalpha" as we already know it isn't isdigit.

NeilBrown


> + int err;
> + char transport[16];
> + int port;
> + if (sscanf(buf, "%15s %4d", transport, &port) == 2) {
> + err = nfsd_create_serv();
> + if (!err)
> + err = svc_create_xprt(nfsd_serv,
> + transport, port,
> + SVC_SOCK_ANONYMOUS);
> + return err < 0 ? err : 0;
> + }
> + }
> return -EINVAL;
> }
>

-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2005.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2007-09-28 04:51:35

by NeilBrown

[permalink] [raw]
Subject: Re: [RFC,PATCH 00/33] SVC Transport Switch

On Wednesday September 26, [email protected] wrote:
>
> I've attempted to organize the patchset such that logical changes are
> clearly reviewable without too much clutter from functionally empty name
> changes.

And you did a very thorough job, thanks!

Just a few minor issues as noted in previous emails. Most of them can
be addressed by incremental patches rather than respinning the whole
series. I'm just not sure about where the IP-address info should
live. Maybe other people have opinions???

NeilBrown

-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2005.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2007-10-03 09:55:24

by Greg Banks

[permalink] [raw]
Subject: Re: [RFC,PATCH 00/33] SVC Transport Switch

G'day,

Compendium reply to all 33 patches from 26 Sep 2007. Sorry
about the delay.

On Wed, Sep 26, 2007 at 11:57:51PM -0500, Tom Tucker wrote:
> Subject: [RFC,PATCH 00/33] SVC Transport Switch

> - The overall design of the switch has been modified to be more similar
> to the client side, e.g.
> - There is a transport class structure svc_xprt_class, and
> - A transport independent structure is manipulated by xprt
> independent code (svc_xprt)
> - Further consolidation of transport independent logic out of
> transport providers and into transport independent code.
> - Transport independent code has been broken out into a separate file
> - Transport independent functions prevously adorned with _sock_ have
> had their names changed, e.g. svc_sock_enqueue

Great! These patches are coming along nicely.

Is there any documentation being added to describe the new xprt interface?



> Subject: [RFC,PATCH 01/33] svc: Add an svc transport class

> +struct svc_xprt_class {
> [...]
> + struct svc_xprt_ops *xcl_ops;
> [...]
> +struct svc_xprt {
> [...]
> + struct svc_xprt_ops xpt_ops;

It seems redundant to have two lots of the ops, and especially so to
have the one in svc_xprt be a struct instance rather than a pointer.
The current client code doesn't seem to do anything like this either.

Perhaps you could move the svc_xprt_ops.xpo_create (the only function
pointer in svc_xprt_ops which needs to be called before a svc_xprt
exists) to svc_xprt_class, and setup the svc_xprt.xpt_ops pointer in
those functions? Then you'd have something like:

/* svc_xprt.h */

struct svc_xprt_class {
...
struct svc_xprt *(*xcl_create)(struct svc_serv *, struct sockaddr *, int);
};

struct svc_xprt {
...
const struct svc_xprt_ops *xpt_ops;
};

/* svc_sock.c */

static struct svc_xprt *
svc_udp_create(struct svc_serv *serv, struct sockaddr *sa, int flags)
{
return svc_create_socket(serv, IPPROTO_UDP, sa,
sizeof(struct sockaddr_in), flags,
&svc_udp_ops);
}

This is not only neater but more consistent with the client code.


> +int svc_reg_xprt_class(struct svc_xprt_class *);
> +int svc_unreg_xprt_class(struct svc_xprt_class *);

There are perfectly good English words "register" and "unregister"
that could be used instead of "reg" and "unreg". RPC doesn't
need to stand for Reduced Phoneme Count ;-)

> +int svc_reg_xprt_class(struct svc_xprt_class *xcl)
> +{

This allows two svc_xprt_class of the same name to be registered.
Is that deliberate?

> +int svc_unreg_xprt_class(struct svc_xprt_class *xcl)

Why does this scan the list? It could just do a list_del_init().
There's no point returning -ENOENT or any other error either, nothing
useful can be done to handle it.


> Subject: [RFC,PATCH 02/33] svc: Make svc_sock the tcp/udp transport

> +void svc_init_xprt_sock(void)
> +{
> + svc_reg_xprt_class(&svc_tcp_class);ug.h
> + svc_reg_xprt_class(&svc_udp_class);
> +}

You could might check the error code from svc_reg_xprt_class() and
propagate it, to avoid doing things like loading the module twice.

> Subject: [RFC,PATCH 03/33] svc: Change the svc_sock in the rqstp structure to a transport

ok

> Subject: [RFC,PATCH 04/33] svc: Add a max payload value to the transport

> @@ -17,11 +17,13 @@ struct svc_xprt_class {
> struct module *xcl_owner;
> struct svc_xprt_ops *xcl_ops;
> struct list_head xcl_list;
> + u32 xcl_max_payload;
> };
>
> struct svc_xprt {
> struct svc_xprt_class *xpt_class;
> struct svc_xprt_ops xpt_ops;
> + u32 xpt_max_payload;
> };


Why have a max_payload variable in two places? Either would be fine by me.


> Subject: [RFC,PATCH 05/33] svc: Move sk_sendto and sk_recvfrom to svc_xprt_class

ok


> Subject: [RFC,PATCH 06/33] svc: Add transport specific xpo_release function

ok


> Subject: [RFC,PATCH 07/33] svc: Add per-transport delete functions

ok (assuming the later patch "Move the authinfo cache to svc_xprt.")


> Subject: [RFC,PATCH 08/33] svc: Add xpo_prep_reply_hdr

ok


> Subject: [RFC,PATCH 09/33] svc: Add a transport function that checks for write space

> The code that checked for white space was coupled with code that
^^^^^

> + dprintk("svc: no write space, socket %p not enqueued\n", svsk);
^^

> + /*
> + * Set the SOCK_NOSPACE flag before checking the available
> + * sock space.
> + */

Well, yes, we can see that. The comment should contain Neil's explanation
of *why* we do that.

Also: this patch makes the inline svc_sock_wspace() redundant but doesn't remove it.


> Subject: [RFC,PATCH 10/33] svc: Move close processing to a single place

ok


> Subject: [RFC,PATCH 11/33] svc: Add xpo_accept transport function

ok

I presume it's intentional that NFS/RDMA transport be subjected
to the connection limit? In earlier versions they weren't.


> Subject: [RFC,PATCH 12/33] svc: Add a generic transport svc_create_xprt function


> + if (strcmp(xprt_name, xcl->xcl_name)==0) {
> + spin_unlock(&svc_xprt_class_lock);
> + if (try_module_get(xcl->xcl_owner)) {

Is this racy vs module unload?


Otherwise ok


> Subject: [RFC,PATCH 13/33] svc: Change services to use new svc_create_xprt service

ok


> Subject: [RFC,PATCH 14/33] svc: Change sk_inuse to a kref


> transport indepenent svc_xprt structure. Change the reference count
^^^^^^^^^^

> Subject: [RFC,PATCH 15/33] svc: Move sk_flags to the svc_xprt structure


> +#define XPT_DETACHED 10 /* detached from tempsocks list */
> +#define XPT_LISTENER 11 /* listening endpoint */
^

Unnecessary whitespace drama.


> Subject: [RFC,PATCH 16/33] svc: Move sk_server and sk_pool to svc_xprt

ok

> Subject: [RFC,PATCH 17/33] svc: Make close transport independent

> -svc_delete_socket(struct svc_sock *svsk)
> +svc_delete_xprt(struct svc_xprt *xprt)

Would svc_xprt_delete() be a better name? The object is a "svc_xprt"
after all, and this would be more consistent with your svc_xprt_get()
svc_xprt_put() etc.

> -static void svc_close_socket(struct svc_sock *svsk)
> +static void svc_close_xprt(struct svc_xprt *xprt)

Likewise this could be svc_xprt_close().


> Subject: [RFC,PATCH 18/33] svc: Move sk_reserved to svc_xprt

ok

> Subject: [RFC,PATCH 19/33] svc: Make the enqueue service transport neutral and export it.

ok


> Subject: [RFC,PATCH 20/33] svc: Make svc_send transport neutral

> + struct mutex xpt_mutex; /* to serialize sending data */

Perhaps it would be better to name it something like xpt_send_mutex ?


> Subject: [RFC,PATCH 21/33] svc: Change svc_sock_received to svc_xprt_received and export it

> @@ -1123,8 +1123,6 @@ svc_tcp_accept(struct svc_xprt *xprt)
> }
> memcpy(&newsvsk->sk_local, sin, slen);
>
> - svc_sock_received(newsvsk);
> -
> if (serv->sv_stats)
> serv->sv_stats->nettcpconn++;
>

Minor nit, but moving the call to svc_xprt_received() up into the
caller is a change that ought to be mentioned in the patch comment.


> Subject: [RFC,PATCH 22/33] svc: Move sk_lastrecv to svc_xprt

> This functionally trivial change moves the tranpsort independent sk_lastrecv
^^^^^^^^^

> @@ -1773,7 +1773,7 @@ static struct svc_sock *svc_setup_socket
> svsk->sk_ostate = inet->sk_state_change;
> svsk->sk_odata = inet->sk_data_ready;
> svsk->sk_owspace = inet->sk_write_space;
> - svsk->sk_lastrecv = get_seconds();
> + svsk->sk_xprt.xpt_lastrecv = get_seconds();
> spin_lock_init(&svsk->sk_lock);
> INIT_LIST_HEAD(&svsk->sk_deferred);

This initialisation code ought to be in svc_xprt_init().
I see it moves there in a later patch, though.


> Subject: [RFC,PATCH 23/33] svc: Move the authinfo cache to svc_xprt.


> @@ -89,6 +89,9 @@ static inline void svc_xprt_free(struct
> struct module *owner = xprt->xpt_class->xcl_owner;
> BUG_ON(atomic_read(&kref->refcount));
> xprt->xpt_ops.xpo_free(xprt);
> + if (test_bit(XPT_CACHE_AUTH, &xprt->xpt_flags)
> + && xprt->xpt_auth_cache != NULL)
> + svcauth_unix_info_release(xprt->xpt_auth_cache);
> module_put(owner);
> }
>

Woops, that's a use-after-free.


> Subject: [RFC,PATCH 24/33] svc: Make deferral processing xprt independent


You're also moving the call to svc_deferred_dequeue() and
svc_deferred_recv() from the transport-dependent svc_foo_recvfrom()
methods up into the xprt core. This is good, but it means the patch is
not functionally trivial as the patch comment claims. For example for
UDP it may delay the implementation of buffer changes when XPT_CHNGBUF
is set until after a deferred call is processed.

> @@ -1612,7 +1602,12 @@ svc_recv(struct svc_rqst *rqstp, long ti
> dprintk("svc: server %p, pool %u, socket %p, inuse=%d\n",
> rqstp, pool->sp_id, svsk,
> atomic_read(&svsk->sk_xprt.xpt_ref.refcount));
> - len = svsk->sk_xprt.xpt_ops.xpo_recvfrom(rqstp);
> +
> + if ((rqstp->rq_deferred = svc_deferred_dequeue(&svsk->sk_xprt))) {
> + svc_xprt_received(&svsk->sk_xprt);
> + len = svc_deferred_recv(rqstp);
> + } else
> + len = svsk->sk_xprt.xpt_ops.xpo_recvfrom(rqstp);
> dprintk("svc: got len=%d\n", len);
> }


I'm surprised that the svc_deferred_dequeue() call isn't being done
outside this else{}, so that the code looks like (from the SGI tree):


if (test_bit(SK_CLOSE, &svsk->sk_flags)) {
dprintk("svc_recv: found SK_CLOSE\n");
svc_delete_socket(svsk);
} else if (test_bit(SK_LISTENER, &svsk->sk_flags)) {
svsk->sk_ops->sko_accept(svsk);
svc_sock_received(svsk);
} else if ((rqstp->rq_deferred = svc_deferred_dequeue(svsk))) { dprintk("svc: rqstp=%p got deferred request on svsk=%p\n", rqstp, svsk);
svc_sock_received(svsk);
len = svc_deferred_recv(rqstp);
} else {
dprintk("svc: server %p, pool %u, socket %p, inuse=%d\n",
rqstp, pool->sp_id, svsk, atomic_read(&svsk->sk_inuse));
len = svsk->sk_ops->sko_recvfrom(rqstp);
dprintk("svc: got len=%d\n", len);
}

That seems a little cleaner to me.



> Subject: [RFC,PATCH 25/33] svc: Move the sockaddr information to svc_xprt

> Move the IP address fields to the svc_xprt structure. Note that this
> assumes that _all_ RPC transports must have IP based 4-tuples. This
> seems reasonable given the tight coupling with the portmapper etc...
> Thoughts?

Such an assumption only seems reasonable if you assume the old v2
portmap protocol and IPv4.

The portmap (aka rpcbind) v3 and v4 protocols use the TLI "uaddr"
concept to store addresses. Uaddrs are a string encoding of the IP
address + port tuple. For IPv4 this is the standard dotted-quad
plus two more quads representing the two bytes of the port,
e.g. 123.34.45.56.8.1 is port 2049 on host 123.34.45.56.

With IPv6 there's no guarantee that you can extract a IPv4 address
from an IPv6 address, nor that any particular host has a routable or
IPv4 address at all.

But I don't think this patch requires any such assumption. It does
however screw up rq_daddr in the UDP case; the code you've pulled
out as svc_copy_addr() only works correctly for TCP. See the
svc_udp_get_dest_address() function for how UDP works.



> Subject: [RFC,PATCH 26/33] svc: Make svc_sock_release svc_xprt_release

> @@ -377,9 +377,9 @@ void svc_reserve(struct svc_rqst *rqstp,
> }
>
> static void
> -svc_sock_release(struct svc_rqst *rqstp)
> +svc_xprt_release(struct svc_rqst *rqstp)
> {
> - struct svc_sock *svsk = rqstp->rq_sock;
> + struct svc_xprt *xprt = rqstp->rq_xprt;
>
> rqstp->rq_xprt->xpt_ops.xpo_release(rqstp);
>
>


This isn't really a svc_xprt function, it's a svc_rqst function,
so svc_xprt_release() is a confusing name. Perhaps a name like
svc_rqst_done_sending() might be more appropriate.


> Subject: [RFC,PATCH 27/33] svc: Make svc_recv transport neutral

ok.


> Subject: [RFC,PATCH 28/33] svc: Make svc_age_temp_sockets svc_age_temp_transports

> @@ -1685,49 +1685,51 @@ svc_send(struct svc_rqst *rqstp)
> * a mark-and-sweep algorithm.
> */
> static void
> -svc_age_temp_sockets(unsigned long closure)
> +svc_age_temp_xprts(unsigned long closure)
> {

Minor nit, but the new name doesn't match the one in the Subject: line.



> Subject: [RFC,PATCH 29/33] svc: Move common create logic to common code

> + /* setup timer to age temp sockets */

This comment should be updated to refer to transports.


> @@ -146,6 +146,13 @@ int svc_create_xprt(struct svc_serv *ser
> if (IS_ERR(newxprt)) {
> module_put(xcl->xcl_owner);
> ret = PTR_ERR(newxprt);
> + } else {
> + clear_bit(XPT_TEMP,
> + &newxprt->xpt_flags);
> + spin_lock_bh(&serv->sv_lock);
> + list_add(&newxprt->xpt_list,
> + &serv->sv_permsocks);
> + spin_unlock_bh(&serv->sv_lock);
> }
> goto out;
> }

and

> @@ -1833,6 +1827,12 @@ int svc_addsock(struct svc_serv *serv,
> svc_xprt_received(&svsk->sk_xprt);
> err = 0;
> }
> + if (so->sk->sk_protocol == IPPROTO_TCP)
> + set_bit(XPT_LISTENER, &svsk->sk_xprt.xpt_flags);
> + clear_bit(XPT_TEMP, &svsk->sk_xprt.xpt_flags);
> + spin_lock_bh(&serv->sv_lock);
> + list_add(&svsk->sk_xprt.xpt_list, &serv->sv_permsocks);
> + spin_unlock_bh(&serv->sv_lock);
> }
> if (err) {
> sockfd_put(so);

Why is this happening in two places now?


> Subject: [RFC,PATCH 30/33] svc: Removing remaining references to rq_sock in rqstp

ok


> Subject: [RFC,PATCH 31/33] svc: Move the xprt independent code to the svc_xprt.c file

functions not lost:
ok

code in functions not lost:

> -svc_age_temp_xprts(unsigned long closure)
> [...]
> - if (atomic_read(&svsk->sk_xprt.xpt_ref.refcount) > 1=20
> - || test_bit(SK_BUSY, &svsk->sk_flags))
> - continue;
> [...]
> +svc_age_temp_xprts(unsigned long closure)
> [...]
> + if (atomic_read(&xprt->xpt_ref.refcount)=20
> + || test_bit(XPT_BUSY, &xprt->xpt_flags))
> + continue;

Looks like you missed some renaming in "Move sk_flags to the svc_xprt structure".
Does the code compile after that patch?


newly non-static functions declared in header:
ok

newly non-static functions exported to modules:
svc_port_is_privileged is not exported
svc_close_xprt is not exported
svc_delete_xprt is not exported


> +int svc_port_is_privileged(struct sockaddr *sin)
> +{
> + switch (sin->sa_family) {
> + case AF_INET:
> + return ntohs(((struct sockaddr_in *)sin)->sin_port)
> + < PROT_SOCK;
> + case AF_INET6:
> + return ntohs(((struct sockaddr_in6 *)sin)->sin6_port)
> + < PROT_SOCK;
> + default:
> + return 0;
> + }
> +}
> [...]
> + rqstp->rq_secure = svc_port_is_privileged(svc_addr(rqstp));

Hmm, here we have svc_port_is_privileged, a very socket-oriented
function, being moved to svc_xprt.c, the supposedly transport-
independent code. Perhaps svc_xprt_ops.xpo_is_privileged() is needed?
Or have the xpo_recvfrom code set up rq_secure (and save and
restore it in svc_defer/svc_revisit) ?

svc_sock_update_bufs() could be renamed and moved to svc_xprt.c



> Subject: [RFC,PATCH 32/33] svc: Add /proc/sys/sunrpc/transport files


> +char xprt_buf[128];

> @@ -145,6 +174,14 @@ static ctl_table debug_table[] = {
> .mode = 0644,
> .proc_handler = &proc_dodebug
> },
> + {
> + .ctl_name = CTL_TRANSPORTS,
> + .procname = "transports",
> + .data = xprt_buf,
> + .maxlen = sizeof(xprt_buf),
> + .mode = 0444,
> + .proc_handler = &proc_do_xprt,
> + },
> { .ctl_name = 0 }
> };


The xprt_buf[] variable isn't necessary. The /proc code uses only
the .ctl_name, .procname, and .mode fields of the ctl_table struct,
so you can safely leave .data and .maxlen as zeroes if you provide
your own .proc_handler function.


> Subject: [RFC,PATCH 33/33] knfsd: Support adding transports by writing portlist file


> The general idea is that the rpc.nfsd program would read the transports
> file and then write the portlist file to create listening endpoints
> for all or selected transports. The current mechanism of writing an
> fd would become obsolete.
> [...]
> @@ -554,6 +554,22 @@ static ssize_t write_ports(struct file *
> [...]
> + err = svc_create_xprt(nfsd_serv,
> + transport, port,
> + SVC_SOCK_ANONYMOUS);

So who does the portmap registration?



Greg.
--
Greg Banks, R&D Software Engineer, SGI Australian Software Group.
Apparently, I'm Bedevere. Which MPHG character are you?
I don't speak for SGI.

-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2005.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2007-10-03 10:07:03

by Greg Banks

[permalink] [raw]
Subject: Re: [RFC,PATCH 22/33] svc: Move sk_lastrecv to svc_xprt

On Fri, Sep 28, 2007 at 11:16:17AM -0500, Tom Tucker wrote:
> On Fri, 2007-09-28 at 14:25 +1000, Neil Brown wrote:
> > On Thursday September 27, [email protected] wrote:
> > >
> > > This functionally trivial change moves the tranpsort independent sk_lastrecv
> > > field to the svc_xprt structure.
> >
> > It would seem that sk_lastrecv is entirely unused (Well, a dprintk
> > prints it, but that isn't very interesting).
>
> agreed.
>
> > I think it used to be used to time out idle connections, but Greg's
> > mark/sweep does a better job without needing this field. Shall we
> > just remove it?
>
> I think so. Greg?

Sure, why not. I have in the past found sk_lastrecv useful
forensically, but I could live without it.

Greg.
--
Greg Banks, R&D Software Engineer, SGI Australian Software Group.
Apparently, I'm Bedevere. Which MPHG character are you?
I don't speak for SGI.

-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2005.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2007-10-03 10:22:40

by Greg Banks

[permalink] [raw]
Subject: Re: [RFC, PATCH 33/33] knfsd: Support adding transports by writing portlist file

On Fri, Sep 28, 2007 at 02:48:56PM +1000, Neil Brown wrote:
> On Thursday September 27, [email protected] wrote:
> >
> > Update the write handler for the portlist file to allow creating new
> > listening endpoints on a transport. The general form of the string is:
> >
> > <transport_name><space><port number>
> >
> > For example:
> >
> > tcp 2049
> >
> > This is intended to support the creation of a listening endpoint for
> > RDMA transports without adding #ifdef code to the nfssvc.c file.
> > The general idea is that the rpc.nfsd program would read the transports
> > file and then write the portlist file to create listening endpoints
> > for all or selected transports. The current mechanism of writing an
> > fd would become obsolete.
>
> Nuh.
> I'll only accept
> rdma 2049
> (or whatever) because there seems to be no other way to do it.
> Writing an 'fd' is the *preferred* way.
>
> There is more to binding an endpoint than protocol and port number.
> There is also local address and I'm not convinced that someone might
> come up with some other way they want to pre-condition a socket.
>
> If there was any way to associate an RDMA endpoint with a
> filedescriptor,

The whole point of RDMA is not to have a file descriptor or any of that
slow stuff like read(), write(), or reliable connections in software.
(Of course you also lose helpful things like strace, tethereal, and
iptables. Swings...roundabouts.)

In the case of the local address, you could pass that into the
portlist file too, like:

echo 'tcp 192.168.0.1:2049' > /proc/fs/nfsd/portlist

Or perhaps

echo '23/rdma' > /proc/fs/nfsd/portlist

where 23 is the same file descriptor passed for TCP?


> I would much prefer that 'rpc.nfsd' does that and passes
> down the filedescriptor. If RDMA is so no-Unix-like (rant rant..)
> that there is no such file descriptor, then I guess we can live with
> getting the kernel to open the connection.

It's as unUnixlike as you can imagine :-/

Greg.
--
Greg Banks, R&D Software Engineer, SGI Australian Software Group.
Apparently, I'm Bedevere. Which MPHG character are you?
I don't speak for SGI.

-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2005.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2007-10-01 16:16:25

by J. Bruce Fields

[permalink] [raw]
Subject: Re: [RFC,PATCH 00/33] SVC Transport Switch

On Fri, Sep 28, 2007 at 12:39:07PM -0500, Tom Tucker wrote:
> On Fri, 2007-09-28 at 14:51 +1000, Neil Brown wrote:
> > Just a few minor issues as noted in previous emails. Most of them can
> > be addressed by incremental patches rather than respinning the whole
> > series. I'm just not sure about where the IP-address info should
> > live. Maybe other people have opinions???
> >
>
> I've already redone the patchset for whitespace cleanup and to handle a
> few checkpatch.pl style issues. How about if I roll in your suggestions,
> repost the whole thing

Fine by me.

--b.

> and then go to incremental after that?

-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2005.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs