2006-06-29 18:12:07

by Wendy Cheng

[permalink] [raw]
Subject: [RFC PATCH 1/3] NLM lock failover - lock release

_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs


Attachments:
gfs_nlm_unlock.patch (8.51 kB)
(No filename) (299.00 B)
(No filename) (140.00 B)
Download all attachments

2006-06-29 19:11:26

by Chuck Lever

[permalink] [raw]
Subject: Re: [RFC PATCH 1/3] NLM lock failover - lock release

On 6/29/06, Wendy Cheng <[email protected]> wrote:
> The patch piggy-backs the logic into "rq_daddr" field of struct svc_rqst
> where NFS server ip address is stored. Upon writing IPv4 address in
> standard dot notation into /proc/fs/nfsd/nlm_unlock, the logic will
> examine NLM's global nlm_files list and subsequently unlock the
> associated file if server ip address matches.
>
> Due to the size of rq_daddr (u32), we would not be able to support IPV6
> for this round of changes. Another to-do item is to enable client:server
> ip pairs to allow NFS V4 failover.

FYI: I have a patch in my IPv6 patchset that increases the size of
this field specifically in order to hold an IPv6 address. See:

http://oss.oracle.com/~cel/linux-2.6/2.6.17/patches/52-svc-rq_daddr.diff

for the individual patch, and surrounding patches for more context.

--
"We who cut mere stones must always be envisioning cathedrals"
-- Quarry worker's creed

Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2006-06-29 21:49:17

by Wendy Cheng

[permalink] [raw]
Subject: Re: [RFC PATCH 1/3] NLM lock failover - lock release

Chuck Lever wrote:

>On 6/29/06, Wendy Cheng <[email protected]> wrote:
>
>
>>The patch piggy-backs the logic into "rq_daddr" field of struct svc_rqst
>>where NFS server ip address is stored. Upon writing IPv4 address in
>>standard dot notation into /proc/fs/nfsd/nlm_unlock, the logic will
>>examine NLM's global nlm_files list and subsequently unlock the
>>associated file if server ip address matches.
>>
>>Due to the size of rq_daddr (u32), we would not be able to support IPV6
>>for this round of changes. Another to-do item is to enable client:server
>>ip pairs to allow NFS V4 failover.
>>
>>
>
>FYI: I have a patch in my IPv6 patchset that increases the size of
>this field specifically in order to hold an IPv6 address. See:
>
> http://oss.oracle.com/~cel/linux-2.6/2.6.17/patches/52-svc-rq_daddr.diff
>
>for the individual patch, and surrounding patches for more context.
>
>
>
ok, thanks ...

-- Wendy

--
S. Wendy Cheng
[email protected]


Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2006-06-29 23:06:58

by Trond Myklebust

[permalink] [raw]
Subject: Re: [RFC PATCH 1/3] NLM lock failover - lock release

On Thu, 2006-06-29 at 14:11 -0400, Wendy Cheng wrote:
> The patch piggy-backs the logic into "rq_daddr" field of struct svc_rqst
> where NFS server ip address is stored. Upon writing IPv4 address in
> standard dot notation into /proc/fs/nfsd/nlm_unlock, the logic will
> examine NLM's global nlm_files list and subsequently unlock the
> associated file if server ip address matches.
>
> Due to the size of rq_daddr (u32), we would not be able to support IPV6
> for this round of changes. Another to-do item is to enable client:server
> ip pairs to allow NFS V4 failover.
>
>
>
>
> plain text document attachment (gfs_nlm_unlock.patch)
> fs/lockd/svcsubs.c | 58 ++++++++++++++++++++++++++++++++++++++++----
> fs/nfsd/nfsctl.c | 54 ++++++++++++++++++++++++++++++++++++++++
> include/linux/lockd/bind.h | 5 +++
> include/linux/lockd/lockd.h | 3 ++
> net/sunrpc/svcsock.c | 5 +++
> 5 files changed, 120 insertions(+), 5 deletions(-)
>
> --- linux-2.6.17/include/linux/lockd/lockd.h 2006-06-17 21:49:35.000000000 -0400
> +++ linux-2.6.17-1/include/linux/lockd/lockd.h 2006-06-27 10:58:47.000000000 -0400
> @@ -105,6 +105,7 @@ struct nlm_file {
> unsigned int f_count; /* reference count */
> struct semaphore f_sema; /* avoid concurrent access */
> int f_hash; /* hash of f_handle */
> + __u32 f_iaddr; /* server ip for failover */
> };
>
> /*
> @@ -133,6 +134,7 @@ struct nlm_block {
> #define NLM_ACT_CHECK 0 /* check for locks */
> #define NLM_ACT_MARK 1 /* mark & sweep */
> #define NLM_ACT_UNLOCK 2 /* release all locks */
> +#define NLM_ACT_FO_UNLOCK 3 /* failover release locks */
>
> /*
> * Global variables
> @@ -196,6 +198,7 @@ void nlm_release_file(struct nlm_file
> void nlmsvc_mark_resources(void);
> void nlmsvc_free_host_resources(struct nlm_host *);
> void nlmsvc_invalidate_all(void);
> +int nlmsvc_fo_unlock(struct in_addr *);
>
> static __inline__ struct inode *
> nlmsvc_file_inode(struct nlm_file *file)
> --- linux-2.6.17/net/sunrpc/svcsock.c 2006-06-17 21:49:35.000000000 -0400
> +++ linux-2.6.17-1/net/sunrpc/svcsock.c 2006-06-27 10:58:47.000000000 -0400
> @@ -454,6 +454,7 @@ svc_recvfrom(struct svc_rqst *rqstp, str
> struct msghdr msg;
> struct socket *sock;
> int len, alen;
> + struct sockaddr_in daddr;
>
> rqstp->rq_addrlen = sizeof(rqstp->rq_addr);
> sock = rqstp->rq_sock->sk_sock;
> @@ -474,6 +475,10 @@ svc_recvfrom(struct svc_rqst *rqstp, str
> alen = sizeof(rqstp->rq_addr);
> sock->ops->getname(sock, (struct sockaddr *)&rqstp->rq_addr, &alen, 1);
>
> + /* add server ip for nlm lock failover */
> + sock->ops->getname(sock, (struct sockaddr *)&daddr, &alen, 0);
> + rqstp->rq_daddr = daddr.sin_addr.s_addr;
> +

Hmm.... Why would you want to do this on every receive when you could
just store the ip address in the struct svc_sock once and for all?

That said, how do you envisage this working in the cases where the
socket is bound to INADDR_ANY?

Cheers,
Trond



Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2006-06-30 03:46:22

by Wendy Cheng

[permalink] [raw]
Subject: Re: [RFC PATCH 1/3] NLM lock failover - lock release

On Thu, 2006-06-29 at 19:06 -0400, Trond Myklebust wrote:

> >
> > + /* add server ip for nlm lock failover */
> > + sock->ops->getname(sock, (struct sockaddr *)&daddr, &alen, 0);
> > + rqstp->rq_daddr = daddr.sin_addr.s_addr;
> > +
>
> Hmm.... Why would you want to do this on every receive when you could
> just store the ip address in the struct svc_sock once and for all?

ok, will do that - save latency. Thanks.

>
> That said, how do you envisage this working in the cases where the
> socket is bound to INADDR_ANY?

This is "our" (server's) address, not peer address - for this request to
arrive "here", it can't be INADDR_ANY. Can it ? Remember "rq_daddr" will
only be used during failover in a clustered NFS servers environment.


-- Wendy




Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2006-06-30 04:15:26

by NeilBrown

[permalink] [raw]
Subject: Re: [RFC PATCH 1/3] NLM lock failover - lock release

On Thursday June 29, [email protected] wrote:
> On Thu, 2006-06-29 at 19:06 -0400, Trond Myklebust wrote:
>
> > >
> > > + /* add server ip for nlm lock failover */
> > > + sock->ops->getname(sock, (struct sockaddr *)&daddr, &alen, 0);
> > > + rqstp->rq_daddr = daddr.sin_addr.s_addr;
> > > +
> >
> > Hmm.... Why would you want to do this on every receive when you could
> > just store the ip address in the struct svc_sock once and for all?
>
> ok, will do that - save latency. Thanks.
>
> >
> > That said, how do you envisage this working in the cases where the
> > socket is bound to INADDR_ANY?
>
> This is "our" (server's) address, not peer address - for this request to
> arrive "here", it can't be INADDR_ANY. Can it ? Remember "rq_daddr" will
> only be used during failover in a clustered NFS servers environment.

The socket can only be bound to INADDR_ANY for UDP, and in that case
we already set rq_daddr correctly.

For a TCP socket, it will be connected, so the local an remote
endpoints will be well defined.

So the code as it stands should work fine.

But yes, it would be best to record the local and remote addresses in
svc_tcp_accept rather than called ->getname twice in svc_recvfrom.

NeilBrown

Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2006-06-30 04:38:54

by Trond Myklebust

[permalink] [raw]
Subject: Re: [RFC PATCH 1/3] NLM lock failover - lock release

On Fri, 2006-06-30 at 14:15 +1000, Neil Brown wrote:

> The socket can only be bound to INADDR_ANY for UDP, and in that case
> we already set rq_daddr correctly.

As I understand it, Wendy is considering the case of a multi-homed
server. She wants to record the IP address on which we received the
datagram so that she knows which locks to invalidate in the case of a
migration of that particular IP address onto another server.

My point is that she won't get that information if the socket is bound
to INADDR_ANY, as would be the case for a UDP socket.

Cheers,
Trond


Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2006-06-30 04:39:48

by Trond Myklebust

[permalink] [raw]
Subject: Re: [RFC PATCH 1/3] NLM lock failover - lock release

On Thu, 2006-06-29 at 23:57 -0400, Wendy Cheng wrote:
> This is "our" (server's) address, not peer address - for this request to
> arrive "here", it can't be INADDR_ANY. Can it ?

Consider the case of a UDP socket.

Cheers,
Trond


Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2006-06-30 05:00:07

by NeilBrown

[permalink] [raw]
Subject: Re: [RFC PATCH 1/3] NLM lock failover - lock release

On Friday June 30, [email protected] wrote:
> On Fri, 2006-06-30 at 14:15 +1000, Neil Brown wrote:
>
> > The socket can only be bound to INADDR_ANY for UDP, and in that case
> > we already set rq_daddr correctly.
>
> As I understand it, Wendy is considering the case of a multi-homed
> server. She wants to record the IP address on which we received the
> datagram so that she knows which locks to invalidate in the case of a
> migration of that particular IP address onto another server.
>
> My point is that she won't get that information if the socket is bound
> to INADDR_ANY, as would be the case for a UDP socket.

Yes. But that code fragment covered the TCP case only.
The extra called to ->getname was placed in svc_recvfrom which is only
called from svc_tcp_recvfrom, not from svc_udp_recvfrom (Yes, I agree
there is room for confusion there).
svc_udp_recvfrom already has
rqstp->rq_addr.sin_addr.s_addr = skb->nh.iph->saddr;
rqstp->rq_daddr = skb->nh.iph->daddr;

so it sets rq_daddr correctly as each packet is received.
rq_daddr was never set for requests received via TCP because it was
never used for tcp (it was used only to set the source address for UDP
replies).

NeilBrown

Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs