2007-04-27 12:40:47

by NeilBrown

[permalink] [raw]
Subject: Re: [Cluster-devel] [PATCH 0/4 Revised] NLM - lock failover

On Friday April 27, [email protected] wrote:
> On Fri, Apr 27, 2007 at 04:00:13PM +1000, Neil Brown wrote:
> >
> > So if you need that, then I think it really must be implemented by
> > something a lot like
> > echo -n /path/name > /proc/fs/nfs/nlm_unlock_filesystem
> >
> > This is something that we could possible teach "fuser -k" about - so
> > it can effectively 'kill' that part of lockd that is accessing a given
> > filesystem. It is useful to failover, but definitely useful beyond
> > failover.
>
> Just a note that I posted a patch ~ a year ago that did precisely that. The
> interface was a little bit different. I had userspace echoing in a dev_t
> number, but it wouldn't be too hard to change it to use a pathname instead.
>
> Subject was:
>
> [PATCH] lockd: add procfs control to cue lockd to release all locks on a device
>
> ...if anyone is interested in having me resurrect it.
>
> -- Jeff

http://lkml.org/lkml/2006/4/10/240

Looks like no-one ever replied.
I probably didn't see it: things on linux-kernel that don't have
'nfs' or 'raid' (or a few related strings) in the subject have at best
an even chance of me seeing them. I've just added 'lockd' to the list
of important strings :-)

I would rather a path name, and would rather it came through the
'nfsd' filesystem, but those are fairly trivial changes.

nlm_traverse_files has changed a bit since then, but it should be
easier to unlock based on filesystem with the current code
(especially if we made the first arg a void*..).

NeilBrown

-------------------------------------------------------------------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs


2007-04-27 13:42:51

by Jeff Layton

[permalink] [raw]
Subject: Re: [Cluster-devel] [PATCH 0/4 Revised] NLM - lock failover

On Fri, Apr 27, 2007 at 10:40:16PM +1000, Neil Brown wrote:
> On Friday April 27, [email protected] wrote:
> > On Fri, Apr 27, 2007 at 04:00:13PM +1000, Neil Brown wrote:
> > >
> > > So if you need that, then I think it really must be implemented by
> > > something a lot like
> > > echo -n /path/name > /proc/fs/nfs/nlm_unlock_filesystem
> > >
> > > This is something that we could possible teach "fuser -k" about - so
> > > it can effectively 'kill' that part of lockd that is accessing a given
> > > filesystem. It is useful to failover, but definitely useful beyond
> > > failover.
> >
> > Just a note that I posted a patch ~ a year ago that did precisely that. The
> > interface was a little bit different. I had userspace echoing in a dev_t
> > number, but it wouldn't be too hard to change it to use a pathname instead.
> >
> > Subject was:
> >
> > [PATCH] lockd: add procfs control to cue lockd to release all locks on a device
> >
> > ...if anyone is interested in having me resurrect it.
> >
> > -- Jeff
>
> http://lkml.org/lkml/2006/4/10/240
>
> Looks like no-one ever replied.
> I probably didn't see it: things on linux-kernel that don't have
> 'nfs' or 'raid' (or a few related strings) in the subject have at best
> an even chance of me seeing them. I've just added 'lockd' to the list
> of important strings :-)
>
> I would rather a path name, and would rather it came through the
> 'nfsd' filesystem, but those are fairly trivial changes.
>
> nlm_traverse_files has changed a bit since then, but it should be
> easier to unlock based on filesystem with the current code
> (especially if we made the first arg a void*..).
>
> NeilBrown
>

Ok, I'll toss cleaning that patch up and reposting it on to my to-do list...

Wendy, it seems like you had some objection to this patch at the time, but
the nature of it escapes me. Do you recall what your concern with it was?

Thanks,
Jeff


-------------------------------------------------------------------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2007-04-27 14:17:12

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [Cluster-devel] [PATCH 0/4 Revised] NLM - lock failover

On Fri, Apr 27, 2007 at 09:42:48AM -0400, Jeff Layton wrote:
> Ok, I'll toss cleaning that patch up and reposting it on to my to-do list...
>
> Wendy, it seems like you had some objection to this patch at the time, but
> the nature of it escapes me. Do you recall what your concern with it was?

I like the idea of the patch. Intead of writing a dev_t into procfs
it should probably be changed to write a path to a new file in the
nfsctl filesystem. In fact couldn't this be treated as a reexport
with a NFSEXP_ flag meaning drop all locks to avoid creating new
interfaces?


-------------------------------------------------------------------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2007-04-27 15:12:28

by Jeff Layton

[permalink] [raw]
Subject: Re: [Cluster-devel] [PATCH 0/4 Revised] NLM - lock failover

On Fri, Apr 27, 2007 at 09:42:48AM -0400, Jeff Layton wrote:
> Ok, I'll toss cleaning that patch up and reposting it on to my to-do list...
>
> Wendy, it seems like you had some objection to this patch at the time, but
> the nature of it escapes me. Do you recall what your concern with it was?
>
> Thanks,
> Jeff
>

Actually, on second thought, I'm going to leave this in Wendy's capable
hands. The changes to the existing code and the interface changes needed are
probably enough to make it so that this patch will be rewritten from scratch
anyway...

Cheers,
Jeff


-------------------------------------------------------------------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2007-04-27 15:43:09

by J. Bruce Fields

[permalink] [raw]
Subject: Re: [Cluster-devel] [PATCH 0/4 Revised] NLM - lock failover

On Fri, Apr 27, 2007 at 03:17:10PM +0100, Christoph Hellwig wrote:
> In fact couldn't this be treated as a reexport with a NFSEXP_ flag
> meaning drop all locks to avoid creating new interfaces?

Off hand, I can't see any reason why that wouldn't work. The code to
handle it would probably go in fs/nfsd/export.c:svc_export_parse().

--b.

-------------------------------------------------------------------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2007-04-27 15:58:04

by Wendy Cheng

[permalink] [raw]
Subject: Re: [Cluster-devel] [PATCH 0/4 Revised] NLM - lock failover

J. Bruce Fields wrote:
> On Fri, Apr 27, 2007 at 03:17:10PM +0100, Christoph Hellwig wrote:
>
>> In fact couldn't this be treated as a reexport with a NFSEXP_ flag
>> meaning drop all locks to avoid creating new interfaces?
>>
>
> Off hand, I can't see any reason why that wouldn't work. The code to
> handle it would probably go in fs/nfsd/export.c:svc_export_parse().
>
>
Sign :( ... folks, we go back to the loop again. That *was* my first
proposal ...

-- Wendy

-------------------------------------------------------------------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2007-04-27 16:31:35

by J. Bruce Fields

[permalink] [raw]
Subject: Re: [Cluster-devel] [PATCH 0/4 Revised] NLM - lock failover

On Fri, Apr 27, 2007 at 11:36:16AM -0400, Wendy Cheng wrote:
> J. Bruce Fields wrote:
> >On Fri, Apr 27, 2007 at 03:17:10PM +0100, Christoph Hellwig wrote:
> >
> >>In fact couldn't this be treated as a reexport with a NFSEXP_ flag
> >>meaning drop all locks to avoid creating new interfaces?
> >>
> >
> >Off hand, I can't see any reason why that wouldn't work. The code to
> >handle it would probably go in fs/nfsd/export.c:svc_export_parse().
> >
> >
> Sign :( ... folks, we go back to the loop again. That *was* my first
> proposal ...

So you're talking about this and followups?:

http://marc.info/?l=linux-nfs&m=115009204513790&w=2

I just took a look and couldn't find any complaints about that
approach. Were they elsewhere?

I understand the frustration. There's a balance betweeen on the one
hand, being willing to throw out some hard work and start over if
someone comes up with a real objection, and, on the other hand, sticking
to a design when you're convinced it's right.

I *really* appreciate good review, but I also try to avoid doing
something I don't like just because it seems to be the only way to make
somebody else happy.... If they've got a real point then I should be
able to understand it. If not, then I risk doing all the work to make
them happy just to throw it away because I can't defend the approach in
the end, or because I find out I misunderstood their original point.

(Then again, sometimes I do just have to trust somebody. And sometimes
I guess learning who can be trusted about what is part of the process.)

In this case I think the complaint about requiring fsid's on everything
is legimate, and the original approach of using the export was sensible.
But I haven't been paying as much attention as I should have, and I
probably missed something.

--b.

-------------------------------------------------------------------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2007-04-27 20:34:45

by Frank van Maarseveen

[permalink] [raw]
Subject: Re: [Cluster-devel] [PATCH 0/4 Revised] NLM - lock failover

On Fri, Apr 27, 2007 at 11:36:16AM -0400, Wendy Cheng wrote:
> J. Bruce Fields wrote:
> > On Fri, Apr 27, 2007 at 03:17:10PM +0100, Christoph Hellwig wrote:
> >
> >> In fact couldn't this be treated as a reexport with a NFSEXP_ flag
> >> meaning drop all locks to avoid creating new interfaces?
> >>
> >
> > Off hand, I can't see any reason why that wouldn't work. The code to
> > handle it would probably go in fs/nfsd/export.c:svc_export_parse().
> >
> >
> Sign :( ... folks, we go back to the loop again. That *was* my first
> proposal ...

I'm quite interested in _any_ patch which would allow me to drop
the locks obtained by NFS clients on a specific export, either by (1)
"exportfs -uh" or by (2) "echo /some/path > /proc/fs/nfsd/nlm_drop_lock"
as Neil mentioned.
I want to migrate virtual(*) NFS servers _including_ the locks without
having to tear down the whole machine. In my case "migration" is a sort
of scheduled failover: no HA or clusters involved.

At first, the "exportfs -uh" proposal (maybe fsid driven) seems "the
right thing" because after unexporting there's no valid case to preserve
the locks AFAICS. Unexport implies EACCES for subsequent NFS accesses
anyway and unexporting /cdrom for example is _required_ in order to be
able to umount and eject the thing. As it stands today, unexporting is
not even enough to be able to unmount it and that's not good. (the need
to having to unexport a /cdrom before being able to eject it is actually
a problem on its own -- a separate issue).

So why not drop the locks always upon unexport? Stupid question of course
because exporting anything will not send out any sm notifications so
that would break symmetry.


I'd prefer (2) "echo /some/path > /proc/fs/nfsd/nlm_drop_lock" because:

- Tying the -h (drop locks) option to -u (unexport) is too restrictive IMO.
For one thing, there's a bug in the linux NFS client locking code (I
reported this earlier) which results in locks not being removed from
the server. It was not too difficult to reproduce and programs on the
client will wait forever due to this. To handle these kind of situations
I need (2) on the server.

- (2) may be useful for other NFS server setups: it is inherently more
flexible.

- (2) does not depend on nfs-utils. It's simpler.


(*) virtual in this case means a UUID or IP based fsid= option and an
additional IP address on eth0 per export entry, such, that it becomes
possible to move an export entry + disk partition + local mount to
different hardware without needing to remount it on all <large number>
NFS clients.

--
Frank

-------------------------------------------------------------------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2007-04-28 03:45:07

by Wendy Cheng

[permalink] [raw]
Subject: Re: [Cluster-devel] [PATCH 0/4 Revised] NLM - lock failover

Frank van Maarseveen wrote:

>I'm quite interested in _any_ patch which would allow me to drop
>the locks obtained by NFS clients on a specific export, either by (1)
>"exportfs -uh" or by (2) "echo /some/path > /proc/fs/nfsd/nlm_drop_lock"
>as Neil mentioned.
>
>
Thanks for commenting on this. Opinions from users who will eventually
use these interfaces are always valued.

>[snip]
>
>
>I'd prefer (2) "echo /some/path > /proc/fs/nfsd/nlm_drop_lock" because:
>
>
To convert the first patch of this submitted series from "fsid" to
"/some/path" is a no-brainer, since we had gone thru several rounds of
similar changes. However, my questions (it is more of a Neil's question)
are, if I convert the first patch to do this,

1) then why do we still need the RPC drop-lock call in nfs-util ?
2) what should we do for the 2nd patch ? i.e., how do we communicate
with the take-over server it is time for its action, by RPC call or by
"echo /some/path > /proc/fs/nfsd/nlm_set_grace_or_whatever" ?

In general, I feel if we do this "/some/path" approach, we may as well
simply convert the 2nd patch from "fsid" to "/some/path". Then we would
finish this long journey.

-- Wendy

>- Tying the -h (drop locks) option to -u (unexport) is too restrictive IMO.
> For one thing, there's a bug in the linux NFS client locking code (I
> reported this earlier) which results in locks not being removed from
> the server. It was not too difficult to reproduce and programs on the
> client will wait forever due to this. To handle these kind of situations
> I need (2) on the server.
>
>- (2) may be useful for other NFS server setups: it is inherently more
> flexible.
>
>- (2) does not depend on nfs-utils. It's simpler.
>
>
>(*) virtual in this case means a UUID or IP based fsid= option and an
>additional IP address on eth0 per export entry, such, that it becomes
>possible to move an export entry + disk partition + local mount to
>different hardware without needing to remount it on all <large number>
>NFS clients.
>
>
>


-------------------------------------------------------------------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs