2007-04-29 23:11:18

by NeilBrown

[permalink] [raw]
Subject: Re: [Cluster-devel] [PATCH 0/4 Revised] NLM - lock failover

On Sunday April 29, [email protected] wrote:
> On Sat, Apr 28, 2007 at 08:22:55AM +1000, Neil Brown wrote:
> > A flag to unexport cannot work because we don't call unexport - we
> > just flush a kernel cache.
> >
> > A flag to export is just .... weird. All the other export flags are
> > state flags. This would be an action flag. They are quite different
> > things. Setting a state flag again is a no-op. Setting an action
> > flag again has a very real effect.
>
> In this case the second set shouldn't have any effect--whatever flag is
> set should prevent further locks from being accepted, shouldn't it? (If
> it matters.)

yes, I guess a "No locks are allowed against this export" makes more
sense than "Remove all locks on this export now".
Though currently the locks are against the filesystem - the export can
disappear from the cache while the locks remain - so it's a long way
from perfect. Possibly we could insist that the export remains in the
kernel while files are locked .... but we update export flags by
replacing the export, so that would be a little awkward.

Also, I think I was half-thinking about the "reset the grace period"
operation, and that looks a lot like an action.... unless you make it
grace_period_ends=seconds-since-epoch.

That might work.

>
> > Also, each filesystem is potentially exported multiple times for
> > different sets of clients. If such a flag (whether on 'export' or
> > 'unexport') just said "remove locks from this set of clients" it
> > wouldn't meet the needs, and if it said "remove all locks" it would be
> > a very irregular interface.
>
> The same could be said of the "fsid=" option on exports. It doesn't
> make sense to provide different filehandle- or path- name spaces
> depending on the IP address of a client. If my laptop changes IP
> address, then I can (grudgingly) accept the fact that the server may
> have to deny me access that I had before--maybe it just can't trust the
> network I moved to for whatever reason--but I'd really rather it didn't
> suddenly start giving me paths, or different filehandles, or different
> semantics (like sync vs. async).
>
> So the export interface is already being used for stuff that's really
> intended to be per-filesystem rather than per-(filesystem, client) pair.

ro/rw is often different based on client address, but yes: at lot of
the flags don't really make sense being different for different
clients on the same filesystem.

My feeling was that the "nolocks" flag is essentially pointless unless
it is the same for all exports on the one filesystem, and that gives
it a very different feel.

To make use of such a flag you could not rely on the normal mechanism
for loading flag information: on-demand loading by mountd.
You would need to look through /proc/fs/nfsd/exports, find all the
current exports for the filesystem, tell the kernel to change each
export to have the "nolocks" flag. And then when you have done all of
that, you want to immediately remove all those export entries so you
can unmount the filesystem.

So while it could be made to work, it doesn't feel clean at all.

A grace_period_ends=seconds-since-epoch flag would not have most of
those problems. e.g. it could be demand loaded.
But there is the risk that it might be set for some exports on a given
filesystem and not for others. And the consequence of that is that
some clients might not be able to reclaim their locks (because the
lock has already been given to a client which didn't know about the
new grace period).

Now maybe it would be good to have a bunch of nfsd options that are
explicitly per-filesystem rather than per-export.
Maybe that is the sort of interface we should be designing.
echo "+nolocks /path/to/filesystem" > /proc/fs/nfsd/filesystem_settings
echo "grace_end=12345678 /path/to/filesystem" > /proc/....
echo "-write_gather /path" > .....


We would need to be clear on how long those settings remain in the
kernel, how it can be told to completely forget a particular
filesystem etc..

But we probably don't need to go over-board straight away.
I like the interface:
echo -n "flag flag .. /path/name" > /proc/fs/nfsd/filesystem_settings

where if flags is "?flag", then the value is returned by a subsequent
read on the same file-descriptor.

At this point we only need "nolocks" and "grace_end".
The grace_end information persists until that point in time.
The "nolocks" information .... doesn't persist(?).

NeilBrown

-------------------------------------------------------------------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs


2007-04-30 05:09:11

by Wendy Cheng

[permalink] [raw]
Subject: Re: [Cluster-devel] [PATCH 0/4 Revised] NLM - lock failover

Neil Brown wrote:

>But we probably don't need to go over-board straight away.
>I like the interface:
> echo -n "flag flag .. /path/name" > /proc/fs/nfsd/filesystem_settings
>
>where if flags is "?flag", then the value is returned by a subsequent
>read on the same file-descriptor.
>
>
>
Will do a quick prototype to see whether this would work as good as it
appears. I haven't given up RPC call (into lockd) either since it seems
to be a bright idea.

-- Wendy

-------------------------------------------------------------------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2007-05-04 18:42:23

by J. Bruce Fields

[permalink] [raw]
Subject: Re: [Cluster-devel] [PATCH 0/4 Revised] NLM - lock failover

On Mon, Apr 30, 2007 at 09:10:38AM +1000, Neil Brown wrote:
> where if flags is "?flag", then the value is returned by a subsequent
> read on the same file-descriptor.

The ?flag thing seems a little awkward. It'd be nice if we could get
all the flags for a single filesystem just by cat'ing an appropriate
file.

--b.

-------------------------------------------------------------------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2007-05-04 21:25:04

by Wendy Cheng

[permalink] [raw]
Subject: Re: [Cluster-devel] [PATCH 0/4 Revised] NLM - lock failover

J. Bruce Fields wrote:

>On Mon, Apr 30, 2007 at 09:10:38AM +1000, Neil Brown wrote:
>
>
>>where if flags is "?flag", then the value is returned by a subsequent
>>read on the same file-descriptor.
>>
>>
>
>The ?flag thing seems a little awkward. It'd be nice if we could get
>all the flags for a single filesystem just by cat'ing an appropriate
>file.
>
>--b.
>
>
ok, make sense ... Wendy

-------------------------------------------------------------------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs