2006-12-06 05:34:10

by J. Bruce Fields

[permalink] [raw]
Subject: asynchronous locks for cluster exports

We'd like an asynchronous posix locking interface so that we can provide NFS
clients with cluster-coherent locking without blocking lockd while the
filesystem goes off and talks to other nodes.

So, here's one attempt. It also includes an draft implementation of the
filesystem side for GFS2, which may well be wrong--it's not even tested
yet--but hopefully gives some idea what would be necessary.

A few points, among others, that I'm unsure of:

- We added a new ->lock() export operation, figuring this was a feature
that only lockd and nfsd care about for now, and that we'd rather not
muck about with common locking code. But the export operation is
pretty much identical to the file ->lock() operation; would it make
more sense to use that?

- The filesystem returns the lock results to lockd using the
->fl_notify() callback. We add a few arguments to fl_notify() to
pass the results, and add a return value so the filesystem can
recognize the case where the callback comes after lockd has given up
waiting and returned an error to the user. Presumably the filesystem
needs to have a way to cancel the lock in this case. (Our GFS code
ignores this problem for now.) Maybe it would be better to just poke
lockd when the result is ready and let it discover what happened by
retrying the original ->lock() call? Or maybe we should use a
separate callback?

- We're ignoring the blocking lock case for now under the assumption
it's always OK for lockd to return an immediate "denied" in that
case, then use the granted callback, even in cases where it doesn't
know for sure that there's a conflicting lock.

Thoughts? Better ideas?

--b.


2006-12-14 22:51:37

by J. Bruce Fields

[permalink] [raw]
Subject: Re: asynchronous locks for cluster exports

On Thu, Dec 07, 2006 at 04:51:08PM +0000, Christoph Hellwig wrote:
> On Wed, Dec 06, 2006 at 12:34:10AM -0500, J. Bruce Fields wrote:
> > We'd like an asynchronous posix locking interface so that we can provide NFS
> > - We added a new ->lock() export operation, figuring this was a feature
> > that only lockd and nfsd care about for now, and that we'd rather not
> > muck about with common locking code. But the export operation is
> > pretty much identical to the file ->lock() operation; would it make
> > more sense to use that?
>
> This definitly needs to be merged back into the ->lock file operation

So the interesting question is whether we can merge the semantics in a
reasonable way. The export operation implemented by the current version
of these patches returns more or less immediately with success, -EAGAIN,
or -EINPROGRESS; in the latter case the filesystem later calls fl_notify
to communicate the results. The existing file lock operation normally
blocks until the lock is actually granted.

The one file lock operation could do both, and just switch between the
two cases depending on whether fl_notify is defined. Would the
semantics be clear enough?

I find the existing use of ->lock() a little odd as it is; stuff like this,
from fcntl_getlk():

if (filp->f_op && filp->f_op->lock) {
error = filp->f_op->lock(filp, F_GETLK, &file_lock);
if (file_lock.fl_ops && file_lock.fl_ops->fl_release_private)
file_lock.fl_ops->fl_release_private(&file_lock);
if (error < 0)
goto out;
else
fl = (file_lock.fl_type == F_UNLCK ? NULL : &file_lock);
} else {
fl = (posix_test_lock(filp, &file_lock, &cfl) ? &cfl : NULL);
}

--b.

-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys - and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2006-12-15 19:51:51

by J. Bruce Fields

[permalink] [raw]
Subject: Re: asynchronous locks for cluster exports

On Wed, Dec 06, 2006 at 12:34:10AM -0500, J. Bruce Fields wrote:
> We'd like an asynchronous posix locking interface so that we can provide NFS
> clients with cluster-coherent locking without blocking lockd while the
> filesystem goes off and talks to other nodes.
>
> So, here's one attempt. It also includes an draft implementation of the
> filesystem side for GFS2, which may well be wrong--it's not even tested
> yet--but hopefully gives some idea what would be necessary.

By the way, I'm keeping updated versions of this patch series in the
server-cluster-locking-api branch of the repository at
git://linux-nfs.org/~bfields/linux.git.

So if you already have a git tree sitting around (cloned with something
like "git clone
git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git")

They you should be able to get this with

git fetch git://linux-nfs.org/~bfields/linux.git +server-cluster-locking-api:server-cluster-locking-api
git checkout server-cluster-locking-api

Alternatively you can browse the patches without git from

http://linux-nfs.org/cgi-bin/gitweb.cgi?p=bfields-2.6.git;a=shortlog;h=server-cluster-locking-api

--b.

2006-12-07 15:23:59

by J. Bruce Fields

[permalink] [raw]
Subject: Re: [PATCH 10/10] gfs2: nfs lock support for gfs2

On Wed, Dec 06, 2006 at 10:47:46PM -0800, Marc Eshel wrote:
> Here is a rewrite of gdlm_plock_callback(). We still need to add the
> lock cancel.
> Marc.
>
> int gdlm_plock_callback(struct plock_op *op)
> {
> struct file *file;
> struct file_lock *fl;
> int (*notify)(void *, void *, int) = NULL;
> int rv;
>
> spin_lock(&ops_lock);
> if (!list_empty(&op->list)) {
> printk(KERN_INFO "plock op on list\n");
> list_del(&op->list);
> }
> spin_unlock(&ops_lock);
>
> rv = op->info.rv;
>
> /* check if the following 2 are still valid or make a copy */
> file = op->info.file;
> fl = op->info.fl;
> notify = op->info.callback;
>
> if (!rv) { /* got fs lock */
> rv = posix_lock_file(file, fl);
> if (rv) { /* did not get posix lock */

If we never request the local lock until after we've gotten the lock
from GFS, then this should never happen. So I think this could just be
a BUG_ON(rv)--except that would mean a failure in the lock manager could
oops the kernel, so maybe it'd be better just to printk.

--b.

2006-12-07 15:36:25

by David Teigland

[permalink] [raw]
Subject: Re: [PATCH 10/10] gfs2: nfs lock support for gfs2

On Wed, Dec 06, 2006 at 05:00:29PM -0500, J. Bruce Fields wrote:
> On Wed, Dec 06, 2006 at 03:42:31PM -0600, David Teigland wrote:
> > Oh yeah, that's painful, I knew it sounded too easy.
>
> Yeah. Well, we could try to teach GFS2 to reliably cancel posix locks.
> I think that may turn out to be necessary some day anyway.

Some posix locks would be trivial to cancel and others would be hard. If
gfs_controld has not yet read the op from the kernel's send_list, then we
just remove the op and it never "goes out". After gfs_controld has taken
it and sent it, then it's had its effect and, as you reminded me, is
unreversible without introducing new complexity (like the provisional
locks which sound unpleasant).

In practice, I don't know how likely we are to find ops that haven't been
sent yet--the easy ones to cancel.


> Or we could look at why we're timing out and figure out whether there's
> something else we should be doing instead in that case. In what
> situations is the GFS2 lock call likely to take overly long?

Again, in practice, I really don't know how long a sent lock could be
delayed. When everything is running normally the only delay is between
sending the message (through the openais comms api) and receiving it back
again (which is when we grant it). So, for us it's completely depedent on
how long the delivery of messages could be delayed by openais due to
openais dealing with configuration changes in the cluster.

Dave


-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys - and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2006-12-07 15:43:48

by Marc Eshel

[permalink] [raw]
Subject: Re: [NFS] [PATCH 10/10] gfs2: nfs lock support for gfs2

[email protected] wrote on 12/07/2006 07:23:59 AM:

> On Wed, Dec 06, 2006 at 10:47:46PM -0800, Marc Eshel wrote:
> > Here is a rewrite of gdlm_plock_callback(). We still need to add the
> > lock cancel.
> > Marc.
> >
> > int gdlm_plock_callback(struct plock_op *op)
> > {
> > struct file *file;
> > struct file_lock *fl;
> > int (*notify)(void *, void *, int) = NULL;
> > int rv;
> >
> > spin_lock(&ops_lock);
> > if (!list_empty(&op->list)) {
> > printk(KERN_INFO "plock op on list\n");
> > list_del(&op->list);
> > }
> > spin_unlock(&ops_lock);
> >
> > rv = op->info.rv;
> >
> > /* check if the following 2 are still valid or make a copy */
> > file = op->info.file;
> > fl = op->info.fl;
> > notify = op->info.callback;
> >
> > if (!rv) { /* got fs lock */
> > rv = posix_lock_file(file, fl);
> > if (rv) { /* did not get posix lock */
>
> If we never request the local lock until after we've gotten the lock
> from GFS, then this should never happen. So I think this could just be
> a BUG_ON(rv)--except that would mean a failure in the lock manager could
> oops the kernel, so maybe it'd be better just to printk.

It can happen if you can not allocate memory.
Marc.

>
> --b.
>
>
-------------------------------------------------------------------------
> Take Surveys. Earn Cash. Influence the Future of IT
> Join SourceForge.net's Techsay panel and you'll get the chance to share
your
> opinions on IT & business topics through brief surveys - and earn cash
>
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
> _______________________________________________
> NFS maillist - [email protected]
> https://lists.sourceforge.net/lists/listinfo/nfs


2006-12-07 16:21:19

by J. Bruce Fields

[permalink] [raw]
Subject: Re: [PATCH 10/10] gfs2: nfs lock support for gfs2

On Thu, Dec 07, 2006 at 07:43:48AM -0800, Marc Eshel wrote:
> [email protected] wrote on 12/07/2006 07:23:59 AM:
>
> > On Wed, Dec 06, 2006 at 10:47:46PM -0800, Marc Eshel wrote:
> > > if (!rv) { /* got fs lock */
> > > rv = posix_lock_file(file, fl);
> > > if (rv) { /* did not get posix lock */
> >
> > If we never request the local lock until after we've gotten the lock
> > from GFS, then this should never happen. So I think this could just be
> > a BUG_ON(rv)--except that would mean a failure in the lock manager could
> > oops the kernel, so maybe it'd be better just to printk.
>
> It can happen if you can not allocate memory.

OK, you're right.

Hm, the NFS client just seems to print out a warning at this point,
though.

--b.

-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys - and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2006-12-07 16:51:19

by Christoph Hellwig

[permalink] [raw]
Subject: Re: asynchronous locks for cluster exports

On Wed, Dec 06, 2006 at 12:34:10AM -0500, J. Bruce Fields wrote:
> We'd like an asynchronous posix locking interface so that we can provide NFS
> - We added a new ->lock() export operation, figuring this was a feature
> that only lockd and nfsd care about for now, and that we'd rather not
> muck about with common locking code. But the export operation is
> pretty much identical to the file ->lock() operation; would it make
> more sense to use that?

This definitly needs to be merged back into the ->lock file operation

-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys - and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2006-12-07 18:52:24

by Trond Myklebust

[permalink] [raw]
Subject: Re: [NFS] [PATCH 10/10] gfs2: nfs lock support for gfs2

On Thu, 2006-12-07 at 11:21 -0500, J. Bruce Fields wrote:
> On Thu, Dec 07, 2006 at 07:43:48AM -0800, Marc Eshel wrote:
> > [email protected] wrote on 12/07/2006 07:23:59 AM:
> >
> > > On Wed, Dec 06, 2006 at 10:47:46PM -0800, Marc Eshel wrote:
> > > > if (!rv) { /* got fs lock */
> > > > rv = posix_lock_file(file, fl);
> > > > if (rv) { /* did not get posix lock */
> > >
> > > If we never request the local lock until after we've gotten the lock
> > > from GFS, then this should never happen. So I think this could just be
> > > a BUG_ON(rv)--except that would mean a failure in the lock manager could
> > > oops the kernel, so maybe it'd be better just to printk.
> >
> > It can happen if you can not allocate memory.
>
> OK, you're right.
>
> Hm, the NFS client just seems to print out a warning at this point,
> though.

Feel free to suggest alternatives. If you cannot even allocate the
memory necessary to add a struct file_lock, then how can you expect to
find enough resources to be able to marshall up an RPC call?

Trond


2006-12-08 17:35:54

by J. Bruce Fields

[permalink] [raw]
Subject: Re: [NFS] [PATCH 10/10] gfs2: nfs lock support for gfs2

On Thu, Dec 07, 2006 at 09:30:43AM -0600, David Teigland wrote:
> Some posix locks would be trivial to cancel and others would be hard. If
> gfs_controld has not yet read the op from the kernel's send_list, then we
> just remove the op and it never "goes out". After gfs_controld has taken
> it and sent it, then it's had its effect and, as you reminded me, is
> unreversible without introducing new complexity (like the provisional
> locks which sound unpleasant).

Something like that may be necessary in the end anyway. The background:

NFSv4 doesn't have lock grant callbacks. It only has a single lock
call. You can set a "blocking" bit in that call, but all this does is
tell the server "I intend continue polling for this lock, so if it
becomes available, could you hold it a little while to give me a chance
at it before anyone else?" That's just a hint, not a requirement. But
if we don't take the hint, locking between clients with different
polling strategies, or between remote clients and local users, may be
dramatically unfair.

The client also isn't required to poll for the lock again--it may stop
without notice at any time. So we'd like to have a way to hold a lock
on behalf of a client, without losing the ability to cancel the lock if
the client goes away. Hence the provisional locks.

For now, maybe we could settle for a solution that at least handles
exclusive whole-file locking (for which an unlock is an adequate
cancel), and leaves open the possibility of a more complete solution
later.

--b.

2006-12-09 05:45:54

by Wendy Cheng

[permalink] [raw]
Subject: Re: [PATCH 1/10] lockd: add new export operation for nfsv4/lockd locking

J. Bruce Fields wrote:

>From: Marc Eshel <[email protected]>
>
>There is currently a filesystem ->lock() method, but it is defined only by
>a few filesystems that are not exported via nfsd. So none of the lock
>routines that are used by lockd or nfsv4 bother to call those methods.
>
>Filesystems such as cluster filesystems would like to do their own locking
>and also would like to be exportable via NFS.
>
>So we add a new lock() export operation, and new routines vfs_lock_file,
>vfs_test_lock, and vfs_cancel_lock, which call the new export operation,
>falling back on the appropriate local operation if the export operation is
>unavailable.
>
>These new functions are intended to be used by lockd and nfsd; lockd and
>nfsd changes to take advantage of them are made by later patches.
>
>
>
Starting to read these patches .. I see these new APIs are placed in
lockd directory. Was there any discussion before to make these generic
vfs layer functions ? I'm just curious.

-- Wendy

-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys - and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2006-12-10 18:31:06

by J. Bruce Fields

[permalink] [raw]
Subject: Re: [PATCH 1/10] lockd: add new export operation for nfsv4/lockd locking

On Sat, Dec 09, 2006 at 12:53:22AM -0500, Wendy Cheng wrote:
> Starting to read these patches .. I see these new APIs are placed in
> lockd directory. Was there any discussion before to make these generic
> vfs layer functions ? I'm just curious.

Yeah--see the introductory email. The new stuff is used only by lockd
and nfsd, and there was an understandable desire not to touch the common
locking code, but I agree that it'd be more elegant to share a common
lock method. So that's probably what we'll do next time around....

--b.

-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys - and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2006-12-06 05:34:16

by J. Bruce Fields

[permalink] [raw]
Subject: [PATCH 6/10] lockd: pass cookie in nlmsvc_testlock

From: Marc Eshel <[email protected]>

Change NLM internal interface to pass more information for test lock; we
need this to make sure the cookie information is pushed down to the place
where we do request deferral, which is handled for testlock by the
following patch.

Signed-off-by: Marc Eshel <[email protected]>
Signed-off-by: J. Bruce Fields <[email protected]>
---
fs/lockd/svc4proc.c | 2 +-
fs/lockd/svclock.c | 5 +++--
fs/lockd/svcproc.c | 2 +-
include/linux/lockd/lockd.h | 4 ++--
4 files changed, 7 insertions(+), 6 deletions(-)

diff --git a/fs/lockd/svc4proc.c b/fs/lockd/svc4proc.c
index 0ce5c81..eb5994e 100644
--- a/fs/lockd/svc4proc.c
+++ b/fs/lockd/svc4proc.c
@@ -99,7 +99,7 @@ nlm4svc_proc_test(struct svc_rqst *rqstp, struct nlm_args *argp,
return resp->status == nlm_drop_reply ? rpc_drop_reply :rpc_success;

/* Now check for conflicting locks */
- resp->status = nlmsvc_testlock(file, &argp->lock, &resp->lock);
+ resp->status = nlmsvc_testlock(rqstp, file, &argp->lock, &resp->lock, &resp->cookie);

dprintk("lockd: TEST4 status %d\n", ntohl(resp->status));
nlm_release_host(host);
diff --git a/fs/lockd/svclock.c b/fs/lockd/svclock.c
index 90aa4a5..e342046 100644
--- a/fs/lockd/svclock.c
+++ b/fs/lockd/svclock.c
@@ -549,8 +549,9 @@ out:
* Test for presence of a conflicting lock.
*/
__be32
-nlmsvc_testlock(struct nlm_file *file, struct nlm_lock *lock,
- struct nlm_lock *conflock)
+nlmsvc_testlock(struct svc_rqst *rqstp, struct nlm_file *file,
+ struct nlm_lock *lock, struct nlm_lock *conflock,
+ struct nlm_cookie *cookie)
{
dprintk("lockd: nlmsvc_testlock(%s/%ld, ty=%d, %Ld-%Ld)\n",
file->f_file->f_dentry->d_inode->i_sb->s_id,
diff --git a/fs/lockd/svcproc.c b/fs/lockd/svcproc.c
index 32e99a6..d4257a5 100644
--- a/fs/lockd/svcproc.c
+++ b/fs/lockd/svcproc.c
@@ -127,7 +127,7 @@ nlmsvc_proc_test(struct svc_rqst *rqstp, struct nlm_args *argp,
return resp->status == nlm_drop_reply ? rpc_drop_reply :rpc_success;

/* Now check for conflicting locks */
- resp->status = cast_status(nlmsvc_testlock(file, &argp->lock, &resp->lock));
+ resp->status = cast_status(nlmsvc_testlock(rqstp, file, &argp->lock, &resp->lock, &resp->cookie));

dprintk("lockd: TEST status %d vers %d\n",
ntohl(resp->status), rqstp->rq_vers);
diff --git a/include/linux/lockd/lockd.h b/include/linux/lockd/lockd.h
index ac865f8..863bc23 100644
--- a/include/linux/lockd/lockd.h
+++ b/include/linux/lockd/lockd.h
@@ -197,8 +197,8 @@ typedef int (*nlm_host_match_fn_t)(struct nlm_host *cur, struct nlm_host *ref)
__be32 nlmsvc_lock(struct svc_rqst *, struct nlm_file *,
struct nlm_lock *, int, struct nlm_cookie *);
__be32 nlmsvc_unlock(struct nlm_file *, struct nlm_lock *);
-__be32 nlmsvc_testlock(struct nlm_file *, struct nlm_lock *,
- struct nlm_lock *);
+__be32 nlmsvc_testlock(struct svc_rqst *, struct nlm_file *,
+ struct nlm_lock *, struct nlm_lock *, struct nlm_cookie *);
__be32 nlmsvc_cancel_blocked(struct nlm_file *, struct nlm_lock *);
unsigned long nlmsvc_retry_blocked(void);
void nlmsvc_traverse_blocks(struct nlm_host *, struct nlm_file *,
--
1.4.4.1


2006-12-06 05:34:18

by J. Bruce Fields

[permalink] [raw]
Subject: [PATCH 8/10] lockd: always preallocate block in nlmsvc_lock()

From: J. Bruce Fields <[email protected]>

Normally we could skip ever having to allocate a block in the case where
the client asks for a non-blocking lock, or asks for a blocking lock that
succeeds immediately.

However we're going to want to always look up a block first in order to
check whether we're revisiting a deferred lock call, and to be prepared to
handle the case where the filesystem returns -EINPROGRESS--in that case we
want to make sure the lock we've given the filesystem is the one embedded
in the block that we'll use to track the deferred request.

Signed-off-by: Marc Eshel <[email protected]>
Signed-off-by: J. Bruce Fields <[email protected]>
---
fs/lockd/svclock.c | 34 +++++++++++-----------------------
1 files changed, 11 insertions(+), 23 deletions(-)

diff --git a/fs/lockd/svclock.c b/fs/lockd/svclock.c
index 21185a7..134a40e 100644
--- a/fs/lockd/svclock.c
+++ b/fs/lockd/svclock.c
@@ -471,7 +471,7 @@ __be32
nlmsvc_lock(struct svc_rqst *rqstp, struct nlm_file *file,
struct nlm_lock *lock, int wait, struct nlm_cookie *cookie)
{
- struct nlm_block *block, *newblock = NULL;
+ struct nlm_block *block = NULL;
int error;
__be32 ret;

@@ -484,17 +484,20 @@ nlmsvc_lock(struct svc_rqst *rqstp, struct nlm_file *file,
wait);


- lock->fl.fl_flags &= ~FL_SLEEP;
-again:
/* Lock file against concurrent access */
mutex_lock(&file->f_mutex);
- /* Get existing block (in case client is busy-waiting) */
+ /* Get existing block (in case client is busy-waiting)
+ * or create new block
+ */
block = nlmsvc_lookup_block(file, lock);
if (block == NULL) {
- if (newblock != NULL)
- lock = &newblock->b_call->a_args.lock;
- } else
+ block = nlmsvc_create_block(rqstp, file, lock, cookie);
+ ret = nlm_lck_denied_nolocks;
+ if (block == NULL)
+ goto out;
lock = &block->b_call->a_args.lock;
+ } else
+ lock->fl.fl_flags &= ~FL_SLEEP;

error = posix_lock_file(file->f_file, &lock->fl);
lock->fl.fl_flags &= ~FL_SLEEP;
@@ -520,26 +523,11 @@ again:
goto out;

ret = nlm_lck_blocked;
- if (block != NULL)
- goto out;
-
- /* If we don't have a block, create and initialize it. Then
- * retry because we may have slept in kmalloc. */
- /* We have to release f_mutex as nlmsvc_create_block may try to
- * to claim it while doing host garbage collection */
- if (newblock == NULL) {
- mutex_unlock(&file->f_mutex);
- dprintk("lockd: blocking on this lock (allocating).\n");
- if (!(newblock = nlmsvc_create_block(rqstp, file, lock, cookie)))
- return nlm_lck_denied_nolocks;
- goto again;
- }

/* Append to list of blocked */
- nlmsvc_insert_block(newblock, NLM_NEVER);
+ nlmsvc_insert_block(block, NLM_NEVER);
out:
mutex_unlock(&file->f_mutex);
- nlmsvc_release_block(newblock);
nlmsvc_release_block(block);
dprintk("lockd: nlmsvc_lock returned %u\n", ret);
return ret;
--
1.4.4.1


2006-12-06 05:34:19

by J. Bruce Fields

[permalink] [raw]
Subject: [PATCH 9/10] lockd: add code to handle deferred lock requests

From: Marc Eshel <[email protected]>

Rewrite nlmsvc_lock() to use the asynchronous interface.

As with testlock, we answer nlm requests in nlmsvc_lock by first looking up
the block and then using the results we find in the block if B_QUEUED is
set, and calling vfs_lock_file() otherwise.

If this a new lock request and we get -EINPROGRESS return on a non-blocking
request then we defer the request.

Also modify nlmsvc_unlock() to call the filesystem method if appropriate.

Signed-off-by: Marc Eshel <[email protected]>
Signed-off-by: J. Bruce Fields <[email protected]>
---
fs/lockd/svclock.c | 40 ++++++++++++++++++++++++++++++++++------
fs/lockd/svcsubs.c | 2 +-
2 files changed, 35 insertions(+), 7 deletions(-)

diff --git a/fs/lockd/svclock.c b/fs/lockd/svclock.c
index 134a40e..90bac90 100644
--- a/fs/lockd/svclock.c
+++ b/fs/lockd/svclock.c
@@ -499,17 +499,44 @@ nlmsvc_lock(struct svc_rqst *rqstp, struct nlm_file *file,
} else
lock->fl.fl_flags &= ~FL_SLEEP;

- error = posix_lock_file(file->f_file, &lock->fl);
- lock->fl.fl_flags &= ~FL_SLEEP;

- dprintk("lockd: posix_lock_file returned %d\n", error);
+ if (block->b_flags & B_QUEUED) {
+ dprintk("lockd: nlmsvc_lock deferred block %p flags %d\n",
+ block, block->b_flags);
+ if (block->b_granted) {
+ nlmsvc_unlink_block(block);
+ ret = nlm_granted;
+ goto out;
+ }
+ if (block->b_flags & B_TOO_LATE) {
+ nlmsvc_unlink_block(block);
+ ret = nlm_lck_denied;
+ goto out;
+ }
+ ret = nlm_drop_reply;
+ goto out;
+ }

+ if (!wait)
+ lock->fl.fl_flags &= ~FL_SLEEP;
+ error = vfs_lock_file(file->f_file, &lock->fl);
+ lock->fl.fl_flags &= ~FL_SLEEP;
+
+ dprintk("lockd: vfs_lock_file returned %d\n", error);
switch(error) {
case 0:
ret = nlm_granted;
goto out;
case -EAGAIN:
+ ret = nlm_lck_denied;
break;
+ case -EINPROGRESS:
+ if (wait)
+ break;
+ /* Filesystem lock operation is in progress
+ Add it to the queue waiting for callback */
+ ret = nlmsvc_defer_lock_rqst(rqstp, block);
+ goto out;
case -EDEADLK:
ret = nlm_deadlock;
goto out;
@@ -623,7 +650,7 @@ nlmsvc_unlock(struct nlm_file *file, struct nlm_lock *lock)
nlmsvc_cancel_blocked(file, lock);

lock->fl.fl_type = F_UNLCK;
- error = posix_lock_file(file->f_file, &lock->fl);
+ error = vfs_lock_file(file->f_file, &lock->fl);

return (error < 0)? nlm_lck_denied_nolocks : nlm_granted;
}
@@ -769,14 +796,15 @@ nlmsvc_grant_blocked(struct nlm_block *block)

/* Try the lock operation again */
lock->fl.fl_flags |= FL_SLEEP;
- error = posix_lock_file(file->f_file, &lock->fl);
+ error = vfs_lock_file(file->f_file, &lock->fl);
lock->fl.fl_flags &= ~FL_SLEEP;

switch (error) {
case 0:
break;
case -EAGAIN:
- dprintk("lockd: lock still blocked\n");
+ case -EINPROGRESS:
+ dprintk("lockd: lock still blocked error %d\n", error);
nlmsvc_insert_block(block, NLM_NEVER);
nlmsvc_release_block(block);
return;
diff --git a/fs/lockd/svcsubs.c b/fs/lockd/svcsubs.c
index e83024e..bb2f6b1 100644
--- a/fs/lockd/svcsubs.c
+++ b/fs/lockd/svcsubs.c
@@ -182,7 +182,7 @@ again:
lock.fl_type = F_UNLCK;
lock.fl_start = 0;
lock.fl_end = OFFSET_MAX;
- if (posix_lock_file(file->f_file, &lock) < 0) {
+ if (vfs_lock_file(file->f_file, &lock) < 0) {
printk("lockd: unlock failure in %s:%d\n",
__FILE__, __LINE__);
return 1;
--
1.4.4.1


2006-12-06 05:34:12

by J. Bruce Fields

[permalink] [raw]
Subject: [PATCH 2/10] nfsd4: Convert NFSv4 to new lock interface

From: J. Bruce Fields <[email protected]>

Convert NFSv4 to the new lock interface. We don't define any callback for
now, so we're not taking advantage of the asynchronous feature--that's less
critical for the multi-threaded nfsd then it is for the single-threaded
lockd. But this does allow a cluster filesystem to export cluster-coherent
locking to NFS.

Signed-off-by: J. Bruce Fields <[email protected]>
---
fs/nfsd/nfs4state.c | 15 ++++++++-------
1 files changed, 8 insertions(+), 7 deletions(-)

diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index fc0634d..140298a 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -50,6 +50,7 @@
#include <linux/nfsd/xdr4.h>
#include <linux/namei.h>
#include <linux/mutex.h>
+#include <linux/lockd/bind.h>

#define NFSDDBG_FACILITY NFSDDBG_PROC

@@ -2759,7 +2760,7 @@ nfsd4_lock(struct svc_rqst *rqstp, struct svc_fh *current_fh, struct nfsd4_lock
* locks_copy_lock: */
conflock.fl_ops = NULL;
conflock.fl_lmops = NULL;
- err = posix_lock_file_conf(filp, &file_lock, &conflock);
+ err = vfs_lock_file_conf(filp, &file_lock, &conflock);
switch (-err) {
case 0: /* success! */
update_stateid(&lock_stp->st_stateid);
@@ -2776,7 +2777,7 @@ nfsd4_lock(struct svc_rqst *rqstp, struct svc_fh *current_fh, struct nfsd4_lock
status = nfserr_deadlock;
break;
default:
- dprintk("NFSD: nfsd4_lock: posix_lock_file_conf() failed! status %d\n",err);
+ dprintk("NFSD: nfsd4_lock: vfs_lock_file_conf() failed! status %d\n",err);
status = nfserr_resource;
break;
}
@@ -2856,16 +2857,16 @@ nfsd4_lockt(struct svc_rqst *rqstp, struct svc_fh *current_fh, struct nfsd4_lock

nfs4_transform_lock_offset(&file_lock);

- /* posix_test_lock uses the struct file _only_ to resolve the inode.
+ /* vfs_test_lock uses the struct file _only_ to resolve the inode.
* since LOCKT doesn't require an OPEN, and therefore a struct
- * file may not exist, pass posix_test_lock a struct file with
+ * file may not exist, pass vfs_test_lock a struct file with
* only the dentry:inode set.
*/
memset(&file, 0, sizeof (struct file));
file.f_dentry = current_fh->fh_dentry;

status = nfs_ok;
- if (posix_test_lock(&file, &file_lock, &conflock)) {
+ if (vfs_test_lock(&file, &file_lock, &conflock)) {
status = nfserr_denied;
nfs4_set_lock_denied(&conflock, &lockt->lt_denied);
}
@@ -2919,9 +2920,9 @@ nfsd4_locku(struct svc_rqst *rqstp, struct svc_fh *current_fh, struct nfsd4_lock
/*
* Try to unlock the file in the VFS.
*/
- err = posix_lock_file(filp, &file_lock);
+ err = vfs_lock_file(filp, &file_lock);
if (err) {
- dprintk("NFSD: nfs4_locku: posix_lock_file failed!\n");
+ dprintk("NFSD: nfs4_locku: vfs_lock_file failed!\n");
goto out_nfserr;
}
/*
--
1.4.4.1


2006-12-06 05:34:13

by J. Bruce Fields

[permalink] [raw]
Subject: [PATCH 3/10] lockd: request deferral routine

From: Marc Eshel <[email protected]>

We need to keep some state for a pending asynchronous lock request, so this
patch adds that state to struct nlm_block.

This also adds a function which defers the request, by calling
rqstp->rq_chandle.defer and storing the resulting deferred request in a
nlm_block structure which we insert into lockd's global block list. That
new function isn't called yet, so it's dead code until a later patch.

Signed-off-by: Marc Eshel <[email protected]>
Signed-off-by: J. Bruce Fields <[email protected]>
---
fs/lockd/svclock.c | 25 +++++++++++++++++++++++++
include/linux/lockd/lockd.h | 10 ++++++++++
2 files changed, 35 insertions(+), 0 deletions(-)

diff --git a/fs/lockd/svclock.c b/fs/lockd/svclock.c
index f523ca2..2ce4dc6 100644
--- a/fs/lockd/svclock.c
+++ b/fs/lockd/svclock.c
@@ -437,6 +437,31 @@ static void nlmsvc_freegrantargs(struct nlm_rqst *call)
}

/*
+ * Deferred lock request handling for non-blocking lock
+ */
+static u32
+nlmsvc_defer_lock_rqst(struct svc_rqst *rqstp, struct nlm_block *block)
+{
+ u32 status = nlm_lck_denied_nolocks;
+
+ block->b_flags |= B_QUEUED;
+
+ nlmsvc_insert_block(block, NLM_TIMEOUT);
+
+ block->b_cache_req = &rqstp->rq_chandle;
+ if (rqstp->rq_chandle.defer) {
+ block->b_deferred_req =
+ rqstp->rq_chandle.defer(block->b_cache_req);
+ if (block->b_deferred_req != NULL)
+ status = nlm_drop_reply;
+ }
+ dprintk("lockd: nlmsvc_defer_lock_rqst block %p flags %d status %d\n",
+ block, block->b_flags, status);
+
+ return status;
+}
+
+/*
* Attempt to establish a lock, and if it can't be granted, block it
* if required.
*/
diff --git a/include/linux/lockd/lockd.h b/include/linux/lockd/lockd.h
index 862d973..ac865f8 100644
--- a/include/linux/lockd/lockd.h
+++ b/include/linux/lockd/lockd.h
@@ -119,6 +119,9 @@ struct nlm_file {
* couldn't be granted because of a conflicting lock).
*/
#define NLM_NEVER (~(unsigned long) 0)
+/* timeout on non-blocking call: */
+#define NLM_TIMEOUT (7 * HZ)
+
struct nlm_block {
struct kref b_count; /* Reference count */
struct list_head b_list; /* linked list of all blocks */
@@ -130,6 +133,13 @@ struct nlm_block {
unsigned int b_id; /* block id */
unsigned char b_granted; /* VFS granted lock */
struct nlm_file * b_file; /* file in question */
+ struct cache_req * b_cache_req; /* deferred request handling */
+ struct file_lock * b_fl; /* set for GETLK */
+ struct cache_deferred_req * b_deferred_req;
+ unsigned int b_flags; /* block flags */
+#define B_QUEUED 1 /* lock queued */
+#define B_GOT_CALLBACK 2 /* got lock or conflicting lock */
+#define B_TOO_LATE 4 /* too late for non-blocking lock */
};

/*
--
1.4.4.1


2006-12-06 05:34:17

by J. Bruce Fields

[permalink] [raw]
Subject: [PATCH 7/10] lockd: handle test_lock deferrals

From: Marc Eshel <[email protected]>

Rewrite nlmsvc_testlock() to use the new asynchronous interface: instead of
immediately doing a posix_test_lock(), we first look for a matching block.
If the subsequent test_lock returns anything other than -EINPROGRESS, we
then remove the block we've found and return the results.

If it returns -EINPROGRESS, then we defer the lock request.

In the case where the block we find in the first step has B_QUEUED set,
we bypass the vfs_test_lock entirely, instead using the block to decide how
to respond:
with nlm_lck_denied if B_TOO_LATE is set.
with nlm_granted if B_GOT_CALLBACK is set.
by dropping if neither B_TOO_LATE nor B_GOT_CALLBACK is set

Signed-off-by: Marc Eshel <[email protected]>
Signed-off-by: J. Bruce Fields <[email protected]>
---
fs/lockd/svclock.c | 59 ++++++++++++++++++++++++++++++++++++++++++---------
1 files changed, 48 insertions(+), 11 deletions(-)

diff --git a/fs/lockd/svclock.c b/fs/lockd/svclock.c
index e342046..21185a7 100644
--- a/fs/lockd/svclock.c
+++ b/fs/lockd/svclock.c
@@ -553,6 +553,9 @@ nlmsvc_testlock(struct svc_rqst *rqstp, struct nlm_file *file,
struct nlm_lock *lock, struct nlm_lock *conflock,
struct nlm_cookie *cookie)
{
+ struct nlm_block *block = NULL;
+ int error;
+
dprintk("lockd: nlmsvc_testlock(%s/%ld, ty=%d, %Ld-%Ld)\n",
file->f_file->f_dentry->d_inode->i_sb->s_id,
file->f_file->f_dentry->d_inode->i_ino,
@@ -560,19 +563,53 @@ nlmsvc_testlock(struct svc_rqst *rqstp, struct nlm_file *file,
(long long)lock->fl.fl_start,
(long long)lock->fl.fl_end);

- if (posix_test_lock(file->f_file, &lock->fl, &conflock->fl)) {
- dprintk("lockd: conflicting lock(ty=%d, %Ld-%Ld)\n",
- conflock->fl.fl_type,
- (long long)conflock->fl.fl_start,
- (long long)conflock->fl.fl_end);
- conflock->caller = "somehost"; /* FIXME */
- conflock->len = strlen(conflock->caller);
- conflock->oh.len = 0; /* don't return OH info */
- conflock->svid = conflock->fl.fl_pid;
- return nlm_lck_denied;
+ /* Get existing block (in case client is busy-waiting) */
+ block = nlmsvc_lookup_block(file, lock);
+
+ if (block == NULL) {
+ block = nlmsvc_create_block(rqstp, file, lock, cookie);
+ if (block == NULL)
+ return nlm_granted;
}
+ if (block->b_flags & B_QUEUED) {
+ dprintk("lockd: nlmsvc_testlock deferred block %p flags %d fl %p\n",
+ block, block->b_flags, block->b_fl);
+ if (block->b_flags & B_TOO_LATE) {
+ nlmsvc_unlink_block(block);
+ return nlm_lck_denied;
+ }
+ if (block->b_flags & B_GOT_CALLBACK) {
+ if (block->b_fl != NULL) {
+ conflock->fl = *block->b_fl;
+ goto conf_lock;
+ }
+ else {
+ nlmsvc_unlink_block(block);
+ return nlm_granted;
+ }
+ }
+ return nlm_drop_reply;
+ }

- return nlm_granted;
+ error = vfs_test_lock(file->f_file, &lock->fl, &conflock->fl);
+ if (!error) {
+ nlmsvc_unlink_block(block);
+ return nlm_granted;
+ }
+ if (error == -EINPROGRESS)
+ return nlmsvc_defer_lock_rqst(rqstp, block);
+
+conf_lock:
+ dprintk("lockd: conflicting lock(ty=%d, %Ld-%Ld)\n",
+ conflock->fl.fl_type, (long long)conflock->fl.fl_start,
+ (long long)conflock->fl.fl_end);
+ conflock->caller = "somehost"; /* FIXME */
+ conflock->len = strlen(conflock->caller);
+ conflock->oh.len = 0; /* don't return OH info */
+ conflock->svid = conflock->fl.fl_pid;
+ if (block)
+ nlmsvc_unlink_block(block);
+ return nlm_lck_denied;
}

/*
--
1.4.4.1


2006-12-06 05:34:25

by J. Bruce Fields

[permalink] [raw]
Subject: [PATCH 1/10] lockd: add new export operation for nfsv4/lockd locking

From: Marc Eshel <[email protected]>

There is currently a filesystem ->lock() method, but it is defined only by
a few filesystems that are not exported via nfsd. So none of the lock
routines that are used by lockd or nfsv4 bother to call those methods.

Filesystems such as cluster filesystems would like to do their own locking
and also would like to be exportable via NFS.

So we add a new lock() export operation, and new routines vfs_lock_file,
vfs_test_lock, and vfs_cancel_lock, which call the new export operation,
falling back on the appropriate local operation if the export operation is
unavailable.

These new functions are intended to be used by lockd and nfsd; lockd and
nfsd changes to take advantage of them are made by later patches.

Acquiring a lock may require communication with remote hosts, and to avoid
blocking lockd or nfsd threads during such communication, we allow the
results to be returned asynchronously.

When a ->lock() call needs to block, the file system will return
-EINPROGRESS, and then later return the results with a call to the routine
in the fl_notify field of the lock_manager_operations struct.

Note that this is different from the ->lock() call discovering that there
is a conflict which would cause the caller to block; this is still handled
in the same way as before. In fact, we don't currently handle "blocking"
locks at all; those are less urgent, because the filesystem can always just
return an immediate -EAGAIN without denying the lock.

So this asynchronous interface is only used in the case of a non-blocking
lock, where we must know whether to allow or deny the lock now.

(Note: with this patch, we haven't yet modified lockd to handle such a
callback, which we must do so before a filesystem can safely use it in this
way.)

Signed-off-by: Marc Eshel <[email protected]>
Signed-off-by: J. Bruce Fields <[email protected]>
---
fs/lockd/svclock.c | 108 +++++++++++++++++++++++++++++++++++++++++++-
include/linux/fs.h | 2 +
include/linux/lockd/bind.h | 4 ++
3 files changed, 113 insertions(+), 1 deletions(-)

diff --git a/fs/lockd/svclock.c b/fs/lockd/svclock.c
index 7e219b9..f523ca2 100644
--- a/fs/lockd/svclock.c
+++ b/fs/lockd/svclock.c
@@ -20,6 +20,7 @@
* Copyright (C) 1996, Olaf Kirch <[email protected]>
*/

+#include <linux/module.h>
#include <linux/types.h>
#include <linux/errno.h>
#include <linux/kernel.h>
@@ -51,6 +52,111 @@ static const struct rpc_call_ops nlmsvc_grant_ops;
*/
static LIST_HEAD(nlm_blocked);

+ /**
+ * vfs_lock_file - file byte range lock
+ * @filp: The file to apply the lock to
+ * @fl: The lock to be applied
+ *
+ * To avoid blocking kernel daemons, such as lockd, that need to acquire POSIX
+ * locks, the ->lock() interface may return asynchronously, before the lock has
+ * been granted or denied by the underlying filesystem, if (and only if)
+ * fl_notify is set. Callers expecting ->lock() to return asynchronously
+ * will only use F_SETLK, not F_SETLKW; they will set FL_SLEEP if (and only if)
+ * the request is for a blocking lock. When ->lock() does return asynchronously,
+ * it must return -EINPROGRESS, and call ->fl_notify() when the lock
+ * request completes.
+ * If the request is for non-blocking lock the file system should return
+ * -EINPROGRESS then try to get the lock and call the callback routine with
+ * the result. If the request timed out the callback routine will return a
+ * nonzero return code and the file system should release the lock. The file
+ * system is also responsible to keep a corresponding posix lock when it
+ * grants a lock so the VFS can find out which locks are locally held and do
+ * the correct lock cleanup when required.
+ * The underlying filesystem must not drop the kernel lock or call
+ * ->fl_notify() before returning to the caller with a -EINPROGRESS
+ * return code.
+ */
+int vfs_lock_file(struct file *filp, struct file_lock *fl)
+{
+ struct super_block *sb;
+
+ sb = filp->f_dentry->d_inode->i_sb;
+ if (sb->s_export_op && sb->s_export_op->lock)
+ return sb->s_export_op->lock(filp, F_SETLK, fl);
+ else
+ return posix_lock_file(filp, fl);
+}
+EXPORT_SYMBOL(vfs_lock_file);
+
+/**
+ * vfs_lock_file - file byte range lock
+ * @filp: The file to apply the lock to
+ * @fl: The lock to be applied
+ * @conf: Place to return a copy of the conflicting lock, if found.
+ *
+ * read comments for vfs_lock_file()
+ */
+int vfs_lock_file_conf(struct file *filp, struct file_lock *fl, struct file_lock *conf)
+{
+ struct super_block *sb;
+
+ sb = filp->f_dentry->d_inode->i_sb;
+ if (sb->s_export_op && sb->s_export_op->lock) {
+ locks_copy_lock(conf, fl);
+ return sb->s_export_op->lock(filp, F_SETLK, fl);
+ } else
+ return posix_lock_file_conf(filp, fl, conf);
+}
+EXPORT_SYMBOL(vfs_lock_file_conf);
+
+/**
+ * vfs_test_lock - test file byte range lock
+ * @filp: The file to test lock for
+ * @fl: The lock to test
+ * @conf: Place to return a copy of the conflicting lock, if found.
+ */
+int vfs_test_lock(struct file *filp, struct file_lock *fl, struct file_lock *conf)
+{
+ int error;
+ struct super_block *sb;
+
+ conf->fl_type = F_UNLCK;
+ sb = filp->f_dentry->d_inode->i_sb;
+ if (sb->s_export_op && sb->s_export_op->lock) {
+ locks_copy_lock(conf, fl);
+ error = sb->s_export_op->lock(filp, F_GETLK, conf);
+ if (!error) {
+ if (conf->fl_type != F_UNLCK)
+ error = 1;
+ }
+ return error;
+ } else
+ return posix_test_lock(filp, fl, conf);
+}
+EXPORT_SYMBOL(vfs_test_lock);
+
+/**
+ * vfs_cancel_lock - file byte range unblock lock
+ * @filp: The file to apply the unblock to
+ * @fl: The lock to be unblocked
+ *
+ * FL_CANCELED is used to cancel blocked requests
+ */
+int vfs_cancel_lock(struct file *filp, struct file_lock *fl)
+{
+ int status;
+ struct super_block *sb;
+
+ fl->fl_flags |= FL_CANCEL;
+ sb = filp->f_dentry->d_inode->i_sb;
+ if (sb->s_export_op && sb->s_export_op->lock)
+ status = sb->s_export_op->lock(filp, F_SETLK, fl);
+ else
+ status = posix_unblock_lock(filp, fl);
+ fl->fl_flags &= ~FL_CANCEL;
+ return status;
+}
+
/*
* Insert a blocked lock into the global list
*/
@@ -241,7 +347,7 @@ static int nlmsvc_unlink_block(struct nlm_block *block)
dprintk("lockd: unlinking block %p...\n", block);

/* Remove block from list */
- status = posix_unblock_lock(block->b_file->f_file, &block->b_call->a_args.lock.fl);
+ status = vfs_cancel_lock(block->b_file->f_file, &block->b_call->a_args.lock.fl);
nlmsvc_remove_block(block);
return status;
}
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 2fe6e3f..b1d287b 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -770,6 +770,7 @@ extern spinlock_t files_lock;

#define FL_POSIX 1
#define FL_FLOCK 2
+#define FL_CANCEL 4 /* set to request cancelling a lock */
#define FL_ACCESS 8 /* not trying to lock, just looking */
#define FL_EXISTS 16 /* when unlocking, test for existence */
#define FL_LEASE 32 /* lease held on this file */
@@ -1372,6 +1373,7 @@ struct export_operations {
int (*acceptable)(void *context, struct dentry *de),
void *context);

+ int (*lock) (struct file *, int, struct file_lock *);

};

diff --git a/include/linux/lockd/bind.h b/include/linux/lockd/bind.h
index aa50d89..780bec4 100644
--- a/include/linux/lockd/bind.h
+++ b/include/linux/lockd/bind.h
@@ -38,4 +38,8 @@ extern int nlmclnt_proc(struct inode *, int, struct file_lock *);
extern int lockd_up(int proto);
extern void lockd_down(void);

+extern int vfs_lock_file(struct file *, struct file_lock *);
+extern int vfs_lock_file_conf(struct file *, struct file_lock *, struct file_lock *);
+extern int vfs_test_lock(struct file *, struct file_lock *, struct file_lock *);
+
#endif /* LINUX_LOCKD_BIND_H */
--
1.4.4.1


-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys - and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2006-12-06 05:34:25

by J. Bruce Fields

[permalink] [raw]
Subject: [PATCH 6/10] lockd: pass cookie in nlmsvc_testlock

From: Marc Eshel <[email protected]>

Change NLM internal interface to pass more information for test lock; we
need this to make sure the cookie information is pushed down to the place
where we do request deferral, which is handled for testlock by the
following patch.

Signed-off-by: Marc Eshel <[email protected]>
Signed-off-by: J. Bruce Fields <[email protected]>
---
fs/lockd/svc4proc.c | 2 +-
fs/lockd/svclock.c | 5 +++--
fs/lockd/svcproc.c | 2 +-
include/linux/lockd/lockd.h | 4 ++--
4 files changed, 7 insertions(+), 6 deletions(-)

diff --git a/fs/lockd/svc4proc.c b/fs/lockd/svc4proc.c
index 0ce5c81..eb5994e 100644
--- a/fs/lockd/svc4proc.c
+++ b/fs/lockd/svc4proc.c
@@ -99,7 +99,7 @@ nlm4svc_proc_test(struct svc_rqst *rqstp, struct nlm_args *argp,
return resp->status == nlm_drop_reply ? rpc_drop_reply :rpc_success;

/* Now check for conflicting locks */
- resp->status = nlmsvc_testlock(file, &argp->lock, &resp->lock);
+ resp->status = nlmsvc_testlock(rqstp, file, &argp->lock, &resp->lock, &resp->cookie);

dprintk("lockd: TEST4 status %d\n", ntohl(resp->status));
nlm_release_host(host);
diff --git a/fs/lockd/svclock.c b/fs/lockd/svclock.c
index 90aa4a5..e342046 100644
--- a/fs/lockd/svclock.c
+++ b/fs/lockd/svclock.c
@@ -549,8 +549,9 @@ out:
* Test for presence of a conflicting lock.
*/
__be32
-nlmsvc_testlock(struct nlm_file *file, struct nlm_lock *lock,
- struct nlm_lock *conflock)
+nlmsvc_testlock(struct svc_rqst *rqstp, struct nlm_file *file,
+ struct nlm_lock *lock, struct nlm_lock *conflock,
+ struct nlm_cookie *cookie)
{
dprintk("lockd: nlmsvc_testlock(%s/%ld, ty=%d, %Ld-%Ld)\n",
file->f_file->f_dentry->d_inode->i_sb->s_id,
diff --git a/fs/lockd/svcproc.c b/fs/lockd/svcproc.c
index 32e99a6..d4257a5 100644
--- a/fs/lockd/svcproc.c
+++ b/fs/lockd/svcproc.c
@@ -127,7 +127,7 @@ nlmsvc_proc_test(struct svc_rqst *rqstp, struct nlm_args *argp,
return resp->status == nlm_drop_reply ? rpc_drop_reply :rpc_success;

/* Now check for conflicting locks */
- resp->status = cast_status(nlmsvc_testlock(file, &argp->lock, &resp->lock));
+ resp->status = cast_status(nlmsvc_testlock(rqstp, file, &argp->lock, &resp->lock, &resp->cookie));

dprintk("lockd: TEST status %d vers %d\n",
ntohl(resp->status), rqstp->rq_vers);
diff --git a/include/linux/lockd/lockd.h b/include/linux/lockd/lockd.h
index ac865f8..863bc23 100644
--- a/include/linux/lockd/lockd.h
+++ b/include/linux/lockd/lockd.h
@@ -197,8 +197,8 @@ typedef int (*nlm_host_match_fn_t)(struct nlm_host *cur, struct nlm_host *ref)
__be32 nlmsvc_lock(struct svc_rqst *, struct nlm_file *,
struct nlm_lock *, int, struct nlm_cookie *);
__be32 nlmsvc_unlock(struct nlm_file *, struct nlm_lock *);
-__be32 nlmsvc_testlock(struct nlm_file *, struct nlm_lock *,
- struct nlm_lock *);
+__be32 nlmsvc_testlock(struct svc_rqst *, struct nlm_file *,
+ struct nlm_lock *, struct nlm_lock *, struct nlm_cookie *);
__be32 nlmsvc_cancel_blocked(struct nlm_file *, struct nlm_lock *);
unsigned long nlmsvc_retry_blocked(void);
void nlmsvc_traverse_blocks(struct nlm_host *, struct nlm_file *,
--
1.4.4.1


-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys - and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2006-12-06 05:34:25

by J. Bruce Fields

[permalink] [raw]
Subject: [PATCH 9/10] lockd: add code to handle deferred lock requests

From: Marc Eshel <[email protected]>

Rewrite nlmsvc_lock() to use the asynchronous interface.

As with testlock, we answer nlm requests in nlmsvc_lock by first looking up
the block and then using the results we find in the block if B_QUEUED is
set, and calling vfs_lock_file() otherwise.

If this a new lock request and we get -EINPROGRESS return on a non-blocking
request then we defer the request.

Also modify nlmsvc_unlock() to call the filesystem method if appropriate.

Signed-off-by: Marc Eshel <[email protected]>
Signed-off-by: J. Bruce Fields <[email protected]>
---
fs/lockd/svclock.c | 40 ++++++++++++++++++++++++++++++++++------
fs/lockd/svcsubs.c | 2 +-
2 files changed, 35 insertions(+), 7 deletions(-)

diff --git a/fs/lockd/svclock.c b/fs/lockd/svclock.c
index 134a40e..90bac90 100644
--- a/fs/lockd/svclock.c
+++ b/fs/lockd/svclock.c
@@ -499,17 +499,44 @@ nlmsvc_lock(struct svc_rqst *rqstp, struct nlm_file *file,
} else
lock->fl.fl_flags &= ~FL_SLEEP;

- error = posix_lock_file(file->f_file, &lock->fl);
- lock->fl.fl_flags &= ~FL_SLEEP;

- dprintk("lockd: posix_lock_file returned %d\n", error);
+ if (block->b_flags & B_QUEUED) {
+ dprintk("lockd: nlmsvc_lock deferred block %p flags %d\n",
+ block, block->b_flags);
+ if (block->b_granted) {
+ nlmsvc_unlink_block(block);
+ ret = nlm_granted;
+ goto out;
+ }
+ if (block->b_flags & B_TOO_LATE) {
+ nlmsvc_unlink_block(block);
+ ret = nlm_lck_denied;
+ goto out;
+ }
+ ret = nlm_drop_reply;
+ goto out;
+ }

+ if (!wait)
+ lock->fl.fl_flags &= ~FL_SLEEP;
+ error = vfs_lock_file(file->f_file, &lock->fl);
+ lock->fl.fl_flags &= ~FL_SLEEP;
+
+ dprintk("lockd: vfs_lock_file returned %d\n", error);
switch(error) {
case 0:
ret = nlm_granted;
goto out;
case -EAGAIN:
+ ret = nlm_lck_denied;
break;
+ case -EINPROGRESS:
+ if (wait)
+ break;
+ /* Filesystem lock operation is in progress
+ Add it to the queue waiting for callback */
+ ret = nlmsvc_defer_lock_rqst(rqstp, block);
+ goto out;
case -EDEADLK:
ret = nlm_deadlock;
goto out;
@@ -623,7 +650,7 @@ nlmsvc_unlock(struct nlm_file *file, struct nlm_lock *lock)
nlmsvc_cancel_blocked(file, lock);

lock->fl.fl_type = F_UNLCK;
- error = posix_lock_file(file->f_file, &lock->fl);
+ error = vfs_lock_file(file->f_file, &lock->fl);

return (error < 0)? nlm_lck_denied_nolocks : nlm_granted;
}
@@ -769,14 +796,15 @@ nlmsvc_grant_blocked(struct nlm_block *block)

/* Try the lock operation again */
lock->fl.fl_flags |= FL_SLEEP;
- error = posix_lock_file(file->f_file, &lock->fl);
+ error = vfs_lock_file(file->f_file, &lock->fl);
lock->fl.fl_flags &= ~FL_SLEEP;

switch (error) {
case 0:
break;
case -EAGAIN:
- dprintk("lockd: lock still blocked\n");
+ case -EINPROGRESS:
+ dprintk("lockd: lock still blocked error %d\n", error);
nlmsvc_insert_block(block, NLM_NEVER);
nlmsvc_release_block(block);
return;
diff --git a/fs/lockd/svcsubs.c b/fs/lockd/svcsubs.c
index e83024e..bb2f6b1 100644
--- a/fs/lockd/svcsubs.c
+++ b/fs/lockd/svcsubs.c
@@ -182,7 +182,7 @@ again:
lock.fl_type = F_UNLCK;
lock.fl_start = 0;
lock.fl_end = OFFSET_MAX;
- if (posix_lock_file(file->f_file, &lock) < 0) {
+ if (vfs_lock_file(file->f_file, &lock) < 0) {
printk("lockd: unlock failure in %s:%d\n",
__FILE__, __LINE__);
return 1;
--
1.4.4.1


-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys - and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2006-12-06 05:34:25

by J. Bruce Fields

[permalink] [raw]
Subject: [PATCH 10/10] gfs2: nfs lock support for gfs2

From: J. Bruce Fields <[email protected]>

From: Marc Eshel <[email protected]>

Add NFS lock support to GFS2. (Untested.)

Signed-off-by: J. Bruce Fields <[email protected]>
---
fs/gfs2/lm.c | 10 ++++
fs/gfs2/lm.h | 2 +
fs/gfs2/locking/dlm/lock_dlm.h | 2 +
fs/gfs2/locking/dlm/mount.c | 1 +
fs/gfs2/locking/dlm/plock.c | 95 +++++++++++++++++++++++++++++++++++++++-
fs/gfs2/ops_export.c | 52 ++++++++++++++++++++++
include/linux/lm_interface.h | 3 +
include/linux/lock_dlm_plock.h | 3 +
8 files changed, 166 insertions(+), 2 deletions(-)

diff --git a/fs/gfs2/lm.c b/fs/gfs2/lm.c
index effe4a3..cf7fd52 100644
--- a/fs/gfs2/lm.c
+++ b/fs/gfs2/lm.c
@@ -197,6 +197,16 @@ int gfs2_lm_plock(struct gfs2_sbd *sdp, struct lm_lockname *name,
return error;
}

+int gfs2_lm_plock_async(struct gfs2_sbd *sdp, struct lm_lockname *name,
+ struct file *file, int cmd, struct file_lock *fl)
+{
+ int error = -EIO;
+ if (likely(!test_bit(SDF_SHUTDOWN, &sdp->sd_flags)))
+ error = sdp->sd_lockstruct.ls_ops->lm_plock_async(
+ sdp->sd_lockstruct.ls_lockspace, name, file, cmd, fl);
+ return error;
+}
+
int gfs2_lm_punlock(struct gfs2_sbd *sdp, struct lm_lockname *name,
struct file *file, struct file_lock *fl)
{
diff --git a/fs/gfs2/lm.h b/fs/gfs2/lm.h
index 21cdc30..1ddd1fd 100644
--- a/fs/gfs2/lm.h
+++ b/fs/gfs2/lm.h
@@ -34,6 +34,8 @@ int gfs2_lm_plock_get(struct gfs2_sbd *sdp, struct lm_lockname *name,
struct file *file, struct file_lock *fl);
int gfs2_lm_plock(struct gfs2_sbd *sdp, struct lm_lockname *name,
struct file *file, int cmd, struct file_lock *fl);
+int gfs2_lm_plock_async(struct gfs2_sbd *sdp, struct lm_lockname *name,
+ struct file *file, int cmd, struct file_lock *fl);
int gfs2_lm_punlock(struct gfs2_sbd *sdp, struct lm_lockname *name,
struct file *file, struct file_lock *fl);
void gfs2_lm_recovery_done(struct gfs2_sbd *sdp, unsigned int jid,
diff --git a/fs/gfs2/locking/dlm/lock_dlm.h b/fs/gfs2/locking/dlm/lock_dlm.h
index 33af707..82af860 100644
--- a/fs/gfs2/locking/dlm/lock_dlm.h
+++ b/fs/gfs2/locking/dlm/lock_dlm.h
@@ -179,6 +179,8 @@ int gdlm_plock_init(void);
void gdlm_plock_exit(void);
int gdlm_plock(void *, struct lm_lockname *, struct file *, int,
struct file_lock *);
+int gdlm_plock_async(void *, struct lm_lockname *, struct file *, int,
+ struct file_lock *);
int gdlm_plock_get(void *, struct lm_lockname *, struct file *,
struct file_lock *);
int gdlm_punlock(void *, struct lm_lockname *, struct file *,
diff --git a/fs/gfs2/locking/dlm/mount.c b/fs/gfs2/locking/dlm/mount.c
index cdd1694..4339e3f 100644
--- a/fs/gfs2/locking/dlm/mount.c
+++ b/fs/gfs2/locking/dlm/mount.c
@@ -244,6 +244,7 @@ const struct lm_lockops gdlm_ops = {
.lm_lock = gdlm_lock,
.lm_unlock = gdlm_unlock,
.lm_plock = gdlm_plock,
+ .lm_plock_async = gdlm_plock_async,
.lm_punlock = gdlm_punlock,
.lm_plock_get = gdlm_plock_get,
.lm_cancel = gdlm_cancel,
diff --git a/fs/gfs2/locking/dlm/plock.c b/fs/gfs2/locking/dlm/plock.c
index 7365aec..c21e667 100644
--- a/fs/gfs2/locking/dlm/plock.c
+++ b/fs/gfs2/locking/dlm/plock.c
@@ -102,6 +102,93 @@ int gdlm_plock(void *lockspace, struct lm_lockname *name,
return rv;
}

+int gdlm_plock_async(void *lockspace, struct lm_lockname *name,
+ struct file *file, int cmd, struct file_lock *fl)
+{
+ struct gdlm_ls *ls = lockspace;
+ struct plock_op *op;
+ int rv;
+
+ op = kzalloc(sizeof(*op), GFP_KERNEL);
+ if (!op)
+ return -ENOMEM;
+
+ op->info.optype = GDLM_PLOCK_OP_LOCK;
+ op->info.pid = fl->fl_pid;
+ op->info.ex = (fl->fl_type == F_WRLCK);
+ op->info.wait = IS_SETLKW(cmd);
+ op->info.fsid = ls->id;
+ op->info.number = name->ln_number;
+ op->info.start = fl->fl_start;
+ op->info.end = fl->fl_end;
+ op->info.owner = (__u64)(long) fl->fl_owner;
+ if (fl->fl_lmops) {
+ op->info.callback = fl->fl_lmops->fl_notify;
+ /* might need to make a copy */
+ op->info.fl = fl;
+ op->info.file = file;
+ } else
+ op->info.callback = NULL;
+
+ send_op(op);
+
+ if (op->info.callback == NULL)
+ wait_event(recv_wq, (op->done != 0));
+ else
+ return -EINPROGRESS;
+
+ spin_lock(&ops_lock);
+ if (!list_empty(&op->list)) {
+ printk(KERN_INFO "plock op on list\n");
+ list_del(&op->list);
+ }
+ spin_unlock(&ops_lock);
+
+ rv = op->info.rv;
+
+ if (!rv) {
+ if (posix_lock_file_wait(file, fl) < 0)
+ log_error("gdlm_plock: vfs lock error %x,%llx",
+ name->ln_type,
+ (unsigned long long)name->ln_number);
+ } else {
+ /* XXX: We need to cancel the lock here: */
+ printk("gfs2 lock granted after lock request failed; dangling lock!\n");
+ }
+
+ kfree(op);
+ return rv;
+}
+
+int gdlm_plock_callback(struct plock_op *op)
+{
+ struct file *file;
+ struct file_lock *fl;
+ int rv;
+
+ spin_lock(&ops_lock);
+ if (!list_empty(&op->list)) {
+ printk(KERN_INFO "plock op on list\n");
+ list_del(&op->list);
+ }
+ spin_unlock(&ops_lock);
+
+ rv = op->info.rv;
+
+ if (!rv) {
+ /* check if the following are still valid or make a copy */
+ file = op->info.file;
+ fl = op->info.fl;
+
+ if (posix_lock_file_wait(file, fl) < 0)
+ log_error("gdlm_plock: vfs lock error file %p fl %p",
+ file, fl);
+ }
+
+ kfree(op);
+ return rv;
+}
+
int gdlm_punlock(void *lockspace, struct lm_lockname *name,
struct file *file, struct file_lock *fl)
{
@@ -242,8 +329,12 @@ static ssize_t dev_write(struct file *file, const char __user *u, size_t count,
}
spin_unlock(&ops_lock);

- if (found)
- wake_up(&recv_wq);
+ if (found) {
+ if (op->info.callback)
+ gdlm_plock_callback(op);
+ else
+ wake_up(&recv_wq);
+ }
else
printk(KERN_INFO "gdlm dev_write no op %x %llx\n", info.fsid,
(unsigned long long)info.number);
diff --git a/fs/gfs2/ops_export.c b/fs/gfs2/ops_export.c
index 86127d9..80ca84f 100644
--- a/fs/gfs2/ops_export.c
+++ b/fs/gfs2/ops_export.c
@@ -22,6 +22,7 @@
#include "glock.h"
#include "glops.h"
#include "inode.h"
+#include "lm.h"
#include "ops_export.h"
#include "rgrp.h"
#include "util.h"
@@ -287,6 +288,56 @@ fail:
gfs2_glock_dq_uninit(&i_gh);
return ERR_PTR(error);
}
+/**
+ * gfs2_exp_lock - acquire/release a posix lock on a file
+ * @file: the file pointer
+ * @cmd: either modify or retrieve lock state, possibly wait
+ * @fl: type and range of lock
+ *
+ * Returns: errno
+ */
+
+static int gfs2_exp_lock(struct file *file, int cmd, struct file_lock *fl)
+{
+ struct gfs2_inode *ip = GFS2_I(file->f_mapping->host);
+ struct gfs2_sbd *sdp = GFS2_SB(file->f_mapping->host);
+ struct lm_lockname name =
+ { .ln_number = ip->i_num.no_addr,
+ .ln_type = LM_TYPE_PLOCK };
+
+ if (!(fl->fl_flags & FL_POSIX))
+ return -ENOLCK;
+ if ((ip->i_di.di_mode & (S_ISGID | S_IXGRP)) == S_ISGID)
+ return -ENOLCK;
+
+ if (sdp->sd_args.ar_localflocks) {
+ if (IS_GETLK(cmd)) {
+ struct file_lock tmp;
+ int ret;
+ ret = posix_test_lock(file, fl, &tmp);
+ fl->fl_type = F_UNLCK;
+ if (ret)
+ memcpy(fl, &tmp, sizeof(struct file_lock));
+ return 0;
+ } else {
+ return posix_lock_file_wait(file, fl);
+ }
+ }
+
+ if (IS_GETLK(cmd))
+ return gfs2_lm_plock_get(sdp, &name, file, fl);
+ else if (fl->fl_type == F_UNLCK)
+ return gfs2_lm_punlock(sdp, &name, file, fl);
+ else {
+ /* If fl_notify is set make an async lock request
+ and reply withh -EINPROGRESS. When lock is granted
+ the gfs2_lm_plock_async should callback to fl_notify */
+ if (fl->fl_lmops->fl_notify)
+ return gfs2_lm_plock_async(sdp, &name, file, cmd, fl);
+ else
+ return gfs2_lm_plock(sdp, &name, file, cmd, fl);
+ }
+}

struct export_operations gfs2_export_ops = {
.decode_fh = gfs2_decode_fh,
@@ -294,5 +345,6 @@ struct export_operations gfs2_export_ops = {
.get_name = gfs2_get_name,
.get_parent = gfs2_get_parent,
.get_dentry = gfs2_get_dentry,
+ .lock = gfs2_exp_lock,
};

diff --git a/include/linux/lm_interface.h b/include/linux/lm_interface.h
index 1418fdc..28d5445 100644
--- a/include/linux/lm_interface.h
+++ b/include/linux/lm_interface.h
@@ -213,6 +213,9 @@ struct lm_lockops {
int (*lm_plock) (void *lockspace, struct lm_lockname *name,
struct file *file, int cmd, struct file_lock *fl);

+ int (*lm_plock_async) (void *lockspace, struct lm_lockname *name,
+ struct file *file, int cmd, struct file_lock *fl);
+
int (*lm_punlock) (void *lockspace, struct lm_lockname *name,
struct file *file, struct file_lock *fl);

diff --git a/include/linux/lock_dlm_plock.h b/include/linux/lock_dlm_plock.h
index fc34151..809c5b7 100644
--- a/include/linux/lock_dlm_plock.h
+++ b/include/linux/lock_dlm_plock.h
@@ -35,6 +35,9 @@ struct gdlm_plock_info {
__u64 start;
__u64 end;
__u64 owner;
+ void *callback;
+ void *fl;
+ void *file;
};

#endif
--
1.4.4.1


-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys - and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2006-12-06 05:34:25

by J. Bruce Fields

[permalink] [raw]
Subject: [PATCH 7/10] lockd: handle test_lock deferrals

From: Marc Eshel <[email protected]>

Rewrite nlmsvc_testlock() to use the new asynchronous interface: instead of
immediately doing a posix_test_lock(), we first look for a matching block.
If the subsequent test_lock returns anything other than -EINPROGRESS, we
then remove the block we've found and return the results.

If it returns -EINPROGRESS, then we defer the lock request.

In the case where the block we find in the first step has B_QUEUED set,
we bypass the vfs_test_lock entirely, instead using the block to decide how
to respond:
with nlm_lck_denied if B_TOO_LATE is set.
with nlm_granted if B_GOT_CALLBACK is set.
by dropping if neither B_TOO_LATE nor B_GOT_CALLBACK is set

Signed-off-by: Marc Eshel <[email protected]>
Signed-off-by: J. Bruce Fields <[email protected]>
---
fs/lockd/svclock.c | 59 ++++++++++++++++++++++++++++++++++++++++++---------
1 files changed, 48 insertions(+), 11 deletions(-)

diff --git a/fs/lockd/svclock.c b/fs/lockd/svclock.c
index e342046..21185a7 100644
--- a/fs/lockd/svclock.c
+++ b/fs/lockd/svclock.c
@@ -553,6 +553,9 @@ nlmsvc_testlock(struct svc_rqst *rqstp, struct nlm_file *file,
struct nlm_lock *lock, struct nlm_lock *conflock,
struct nlm_cookie *cookie)
{
+ struct nlm_block *block = NULL;
+ int error;
+
dprintk("lockd: nlmsvc_testlock(%s/%ld, ty=%d, %Ld-%Ld)\n",
file->f_file->f_dentry->d_inode->i_sb->s_id,
file->f_file->f_dentry->d_inode->i_ino,
@@ -560,19 +563,53 @@ nlmsvc_testlock(struct svc_rqst *rqstp, struct nlm_file *file,
(long long)lock->fl.fl_start,
(long long)lock->fl.fl_end);

- if (posix_test_lock(file->f_file, &lock->fl, &conflock->fl)) {
- dprintk("lockd: conflicting lock(ty=%d, %Ld-%Ld)\n",
- conflock->fl.fl_type,
- (long long)conflock->fl.fl_start,
- (long long)conflock->fl.fl_end);
- conflock->caller = "somehost"; /* FIXME */
- conflock->len = strlen(conflock->caller);
- conflock->oh.len = 0; /* don't return OH info */
- conflock->svid = conflock->fl.fl_pid;
- return nlm_lck_denied;
+ /* Get existing block (in case client is busy-waiting) */
+ block = nlmsvc_lookup_block(file, lock);
+
+ if (block == NULL) {
+ block = nlmsvc_create_block(rqstp, file, lock, cookie);
+ if (block == NULL)
+ return nlm_granted;
}
+ if (block->b_flags & B_QUEUED) {
+ dprintk("lockd: nlmsvc_testlock deferred block %p flags %d fl %p\n",
+ block, block->b_flags, block->b_fl);
+ if (block->b_flags & B_TOO_LATE) {
+ nlmsvc_unlink_block(block);
+ return nlm_lck_denied;
+ }
+ if (block->b_flags & B_GOT_CALLBACK) {
+ if (block->b_fl != NULL) {
+ conflock->fl = *block->b_fl;
+ goto conf_lock;
+ }
+ else {
+ nlmsvc_unlink_block(block);
+ return nlm_granted;
+ }
+ }
+ return nlm_drop_reply;
+ }

- return nlm_granted;
+ error = vfs_test_lock(file->f_file, &lock->fl, &conflock->fl);
+ if (!error) {
+ nlmsvc_unlink_block(block);
+ return nlm_granted;
+ }
+ if (error == -EINPROGRESS)
+ return nlmsvc_defer_lock_rqst(rqstp, block);
+
+conf_lock:
+ dprintk("lockd: conflicting lock(ty=%d, %Ld-%Ld)\n",
+ conflock->fl.fl_type, (long long)conflock->fl.fl_start,
+ (long long)conflock->fl.fl_end);
+ conflock->caller = "somehost"; /* FIXME */
+ conflock->len = strlen(conflock->caller);
+ conflock->oh.len = 0; /* don't return OH info */
+ conflock->svid = conflock->fl.fl_pid;
+ if (block)
+ nlmsvc_unlink_block(block);
+ return nlm_lck_denied;
}

/*
--
1.4.4.1


-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys - and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2006-12-06 05:34:27

by J. Bruce Fields

[permalink] [raw]
Subject: [PATCH 3/10] lockd: request deferral routine

From: Marc Eshel <[email protected]>

We need to keep some state for a pending asynchronous lock request, so this
patch adds that state to struct nlm_block.

This also adds a function which defers the request, by calling
rqstp->rq_chandle.defer and storing the resulting deferred request in a
nlm_block structure which we insert into lockd's global block list. That
new function isn't called yet, so it's dead code until a later patch.

Signed-off-by: Marc Eshel <[email protected]>
Signed-off-by: J. Bruce Fields <[email protected]>
---
fs/lockd/svclock.c | 25 +++++++++++++++++++++++++
include/linux/lockd/lockd.h | 10 ++++++++++
2 files changed, 35 insertions(+), 0 deletions(-)

diff --git a/fs/lockd/svclock.c b/fs/lockd/svclock.c
index f523ca2..2ce4dc6 100644
--- a/fs/lockd/svclock.c
+++ b/fs/lockd/svclock.c
@@ -437,6 +437,31 @@ static void nlmsvc_freegrantargs(struct nlm_rqst *call)
}

/*
+ * Deferred lock request handling for non-blocking lock
+ */
+static u32
+nlmsvc_defer_lock_rqst(struct svc_rqst *rqstp, struct nlm_block *block)
+{
+ u32 status = nlm_lck_denied_nolocks;
+
+ block->b_flags |= B_QUEUED;
+
+ nlmsvc_insert_block(block, NLM_TIMEOUT);
+
+ block->b_cache_req = &rqstp->rq_chandle;
+ if (rqstp->rq_chandle.defer) {
+ block->b_deferred_req =
+ rqstp->rq_chandle.defer(block->b_cache_req);
+ if (block->b_deferred_req != NULL)
+ status = nlm_drop_reply;
+ }
+ dprintk("lockd: nlmsvc_defer_lock_rqst block %p flags %d status %d\n",
+ block, block->b_flags, status);
+
+ return status;
+}
+
+/*
* Attempt to establish a lock, and if it can't be granted, block it
* if required.
*/
diff --git a/include/linux/lockd/lockd.h b/include/linux/lockd/lockd.h
index 862d973..ac865f8 100644
--- a/include/linux/lockd/lockd.h
+++ b/include/linux/lockd/lockd.h
@@ -119,6 +119,9 @@ struct nlm_file {
* couldn't be granted because of a conflicting lock).
*/
#define NLM_NEVER (~(unsigned long) 0)
+/* timeout on non-blocking call: */
+#define NLM_TIMEOUT (7 * HZ)
+
struct nlm_block {
struct kref b_count; /* Reference count */
struct list_head b_list; /* linked list of all blocks */
@@ -130,6 +133,13 @@ struct nlm_block {
unsigned int b_id; /* block id */
unsigned char b_granted; /* VFS granted lock */
struct nlm_file * b_file; /* file in question */
+ struct cache_req * b_cache_req; /* deferred request handling */
+ struct file_lock * b_fl; /* set for GETLK */
+ struct cache_deferred_req * b_deferred_req;
+ unsigned int b_flags; /* block flags */
+#define B_QUEUED 1 /* lock queued */
+#define B_GOT_CALLBACK 2 /* got lock or conflicting lock */
+#define B_TOO_LATE 4 /* too late for non-blocking lock */
};

/*
--
1.4.4.1


-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys - and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2006-12-06 05:34:27

by J. Bruce Fields

[permalink] [raw]
Subject: [PATCH 2/10] nfsd4: Convert NFSv4 to new lock interface

From: J. Bruce Fields <[email protected]>

Convert NFSv4 to the new lock interface. We don't define any callback for
now, so we're not taking advantage of the asynchronous feature--that's less
critical for the multi-threaded nfsd then it is for the single-threaded
lockd. But this does allow a cluster filesystem to export cluster-coherent
locking to NFS.

Signed-off-by: J. Bruce Fields <[email protected]>
---
fs/nfsd/nfs4state.c | 15 ++++++++-------
1 files changed, 8 insertions(+), 7 deletions(-)

diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index fc0634d..140298a 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -50,6 +50,7 @@
#include <linux/nfsd/xdr4.h>
#include <linux/namei.h>
#include <linux/mutex.h>
+#include <linux/lockd/bind.h>

#define NFSDDBG_FACILITY NFSDDBG_PROC

@@ -2759,7 +2760,7 @@ nfsd4_lock(struct svc_rqst *rqstp, struct svc_fh *current_fh, struct nfsd4_lock
* locks_copy_lock: */
conflock.fl_ops = NULL;
conflock.fl_lmops = NULL;
- err = posix_lock_file_conf(filp, &file_lock, &conflock);
+ err = vfs_lock_file_conf(filp, &file_lock, &conflock);
switch (-err) {
case 0: /* success! */
update_stateid(&lock_stp->st_stateid);
@@ -2776,7 +2777,7 @@ nfsd4_lock(struct svc_rqst *rqstp, struct svc_fh *current_fh, struct nfsd4_lock
status = nfserr_deadlock;
break;
default:
- dprintk("NFSD: nfsd4_lock: posix_lock_file_conf() failed! status %d\n",err);
+ dprintk("NFSD: nfsd4_lock: vfs_lock_file_conf() failed! status %d\n",err);
status = nfserr_resource;
break;
}
@@ -2856,16 +2857,16 @@ nfsd4_lockt(struct svc_rqst *rqstp, struct svc_fh *current_fh, struct nfsd4_lock

nfs4_transform_lock_offset(&file_lock);

- /* posix_test_lock uses the struct file _only_ to resolve the inode.
+ /* vfs_test_lock uses the struct file _only_ to resolve the inode.
* since LOCKT doesn't require an OPEN, and therefore a struct
- * file may not exist, pass posix_test_lock a struct file with
+ * file may not exist, pass vfs_test_lock a struct file with
* only the dentry:inode set.
*/
memset(&file, 0, sizeof (struct file));
file.f_dentry = current_fh->fh_dentry;

status = nfs_ok;
- if (posix_test_lock(&file, &file_lock, &conflock)) {
+ if (vfs_test_lock(&file, &file_lock, &conflock)) {
status = nfserr_denied;
nfs4_set_lock_denied(&conflock, &lockt->lt_denied);
}
@@ -2919,9 +2920,9 @@ nfsd4_locku(struct svc_rqst *rqstp, struct svc_fh *current_fh, struct nfsd4_lock
/*
* Try to unlock the file in the VFS.
*/
- err = posix_lock_file(filp, &file_lock);
+ err = vfs_lock_file(filp, &file_lock);
if (err) {
- dprintk("NFSD: nfs4_locku: posix_lock_file failed!\n");
+ dprintk("NFSD: nfs4_locku: vfs_lock_file failed!\n");
goto out_nfserr;
}
/*
--
1.4.4.1


-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys - and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2006-12-06 05:34:26

by J. Bruce Fields

[permalink] [raw]
Subject: [PATCH 8/10] lockd: always preallocate block in nlmsvc_lock()

From: J. Bruce Fields <[email protected]>

Normally we could skip ever having to allocate a block in the case where
the client asks for a non-blocking lock, or asks for a blocking lock that
succeeds immediately.

However we're going to want to always look up a block first in order to
check whether we're revisiting a deferred lock call, and to be prepared to
handle the case where the filesystem returns -EINPROGRESS--in that case we
want to make sure the lock we've given the filesystem is the one embedded
in the block that we'll use to track the deferred request.

Signed-off-by: Marc Eshel <[email protected]>
Signed-off-by: J. Bruce Fields <[email protected]>
---
fs/lockd/svclock.c | 34 +++++++++++-----------------------
1 files changed, 11 insertions(+), 23 deletions(-)

diff --git a/fs/lockd/svclock.c b/fs/lockd/svclock.c
index 21185a7..134a40e 100644
--- a/fs/lockd/svclock.c
+++ b/fs/lockd/svclock.c
@@ -471,7 +471,7 @@ __be32
nlmsvc_lock(struct svc_rqst *rqstp, struct nlm_file *file,
struct nlm_lock *lock, int wait, struct nlm_cookie *cookie)
{
- struct nlm_block *block, *newblock = NULL;
+ struct nlm_block *block = NULL;
int error;
__be32 ret;

@@ -484,17 +484,20 @@ nlmsvc_lock(struct svc_rqst *rqstp, struct nlm_file *file,
wait);


- lock->fl.fl_flags &= ~FL_SLEEP;
-again:
/* Lock file against concurrent access */
mutex_lock(&file->f_mutex);
- /* Get existing block (in case client is busy-waiting) */
+ /* Get existing block (in case client is busy-waiting)
+ * or create new block
+ */
block = nlmsvc_lookup_block(file, lock);
if (block == NULL) {
- if (newblock != NULL)
- lock = &newblock->b_call->a_args.lock;
- } else
+ block = nlmsvc_create_block(rqstp, file, lock, cookie);
+ ret = nlm_lck_denied_nolocks;
+ if (block == NULL)
+ goto out;
lock = &block->b_call->a_args.lock;
+ } else
+ lock->fl.fl_flags &= ~FL_SLEEP;

error = posix_lock_file(file->f_file, &lock->fl);
lock->fl.fl_flags &= ~FL_SLEEP;
@@ -520,26 +523,11 @@ again:
goto out;

ret = nlm_lck_blocked;
- if (block != NULL)
- goto out;
-
- /* If we don't have a block, create and initialize it. Then
- * retry because we may have slept in kmalloc. */
- /* We have to release f_mutex as nlmsvc_create_block may try to
- * to claim it while doing host garbage collection */
- if (newblock == NULL) {
- mutex_unlock(&file->f_mutex);
- dprintk("lockd: blocking on this lock (allocating).\n");
- if (!(newblock = nlmsvc_create_block(rqstp, file, lock, cookie)))
- return nlm_lck_denied_nolocks;
- goto again;
- }

/* Append to list of blocked */
- nlmsvc_insert_block(newblock, NLM_NEVER);
+ nlmsvc_insert_block(block, NLM_NEVER);
out:
mutex_unlock(&file->f_mutex);
- nlmsvc_release_block(newblock);
nlmsvc_release_block(block);
dprintk("lockd: nlmsvc_lock returned %u\n", ret);
return ret;
--
1.4.4.1


-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys - and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2006-12-06 05:34:29

by J. Bruce Fields

[permalink] [raw]
Subject: [PATCH 4/10] locks: add fl_notify arguments

From: J. Bruce Fields <[email protected]>

We're using fl_notify to asynchronously return the result of a lock
request. So we want fl_notify to be able to return a status and, if
appropriate, a conflicting lock.

This only current caller of fl_notify is in the blocked case, in which case
we don't use these extra arguments.

We also allow fl_notify to return an error. (Also ignored for now.)

Signed-off-by: J. Bruce Fields <[email protected]>
---
fs/lockd/svclock.c | 7 ++++---
fs/locks.c | 2 +-
include/linux/fs.h | 2 +-
3 files changed, 6 insertions(+), 5 deletions(-)

diff --git a/fs/lockd/svclock.c b/fs/lockd/svclock.c
index 2ce4dc6..32f4cc4 100644
--- a/fs/lockd/svclock.c
+++ b/fs/lockd/svclock.c
@@ -637,12 +637,13 @@ nlmsvc_cancel_blocked(struct nlm_file *file, struct nlm_lock *lock)
* This function doesn't grant the blocked lock instantly, but rather moves
* the block to the head of nlm_blocked where it can be picked up by lockd.
*/
-static void
-nlmsvc_notify_blocked(struct file_lock *fl)
+static int
+nlmsvc_notify_blocked(struct file_lock *fl, struct file_lock *conf, int result)
{
struct nlm_block *block;

- dprintk("lockd: VFS unblock notification for block %p\n", fl);
+ dprintk("lockd: nlmsvc_notify_blocked lock %p conf %p result %d\n",
+ fl, conf, result);
list_for_each_entry(block, &nlm_blocked, b_list) {
if (nlm_compare_locks(&block->b_call->a_args.lock.fl, fl)) {
nlmsvc_insert_block(block, 0);
diff --git a/fs/locks.c b/fs/locks.c
index 451a61a..959347e 100644
--- a/fs/locks.c
+++ b/fs/locks.c
@@ -544,7 +544,7 @@ static void locks_wake_up_blocks(struct file_lock *blocker)
struct file_lock, fl_block);
__locks_delete_block(waiter);
if (waiter->fl_lmops && waiter->fl_lmops->fl_notify)
- waiter->fl_lmops->fl_notify(waiter);
+ waiter->fl_lmops->fl_notify(waiter, NULL, -EAGAIN);
else
wake_up(&waiter->fl_wait);
}
diff --git a/include/linux/fs.h b/include/linux/fs.h
index b1d287b..9b57afc 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -795,7 +795,7 @@ struct file_lock_operations {

struct lock_manager_operations {
int (*fl_compare_owner)(struct file_lock *, struct file_lock *);
- void (*fl_notify)(struct file_lock *); /* unblock callback */
+ int (*fl_notify)(struct file_lock *, struct file_lock *, int);
void (*fl_copy_lock)(struct file_lock *, struct file_lock *);
void (*fl_release_private)(struct file_lock *);
void (*fl_break)(struct file_lock *);
--
1.4.4.1


-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys - and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2006-12-06 05:34:30

by J. Bruce Fields

[permalink] [raw]
Subject: [PATCH 5/10] lockd: handle fl_notify callbacks

From: Marc Eshel <[email protected]>

Add code to handle file system callback when the lock is finally granted.

Signed-off-by: Marc Eshel <[email protected]>
Signed-off-by: J. Bruce Fields <[email protected]>
---
fs/lockd/svclock.c | 78 ++++++++++++++++++++++++++++++++++++++++++++++++----
1 files changed, 72 insertions(+), 6 deletions(-)

diff --git a/fs/lockd/svclock.c b/fs/lockd/svclock.c
index 32f4cc4..90aa4a5 100644
--- a/fs/lockd/svclock.c
+++ b/fs/lockd/svclock.c
@@ -367,6 +367,8 @@ static void nlmsvc_free_block(struct kref *kref)
nlmsvc_freegrantargs(block->b_call);
nlm_release_call(block->b_call);
nlm_release_file(block->b_file);
+ if (block->b_fl)
+ kfree(block->b_fl);
kfree(block);
}

@@ -631,6 +633,32 @@ nlmsvc_cancel_blocked(struct nlm_file *file, struct nlm_lock *lock)
}

/*
+ * This is a callback from the filesystem for VFS file lock requests.
+ * It will be used if fl_notify is defined and the filesystem can not
+ * respond to the request immediately.
+ * For GETLK request it will copy the reply to the nlm_block.
+ * For SETLK or SETLKW request it will get the local posix lock.
+ * In all cases it will move the block to the head of nlm_blocked q where
+ * nlmsvc_retry_blocked() can send back a reply for SETLKW or revisit the
+ * deferred rpc for GETLK and SETLK.
+ */
+static void
+nlmsvc_update_deferred_block(struct nlm_block *block, struct file_lock *conf,
+ int result)
+{
+ block->b_flags |= B_GOT_CALLBACK;
+ if (result == 0)
+ block->b_granted = 1;
+ else
+ block->b_flags |= B_TOO_LATE;
+ if (conf) {
+ block->b_fl = kmalloc(sizeof(struct file_lock), GFP_KERNEL);
+ if (block->b_fl)
+ locks_copy_lock(block->b_fl, conf);
+ }
+}
+
+/*
* Unblock a blocked lock request. This is a callback invoked from the
* VFS layer when a lock on which we blocked is removed.
*
@@ -641,18 +669,33 @@ static int
nlmsvc_notify_blocked(struct file_lock *fl, struct file_lock *conf, int result)
{
struct nlm_block *block;
+ int rc = -ENOENT;

dprintk("lockd: nlmsvc_notify_blocked lock %p conf %p result %d\n",
fl, conf, result);
+ lock_kernel();
list_for_each_entry(block, &nlm_blocked, b_list) {
if (nlm_compare_locks(&block->b_call->a_args.lock.fl, fl)) {
+ dprintk("lockd: nlmsvc_notify_blocked block %p flags %d\n",
+ block, block->b_flags);
+ if (block->b_flags & B_QUEUED) {
+ if (block->b_flags & B_TOO_LATE) {
+ rc = -ENOLCK;
+ break;
+ }
+ nlmsvc_update_deferred_block(block, conf, result);
+ }
nlmsvc_insert_block(block, 0);
svc_wake_up(block->b_daemon);
- return;
+ rc = 0;
+ break;
}
}
+ unlock_kernel();

- printk(KERN_WARNING "lockd: notification for unknown block!\n");
+ if (rc == -ENOENT)
+ printk(KERN_WARNING "lockd: notification for unknown block!\n");
+ return rc;
}

static int nlmsvc_same_owner(struct file_lock *fl1, struct file_lock *fl2)
@@ -685,6 +728,8 @@ nlmsvc_grant_blocked(struct nlm_block *block)

dprintk("lockd: grant blocked lock %p\n", block);

+ kref_get(&block->b_count);
+
/* Unlink block request from list */
nlmsvc_unlink_block(block);

@@ -707,11 +752,13 @@ nlmsvc_grant_blocked(struct nlm_block *block)
case -EAGAIN:
dprintk("lockd: lock still blocked\n");
nlmsvc_insert_block(block, NLM_NEVER);
+ nlmsvc_release_block(block);
return;
default:
printk(KERN_WARNING "lockd: unexpected error %d in %s!\n",
-error, __FUNCTION__);
nlmsvc_insert_block(block, 10 * HZ);
+ nlmsvc_release_block(block);
return;
}

@@ -724,7 +771,6 @@ callback:
nlmsvc_insert_block(block, 30 * HZ);

/* Call the client */
- kref_get(&block->b_count);
if (nlm_async_call(block->b_call, NLMPROC_GRANTED_MSG,
&nlmsvc_grant_ops) < 0)
nlmsvc_release_block(block);
@@ -799,6 +845,23 @@ nlmsvc_grant_reply(struct nlm_cookie *cookie, u32 status)
nlmsvc_release_block(block);
}

+/* Helper function to handle retry of a deferred block.
+ * If it is a blocking lock, call grant_blocked.
+ * For a non-blocking lock or test lock, revisit the request.
+ */
+static void
+retry_deferred_block(struct nlm_block *block)
+{
+ if (!(block->b_flags & B_GOT_CALLBACK))
+ block->b_flags |= B_TOO_LATE;
+ nlmsvc_insert_block(block, NLM_TIMEOUT);
+ dprintk("revisit block %p flags %d\n", block, block->b_flags);
+ if (block->b_deferred_req) {
+ block->b_deferred_req->revisit(block->b_deferred_req, 0);
+ block->b_deferred_req = NULL;
+ }
+}
+
/*
* Retry all blocked locks that have been notified. This is where lockd
* picks up locks that can be granted, or grant notifications that must
@@ -822,9 +885,12 @@ nlmsvc_retry_blocked(void)

dprintk("nlmsvc_retry_blocked(%p, when=%ld)\n",
block, block->b_when);
- kref_get(&block->b_count);
- nlmsvc_grant_blocked(block);
- nlmsvc_release_block(block);
+ if (block->b_flags & B_QUEUED) {
+ dprintk("nlmsvc_retry_blocked delete block (%p, granted=%d, flags=%d)\n",
+ block, block->b_granted, block->b_flags);
+ retry_deferred_block(block);
+ } else
+ nlmsvc_grant_blocked(block);
}

return timeout;
--
1.4.4.1


-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys - and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2006-12-06 05:34:15

by J. Bruce Fields

[permalink] [raw]
Subject: [PATCH 5/10] lockd: handle fl_notify callbacks

From: Marc Eshel <[email protected]>

Add code to handle file system callback when the lock is finally granted.

Signed-off-by: Marc Eshel <[email protected]>
Signed-off-by: J. Bruce Fields <[email protected]>
---
fs/lockd/svclock.c | 78 ++++++++++++++++++++++++++++++++++++++++++++++++----
1 files changed, 72 insertions(+), 6 deletions(-)

diff --git a/fs/lockd/svclock.c b/fs/lockd/svclock.c
index 32f4cc4..90aa4a5 100644
--- a/fs/lockd/svclock.c
+++ b/fs/lockd/svclock.c
@@ -367,6 +367,8 @@ static void nlmsvc_free_block(struct kref *kref)
nlmsvc_freegrantargs(block->b_call);
nlm_release_call(block->b_call);
nlm_release_file(block->b_file);
+ if (block->b_fl)
+ kfree(block->b_fl);
kfree(block);
}

@@ -631,6 +633,32 @@ nlmsvc_cancel_blocked(struct nlm_file *file, struct nlm_lock *lock)
}

/*
+ * This is a callback from the filesystem for VFS file lock requests.
+ * It will be used if fl_notify is defined and the filesystem can not
+ * respond to the request immediately.
+ * For GETLK request it will copy the reply to the nlm_block.
+ * For SETLK or SETLKW request it will get the local posix lock.
+ * In all cases it will move the block to the head of nlm_blocked q where
+ * nlmsvc_retry_blocked() can send back a reply for SETLKW or revisit the
+ * deferred rpc for GETLK and SETLK.
+ */
+static void
+nlmsvc_update_deferred_block(struct nlm_block *block, struct file_lock *conf,
+ int result)
+{
+ block->b_flags |= B_GOT_CALLBACK;
+ if (result == 0)
+ block->b_granted = 1;
+ else
+ block->b_flags |= B_TOO_LATE;
+ if (conf) {
+ block->b_fl = kmalloc(sizeof(struct file_lock), GFP_KERNEL);
+ if (block->b_fl)
+ locks_copy_lock(block->b_fl, conf);
+ }
+}
+
+/*
* Unblock a blocked lock request. This is a callback invoked from the
* VFS layer when a lock on which we blocked is removed.
*
@@ -641,18 +669,33 @@ static int
nlmsvc_notify_blocked(struct file_lock *fl, struct file_lock *conf, int result)
{
struct nlm_block *block;
+ int rc = -ENOENT;

dprintk("lockd: nlmsvc_notify_blocked lock %p conf %p result %d\n",
fl, conf, result);
+ lock_kernel();
list_for_each_entry(block, &nlm_blocked, b_list) {
if (nlm_compare_locks(&block->b_call->a_args.lock.fl, fl)) {
+ dprintk("lockd: nlmsvc_notify_blocked block %p flags %d\n",
+ block, block->b_flags);
+ if (block->b_flags & B_QUEUED) {
+ if (block->b_flags & B_TOO_LATE) {
+ rc = -ENOLCK;
+ break;
+ }
+ nlmsvc_update_deferred_block(block, conf, result);
+ }
nlmsvc_insert_block(block, 0);
svc_wake_up(block->b_daemon);
- return;
+ rc = 0;
+ break;
}
}
+ unlock_kernel();

- printk(KERN_WARNING "lockd: notification for unknown block!\n");
+ if (rc == -ENOENT)
+ printk(KERN_WARNING "lockd: notification for unknown block!\n");
+ return rc;
}

static int nlmsvc_same_owner(struct file_lock *fl1, struct file_lock *fl2)
@@ -685,6 +728,8 @@ nlmsvc_grant_blocked(struct nlm_block *block)

dprintk("lockd: grant blocked lock %p\n", block);

+ kref_get(&block->b_count);
+
/* Unlink block request from list */
nlmsvc_unlink_block(block);

@@ -707,11 +752,13 @@ nlmsvc_grant_blocked(struct nlm_block *block)
case -EAGAIN:
dprintk("lockd: lock still blocked\n");
nlmsvc_insert_block(block, NLM_NEVER);
+ nlmsvc_release_block(block);
return;
default:
printk(KERN_WARNING "lockd: unexpected error %d in %s!\n",
-error, __FUNCTION__);
nlmsvc_insert_block(block, 10 * HZ);
+ nlmsvc_release_block(block);
return;
}

@@ -724,7 +771,6 @@ callback:
nlmsvc_insert_block(block, 30 * HZ);

/* Call the client */
- kref_get(&block->b_count);
if (nlm_async_call(block->b_call, NLMPROC_GRANTED_MSG,
&nlmsvc_grant_ops) < 0)
nlmsvc_release_block(block);
@@ -799,6 +845,23 @@ nlmsvc_grant_reply(struct nlm_cookie *cookie, u32 status)
nlmsvc_release_block(block);
}

+/* Helper function to handle retry of a deferred block.
+ * If it is a blocking lock, call grant_blocked.
+ * For a non-blocking lock or test lock, revisit the request.
+ */
+static void
+retry_deferred_block(struct nlm_block *block)
+{
+ if (!(block->b_flags & B_GOT_CALLBACK))
+ block->b_flags |= B_TOO_LATE;
+ nlmsvc_insert_block(block, NLM_TIMEOUT);
+ dprintk("revisit block %p flags %d\n", block, block->b_flags);
+ if (block->b_deferred_req) {
+ block->b_deferred_req->revisit(block->b_deferred_req, 0);
+ block->b_deferred_req = NULL;
+ }
+}
+
/*
* Retry all blocked locks that have been notified. This is where lockd
* picks up locks that can be granted, or grant notifications that must
@@ -822,9 +885,12 @@ nlmsvc_retry_blocked(void)

dprintk("nlmsvc_retry_blocked(%p, when=%ld)\n",
block, block->b_when);
- kref_get(&block->b_count);
- nlmsvc_grant_blocked(block);
- nlmsvc_release_block(block);
+ if (block->b_flags & B_QUEUED) {
+ dprintk("nlmsvc_retry_blocked delete block (%p, granted=%d, flags=%d)\n",
+ block, block->b_granted, block->b_flags);
+ retry_deferred_block(block);
+ } else
+ nlmsvc_grant_blocked(block);
}

return timeout;
--
1.4.4.1


2006-12-06 05:34:14

by J. Bruce Fields

[permalink] [raw]
Subject: [PATCH 4/10] locks: add fl_notify arguments

From: J. Bruce Fields <[email protected]>

We're using fl_notify to asynchronously return the result of a lock
request. So we want fl_notify to be able to return a status and, if
appropriate, a conflicting lock.

This only current caller of fl_notify is in the blocked case, in which case
we don't use these extra arguments.

We also allow fl_notify to return an error. (Also ignored for now.)

Signed-off-by: J. Bruce Fields <[email protected]>
---
fs/lockd/svclock.c | 7 ++++---
fs/locks.c | 2 +-
include/linux/fs.h | 2 +-
3 files changed, 6 insertions(+), 5 deletions(-)

diff --git a/fs/lockd/svclock.c b/fs/lockd/svclock.c
index 2ce4dc6..32f4cc4 100644
--- a/fs/lockd/svclock.c
+++ b/fs/lockd/svclock.c
@@ -637,12 +637,13 @@ nlmsvc_cancel_blocked(struct nlm_file *file, struct nlm_lock *lock)
* This function doesn't grant the blocked lock instantly, but rather moves
* the block to the head of nlm_blocked where it can be picked up by lockd.
*/
-static void
-nlmsvc_notify_blocked(struct file_lock *fl)
+static int
+nlmsvc_notify_blocked(struct file_lock *fl, struct file_lock *conf, int result)
{
struct nlm_block *block;

- dprintk("lockd: VFS unblock notification for block %p\n", fl);
+ dprintk("lockd: nlmsvc_notify_blocked lock %p conf %p result %d\n",
+ fl, conf, result);
list_for_each_entry(block, &nlm_blocked, b_list) {
if (nlm_compare_locks(&block->b_call->a_args.lock.fl, fl)) {
nlmsvc_insert_block(block, 0);
diff --git a/fs/locks.c b/fs/locks.c
index 451a61a..959347e 100644
--- a/fs/locks.c
+++ b/fs/locks.c
@@ -544,7 +544,7 @@ static void locks_wake_up_blocks(struct file_lock *blocker)
struct file_lock, fl_block);
__locks_delete_block(waiter);
if (waiter->fl_lmops && waiter->fl_lmops->fl_notify)
- waiter->fl_lmops->fl_notify(waiter);
+ waiter->fl_lmops->fl_notify(waiter, NULL, -EAGAIN);
else
wake_up(&waiter->fl_wait);
}
diff --git a/include/linux/fs.h b/include/linux/fs.h
index b1d287b..9b57afc 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -795,7 +795,7 @@ struct file_lock_operations {

struct lock_manager_operations {
int (*fl_compare_owner)(struct file_lock *, struct file_lock *);
- void (*fl_notify)(struct file_lock *); /* unblock callback */
+ int (*fl_notify)(struct file_lock *, struct file_lock *, int);
void (*fl_copy_lock)(struct file_lock *, struct file_lock *);
void (*fl_release_private)(struct file_lock *);
void (*fl_break)(struct file_lock *);
--
1.4.4.1


2006-12-06 05:34:11

by J. Bruce Fields

[permalink] [raw]
Subject: [PATCH 1/10] lockd: add new export operation for nfsv4/lockd locking

From: Marc Eshel <[email protected]>

There is currently a filesystem ->lock() method, but it is defined only by
a few filesystems that are not exported via nfsd. So none of the lock
routines that are used by lockd or nfsv4 bother to call those methods.

Filesystems such as cluster filesystems would like to do their own locking
and also would like to be exportable via NFS.

So we add a new lock() export operation, and new routines vfs_lock_file,
vfs_test_lock, and vfs_cancel_lock, which call the new export operation,
falling back on the appropriate local operation if the export operation is
unavailable.

These new functions are intended to be used by lockd and nfsd; lockd and
nfsd changes to take advantage of them are made by later patches.

Acquiring a lock may require communication with remote hosts, and to avoid
blocking lockd or nfsd threads during such communication, we allow the
results to be returned asynchronously.

When a ->lock() call needs to block, the file system will return
-EINPROGRESS, and then later return the results with a call to the routine
in the fl_notify field of the lock_manager_operations struct.

Note that this is different from the ->lock() call discovering that there
is a conflict which would cause the caller to block; this is still handled
in the same way as before. In fact, we don't currently handle "blocking"
locks at all; those are less urgent, because the filesystem can always just
return an immediate -EAGAIN without denying the lock.

So this asynchronous interface is only used in the case of a non-blocking
lock, where we must know whether to allow or deny the lock now.

(Note: with this patch, we haven't yet modified lockd to handle such a
callback, which we must do so before a filesystem can safely use it in this
way.)

Signed-off-by: Marc Eshel <[email protected]>
Signed-off-by: J. Bruce Fields <[email protected]>
---
fs/lockd/svclock.c | 108 +++++++++++++++++++++++++++++++++++++++++++-
include/linux/fs.h | 2 +
include/linux/lockd/bind.h | 4 ++
3 files changed, 113 insertions(+), 1 deletions(-)

diff --git a/fs/lockd/svclock.c b/fs/lockd/svclock.c
index 7e219b9..f523ca2 100644
--- a/fs/lockd/svclock.c
+++ b/fs/lockd/svclock.c
@@ -20,6 +20,7 @@
* Copyright (C) 1996, Olaf Kirch <[email protected]>
*/

+#include <linux/module.h>
#include <linux/types.h>
#include <linux/errno.h>
#include <linux/kernel.h>
@@ -51,6 +52,111 @@ static const struct rpc_call_ops nlmsvc_grant_ops;
*/
static LIST_HEAD(nlm_blocked);

+ /**
+ * vfs_lock_file - file byte range lock
+ * @filp: The file to apply the lock to
+ * @fl: The lock to be applied
+ *
+ * To avoid blocking kernel daemons, such as lockd, that need to acquire POSIX
+ * locks, the ->lock() interface may return asynchronously, before the lock has
+ * been granted or denied by the underlying filesystem, if (and only if)
+ * fl_notify is set. Callers expecting ->lock() to return asynchronously
+ * will only use F_SETLK, not F_SETLKW; they will set FL_SLEEP if (and only if)
+ * the request is for a blocking lock. When ->lock() does return asynchronously,
+ * it must return -EINPROGRESS, and call ->fl_notify() when the lock
+ * request completes.
+ * If the request is for non-blocking lock the file system should return
+ * -EINPROGRESS then try to get the lock and call the callback routine with
+ * the result. If the request timed out the callback routine will return a
+ * nonzero return code and the file system should release the lock. The file
+ * system is also responsible to keep a corresponding posix lock when it
+ * grants a lock so the VFS can find out which locks are locally held and do
+ * the correct lock cleanup when required.
+ * The underlying filesystem must not drop the kernel lock or call
+ * ->fl_notify() before returning to the caller with a -EINPROGRESS
+ * return code.
+ */
+int vfs_lock_file(struct file *filp, struct file_lock *fl)
+{
+ struct super_block *sb;
+
+ sb = filp->f_dentry->d_inode->i_sb;
+ if (sb->s_export_op && sb->s_export_op->lock)
+ return sb->s_export_op->lock(filp, F_SETLK, fl);
+ else
+ return posix_lock_file(filp, fl);
+}
+EXPORT_SYMBOL(vfs_lock_file);
+
+/**
+ * vfs_lock_file - file byte range lock
+ * @filp: The file to apply the lock to
+ * @fl: The lock to be applied
+ * @conf: Place to return a copy of the conflicting lock, if found.
+ *
+ * read comments for vfs_lock_file()
+ */
+int vfs_lock_file_conf(struct file *filp, struct file_lock *fl, struct file_lock *conf)
+{
+ struct super_block *sb;
+
+ sb = filp->f_dentry->d_inode->i_sb;
+ if (sb->s_export_op && sb->s_export_op->lock) {
+ locks_copy_lock(conf, fl);
+ return sb->s_export_op->lock(filp, F_SETLK, fl);
+ } else
+ return posix_lock_file_conf(filp, fl, conf);
+}
+EXPORT_SYMBOL(vfs_lock_file_conf);
+
+/**
+ * vfs_test_lock - test file byte range lock
+ * @filp: The file to test lock for
+ * @fl: The lock to test
+ * @conf: Place to return a copy of the conflicting lock, if found.
+ */
+int vfs_test_lock(struct file *filp, struct file_lock *fl, struct file_lock *conf)
+{
+ int error;
+ struct super_block *sb;
+
+ conf->fl_type = F_UNLCK;
+ sb = filp->f_dentry->d_inode->i_sb;
+ if (sb->s_export_op && sb->s_export_op->lock) {
+ locks_copy_lock(conf, fl);
+ error = sb->s_export_op->lock(filp, F_GETLK, conf);
+ if (!error) {
+ if (conf->fl_type != F_UNLCK)
+ error = 1;
+ }
+ return error;
+ } else
+ return posix_test_lock(filp, fl, conf);
+}
+EXPORT_SYMBOL(vfs_test_lock);
+
+/**
+ * vfs_cancel_lock - file byte range unblock lock
+ * @filp: The file to apply the unblock to
+ * @fl: The lock to be unblocked
+ *
+ * FL_CANCELED is used to cancel blocked requests
+ */
+int vfs_cancel_lock(struct file *filp, struct file_lock *fl)
+{
+ int status;
+ struct super_block *sb;
+
+ fl->fl_flags |= FL_CANCEL;
+ sb = filp->f_dentry->d_inode->i_sb;
+ if (sb->s_export_op && sb->s_export_op->lock)
+ status = sb->s_export_op->lock(filp, F_SETLK, fl);
+ else
+ status = posix_unblock_lock(filp, fl);
+ fl->fl_flags &= ~FL_CANCEL;
+ return status;
+}
+
/*
* Insert a blocked lock into the global list
*/
@@ -241,7 +347,7 @@ static int nlmsvc_unlink_block(struct nlm_block *block)
dprintk("lockd: unlinking block %p...\n", block);

/* Remove block from list */
- status = posix_unblock_lock(block->b_file->f_file, &block->b_call->a_args.lock.fl);
+ status = vfs_cancel_lock(block->b_file->f_file, &block->b_call->a_args.lock.fl);
nlmsvc_remove_block(block);
return status;
}
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 2fe6e3f..b1d287b 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -770,6 +770,7 @@ extern spinlock_t files_lock;

#define FL_POSIX 1
#define FL_FLOCK 2
+#define FL_CANCEL 4 /* set to request cancelling a lock */
#define FL_ACCESS 8 /* not trying to lock, just looking */
#define FL_EXISTS 16 /* when unlocking, test for existence */
#define FL_LEASE 32 /* lease held on this file */
@@ -1372,6 +1373,7 @@ struct export_operations {
int (*acceptable)(void *context, struct dentry *de),
void *context);

+ int (*lock) (struct file *, int, struct file_lock *);

};

diff --git a/include/linux/lockd/bind.h b/include/linux/lockd/bind.h
index aa50d89..780bec4 100644
--- a/include/linux/lockd/bind.h
+++ b/include/linux/lockd/bind.h
@@ -38,4 +38,8 @@ extern int nlmclnt_proc(struct inode *, int, struct file_lock *);
extern int lockd_up(int proto);
extern void lockd_down(void);

+extern int vfs_lock_file(struct file *, struct file_lock *);
+extern int vfs_lock_file_conf(struct file *, struct file_lock *, struct file_lock *);
+extern int vfs_test_lock(struct file *, struct file_lock *, struct file_lock *);
+
#endif /* LINUX_LOCKD_BIND_H */
--
1.4.4.1


2006-12-06 05:34:20

by J. Bruce Fields

[permalink] [raw]
Subject: [PATCH 10/10] gfs2: nfs lock support for gfs2

From: J. Bruce Fields <[email protected]>

From: Marc Eshel <[email protected]>

Add NFS lock support to GFS2. (Untested.)

Signed-off-by: J. Bruce Fields <[email protected]>
---
fs/gfs2/lm.c | 10 ++++
fs/gfs2/lm.h | 2 +
fs/gfs2/locking/dlm/lock_dlm.h | 2 +
fs/gfs2/locking/dlm/mount.c | 1 +
fs/gfs2/locking/dlm/plock.c | 95 +++++++++++++++++++++++++++++++++++++++-
fs/gfs2/ops_export.c | 52 ++++++++++++++++++++++
include/linux/lm_interface.h | 3 +
include/linux/lock_dlm_plock.h | 3 +
8 files changed, 166 insertions(+), 2 deletions(-)

diff --git a/fs/gfs2/lm.c b/fs/gfs2/lm.c
index effe4a3..cf7fd52 100644
--- a/fs/gfs2/lm.c
+++ b/fs/gfs2/lm.c
@@ -197,6 +197,16 @@ int gfs2_lm_plock(struct gfs2_sbd *sdp, struct lm_lockname *name,
return error;
}

+int gfs2_lm_plock_async(struct gfs2_sbd *sdp, struct lm_lockname *name,
+ struct file *file, int cmd, struct file_lock *fl)
+{
+ int error = -EIO;
+ if (likely(!test_bit(SDF_SHUTDOWN, &sdp->sd_flags)))
+ error = sdp->sd_lockstruct.ls_ops->lm_plock_async(
+ sdp->sd_lockstruct.ls_lockspace, name, file, cmd, fl);
+ return error;
+}
+
int gfs2_lm_punlock(struct gfs2_sbd *sdp, struct lm_lockname *name,
struct file *file, struct file_lock *fl)
{
diff --git a/fs/gfs2/lm.h b/fs/gfs2/lm.h
index 21cdc30..1ddd1fd 100644
--- a/fs/gfs2/lm.h
+++ b/fs/gfs2/lm.h
@@ -34,6 +34,8 @@ int gfs2_lm_plock_get(struct gfs2_sbd *sdp, struct lm_lockname *name,
struct file *file, struct file_lock *fl);
int gfs2_lm_plock(struct gfs2_sbd *sdp, struct lm_lockname *name,
struct file *file, int cmd, struct file_lock *fl);
+int gfs2_lm_plock_async(struct gfs2_sbd *sdp, struct lm_lockname *name,
+ struct file *file, int cmd, struct file_lock *fl);
int gfs2_lm_punlock(struct gfs2_sbd *sdp, struct lm_lockname *name,
struct file *file, struct file_lock *fl);
void gfs2_lm_recovery_done(struct gfs2_sbd *sdp, unsigned int jid,
diff --git a/fs/gfs2/locking/dlm/lock_dlm.h b/fs/gfs2/locking/dlm/lock_dlm.h
index 33af707..82af860 100644
--- a/fs/gfs2/locking/dlm/lock_dlm.h
+++ b/fs/gfs2/locking/dlm/lock_dlm.h
@@ -179,6 +179,8 @@ int gdlm_plock_init(void);
void gdlm_plock_exit(void);
int gdlm_plock(void *, struct lm_lockname *, struct file *, int,
struct file_lock *);
+int gdlm_plock_async(void *, struct lm_lockname *, struct file *, int,
+ struct file_lock *);
int gdlm_plock_get(void *, struct lm_lockname *, struct file *,
struct file_lock *);
int gdlm_punlock(void *, struct lm_lockname *, struct file *,
diff --git a/fs/gfs2/locking/dlm/mount.c b/fs/gfs2/locking/dlm/mount.c
index cdd1694..4339e3f 100644
--- a/fs/gfs2/locking/dlm/mount.c
+++ b/fs/gfs2/locking/dlm/mount.c
@@ -244,6 +244,7 @@ const struct lm_lockops gdlm_ops = {
.lm_lock = gdlm_lock,
.lm_unlock = gdlm_unlock,
.lm_plock = gdlm_plock,
+ .lm_plock_async = gdlm_plock_async,
.lm_punlock = gdlm_punlock,
.lm_plock_get = gdlm_plock_get,
.lm_cancel = gdlm_cancel,
diff --git a/fs/gfs2/locking/dlm/plock.c b/fs/gfs2/locking/dlm/plock.c
index 7365aec..c21e667 100644
--- a/fs/gfs2/locking/dlm/plock.c
+++ b/fs/gfs2/locking/dlm/plock.c
@@ -102,6 +102,93 @@ int gdlm_plock(void *lockspace, struct lm_lockname *name,
return rv;
}

+int gdlm_plock_async(void *lockspace, struct lm_lockname *name,
+ struct file *file, int cmd, struct file_lock *fl)
+{
+ struct gdlm_ls *ls = lockspace;
+ struct plock_op *op;
+ int rv;
+
+ op = kzalloc(sizeof(*op), GFP_KERNEL);
+ if (!op)
+ return -ENOMEM;
+
+ op->info.optype = GDLM_PLOCK_OP_LOCK;
+ op->info.pid = fl->fl_pid;
+ op->info.ex = (fl->fl_type == F_WRLCK);
+ op->info.wait = IS_SETLKW(cmd);
+ op->info.fsid = ls->id;
+ op->info.number = name->ln_number;
+ op->info.start = fl->fl_start;
+ op->info.end = fl->fl_end;
+ op->info.owner = (__u64)(long) fl->fl_owner;
+ if (fl->fl_lmops) {
+ op->info.callback = fl->fl_lmops->fl_notify;
+ /* might need to make a copy */
+ op->info.fl = fl;
+ op->info.file = file;
+ } else
+ op->info.callback = NULL;
+
+ send_op(op);
+
+ if (op->info.callback == NULL)
+ wait_event(recv_wq, (op->done != 0));
+ else
+ return -EINPROGRESS;
+
+ spin_lock(&ops_lock);
+ if (!list_empty(&op->list)) {
+ printk(KERN_INFO "plock op on list\n");
+ list_del(&op->list);
+ }
+ spin_unlock(&ops_lock);
+
+ rv = op->info.rv;
+
+ if (!rv) {
+ if (posix_lock_file_wait(file, fl) < 0)
+ log_error("gdlm_plock: vfs lock error %x,%llx",
+ name->ln_type,
+ (unsigned long long)name->ln_number);
+ } else {
+ /* XXX: We need to cancel the lock here: */
+ printk("gfs2 lock granted after lock request failed; dangling lock!\n");
+ }
+
+ kfree(op);
+ return rv;
+}
+
+int gdlm_plock_callback(struct plock_op *op)
+{
+ struct file *file;
+ struct file_lock *fl;
+ int rv;
+
+ spin_lock(&ops_lock);
+ if (!list_empty(&op->list)) {
+ printk(KERN_INFO "plock op on list\n");
+ list_del(&op->list);
+ }
+ spin_unlock(&ops_lock);
+
+ rv = op->info.rv;
+
+ if (!rv) {
+ /* check if the following are still valid or make a copy */
+ file = op->info.file;
+ fl = op->info.fl;
+
+ if (posix_lock_file_wait(file, fl) < 0)
+ log_error("gdlm_plock: vfs lock error file %p fl %p",
+ file, fl);
+ }
+
+ kfree(op);
+ return rv;
+}
+
int gdlm_punlock(void *lockspace, struct lm_lockname *name,
struct file *file, struct file_lock *fl)
{
@@ -242,8 +329,12 @@ static ssize_t dev_write(struct file *file, const char __user *u, size_t count,
}
spin_unlock(&ops_lock);

- if (found)
- wake_up(&recv_wq);
+ if (found) {
+ if (op->info.callback)
+ gdlm_plock_callback(op);
+ else
+ wake_up(&recv_wq);
+ }
else
printk(KERN_INFO "gdlm dev_write no op %x %llx\n", info.fsid,
(unsigned long long)info.number);
diff --git a/fs/gfs2/ops_export.c b/fs/gfs2/ops_export.c
index 86127d9..80ca84f 100644
--- a/fs/gfs2/ops_export.c
+++ b/fs/gfs2/ops_export.c
@@ -22,6 +22,7 @@
#include "glock.h"
#include "glops.h"
#include "inode.h"
+#include "lm.h"
#include "ops_export.h"
#include "rgrp.h"
#include "util.h"
@@ -287,6 +288,56 @@ fail:
gfs2_glock_dq_uninit(&i_gh);
return ERR_PTR(error);
}
+/**
+ * gfs2_exp_lock - acquire/release a posix lock on a file
+ * @file: the file pointer
+ * @cmd: either modify or retrieve lock state, possibly wait
+ * @fl: type and range of lock
+ *
+ * Returns: errno
+ */
+
+static int gfs2_exp_lock(struct file *file, int cmd, struct file_lock *fl)
+{
+ struct gfs2_inode *ip = GFS2_I(file->f_mapping->host);
+ struct gfs2_sbd *sdp = GFS2_SB(file->f_mapping->host);
+ struct lm_lockname name =
+ { .ln_number = ip->i_num.no_addr,
+ .ln_type = LM_TYPE_PLOCK };
+
+ if (!(fl->fl_flags & FL_POSIX))
+ return -ENOLCK;
+ if ((ip->i_di.di_mode & (S_ISGID | S_IXGRP)) == S_ISGID)
+ return -ENOLCK;
+
+ if (sdp->sd_args.ar_localflocks) {
+ if (IS_GETLK(cmd)) {
+ struct file_lock tmp;
+ int ret;
+ ret = posix_test_lock(file, fl, &tmp);
+ fl->fl_type = F_UNLCK;
+ if (ret)
+ memcpy(fl, &tmp, sizeof(struct file_lock));
+ return 0;
+ } else {
+ return posix_lock_file_wait(file, fl);
+ }
+ }
+
+ if (IS_GETLK(cmd))
+ return gfs2_lm_plock_get(sdp, &name, file, fl);
+ else if (fl->fl_type == F_UNLCK)
+ return gfs2_lm_punlock(sdp, &name, file, fl);
+ else {
+ /* If fl_notify is set make an async lock request
+ and reply withh -EINPROGRESS. When lock is granted
+ the gfs2_lm_plock_async should callback to fl_notify */
+ if (fl->fl_lmops->fl_notify)
+ return gfs2_lm_plock_async(sdp, &name, file, cmd, fl);
+ else
+ return gfs2_lm_plock(sdp, &name, file, cmd, fl);
+ }
+}

struct export_operations gfs2_export_ops = {
.decode_fh = gfs2_decode_fh,
@@ -294,5 +345,6 @@ struct export_operations gfs2_export_ops = {
.get_name = gfs2_get_name,
.get_parent = gfs2_get_parent,
.get_dentry = gfs2_get_dentry,
+ .lock = gfs2_exp_lock,
};

diff --git a/include/linux/lm_interface.h b/include/linux/lm_interface.h
index 1418fdc..28d5445 100644
--- a/include/linux/lm_interface.h
+++ b/include/linux/lm_interface.h
@@ -213,6 +213,9 @@ struct lm_lockops {
int (*lm_plock) (void *lockspace, struct lm_lockname *name,
struct file *file, int cmd, struct file_lock *fl);

+ int (*lm_plock_async) (void *lockspace, struct lm_lockname *name,
+ struct file *file, int cmd, struct file_lock *fl);
+
int (*lm_punlock) (void *lockspace, struct lm_lockname *name,
struct file *file, struct file_lock *fl);

diff --git a/include/linux/lock_dlm_plock.h b/include/linux/lock_dlm_plock.h
index fc34151..809c5b7 100644
--- a/include/linux/lock_dlm_plock.h
+++ b/include/linux/lock_dlm_plock.h
@@ -35,6 +35,9 @@ struct gdlm_plock_info {
__u64 start;
__u64 end;
__u64 owner;
+ void *callback;
+ void *fl;
+ void *file;
};

#endif
--
1.4.4.1


2006-12-06 06:00:59

by Wendy Cheng

[permalink] [raw]
Subject: Re: [NFS] [PATCH 10/10] gfs2: nfs lock support for gfs2

J. Bruce Fields wrote:

>From: J. Bruce Fields <[email protected]>
>
>From: Marc Eshel <[email protected]>
>
>Add NFS lock support to GFS2. (Untested.)
>
>
Untested ? Trying to keep us busy ? :) ..

-- Wendy


2006-12-06 12:02:48

by Steven Whitehouse

[permalink] [raw]
Subject: Re: [PATCH 10/10] gfs2: nfs lock support for gfs2

Hi,

This looks good to me, and I'm copying in Dave & Wendy who have both
done previous work in this area for further comment. Provided we can get
this tested, I'd be happy to accept the patch in its current form.

Steve.

On Wed, 2006-12-06 at 00:34 -0500, J. Bruce Fields wrote:
> From: J. Bruce Fields <[email protected]>
>
> From: Marc Eshel <[email protected]>
>
> Add NFS lock support to GFS2. (Untested.)
>
> Signed-off-by: J. Bruce Fields <[email protected]>
> ---
> fs/gfs2/lm.c | 10 ++++
> fs/gfs2/lm.h | 2 +
> fs/gfs2/locking/dlm/lock_dlm.h | 2 +
> fs/gfs2/locking/dlm/mount.c | 1 +
> fs/gfs2/locking/dlm/plock.c | 95 +++++++++++++++++++++++++++++++++++++++-
> fs/gfs2/ops_export.c | 52 ++++++++++++++++++++++
> include/linux/lm_interface.h | 3 +
> include/linux/lock_dlm_plock.h | 3 +
> 8 files changed, 166 insertions(+), 2 deletions(-)
>
> diff --git a/fs/gfs2/lm.c b/fs/gfs2/lm.c
> index effe4a3..cf7fd52 100644
> --- a/fs/gfs2/lm.c
> +++ b/fs/gfs2/lm.c
> @@ -197,6 +197,16 @@ int gfs2_lm_plock(struct gfs2_sbd *sdp, struct lm_lockname *name,
> return error;
> }
>
> +int gfs2_lm_plock_async(struct gfs2_sbd *sdp, struct lm_lockname *name,
> + struct file *file, int cmd, struct file_lock *fl)
> +{
> + int error = -EIO;
> + if (likely(!test_bit(SDF_SHUTDOWN, &sdp->sd_flags)))
> + error = sdp->sd_lockstruct.ls_ops->lm_plock_async(
> + sdp->sd_lockstruct.ls_lockspace, name, file, cmd, fl);
> + return error;
> +}
> +
> int gfs2_lm_punlock(struct gfs2_sbd *sdp, struct lm_lockname *name,
> struct file *file, struct file_lock *fl)
> {
> diff --git a/fs/gfs2/lm.h b/fs/gfs2/lm.h
> index 21cdc30..1ddd1fd 100644
> --- a/fs/gfs2/lm.h
> +++ b/fs/gfs2/lm.h
> @@ -34,6 +34,8 @@ int gfs2_lm_plock_get(struct gfs2_sbd *sdp, struct lm_lockname *name,
> struct file *file, struct file_lock *fl);
> int gfs2_lm_plock(struct gfs2_sbd *sdp, struct lm_lockname *name,
> struct file *file, int cmd, struct file_lock *fl);
> +int gfs2_lm_plock_async(struct gfs2_sbd *sdp, struct lm_lockname *name,
> + struct file *file, int cmd, struct file_lock *fl);
> int gfs2_lm_punlock(struct gfs2_sbd *sdp, struct lm_lockname *name,
> struct file *file, struct file_lock *fl);
> void gfs2_lm_recovery_done(struct gfs2_sbd *sdp, unsigned int jid,
> diff --git a/fs/gfs2/locking/dlm/lock_dlm.h b/fs/gfs2/locking/dlm/lock_dlm.h
> index 33af707..82af860 100644
> --- a/fs/gfs2/locking/dlm/lock_dlm.h
> +++ b/fs/gfs2/locking/dlm/lock_dlm.h
> @@ -179,6 +179,8 @@ int gdlm_plock_init(void);
> void gdlm_plock_exit(void);
> int gdlm_plock(void *, struct lm_lockname *, struct file *, int,
> struct file_lock *);
> +int gdlm_plock_async(void *, struct lm_lockname *, struct file *, int,
> + struct file_lock *);
> int gdlm_plock_get(void *, struct lm_lockname *, struct file *,
> struct file_lock *);
> int gdlm_punlock(void *, struct lm_lockname *, struct file *,
> diff --git a/fs/gfs2/locking/dlm/mount.c b/fs/gfs2/locking/dlm/mount.c
> index cdd1694..4339e3f 100644
> --- a/fs/gfs2/locking/dlm/mount.c
> +++ b/fs/gfs2/locking/dlm/mount.c
> @@ -244,6 +244,7 @@ const struct lm_lockops gdlm_ops = {
> .lm_lock = gdlm_lock,
> .lm_unlock = gdlm_unlock,
> .lm_plock = gdlm_plock,
> + .lm_plock_async = gdlm_plock_async,
> .lm_punlock = gdlm_punlock,
> .lm_plock_get = gdlm_plock_get,
> .lm_cancel = gdlm_cancel,
> diff --git a/fs/gfs2/locking/dlm/plock.c b/fs/gfs2/locking/dlm/plock.c
> index 7365aec..c21e667 100644
> --- a/fs/gfs2/locking/dlm/plock.c
> +++ b/fs/gfs2/locking/dlm/plock.c
> @@ -102,6 +102,93 @@ int gdlm_plock(void *lockspace, struct lm_lockname *name,
> return rv;
> }
>
> +int gdlm_plock_async(void *lockspace, struct lm_lockname *name,
> + struct file *file, int cmd, struct file_lock *fl)
> +{
> + struct gdlm_ls *ls = lockspace;
> + struct plock_op *op;
> + int rv;
> +
> + op = kzalloc(sizeof(*op), GFP_KERNEL);
> + if (!op)
> + return -ENOMEM;
> +
> + op->info.optype = GDLM_PLOCK_OP_LOCK;
> + op->info.pid = fl->fl_pid;
> + op->info.ex = (fl->fl_type == F_WRLCK);
> + op->info.wait = IS_SETLKW(cmd);
> + op->info.fsid = ls->id;
> + op->info.number = name->ln_number;
> + op->info.start = fl->fl_start;
> + op->info.end = fl->fl_end;
> + op->info.owner = (__u64)(long) fl->fl_owner;
> + if (fl->fl_lmops) {
> + op->info.callback = fl->fl_lmops->fl_notify;
> + /* might need to make a copy */
> + op->info.fl = fl;
> + op->info.file = file;
> + } else
> + op->info.callback = NULL;
> +
> + send_op(op);
> +
> + if (op->info.callback == NULL)
> + wait_event(recv_wq, (op->done != 0));
> + else
> + return -EINPROGRESS;
> +
> + spin_lock(&ops_lock);
> + if (!list_empty(&op->list)) {
> + printk(KERN_INFO "plock op on list\n");
> + list_del(&op->list);
> + }
> + spin_unlock(&ops_lock);
> +
> + rv = op->info.rv;
> +
> + if (!rv) {
> + if (posix_lock_file_wait(file, fl) < 0)
> + log_error("gdlm_plock: vfs lock error %x,%llx",
> + name->ln_type,
> + (unsigned long long)name->ln_number);
> + } else {
> + /* XXX: We need to cancel the lock here: */
> + printk("gfs2 lock granted after lock request failed; dangling lock!\n");
> + }
> +
> + kfree(op);
> + return rv;
> +}
> +
> +int gdlm_plock_callback(struct plock_op *op)
> +{
> + struct file *file;
> + struct file_lock *fl;
> + int rv;
> +
> + spin_lock(&ops_lock);
> + if (!list_empty(&op->list)) {
> + printk(KERN_INFO "plock op on list\n");
> + list_del(&op->list);
> + }
> + spin_unlock(&ops_lock);
> +
> + rv = op->info.rv;
> +
> + if (!rv) {
> + /* check if the following are still valid or make a copy */
> + file = op->info.file;
> + fl = op->info.fl;
> +
> + if (posix_lock_file_wait(file, fl) < 0)
> + log_error("gdlm_plock: vfs lock error file %p fl %p",
> + file, fl);
> + }
> +
> + kfree(op);
> + return rv;
> +}
> +
> int gdlm_punlock(void *lockspace, struct lm_lockname *name,
> struct file *file, struct file_lock *fl)
> {
> @@ -242,8 +329,12 @@ static ssize_t dev_write(struct file *file, const char __user *u, size_t count,
> }
> spin_unlock(&ops_lock);
>
> - if (found)
> - wake_up(&recv_wq);
> + if (found) {
> + if (op->info.callback)
> + gdlm_plock_callback(op);
> + else
> + wake_up(&recv_wq);
> + }
> else
> printk(KERN_INFO "gdlm dev_write no op %x %llx\n", info.fsid,
> (unsigned long long)info.number);
> diff --git a/fs/gfs2/ops_export.c b/fs/gfs2/ops_export.c
> index 86127d9..80ca84f 100644
> --- a/fs/gfs2/ops_export.c
> +++ b/fs/gfs2/ops_export.c
> @@ -22,6 +22,7 @@
> #include "glock.h"
> #include "glops.h"
> #include "inode.h"
> +#include "lm.h"
> #include "ops_export.h"
> #include "rgrp.h"
> #include "util.h"
> @@ -287,6 +288,56 @@ fail:
> gfs2_glock_dq_uninit(&i_gh);
> return ERR_PTR(error);
> }
> +/**
> + * gfs2_exp_lock - acquire/release a posix lock on a file
> + * @file: the file pointer
> + * @cmd: either modify or retrieve lock state, possibly wait
> + * @fl: type and range of lock
> + *
> + * Returns: errno
> + */
> +
> +static int gfs2_exp_lock(struct file *file, int cmd, struct file_lock *fl)
> +{
> + struct gfs2_inode *ip = GFS2_I(file->f_mapping->host);
> + struct gfs2_sbd *sdp = GFS2_SB(file->f_mapping->host);
> + struct lm_lockname name =
> + { .ln_number = ip->i_num.no_addr,
> + .ln_type = LM_TYPE_PLOCK };
> +
> + if (!(fl->fl_flags & FL_POSIX))
> + return -ENOLCK;
> + if ((ip->i_di.di_mode & (S_ISGID | S_IXGRP)) == S_ISGID)
> + return -ENOLCK;
> +
> + if (sdp->sd_args.ar_localflocks) {
> + if (IS_GETLK(cmd)) {
> + struct file_lock tmp;
> + int ret;
> + ret = posix_test_lock(file, fl, &tmp);
> + fl->fl_type = F_UNLCK;
> + if (ret)
> + memcpy(fl, &tmp, sizeof(struct file_lock));
> + return 0;
> + } else {
> + return posix_lock_file_wait(file, fl);
> + }
> + }
> +
> + if (IS_GETLK(cmd))
> + return gfs2_lm_plock_get(sdp, &name, file, fl);
> + else if (fl->fl_type == F_UNLCK)
> + return gfs2_lm_punlock(sdp, &name, file, fl);
> + else {
> + /* If fl_notify is set make an async lock request
> + and reply withh -EINPROGRESS. When lock is granted
> + the gfs2_lm_plock_async should callback to fl_notify */
> + if (fl->fl_lmops->fl_notify)
> + return gfs2_lm_plock_async(sdp, &name, file, cmd, fl);
> + else
> + return gfs2_lm_plock(sdp, &name, file, cmd, fl);
> + }
> +}
>
> struct export_operations gfs2_export_ops = {
> .decode_fh = gfs2_decode_fh,
> @@ -294,5 +345,6 @@ struct export_operations gfs2_export_ops = {
> .get_name = gfs2_get_name,
> .get_parent = gfs2_get_parent,
> .get_dentry = gfs2_get_dentry,
> + .lock = gfs2_exp_lock,
> };
>
> diff --git a/include/linux/lm_interface.h b/include/linux/lm_interface.h
> index 1418fdc..28d5445 100644
> --- a/include/linux/lm_interface.h
> +++ b/include/linux/lm_interface.h
> @@ -213,6 +213,9 @@ struct lm_lockops {
> int (*lm_plock) (void *lockspace, struct lm_lockname *name,
> struct file *file, int cmd, struct file_lock *fl);
>
> + int (*lm_plock_async) (void *lockspace, struct lm_lockname *name,
> + struct file *file, int cmd, struct file_lock *fl);
> +
> int (*lm_punlock) (void *lockspace, struct lm_lockname *name,
> struct file *file, struct file_lock *fl);
>
> diff --git a/include/linux/lock_dlm_plock.h b/include/linux/lock_dlm_plock.h
> index fc34151..809c5b7 100644
> --- a/include/linux/lock_dlm_plock.h
> +++ b/include/linux/lock_dlm_plock.h
> @@ -35,6 +35,9 @@ struct gdlm_plock_info {
> __u64 start;
> __u64 end;
> __u64 owner;
> + void *callback;
> + void *fl;
> + void *file;
> };
>
> #endif


2006-12-06 13:26:54

by Wendy Cheng

[permalink] [raw]
Subject: Re: [NFS] [PATCH 10/10] gfs2: nfs lock support for gfs2

Wendy Cheng wrote:

>J. Bruce Fields wrote:
>
>
>
>>From: J. Bruce Fields <[email protected]>
>>
>>From: Marc Eshel <[email protected]>
>>
>>Add NFS lock support to GFS2. (Untested.)
>>
>>
>>
>>
>Untested ? Trying to keep us busy ?
>
>
>
Sorry, forgot this is on external mailing lists ....

Nice piece of work ! This solves a long standing issue for our NFS
servers. Thank you for doing it - will test them out from our end too.

-- Wendy

2006-12-06 15:49:51

by David Teigland

[permalink] [raw]
Subject: Re: [PATCH 10/10] gfs2: nfs lock support for gfs2

On Wed, Dec 06, 2006 at 12:34:20AM -0500, J. Bruce Fields wrote:
> +int gdlm_plock_callback(struct plock_op *op)
> +{
> + struct file *file;
> + struct file_lock *fl;
> + int rv;
> +
> + spin_lock(&ops_lock);
> + if (!list_empty(&op->list)) {
> + printk(KERN_INFO "plock op on list\n");
> + list_del(&op->list);
> + }
> + spin_unlock(&ops_lock);
> +
> + rv = op->info.rv;
> +
> + if (!rv) {
> + /* check if the following are still valid or make a copy */
> + file = op->info.file;
> + fl = op->info.fl;
> +
> + if (posix_lock_file_wait(file, fl) < 0)
> + log_error("gdlm_plock: vfs lock error file %p fl %p",
> + file, fl);
> + }
> +
> + kfree(op);
> + return rv;
> +}

..

> + if (found) {
> + if (op->info.callback)
> + gdlm_plock_callback(op);
> + else
> + wake_up(&recv_wq);
> + }

The gfs side looks fine to me. Did you forget to call fl_notify from
gdlm_plock_callback() or am I missing something?

Dave


2006-12-06 19:57:22

by J. Bruce Fields

[permalink] [raw]
Subject: Re: [PATCH 10/10] gfs2: nfs lock support for gfs2

On Wed, Dec 06, 2006 at 09:49:51AM -0600, David Teigland wrote:
> The gfs side looks fine to me. Did you forget to call fl_notify from
> gdlm_plock_callback() or am I missing something?

Yes, looks like we missed something, thanks. This code's an rfc (not
even tested), so don't apply it yet! What we should have there is
something like:

rv = op->info.rv;

if (fl_notify(fl, NULL, rv)) {
/* XXX: We need to cancel the lock here: */
printk("gfs2 lock granted after lock request failed; dangling lock!\n");
}

if (!rv) {
/* check if the following are still valid or make a copy */
file = op->info.file;
fl = op->info.fl;

if (posix_lock_file_wait(file, fl) < 0)
log_error("gdlm_plock: vfs lock error file %p fl %p",
file, fl);
}

Note there's a race condition--that calls fl_notify before actually
getting the lock locally. I don't *think* that's a problem, as long as
it's the filesystem and not the local lock list that's authoritative
when it comes to who gets a posix lock.

The more annoying problem is the need to cancel the GFS lock when
fl_notify fails; is that something that it's possible for GFS to do?

It can fail because lockd has a timeout--it waits a few seconds for the
callback, then gives up and returns a failure to the user. If that
happens after your userspace posix lock manager acquires the lock (but
before fl_notify is called) then you've got to cancel it somehow.

--b.

2006-12-06 20:08:43

by J. Bruce Fields

[permalink] [raw]
Subject: Re: [NFS] [PATCH 10/10] gfs2: nfs lock support for gfs2

On Wed, Dec 06, 2006 at 02:57:22PM -0500, J. Bruce Fields wrote:
> On Wed, Dec 06, 2006 at 09:49:51AM -0600, David Teigland wrote:
> > The gfs side looks fine to me. Did you forget to call fl_notify from
> > gdlm_plock_callback() or am I missing something?
>
> Yes, looks like we missed something, thanks. This code's an rfc (not
> even tested), so don't apply it yet! What we should have there is
> something like:
>
> rv = op->info.rv;
>
> if (fl_notify(fl, NULL, rv)) {
> /* XXX: We need to cancel the lock here: */
> printk("gfs2 lock granted after lock request failed; dangling lock!\n");
> }

(And note in the patch I sent out I stuck this in the wrong place--in
the synchronous instead of the asynchronous code. So the patch should
have been something closer to the following.)

--b.

diff --git a/fs/gfs2/lm.c b/fs/gfs2/lm.c
index effe4a3..cf7fd52 100644
--- a/fs/gfs2/lm.c
+++ b/fs/gfs2/lm.c
@@ -197,6 +197,16 @@ int gfs2_lm_plock(struct gfs2_sbd *sdp, struct lm_lockname *name,
return error;
}

+int gfs2_lm_plock_async(struct gfs2_sbd *sdp, struct lm_lockname *name,
+ struct file *file, int cmd, struct file_lock *fl)
+{
+ int error = -EIO;
+ if (likely(!test_bit(SDF_SHUTDOWN, &sdp->sd_flags)))
+ error = sdp->sd_lockstruct.ls_ops->lm_plock_async(
+ sdp->sd_lockstruct.ls_lockspace, name, file, cmd, fl);
+ return error;
+}
+
int gfs2_lm_punlock(struct gfs2_sbd *sdp, struct lm_lockname *name,
struct file *file, struct file_lock *fl)
{
diff --git a/fs/gfs2/lm.h b/fs/gfs2/lm.h
index 21cdc30..1ddd1fd 100644
--- a/fs/gfs2/lm.h
+++ b/fs/gfs2/lm.h
@@ -34,6 +34,8 @@ int gfs2_lm_plock_get(struct gfs2_sbd *sdp, struct lm_lockname *name,
struct file *file, struct file_lock *fl);
int gfs2_lm_plock(struct gfs2_sbd *sdp, struct lm_lockname *name,
struct file *file, int cmd, struct file_lock *fl);
+int gfs2_lm_plock_async(struct gfs2_sbd *sdp, struct lm_lockname *name,
+ struct file *file, int cmd, struct file_lock *fl);
int gfs2_lm_punlock(struct gfs2_sbd *sdp, struct lm_lockname *name,
struct file *file, struct file_lock *fl);
void gfs2_lm_recovery_done(struct gfs2_sbd *sdp, unsigned int jid,
diff --git a/fs/gfs2/locking/dlm/lock_dlm.h b/fs/gfs2/locking/dlm/lock_dlm.h
index 33af707..82af860 100644
--- a/fs/gfs2/locking/dlm/lock_dlm.h
+++ b/fs/gfs2/locking/dlm/lock_dlm.h
@@ -179,6 +179,8 @@ int gdlm_plock_init(void);
void gdlm_plock_exit(void);
int gdlm_plock(void *, struct lm_lockname *, struct file *, int,
struct file_lock *);
+int gdlm_plock_async(void *, struct lm_lockname *, struct file *, int,
+ struct file_lock *);
int gdlm_plock_get(void *, struct lm_lockname *, struct file *,
struct file_lock *);
int gdlm_punlock(void *, struct lm_lockname *, struct file *,
diff --git a/fs/gfs2/locking/dlm/mount.c b/fs/gfs2/locking/dlm/mount.c
index cdd1694..4339e3f 100644
--- a/fs/gfs2/locking/dlm/mount.c
+++ b/fs/gfs2/locking/dlm/mount.c
@@ -244,6 +244,7 @@ const struct lm_lockops gdlm_ops = {
.lm_lock = gdlm_lock,
.lm_unlock = gdlm_unlock,
.lm_plock = gdlm_plock,
+ .lm_plock_async = gdlm_plock_async,
.lm_punlock = gdlm_punlock,
.lm_plock_get = gdlm_plock_get,
.lm_cancel = gdlm_cancel,
diff --git a/fs/gfs2/locking/dlm/plock.c b/fs/gfs2/locking/dlm/plock.c
index 7365aec..f91a18a 100644
--- a/fs/gfs2/locking/dlm/plock.c
+++ b/fs/gfs2/locking/dlm/plock.c
@@ -102,6 +102,94 @@ int gdlm_plock(void *lockspace, struct lm_lockname *name,
return rv;
}

+int gdlm_plock_async(void *lockspace, struct lm_lockname *name,
+ struct file *file, int cmd, struct file_lock *fl)
+{
+ struct gdlm_ls *ls = lockspace;
+ struct plock_op *op;
+ int rv;
+
+ op = kzalloc(sizeof(*op), GFP_KERNEL);
+ if (!op)
+ return -ENOMEM;
+
+ op->info.optype = GDLM_PLOCK_OP_LOCK;
+ op->info.pid = fl->fl_pid;
+ op->info.ex = (fl->fl_type == F_WRLCK);
+ op->info.wait = IS_SETLKW(cmd);
+ op->info.fsid = ls->id;
+ op->info.number = name->ln_number;
+ op->info.start = fl->fl_start;
+ op->info.end = fl->fl_end;
+ op->info.owner = (__u64)(long) fl->fl_owner;
+ if (fl->fl_lmops) {
+ op->info.callback = fl->fl_lmops->fl_notify;
+ /* might need to make a copy */
+ op->info.fl = fl;
+ op->info.file = file;
+ } else
+ op->info.callback = NULL;
+
+ send_op(op);
+
+ if (op->info.callback == NULL)
+ wait_event(recv_wq, (op->done != 0));
+ else
+ return -EINPROGRESS;
+
+ spin_lock(&ops_lock);
+ if (!list_empty(&op->list)) {
+ printk(KERN_INFO "plock op on list\n");
+ list_del(&op->list);
+ }
+ spin_unlock(&ops_lock);
+
+ rv = op->info.rv;
+
+ if (!rv) {
+ if (posix_lock_file_wait(file, fl) < 0)
+ log_error("gdlm_plock: vfs lock error %x,%llx",
+ name->ln_type,
+ (unsigned long long)name->ln_number);
+ }
+
+ kfree(op);
+ return rv;
+}
+
+static void gdlm_plock_callback(struct plock_op *op)
+{
+ struct file *file;
+ struct file_lock *fl;
+ int rv;
+
+ spin_lock(&ops_lock);
+ if (!list_empty(&op->list)) {
+ printk(KERN_INFO "plock op on list\n");
+ list_del(&op->list);
+ }
+ spin_unlock(&ops_lock);
+
+ rv = op->info.rv;
+
+ if (fl_notify(fl, NULL, rv)) {
+ /* XXX: We need to cancel the lock here: */
+ printk("gfs2 lock granted after lock request failed; dangling lock!\n");
+ }
+
+ if (!rv) {
+ /* check if the following are still valid or make a copy */
+ file = op->info.file;
+ fl = op->info.fl;
+
+ if (posix_lock_file_wait(file, fl) < 0)
+ log_error("gdlm_plock: vfs lock error file %p fl %p",
+ file, fl);
+ }
+
+ kfree(op);
+}
+
int gdlm_punlock(void *lockspace, struct lm_lockname *name,
struct file *file, struct file_lock *fl)
{
@@ -242,8 +330,12 @@ static ssize_t dev_write(struct file *file, const char __user *u, size_t count,
}
spin_unlock(&ops_lock);

- if (found)
- wake_up(&recv_wq);
+ if (found) {
+ if (op->info.callback)
+ gdlm_plock_callback(op);
+ else
+ wake_up(&recv_wq);
+ }
else
printk(KERN_INFO "gdlm dev_write no op %x %llx\n", info.fsid,
(unsigned long long)info.number);
diff --git a/fs/gfs2/ops_export.c b/fs/gfs2/ops_export.c
index 86127d9..80ca84f 100644
--- a/fs/gfs2/ops_export.c
+++ b/fs/gfs2/ops_export.c
@@ -22,6 +22,7 @@
#include "glock.h"
#include "glops.h"
#include "inode.h"
+#include "lm.h"
#include "ops_export.h"
#include "rgrp.h"
#include "util.h"
@@ -287,6 +288,56 @@ fail:
gfs2_glock_dq_uninit(&i_gh);
return ERR_PTR(error);
}
+/**
+ * gfs2_exp_lock - acquire/release a posix lock on a file
+ * @file: the file pointer
+ * @cmd: either modify or retrieve lock state, possibly wait
+ * @fl: type and range of lock
+ *
+ * Returns: errno
+ */
+
+static int gfs2_exp_lock(struct file *file, int cmd, struct file_lock *fl)
+{
+ struct gfs2_inode *ip = GFS2_I(file->f_mapping->host);
+ struct gfs2_sbd *sdp = GFS2_SB(file->f_mapping->host);
+ struct lm_lockname name =
+ { .ln_number = ip->i_num.no_addr,
+ .ln_type = LM_TYPE_PLOCK };
+
+ if (!(fl->fl_flags & FL_POSIX))
+ return -ENOLCK;
+ if ((ip->i_di.di_mode & (S_ISGID | S_IXGRP)) == S_ISGID)
+ return -ENOLCK;
+
+ if (sdp->sd_args.ar_localflocks) {
+ if (IS_GETLK(cmd)) {
+ struct file_lock tmp;
+ int ret;
+ ret = posix_test_lock(file, fl, &tmp);
+ fl->fl_type = F_UNLCK;
+ if (ret)
+ memcpy(fl, &tmp, sizeof(struct file_lock));
+ return 0;
+ } else {
+ return posix_lock_file_wait(file, fl);
+ }
+ }
+
+ if (IS_GETLK(cmd))
+ return gfs2_lm_plock_get(sdp, &name, file, fl);
+ else if (fl->fl_type == F_UNLCK)
+ return gfs2_lm_punlock(sdp, &name, file, fl);
+ else {
+ /* If fl_notify is set make an async lock request
+ and reply withh -EINPROGRESS. When lock is granted
+ the gfs2_lm_plock_async should callback to fl_notify */
+ if (fl->fl_lmops->fl_notify)
+ return gfs2_lm_plock_async(sdp, &name, file, cmd, fl);
+ else
+ return gfs2_lm_plock(sdp, &name, file, cmd, fl);
+ }
+}

struct export_operations gfs2_export_ops = {
.decode_fh = gfs2_decode_fh,
@@ -294,5 +345,6 @@ struct export_operations gfs2_export_ops = {
.get_name = gfs2_get_name,
.get_parent = gfs2_get_parent,
.get_dentry = gfs2_get_dentry,
+ .lock = gfs2_exp_lock,
};

diff --git a/include/linux/lm_interface.h b/include/linux/lm_interface.h
index 1418fdc..28d5445 100644
--- a/include/linux/lm_interface.h
+++ b/include/linux/lm_interface.h
@@ -213,6 +213,9 @@ struct lm_lockops {
int (*lm_plock) (void *lockspace, struct lm_lockname *name,
struct file *file, int cmd, struct file_lock *fl);

+ int (*lm_plock_async) (void *lockspace, struct lm_lockname *name,
+ struct file *file, int cmd, struct file_lock *fl);
+
int (*lm_punlock) (void *lockspace, struct lm_lockname *name,
struct file *file, struct file_lock *fl);

diff --git a/include/linux/lock_dlm_plock.h b/include/linux/lock_dlm_plock.h
index fc34151..809c5b7 100644
--- a/include/linux/lock_dlm_plock.h
+++ b/include/linux/lock_dlm_plock.h
@@ -35,6 +35,9 @@ struct gdlm_plock_info {
__u64 start;
__u64 end;
__u64 owner;
+ void *callback;
+ void *fl;
+ void *file;
};

#endif

2006-12-06 20:58:22

by David Teigland

[permalink] [raw]
Subject: Re: [PATCH 10/10] gfs2: nfs lock support for gfs2

On Wed, Dec 06, 2006 at 02:57:22PM -0500, J. Bruce Fields wrote:
> On Wed, Dec 06, 2006 at 09:49:51AM -0600, David Teigland wrote:
> > The gfs side looks fine to me. Did you forget to call fl_notify from
> > gdlm_plock_callback() or am I missing something?
>
> Yes, looks like we missed something, thanks. This code's an rfc (not
> even tested), so don't apply it yet! What we should have there is
> something like:
>
> rv = op->info.rv;
>
> if (fl_notify(fl, NULL, rv)) {
> /* XXX: We need to cancel the lock here: */
> printk("gfs2 lock granted after lock request failed; dangling lock!\n");
> }
>
> if (!rv) {
> /* check if the following are still valid or make a copy */
> file = op->info.file;
> fl = op->info.fl;
>
> if (posix_lock_file_wait(file, fl) < 0)
> log_error("gdlm_plock: vfs lock error file %p fl %p",
> file, fl);
> }
>
> Note there's a race condition--that calls fl_notify before actually
> getting the lock locally. I don't *think* that's a problem, as long as
> it's the filesystem and not the local lock list that's authoritative
> when it comes to who gets a posix lock.

agree

> The more annoying problem is the need to cancel the GFS lock when
> fl_notify fails; is that something that it's possible for GFS to do?
>
> It can fail because lockd has a timeout--it waits a few seconds for the
> callback, then gives up and returns a failure to the user. If that
> happens after your userspace posix lock manager acquires the lock (but
> before fl_notify is called) then you've got to cancel it somehow.

I'd think we could just send an unlock for it at that point. Set up an op
with GDLM_PLOCK_OP_UNLOCK and the same fields as the lock you're removing
and call send_op(). We probably need to flag this internal-unlock op so
that when the result arrives, device_write() delists and frees it itself.

(I wouldn't call this "canceling", I think of cancel as trying to force a
blocked request to return/fail prematurely.)

Dave


2006-12-06 21:23:47

by J. Bruce Fields

[permalink] [raw]
Subject: Re: [PATCH 10/10] gfs2: nfs lock support for gfs2

On Wed, Dec 06, 2006 at 02:58:22PM -0600, David Teigland wrote:
> On Wed, Dec 06, 2006 at 02:57:22PM -0500, J. Bruce Fields wrote:
> > The more annoying problem is the need to cancel the GFS lock when
> > fl_notify fails; is that something that it's possible for GFS to do?
> >
> > It can fail because lockd has a timeout--it waits a few seconds for the
> > callback, then gives up and returns a failure to the user. If that
> > happens after your userspace posix lock manager acquires the lock (but
> > before fl_notify is called) then you've got to cancel it somehow.
>
> I'd think we could just send an unlock for it at that point. Set up an op
> with GDLM_PLOCK_OP_UNLOCK and the same fields as the lock you're removing
> and call send_op(). We probably need to flag this internal-unlock op so
> that when the result arrives, device_write() delists and frees it itself.
>
> (I wouldn't call this "canceling", I think of cancel as trying to force a
> blocked request to return/fail prematurely.)

I call it a cancel because it should leave us in the same state we were
in if we hadn't done the lock. An unlock doesn't do that, because the
original lock may have coalesced and/or downgraded existing locks.

We've got a similar problem with NFSv4 blocking locks. I've got
unsubmitted patches

http://linux-nfs.org/cgi-bin/gitweb.cgi?p=bfields-2.6.git;a=shortlog;h=fair-queueing

which introduce a new "provisional" lock type that is identical to a
posix lock in every way except that a provisional lock doesn't coalesce
or downgrade existing locks--it just sits there on the lock list
blocking conflicting requests until somebody comes along and upgrades it
to a real lock or cancels it.

Maybe there's a better solution in this case. I can't think of anything
other than just giving up on the whole idea of timing out. (Maybe that
wouldn't be so bad?)

--b.

2006-12-06 21:42:31

by David Teigland

[permalink] [raw]
Subject: Re: [PATCH 10/10] gfs2: nfs lock support for gfs2

On Wed, Dec 06, 2006 at 04:23:47PM -0500, J. Bruce Fields wrote:
> On Wed, Dec 06, 2006 at 02:58:22PM -0600, David Teigland wrote:
> > On Wed, Dec 06, 2006 at 02:57:22PM -0500, J. Bruce Fields wrote:
> > > The more annoying problem is the need to cancel the GFS lock when
> > > fl_notify fails; is that something that it's possible for GFS to do?
> > >
> > > It can fail because lockd has a timeout--it waits a few seconds for the
> > > callback, then gives up and returns a failure to the user. If that
> > > happens after your userspace posix lock manager acquires the lock (but
> > > before fl_notify is called) then you've got to cancel it somehow.
> >
> > I'd think we could just send an unlock for it at that point. Set up an op
> > with GDLM_PLOCK_OP_UNLOCK and the same fields as the lock you're removing
> > and call send_op(). We probably need to flag this internal-unlock op so
> > that when the result arrives, device_write() delists and frees it itself.
> >
> > (I wouldn't call this "canceling", I think of cancel as trying to force a
> > blocked request to return/fail prematurely.)
>
> I call it a cancel because it should leave us in the same state we were
> in if we hadn't done the lock. An unlock doesn't do that, because the
> original lock may have coalesced and/or downgraded existing locks.

Oh yeah, that's painful, I knew it sounded too easy.

Dave


2006-12-06 22:00:29

by J. Bruce Fields

[permalink] [raw]
Subject: Re: [NFS] [PATCH 10/10] gfs2: nfs lock support for gfs2

On Wed, Dec 06, 2006 at 03:42:31PM -0600, David Teigland wrote:
> Oh yeah, that's painful, I knew it sounded too easy.

Yeah. Well, we could try to teach GFS2 to reliably cancel posix locks.
I think that may turn out to be necessary some day anyway.

Or we could look at why we're timing out and figure out whether there's
something else we should be doing instead in that case. In what
situations is the GFS2 lock call likely to take overly long?

--b.

2006-12-07 06:47:46

by Marc Eshel

[permalink] [raw]
Subject: Re: [PATCH 10/10] gfs2: nfs lock support for gfs2

Here is a rewrite of gdlm_plock_callback(). We still need to add the
lock cancel.
Marc.

int gdlm_plock_callback(struct plock_op *op)
{
struct file *file;
struct file_lock *fl;
int (*notify)(void *, void *, int) = NULL;
int rv;

spin_lock(&ops_lock);
if (!list_empty(&op->list)) {
printk(KERN_INFO "plock op on list\n");
list_del(&op->list);
}
spin_unlock(&ops_lock);

rv = op->info.rv;

/* check if the following 2 are still valid or make a copy */
file = op->info.file;
fl = op->info.fl;
notify = op->info.callback;

if (!rv) { /* got fs lock */
rv = posix_lock_file(file, fl);
if (rv) { /* did not get posix lock */
notify(fl, NULL, rv);
log_error("gdlm_plock: vfs lock error file %p fl %p",
file, fl);
/* XXX: We need to cancel the fs lock here: */
printk("gfs2 lock posix lock request failed\n");
}
else { /* got posix lock */
if (notify(fl, NULL, 0)) {
/* XXX: We need to cancel the fs lock here: */
printk("gfs2 lock granted after lock request failed;
dangling lock!\n");
}
}
}
else { /* did not get fs lock */
notify(fl, NULL, rv);
}

kfree(op);
return rv;
}


David Teigland wrote:

>On Wed, Dec 06, 2006 at 02:57:22PM -0500, J. Bruce Fields wrote:
>
>
>>On Wed, Dec 06, 2006 at 09:49:51AM -0600, David Teigland wrote:
>>
>>
>>>The gfs side looks fine to me. Did you forget to call fl_notify from
>>>gdlm_plock_callback() or am I missing something?
>>>
>>>
>>Yes, looks like we missed something, thanks. This code's an rfc (not
>>even tested), so don't apply it yet! What we should have there is
>>something like:
>>
>> rv = op->info.rv;
>>
>> if (fl_notify(fl, NULL, rv)) {
>> /* XXX: We need to cancel the lock here: */
>> printk("gfs2 lock granted after lock request failed; dangling lock!\n");
>> }
>>
>> if (!rv) {
>> /* check if the following are still valid or make a copy */
>> file = op->info.file;
>> fl = op->info.fl;
>>
>> if (posix_lock_file_wait(file, fl) < 0)
>> log_error("gdlm_plock: vfs lock error file %p fl %p",
>> file, fl);
>> }
>>
>>Note there's a race condition--that calls fl_notify before actually
>>getting the lock locally. I don't *think* that's a problem, as long as
>>it's the filesystem and not the local lock list that's authoritative
>>when it comes to who gets a posix lock.
>>
>>
>
>agree
>
>
>
>>The more annoying problem is the need to cancel the GFS lock when
>>fl_notify fails; is that something that it's possible for GFS to do?
>>
>>It can fail because lockd has a timeout--it waits a few seconds for the
>>callback, then gives up and returns a failure to the user. If that
>>happens after your userspace posix lock manager acquires the lock (but
>>before fl_notify is called) then you've got to cancel it somehow.
>>
>>
>
>I'd think we could just send an unlock for it at that point. Set up an op
>with GDLM_PLOCK_OP_UNLOCK and the same fields as the lock you're removing
>and call send_op(). We probably need to flag this internal-unlock op so
>that when the result arrives, device_write() delists and frees it itself.
>
>(I wouldn't call this "canceling", I think of cancel as trying to force a
>blocked request to return/fail prematurely.)
>
>Dave
>
>
>


2007-02-26 19:52:20

by J. Bruce Fields

[permalink] [raw]
Subject: Re: asynchronous locks for cluster exports

On Sat, Feb 03, 2007 at 12:30:55AM -0500, J. Bruce Fields wrote:
>
> This is another attempt at a posix locking interface that allows us to
> provide NFS clients with cluster-coherent locking without blocking lockd
> while the filesystem goes off and talks to other nodes.

Marc and I have an updated version of this at:

git://linux-nfs.org/~bfields/linux.git

(See the server-cluster-locking-api branch.)

Changes include:

- bugfixes for GFS2; thanks to some setup help from Wendy Chang
and others, Marc has been able to run locking tests against an
nfs-exported GFS2 filesystem
- preallocation of storage for conflicting lock in lockd's
testlock implementation, to avoid a race identified by Trond
- Cleanup suggested by Christoph and others, including:
- creation of posix-to-flock helper functions
- rewrite of posix_test_lock interface to agree with
->lock( ,F_GETLK, )
- removal of some unnecessary parentheses, untangling of
some slightly tortured logic

We're hoping to get a detailed review from Trond sometime in the coming
month, after which we'll probably mailbomb linux-fsdevel again, but
any comments are welcome in the meantime.

--b.

-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys-and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs