2010-03-19 12:06:28

by Jeff Layton

[permalink] [raw]
Subject: [PATCH] nfsd: don't break lease while servicing a COMMIT call (try #2)

This is the second attempt to fix the problem whereby a COMMIT call
causes a lease break and triggers a possible deadlock.

The problem is that nfsd attempts to break a lease on a COMMIT call.
This triggers a delegation recall if the lease is held for a delegation.
If the client is the one holding the delegation and it's the same one on
which it's issuing the COMMIT, then it can't return that delegation
until the COMMIT is complete. But, nfsd won't complete the COMMIT until
the delegation is returned. The client and server are essentially
deadlocked until the state is marked bad (due to the client not
responding on the callback channel).

The first patch attempted to deal with this by eliminating the open of
the file altogether and simply had nfsd_commit pass a NULL file pointer
to the vfs_fsync_range. That would conflict with some work in progress
by Christoph Hellwig to clean up the fsync interface, so this patch
takes a different approach.

This declares a new NFSD_MAY_NOT_BREAK_LEASE access flag that indicates
to nfsd_open that it should not break any leases when opening the file,
and has nfsd_commit set that flag on the nfsd_open call.

For now, this patch leaves nfsd_commit opening the file with write
access since I'm not clear on what sort of access would be more
appropriate.

Signed-off-by: Jeff Layton <[email protected]>
---
fs/nfsd/vfs.c | 8 +++++---
fs/nfsd/vfs.h | 1 +
2 files changed, 6 insertions(+), 3 deletions(-)

diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c
index a11b0e8..c2dcb4c 100644
--- a/fs/nfsd/vfs.c
+++ b/fs/nfsd/vfs.c
@@ -723,7 +723,7 @@ nfsd_open(struct svc_rqst *rqstp, struct svc_fh *fhp, int type,
struct inode *inode;
int flags = O_RDONLY|O_LARGEFILE;
__be32 err;
- int host_err;
+ int host_err = 0;

validate_process_creds();

@@ -760,7 +760,8 @@ nfsd_open(struct svc_rqst *rqstp, struct svc_fh *fhp, int type,
* Check to see if there are any leases on this file.
* This may block while leases are broken.
*/
- host_err = break_lease(inode, O_NONBLOCK | ((access & NFSD_MAY_WRITE) ? O_WRONLY : 0));
+ if (!(access & NFSD_MAY_NOT_BREAK_LEASE))
+ host_err = break_lease(inode, O_NONBLOCK | ((access & NFSD_MAY_WRITE) ? O_WRONLY : 0));
if (host_err == -EWOULDBLOCK)
host_err = -ETIMEDOUT;
if (host_err) /* NOMEM or WOULDBLOCK */
@@ -1168,7 +1169,8 @@ nfsd_commit(struct svc_rqst *rqstp, struct svc_fh *fhp,
goto out;
}

- err = nfsd_open(rqstp, fhp, S_IFREG, NFSD_MAY_WRITE, &file);
+ err = nfsd_open(rqstp, fhp, S_IFREG,
+ NFSD_MAY_WRITE|NFSD_MAY_NOT_BREAK_LEASE, &file);
if (err)
goto out;
if (EX_ISSYNC(fhp->fh_export)) {
diff --git a/fs/nfsd/vfs.h b/fs/nfsd/vfs.h
index 4b1de0a..217a62c 100644
--- a/fs/nfsd/vfs.h
+++ b/fs/nfsd/vfs.h
@@ -20,6 +20,7 @@
#define NFSD_MAY_OWNER_OVERRIDE 64
#define NFSD_MAY_LOCAL_ACCESS 128 /* IRIX doing local access check on device special file*/
#define NFSD_MAY_BYPASS_GSS_ON_ROOT 256
+#define NFSD_MAY_NOT_BREAK_LEASE 512

#define NFSD_MAY_CREATE (NFSD_MAY_EXEC|NFSD_MAY_WRITE)
#define NFSD_MAY_REMOVE (NFSD_MAY_EXEC|NFSD_MAY_WRITE|NFSD_MAY_TRUNC)
--
1.6.6.1

_______________________________________________
NFSv4 mailing list
[email protected]
http://linux-nfs.org/cgi-bin/mailman/listinfo/nfsv4


2010-03-25 18:16:37

by Trond Myklebust

[permalink] [raw]
Subject: Re: [PATCH] nfsd: don't break lease while servicing a COMMIT call (try #2)

On Thu, 2010-03-25 at 13:47 -0400, J. Bruce Fields wrote:
> On Mon, Mar 22, 2010 at 04:33:40PM -0400, Jeff Layton wrote:
> > It looks like nfs_inode_return_delegation always calls nfs_msync_inode
> > on any valid delegation before returning it, regardless of the
> > delegation type.
> >
> > RFC 3530 says this:
> >
> > If the client is granted a read delegation, it is assured that no
> > other client has the ability to write to the file for the duration of
> > the delegation. If the client is granted a write delegation, the
> > client is assured that no other client has read or write access to
> > the file.
> >
> > That doesn't seem to imply that we must flush writes before returning
> > either type of delegation. OTOH, maybe it makes sense to treat those as
> > cache consistency points since a delegreturn sort of implies that
> > another client wants to use the file.
> >
> > I'm not quite sure how to interpret the spec here...
>
> If there's that call could cause the client to wait for an actual write
> to succeed before returning the delegation, then something's wrong.

We're certainly expected to write back data before returning a write
delegation (see Section 9.4.4 of RFC 3530).

For the case of a read delegation, then the spec is silent because it
contains no discussion of the case where a server grants both an open
for write and a read delegation. If you want a normative statement on
what clients should do for that case, then I suggest a discussion on the
IETF list with a view to getting it into RFC3530-bis.

Trond


2010-03-25 22:05:56

by J. Bruce Fields

[permalink] [raw]
Subject: Re: [PATCH] nfsd: don't break lease while servicing a COMMIT call (try #2)

On Thu, Mar 25, 2010 at 02:16:28PM -0400, Trond Myklebust wrote:
> On Thu, 2010-03-25 at 13:47 -0400, J. Bruce Fields wrote:
> > On Mon, Mar 22, 2010 at 04:33:40PM -0400, Jeff Layton wrote:
> > > It looks like nfs_inode_return_delegation always calls nfs_msync_inode
> > > on any valid delegation before returning it, regardless of the
> > > delegation type.
> > >
> > > RFC 3530 says this:
> > >
> > > If the client is granted a read delegation, it is assured that no
> > > other client has the ability to write to the file for the duration of
> > > the delegation. If the client is granted a write delegation, the
> > > client is assured that no other client has read or write access to
> > > the file.
> > >
> > > That doesn't seem to imply that we must flush writes before returning
> > > either type of delegation. OTOH, maybe it makes sense to treat those as
> > > cache consistency points since a delegreturn sort of implies that
> > > another client wants to use the file.
> > >
> > > I'm not quite sure how to interpret the spec here...
> >
> > If there's that call could cause the client to wait for an actual write
> > to succeed before returning the delegation, then something's wrong.
>
> We're certainly expected to write back data before returning a write
> delegation (see Section 9.4.4 of RFC 3530).
>
> For the case of a read delegation, then the spec is silent because it
> contains no discussion of the case where a server grants both an open
> for write and a read delegation. If you want a normative statement on
> what clients should do for that case, then I suggest a discussion on the
> IETF list with a view to getting it into RFC3530-bis.

Yeah, that would be a good idea to get nailed down at some point.

(But the current server implementation doesn't allow write opens in this
situation. So I wonder why we're seeing any commit from the client at
all?)

--b.

2010-03-25 17:45:10

by J. Bruce Fields

[permalink] [raw]
Subject: Re: [PATCH] nfsd: don't break lease while servicing a COMMIT call (try #2)

On Mon, Mar 22, 2010 at 04:33:40PM -0400, Jeff Layton wrote:
> It looks like nfs_inode_return_delegation always calls nfs_msync_inode
> on any valid delegation before returning it, regardless of the
> delegation type.
>
> RFC 3530 says this:
>
> If the client is granted a read delegation, it is assured that no
> other client has the ability to write to the file for the duration of
> the delegation. If the client is granted a write delegation, the
> client is assured that no other client has read or write access to
> the file.
>
> That doesn't seem to imply that we must flush writes before returning
> either type of delegation. OTOH, maybe it makes sense to treat those as
> cache consistency points since a delegreturn sort of implies that
> another client wants to use the file.
>
> I'm not quite sure how to interpret the spec here...

If there's that call could cause the client to wait for an actual write
to succeed before returning the delegation, then something's wrong.

Waiting for a commit also seems strange, but probably not incorrect: it
seems fair for the client to expect the server not to treat a commit as
conflicting with a delegation.

--b.

2010-03-26 14:45:51

by Jeff Layton

[permalink] [raw]
Subject: Re: [PATCH] nfsd: don't break lease while servicing a COMMIT call (try #2)

On Thu, 25 Mar 2010 18:07:48 -0400
"J. Bruce Fields" <[email protected]> wrote:

> On Thu, Mar 25, 2010 at 02:16:28PM -0400, Trond Myklebust wrote:
> > On Thu, 2010-03-25 at 13:47 -0400, J. Bruce Fields wrote:
> > > On Mon, Mar 22, 2010 at 04:33:40PM -0400, Jeff Layton wrote:
> > > > It looks like nfs_inode_return_delegation always calls nfs_msync_inode
> > > > on any valid delegation before returning it, regardless of the
> > > > delegation type.
> > > >
> > > > RFC 3530 says this:
> > > >
> > > > If the client is granted a read delegation, it is assured that no
> > > > other client has the ability to write to the file for the duration of
> > > > the delegation. If the client is granted a write delegation, the
> > > > client is assured that no other client has read or write access to
> > > > the file.
> > > >
> > > > That doesn't seem to imply that we must flush writes before returning
> > > > either type of delegation. OTOH, maybe it makes sense to treat those as
> > > > cache consistency points since a delegreturn sort of implies that
> > > > another client wants to use the file.
> > > >
> > > > I'm not quite sure how to interpret the spec here...
> > >
> > > If there's that call could cause the client to wait for an actual write
> > > to succeed before returning the delegation, then something's wrong.
> >
> > We're certainly expected to write back data before returning a write
> > delegation (see Section 9.4.4 of RFC 3530).
> >
> > For the case of a read delegation, then the spec is silent because it
> > contains no discussion of the case where a server grants both an open
> > for write and a read delegation. If you want a normative statement on
> > what clients should do for that case, then I suggest a discussion on the
> > IETF list with a view to getting it into RFC3530-bis.
>
> Yeah, that would be a good idea to get nailed down at some point.
>
> (But the current server implementation doesn't allow write opens in this
> situation. So I wonder why we're seeing any commit from the client at
> all?)
>
> --b.

This problem was reported against a RHEL5 client and server. It's
possible that there's another bug that's allowing that somehow, but
I'll need to look over the capture file again to see if I can find it.

--
Jeff Layton <[email protected]>

2010-03-22 19:47:10

by J. Bruce Fields

[permalink] [raw]
Subject: Re: [PATCH] nfsd: don't break lease while servicing a COMMIT call (try #2)

On Fri, Mar 19, 2010 at 08:06:28AM -0400, Jeff Layton wrote:
> This is the second attempt to fix the problem whereby a COMMIT call
> causes a lease break and triggers a possible deadlock.
>
> The problem is that nfsd attempts to break a lease on a COMMIT call.
> This triggers a delegation recall if the lease is held for a delegation.
> If the client is the one holding the delegation and it's the same one on
> which it's issuing the COMMIT, then it can't return that delegation
> until the COMMIT is complete. But, nfsd won't complete the COMMIT until
> the delegation is returned. The client and server are essentially
> deadlocked until the state is marked bad (due to the client not
> responding on the callback channel).
>
> The first patch attempted to deal with this by eliminating the open of
> the file altogether and simply had nfsd_commit pass a NULL file pointer
> to the vfs_fsync_range. That would conflict with some work in progress
> by Christoph Hellwig to clean up the fsync interface, so this patch
> takes a different approach.
>
> This declares a new NFSD_MAY_NOT_BREAK_LEASE access flag that indicates
> to nfsd_open that it should not break any leases when opening the file,
> and has nfsd_commit set that flag on the nfsd_open call.

Thanks; applying (after I run a few basic tests).

But: why would a client be waiting on a commit to return a read
delegation? (We don't give out write delegations.)

--b.

>
> For now, this patch leaves nfsd_commit opening the file with write
> access since I'm not clear on what sort of access would be more
> appropriate.
>
> Signed-off-by: Jeff Layton <[email protected]>
> ---
> fs/nfsd/vfs.c | 8 +++++---
> fs/nfsd/vfs.h | 1 +
> 2 files changed, 6 insertions(+), 3 deletions(-)
>
> diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c
> index a11b0e8..c2dcb4c 100644
> --- a/fs/nfsd/vfs.c
> +++ b/fs/nfsd/vfs.c
> @@ -723,7 +723,7 @@ nfsd_open(struct svc_rqst *rqstp, struct svc_fh *fhp, int type,
> struct inode *inode;
> int flags = O_RDONLY|O_LARGEFILE;
> __be32 err;
> - int host_err;
> + int host_err = 0;
>
> validate_process_creds();
>
> @@ -760,7 +760,8 @@ nfsd_open(struct svc_rqst *rqstp, struct svc_fh *fhp, int type,
> * Check to see if there are any leases on this file.
> * This may block while leases are broken.
> */
> - host_err = break_lease(inode, O_NONBLOCK | ((access & NFSD_MAY_WRITE) ? O_WRONLY : 0));
> + if (!(access & NFSD_MAY_NOT_BREAK_LEASE))
> + host_err = break_lease(inode, O_NONBLOCK | ((access & NFSD_MAY_WRITE) ? O_WRONLY : 0));
> if (host_err == -EWOULDBLOCK)
> host_err = -ETIMEDOUT;
> if (host_err) /* NOMEM or WOULDBLOCK */
> @@ -1168,7 +1169,8 @@ nfsd_commit(struct svc_rqst *rqstp, struct svc_fh *fhp,
> goto out;
> }
>
> - err = nfsd_open(rqstp, fhp, S_IFREG, NFSD_MAY_WRITE, &file);
> + err = nfsd_open(rqstp, fhp, S_IFREG,
> + NFSD_MAY_WRITE|NFSD_MAY_NOT_BREAK_LEASE, &file);
> if (err)
> goto out;
> if (EX_ISSYNC(fhp->fh_export)) {
> diff --git a/fs/nfsd/vfs.h b/fs/nfsd/vfs.h
> index 4b1de0a..217a62c 100644
> --- a/fs/nfsd/vfs.h
> +++ b/fs/nfsd/vfs.h
> @@ -20,6 +20,7 @@
> #define NFSD_MAY_OWNER_OVERRIDE 64
> #define NFSD_MAY_LOCAL_ACCESS 128 /* IRIX doing local access check on device special file*/
> #define NFSD_MAY_BYPASS_GSS_ON_ROOT 256
> +#define NFSD_MAY_NOT_BREAK_LEASE 512
>
> #define NFSD_MAY_CREATE (NFSD_MAY_EXEC|NFSD_MAY_WRITE)
> #define NFSD_MAY_REMOVE (NFSD_MAY_EXEC|NFSD_MAY_WRITE|NFSD_MAY_TRUNC)
> --
> 1.6.6.1
>

2010-03-22 20:33:40

by Jeff Layton

[permalink] [raw]
Subject: Re: [PATCH] nfsd: don't break lease while servicing a COMMIT call (try #2)

On Mon, 22 Mar 2010 15:47:10 -0400
"J. Bruce Fields" <[email protected]> wrote:

> On Fri, Mar 19, 2010 at 08:06:28AM -0400, Jeff Layton wrote:
> > This is the second attempt to fix the problem whereby a COMMIT call
> > causes a lease break and triggers a possible deadlock.
> >
> > The problem is that nfsd attempts to break a lease on a COMMIT call.
> > This triggers a delegation recall if the lease is held for a delegation.
> > If the client is the one holding the delegation and it's the same one on
> > which it's issuing the COMMIT, then it can't return that delegation
> > until the COMMIT is complete. But, nfsd won't complete the COMMIT until
> > the delegation is returned. The client and server are essentially
> > deadlocked until the state is marked bad (due to the client not
> > responding on the callback channel).
> >
> > The first patch attempted to deal with this by eliminating the open of
> > the file altogether and simply had nfsd_commit pass a NULL file pointer
> > to the vfs_fsync_range. That would conflict with some work in progress
> > by Christoph Hellwig to clean up the fsync interface, so this patch
> > takes a different approach.
> >
> > This declares a new NFSD_MAY_NOT_BREAK_LEASE access flag that indicates
> > to nfsd_open that it should not break any leases when opening the file,
> > and has nfsd_commit set that flag on the nfsd_open call.
>
> Thanks; applying (after I run a few basic tests).
>
> But: why would a client be waiting on a commit to return a read
> delegation? (We don't give out write delegations.)
>
> --b.
>

It looks like nfs_inode_return_delegation always calls nfs_msync_inode
on any valid delegation before returning it, regardless of the
delegation type.

RFC 3530 says this:

If the client is granted a read delegation, it is assured that no
other client has the ability to write to the file for the duration of
the delegation. If the client is granted a write delegation, the
client is assured that no other client has read or write access to
the file.

That doesn't seem to imply that we must flush writes before returning
either type of delegation. OTOH, maybe it makes sense to treat those as
cache consistency points since a delegreturn sort of implies that
another client wants to use the file.

I'm not quite sure how to interpret the spec here...

--
Jeff Layton <[email protected]>