2010-05-14 09:32:49

by Mi Jinlong

[permalink] [raw]
Subject: [PATCH] VFS: Unlink should revoke all outstanding leases on file

After client get one file's READ delegation through NFSv4,
server delete this file but don't reclaim the delegation.

This patch add break_lease at may_delete, which can reclaim delegations.

---
fs/namei.c | 2 +-
1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/fs/namei.c b/fs/namei.c
index 16df727..17bafc1 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -1338,7 +1338,7 @@ static int may_delete(struct inode *dir,struct dentry *victim,int isdir)
return -ENOENT;
if (victim->d_flags & DCACHE_NFSFS_RENAMED)
return -EBUSY;
- return 0;
+ return break_lease(victim->d_inode, FMODE_WRITE);
}

/* Check whether we can create an object with dentry child in directory
--
1.7.0





2010-05-19 14:06:55

by J. Bruce Fields

[permalink] [raw]
Subject: Re: [PATCH] VFS: Unlink should revoke all outstanding leases on file

On Fri, May 14, 2010 at 11:16:43AM -0700, Jeremy Allison wrote:
> On Fri, May 14, 2010 at 06:46:53PM +0100, Jamie Lokier wrote:
>
> > I think you can delete open files on Windows nowadays, if they are
> > opened with a particular flag.
>
> You can only mark them as "to be deleted" once the last opener
> closes. They still exist in the namespace.

So, I'm a little lost: in your opinion, would leases be more useful to
Samba if they were broken on delete, or if they weren't, or does it not
matter because you'll never do that?

(And, what about the same question for rename, link, chmod, chown, ...?)

I see three options:
1. modify the existing file lease behavior to match what NFSv4
(and Samba?) needs; or
2. leave the existing leases alone and create some new lock type
(or otherwise flag some leases somehow) that does what we
want; and, if we do that, either:
2a. leave the new leases in-kernel-only, or
2b. expose the new leases to userspace somehow so Samba
(or whever) can use them.

I don't think any of 1, 2a, or 2b is likely to be harder than any other,
so it's just a question of what we want.

--b.

2010-05-14 17:59:39

by Trond Myklebust

[permalink] [raw]
Subject: Re: [PATCH] VFS: Unlink should revoke all outstanding leases on file

On Fri, 2010-05-14 at 13:38 -0400, Jeff Layton wrote:
> On Fri, 14 May 2010 13:17:51 -0400
> Trond Myklebust <[email protected]> wrote:
>
> > On Fri, 2010-05-14 at 05:58 -0400, Jeff Layton wrote:
> > > On Fri, 14 May 2010 17:35:27 +0800
> > > Mi Jinlong <[email protected]> wrote:
> > >
> > > > After client get one file's READ delegation through NFSv4,
> > > > server delete this file but don't reclaim the delegation.
> > > >
> > > > This patch add break_lease at may_delete, which can reclaim delegations.
> > > >
> > > > ---
> > > > fs/namei.c | 2 +-
> > > > 1 files changed, 1 insertions(+), 1 deletions(-)
> > > >
> > > > diff --git a/fs/namei.c b/fs/namei.c
> > > > index 16df727..17bafc1 100644
> > > > --- a/fs/namei.c
> > > > +++ b/fs/namei.c
> > > > @@ -1338,7 +1338,7 @@ static int may_delete(struct inode *dir,struct dentry *victim,int isdir)
> > > > return -ENOENT;
> > > > if (victim->d_flags & DCACHE_NFSFS_RENAMED)
> > > > return -EBUSY;
> > > > - return 0;
> > > > + return break_lease(victim->d_inode, FMODE_WRITE);
> > > > }
> > > >
> > > > /* Check whether we can create an object with dentry child in directory
> > >
> > > This doesn't look right to me.
> > >
> > > The fcntl(2) manpage basically says that leases should be broken if the
> > > file is opened for read or write, or is truncated. unlinks don't seem
> > > to fall into either category...
> > >
> >
> > Breaking the lease in this case is certainly a requirement for NFSv4
> > delegations. I've no idea what the CIFS oplock requirements are...
> >
>
> Heh, probably "undefined". Windows generally doesn't allow you to
> delete open files at all. I don't think samba will really care too
> much either way. I suppose it could hurt performance in situations
> where you had a file that was hardlinked and deleted a hardlink that
> was "unrelated" to the dentry being held open...but that's pretty
> clearly a corner case at best.
>
> At the risk of being lazy and not checking for myself...what in the
> NFSv4 spec mandates this?
>

Section 9.4.4: Recall of Open Delegation

The following events necessitate recall of an open delegation:

o Potentially conflicting OPEN request (or READ/WRITE done with
"special" stateid)

o SETATTR issued by another client

o REMOVE request for the file

o RENAME request for the file as either source or target of the
RENAME

Whether a RENAME of a directory in the path leading to the file
results in recall of an open delegation depends on the semantics of
the server filesystem. If that filesystem denies such RENAMEs when a
file is open, the recall must be performed to determine whether the
file in question is, in fact, open.

Note that the server should also recall the delegation if someone
attempts to violate the guarantees that are listed in section 9.4: Open
Delegation

When a client has a read open delegation, it may not make any changes
to the contents or attributes of the file but it is assured that no
other client may do so. When a client has a write open delegation,
it may modify the file data since no other client will be accessing
the file's data. The client holding a write delegation may only
affect file attributes which are intimately connected with the file
data: size, time_modify, change.

IOW: even if you hold a write delegation you are not allowed to change
the file mode bits, owner, group or acls...

Cheers
Trond


2010-05-19 09:46:27

by Mi Jinlong

[permalink] [raw]
Subject: Re: [PATCH] VFS: Unlink should revoke all outstanding leases on file



J. Bruce Fields :
> On Fri, May 14, 2010 at 02:31:12PM -0400, Trond Myklebust wrote:
>> On Fri, 2010-05-14 at 13:59 -0400, Trond Myklebust wrote:
>>> Note that the server should also recall the delegation if someone
>>> attempts to violate the guarantees that are listed in section 9.4: Open
>>> Delegation
>>>
>>> When a client has a read open delegation, it may not make any changes
>>> to the contents or attributes of the file but it is assured that no
>>> other client may do so. When a client has a write open delegation,
>>> it may modify the file data since no other client will be accessing
>>> the file's data. The client holding a write delegation may only
>>> affect file attributes which are intimately connected with the file
>>> data: size, time_modify, change.
>>>
>>> IOW: even if you hold a write delegation you are not allowed to change
>>> the file mode bits, owner, group or acls...
>> ...or the nlink value. So technically, we should also recall the
>> delegation when someone creates or deletes a hard link. I think I need
>> to remind Tom that he should add that to the RFC3530bis draft...
>
> Yep. And fixing all these cases is required before our the server's
> NFSv4 server is ready for much of anything.
>
> I'm not sure ading break_lease() to may_delete() is right, but maybe
> it's better than nothing.

Agree with you.

>
> One problem is that there's a race: nothing I can see stops anyone from
> getting another lease after may_delete() but before the delete happens.

Yes.
The problem will exist, but there isn't some better methods to avoid it.
Is there a lease lock exist in kernel?
If that's true, the problem will be fixed simply.

thanks,
Mi Jinlong


2010-05-14 18:24:16

by Jeremy Allison

[permalink] [raw]
Subject: Re: [PATCH] VFS: Unlink should revoke all outstanding leases on file

On Fri, May 14, 2010 at 06:46:53PM +0100, Jamie Lokier wrote:

> I think you can delete open files on Windows nowadays, if they are
> opened with a particular flag.

You can only mark them as "to be deleted" once the last opener
closes. They still exist in the namespace.

Jeremy.

2010-05-20 09:46:10

by Mi Jinlong

[permalink] [raw]
Subject: Re: [PATCH] VFS: Unlink should revoke all outstanding leases on file



J. Bruce Fields :
> On Wed, May 19, 2010 at 05:46:23PM +0800, Mi Jinlong wrote:
>>
>> J. Bruce Fields :
>>> On Fri, May 14, 2010 at 02:31:12PM -0400, Trond Myklebust wrote:
>>>> On Fri, 2010-05-14 at 13:59 -0400, Trond Myklebust wrote:
>>>>> Note that the server should also recall the delegation if someone
>>>>> attempts to violate the guarantees that are listed in section 9.4: Open
>>>>> Delegation
>>>>>
>>>>> When a client has a read open delegation, it may not make any changes
>>>>> to the contents or attributes of the file but it is assured that no
>>>>> other client may do so. When a client has a write open delegation,
>>>>> it may modify the file data since no other client will be accessing
>>>>> the file's data. The client holding a write delegation may only
>>>>> affect file attributes which are intimately connected with the file
>>>>> data: size, time_modify, change.
>>>>>
>>>>> IOW: even if you hold a write delegation you are not allowed to change
>>>>> the file mode bits, owner, group or acls...
>>>> ...or the nlink value. So technically, we should also recall the
>>>> delegation when someone creates or deletes a hard link. I think I need
>>>> to remind Tom that he should add that to the RFC3530bis draft...
>>> Yep. And fixing all these cases is required before our the server's
>>> NFSv4 server is ready for much of anything.
>>>
>>> I'm not sure ading break_lease() to may_delete() is right, but maybe
>>> it's better than nothing.
>> Agree with you.
>>
>>> One problem is that there's a race: nothing I can see stops anyone from
>>> getting another lease after may_delete() but before the delete happens.
>> Yes.
>> The problem will exist, but there isn't some better methods to avoid it.
>> Is there a lease lock exist in kernel?
>> If that's true, the problem will be fixed simply.
>
> I don't know of any existing lock that does exactly what we want.
>
> Somebody at citi worked on a better lease implementation for a while,
> but I don't think we ever really got it right; the last version I can
> find is here:
>
> git://linux-nfs.org/~bfields/linux-topics.git leases

When reading the code of the git, i found a patch which try to fix
the lease's problem, but only for unlink.

commit id: d5a08e556116c66fb60a448f805a40bf54314634
msg: "leases: break file leases on unlink."

In this patch, it seems only add break_lease() and some other functions
but seems don't avoid the problem of race. Or there is some different
at break_lease() with the community's kernel.

Can you give me some message about the new lease? Thanks.

thanks,
Mi Jinlong


2010-05-14 17:18:04

by Trond Myklebust

[permalink] [raw]
Subject: Re: [PATCH] VFS: Unlink should revoke all outstanding leases on file

On Fri, 2010-05-14 at 05:58 -0400, Jeff Layton wrote:
> On Fri, 14 May 2010 17:35:27 +0800
> Mi Jinlong <[email protected]> wrote:
>
> > After client get one file's READ delegation through NFSv4,
> > server delete this file but don't reclaim the delegation.
> >
> > This patch add break_lease at may_delete, which can reclaim delegations.
> >
> > ---
> > fs/namei.c | 2 +-
> > 1 files changed, 1 insertions(+), 1 deletions(-)
> >
> > diff --git a/fs/namei.c b/fs/namei.c
> > index 16df727..17bafc1 100644
> > --- a/fs/namei.c
> > +++ b/fs/namei.c
> > @@ -1338,7 +1338,7 @@ static int may_delete(struct inode *dir,struct dentry *victim,int isdir)
> > return -ENOENT;
> > if (victim->d_flags & DCACHE_NFSFS_RENAMED)
> > return -EBUSY;
> > - return 0;
> > + return break_lease(victim->d_inode, FMODE_WRITE);
> > }
> >
> > /* Check whether we can create an object with dentry child in directory
>
> This doesn't look right to me.
>
> The fcntl(2) manpage basically says that leases should be broken if the
> file is opened for read or write, or is truncated. unlinks don't seem
> to fall into either category...
>

Breaking the lease in this case is certainly a requirement for NFSv4
delegations. I've no idea what the CIFS oplock requirements are...

Cheers
Trond


2010-05-14 17:38:36

by Jeff Layton

[permalink] [raw]
Subject: Re: [PATCH] VFS: Unlink should revoke all outstanding leases on file

On Fri, 14 May 2010 13:17:51 -0400
Trond Myklebust <[email protected]> wrote:

> On Fri, 2010-05-14 at 05:58 -0400, Jeff Layton wrote:
> > On Fri, 14 May 2010 17:35:27 +0800
> > Mi Jinlong <[email protected]> wrote:
> >
> > > After client get one file's READ delegation through NFSv4,
> > > server delete this file but don't reclaim the delegation.
> > >
> > > This patch add break_lease at may_delete, which can reclaim delegations.
> > >
> > > ---
> > > fs/namei.c | 2 +-
> > > 1 files changed, 1 insertions(+), 1 deletions(-)
> > >
> > > diff --git a/fs/namei.c b/fs/namei.c
> > > index 16df727..17bafc1 100644
> > > --- a/fs/namei.c
> > > +++ b/fs/namei.c
> > > @@ -1338,7 +1338,7 @@ static int may_delete(struct inode *dir,struct dentry *victim,int isdir)
> > > return -ENOENT;
> > > if (victim->d_flags & DCACHE_NFSFS_RENAMED)
> > > return -EBUSY;
> > > - return 0;
> > > + return break_lease(victim->d_inode, FMODE_WRITE);
> > > }
> > >
> > > /* Check whether we can create an object with dentry child in directory
> >
> > This doesn't look right to me.
> >
> > The fcntl(2) manpage basically says that leases should be broken if the
> > file is opened for read or write, or is truncated. unlinks don't seem
> > to fall into either category...
> >
>
> Breaking the lease in this case is certainly a requirement for NFSv4
> delegations. I've no idea what the CIFS oplock requirements are...
>

Heh, probably "undefined". Windows generally doesn't allow you to
delete open files at all. I don't think samba will really care too
much either way. I suppose it could hurt performance in situations
where you had a file that was hardlinked and deleted a hardlink that
was "unrelated" to the dentry being held open...but that's pretty
clearly a corner case at best.

At the risk of being lazy and not checking for myself...what in the
NFSv4 spec mandates this?

--
Jeff Layton <[email protected]>

2010-05-19 16:21:31

by Jamie Lokier

[permalink] [raw]
Subject: Re: [PATCH] VFS: Unlink should revoke all outstanding leases on file

J. Bruce Fields wrote:
> On Fri, May 14, 2010 at 11:16:43AM -0700, Jeremy Allison wrote:
> > On Fri, May 14, 2010 at 06:46:53PM +0100, Jamie Lokier wrote:
> >
> > > I think you can delete open files on Windows nowadays, if they are
> > > opened with a particular flag.
> >
> > You can only mark them as "to be deleted" once the last opener
> > closes. They still exist in the namespace.
>
> So, I'm a little lost: in your opinion, would leases be more useful to
> Samba if they were broken on delete, or if they weren't, or does it not
> matter because you'll never do that?

Samba might not delete open files (I'm not sure), but the Linux user
on the server can still unlink the files, or rename over them.

What should happen then, I'm not sure. Maybe Samba should be able to
delay the delete (like reads/writes can be delayed), or maybe it
should be able to refuse the delete altogether (similar to the way
the fanotify framework can block operations).

> I see three options:
> 1. modify the existing file lease behavior to match what NFSv4
> (and Samba?) needs; or
> 2. leave the existing leases alone and create some new lock type
> (or otherwise flag some leases somehow) that does what we
> want; and, if we do that, either:
> 2a. leave the new leases in-kernel-only, or
> 2b. expose the new leases to userspace somehow so Samba
> (or whever) can use them.
>
> I don't think any of 1, 2a, or 2b is likely to be harder than any other,
> so it's just a question of what we want.

I think changing the userspace contract for long-standing F_SETLEASE
is rude at least. Samba and NFS aren't the only lease users, and
anyway people will still run old Samba on new kernels; changing its
behaviour is a bit risky.

Imho, new lease semantics should use new userspace API.

-- Jamie

2010-05-21 21:07:45

by J. Bruce Fields

[permalink] [raw]
Subject: Re: [PATCH] VFS: Unlink should revoke all outstanding leases on file

On Thu, May 20, 2010 at 05:46:06PM +0800, Mi Jinlong wrote:
> J. Bruce Fields :
> > I don't know of any existing lock that does exactly what we want.
> >
> > Somebody at citi worked on a better lease implementation for a while,
> > but I don't think we ever really got it right; the last version I can
> > find is here:
> >
> > git://linux-nfs.org/~bfields/linux-topics.git leases
>
> When reading the code of the git, i found a patch which try to fix
> the lease's problem, but only for unlink.

It's 8 patches together:

> commit id: d5a08e556116c66fb60a448f805a40bf54314634
> msg: "leases: break file leases on unlink."
>
> In this patch, it seems only add break_lease() and some other functions
> but seems don't avoid the problem of race.

Look again: break_lease() is there, but there's also a break_lease_end()
after the unlink.

> Or there is some different
> at break_lease() with the community's kernel.
>
> Can you give me some message about the new lease? Thanks.

So the 8 patches at that branch are:
leases: introduce per-inode lease enabling/disabling
VFS: clean up extra dereferences in do_unlinkat()
leases: break file leases on unlink.
leases: break file leases on rename.
leases: break leases on chown.
VFS: refactor sys_fchmod and sys_fchmodat
leases: break leases on chmod.
leases: break leases on link.

Like I say, I don't think they're correct or I'd just copy them all to
the list. But maybe the comment on the first patch (appended) is
useful.

--b.

leases: introduce per-inode lease enabling/disabling

The current file lease implementation is inadequate (for the purposes of
nfs, and, we believe, for the purposes of Samba), in at least two ways:

- Leases are broken only conflicting opens; but both nfsv4
delegations and (we're told) Windows op locks actually require
that leases be broken on any operation that changes file
metadata--including unlink, link, rename, chmod, and chown.

- The internal kernel api used for lease-breaking is inherently
racy, consisting as it does of a single break_lease() call.
(Consider this scenario: a file is not currently open and is
about to be unlinked. During unlink processing, a lookup is
done, and break_lease() is called. After the break_lease(),
but before the unlink completes, another user opens the file
and gets a read lease. The unlink then completes, but the
other user thinks their read lease is still valid. This
situation would be avoided if lease-granting for the inode
were disabled for the duration of the unlink.)

We're primarily interested in the case of read leases for now. (Write
leases, which also must be broken on *access* to a file, are more
difficult to get completely right, and aren't used by the current nfs
server.)

Fixing the second problem requires replacing break_lease() by a pair of
calls, here called break_lease() and break_lease_end(), between which
new leases are temporarily prohibited.

We want to implement that temporary prohibition in a simple way that has
low impact in common (uncontended) cases.

This patch adds a field, i_leasecount, which provides mutual exclusion
between inode-modifying operations and read leases in the same way the
i_writecount provides mutual exclusion between write opens and execs:
when i_leasecount is positive, it counts the number of leases on the
given inode, and when it's negative it counts the number of operations
which want leases temporarily disabled. This allows selective
enabling/disabling of leases on a per-inode basis.

To that end, the functions leases_get_access() and leases_put_access()
are used when a lease is granted and returned, respectively. The
functions leases_deny_access() and leases_allow_access() are used to
prevent races between breaking-with-FMODE_WRITE and write-lease-granting
for the entire duration of a file operation. Currently, leases are
broken only when a file is opened or truncated; these functions will
allow leases to be broken on things like unlink and rename as well.
NFSv4 implements delegations using leases, and needs its delegations to
be revoked on unlinks, renames, chowns, etc.

Note that this patch changes break_lease() and __break_lease(), such
that when they are called with FMODE_WRITE and return successfully, they
will leave leases disabled on the inode in question, and the caller must
eventually call break_lease_end() to re-enable leasing. As alluded to
in the scenario above, this behavior isn't necessary when breaking
without FMODE_WRITE: existing and new read-leases wouldn't need to be
revoked or blocked; and a write-lease-granting setlease won't race the
break_lease() because the latter is presumed to have been preceded by
something like a dget() on the dentry in question (where d_count or
i_count > 1 blocks write-lease-granting).

This patch also closes a small existing open/lease race: a lease-related
race exists between the time that outstanding leases are broken (by
may_open()) and the time that, e.g., O_RDWR or O_WRONLY are reflected in
the inode's i_writecount variable (which will prevent subsequent
lease-granting setlease calls). Conceivably, a read lease could be
granted in the interim.

To deal with this, may_open() is modified so that, on success and when
called with FMODE_WRITE, it will return with lease-granting disabled for
the inode in question. do_filp_open() is modified so that leasing is
re-enabled once everything is finished. Analogous changes are made on
truncation.

2010-05-14 17:47:01

by Jamie Lokier

[permalink] [raw]
Subject: Re: [PATCH] VFS: Unlink should revoke all outstanding leases on file

Jeff Layton wrote:
> On Fri, 14 May 2010 13:17:51 -0400
> Trond Myklebust <[email protected]> wrote:
>
> > On Fri, 2010-05-14 at 05:58 -0400, Jeff Layton wrote:
> > > On Fri, 14 May 2010 17:35:27 +0800
> > > Mi Jinlong <[email protected]> wrote:
> > >
> > > > After client get one file's READ delegation through NFSv4,
> > > > server delete this file but don't reclaim the delegation.
> > > >
> > > > This patch add break_lease at may_delete, which can reclaim delegations.
> > > >
> > > > ---
> > > > fs/namei.c | 2 +-
> > > > 1 files changed, 1 insertions(+), 1 deletions(-)
> > > >
> > > > diff --git a/fs/namei.c b/fs/namei.c
> > > > index 16df727..17bafc1 100644
> > > > --- a/fs/namei.c
> > > > +++ b/fs/namei.c
> > > > @@ -1338,7 +1338,7 @@ static int may_delete(struct inode *dir,struct dentry *victim,int isdir)
> > > > return -ENOENT;
> > > > if (victim->d_flags & DCACHE_NFSFS_RENAMED)
> > > > return -EBUSY;
> > > > - return 0;
> > > > + return break_lease(victim->d_inode, FMODE_WRITE);
> > > > }
> > > >
> > > > /* Check whether we can create an object with dentry child in directory
> > >
> > > This doesn't look right to me.
> > >
> > > The fcntl(2) manpage basically says that leases should be broken if the
> > > file is opened for read or write, or is truncated. unlinks don't seem
> > > to fall into either category...
> > >
> >
> > Breaking the lease in this case is certainly a requirement for NFSv4
> > delegations. I've no idea what the CIFS oplock requirements are...
> >
>
> Heh, probably "undefined". Windows generally doesn't allow you to
> delete open files at all.

I think you can delete open files on Windows nowadays, if they are
opened with a particular flag.

> I don't think samba will really care too much either way. I suppose
> it could hurt performance in situations where you had a file that
> was hardlinked and deleted a hardlink that was "unrelated" to the
> dentry being held open...but that's pretty clearly a corner case at
> best.

Leases are handy for some userspace caching tricks too. (inotify is
too late for some coherent things: the file is modified first, then
you find out.)

I wouldn't like deleting a hard-link to have that effect if it can be
avoided. Or renaming (see below).

> At the risk of being lazy and not checking for myself...what in the
> NFSv4 spec mandates this?

On the same note, if deleting any link of a hard-link file requires
this, surely renaming a file requires it too, because that's roughly
equivalent to making a new link and deleting the old one.

-- Jamie

2010-05-19 15:57:07

by J. Bruce Fields

[permalink] [raw]
Subject: Re: [PATCH] VFS: Unlink should revoke all outstanding leases on file

On Wed, May 19, 2010 at 05:46:23PM +0800, Mi Jinlong wrote:
>
>
> J. Bruce Fields :
> > On Fri, May 14, 2010 at 02:31:12PM -0400, Trond Myklebust wrote:
> >> On Fri, 2010-05-14 at 13:59 -0400, Trond Myklebust wrote:
> >>> Note that the server should also recall the delegation if someone
> >>> attempts to violate the guarantees that are listed in section 9.4: Open
> >>> Delegation
> >>>
> >>> When a client has a read open delegation, it may not make any changes
> >>> to the contents or attributes of the file but it is assured that no
> >>> other client may do so. When a client has a write open delegation,
> >>> it may modify the file data since no other client will be accessing
> >>> the file's data. The client holding a write delegation may only
> >>> affect file attributes which are intimately connected with the file
> >>> data: size, time_modify, change.
> >>>
> >>> IOW: even if you hold a write delegation you are not allowed to change
> >>> the file mode bits, owner, group or acls...
> >> ...or the nlink value. So technically, we should also recall the
> >> delegation when someone creates or deletes a hard link. I think I need
> >> to remind Tom that he should add that to the RFC3530bis draft...
> >
> > Yep. And fixing all these cases is required before our the server's
> > NFSv4 server is ready for much of anything.
> >
> > I'm not sure ading break_lease() to may_delete() is right, but maybe
> > it's better than nothing.
>
> Agree with you.
>
> >
> > One problem is that there's a race: nothing I can see stops anyone from
> > getting another lease after may_delete() but before the delete happens.
>
> Yes.
> The problem will exist, but there isn't some better methods to avoid it.
> Is there a lease lock exist in kernel?
> If that's true, the problem will be fixed simply.

I don't know of any existing lock that does exactly what we want.

Somebody at citi worked on a better lease implementation for a while,
but I don't think we ever really got it right; the last version I can
find is here:

git://linux-nfs.org/~bfields/linux-topics.git leases

--b.

2010-05-19 16:14:29

by Jamie Lokier

[permalink] [raw]
Subject: Re: [PATCH] VFS: Unlink should revoke all outstanding leases on file

J. Bruce Fields wrote:
> On Fri, May 14, 2010 at 02:31:12PM -0400, Trond Myklebust wrote:
> > On Fri, 2010-05-14 at 13:59 -0400, Trond Myklebust wrote:
> > > Note that the server should also recall the delegation if someone
> > > attempts to violate the guarantees that are listed in section 9.4: Open
> > > Delegation
> > >
> > > When a client has a read open delegation, it may not make any changes
> > > to the contents or attributes of the file but it is assured that no
> > > other client may do so. When a client has a write open delegation,
> > > it may modify the file data since no other client will be accessing
> > > the file's data. The client holding a write delegation may only
> > > affect file attributes which are intimately connected with the file
> > > data: size, time_modify, change.
> > >
> > > IOW: even if you hold a write delegation you are not allowed to change
> > > the file mode bits, owner, group or acls...
> >
> > ...or the nlink value. So technically, we should also recall the
> > delegation when someone creates or deletes a hard link. I think I need
> > to remind Tom that he should add that to the RFC3530bis draft...
>
> Yep. And fixing all these cases is required before our the server's
> NFSv4 server is ready for much of anything.
>
> I'm not sure ading break_lease() to may_delete() is right, but maybe
> it's better than nothing.
>
> One problem is that there's a race: nothing I can see stops anyone from
> getting another lease after may_delete() but before the delete happens.

Presumably the intent is that the NFSv4 REMOVE request _acquires_ the
lease, and releases it after the delete is done.

Same pattern with renames, attribute changes, etc.

Imho it would all be much tidier if leases had the same set of flag
bits as inotify/dnotify, to say what changes they block. (Maybe the
flags would be slightly different - a detail).

All operations (read, write, open, link, rename, etc.) would follow a
pattern like this pseudo-code:

do_write(file)
{
err = lease_acquire(file, IN_MODIFY);
if (err < 0)
return err;

/* Do the modifying. */

lease_release_and_inotify_event(file, IN_MODIFY);
}

I think that would provide the semantics needed by NFS, Samba, also
fanotify for free, and more or less any kind of userspace caching,
coherent or not. It's clean and orthogonal. (Good value for money isn't it?)

The nlink value is missing from inotify (or "linked from" if you look
at it differently), but that's a problem needing to be fixed anyway.

-- Jamie

2010-05-14 18:31:21

by Trond Myklebust

[permalink] [raw]
Subject: Re: [PATCH] VFS: Unlink should revoke all outstanding leases on file

On Fri, 2010-05-14 at 13:59 -0400, Trond Myklebust wrote:
> Note that the server should also recall the delegation if someone
> attempts to violate the guarantees that are listed in section 9.4: Open
> Delegation
>
> When a client has a read open delegation, it may not make any changes
> to the contents or attributes of the file but it is assured that no
> other client may do so. When a client has a write open delegation,
> it may modify the file data since no other client will be accessing
> the file's data. The client holding a write delegation may only
> affect file attributes which are intimately connected with the file
> data: size, time_modify, change.
>
> IOW: even if you hold a write delegation you are not allowed to change
> the file mode bits, owner, group or acls...

...or the nlink value. So technically, we should also recall the
delegation when someone creates or deletes a hard link. I think I need
to remind Tom that he should add that to the RFC3530bis draft...

Cheers
Trond


2010-05-20 02:21:27

by J. Bruce Fields

[permalink] [raw]
Subject: Re: [PATCH] VFS: Unlink should revoke all outstanding leases on file

On Wed, May 19, 2010 at 05:14:19PM +0100, Jamie Lokier wrote:
> J. Bruce Fields wrote:
> > I'm not sure ading break_lease() to may_delete() is right, but maybe
> > it's better than nothing.
> >
> > One problem is that there's a race: nothing I can see stops anyone from
> > getting another lease after may_delete() but before the delete happens.
>
> Presumably the intent is that the NFSv4 REMOVE request _acquires_ the
> lease, and releases it after the delete is done.

It acquires something that prevents conflicting leases from being
granted, yes. (Dunno if I'd call the thing that prevents conflicting
leases a "lease".)

> Same pattern with renames, attribute changes, etc.
>
> Imho it would all be much tidier if leases had the same set of flag
> bits as inotify/dnotify, to say what changes they block. (Maybe the
> flags would be slightly different - a detail).
>
> All operations (read, write, open, link, rename, etc.) would follow a
> pattern like this pseudo-code:
>
> do_write(file)
> {
> err = lease_acquire(file, IN_MODIFY);

So I might call that "lease_deny" (or continue to call it
"lease_break").

> if (err < 0)
> return err;
>
> /* Do the modifying. */
>
> lease_release_and_inotify_event(file, IN_MODIFY);

Also, note the holder of the conflicting lease needs to be notified at
the start, not here. (And the notification is synchronous--the
lease-holder gets to block the operation until it returns the lease.)

> }
>
> I think that would provide the semantics needed by NFS, Samba, also
> fanotify for free, and more or less any kind of userspace caching,
> coherent or not. It's clean and orthogonal. (Good value for money isn't it?)
>
> The nlink value is missing from inotify (or "linked from" if you look
> at it differently), but that's a problem needing to be fixed anyway.

The interface sounds neat, sure.

I worry if it requires us to implement all of those mask bits at once.
Some might turn out to be more difficult to implement than others, and
we really only care about some of them for now. I suppose there could
be a "supported_lease_mask_bits" value advertised to userspace.

--b.

2010-05-14 19:23:35

by J. Bruce Fields

[permalink] [raw]
Subject: Re: [PATCH] VFS: Unlink should revoke all outstanding leases on file

On Fri, May 14, 2010 at 02:31:12PM -0400, Trond Myklebust wrote:
> On Fri, 2010-05-14 at 13:59 -0400, Trond Myklebust wrote:
> > Note that the server should also recall the delegation if someone
> > attempts to violate the guarantees that are listed in section 9.4: Open
> > Delegation
> >
> > When a client has a read open delegation, it may not make any changes
> > to the contents or attributes of the file but it is assured that no
> > other client may do so. When a client has a write open delegation,
> > it may modify the file data since no other client will be accessing
> > the file's data. The client holding a write delegation may only
> > affect file attributes which are intimately connected with the file
> > data: size, time_modify, change.
> >
> > IOW: even if you hold a write delegation you are not allowed to change
> > the file mode bits, owner, group or acls...
>
> ...or the nlink value. So technically, we should also recall the
> delegation when someone creates or deletes a hard link. I think I need
> to remind Tom that he should add that to the RFC3530bis draft...

Yep. And fixing all these cases is required before our the server's
NFSv4 server is ready for much of anything.

I'm not sure ading break_lease() to may_delete() is right, but maybe
it's better than nothing.

One problem is that there's a race: nothing I can see stops anyone from
getting another lease after may_delete() but before the delete happens.

--b.

2010-05-25 10:14:53

by Mi Jinlong

[permalink] [raw]
Subject: Re: [PATCH] VFS: Unlink should revoke all outstanding leases on file



J. Bruce Fields :
> On Thu, May 20, 2010 at 05:46:06PM +0800, Mi Jinlong wrote:
>> J. Bruce Fields :
>>> I don't know of any existing lock that does exactly what we want.
>>>
>>> Somebody at citi worked on a better lease implementation for a while,
>>> but I don't think we ever really got it right; the last version I can
>>> find is here:
>>>
>>> git://linux-nfs.org/~bfields/linux-topics.git leases
>> When reading the code of the git, i found a patch which try to fix
>> the lease's problem, but only for unlink.
>
> It's 8 patches together:
>
>> commit id: d5a08e556116c66fb60a448f805a40bf54314634
>> msg: "leases: break file leases on unlink."
>>
>> In this patch, it seems only add break_lease() and some other functions
>> but seems don't avoid the problem of race.
>
> Look again: break_lease() is there, but there's also a break_lease_end()
> after the unlink.

Thanks. That's important.

>
>> Or there is some different
>> at break_lease() with the community's kernel.
>>
>> Can you give me some message about the new lease? Thanks.
>
> So the 8 patches at that branch are:
> leases: introduce per-inode lease enabling/disabling
> VFS: clean up extra dereferences in do_unlinkat()
> leases: break file leases on unlink.
> leases: break file leases on rename.
> leases: break leases on chown.
> VFS: refactor sys_fchmod and sys_fchmodat
> leases: break leases on chmod.
> leases: break leases on link.
>
> Like I say, I don't think they're correct or I'd just copy them all to
> the list. But maybe the comment on the first patch (appended) is
> useful.

As reading the first patch ("leases: introduce ... "), it's really useful.
But, as Jamie Lokier said "new lease semantics should use new userspace API.",
is there some new lease under development ?

And, IMO, this problem we discussing is serious that should be fix ASAP.
Can we fix this problem refer to this solutions?

thanks,
Mi Jinlong

>
> --b.
>
> leases: introduce per-inode lease enabling/disabling
>
> The current file lease implementation is inadequate (for the purposes of
> nfs, and, we believe, for the purposes of Samba), in at least two ways:
>
> - Leases are broken only conflicting opens; but both nfsv4
> delegations and (we're told) Windows op locks actually require
> that leases be broken on any operation that changes file
> metadata--including unlink, link, rename, chmod, and chown.
>
> - The internal kernel api used for lease-breaking is inherently
> racy, consisting as it does of a single break_lease() call.
> (Consider this scenario: a file is not currently open and is
> about to be unlinked. During unlink processing, a lookup is
> done, and break_lease() is called. After the break_lease(),
> but before the unlink completes, another user opens the file
> and gets a read lease. The unlink then completes, but the
> other user thinks their read lease is still valid. This
> situation would be avoided if lease-granting for the inode
> were disabled for the duration of the unlink.)
>
> We're primarily interested in the case of read leases for now. (Write
> leases, which also must be broken on *access* to a file, are more
> difficult to get completely right, and aren't used by the current nfs
> server.)
>
> Fixing the second problem requires replacing break_lease() by a pair of
> calls, here called break_lease() and break_lease_end(), between which
> new leases are temporarily prohibited.
>
> We want to implement that temporary prohibition in a simple way that has
> low impact in common (uncontended) cases.
>
> This patch adds a field, i_leasecount, which provides mutual exclusion
> between inode-modifying operations and read leases in the same way the
> i_writecount provides mutual exclusion between write opens and execs:
> when i_leasecount is positive, it counts the number of leases on the
> given inode, and when it's negative it counts the number of operations
> which want leases temporarily disabled. This allows selective
> enabling/disabling of leases on a per-inode basis.
>
> To that end, the functions leases_get_access() and leases_put_access()
> are used when a lease is granted and returned, respectively. The
> functions leases_deny_access() and leases_allow_access() are used to
> prevent races between breaking-with-FMODE_WRITE and write-lease-granting
> for the entire duration of a file operation. Currently, leases are
> broken only when a file is opened or truncated; these functions will
> allow leases to be broken on things like unlink and rename as well.
> NFSv4 implements delegations using leases, and needs its delegations to
> be revoked on unlinks, renames, chowns, etc.
>
> Note that this patch changes break_lease() and __break_lease(), such
> that when they are called with FMODE_WRITE and return successfully, they
> will leave leases disabled on the inode in question, and the caller must
> eventually call break_lease_end() to re-enable leasing. As alluded to
> in the scenario above, this behavior isn't necessary when breaking
> without FMODE_WRITE: existing and new read-leases wouldn't need to be
> revoked or blocked; and a write-lease-granting setlease won't race the
> break_lease() because the latter is presumed to have been preceded by
> something like a dget() on the dentry in question (where d_count or
> i_count > 1 blocks write-lease-granting).
>
> This patch also closes a small existing open/lease race: a lease-related
> race exists between the time that outstanding leases are broken (by
> may_open()) and the time that, e.g., O_RDWR or O_WRONLY are reflected in
> the inode's i_writecount variable (which will prevent subsequent
> lease-granting setlease calls). Conceivably, a read lease could be
> granted in the interim.
>
> To deal with this, may_open() is modified so that, on success and when
> called with FMODE_WRITE, it will return with lease-granting disabled for
> the inode in question. do_filp_open() is modified so that leasing is
> re-enabled once everything is finished. Analogous changes are made on
> truncation.


2010-05-14 09:58:59

by Jeff Layton

[permalink] [raw]
Subject: Re: [PATCH] VFS: Unlink should revoke all outstanding leases on file

On Fri, 14 May 2010 17:35:27 +0800
Mi Jinlong <[email protected]> wrote:

> After client get one file's READ delegation through NFSv4,
> server delete this file but don't reclaim the delegation.
>
> This patch add break_lease at may_delete, which can reclaim delegations.
>
> ---
> fs/namei.c | 2 +-
> 1 files changed, 1 insertions(+), 1 deletions(-)
>
> diff --git a/fs/namei.c b/fs/namei.c
> index 16df727..17bafc1 100644
> --- a/fs/namei.c
> +++ b/fs/namei.c
> @@ -1338,7 +1338,7 @@ static int may_delete(struct inode *dir,struct dentry *victim,int isdir)
> return -ENOENT;
> if (victim->d_flags & DCACHE_NFSFS_RENAMED)
> return -EBUSY;
> - return 0;
> + return break_lease(victim->d_inode, FMODE_WRITE);
> }
>
> /* Check whether we can create an object with dentry child in directory

This doesn't look right to me.

The fcntl(2) manpage basically says that leases should be broken if the
file is opened for read or write, or is truncated. unlinks don't seem
to fall into either category...

--
Jeff Layton <[email protected]>