2005-07-11 10:46:14

by Olaf Kirch

[permalink] [raw]
Subject: [PATCH NFSv2] Drop inode after rename

With recent kernels, we were seeing frequent corruption of files on NFSv2
immediately after creation, often containing pieces of of frequently
written files such as .viminfo and the like.

The problem turns out to be caused by rename. The applications
encountering this problem were all rewriting files by first creating a
temporary copy, and finally renaming it, replacing the original file.

When doing a rename on a NFSv2 mount on top of an existing file, the
inode of the overwritten file would remain, with all caches intact. If
a new file created subsequently ended up with the same inode number, it
would pick up the stale cache.

The fix is to decrement i_nlink of the overwritten inode, like we
do for unlink, rmdir etc already.

Signed-off-by: [email protected]


Index: linux-2.6.11/fs/nfs/dir.c
===================================================================
--- linux-2.6.11.orig/fs/nfs/dir.c
+++ linux-2.6.11/fs/nfs/dir.c
@@ -1429,8 +1429,15 @@ go_ahead:
shrink_dcache_parent(old_dentry);
}

- if (new_inode)
+ /* d_delete will currently not do a dentry_iput
+ * because new_dentry->d_count is at least 2 when
+ * we get here. As a matter of caution
+ * let's grab a reference to the inode anyway.
+ */
+ if (new_inode && igrab(new_inode)) {
+ nfs_begin_data_update(new_inode);
d_delete(new_dentry);
+ }

nfs_begin_data_update(old_dir);
nfs_begin_data_update(new_dir);
@@ -1440,6 +1447,17 @@ go_ahead:
nfs_end_data_update(old_inode);
nfs_end_data_update(new_dir);
nfs_end_data_update(old_dir);
+
+ if (new_inode) {
+ /* If this was the last reference to the inode make
+ * sure the VFS zaps it and all associated caches.
+ */
+ if (error == 0)
+ new_inode->i_nlink--;
+ nfs_end_data_update(new_inode);
+ iput(new_inode);
+ }
+
out:
if (rehash)
d_rehash(rehash);

--
Olaf Kirch | --- o --- Nous sommes du soleil we love when we play
[email protected] | / | \ sol.dhoop.naytheet.ah kin.ir.samse.qurax


-------------------------------------------------------
This SF.Net email is sponsored by the 'Do More With Dual!' webinar happening
July 14 at 8am PDT/11am EDT. We invite you to explore the latest in dual
core and dual graphics technology at this free one hour event hosted by HP,
AMD, and NVIDIA. To register visit http://www.hp.com/go/dualwebinar
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs


2005-07-11 10:50:48

by Olaf Kirch

[permalink] [raw]
Subject: Re: [PATCH NFSv2] Drop inode after rename

Sorry, I sent an older version of this patch. Here's the one for
2.6.12.

Olaf
--
Olaf Kirch | --- o --- Nous sommes du soleil we love when we play
[email protected] | / | \ sol.dhoop.naytheet.ah kin.ir.samse.qurax


Attachments:
(No filename) (219.00 B)
nfs-rename-zap-inode (1.08 kB)
Download all attachments

2005-07-11 11:06:02

by Trond Myklebust

[permalink] [raw]
Subject: Re: [PATCH NFSv2] Drop inode after rename

m=C3=A5 den 11.07.2005 Klokka 12:46 (+0200) skreiv Olaf Kirch:
> With recent kernels, we were seeing frequent corruption of files on NFSv2
> immediately after creation, often containing pieces of of frequently
> written files such as .viminfo and the like.
>=20
> The problem turns out to be caused by rename. The applications
> encountering this problem were all rewriting files by first creating a
> temporary copy, and finally renaming it, replacing the original file.
>=20
> When doing a rename on a NFSv2 mount on top of an existing file, the
> inode of the overwritten file would remain, with all caches intact. If
> a new file created subsequently ended up with the same inode number, it
> would pick up the stale cache.

This should never happen. The inodes are not indexed on the inode
number, they are indexed on the _filehandle_.

Cheers,
Trond



-------------------------------------------------------
This SF.Net email is sponsored by the 'Do More With Dual!' webinar happening
July 14 at 8am PDT/11am EDT. We invite you to explore the latest in dual
core and dual graphics technology at this free one hour event hosted by HP,
AMD, and NVIDIA. To register visit http://www.hp.com/go/dualwebinar
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2005-07-11 11:08:20

by Olaf Kirch

[permalink] [raw]
Subject: Re: [PATCH NFSv2] Drop inode after rename

On Mon, Jul 11, 2005 at 07:05:42AM -0400, Trond Myklebust wrote:
> > When doing a rename on a NFSv2 mount on top of an existing file, the
> > inode of the overwritten file would remain, with all caches intact. If
> > a new file created subsequently ended up with the same inode number, it
> > would pick up the stale cache.
>
> This should never happen. The inodes are not indexed on the inode
> number, they are indexed on the _filehandle_.

But it does happen. The new file on the NFS server has the exact
same file handle.

Olaf
--
Olaf Kirch | --- o --- Nous sommes du soleil we love when we play
[email protected] | / | \ sol.dhoop.naytheet.ah kin.ir.samse.qurax


-------------------------------------------------------
This SF.Net email is sponsored by the 'Do More With Dual!' webinar happening
July 14 at 8am PDT/11am EDT. We invite you to explore the latest in dual
core and dual graphics technology at this free one hour event hosted by HP,
AMD, and NVIDIA. To register visit http://www.hp.com/go/dualwebinar
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2005-07-11 11:14:40

by Trond Myklebust

[permalink] [raw]
Subject: Re: [PATCH NFSv2] Drop inode after rename

m=C3=A5 den 11.07.2005 Klokka 13:07 (+0200) skreiv Olaf Kirch:
> On Mon, Jul 11, 2005 at 07:05:42AM -0400, Trond Myklebust wrote:
> > > When doing a rename on a NFSv2 mount on top of an existing file, the
> > > inode of the overwritten file would remain, with all caches intact. I=
f
> > > a new file created subsequently ended up with the same inode number, =
it
> > > would pick up the stale cache.
> >=20
> > This should never happen. The inodes are not indexed on the inode
> > number, they are indexed on the _filehandle_.
>=20
> But it does happen. The new file on the NFS server has the exact
> same file handle.

If people care enough about their data to worry about this sort of
problem, then they should care enough to use a server that doesn't
violate the NFS protocol.

Cheers,
Trond



-------------------------------------------------------
This SF.Net email is sponsored by the 'Do More With Dual!' webinar happening
July 14 at 8am PDT/11am EDT. We invite you to explore the latest in dual
core and dual graphics technology at this free one hour event hosted by HP,
AMD, and NVIDIA. To register visit http://www.hp.com/go/dualwebinar
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2005-07-11 11:24:35

by Olaf Kirch

[permalink] [raw]
Subject: Re: [PATCH NFSv2] Drop inode after rename

On Mon, Jul 11, 2005 at 07:14:20AM -0400, Trond Myklebust wrote:
> If people care enough about their data to worry about this sort of
> problem, then they should care enough to use a server that doesn't
> violate the NFS protocol.

unfsd definitely does it this way, unfortunately. I agree that
it's not a very good idea to continue to use unfsd, but...

I would still vote for merging this patch, as it just does
the same nlink accounting that other functions already do,
eg nfs_rmdir, nfs_safe_remove, etc. The latter even has a comment
preceding it that says

* We invalidate the attribute cache and free the inode prior to the operation
* to avoid possible races if the server reuses the inode.

So I think my patch isn't that absurd, is it? :)

Cheers,
Olaf
--
Olaf Kirch | --- o --- Nous sommes du soleil we love when we play
[email protected] | / | \ sol.dhoop.naytheet.ah kin.ir.samse.qurax


-------------------------------------------------------
This SF.Net email is sponsored by the 'Do More With Dual!' webinar happening
July 14 at 8am PDT/11am EDT. We invite you to explore the latest in dual
core and dual graphics technology at this free one hour event hosted by HP,
AMD, and NVIDIA. To register visit http://www.hp.com/go/dualwebinar
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2005-07-11 11:56:33

by Trond Myklebust

[permalink] [raw]
Subject: Re: [PATCH NFSv2] Drop inode after rename

m=C3=A5 den 11.07.2005 Klokka 13:24 (+0200) skreiv Olaf Kirch:

> unfsd definitely does it this way, unfortunately. I agree that
> it's not a very good idea to continue to use unfsd, but...

I still don't have a good idea about what kind of cache consistency
guarantees a client should attempt to offer against a unfsd server.

> I would still vote for merging this patch, as it just does
> the same nlink accounting that other functions already do,
> eg nfs_rmdir, nfs_safe_remove, etc. The latter even has a comment
> preceding it that says
>=20
> * We invalidate the attribute cache and free the inode prior to the opera=
tion
> * to avoid possible races if the server reuses the inode.
>=20
> So I think my patch isn't that absurd, is it? :)

...but it should be unnecessary.

d_delete() is supposed to cause the inode to be dropped: if it cannot
immediately iput the inode, it will drop the dentry. So unless there
exists a hard link or an aliased dentry, the inode should get dropped
and freed pretty rapidly.

The only case I can see that might cause a problem is if someone holds
onto the dropped dentry for long enough to have a lookup() pick up the
inode again. I cannot see how your patch would change matters there.

I also have some trouble seeing why basic checks of mtime and size upon
lookup/open would fail to catch most of the races. AFAICS you should
almost always get a pagecache invalidation due to that.

Cheers,
Trond



-------------------------------------------------------
This SF.Net email is sponsored by the 'Do More With Dual!' webinar happening
July 14 at 8am PDT/11am EDT. We invite you to explore the latest in dual
core and dual graphics technology at this free one hour event hosted by HP,
AMD, and NVIDIA. To register visit http://www.hp.com/go/dualwebinar
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2005-07-11 12:35:02

by Trond Myklebust

[permalink] [raw]
Subject: Re: [PATCH NFSv2] Drop inode after rename

m=C3=A5 den 11.07.2005 Klokka 14:28 (+0200) skreiv Michael Schroeder:

> AFAIR the problem is that i_nlink doesn't get decremented and
> therefore the inode doesn't get deleted.

OK. So how about just a one-liner _before_ the call to d_delete()? We
don't really care whether or not the rename was successful.

Cheers,
Trond



-------------------------------------------------------
This SF.Net email is sponsored by the 'Do More With Dual!' webinar happening
July 14 at 8am PDT/11am EDT. We invite you to explore the latest in dual
core and dual graphics technology at this free one hour event hosted by HP,
AMD, and NVIDIA. To register visit http://www.hp.com/go/dualwebinar
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2005-07-11 12:28:27

by Michael Schroeder

[permalink] [raw]
Subject: Re: [PATCH NFSv2] Drop inode after rename

On Mon, Jul 11, 2005 at 07:56:05AM -0400, Trond Myklebust wrote:
> ...but it should be unnecessary.
>
> d_delete() is supposed to cause the inode to be dropped: if it cannot
> immediately iput the inode, it will drop the dentry. So unless there
> exists a hard link or an aliased dentry, the inode should get dropped
> and freed pretty rapidly.

AFAIR the problem is that i_nlink doesn't get decremented and
therefore the inode doesn't get deleted.

Cheers,
Michael.

--
Michael Schroeder [email protected]
main(_){while(_=~getchar())putchar(~_-1/(~(_|32)/13*2-11)*13);}


-------------------------------------------------------
This SF.Net email is sponsored by the 'Do More With Dual!' webinar happening
July 14 at 8am PDT/11am EDT. We invite you to explore the latest in dual
core and dual graphics technology at this free one hour event hosted by HP,
AMD, and NVIDIA. To register visit http://www.hp.com/go/dualwebinar
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs