2005-04-05 04:02:05

by NeilBrown

[permalink] [raw]
Subject: Re: Re: corruption over NFS with 2.6 client, locking, truncating and appending...

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Monday April 4, [email protected] wrote:
> ty den 05.04.2005 Klokka 11:39 (+1000) skreiv Neil Brown:
>
> > Your explanation would explain that value of 'off' being wrong as I
> > found with the lockf locking in 2.4.
> > However as the file was open for O_APPEND, there would be an implicit
> > seek to the end on every write. Is this implicit seek expected to do
> > the right thing or would it need fixing too?
>
> The writes should indeed end up being correctly positioned w.r.t. the
> end of the file (but may write the wrong offset - which is what you
> claimed to see). The problem will be the case of the ftruncate()s which
> will end up using stale offsets obtained from the call to lseek(). You
> may therefore end up extending the file instead of truncating it.

I don't think ftruncates on stale offsets can cause this problem.
As the client which calls ftruncate is the only client which ever
causes the file to shrink (the other client only causes it to grow),
stale offsets that it sees may be too short, but will never be too
long.
So it might truncate more than it expects to (and this test doesn't
check for that possibility) but it won't truncate beyond the end of
the file and so create a hole.

>
> Could you please test out the following patch? It should apply to 2.4.x
> too if you substitute generic_file_llseek() for the references to
> remote_llseek().

I tried the patch, and it didn't help noticeably.

(I had to change
> + int retval = nfs_revalidate_inode(file->f_dentry->d_inode);
to
> + int retval = nfs_revalidate_inode(NFS_SERVER(file->f_dentry->d_inode), file->f_dentry->d_inode);

but that might be because I'm actually using -mm for testing).


I poked around in the related code and found that nfs_update_inode,
which is called by nfs_revalidate_inode, won't accept a decrease in
file size if it thinks the file is 'unstable'. So I put in a printk
to tell me when this happened:

if (S_ISREG(inode->i_mode) && data_unstable) {
if (new_isize > cur_isize) {
inode->i_size = new_isize;
invalid |= NFS_INO_INVALID_ATTR|NFS_INO_INVALID_DATA;
}
else printk("may have missed trunc %ld->%ld\n",
(long)cur_isize, (long)new_isize); /* neilb */
} else {
inode->i_size = new_isize;
invalid |= NFS_INO_INVALID_ATTR|NFS_INO_INVALID_DATA;
}

and got some output:

Apr 5 13:40:41 cage kernel: may have missed trunc 2050->2030
Apr 5 13:40:41 cage kernel: may have missed trunc 2060->2020
Apr 5 13:40:41 cage kernel: may have missed trunc 2090->2070
Apr 5 13:40:41 cage kernel: may have missed trunc 2130->2110

and more.
This first message would have resulted in a 20 byte hole at 2030.
The next a 40 byte hole at 2020 (subsuming the previous hole).
The next a 20 byte hole at 2070

The are indeed the first 2 'holes' of nuls that I find when looking
in the resulting file.
(This is using fcntl locking).

I feel were getting closer to atleast one bug...

NeilBrown
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.0 (GNU/Linux)
Comment: Processed by Mailcrypt 3.5.8 <http://mailcrypt.sourceforge.net/>

iD8DBQFCUg21G5fc6gV+Wb0RAps8AKDAV2Wzj+hajLzCBkN1jmx50H3duQCdFrr8
OuGnL8KM1fi/MHB22U3jCX0=
=5jLw
-----END PGP SIGNATURE-----


-------------------------------------------------------
SF email is sponsored by - The IT Product Guide
Read honest & candid reviews on hundreds of IT Products from real users.
Discover which products truly live up to the hype. Start reading now.
http://ads.osdn.com/?ad_id=6595&alloc_id=14396&op=click
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs


2005-04-05 04:54:08

by Trond Myklebust

[permalink] [raw]
Subject: Re: Re: corruption over NFS with 2.6 client, locking, truncating and appending...

ty den 05.04.2005 Klokka 14:01 (+1000) skreiv Neil Brown:
> I don't think ftruncates on stale offsets can cause this problem.
> As the client which calls ftruncate is the only client which ever
> causes the file to shrink (the other client only causes it to grow),
> stale offsets that it sees may be too short, but will never be too
> long.
> So it might truncate more than it expects to (and this test doesn't
> check for that possibility) but it won't truncate beyond the end of
> the file and so create a hole.

OK.


> (I had to change
> > + int retval = nfs_revalidate_inode(file->f_dentry->d_inode);
> to
> > + int retval = nfs_revalidate_inode(NFS_SERVER(file->f_dentry->d_inode), file->f_dentry->d_inode);
>
> but that might be because I'm actually using -mm for testing).

Sorry. I didn't have time for a compile test...

> I poked around in the related code and found that nfs_update_inode,
> which is called by nfs_revalidate_inode, won't accept a decrease in
> file size if it thinks the file is 'unstable'. So I put in a printk
> to tell me when this happened:
>
> if (S_ISREG(inode->i_mode) && data_unstable) {
> if (new_isize > cur_isize) {
> inode->i_size = new_isize;
> invalid |= NFS_INO_INVALID_ATTR|NFS_INO_INVALID_DATA;
> }
> else printk("may have missed trunc %ld->%ld\n",
> (long)cur_isize, (long)new_isize); /* neilb */
> } else {
> inode->i_size = new_isize;
> invalid |= NFS_INO_INVALID_ATTR|NFS_INO_INVALID_DATA;
> }
>
> and got some output:
>
> Apr 5 13:40:41 cage kernel: may have missed trunc 2050->2030
> Apr 5 13:40:41 cage kernel: may have missed trunc 2060->2020
> Apr 5 13:40:41 cage kernel: may have missed trunc 2090->2070
> Apr 5 13:40:41 cage kernel: may have missed trunc 2130->2110

That is indeed a problem. The current code basically assumes
close-to-open semantics, and that of course breaks down in the case of
locking.
OK. I'm going to have to give that some thought...

BTW: The 2.4 code should be very different in this respect: it has no
equivalent to the "unstable" test. Are you seeing the same problem
there?

Cheers,
Trond
--
Trond Myklebust <[email protected]>



-------------------------------------------------------
SF email is sponsored by - The IT Product Guide
Read honest & candid reviews on hundreds of IT Products from real users.
Discover which products truly live up to the hype. Start reading now.
http://ads.osdn.com/?ad_id=6595&alloc_id=14396&op=click
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs