2005-03-12 11:56:14

by Junfeng Yang

[permalink] [raw]
Subject: [CHECKER] inconsistent NFS stat cache (NFS on ext3, 2.6.11)


Hi,

We checked NFS on top of ext3 using FiSC (our file system model checker)
and found a case where NFS stat cache can contain inconsistent entries.

Basically, to trigger this inconsistency, just do the following steps:
1. create a file A1, write a few bytes to it, so A1 is 4 words
2. create a hard link A2, pointing to A1
3. stat on A2. A2's size is 4 words
4. truncate A1 to a larger size, write a few bytes at the end. now it's
1031 words.
5. stat on A2. it's size is still 4 words, which should be 1031 words

We have a test case to re-create this warning. You can download it at
http://fisc.stanford.edu/bug16/crash.c. It includes some sudo commands
to mount nfs partitions, which you might want to change according to your
local settings.

cat /etc/exports shows:
/mnt/sbd0-export localhost(rw,sync)
/mnt/sbd1-export localhost(rw,sync)

Let me know if you have any problems reproducing the warning. We'd
appreciate any confirmations/clarifications.

-Junfeng



-------------------------------------------------------
SF email is sponsored by - The IT Product Guide
Read honest & candid reviews on hundreds of IT Products from real users.
Discover which products truly live up to the hype. Start reading now.
http://ads.osdn.com/?ad_id=6595&alloc_id=14396&op=click


2005-03-13 05:04:27

by Trond Myklebust

[permalink] [raw]
Subject: Re: [CHECKER] inconsistent NFS stat cache (NFS on ext3, 2.6.11)

lau den 12.03.2005 Klokka 03:56 (-0800) skreiv Junfeng Yang:
> Hi,
>
> We checked NFS on top of ext3 using FiSC (our file system model checker)
> and found a case where NFS stat cache can contain inconsistent entries.
>
> Basically, to trigger this inconsistency, just do the following steps:
> 1. create a file A1, write a few bytes to it, so A1 is 4 words
> 2. create a hard link A2, pointing to A1
> 3. stat on A2. A2's size is 4 words
> 4. truncate A1 to a larger size, write a few bytes at the end. now it's
> 1031 words.
> 5. stat on A2. it's size is still 4 words, which should be 1031 words
>
> We have a test case to re-create this warning. You can download it at
> http://fisc.stanford.edu/bug16/crash.c. It includes some sudo commands
> to mount nfs partitions, which you might want to change according to your
> local settings.
>
> cat /etc/exports shows:
> /mnt/sbd0-export localhost(rw,sync)
> /mnt/sbd1-export localhost(rw,sync)
>
> Let me know if you have any problems reproducing the warning. We'd
> appreciate any confirmations/clarifications.
>

This is a known problem. Turn off the (default - grrr) subtree checking
export option on the server, and it will all work properly. The subtree
checking option violates the NFS standards for filehandle generation in
so many ways, that it isn't even funny.

Cheers,
Trond

--
Trond Myklebust <[email protected]>



-------------------------------------------------------
SF email is sponsored by - The IT Product Guide
Read honest & candid reviews on hundreds of IT Products from real users.
Discover which products truly live up to the hype. Start reading now.
http://ads.osdn.com/?ad_id=6595&alloc_id=14396&op=click
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2005-03-13 06:16:34

by Junfeng Yang

[permalink] [raw]
Subject: Re: [CHECKER] inconsistent NFS stat cache (NFS on ext3, 2.6.11)

> This is a known problem. Turn off the (default - grrr) subtree checking
> export option on the server, and it will all work properly. The subtree
> checking option violates the NFS standards for filehandle generation in
> so many ways, that it isn't even funny.

Thanks Trond. no_subtree_check fixes the problem.

-Junfeng



-------------------------------------------------------
SF email is sponsored by - The IT Product Guide
Read honest & candid reviews on hundreds of IT Products from real users.
Discover which products truly live up to the hype. Start reading now.
http://ads.osdn.com/?ad_id=6595&alloc_id=14396&op=click

2005-03-13 15:18:50

by Trond Myklebust

[permalink] [raw]
Subject: Re: [CHECKER] inconsistent NFS stat cache (NFS on ext3, 2.6.11)

lau den 12.03.2005 Klokka 23:04 (-0800) skreiv Junfeng Yang:
> > This is a known problem. Turn off the (default - grrr) subtree checking
> > export option on the server, and it will all work properly. The subtree
> > checking option violates the NFS standards for filehandle generation in
> > so many ways, that it isn't even funny.
>
> Hi Trond,
>
> Turn off this option in /etc/exports does fix the inconsistency. However
> I looked throught the export man page and seems that subtree checking will
> cause problems only in the following case: "accessing files that are
> renamed while a client has them open". My test case is *not* doing this.
> It does renames but it never does so with a file opened. Can you please
> clarify?

The subtree checking option causes knfsd to store the inode number of
the parent directory in the filehandle. Your case involves two hard
links that resides in different directories, and so the filehandles are
different.
In theory, the client could use the fileid in order to decide that these
2 filehandles point to the same file, but doing so will still fail to
deal with the problem of renamed open files. It also adds a lot of
complexity, since fileids have a lot of nasty properties (not least
being the fact that they may be reused after the original file was
deleted).

Cheers,
Trond
--
Trond Myklebust <[email protected]>



-------------------------------------------------------
SF email is sponsored by - The IT Product Guide
Read honest & candid reviews on hundreds of IT Products from real users.
Discover which products truly live up to the hype. Start reading now.
http://ads.osdn.com/?ad_id=6595&alloc_id=14396&op=click
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2005-03-13 20:04:12

by Daniel Jacobowitz

[permalink] [raw]
Subject: Re: [CHECKER] inconsistent NFS stat cache (NFS on ext3, 2.6.11)

On Sun, Mar 13, 2005 at 12:04:27AM -0500, Trond Myklebust wrote:
> lau den 12.03.2005 Klokka 03:56 (-0800) skreiv Junfeng Yang:
> > Hi,
> >
> > We checked NFS on top of ext3 using FiSC (our file system model checker)
> > and found a case where NFS stat cache can contain inconsistent entries.
> >
> > Basically, to trigger this inconsistency, just do the following steps:
> > 1. create a file A1, write a few bytes to it, so A1 is 4 words
> > 2. create a hard link A2, pointing to A1
> > 3. stat on A2. A2's size is 4 words
> > 4. truncate A1 to a larger size, write a few bytes at the end. now it's
> > 1031 words.
> > 5. stat on A2. it's size is still 4 words, which should be 1031 words
> >
> > We have a test case to re-create this warning. You can download it at
> > http://fisc.stanford.edu/bug16/crash.c. It includes some sudo commands
> > to mount nfs partitions, which you might want to change according to your
> > local settings.
> >
> > cat /etc/exports shows:
> > /mnt/sbd0-export localhost(rw,sync)
> > /mnt/sbd1-export localhost(rw,sync)
> >
> > Let me know if you have any problems reproducing the warning. We'd
> > appreciate any confirmations/clarifications.
> >
>
> This is a known problem. Turn off the (default - grrr) subtree checking
> export option on the server, and it will all work properly. The subtree
> checking option violates the NFS standards for filehandle generation in
> so many ways, that it isn't even funny.

I can't find any documentation about this, but it seems like the same
problem that has been causing me headaches lately; when I replace glibc
from the server side of an nfsroot, the client has a couple of
variously wrong reads before it sees the new files. If it breaks NFS
so badly, why is it the default for the Linux NFS server?

--
Daniel Jacobowitz
CodeSourcery, LLC


-------------------------------------------------------
SF email is sponsored by - The IT Product Guide
Read honest & candid reviews on hundreds of IT Products from real users.
Discover which products truly live up to the hype. Start reading now.
http://ads.osdn.com/?ad_id=6595&alloc_id=14396&op=click

2005-03-13 20:42:29

by Trond Myklebust

[permalink] [raw]
Subject: Re: [CHECKER] inconsistent NFS stat cache (NFS on ext3, 2.6.11)

su den 13.03.2005 Klokka 15:04 (-0500) skreiv Daniel Jacobowitz:

> I can't find any documentation about this, but it seems like the same
> problem that has been causing me headaches lately; when I replace glibc
> from the server side of an nfsroot, the client has a couple of
> variously wrong reads before it sees the new files. If it breaks NFS
> so badly, why is it the default for the Linux NFS server?

No, that's a very different issue: you are violating the NFS cache
consistency rules if you are changing a file that is being held open by
other machines.
The correct way to do the above is to use GNU install with the '-b'
option: that will rename the version of glibc that is in use, and then
install the new glibc in a different inode.

Cheers,
Trond
--
Trond Myklebust <[email protected]>



-------------------------------------------------------
SF email is sponsored by - The IT Product Guide
Read honest & candid reviews on hundreds of IT Products from real users.
Discover which products truly live up to the hype. Start reading now.
http://ads.osdn.com/?ad_id=6595&alloc_id=14396&op=click
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2005-03-13 20:46:34

by Trond Myklebust

[permalink] [raw]
Subject: Re: [CHECKER] inconsistent NFS stat cache (NFS on ext3, 2.6.11)

su den 13.03.2005 Klokka 15:42 (-0500) skreiv Trond Myklebust:

> No, that's a very different issue: you are violating the NFS cache
> consistency rules if you are changing a file that is being held open by
> other machines.
> The correct way to do the above is to use GNU install with the '-b'
> option: that will rename the version of glibc that is in use, and then
> install the new glibc in a different inode.

BTW: there is a more complete description of the NFS cache consistency
model in the NFS FAQ:

http://nfs.sourceforge.net/index.cel.php#faq_a8

Cheers,
Trond

--
Trond Myklebust <[email protected]>



-------------------------------------------------------
SF email is sponsored by - The IT Product Guide
Read honest & candid reviews on hundreds of IT Products from real users.
Discover which products truly live up to the hype. Start reading now.
http://ads.osdn.com/?ad_id=6595&alloc_id=14396&op=click

2005-03-13 07:04:49

by Junfeng Yang

[permalink] [raw]
Subject: Re: [CHECKER] inconsistent NFS stat cache (NFS on ext3, 2.6.11)

> This is a known problem. Turn off the (default - grrr) subtree checking
> export option on the server, and it will all work properly. The subtree
> checking option violates the NFS standards for filehandle generation in
> so many ways, that it isn't even funny.

Hi Trond,

Turn off this option in /etc/exports does fix the inconsistency. However
I looked throught the export man page and seems that subtree checking will
cause problems only in the following case: "accessing files that are
renamed while a client has them open". My test case is *not* doing this.
It does renames but it never does so with a file opened. Can you please
clarify?

Thanks,
-Junfeng



-------------------------------------------------------
SF email is sponsored by - The IT Product Guide
Read honest & candid reviews on hundreds of IT Products from real users.
Discover which products truly live up to the hype. Start reading now.
http://ads.osdn.com/?ad_id=6595&alloc_id=14396&op=click
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs