While developing a new product that uses NFS here at EMC, we've run into
issues with cache coherency. Our server uses ext3, and therefore can
only provide timestamps with a granularity of 1 second. We have seen
cases where a directory is modified on the server, then a non-existent
filename within that directory is accessed from the client (returning
an error), and then the directory entry (either subdirectory or file)
gets added on the server, all within the same second. This leads to
a cached negative dentry on the client, which results in erroneously
returning "file not found" to all future lstat()s and open()s, as the
timestamp on the parent directory has not been modified.
Ideally, all NFS servers would provide finer-grained timestamps, so
that all changes to a directory also modify the timestamp. However, it
is a reality that not every one will. Therefore, we are attempting to
tackle this problem with the following three patches.
First, if the nlinks count changes, invalidate the cached data for the
directory. This is of limited use, as it only catches the cases where
the number of subdirectories has changed. However, because there is no
really downside to checking the nlinks count, this check can be
universally enabled.
Secondly, a modification to the kernel has been made to disable the
caching of negative dentries when the NFS_MOUNT_NONEGDE flag is set.
Note that this does nothing to address outdated information being
returned by readdir(). However, it does allow a file to be open()ed or
lstat()ed if the filename is known. Because this has a potential
performance impact, it is left as an option for the user to select.
Lastly, changes have been made to userspace to accept the "nonegde"
mount option and set the NFS_MOUNT_NONEGDE flag.
Your review and comments are greatly appreciated. I believe this is
a fairly basic patch, and it has worked well in testing so far on SLES
10 (with slight modifications to match the software versions in SLES
10). Most of this patch is fairly trivial; my only open question is
whether "nonegde" is a good name (I myself sometimes read it as
"non-edge", but have yet to come up with anything better).
--
Bob Bell
On Tue, Jan 15, 2008 at 11:27:01AM -0500, Bob Bell wrote:
>Secondly, a modification to the kernel has been made to disable the
>caching of negative dentries when the NFS_MOUNT_NONEGDE flag is set.
>Note that this does nothing to address outdated information being
>returned by readdir(). However, it does allow a file to be open()ed or
>lstat()ed if the filename is known. Because this has a potential
>performance impact, it is left as an option for the user to select.
Someone pointed me to this page on NFSv4.1 directory delegations:
http://wiki.linux-nfs.org/wiki/index.php/NFSv4.1_Directory_Delegations
I found the following quotes intriguing:
"Even though the directories along the path are cached, [without
directory delegations] negative dentry caching is not allowed because it
potentially violates close-to-open consistency semantics."
"Close-to-open consistency currently requires that even in a case where
previous LOOKUPs or OPENs for a given file have recently and repeatedly
failed, subsequent LOOKUPs and OPENs must nevertheless be sent to the
server (i.e., negative caching provides no benefit in those cases)."
I admittedly know little about what is being discussed here. However,
is it possible that the Linux NFS client should *NEVER* cache negative
entries? (well, maybe with the exception of "nocto")
I'm interested if anyone has thoughts on the matter, though I know don't
enough to argue either side, so here's a really easy chance to win an
argument. :)
--
Bob Bell
On Tue, 2008-01-15 at 20:55 -0500, Bob Bell wrote:
> On Tue, Jan 15, 2008 at 11:27:01AM -0500, Bob Bell wrote:
> >Secondly, a modification to the kernel has been made to disable the
> >caching of negative dentries when the NFS_MOUNT_NONEGDE flag is set.
> >Note that this does nothing to address outdated information being
> >returned by readdir(). However, it does allow a file to be open()ed or
> >lstat()ed if the filename is known. Because this has a potential
> >performance impact, it is left as an option for the user to select.
>
> Someone pointed me to this page on NFSv4.1 directory delegations:
> http://wiki.linux-nfs.org/wiki/index.php/NFSv4.1_Directory_Delegations
>
> I found the following quotes intriguing:
>
> "Even though the directories along the path are cached, [without
> directory delegations] negative dentry caching is not allowed because it
> potentially violates close-to-open consistency semantics."
Negative dentry caching has nothing whatsoever to do with close-to-open
semantics.
> "Close-to-open consistency currently requires that even in a case where
> previous LOOKUPs or OPENs for a given file have recently and repeatedly
> failed, subsequent LOOKUPs and OPENs must nevertheless be sent to the
> server (i.e., negative caching provides no benefit in those cases)."
Nope. All close-to-open cache consistency requires is that the client
revalidate the file/directory cached data upon an open()/opendir() call
by checking the inode mtime and/or change attribute (NFSv4 only).
I think we need to revisit that wiki entry...
Trond
On Tue, 15 Jan 2008, Trond Myklebust wrote:
>
> On Tue, 2008-01-15 at 20:55 -0500, Bob Bell wrote:
> > On Tue, Jan 15, 2008 at 11:27:01AM -0500, Bob Bell wrote:
> > >Secondly, a modification to the kernel has been made to disable the
> > >caching of negative dentries when the NFS_MOUNT_NONEGDE flag is set.
> > >Note that this does nothing to address outdated information being
> > >returned by readdir(). However, it does allow a file to be open()ed or
> > >lstat()ed if the filename is known. Because this has a potential
> > >performance impact, it is left as an option for the user to select.
> >
> > Someone pointed me to this page on NFSv4.1 directory delegations:
> > http://wiki.linux-nfs.org/wiki/index.php/NFSv4.1_Directory_Delegations
> >
> > I found the following quotes intriguing:
> >
> > "Even though the directories along the path are cached, [without
> > directory delegations] negative dentry caching is not allowed because it
> > potentially violates close-to-open consistency semantics."
>
> Negative dentry caching has nothing whatsoever to do with close-to-open
> semantics.
>
> > "Close-to-open consistency currently requires that even in a case where
> > previous LOOKUPs or OPENs for a given file have recently and repeatedly
> > failed, subsequent LOOKUPs and OPENs must nevertheless be sent to the
> > server (i.e., negative caching provides no benefit in those cases)."
>
> Nope. All close-to-open cache consistency requires is that the client
> revalidate the file/directory cached data upon an open()/opendir() call
> by checking the inode mtime and/or change attribute (NFSv4 only).
>
> I think we need to revisit that wiki entry...
so then, would it instead be correct to say that the negative
dentry caching on the client afforded by the delegation is beneficial
insofar as it obviates the client's need to revalidate the file/directory
in question? if so, i understand the flawed CTO wording and will fix the
wiki.
thanks,
d
.
>
> Trond
On Wed, 2008-01-16 at 12:16 -0500, david m. richter wrote:
> so then, would it instead be correct to say that the negative
> dentry caching on the client afforded by the delegation is beneficial
> insofar as it obviates the client's need to revalidate the file/directory
> in question? if so, i understand the flawed CTO wording and will fix the
> wiki.
Delegations give you a guarantee that the directory contents (i.e. the
readdir() information) have not changed, and so the client no longer
needs to poll the directory for change information.
IOW: specifically they allow the client to optimise away the GETATTR
call in opendir(), and they allow it to optimise away most of
nfs_lookup_revalidate().
Cheers
Trond
On Wed, 16 Jan 2008, Trond Myklebust wrote:
>
> On Wed, 2008-01-16 at 12:16 -0500, david m. richter wrote:
> > so then, would it instead be correct to say that the negative
> > dentry caching on the client afforded by the delegation is beneficial
> > insofar as it obviates the client's need to revalidate the file/directory
> > in question? if so, i understand the flawed CTO wording and will fix the
> > wiki.
>
> Delegations give you a guarantee that the directory contents (i.e. the
> readdir() information) have not changed, and so the client no longer
> needs to poll the directory for change information.
>
> IOW: specifically they allow the client to optimise away the GETATTR
> call in opendir(), and they allow it to optimise away most of
> nfs_lookup_revalidate().
>
> Cheers
> Trond
good, i'm on the same page as you here. thanks, trond,
d
.