2005-10-11 09:09:30

by Leif Nixon

[permalink] [raw]
Subject: Cache invalidation bug in NFS v3 - trivially reproducible

Hi,

We have come across a bug where a NFS v3 client fails to invalidate
its data cache for a file even though it realizes that the file
attributes have changed. We have been able to recreate the bug on a
range of kernel versions and different underlying file systems.

Here's a minimal way to reproduce the error (there seems to be some
timing issues involved, but this has worked at least 90% of the time):

NFS client n1 NFS client n2

$ echo 1 > f
$ cat f
1
$ touch .
$ echo 2 > f
$ touch f
$ cat f
1

Now client n2 is stuck in a state where it uses its old cached data
forever (or at least for several hours):

NFS client n1 NFS client n2

$ cat f
2
$ cat f
1

However, "stat f" gives the same output on both clients. "touch f" on
either machine corrects the situation; n2 invalidates its data cache.

Interestingly, the second write to the file ("echo 2 > f") must
not change the size of the file. If you do "echo foo > f" instead,
the erroneous behaviour isn't triggered.

We have seen this on a range of kernels between 2.6.9 and 2.6.13.2 on
Debian, CentOS, RHEL, Fedora and vanilla kernel.org, on both clients
and server. We have *not* been able to reproduce the bug with Linux
clients and a Solaris server, neither with Solaris clients and a Linux
server. Underlying file systems have been ext3 and xfs (and Solaris
ufs). We have tried varying mount options, but to no avail; the bug
persists, even with "noac".


Hypothesis:

When n2 does "touch f" and wants to do SETATTR, it first has to do a
LOOKUP (because n1 has updated the attributes on cwd with "touch .").
It seems that when n2 receives the updated attributes for f as a part
of the LOOKUP reply, it updates its attribute cache without
invalidating its data cache, leading to the anomalous situation.

If the "touch ." is omitted, n2 receives the updated file attributes
via an explicit GETATTR on f, and then everything works properly.

--
Leif Nixon - Systems expert
------------------------------------------------------------
National Supercomputer Centre - Linkoping University
------------------------------------------------------------


2005-10-11 09:22:34

by Jakob Oestergaard

[permalink] [raw]
Subject: Re: Cache invalidation bug in NFS v3 - trivially reproducible

On Tue, Oct 11, 2005 at 11:09:27AM +0200, Leif Nixon wrote:
...
>
> Now client n2 is stuck in a state where it uses its old cached data
> forever (or at least for several hours):
>
> NFS client n1 NFS client n2
>
> $ cat f
> 2
> $ cat f
> 1

I can confirm this on NFSv3 UDP export from patched 2.6.11.11 server
(dual opteron 64-bit kernel) to two different SMP (32-bit x86) clients
with 2.6.12.4 and 2.6.11.11 kernels.

There are definitely timing issues - in order to reproduce the problem I
had to use "touch . ; echo 2 > r" and "touch r; cat r" on the clients -
I couldn't type the commands quickly enough one-by-one.

But right now I have:

[phoenix:joe] $ cat r
2

[raven:joe] $ cat r
1

Beautiful :)

--

/ jakob

2005-10-11 14:48:12

by Trond Myklebust

[permalink] [raw]
Subject: Re: Cache invalidation bug in NFS v3 - trivially reproducible

ty den 11.10.2005 Klokka 11:09 (+0200) skreiv Leif Nixon:
> Hi,
>
> We have come across a bug where a NFS v3 client fails to invalidate
> its data cache for a file even though it realizes that the file
> attributes have changed. We have been able to recreate the bug on a
> range of kernel versions and different underlying file systems.
>
> Here's a minimal way to reproduce the error (there seems to be some
> timing issues involved, but this has worked at least 90% of the time):
>
> NFS client n1 NFS client n2
>
> $ echo 1 > f
> $ cat f
> 1
> $ touch .
> $ echo 2 > f
> $ touch f
> $ cat f
> 1
>
> Now client n2 is stuck in a state where it uses its old cached data
> forever (or at least for several hours):

Yep. I can see a problem whereby the cache is "losing" consistency
information when you do this sort of thing. I'm working on a fix.

Cheers,
Trond

2005-10-12 04:31:27

by Willy Tarreau

[permalink] [raw]
Subject: Re: Cache invalidation bug in NFS v3 - trivially reproducible

Hi,

On Tue, Oct 11, 2005 at 10:47:57AM -0400, Trond Myklebust wrote:
> ty den 11.10.2005 Klokka 11:09 (+0200) skreiv Leif Nixon:
> > Hi,
> >
> > We have come across a bug where a NFS v3 client fails to invalidate
> > its data cache for a file even though it realizes that the file
> > attributes have changed. We have been able to recreate the bug on a
> > range of kernel versions and different underlying file systems.
> >
> > Here's a minimal way to reproduce the error (there seems to be some
> > timing issues involved, but this has worked at least 90% of the time):
> >
> > NFS client n1 NFS client n2
> >
> > $ echo 1 > f
> > $ cat f
> > 1
> > $ touch .
> > $ echo 2 > f
> > $ touch f
> > $ cat f
> > 1
> >
> > Now client n2 is stuck in a state where it uses its old cached data
> > forever (or at least for several hours):
>
> Yep. I can see a problem whereby the cache is "losing" consistency
> information when you do this sort of thing. I'm working on a fix.

Trond, if this can help you, I *cannot* reproduce this with 2.4 clients
and server.

Cheers,
Willy

2005-10-13 22:34:30

by Trond Myklebust

[permalink] [raw]
Subject: Re: Cache invalidation bug in NFS v3 - trivially reproducible

NFS: Fix cache consistency races

If the data cache has been marked as potentially invalid by nfs_refresh_inode,
we should invalidate it rather than assume that changes are due to our own
activity.

Also ensure that we always start with a valid cache before declaring it
to be protected by a delegation.

Signed-off-by: Trond Myklebust <[email protected]>
---
delegation.c | 4 ++++
file.c | 3 ++-
inode.c | 7 ++-----
3 files changed, 8 insertions(+), 6 deletions(-)

Index: linux-2.6.14-rc4/fs/nfs/delegation.c
===================================================================
--- linux-2.6.14-rc4.orig/fs/nfs/delegation.c
+++ linux-2.6.14-rc4/fs/nfs/delegation.c
@@ -85,6 +85,10 @@ int nfs_inode_set_delegation(struct inod
struct nfs_delegation *delegation;
int status = 0;

+ /* Ensure we first revalidate the attributes and page cache! */
+ if ((nfsi->cache_validity & (NFS_INO_REVAL_PAGECACHE|NFS_INO_INVALID_ATTR)))
+ __nfs_revalidate_inode(NFS_SERVER(inode), inode);
+
delegation = nfs_alloc_delegation();
if (delegation == NULL)
return -ENOMEM;
Index: linux-2.6.14-rc4/fs/nfs/file.c
===================================================================
--- linux-2.6.14-rc4.orig/fs/nfs/file.c
+++ linux-2.6.14-rc4/fs/nfs/file.c
@@ -137,7 +137,8 @@ static int nfs_revalidate_file(struct in
struct nfs_inode *nfsi = NFS_I(inode);
int retval = 0;

- if ((nfsi->cache_validity & NFS_INO_REVAL_PAGECACHE) || nfs_attribute_timeout(inode))
+ if ((nfsi->cache_validity & (NFS_INO_REVAL_PAGECACHE|NFS_INO_INVALID_ATTR))
+ || nfs_attribute_timeout(inode))
retval = __nfs_revalidate_inode(NFS_SERVER(inode), inode);
nfs_revalidate_mapping(inode, filp->f_mapping);
return 0;
Index: linux-2.6.14-rc4/fs/nfs/inode.c
===================================================================
--- linux-2.6.14-rc4.orig/fs/nfs/inode.c
+++ linux-2.6.14-rc4/fs/nfs/inode.c
@@ -1226,10 +1226,6 @@ int nfs_refresh_inode(struct inode *inod
loff_t cur_size, new_isize;
int data_unstable;

- /* Do we hold a delegation? */
- if (nfs_have_delegation(inode, FMODE_READ))
- return 0;
-
spin_lock(&inode->i_lock);

/* Are we in the process of updating data on the server? */
@@ -1350,7 +1346,8 @@ static int nfs_update_inode(struct inode
nfsi->read_cache_jiffies = fattr->timestamp;

/* Are we racing with known updates of the metadata on the server? */
- data_unstable = ! nfs_verify_change_attribute(inode, verifier);
+ data_unstable = ! (nfs_verify_change_attribute(inode, verifier) ||
+ (nfsi->cache_validity & NFS_INO_REVAL_PAGECACHE));

/* Check if our cached file size is stale */
new_isize = nfs_size_to_loff_t(fattr->size);


Attachments:
linux-2.6.14-01-fix_attrcache.dif (2.63 kB)

2005-10-14 13:34:33

by Leif Nixon

[permalink] [raw]
Subject: Re: Cache invalidation bug in NFS v3 - trivially reproducible

Trond Myklebust <[email protected]> writes:

> Does the attached patch help?

Yes, it does indeed seem to solve the problem.

Thanks!

--
Leif Nixon - Systems expert
------------------------------------------------------------
National Supercomputer Centre - Linkoping University
------------------------------------------------------------