2008-04-30 07:00:07

by Clay McClure

[permalink] [raw]
Subject: Data corruption using 2.6.18 NFSv3 client -- sparse files?

Hello,

When multiple 2.6.18 NFSv3 clients write to the same file, after one of the
clients has recently read from the file, we see data corruption in the form of
null bytes inserted into the file.

Simple test case:

hosta% echo "line 1" > /nfs/volume/bar.txt

then, in rapid succession:

hostb% cat bar.txt && sleep 2 && echo "line 2 from hostb" >> /nfs/bar.txt
hostc% cat bar.txt && sleep 2 && echo "line 2 from hostc" >> /nfs/bar.txt

Expected result:

/nfs/bar.txt contains:

line 1
line 2 from hostb
line 2 from hostc

Actual result:

/nfs/bar.txt contains:

line 1
\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0line 2 from hostc

This seems to be due to an inconsistency between the page cache and the
attribute cache on hostc. Running the 'cat' command on hostc causes bar.txt
to be loaded into the page cache. Meanwhile, its attributes are cached in the
attribute cache.

Seconds later, the 'echo' command on hostc causes the attribute cache to be
updated (a GETATTR operation is issued) with the new file size (reflecting
the line just appended by hostb), but the page cache is not updated (no READ
operation is issued).

The subsequent WRITE operation from hostc specifies an offset of 0 (beginning
of file) and a length equal to "line 1" + "line 2 from hostb" + "line 2 from
hostc". Since the page cache on hostc does not contain the "line 2 from hostb"
content, that segment of the WRITE buffer is filled with nulls.

Note that no file locking is being used in this test case or our production
use case.

Questions:

- Is this a bug or correct operation?
- Would file locking produce the expected behaviour?

Thanks,

Clay McClure





2008-04-30 07:33:50

by Trond Myklebust

[permalink] [raw]
Subject: Re: Data corruption using 2.6.18 NFSv3 client -- sparse files?


On Wed, 2008-04-30 at 06:41 +0000, Clay McClure wrote:
> Hello,
>
> When multiple 2.6.18 NFSv3 clients write to the same file, after one of the
> clients has recently read from the file, we see data corruption in the form of
> null bytes inserted into the file.
>
> Simple test case:
>
> hosta% echo "line 1" > /nfs/volume/bar.txt
>
> then, in rapid succession:
>
> hostb% cat bar.txt && sleep 2 && echo "line 2 from hostb" >> /nfs/bar.txt
> hostc% cat bar.txt && sleep 2 && echo "line 2 from hostc" >> /nfs/bar.txt
>
> Expected result:
>
> /nfs/bar.txt contains:
>
> line 1
> line 2 from hostb
> line 2 from hostc
>
> Actual result:
>
> /nfs/bar.txt contains:
>
> line 1
> \0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0line 2 from hostc
>
> This seems to be due to an inconsistency between the page cache and the
> attribute cache on hostc. Running the 'cat' command on hostc causes bar.txt
> to be loaded into the page cache. Meanwhile, its attributes are cached in the
> attribute cache.
>
> Seconds later, the 'echo' command on hostc causes the attribute cache to be
> updated (a GETATTR operation is issued) with the new file size (reflecting
> the line just appended by hostb), but the page cache is not updated (no READ
> operation is issued).
>
> The subsequent WRITE operation from hostc specifies an offset of 0 (beginning
> of file) and a length equal to "line 1" + "line 2 from hostb" + "line 2 from
> hostc". Since the page cache on hostc does not contain the "line 2 from hostb"
> content, that segment of the WRITE buffer is filled with nulls.
>
> Note that no file locking is being used in this test case or our production
> use case.
>
> Questions:
>
> - Is this a bug or correct operation?
> - Would file locking produce the expected behaviour?
>
> Thanks,
>
> Clay McClure


A number of read and write races have been fixed since September 2006.
Have you tested with 2.6.25?

Trond