Hello,
When multiple 2.6.18 NFSv3 clients write to the same file, after one of the
clients has recently read from the file, we see data corruption in the form of
null bytes inserted into the file.
Simple test case:
hosta% echo "line 1" > /nfs/volume/bar.txt
then, in rapid succession:
hostb% cat bar.txt && sleep 2 && echo "line 2 from hostb" >> /nfs/bar.txt
hostc% cat bar.txt && sleep 2 && echo "line 2 from hostc" >> /nfs/bar.txt
Expected result:
/nfs/bar.txt contains:
line 1
line 2 from hostb
line 2 from hostc
Actual result:
/nfs/bar.txt contains:
line 1
\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0line 2 from hostc
This seems to be due to an inconsistency between the page cache and the
attribute cache on hostc. Running the 'cat' command on hostc causes bar.txt
to be loaded into the page cache. Meanwhile, its attributes are cached in the
attribute cache.
Seconds later, the 'echo' command on hostc causes the attribute cache to be
updated (a GETATTR operation is issued) with the new file size (reflecting
the line just appended by hostb), but the page cache is not updated (no READ
operation is issued).
The subsequent WRITE operation from hostc specifies an offset of 0 (beginning
of file) and a length equal to "line 1" + "line 2 from hostb" + "line 2 from
hostc". Since the page cache on hostc does not contain the "line 2 from hostb"
content, that segment of the WRITE buffer is filled with nulls.
Note that no file locking is being used in this test case or our production
use case.
Questions:
- Is this a bug or correct operation?
- Would file locking produce the expected behaviour?
Thanks,
Clay McClure
On Wed, 2008-04-30 at 06:41 +0000, Clay McClure wrote:
> Hello,
>
> When multiple 2.6.18 NFSv3 clients write to the same file, after one of the
> clients has recently read from the file, we see data corruption in the form of
> null bytes inserted into the file.
>
> Simple test case:
>
> hosta% echo "line 1" > /nfs/volume/bar.txt
>
> then, in rapid succession:
>
> hostb% cat bar.txt && sleep 2 && echo "line 2 from hostb" >> /nfs/bar.txt
> hostc% cat bar.txt && sleep 2 && echo "line 2 from hostc" >> /nfs/bar.txt
>
> Expected result:
>
> /nfs/bar.txt contains:
>
> line 1
> line 2 from hostb
> line 2 from hostc
>
> Actual result:
>
> /nfs/bar.txt contains:
>
> line 1
> \0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0line 2 from hostc
>
> This seems to be due to an inconsistency between the page cache and the
> attribute cache on hostc. Running the 'cat' command on hostc causes bar.txt
> to be loaded into the page cache. Meanwhile, its attributes are cached in the
> attribute cache.
>
> Seconds later, the 'echo' command on hostc causes the attribute cache to be
> updated (a GETATTR operation is issued) with the new file size (reflecting
> the line just appended by hostb), but the page cache is not updated (no READ
> operation is issued).
>
> The subsequent WRITE operation from hostc specifies an offset of 0 (beginning
> of file) and a length equal to "line 1" + "line 2 from hostb" + "line 2 from
> hostc". Since the page cache on hostc does not contain the "line 2 from hostb"
> content, that segment of the WRITE buffer is filled with nulls.
>
> Note that no file locking is being used in this test case or our production
> use case.
>
> Questions:
>
> - Is this a bug or correct operation?
> - Would file locking produce the expected behaviour?
>
> Thanks,
>
> Clay McClure
A number of read and write races have been fixed since September 2006.
Have you tested with 2.6.25?
Trond