2021-02-03 11:46:29

by Kinglong Mee

[permalink] [raw]
Subject: Data corrupt after truncate at nfs client v3 mount point when nfs server restart

Hello,

I meet a data corrupt problem at nfs client v3 mount point without sync
at centos 7.6. (kernel 3.10.0-957.el7.x86_64).

When runing ltp ftest01 test case with restart/reboot nfs server,
ftest01 reports data corrupt or size mismatch sometimes.

Debugging shows,
1. The WRITE reply contains a write verifier, and the COMMIT reply
contains a verifer too.The knfsd encodes nfsd_net_id for the
two verifers, other nfs server (eg, nfs-ganesha) encodes the daemon
start time. After nfs server restart/reboot, the verifier is changed.
2. For mounting without sync, nfs client uses buffer io, sends WRITE
request with unstable argument.
After unstable WRITE success, it is added to commit list, not be
deleted directly.
3. Following sync(fsync) or setattr(truncate) will occurs a COMMIT,
if COMMIT success and the returned verifier is same as the unstable
WRITE in commit list, the unstable WRITE is deleted; otherwise the
unstable WRITE will be resend to nfs server.
It's in nfs_commit_release_pages().
4. After nfs server restart, the COMMIT may be processed by the newer
server that a different verifier is returned with COMMIT success.
At this case, those unstable WRITE will be resend but COMMIT finish
without any error and does not wait those resending WRITEs.
5. For sync(fsync), it's okay; but for setattr(truncate), data corrupt
appears that those resending WRITEs may be send to server with the
SETATTR for truncate simultaneously.
6. nfs_setattr() only does a nfs_sync_inode without processing the
result.

/* Write all dirty data */
if (S_ISREG(inode->i_mode))
nfs_sync_inode(inode);

Also, nfs_initiate_commit() does not return error for FLUSH_SYNC
when COMMIT meeting error.

Maybe we should return error to upper caller (the nfs_setattr()) who
doing FLUSH_SYNC commit when COMMIT meets different verifier as those
unstable WRITEs. With the error, upper caller may return error or
do nfs_sync_inode again.

Although I meet this problem at a older kernel, but the logical in the
latest nfs client source does not be updated.

Any suggestion is welcome.

thanks,
Kinglong Mee