This series fixes nfs writeback error checking bugs.
If there is an error during the writeback process, it should be returned
when user space calls close() or fsync().
filemap_sample_wb_err() will return 0 if nobody has seen the error yet,
then filemap_check_wb_err() will return the unchanged writeback error.
ChenXiaoSong (2):
nfs: nfs{,4}_file_flush should consume writeback error
nfs: nfs_file_write() check writeback errors correctly
fs/nfs/file.c | 8 +++-----
fs/nfs/nfs4file.c | 4 +---
2 files changed, 4 insertions(+), 8 deletions(-)
--
2.31.1
filemap_sample_wb_err() will return 0 if nobody has seen the error yet,
then filemap_check_wb_err() will return the unchanged writeback error.
Reproducer:
nfs server | nfs client
--------------------------------|-----------------------------------------------
# No space left on server |
fallocate -l 100G /server/file1 |
| mount -t nfs $nfs_server_ip:/ /mnt
| # Expected error: No space left on device
| dd if=/dev/zero of=/mnt/file2 count=1 ibs=100K
# Release space on server |
rm /server/file1 |
| # Unexpected error: No space left on device
| dd if=/dev/zero of=/mnt/file2 count=1 ibs=100K
Fix this by using file_check_and_advance_wb_err(). If there is an error during
the writeback process, it should be returned when user space calls close() or fsync(),
flush() is called when user space calls close().
Note that fsync() will not be called after close().
Fixes: 67dd23f9e6fb ("nfs: ensure correct writeback errors are returned on close()")
Signed-off-by: ChenXiaoSong <[email protected]>
---
fs/nfs/file.c | 4 +---
fs/nfs/nfs4file.c | 4 +---
2 files changed, 2 insertions(+), 6 deletions(-)
diff --git a/fs/nfs/file.c b/fs/nfs/file.c
index 76d76acbc594..83d63bce9596 100644
--- a/fs/nfs/file.c
+++ b/fs/nfs/file.c
@@ -141,7 +141,6 @@ static int
nfs_file_flush(struct file *file, fl_owner_t id)
{
struct inode *inode = file_inode(file);
- errseq_t since;
dprintk("NFS: flush(%pD2)\n", file);
@@ -150,9 +149,8 @@ nfs_file_flush(struct file *file, fl_owner_t id)
return 0;
/* Flush writes to the server and return any errors */
- since = filemap_sample_wb_err(file->f_mapping);
nfs_wb_all(inode);
- return filemap_check_wb_err(file->f_mapping, since);
+ return file_check_and_advance_wb_err(file);
}
ssize_t
diff --git a/fs/nfs/nfs4file.c b/fs/nfs/nfs4file.c
index e79ae4cbc395..63a57e5b6db7 100644
--- a/fs/nfs/nfs4file.c
+++ b/fs/nfs/nfs4file.c
@@ -111,7 +111,6 @@ static int
nfs4_file_flush(struct file *file, fl_owner_t id)
{
struct inode *inode = file_inode(file);
- errseq_t since;
dprintk("NFS: flush(%pD2)\n", file);
@@ -127,9 +126,8 @@ nfs4_file_flush(struct file *file, fl_owner_t id)
return filemap_fdatawrite(file->f_mapping);
/* Flush writes to the server and return any errors */
- since = filemap_sample_wb_err(file->f_mapping);
nfs_wb_all(inode);
- return filemap_check_wb_err(file->f_mapping, since);
+ return file_check_and_advance_wb_err(file);
}
#ifdef CONFIG_NFS_V4_2
--
2.31.1
On Sat, 2022-03-05 at 20:46 +0800, ChenXiaoSong wrote:
> filemap_sample_wb_err() will return 0 if nobody has seen the error
> yet,
> then filemap_check_wb_err() will return the unchanged writeback
> error.
>
> Reproducer:
> nfs server | nfs client
> --------------------------------|-----------------------------------
> ------------
> # No space left on server |
> fallocate -l 100G /server/file1 |
> | mount -t nfs $nfs_server_ip:/ /mnt
> | # Expected error: No space left on
> device
> | dd if=/dev/zero of=/mnt/file2
> count=1 ibs=100K
> # Release space on server |
> rm /server/file1 |
> | # Unexpected error: No space left
> on device
> | dd if=/dev/zero of=/mnt/file2
> count=1 ibs=100K
>
'rm' doesn't open any files or do any I/O, so it shouldn't be returning
any errors from the page cache.
IOW: The problem here is not that we're failing to clear an error from
the page cache. It is that something in 'rm' is checking the page cache
and returning any errors that it finds there.
Is 'rm' perhaps doing a stat() on the file it is deleting? If so, does
this patch fix the bug?
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/comm
it/?id=d19e0183a883
--
Trond Myklebust
Linux NFS client maintainer, Hammerspace
[email protected]
It would be more clear if I update the reproducer like this:
nfs server | nfs client
--------------------------------- |---------------------------------
# No space left on server |
fallocate -l 100G /server/nospace |
| mount -t nfs $nfs_server_ip:/ /mnt
|
| # Expected error
| dd if=/dev/zero of=/mnt/file
|
| # Release space on mountpoint
| rm /mnt/nospace
|
| # Unexpected error
| dd if=/dev/zero of=/mnt/file
The Unexpected error (No space left on device) when doing second `dd`,
is from unconsumed writeback error after close() the file when doing
first `dd`. There is enough space when doing second `dd`, we should not
report the nospace error.
We should report and consume the writeback error when userspace call
close()->flush(), the writeback error should not be left for next open().
Currently, fsync() will consume the writeback error while calling
file_check_and_advance_wb_err(), close()->flush() should also consume
the writeback error.
在 2022/3/6 0:53, Trond Myklebust 写道:
> 'rm' doesn't open any files or do any I/O, so it shouldn't be returning
> any errors from the page cache.
>
> IOW: The problem here is not that we're failing to clear an error from
> the page cache. It is that something in 'rm' is checking the page cache
> and returning any errors that it finds there.
>
> Is 'rm' perhaps doing a stat() on the file it is deleting? If so, does
> this patch fix the bug?
>
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/comm
> it/?id=d19e0183a883
>
在 2022/4/12 22:27, Trond Myklebust 写道:
>
> It will clear ENOSPC, EDQUOT and EFBIG. It should not clear other
> errors that are not supposed to be reported by write().
>
> As I keep repeating, that is _documented behaviour_!
>
Hi Trond:
You may mean that write(2) manpage described:
> Since Linux 4.13, errors from write-back come with a promise that
> they may be reported by subsequent. write(2) requests, and will be
> reported by a subsequent fsync(2) (whether or not they were also
> reported by write(2)).
The manpage mentioned that "reported by a subsequent fsync(2)", your
patch[1] clear the wb err on _async_ write(), and wb err will _not_ be
reported by subsequent fsync(2), is it documented behaviour?
All other filesystems will _not_ clear any wb err on _async_ write().
[1]
https://patchwork.kernel.org/project/linux-nfs/patch/[email protected]/