This patch fixes the read_bytes count for NFS buffered reads.
Simple reproducer follows.
Before this patch:
# mount 127.0.0.1:/ /mnt/nfs
# bash
# function dump_stats { cat /proc/$$/io; }
# trap dump_stats EXIT
# cat /mnt/nfs/file1.bin > /dev/null
# exit
exit
rchar: 3587436
wchar: 1054077
syscr: 544
syscw: 33
read_bytes: 0
write_bytes: 0
cancelled_write_bytes: 0
After this patch:
# mount 127.0.0.1:/ /mnt/nfs
# bash
# function dump_stats { cat /proc/$$/io; }
# trap dump_stats EXIT
# cat /mnt/nfs/file1.bin > /dev/null
# exit
exit
rchar: 3587278
wchar: 1054161
syscr: 544
syscw: 33
read_bytes: 1048576
write_bytes: 0
cancelled_write_bytes: 0
Dave Wysochanski (1):
NFS: Fix /proc/PID/io read_bytes for buffered reads
fs/nfs/read.c | 3 +++
1 file changed, 3 insertions(+)
--
2.31.1
Prior to commit 8786fde8421c ("Convert NFS from readpages to
readahead"), nfs_readpages() used the old mm interface read_cache_pages()
which called task_io_account_read() for each NFS page read. After
this commit, nfs_readpages() is converted to nfs_readahead(), which
now uses the new mm interface readahead_page(). The new interface
requires callers to call task_io_account_read() themselves.
In addition, to nfs_readahead() task_io_account_read() should also
be called from nfs_read_folio().
Fixes: 8786fde8421c ("Convert NFS from readpages to readahead")
Link: https://lore.kernel.org/linux-nfs/CAPt2mGNEYUk5u8V4abe=5MM5msZqmvzCVrtCP4Qw1n=gCHCnww@mail.gmail.com/
Signed-off-by: Dave Wysochanski <[email protected]>
---
fs/nfs/read.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/fs/nfs/read.c b/fs/nfs/read.c
index c380cff4108e..e90988591df4 100644
--- a/fs/nfs/read.c
+++ b/fs/nfs/read.c
@@ -15,6 +15,7 @@
#include <linux/stat.h>
#include <linux/mm.h>
#include <linux/slab.h>
+#include <linux/task_io_accounting_ops.h>
#include <linux/pagemap.h>
#include <linux/sunrpc/clnt.h>
#include <linux/nfs_fs.h>
@@ -337,6 +338,7 @@ int nfs_read_folio(struct file *file, struct folio *folio)
trace_nfs_aop_readpage(inode, folio);
nfs_inc_stats(inode, NFSIOS_VFSREADPAGE);
+ task_io_account_read(folio_size(folio));
/*
* Try to flush any pending writes to the file..
@@ -393,6 +395,7 @@ void nfs_readahead(struct readahead_control *ractl)
trace_nfs_aop_readahead(inode, readahead_pos(ractl), nr_pages);
nfs_inc_stats(inode, NFSIOS_VFSREADPAGES);
+ task_io_account_read(readahead_length(ractl));
ret = -ESTALE;
if (NFS_STALE(inode))
--
2.31.1
Thanks Dave.
I can confirm that this does indeed fix the reporting of
/proc/PID/read_bytes for us. I actually applied it (cleanly) to our
v5.18 based kernel as we use that in production on our render farm.
We use the read_bytes in our post render statistics to help track per render IO.
Tested-by: Daire Byrne <[email protected]>
On Thu, 9 Mar 2023 at 19:00, Dave Wysochanski <[email protected]> wrote:
>
> Prior to commit 8786fde8421c ("Convert NFS from readpages to
> readahead"), nfs_readpages() used the old mm interface read_cache_pages()
> which called task_io_account_read() for each NFS page read. After
> this commit, nfs_readpages() is converted to nfs_readahead(), which
> now uses the new mm interface readahead_page(). The new interface
> requires callers to call task_io_account_read() themselves.
> In addition, to nfs_readahead() task_io_account_read() should also
> be called from nfs_read_folio().
>
> Fixes: 8786fde8421c ("Convert NFS from readpages to readahead")
> Link: https://lore.kernel.org/linux-nfs/CAPt2mGNEYUk5u8V4abe=5MM5msZqmvzCVrtCP4Qw1n=gCHCnww@mail.gmail.com/
> Signed-off-by: Dave Wysochanski <[email protected]>
> ---
> fs/nfs/read.c | 3 +++
> 1 file changed, 3 insertions(+)
>
> diff --git a/fs/nfs/read.c b/fs/nfs/read.c
> index c380cff4108e..e90988591df4 100644
> --- a/fs/nfs/read.c
> +++ b/fs/nfs/read.c
> @@ -15,6 +15,7 @@
> #include <linux/stat.h>
> #include <linux/mm.h>
> #include <linux/slab.h>
> +#include <linux/task_io_accounting_ops.h>
> #include <linux/pagemap.h>
> #include <linux/sunrpc/clnt.h>
> #include <linux/nfs_fs.h>
> @@ -337,6 +338,7 @@ int nfs_read_folio(struct file *file, struct folio *folio)
>
> trace_nfs_aop_readpage(inode, folio);
> nfs_inc_stats(inode, NFSIOS_VFSREADPAGE);
> + task_io_account_read(folio_size(folio));
>
> /*
> * Try to flush any pending writes to the file..
> @@ -393,6 +395,7 @@ void nfs_readahead(struct readahead_control *ractl)
>
> trace_nfs_aop_readahead(inode, readahead_pos(ractl), nr_pages);
> nfs_inc_stats(inode, NFSIOS_VFSREADPAGES);
> + task_io_account_read(readahead_length(ractl));
>
> ret = -ESTALE;
> if (NFS_STALE(inode))
> --
> 2.31.1
>