2022-08-31 01:05:30

by David Wysochanski

[permalink] [raw]
Subject: [PATCH v3 0/3] Convert NFS with fscache to the netfs API

This patchset converts NFS with fscache non-direct READ IO paths to
use the netfs API with a non-invasive approach. The existing NFS pgio
layer does not need extensive changes, and is the best way so far I've
found to address Trond's concerns about modifying the IO path [1] as
well as only enabling netfs when fscache is configured and enabled [2].
I have not yet attempted performance comparisions to address Chuck
Lever's concern [3], though this is probably less of a concern now,
since netfs is not used unless fscache is configured and enabled.

The main patch to be reviewed is patch #3 which converts nfs_read_folio
and nfs_readahead.

Changes since v2 (all PATCH 3)
- Fix Aug 28 kernel test robot <[email protected]> compile warning due
to unusued 'inode' variable by removing dprintk (can bring back
as a trace event if needed)
- Rename nfs_issue_read to nfs_netfs_issue_read
- Rename nfs_begin_cache_operation to nfs_netfs_begin_cache_operation
- Add nfs_netfs_read_completion and move logic from read.c into fscache.c
- Add nfs_netfs_read_initiate and nfs_netfs_read_done to handle netfs
IO accounting updates (resolves prior hang with BAD_STATEID errors),
moving logic out of pagelist.c

The patches are fairly stable as evidenced with xfstests generic with
various servers: hammerspace w/NFS4.2+fscache,
NetApp(ontap9) NFSv4.1+fscache, and RHEL8 NFSv3+fscache.
The known issues are as follows:

1. Unit test setting rsize < readahead does not properly read from
fscache but re-reads data from the NFS server
* This will be fixed once another linux-cachefs [4] patch to resolve
"Stop read optimisation when folio removed from pagecache"

2. "Cache volume key already in use" after xfstest runs
* xfstests (hammerspace with vers=4.2,fsc) shows the following on the
console after some tests:
"NFS: Cache volume key already in use (nfs,4.1,2,c50,cfe0100a,3,,,8000,100000,100000,bb8,ea60,7530,ea60,1)"
* This may be fixed with another patch [4] that is in progress

3. Hang (RESOLVED)

4. Data corruption seen with unit test where rsize < readahead
* Seen with vanilla 6.0-rc2 (did not occur on 5.19); likely unrelated
to this patchset
mount -o vers=4.2,fsc,rsize=8192 127.0.0.1:/export /mnt
dd if=/dev/urandom of=/tmp/integrity-rsize-file1.bin bs=16k count=1
./nfs-readahead.sh set /mnt 16384
dd if=/tmp/integrity-rsize-file1.bin of=/mnt/integrity-rsize-file1.bin bs=16k count=1
echo 3 > /proc/sys/vm/drop_caches
md5sum /mnt/integrity-rsize-file1.bin /tmp/integrity-rsize-file1.bin
md5sums don't match, MD5_NFS = 00eaf1a5bc1b3dfd54711db551619afa != MD5_LOCAL = e8d835c83ba1f1264869dc40673fa20c

The patchset is based on 6.0-rc3 and has been pushed to github at:
https://github.com/DaveWysochanskiRH/kernel/commits/nfs-fscache-netfs

[1] https://lore.kernel.org/linux-nfs/[email protected]/
[2] https://lore.kernel.org/linux-nfs/[email protected]/
[3] https://marc.info/?l=linux-nfs&m=160597917525083&w=4
[4] https://www.mail-archive.com/[email protected]/msg03043.html
[5] https://marc.info/?l=linux-nfs&m=165962662200679&w=4

Dave Wysochanski (3):
NFS: Rename readpage_async_filler to nfs_pageio_add_page
NFS: Add support for netfs in struct nfs_inode and Kconfig
NFS: Convert nfs_read_folio and nfs_readahead to netfs APIs

fs/nfs/Kconfig | 1 +
fs/nfs/delegation.c | 2 +-
fs/nfs/dir.c | 2 +-
fs/nfs/fscache.c | 191 ++++++++++++++++++---------------------
fs/nfs/fscache.h | 77 ++++++++--------
fs/nfs/inode.c | 8 +-
fs/nfs/internal.h | 10 +-
fs/nfs/pagelist.c | 14 +++
fs/nfs/pnfs.c | 12 +--
fs/nfs/read.c | 117 ++++++++----------------
fs/nfs/write.c | 2 +-
include/linux/nfs_fs.h | 19 +---
include/linux/nfs_page.h | 1 +
include/linux/nfs_xdr.h | 1 +
14 files changed, 210 insertions(+), 247 deletions(-)

--
2.31.1


2022-08-31 01:05:30

by David Wysochanski

[permalink] [raw]
Subject: [PATCH v3 3/3] NFS: Convert buffered read paths to use netfs when fscache is enabled

Convert the NFS buffered read code paths to corresponding netfs APIs,
but only when fscache is configured and enabled.

The netfs API defines struct netfs_request_ops which must be filled
in by the network filesystem. For NFS, we only need to define 5 of
the functions, the main one being the issue_read() function.
The issue_read() function is called by the netfs layer when a read
cannot be fulfilled locally, and must be sent to the server (either
the cache is not active, or it is active but the data is not available).
Once the read from the server is complete, netfs requires a call to
netfs_subreq_terminated() which conveys either how many bytes were read
successfully, or an error. Note that issue_read() is called with a
structure, netfs_io_subrequest, which defines the IO requested, and
contains a start and a length (both in bytes), and assumes the underlying
netfs will return a either an error on the whole region, or the number
of bytes successfully read.

The NFS IO path is page based and the main APIs are the pgio APIs defined
in pagelist.c. For the pgio APIs, there is no way for the caller to
know how many RPCs will be sent and how the pages will be broken up
into underlying RPCs, each of which will have their own return code.
Thus, NFS needs some way to accommodate the netfs API requirement on
the single response to the whole request, while also minimizing
disruptive changes to the NFS pgio layer. The approach taken with this
patch is to allocate a small structure for each call to nfs_issue_read()
to keep some accounting information for the outstanding RPCs, as well as
the final error value or the number of bytes successfully read. The
accounting data is updated inside the pgio layer, when a nfs_pgio_header
contains a valid pointer to this data. In the NFS read completion
function, nfs_read_completion(), the final error value and bytes read
is updated, and the accounting data is used to determine whether this
completion represents the last RPC. If this is the last RPC, call
netfs_subreq_terminated() with the final error value or the number
of bytes transferred.

Link: https://lore.kernel.org/linux-nfs/[email protected]/

Signed-off-by: Dave Wysochanski <[email protected]>
---
fs/nfs/fscache.c | 232 ++++++++++++++++++++++++---------------
fs/nfs/fscache.h | 71 ++++++------
fs/nfs/inode.c | 3 +
fs/nfs/internal.h | 9 ++
fs/nfs/pagelist.c | 14 +++
fs/nfs/read.c | 68 ++++++++----
include/linux/nfs_page.h | 3 +
include/linux/nfs_xdr.h | 3 +
8 files changed, 258 insertions(+), 145 deletions(-)

diff --git a/fs/nfs/fscache.c b/fs/nfs/fscache.c
index a6fc1c8b6644..696baed44aeb 100644
--- a/fs/nfs/fscache.c
+++ b/fs/nfs/fscache.c
@@ -15,6 +15,9 @@
#include <linux/seq_file.h>
#include <linux/slab.h>
#include <linux/iversion.h>
+#include <linux/xarray.h>
+#include <linux/fscache.h>
+#include <linux/netfs.h>

#include "internal.h"
#include "iostat.h"
@@ -235,112 +238,161 @@ void nfs_fscache_release_file(struct inode *inode, struct file *filp)
fscache_unuse_cookie(cookie, &auxdata, &i_size);
}

-/*
- * Fallback page reading interface.
- */
-static int fscache_fallback_read_page(struct inode *inode, struct page *page)
+
+atomic_t nfs_netfs_debug_id;
+static int nfs_netfs_init_request(struct netfs_io_request *rreq, struct file *file)
{
- struct netfs_cache_resources cres;
- struct fscache_cookie *cookie = netfs_i_cookie(&NFS_I(inode)->netfs);
- struct iov_iter iter;
- struct bio_vec bvec[1];
- int ret;
-
- memset(&cres, 0, sizeof(cres));
- bvec[0].bv_page = page;
- bvec[0].bv_offset = 0;
- bvec[0].bv_len = PAGE_SIZE;
- iov_iter_bvec(&iter, READ, bvec, ARRAY_SIZE(bvec), PAGE_SIZE);
-
- ret = fscache_begin_read_operation(&cres, cookie);
- if (ret < 0)
- return ret;
-
- ret = fscache_read(&cres, page_offset(page), &iter, NETFS_READ_HOLE_FAIL,
- NULL, NULL);
- fscache_end_operation(&cres);
- return ret;
+ struct nfs_open_context *ctx;
+
+ if (file == NULL) {
+ ctx = nfs_find_open_context(rreq->inode, NULL, FMODE_READ);
+ if (!ctx)
+ return -ENOMEM;
+ } else
+ ctx = get_nfs_open_context(nfs_file_open_context(file));
+
+ rreq->netfs_priv = ctx;
+
+ if (netfs_i_cookie(&NFS_I(rreq->inode)->netfs))
+ rreq->debug_id = atomic_inc_return(&nfs_netfs_debug_id);
+
+ return 0;
}

-/*
- * Fallback page writing interface.
- */
-static int fscache_fallback_write_page(struct inode *inode, struct page *page,
- bool no_space_allocated_yet)
+static void nfs_netfs_free_request(struct netfs_io_request *rreq)
{
- struct netfs_cache_resources cres;
- struct fscache_cookie *cookie = netfs_i_cookie(&NFS_I(inode)->netfs);
- struct iov_iter iter;
- struct bio_vec bvec[1];
- loff_t start = page_offset(page);
- size_t len = PAGE_SIZE;
- int ret;
-
- memset(&cres, 0, sizeof(cres));
- bvec[0].bv_page = page;
- bvec[0].bv_offset = 0;
- bvec[0].bv_len = PAGE_SIZE;
- iov_iter_bvec(&iter, WRITE, bvec, ARRAY_SIZE(bvec), PAGE_SIZE);
-
- ret = fscache_begin_write_operation(&cres, cookie);
- if (ret < 0)
- return ret;
-
- ret = cres.ops->prepare_write(&cres, &start, &len, i_size_read(inode),
- no_space_allocated_yet);
- if (ret == 0)
- ret = fscache_write(&cres, page_offset(page), &iter, NULL, NULL);
- fscache_end_operation(&cres);
- return ret;
+ put_nfs_open_context(rreq->netfs_priv);
}

-/*
- * Retrieve a page from fscache
- */
-int __nfs_fscache_read_page(struct inode *inode, struct page *page)
+static inline int nfs_netfs_begin_cache_operation(struct netfs_io_request *rreq)
{
- int ret;
+ return fscache_begin_read_operation(&rreq->cache_resources,
+ netfs_i_cookie(&NFS_I(rreq->inode)->netfs));
+}

- trace_nfs_fscache_read_page(inode, page);
- if (PageChecked(page)) {
- ClearPageChecked(page);
- ret = 1;
- goto out;
- }
+static struct nfs_netfs_io_data *nfs_netfs_alloc(struct netfs_io_subrequest *sreq)
+{
+ struct nfs_netfs_io_data *netfs;
+
+ netfs = kzalloc(sizeof(*netfs), GFP_KERNEL_ACCOUNT);
+ if (!netfs)
+ return NULL;
+ netfs->sreq = sreq;
+ refcount_set(&netfs->refcount, 1);
+ spin_lock_init(&netfs->lock);
+ return netfs;
+}

- ret = fscache_fallback_read_page(inode, page);
- if (ret < 0) {
- nfs_inc_fscache_stats(inode, NFSIOS_FSCACHE_PAGES_READ_FAIL);
- SetPageChecked(page);
- goto out;
+static bool nfs_netfs_clamp_length(struct netfs_io_subrequest *sreq)
+{
+ size_t rsize = NFS_SB(sreq->rreq->inode->i_sb)->rsize;
+
+ sreq->len = min(sreq->len, rsize);
+ return true;
+}
+
+static void nfs_netfs_issue_read(struct netfs_io_subrequest *sreq)
+{
+ struct nfs_pageio_descriptor pgio;
+ struct inode *inode = sreq->rreq->inode;
+ struct nfs_open_context *ctx = sreq->rreq->netfs_priv;
+ struct page *page;
+ int err;
+ pgoff_t start = (sreq->start + sreq->transferred) >> PAGE_SHIFT;
+ pgoff_t last = ((sreq->start + sreq->len -
+ sreq->transferred - 1) >> PAGE_SHIFT);
+ XA_STATE(xas, &sreq->rreq->mapping->i_pages, start);
+
+ nfs_pageio_init_read(&pgio, inode, false,
+ &nfs_async_read_completion_ops);
+
+ pgio.pg_netfs = nfs_netfs_alloc(sreq); /* used in completion */
+ if (!pgio.pg_netfs)
+ return netfs_subreq_terminated(sreq, -ENOMEM, false);
+
+ xas_lock(&xas);
+ xas_for_each(&xas, page, last) {
+ /* nfs_pageio_add_page() may schedule() due to pNFS layout and other RPCs */
+ xas_pause(&xas);
+ xas_unlock(&xas);
+ err = nfs_pageio_add_page(&pgio, ctx, page);
+ if (err < 0)
+ return netfs_subreq_terminated(sreq, err, false);
+ xas_lock(&xas);
}
+ xas_unlock(&xas);
+ nfs_pageio_complete_read(&pgio);
+ nfs_netfs_put(pgio.pg_netfs);
+}

- /* Read completed synchronously */
- nfs_inc_fscache_stats(inode, NFSIOS_FSCACHE_PAGES_READ_OK);
- SetPageUptodate(page);
- ret = 0;
-out:
- trace_nfs_fscache_read_page_exit(inode, page, ret);
- return ret;
+void nfs_netfs_read_initiate(struct nfs_pgio_header *hdr)
+{
+ struct nfs_netfs_io_data *netfs = hdr->netfs;
+ struct netfs_io_subrequest *sreq;
+
+ if (!netfs)
+ return;
+
+ sreq = netfs->sreq;
+ spin_lock(&netfs->lock);
+ atomic_inc(&netfs->rpcs);
+ netfs->rpc_byte_count += hdr->args.count;
+ spin_unlock(&netfs->lock);
}

-/*
- * Store a newly fetched page in fscache. We can be certain there's no page
- * stored in the cache as yet otherwise we would've read it from there.
- */
-void __nfs_fscache_write_page(struct inode *inode, struct page *page)
+void nfs_netfs_read_done(struct nfs_pgio_header *hdr)
{
- int ret;
+ struct nfs_netfs_io_data *netfs = hdr->netfs;
+ struct netfs_io_subrequest *sreq;
+
+ if (!netfs)
+ return;

- trace_nfs_fscache_write_page(inode, page);
+ sreq = netfs->sreq;
+ spin_lock(&netfs->lock);
+ if (hdr->res.op_status) {
+ /*
+ * Retryable errors such as BAD_STATEID will be re-issued,
+ * so reduce the bytes and the RPC counts.
+ */
+ netfs->rpc_byte_count -= hdr->args.count;
+ atomic_dec(&netfs->rpcs);
+ }
+ spin_unlock(&netfs->lock);
+}

- ret = fscache_fallback_write_page(inode, page, true);
+void nfs_netfs_read_completion(struct nfs_pgio_header *hdr)
+{
+ struct nfs_netfs_io_data *netfs = hdr->netfs;
+ struct netfs_io_subrequest *sreq;

- if (ret != 0) {
- nfs_inc_fscache_stats(inode, NFSIOS_FSCACHE_PAGES_WRITTEN_FAIL);
- nfs_inc_fscache_stats(inode, NFSIOS_FSCACHE_PAGES_UNCACHED);
- } else {
- nfs_inc_fscache_stats(inode, NFSIOS_FSCACHE_PAGES_WRITTEN_OK);
+ if (!netfs)
+ return;
+
+ sreq = netfs->sreq;
+ if (test_bit(NFS_IOHDR_EOF, &hdr->flags))
+ __set_bit(NETFS_SREQ_CLEAR_TAIL, &sreq->flags);
+
+ spin_lock(&netfs->lock);
+ if (hdr->error)
+ netfs->error = hdr->error;
+ else
+ netfs->transferred += hdr->res.count;
+ spin_unlock(&netfs->lock);
+
+ /* Only the last RPC completion should call netfs_subreq_terminated() */
+ if (atomic_dec_and_test(&netfs->rpcs) &&
+ (netfs->rpc_byte_count >= sreq->len)) {
+ netfs_subreq_terminated(sreq, netfs->error ?: netfs->transferred, false);
+ nfs_netfs_put(netfs);
+ hdr->netfs = NULL;
}
- trace_nfs_fscache_write_page_exit(inode, page, ret);
}
+
+const struct netfs_request_ops nfs_netfs_ops = {
+ .init_request = nfs_netfs_init_request,
+ .free_request = nfs_netfs_free_request,
+ .begin_cache_operation = nfs_netfs_begin_cache_operation,
+ .issue_read = nfs_netfs_issue_read,
+ .clamp_length = nfs_netfs_clamp_length
+};
diff --git a/fs/nfs/fscache.h b/fs/nfs/fscache.h
index 38614ed8f951..c59a82a7d4a8 100644
--- a/fs/nfs/fscache.h
+++ b/fs/nfs/fscache.h
@@ -34,6 +34,44 @@ struct nfs_fscache_inode_auxdata {
u64 change_attr;
};

+struct nfs_netfs_io_data {
+ refcount_t refcount;
+ struct netfs_io_subrequest *sreq;
+
+ /*
+ * NFS may split a netfs_io_subrequest into multiple RPCs, each
+ * with their own read completion. In netfs, we can only call
+ * netfs_subreq_terminated() once for each subrequest. So we
+ * must keep track of the rpcs and rpc_byte_count for what has
+ * been submitted, and only call netfs via netfs_subreq_terminated()
+ * when the final RPC has completed.
+ */
+ atomic_t rpcs;
+ unsigned long rpc_byte_count;
+
+ /*
+ * Final dispostion of the netfs_io_subrequest, sent in
+ * netfs_subreq_terminated()
+ */
+ spinlock_t lock;
+ ssize_t transferred;
+ int error;
+};
+
+static inline void nfs_netfs_get(struct nfs_netfs_io_data *netfs)
+{
+ refcount_inc(&netfs->refcount);
+}
+
+static inline void nfs_netfs_put(struct nfs_netfs_io_data *netfs)
+{
+ if (refcount_dec_and_test(&netfs->refcount))
+ kfree(netfs);
+}
+extern void nfs_netfs_read_initiate(struct nfs_pgio_header *hdr);
+extern void nfs_netfs_read_done(struct nfs_pgio_header *hdr);
+extern void nfs_netfs_read_completion(struct nfs_pgio_header *hdr);
+
/*
* fscache.c
*/
@@ -45,43 +83,17 @@ extern void nfs_fscache_clear_inode(struct inode *);
extern void nfs_fscache_open_file(struct inode *, struct file *);
extern void nfs_fscache_release_file(struct inode *, struct file *);

-extern int __nfs_fscache_read_page(struct inode *, struct page *);
-extern void __nfs_fscache_write_page(struct inode *, struct page *);
-
static inline bool nfs_fscache_release_folio(struct folio *folio, gfp_t gfp)
{
if (folio_test_fscache(folio)) {
if (current_is_kswapd() || !(gfp & __GFP_FS))
return false;
folio_wait_fscache(folio);
- fscache_note_page_release(netfs_i_cookie(&NFS_I(folio->mapping->host)->netfs));
- nfs_inc_fscache_stats(folio->mapping->host,
- NFSIOS_FSCACHE_PAGES_UNCACHED);
}
+ fscache_note_page_release(netfs_i_cookie(&NFS_I(folio->mapping->host)->netfs));
return true;
}

-/*
- * Retrieve a page from an inode data storage object.
- */
-static inline int nfs_fscache_read_page(struct inode *inode, struct page *page)
-{
- if (netfs_inode(inode)->cache)
- return __nfs_fscache_read_page(inode, page);
- return -ENOBUFS;
-}
-
-/*
- * Store a page newly fetched from the server in an inode data storage object
- * in the cache.
- */
-static inline void nfs_fscache_write_page(struct inode *inode,
- struct page *page)
-{
- if (netfs_inode(inode)->cache)
- __nfs_fscache_write_page(inode, page);
-}
-
static inline void nfs_fscache_update_auxdata(struct nfs_fscache_inode_auxdata *auxdata,
struct inode *inode)
{
@@ -130,11 +142,6 @@ static inline bool nfs_fscache_release_folio(struct folio *folio, gfp_t gfp)
{
return true; /* may release folio */
}
-static inline int nfs_fscache_read_page(struct inode *inode, struct page *page)
-{
- return -ENOBUFS;
-}
-static inline void nfs_fscache_write_page(struct inode *inode, struct page *page) {}
static inline void nfs_fscache_invalidate(struct inode *inode, int flags) {}

static inline const char *nfs_server_fscache_state(struct nfs_server *server)
diff --git a/fs/nfs/inode.c b/fs/nfs/inode.c
index 7d1b9e39863c..de5eaba5b3f1 100644
--- a/fs/nfs/inode.c
+++ b/fs/nfs/inode.c
@@ -2248,6 +2248,9 @@ struct inode *nfs_alloc_inode(struct super_block *sb)
#endif /* CONFIG_NFS_V4 */
#ifdef CONFIG_NFS_V4_2
nfsi->xattr_cache = NULL;
+#endif
+#ifdef CONFIG_NFS_FSCACHE
+ netfs_inode_init(&nfsi->netfs, &nfs_netfs_ops);
#endif
return NFSI_TO_INODE(nfsi);
}
diff --git a/fs/nfs/internal.h b/fs/nfs/internal.h
index 78d945f5f97f..6b1870382cea 100644
--- a/fs/nfs/internal.h
+++ b/fs/nfs/internal.h
@@ -453,6 +453,10 @@ extern void nfs_sb_deactive(struct super_block *sb);
extern int nfs_client_for_each_server(struct nfs_client *clp,
int (*fn)(struct nfs_server *, void *),
void *data);
+#ifdef CONFIG_NFS_FSCACHE
+extern const struct netfs_request_ops nfs_netfs_ops;
+#endif
+
/* io.c */
extern void nfs_start_io_read(struct inode *inode);
extern void nfs_end_io_read(struct inode *inode);
@@ -482,9 +486,14 @@ extern int nfs4_get_rootfh(struct nfs_server *server, struct nfs_fh *mntfh, bool

struct nfs_pgio_completion_ops;
/* read.c */
+extern const struct nfs_pgio_completion_ops nfs_async_read_completion_ops;
extern void nfs_pageio_init_read(struct nfs_pageio_descriptor *pgio,
struct inode *inode, bool force_mds,
const struct nfs_pgio_completion_ops *compl_ops);
+extern int nfs_pageio_add_page(struct nfs_pageio_descriptor *pgio,
+ struct nfs_open_context *ctx,
+ struct page *page);
+extern void nfs_pageio_complete_read(struct nfs_pageio_descriptor *pgio);
extern void nfs_read_prepare(struct rpc_task *task, void *calldata);
extern void nfs_pageio_reset_read_mds(struct nfs_pageio_descriptor *pgio);

diff --git a/fs/nfs/pagelist.c b/fs/nfs/pagelist.c
index 317cedfa52bf..600989332a6f 100644
--- a/fs/nfs/pagelist.c
+++ b/fs/nfs/pagelist.c
@@ -25,6 +25,7 @@
#include "internal.h"
#include "pnfs.h"
#include "nfstrace.h"
+#include "fscache.h"

#define NFSDBG_FACILITY NFSDBG_PAGECACHE

@@ -68,6 +69,12 @@ void nfs_pgheader_init(struct nfs_pageio_descriptor *desc,
hdr->good_bytes = mirror->pg_count;
hdr->io_completion = desc->pg_io_completion;
hdr->dreq = desc->pg_dreq;
+#ifdef CONFIG_NFS_FSCACHE
+ if (desc->pg_netfs) {
+ hdr->netfs = desc->pg_netfs;
+ nfs_netfs_get(desc->pg_netfs);
+ }
+#endif
hdr->release = release;
hdr->completion_ops = desc->pg_completion_ops;
if (hdr->completion_ops->init_hdr)
@@ -846,6 +853,9 @@ void nfs_pageio_init(struct nfs_pageio_descriptor *desc,
desc->pg_lseg = NULL;
desc->pg_io_completion = NULL;
desc->pg_dreq = NULL;
+#ifdef CONFIG_NFS_FSCACHE
+ desc->pg_netfs = NULL;
+#endif
desc->pg_bsize = bsize;

desc->pg_mirror_count = 1;
@@ -940,6 +950,7 @@ int nfs_generic_pgio(struct nfs_pageio_descriptor *desc,
/* Set up the argument struct */
nfs_pgio_rpcsetup(hdr, mirror->pg_count, desc->pg_ioflags, &cinfo);
desc->pg_rpc_callops = &nfs_pgio_common_ops;
+
return 0;
}
EXPORT_SYMBOL_GPL(nfs_generic_pgio);
@@ -1360,6 +1371,9 @@ int nfs_pageio_resend(struct nfs_pageio_descriptor *desc,

desc->pg_io_completion = hdr->io_completion;
desc->pg_dreq = hdr->dreq;
+#ifdef CONFIG_NFS_FSCACHE
+ desc->pg_netfs = hdr->netfs;
+#endif
list_splice_init(&hdr->pages, &pages);
while (!list_empty(&pages)) {
struct nfs_page *req = nfs_list_entry(pages.next);
diff --git a/fs/nfs/read.c b/fs/nfs/read.c
index 525e82ea9a9e..3bc48472f207 100644
--- a/fs/nfs/read.c
+++ b/fs/nfs/read.c
@@ -30,7 +30,7 @@

#define NFSDBG_FACILITY NFSDBG_PAGECACHE

-static const struct nfs_pgio_completion_ops nfs_async_read_completion_ops;
+const struct nfs_pgio_completion_ops nfs_async_read_completion_ops;
static const struct nfs_rw_ops nfs_rw_read_ops;

static struct kmem_cache *nfs_rdata_cachep;
@@ -74,7 +74,7 @@ void nfs_pageio_init_read(struct nfs_pageio_descriptor *pgio,
}
EXPORT_SYMBOL_GPL(nfs_pageio_init_read);

-static void nfs_pageio_complete_read(struct nfs_pageio_descriptor *pgio)
+void nfs_pageio_complete_read(struct nfs_pageio_descriptor *pgio)
{
struct nfs_pgio_mirror *pgm;
unsigned long npages;
@@ -110,20 +110,25 @@ EXPORT_SYMBOL_GPL(nfs_pageio_reset_read_mds);

static void nfs_readpage_release(struct nfs_page *req, int error)
{
- struct inode *inode = d_inode(nfs_req_openctx(req)->dentry);
struct page *page = req->wb_page;

- dprintk("NFS: read done (%s/%llu %d@%lld)\n", inode->i_sb->s_id,
- (unsigned long long)NFS_FILEID(inode), req->wb_bytes,
- (long long)req_offset(req));
-
if (nfs_error_is_fatal_on_server(error) && error != -ETIMEDOUT)
SetPageError(page);
if (nfs_page_group_sync_on_bit(req, PG_UNLOCKPAGE)) {
- if (PageUptodate(page))
- nfs_fscache_write_page(inode, page);
+#ifdef CONFIG_NFS_FSCACHE
+ struct inode *inode = d_inode(nfs_req_openctx(req)->dentry);
+
+ /*
+ * If fscache is enabled, netfs will unlock pages.
+ * Otherwise, we have to unlock the page here
+ */
+ if (!netfs_inode(inode)->cache)
+ unlock_page(page);
+#else
unlock_page(page);
+#endif
}
+
nfs_release_request(req);
}

@@ -177,6 +182,10 @@ static void nfs_read_completion(struct nfs_pgio_header *hdr)
nfs_list_remove_request(req);
nfs_readpage_release(req, error);
}
+#ifdef CONFIG_NFS_FSCACHE
+ nfs_netfs_read_completion(hdr);
+#endif
+
out:
hdr->release(hdr);
}
@@ -187,6 +196,9 @@ static void nfs_initiate_read(struct nfs_pgio_header *hdr,
struct rpc_task_setup *task_setup_data, int how)
{
rpc_ops->read_setup(hdr, msg);
+#ifdef CONFIG_NFS_FSCACHE
+ nfs_netfs_read_initiate(hdr);
+#endif
trace_nfs_initiate_read(hdr);
}

@@ -202,7 +214,7 @@ nfs_async_read_error(struct list_head *head, int error)
}
}

-static const struct nfs_pgio_completion_ops nfs_async_read_completion_ops = {
+const struct nfs_pgio_completion_ops nfs_async_read_completion_ops = {
.error_cleanup = nfs_async_read_error,
.completion = nfs_read_completion,
};
@@ -219,6 +231,9 @@ static int nfs_readpage_done(struct rpc_task *task,
if (status != 0)
return status;

+#ifdef CONFIG_NFS_FSCACHE
+ nfs_netfs_read_done(hdr);
+#endif
nfs_add_stats(inode, NFSIOS_SERVERREADBYTES, hdr->res.count);
trace_nfs_readpage_done(task, hdr);

@@ -294,12 +309,6 @@ nfs_pageio_add_page(struct nfs_pageio_descriptor *pgio,

aligned_len = min_t(unsigned int, ALIGN(len, rsize), PAGE_SIZE);

- if (!IS_SYNC(page->mapping->host)) {
- error = nfs_fscache_read_page(page->mapping->host, page);
- if (error == 0)
- goto out_unlock;
- }
-
new = nfs_create_request(ctx, page, 0, aligned_len);
if (IS_ERR(new))
goto out_error;
@@ -315,8 +324,6 @@ nfs_pageio_add_page(struct nfs_pageio_descriptor *pgio,
return 0;
out_error:
error = PTR_ERR(new);
-out_unlock:
- unlock_page(page);
out:
return error;
}
@@ -355,6 +362,12 @@ int nfs_read_folio(struct file *file, struct folio *folio)
if (NFS_STALE(inode))
goto out_unlock;

+#ifdef CONFIG_NFS_FSCACHE
+ if (netfs_inode(inode)->cache) {
+ ret = netfs_read_folio(file, folio);
+ goto out;
+ }
+#endif
if (file == NULL) {
ret = -EBADF;
ctx = nfs_find_open_context(inode, NULL, FMODE_READ);
@@ -368,8 +381,10 @@ int nfs_read_folio(struct file *file, struct folio *folio)
&nfs_async_read_completion_ops);

ret = nfs_pageio_add_page(&pgio, ctx, page);
- if (ret)
- goto out;
+ if (ret) {
+ put_nfs_open_context(ctx);
+ goto out_unlock;
+ }

nfs_pageio_complete_read(&pgio);
ret = pgio.pg_error < 0 ? pgio.pg_error : 0;
@@ -378,12 +393,12 @@ int nfs_read_folio(struct file *file, struct folio *folio)
if (!PageUptodate(page) && !ret)
ret = xchg(&ctx->error, 0);
}
-out:
put_nfs_open_context(ctx);
- trace_nfs_aop_readpage_done(inode, page, ret);
- return ret;
+ goto out;
+
out_unlock:
unlock_page(page);
+out:
trace_nfs_aop_readpage_done(inode, page, ret);
return ret;
}
@@ -405,6 +420,13 @@ void nfs_readahead(struct readahead_control *ractl)
if (NFS_STALE(inode))
goto out;

+#ifdef CONFIG_NFS_FSCACHE
+ if (netfs_inode(inode)->cache) {
+ netfs_readahead(ractl);
+ ret = 0;
+ goto out;
+ }
+#endif
if (file == NULL) {
ret = -EBADF;
ctx = nfs_find_open_context(inode, NULL, FMODE_READ);
diff --git a/include/linux/nfs_page.h b/include/linux/nfs_page.h
index ba7e2e4b0926..8eeb16d9bacd 100644
--- a/include/linux/nfs_page.h
+++ b/include/linux/nfs_page.h
@@ -101,6 +101,9 @@ struct nfs_pageio_descriptor {
struct pnfs_layout_segment *pg_lseg;
struct nfs_io_completion *pg_io_completion;
struct nfs_direct_req *pg_dreq;
+#ifdef CONFIG_NFS_FSCACHE
+ void *pg_netfs;
+#endif
unsigned int pg_bsize; /* default bsize for mirrors */

u32 pg_mirror_count;
diff --git a/include/linux/nfs_xdr.h b/include/linux/nfs_xdr.h
index e86cf6642d21..e196ef595908 100644
--- a/include/linux/nfs_xdr.h
+++ b/include/linux/nfs_xdr.h
@@ -1619,6 +1619,9 @@ struct nfs_pgio_header {
const struct nfs_rw_ops *rw_ops;
struct nfs_io_completion *io_completion;
struct nfs_direct_req *dreq;
+#ifdef CONFIG_NFS_FSCACHE
+ void *netfs;
+#endif

int pnfs_error;
int error; /* merge with pnfs_error */
--
2.31.1

2022-08-31 01:06:29

by David Wysochanski

[permalink] [raw]
Subject: [PATCH v3 2/3] NFS: Configure support for netfs when NFS fscache is configured

As first steps for support of the netfs library when NFS_FSCACHE is
configured, add NETFS_SUPPORT to Kconfig and add the required netfs_inode
into struct nfs_inode.

Using netfs requires we move the VFS inode structure to be stored
inside struct netfs_inode, along with the fscache_cookie.
Thus, create a new helper, NFSI_TO_INODE(), which is defined
differently depending on whether NFS_FSCACHE is configured.
In addition, use the netfs_inode() and netfs_i_cookie() helpers,
and remove our own helper, nfs_i_fscache().

Later patches will convert NFS fscache to fully use netfs.

Link: https://lore.kernel.org/linux-nfs/[email protected]/

Signed-off-by: Dave Wysochanski <[email protected]>
---
fs/nfs/Kconfig | 1 +
fs/nfs/delegation.c | 2 +-
fs/nfs/dir.c | 2 +-
fs/nfs/fscache.c | 20 +++++++++-----------
fs/nfs/fscache.h | 15 ++++++---------
fs/nfs/inode.c | 6 +++---
fs/nfs/internal.h | 2 +-
fs/nfs/pnfs.c | 12 ++++++------
fs/nfs/write.c | 2 +-
include/linux/nfs_fs.h | 36 ++++++++++++++++++++++++------------
10 files changed, 53 insertions(+), 45 deletions(-)

diff --git a/fs/nfs/Kconfig b/fs/nfs/Kconfig
index 14a72224b657..8fbb6caf3481 100644
--- a/fs/nfs/Kconfig
+++ b/fs/nfs/Kconfig
@@ -171,6 +171,7 @@ config ROOT_NFS
config NFS_FSCACHE
bool "Provide NFS client caching support"
depends on NFS_FS=m && FSCACHE || NFS_FS=y && FSCACHE=y
+ select NETFS_SUPPORT
help
Say Y here if you want NFS data to be cached locally on disc through
the general filesystem cache manager
diff --git a/fs/nfs/delegation.c b/fs/nfs/delegation.c
index 5c97cad741a7..41a2c5176783 100644
--- a/fs/nfs/delegation.c
+++ b/fs/nfs/delegation.c
@@ -306,7 +306,7 @@ nfs_start_delegation_return_locked(struct nfs_inode *nfsi)
}
spin_unlock(&delegation->lock);
if (ret)
- nfs_clear_verifier_delegated(&nfsi->vfs_inode);
+ nfs_clear_verifier_delegated(NFSI_TO_INODE(nfsi));
out:
return ret;
}
diff --git a/fs/nfs/dir.c b/fs/nfs/dir.c
index 5d6c2ddc7ea6..72de1b8791e9 100644
--- a/fs/nfs/dir.c
+++ b/fs/nfs/dir.c
@@ -2799,7 +2799,7 @@ nfs_do_access_cache_scan(unsigned int nr_to_scan)

if (nr_to_scan-- == 0)
break;
- inode = &nfsi->vfs_inode;
+ inode = NFSI_TO_INODE(nfsi);
spin_lock(&inode->i_lock);
if (list_empty(&nfsi->access_cache_entry_lru))
goto remove_lru_entry;
diff --git a/fs/nfs/fscache.c b/fs/nfs/fscache.c
index e861d7bae305..a6fc1c8b6644 100644
--- a/fs/nfs/fscache.c
+++ b/fs/nfs/fscache.c
@@ -163,13 +163,14 @@ void nfs_fscache_init_inode(struct inode *inode)
struct nfs_server *nfss = NFS_SERVER(inode);
struct nfs_inode *nfsi = NFS_I(inode);

- nfsi->fscache = NULL;
+ netfs_inode(inode)->cache = NULL;
if (!(nfss->fscache && S_ISREG(inode->i_mode)))
return;

nfs_fscache_update_auxdata(&auxdata, inode);

- nfsi->fscache = fscache_acquire_cookie(NFS_SB(inode->i_sb)->fscache,
+ netfs_inode(inode)->cache = fscache_acquire_cookie(
+ nfss->fscache,
0,
nfsi->fh.data, /* index_key */
nfsi->fh.size,
@@ -183,11 +184,8 @@ void nfs_fscache_init_inode(struct inode *inode)
*/
void nfs_fscache_clear_inode(struct inode *inode)
{
- struct nfs_inode *nfsi = NFS_I(inode);
- struct fscache_cookie *cookie = nfs_i_fscache(inode);
-
- fscache_relinquish_cookie(cookie, false);
- nfsi->fscache = NULL;
+ fscache_relinquish_cookie(netfs_i_cookie(&NFS_I(inode)->netfs), false);
+ netfs_inode(inode)->cache = NULL;
}

/*
@@ -212,7 +210,7 @@ void nfs_fscache_clear_inode(struct inode *inode)
void nfs_fscache_open_file(struct inode *inode, struct file *filp)
{
struct nfs_fscache_inode_auxdata auxdata;
- struct fscache_cookie *cookie = nfs_i_fscache(inode);
+ struct fscache_cookie *cookie = netfs_i_cookie(&NFS_I(inode)->netfs);
bool open_for_write = inode_is_open_for_write(inode);

if (!fscache_cookie_valid(cookie))
@@ -230,7 +228,7 @@ EXPORT_SYMBOL_GPL(nfs_fscache_open_file);
void nfs_fscache_release_file(struct inode *inode, struct file *filp)
{
struct nfs_fscache_inode_auxdata auxdata;
- struct fscache_cookie *cookie = nfs_i_fscache(inode);
+ struct fscache_cookie *cookie = netfs_i_cookie(&NFS_I(inode)->netfs);
loff_t i_size = i_size_read(inode);

nfs_fscache_update_auxdata(&auxdata, inode);
@@ -243,7 +241,7 @@ void nfs_fscache_release_file(struct inode *inode, struct file *filp)
static int fscache_fallback_read_page(struct inode *inode, struct page *page)
{
struct netfs_cache_resources cres;
- struct fscache_cookie *cookie = nfs_i_fscache(inode);
+ struct fscache_cookie *cookie = netfs_i_cookie(&NFS_I(inode)->netfs);
struct iov_iter iter;
struct bio_vec bvec[1];
int ret;
@@ -271,7 +269,7 @@ static int fscache_fallback_write_page(struct inode *inode, struct page *page,
bool no_space_allocated_yet)
{
struct netfs_cache_resources cres;
- struct fscache_cookie *cookie = nfs_i_fscache(inode);
+ struct fscache_cookie *cookie = netfs_i_cookie(&NFS_I(inode)->netfs);
struct iov_iter iter;
struct bio_vec bvec[1];
loff_t start = page_offset(page);
diff --git a/fs/nfs/fscache.h b/fs/nfs/fscache.h
index 2a37af880978..38614ed8f951 100644
--- a/fs/nfs/fscache.h
+++ b/fs/nfs/fscache.h
@@ -54,7 +54,7 @@ static inline bool nfs_fscache_release_folio(struct folio *folio, gfp_t gfp)
if (current_is_kswapd() || !(gfp & __GFP_FS))
return false;
folio_wait_fscache(folio);
- fscache_note_page_release(nfs_i_fscache(folio->mapping->host));
+ fscache_note_page_release(netfs_i_cookie(&NFS_I(folio->mapping->host)->netfs));
nfs_inc_fscache_stats(folio->mapping->host,
NFSIOS_FSCACHE_PAGES_UNCACHED);
}
@@ -66,7 +66,7 @@ static inline bool nfs_fscache_release_folio(struct folio *folio, gfp_t gfp)
*/
static inline int nfs_fscache_read_page(struct inode *inode, struct page *page)
{
- if (nfs_i_fscache(inode))
+ if (netfs_inode(inode)->cache)
return __nfs_fscache_read_page(inode, page);
return -ENOBUFS;
}
@@ -78,7 +78,7 @@ static inline int nfs_fscache_read_page(struct inode *inode, struct page *page)
static inline void nfs_fscache_write_page(struct inode *inode,
struct page *page)
{
- if (nfs_i_fscache(inode))
+ if (netfs_inode(inode)->cache)
__nfs_fscache_write_page(inode, page);
}

@@ -101,13 +101,10 @@ static inline void nfs_fscache_update_auxdata(struct nfs_fscache_inode_auxdata *
static inline void nfs_fscache_invalidate(struct inode *inode, int flags)
{
struct nfs_fscache_inode_auxdata auxdata;
- struct nfs_inode *nfsi = NFS_I(inode);
+ struct fscache_cookie *cookie = netfs_i_cookie(&NFS_I(inode)->netfs);

- if (nfsi->fscache) {
- nfs_fscache_update_auxdata(&auxdata, inode);
- fscache_invalidate(nfsi->fscache, &auxdata,
- i_size_read(inode), flags);
- }
+ nfs_fscache_update_auxdata(&auxdata, inode);
+ fscache_invalidate(cookie, &auxdata, i_size_read(inode), flags);
}

/*
diff --git a/fs/nfs/inode.c b/fs/nfs/inode.c
index bea7c005119c..7d1b9e39863c 100644
--- a/fs/nfs/inode.c
+++ b/fs/nfs/inode.c
@@ -1411,7 +1411,7 @@ int nfs_revalidate_mapping(struct inode *inode, struct address_space *mapping)

static bool nfs_file_has_writers(struct nfs_inode *nfsi)
{
- struct inode *inode = &nfsi->vfs_inode;
+ struct inode *inode = NFSI_TO_INODE(nfsi);

if (!S_ISREG(inode->i_mode))
return false;
@@ -2249,7 +2249,7 @@ struct inode *nfs_alloc_inode(struct super_block *sb)
#ifdef CONFIG_NFS_V4_2
nfsi->xattr_cache = NULL;
#endif
- return &nfsi->vfs_inode;
+ return NFSI_TO_INODE(nfsi);
}
EXPORT_SYMBOL_GPL(nfs_alloc_inode);

@@ -2273,7 +2273,7 @@ static void init_once(void *foo)
{
struct nfs_inode *nfsi = (struct nfs_inode *) foo;

- inode_init_once(&nfsi->vfs_inode);
+ inode_init_once(NFSI_TO_INODE(nfsi));
INIT_LIST_HEAD(&nfsi->open_files);
INIT_LIST_HEAD(&nfsi->access_cache_entry_lru);
INIT_LIST_HEAD(&nfsi->access_cache_inode_lru);
diff --git a/fs/nfs/internal.h b/fs/nfs/internal.h
index 27c720d71b4e..78d945f5f97f 100644
--- a/fs/nfs/internal.h
+++ b/fs/nfs/internal.h
@@ -355,7 +355,7 @@ nfs4_label_copy(struct nfs4_label *dst, struct nfs4_label *src)

static inline void nfs_zap_label_cache_locked(struct nfs_inode *nfsi)
{
- if (nfs_server_capable(&nfsi->vfs_inode, NFS_CAP_SECURITY_LABEL))
+ if (nfs_server_capable(NFSI_TO_INODE(nfsi), NFS_CAP_SECURITY_LABEL))
nfsi->cache_validity |= NFS_INO_INVALID_LABEL;
}
#else
diff --git a/fs/nfs/pnfs.c b/fs/nfs/pnfs.c
index 2613b7e36eb9..e44b9934349c 100644
--- a/fs/nfs/pnfs.c
+++ b/fs/nfs/pnfs.c
@@ -763,19 +763,19 @@ static struct pnfs_layout_hdr *__pnfs_destroy_layout(struct nfs_inode *nfsi)
struct pnfs_layout_hdr *lo;
LIST_HEAD(tmp_list);

- spin_lock(&nfsi->vfs_inode.i_lock);
+ spin_lock(&NFSI_TO_INODE(nfsi)->i_lock);
lo = nfsi->layout;
if (lo) {
pnfs_get_layout_hdr(lo);
pnfs_mark_layout_stateid_invalid(lo, &tmp_list);
pnfs_layout_clear_fail_bit(lo, NFS_LAYOUT_RO_FAILED);
pnfs_layout_clear_fail_bit(lo, NFS_LAYOUT_RW_FAILED);
- spin_unlock(&nfsi->vfs_inode.i_lock);
+ spin_unlock(&NFSI_TO_INODE(nfsi)->i_lock);
pnfs_free_lseg_list(&tmp_list);
- nfs_commit_inode(&nfsi->vfs_inode, 0);
+ nfs_commit_inode(NFSI_TO_INODE(nfsi), 0);
pnfs_put_layout_hdr(lo);
} else
- spin_unlock(&nfsi->vfs_inode.i_lock);
+ spin_unlock(&NFSI_TO_INODE(nfsi)->i_lock);
return lo;
}

@@ -790,9 +790,9 @@ static bool pnfs_layout_removed(struct nfs_inode *nfsi,
{
bool ret;

- spin_lock(&nfsi->vfs_inode.i_lock);
+ spin_lock(&NFSI_TO_INODE(nfsi)->i_lock);
ret = nfsi->layout != lo;
- spin_unlock(&nfsi->vfs_inode.i_lock);
+ spin_unlock(&NFSI_TO_INODE(nfsi)->i_lock);
return ret;
}

diff --git a/fs/nfs/write.c b/fs/nfs/write.c
index 1843fa235d9b..451da6336e12 100644
--- a/fs/nfs/write.c
+++ b/fs/nfs/write.c
@@ -828,7 +828,7 @@ nfs_page_search_commits_for_head_request_locked(struct nfs_inode *nfsi,
{
struct nfs_page *freq, *t;
struct nfs_commit_info cinfo;
- struct inode *inode = &nfsi->vfs_inode;
+ struct inode *inode = NFSI_TO_INODE(nfsi);

nfs_init_cinfo_from_inode(&cinfo, inode);

diff --git a/include/linux/nfs_fs.h b/include/linux/nfs_fs.h
index 7931fa472561..5a451caa38a4 100644
--- a/include/linux/nfs_fs.h
+++ b/include/linux/nfs_fs.h
@@ -31,6 +31,10 @@
#include <linux/sunrpc/auth.h>
#include <linux/sunrpc/clnt.h>

+#ifdef CONFIG_NFS_FSCACHE
+#include <linux/netfs.h>
+#endif
+
#include <linux/nfs.h>
#include <linux/nfs2.h>
#include <linux/nfs3.h>
@@ -121,6 +125,10 @@ struct nfs4_xattr_cache;
* nfs fs inode data in memory
*/
struct nfs_inode {
+#ifdef CONFIG_NFS_FSCACHE
+ struct netfs_inode netfs; /* Netfslib context and inode */
+#endif
+
/*
* The 64bit 'inode number'
*/
@@ -203,10 +211,10 @@ struct nfs_inode {
/* how many bytes have been written/read and how many bytes queued up */
__u64 write_io;
__u64 read_io;
-#ifdef CONFIG_NFS_FSCACHE
- struct fscache_cookie *fscache;
+#ifndef CONFIG_NFS_FSCACHE
+ /* fscache uses netfs, and netfs_inode contains the inode */
+ struct inode vfs_inode;
#endif
- struct inode vfs_inode;

#ifdef CONFIG_NFS_V4_2
struct nfs4_xattr_cache *xattr_cache;
@@ -281,9 +289,22 @@ struct nfs4_copy_state {
#define NFS_INO_LAYOUTSTATS (11) /* layoutstats inflight */
#define NFS_INO_ODIRECT (12) /* I/O setting is O_DIRECT */

+static inline struct inode *NFSI_TO_INODE(struct nfs_inode *nfsi)
+{
+#ifdef CONFIG_NFS_FSCACHE
+ return &nfsi->netfs.inode;
+#else
+ return &nfsi->vfs_inode;
+#endif
+}
+
static inline struct nfs_inode *NFS_I(const struct inode *inode)
{
+#ifdef CONFIG_NFS_FSCACHE
+ return container_of(inode, struct nfs_inode, netfs.inode);
+#else
return container_of(inode, struct nfs_inode, vfs_inode);
+#endif
}

static inline struct nfs_server *NFS_SB(const struct super_block *s)
@@ -328,15 +349,6 @@ static inline int NFS_STALE(const struct inode *inode)
return test_bit(NFS_INO_STALE, &NFS_I(inode)->flags);
}

-static inline struct fscache_cookie *nfs_i_fscache(struct inode *inode)
-{
-#ifdef CONFIG_NFS_FSCACHE
- return NFS_I(inode)->fscache;
-#else
- return NULL;
-#endif
-}
-
static inline __u64 NFS_FILEID(const struct inode *inode)
{
return NFS_I(inode)->fileid;
--
2.31.1

2022-08-31 02:46:09

by kernel test robot

[permalink] [raw]
Subject: Re: [PATCH v3 3/3] NFS: Convert buffered read paths to use netfs when fscache is enabled

Hi Dave,

Thank you for the patch! Perhaps something to improve:

[auto build test WARNING on v6.0-rc3]
[also build test WARNING on linus/master next-20220830]
[cannot apply to trondmy-nfs/linux-next]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url: https://github.com/intel-lab-lkp/linux/commits/Dave-Wysochanski/Convert-NFS-with-fscache-to-the-netfs-API/20220831-085217
base: b90cb1053190353cc30f0fef0ef1f378ccc063c5
config: m68k-allyesconfig (https://download.01.org/0day-ci/archive/20220831/[email protected]/config)
compiler: m68k-linux-gcc (GCC) 12.1.0
reproduce (this is a W=1 build):
wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
chmod +x ~/bin/make.cross
# https://github.com/intel-lab-lkp/linux/commit/5ac59d0573d1008c9d8856a91e4b4fd2ee61bb63
git remote add linux-review https://github.com/intel-lab-lkp/linux
git fetch --no-tags linux-review Dave-Wysochanski/Convert-NFS-with-fscache-to-the-netfs-API/20220831-085217
git checkout 5ac59d0573d1008c9d8856a91e4b4fd2ee61bb63
# save the config file
mkdir build_dir && cp config build_dir/.config
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-12.1.0 make.cross W=1 O=build_dir ARCH=m68k SHELL=/bin/bash fs/

If you fix the issue, kindly add following tag where applicable
Reported-by: kernel test robot <[email protected]>

All warnings (new ones prefixed by >>):

fs/nfs/fscache.c: In function 'nfs_netfs_read_initiate':
>> fs/nfs/fscache.c:331:42: warning: variable 'sreq' set but not used [-Wunused-but-set-variable]
331 | struct netfs_io_subrequest *sreq;
| ^~~~
fs/nfs/fscache.c: In function 'nfs_netfs_read_done':
fs/nfs/fscache.c:346:42: warning: variable 'sreq' set but not used [-Wunused-but-set-variable]
346 | struct netfs_io_subrequest *sreq;
| ^~~~


vim +/sreq +331 fs/nfs/fscache.c

327
328 void nfs_netfs_read_initiate(struct nfs_pgio_header *hdr)
329 {
330 struct nfs_netfs_io_data *netfs = hdr->netfs;
> 331 struct netfs_io_subrequest *sreq;
332
333 if (!netfs)
334 return;
335
336 sreq = netfs->sreq;
337 spin_lock(&netfs->lock);
338 atomic_inc(&netfs->rpcs);
339 netfs->rpc_byte_count += hdr->args.count;
340 spin_unlock(&netfs->lock);
341 }
342

--
0-DAY CI Kernel Test Service
https://01.org/lkp

2022-08-31 16:57:34

by Jeff Layton

[permalink] [raw]
Subject: Re: [PATCH v3 2/3] NFS: Configure support for netfs when NFS fscache is configured

On Tue, 2022-08-30 at 20:50 -0400, Dave Wysochanski wrote:
> As first steps for support of the netfs library when NFS_FSCACHE is
> configured, add NETFS_SUPPORT to Kconfig and add the required netfs_inode
> into struct nfs_inode.
>
> Using netfs requires we move the VFS inode structure to be stored
> inside struct netfs_inode, along with the fscache_cookie.
> Thus, create a new helper, NFSI_TO_INODE(), which is defined
> differently depending on whether NFS_FSCACHE is configured.
> In addition, use the netfs_inode() and netfs_i_cookie() helpers,
> and remove our own helper, nfs_i_fscache().
>
> Later patches will convert NFS fscache to fully use netfs.
>
> Link: https://lore.kernel.org/linux-nfs/[email protected]/
>
> Signed-off-by: Dave Wysochanski <[email protected]>
> ---
> fs/nfs/Kconfig | 1 +
> fs/nfs/delegation.c | 2 +-
> fs/nfs/dir.c | 2 +-
> fs/nfs/fscache.c | 20 +++++++++-----------
> fs/nfs/fscache.h | 15 ++++++---------
> fs/nfs/inode.c | 6 +++---
> fs/nfs/internal.h | 2 +-
> fs/nfs/pnfs.c | 12 ++++++------
> fs/nfs/write.c | 2 +-
> include/linux/nfs_fs.h | 36 ++++++++++++++++++++++++------------
> 10 files changed, 53 insertions(+), 45 deletions(-)
>
> diff --git a/fs/nfs/Kconfig b/fs/nfs/Kconfig
> index 14a72224b657..8fbb6caf3481 100644
> --- a/fs/nfs/Kconfig
> +++ b/fs/nfs/Kconfig
> @@ -171,6 +171,7 @@ config ROOT_NFS
> config NFS_FSCACHE
> bool "Provide NFS client caching support"
> depends on NFS_FS=m && FSCACHE || NFS_FS=y && FSCACHE=y
> + select NETFS_SUPPORT
> help
> Say Y here if you want NFS data to be cached locally on disc through
> the general filesystem cache manager
> diff --git a/fs/nfs/delegation.c b/fs/nfs/delegation.c
> index 5c97cad741a7..41a2c5176783 100644
> --- a/fs/nfs/delegation.c
> +++ b/fs/nfs/delegation.c
> @@ -306,7 +306,7 @@ nfs_start_delegation_return_locked(struct nfs_inode *nfsi)
> }
> spin_unlock(&delegation->lock);
> if (ret)
> - nfs_clear_verifier_delegated(&nfsi->vfs_inode);
> + nfs_clear_verifier_delegated(NFSI_TO_INODE(nfsi));
> out:
> return ret;
> }
> diff --git a/fs/nfs/dir.c b/fs/nfs/dir.c
> index 5d6c2ddc7ea6..72de1b8791e9 100644
> --- a/fs/nfs/dir.c
> +++ b/fs/nfs/dir.c
> @@ -2799,7 +2799,7 @@ nfs_do_access_cache_scan(unsigned int nr_to_scan)
>
> if (nr_to_scan-- == 0)
> break;
> - inode = &nfsi->vfs_inode;
> + inode = NFSI_TO_INODE(nfsi);
> spin_lock(&inode->i_lock);
> if (list_empty(&nfsi->access_cache_entry_lru))
> goto remove_lru_entry;
> diff --git a/fs/nfs/fscache.c b/fs/nfs/fscache.c
> index e861d7bae305..a6fc1c8b6644 100644
> --- a/fs/nfs/fscache.c
> +++ b/fs/nfs/fscache.c
> @@ -163,13 +163,14 @@ void nfs_fscache_init_inode(struct inode *inode)
> struct nfs_server *nfss = NFS_SERVER(inode);
> struct nfs_inode *nfsi = NFS_I(inode);
>
> - nfsi->fscache = NULL;
> + netfs_inode(inode)->cache = NULL;
> if (!(nfss->fscache && S_ISREG(inode->i_mode)))
> return;
>
> nfs_fscache_update_auxdata(&auxdata, inode);
>
> - nfsi->fscache = fscache_acquire_cookie(NFS_SB(inode->i_sb)->fscache,
> + netfs_inode(inode)->cache = fscache_acquire_cookie(
> + nfss->fscache,
> 0,
> nfsi->fh.data, /* index_key */
> nfsi->fh.size,
> @@ -183,11 +184,8 @@ void nfs_fscache_init_inode(struct inode *inode)
> */
> void nfs_fscache_clear_inode(struct inode *inode)
> {
> - struct nfs_inode *nfsi = NFS_I(inode);
> - struct fscache_cookie *cookie = nfs_i_fscache(inode);
> -
> - fscache_relinquish_cookie(cookie, false);
> - nfsi->fscache = NULL;
> + fscache_relinquish_cookie(netfs_i_cookie(&NFS_I(inode)->netfs), false);
> + netfs_inode(inode)->cache = NULL;
> }
>
> /*
> @@ -212,7 +210,7 @@ void nfs_fscache_clear_inode(struct inode *inode)
> void nfs_fscache_open_file(struct inode *inode, struct file *filp)
> {
> struct nfs_fscache_inode_auxdata auxdata;
> - struct fscache_cookie *cookie = nfs_i_fscache(inode);
> + struct fscache_cookie *cookie = netfs_i_cookie(&NFS_I(inode)->netfs);
> bool open_for_write = inode_is_open_for_write(inode);
>
> if (!fscache_cookie_valid(cookie))
> @@ -230,7 +228,7 @@ EXPORT_SYMBOL_GPL(nfs_fscache_open_file);
> void nfs_fscache_release_file(struct inode *inode, struct file *filp)
> {
> struct nfs_fscache_inode_auxdata auxdata;
> - struct fscache_cookie *cookie = nfs_i_fscache(inode);
> + struct fscache_cookie *cookie = netfs_i_cookie(&NFS_I(inode)->netfs);
> loff_t i_size = i_size_read(inode);
>
> nfs_fscache_update_auxdata(&auxdata, inode);
> @@ -243,7 +241,7 @@ void nfs_fscache_release_file(struct inode *inode, struct file *filp)
> static int fscache_fallback_read_page(struct inode *inode, struct page *page)
> {
> struct netfs_cache_resources cres;
> - struct fscache_cookie *cookie = nfs_i_fscache(inode);
> + struct fscache_cookie *cookie = netfs_i_cookie(&NFS_I(inode)->netfs);
> struct iov_iter iter;
> struct bio_vec bvec[1];
> int ret;
> @@ -271,7 +269,7 @@ static int fscache_fallback_write_page(struct inode *inode, struct page *page,
> bool no_space_allocated_yet)
> {
> struct netfs_cache_resources cres;
> - struct fscache_cookie *cookie = nfs_i_fscache(inode);
> + struct fscache_cookie *cookie = netfs_i_cookie(&NFS_I(inode)->netfs);
> struct iov_iter iter;
> struct bio_vec bvec[1];
> loff_t start = page_offset(page);
> diff --git a/fs/nfs/fscache.h b/fs/nfs/fscache.h
> index 2a37af880978..38614ed8f951 100644
> --- a/fs/nfs/fscache.h
> +++ b/fs/nfs/fscache.h
> @@ -54,7 +54,7 @@ static inline bool nfs_fscache_release_folio(struct folio *folio, gfp_t gfp)
> if (current_is_kswapd() || !(gfp & __GFP_FS))
> return false;
> folio_wait_fscache(folio);
> - fscache_note_page_release(nfs_i_fscache(folio->mapping->host));
> + fscache_note_page_release(netfs_i_cookie(&NFS_I(folio->mapping->host)->netfs));
> nfs_inc_fscache_stats(folio->mapping->host,
> NFSIOS_FSCACHE_PAGES_UNCACHED);
> }
> @@ -66,7 +66,7 @@ static inline bool nfs_fscache_release_folio(struct folio *folio, gfp_t gfp)
> */
> static inline int nfs_fscache_read_page(struct inode *inode, struct page *page)
> {
> - if (nfs_i_fscache(inode))
> + if (netfs_inode(inode)->cache)
> return __nfs_fscache_read_page(inode, page);
> return -ENOBUFS;
> }
> @@ -78,7 +78,7 @@ static inline int nfs_fscache_read_page(struct inode *inode, struct page *page)
> static inline void nfs_fscache_write_page(struct inode *inode,
> struct page *page)
> {
> - if (nfs_i_fscache(inode))
> + if (netfs_inode(inode)->cache)
> __nfs_fscache_write_page(inode, page);
> }
>
> @@ -101,13 +101,10 @@ static inline void nfs_fscache_update_auxdata(struct nfs_fscache_inode_auxdata *
> static inline void nfs_fscache_invalidate(struct inode *inode, int flags)
> {
> struct nfs_fscache_inode_auxdata auxdata;
> - struct nfs_inode *nfsi = NFS_I(inode);
> + struct fscache_cookie *cookie = netfs_i_cookie(&NFS_I(inode)->netfs);
>
> - if (nfsi->fscache) {
> - nfs_fscache_update_auxdata(&auxdata, inode);
> - fscache_invalidate(nfsi->fscache, &auxdata,
> - i_size_read(inode), flags);
> - }
> + nfs_fscache_update_auxdata(&auxdata, inode);
> + fscache_invalidate(cookie, &auxdata, i_size_read(inode), flags);
> }
>
> /*
> diff --git a/fs/nfs/inode.c b/fs/nfs/inode.c
> index bea7c005119c..7d1b9e39863c 100644
> --- a/fs/nfs/inode.c
> +++ b/fs/nfs/inode.c
> @@ -1411,7 +1411,7 @@ int nfs_revalidate_mapping(struct inode *inode, struct address_space *mapping)
>
> static bool nfs_file_has_writers(struct nfs_inode *nfsi)
> {
> - struct inode *inode = &nfsi->vfs_inode;
> + struct inode *inode = NFSI_TO_INODE(nfsi);
>
> if (!S_ISREG(inode->i_mode))
> return false;
> @@ -2249,7 +2249,7 @@ struct inode *nfs_alloc_inode(struct super_block *sb)
> #ifdef CONFIG_NFS_V4_2
> nfsi->xattr_cache = NULL;
> #endif
> - return &nfsi->vfs_inode;
> + return NFSI_TO_INODE(nfsi);
> }
> EXPORT_SYMBOL_GPL(nfs_alloc_inode);
>
> @@ -2273,7 +2273,7 @@ static void init_once(void *foo)
> {
> struct nfs_inode *nfsi = (struct nfs_inode *) foo;
>
> - inode_init_once(&nfsi->vfs_inode);
> + inode_init_once(NFSI_TO_INODE(nfsi));
> INIT_LIST_HEAD(&nfsi->open_files);
> INIT_LIST_HEAD(&nfsi->access_cache_entry_lru);
> INIT_LIST_HEAD(&nfsi->access_cache_inode_lru);
> diff --git a/fs/nfs/internal.h b/fs/nfs/internal.h
> index 27c720d71b4e..78d945f5f97f 100644
> --- a/fs/nfs/internal.h
> +++ b/fs/nfs/internal.h
> @@ -355,7 +355,7 @@ nfs4_label_copy(struct nfs4_label *dst, struct nfs4_label *src)
>
> static inline void nfs_zap_label_cache_locked(struct nfs_inode *nfsi)
> {
> - if (nfs_server_capable(&nfsi->vfs_inode, NFS_CAP_SECURITY_LABEL))
> + if (nfs_server_capable(NFSI_TO_INODE(nfsi), NFS_CAP_SECURITY_LABEL))
> nfsi->cache_validity |= NFS_INO_INVALID_LABEL;
> }
> #else
> diff --git a/fs/nfs/pnfs.c b/fs/nfs/pnfs.c
> index 2613b7e36eb9..e44b9934349c 100644
> --- a/fs/nfs/pnfs.c
> +++ b/fs/nfs/pnfs.c
> @@ -763,19 +763,19 @@ static struct pnfs_layout_hdr *__pnfs_destroy_layout(struct nfs_inode *nfsi)
> struct pnfs_layout_hdr *lo;
> LIST_HEAD(tmp_list);
>
> - spin_lock(&nfsi->vfs_inode.i_lock);
> + spin_lock(&NFSI_TO_INODE(nfsi)->i_lock);
> lo = nfsi->layout;
> if (lo) {
> pnfs_get_layout_hdr(lo);
> pnfs_mark_layout_stateid_invalid(lo, &tmp_list);
> pnfs_layout_clear_fail_bit(lo, NFS_LAYOUT_RO_FAILED);
> pnfs_layout_clear_fail_bit(lo, NFS_LAYOUT_RW_FAILED);
> - spin_unlock(&nfsi->vfs_inode.i_lock);
> + spin_unlock(&NFSI_TO_INODE(nfsi)->i_lock);
> pnfs_free_lseg_list(&tmp_list);
> - nfs_commit_inode(&nfsi->vfs_inode, 0);
> + nfs_commit_inode(NFSI_TO_INODE(nfsi), 0);
> pnfs_put_layout_hdr(lo);
> } else
> - spin_unlock(&nfsi->vfs_inode.i_lock);
> + spin_unlock(&NFSI_TO_INODE(nfsi)->i_lock);
> return lo;
> }
>
> @@ -790,9 +790,9 @@ static bool pnfs_layout_removed(struct nfs_inode *nfsi,
> {
> bool ret;
>
> - spin_lock(&nfsi->vfs_inode.i_lock);
> + spin_lock(&NFSI_TO_INODE(nfsi)->i_lock);
> ret = nfsi->layout != lo;
> - spin_unlock(&nfsi->vfs_inode.i_lock);
> + spin_unlock(&NFSI_TO_INODE(nfsi)->i_lock);
> return ret;
> }
>
> diff --git a/fs/nfs/write.c b/fs/nfs/write.c
> index 1843fa235d9b..451da6336e12 100644
> --- a/fs/nfs/write.c
> +++ b/fs/nfs/write.c
> @@ -828,7 +828,7 @@ nfs_page_search_commits_for_head_request_locked(struct nfs_inode *nfsi,
> {
> struct nfs_page *freq, *t;
> struct nfs_commit_info cinfo;
> - struct inode *inode = &nfsi->vfs_inode;
> + struct inode *inode = NFSI_TO_INODE(nfsi);
>
> nfs_init_cinfo_from_inode(&cinfo, inode);
>
> diff --git a/include/linux/nfs_fs.h b/include/linux/nfs_fs.h
> index 7931fa472561..5a451caa38a4 100644
> --- a/include/linux/nfs_fs.h
> +++ b/include/linux/nfs_fs.h
> @@ -31,6 +31,10 @@
> #include <linux/sunrpc/auth.h>
> #include <linux/sunrpc/clnt.h>
>
> +#ifdef CONFIG_NFS_FSCACHE
> +#include <linux/netfs.h>
> +#endif
> +
> #include <linux/nfs.h>
> #include <linux/nfs2.h>
> #include <linux/nfs3.h>
> @@ -121,6 +125,10 @@ struct nfs4_xattr_cache;
> * nfs fs inode data in memory
> */
> struct nfs_inode {
> +#ifdef CONFIG_NFS_FSCACHE
> + struct netfs_inode netfs; /* Netfslib context and inode */
> +#endif
> +
> /*
> * The 64bit 'inode number'
> */
> @@ -203,10 +211,10 @@ struct nfs_inode {
> /* how many bytes have been written/read and how many bytes queued up */
> __u64 write_io;
> __u64 read_io;
> -#ifdef CONFIG_NFS_FSCACHE
> - struct fscache_cookie *fscache;
> +#ifndef CONFIG_NFS_FSCACHE
> + /* fscache uses netfs, and netfs_inode contains the inode */
> + struct inode vfs_inode;
> #endif
> - struct inode vfs_inode;
>
> #ifdef CONFIG_NFS_V4_2
> struct nfs4_xattr_cache *xattr_cache;
> @@ -281,9 +289,22 @@ struct nfs4_copy_state {
> #define NFS_INO_LAYOUTSTATS (11) /* layoutstats inflight */
> #define NFS_INO_ODIRECT (12) /* I/O setting is O_DIRECT */
>
> +static inline struct inode *NFSI_TO_INODE(struct nfs_inode *nfsi)

Not crazy about the name here. Maybe VFS_I() ? ntfs and xfs have private
helpers named VFS_I that do something similar, so it seems more
idiomatic.

> +{
> +#ifdef CONFIG_NFS_FSCACHE
> + return &nfsi->netfs.inode;
> +#else
> + return &nfsi->vfs_inode;
> +#endif
> +}
> +

These are hard to read (and reason about) defined this way. I think I'd
rather see less #ifdef-ery here. Instead if having the #ifdef's inside
the functions, do:

#ifdef CONFIG_NFS_FSCACHE

/* define all static inlines here for fscache case */

#else

/* and here for the !fscache case */

#endif

> static inline struct nfs_inode *NFS_I(const struct inode *inode)
> {
> +#ifdef CONFIG_NFS_FSCACHE
> + return container_of(inode, struct nfs_inode, netfs.inode);
> +#else
> return container_of(inode, struct nfs_inode, vfs_inode);
> +#endif
> }
>
> static inline struct nfs_server *NFS_SB(const struct super_block *s)
> @@ -328,15 +349,6 @@ static inline int NFS_STALE(const struct inode *inode)
> return test_bit(NFS_INO_STALE, &NFS_I(inode)->flags);
> }
>
> -static inline struct fscache_cookie *nfs_i_fscache(struct inode *inode)
> -{
> -#ifdef CONFIG_NFS_FSCACHE
> - return NFS_I(inode)->fscache;
> -#else
> - return NULL;
> -#endif
> -}
> -
> static inline __u64 NFS_FILEID(const struct inode *inode)
> {
> return NFS_I(inode)->fileid;

--
Jeff Layton <[email protected]>

2022-08-31 18:32:49

by Jeff Layton

[permalink] [raw]
Subject: Re: [PATCH v3 3/3] NFS: Convert buffered read paths to use netfs when fscache is enabled

On Tue, 2022-08-30 at 20:50 -0400, Dave Wysochanski wrote:
> Convert the NFS buffered read code paths to corresponding netfs APIs,
> but only when fscache is configured and enabled.
>
> The netfs API defines struct netfs_request_ops which must be filled
> in by the network filesystem. For NFS, we only need to define 5 of
> the functions, the main one being the issue_read() function.
> The issue_read() function is called by the netfs layer when a read
> cannot be fulfilled locally, and must be sent to the server (either
> the cache is not active, or it is active but the data is not available).
> Once the read from the server is complete, netfs requires a call to
> netfs_subreq_terminated() which conveys either how many bytes were read
> successfully, or an error. Note that issue_read() is called with a
> structure, netfs_io_subrequest, which defines the IO requested, and
> contains a start and a length (both in bytes), and assumes the underlying
> netfs will return a either an error on the whole region, or the number
> of bytes successfully read.
>
> The NFS IO path is page based and the main APIs are the pgio APIs defined
> in pagelist.c. For the pgio APIs, there is no way for the caller to
> know how many RPCs will be sent and how the pages will be broken up
> into underlying RPCs, each of which will have their own return code.
> Thus, NFS needs some way to accommodate the netfs API requirement on
> the single response to the whole request, while also minimizing
> disruptive changes to the NFS pgio layer. The approach taken with this
> patch is to allocate a small structure for each call to nfs_issue_read()
> to keep some accounting information for the outstanding RPCs, as well as
> the final error value or the number of bytes successfully read. The
> accounting data is updated inside the pgio layer, when a nfs_pgio_header
> contains a valid pointer to this data. In the NFS read completion
> function, nfs_read_completion(), the final error value and bytes read
> is updated, and the accounting data is used to determine whether this
> completion represents the last RPC. If this is the last RPC, call
> netfs_subreq_terminated() with the final error value or the number
> of bytes transferred.
>
> Link: https://lore.kernel.org/linux-nfs/[email protected]/
>
> Signed-off-by: Dave Wysochanski <[email protected]>
> ---
> fs/nfs/fscache.c | 232 ++++++++++++++++++++++++---------------
> fs/nfs/fscache.h | 71 ++++++------
> fs/nfs/inode.c | 3 +
> fs/nfs/internal.h | 9 ++
> fs/nfs/pagelist.c | 14 +++
> fs/nfs/read.c | 68 ++++++++----
> include/linux/nfs_page.h | 3 +
> include/linux/nfs_xdr.h | 3 +
> 8 files changed, 258 insertions(+), 145 deletions(-)
>
> diff --git a/fs/nfs/fscache.c b/fs/nfs/fscache.c
> index a6fc1c8b6644..696baed44aeb 100644
> --- a/fs/nfs/fscache.c
> +++ b/fs/nfs/fscache.c
> @@ -15,6 +15,9 @@
> #include <linux/seq_file.h>
> #include <linux/slab.h>
> #include <linux/iversion.h>
> +#include <linux/xarray.h>
> +#include <linux/fscache.h>
> +#include <linux/netfs.h>
>
> #include "internal.h"
> #include "iostat.h"
> @@ -235,112 +238,161 @@ void nfs_fscache_release_file(struct inode *inode, struct file *filp)
> fscache_unuse_cookie(cookie, &auxdata, &i_size);
> }
>
> -/*
> - * Fallback page reading interface.
> - */
> -static int fscache_fallback_read_page(struct inode *inode, struct page *page)
> +
> +atomic_t nfs_netfs_debug_id;
> +static int nfs_netfs_init_request(struct netfs_io_request *rreq, struct file *file)
> {
> - struct netfs_cache_resources cres;
> - struct fscache_cookie *cookie = netfs_i_cookie(&NFS_I(inode)->netfs);
> - struct iov_iter iter;
> - struct bio_vec bvec[1];
> - int ret;
> -
> - memset(&cres, 0, sizeof(cres));
> - bvec[0].bv_page = page;
> - bvec[0].bv_offset = 0;
> - bvec[0].bv_len = PAGE_SIZE;
> - iov_iter_bvec(&iter, READ, bvec, ARRAY_SIZE(bvec), PAGE_SIZE);
> -
> - ret = fscache_begin_read_operation(&cres, cookie);
> - if (ret < 0)
> - return ret;
> -
> - ret = fscache_read(&cres, page_offset(page), &iter, NETFS_READ_HOLE_FAIL,
> - NULL, NULL);
> - fscache_end_operation(&cres);
> - return ret;
> + struct nfs_open_context *ctx;
> +
> + if (file == NULL) {
> + ctx = nfs_find_open_context(rreq->inode, NULL, FMODE_READ);
> + if (!ctx)
> + return -ENOMEM;

That error return seems like an odd choice. A NULL return here just
means that we don't have a suitable open file, not that we're out of
memory.

I think a NULL file pointer from netfs can only happen in readahead, and
the comments over readahead_control say:

* @file: The file, used primarily by network filesystems for authentication.
* May be NULL if invoked internally by the filesystem.

AFAICT though, only f2fs and ext4 invoke it internally.

Maybe instead of doing this, it ought to just throw a WARN if we get a
NULL file pointer and return -EINVAL or something?

Willy, am I correct on when ractl->file can be NULL?

> + } else
> + ctx = get_nfs_open_context(nfs_file_open_context(file));
> +
> + rreq->netfs_priv = ctx;
> +
> + if (netfs_i_cookie(&NFS_I(rreq->inode)->netfs))
> + rreq->debug_id = atomic_inc_return(&nfs_netfs_debug_id);
> +
> + return 0;
> }
>
> -/*
> - * Fallback page writing interface.
> - */
> -static int fscache_fallback_write_page(struct inode *inode, struct page *page,
> - bool no_space_allocated_yet)
> +static void nfs_netfs_free_request(struct netfs_io_request *rreq)
> {
> - struct netfs_cache_resources cres;
> - struct fscache_cookie *cookie = netfs_i_cookie(&NFS_I(inode)->netfs);
> - struct iov_iter iter;
> - struct bio_vec bvec[1];
> - loff_t start = page_offset(page);
> - size_t len = PAGE_SIZE;
> - int ret;
> -
> - memset(&cres, 0, sizeof(cres));
> - bvec[0].bv_page = page;
> - bvec[0].bv_offset = 0;
> - bvec[0].bv_len = PAGE_SIZE;
> - iov_iter_bvec(&iter, WRITE, bvec, ARRAY_SIZE(bvec), PAGE_SIZE);
> -
> - ret = fscache_begin_write_operation(&cres, cookie);
> - if (ret < 0)
> - return ret;
> -
> - ret = cres.ops->prepare_write(&cres, &start, &len, i_size_read(inode),
> - no_space_allocated_yet);
> - if (ret == 0)
> - ret = fscache_write(&cres, page_offset(page), &iter, NULL, NULL);
> - fscache_end_operation(&cres);
> - return ret;
> + put_nfs_open_context(rreq->netfs_priv);
> }
>
> -/*
> - * Retrieve a page from fscache
> - */
> -int __nfs_fscache_read_page(struct inode *inode, struct page *page)
> +static inline int nfs_netfs_begin_cache_operation(struct netfs_io_request *rreq)
> {
> - int ret;
> + return fscache_begin_read_operation(&rreq->cache_resources,
> + netfs_i_cookie(&NFS_I(rreq->inode)->netfs));
> +}
>
> - trace_nfs_fscache_read_page(inode, page);
> - if (PageChecked(page)) {
> - ClearPageChecked(page);
> - ret = 1;
> - goto out;
> - }
> +static struct nfs_netfs_io_data *nfs_netfs_alloc(struct netfs_io_subrequest *sreq)
> +{
> + struct nfs_netfs_io_data *netfs;
> +
> + netfs = kzalloc(sizeof(*netfs), GFP_KERNEL_ACCOUNT);
> + if (!netfs)
> + return NULL;
> + netfs->sreq = sreq;
> + refcount_set(&netfs->refcount, 1);
> + spin_lock_init(&netfs->lock);
> + return netfs;
> +}
>
> - ret = fscache_fallback_read_page(inode, page);
> - if (ret < 0) {
> - nfs_inc_fscache_stats(inode, NFSIOS_FSCACHE_PAGES_READ_FAIL);
> - SetPageChecked(page);
> - goto out;
> +static bool nfs_netfs_clamp_length(struct netfs_io_subrequest *sreq)
> +{
> + size_t rsize = NFS_SB(sreq->rreq->inode->i_sb)->rsize;
> +
> + sreq->len = min(sreq->len, rsize);
> + return true;
> +}
> +
> +static void nfs_netfs_issue_read(struct netfs_io_subrequest *sreq)
> +{
> + struct nfs_pageio_descriptor pgio;
> + struct inode *inode = sreq->rreq->inode;
> + struct nfs_open_context *ctx = sreq->rreq->netfs_priv;
> + struct page *page;
> + int err;
> + pgoff_t start = (sreq->start + sreq->transferred) >> PAGE_SHIFT;
> + pgoff_t last = ((sreq->start + sreq->len -
> + sreq->transferred - 1) >> PAGE_SHIFT);
> + XA_STATE(xas, &sreq->rreq->mapping->i_pages, start);
> +
> + nfs_pageio_init_read(&pgio, inode, false,
> + &nfs_async_read_completion_ops);
> +
> + pgio.pg_netfs = nfs_netfs_alloc(sreq); /* used in completion */
> + if (!pgio.pg_netfs)
> + return netfs_subreq_terminated(sreq, -ENOMEM, false);
> +
> + xas_lock(&xas);
> + xas_for_each(&xas, page, last) {
> + /* nfs_pageio_add_page() may schedule() due to pNFS layout and other RPCs */
> + xas_pause(&xas);
> + xas_unlock(&xas);
> + err = nfs_pageio_add_page(&pgio, ctx, page);
> + if (err < 0)
> + return netfs_subreq_terminated(sreq, err, false);
> + xas_lock(&xas);
> }
> + xas_unlock(&xas);
> + nfs_pageio_complete_read(&pgio);
> + nfs_netfs_put(pgio.pg_netfs);
> +}
> - /* Read completed synchronously */
> - nfs_inc_fscache_stats(inode, NFSIOS_FSCACHE_PAGES_READ_OK);
> - SetPageUptodate(page);
> - ret = 0;
> -out:
> - trace_nfs_fscache_read_page_exit(inode, page, ret);
> - return ret;
> +void nfs_netfs_read_initiate(struct nfs_pgio_header *hdr)
> +{
> + struct nfs_netfs_io_data *netfs = hdr->netfs;
> + struct netfs_io_subrequest *sreq;
> +
> + if (!netfs)
> + return;
> +
> + sreq = netfs->sreq;
> + spin_lock(&netfs->lock);
> + atomic_inc(&netfs->rpcs);
> + netfs->rpc_byte_count += hdr->args.count;
> + spin_unlock(&netfs->lock);
> }
>
> -/*
> - * Store a newly fetched page in fscache. We can be certain there's no page
> - * stored in the cache as yet otherwise we would've read it from there.
> - */
> -void __nfs_fscache_write_page(struct inode *inode, struct page *page)
> +void nfs_netfs_read_done(struct nfs_pgio_header *hdr)
> {
> - int ret;
> + struct nfs_netfs_io_data *netfs = hdr->netfs;
> + struct netfs_io_subrequest *sreq;
> +
> + if (!netfs)
> + return;
>
> - trace_nfs_fscache_write_page(inode, page);
> + sreq = netfs->sreq;
> + spin_lock(&netfs->lock);
> + if (hdr->res.op_status) {
> + /*
> + * Retryable errors such as BAD_STATEID will be re-issued,
> + * so reduce the bytes and the RPC counts.
> + */
> + netfs->rpc_byte_count -= hdr->args.count;
> + atomic_dec(&netfs->rpcs);
> + }
> + spin_unlock(&netfs->lock);
> +}
>
> - ret = fscache_fallback_write_page(inode, page, true);
> +void nfs_netfs_read_completion(struct nfs_pgio_header *hdr)
> +{
> + struct nfs_netfs_io_data *netfs = hdr->netfs;
> + struct netfs_io_subrequest *sreq;
>
> - if (ret != 0) {
> - nfs_inc_fscache_stats(inode, NFSIOS_FSCACHE_PAGES_WRITTEN_FAIL);
> - nfs_inc_fscache_stats(inode, NFSIOS_FSCACHE_PAGES_UNCACHED);
> - } else {
> - nfs_inc_fscache_stats(inode, NFSIOS_FSCACHE_PAGES_WRITTEN_OK);
> + if (!netfs)
> + return;
> +
> + sreq = netfs->sreq;
> + if (test_bit(NFS_IOHDR_EOF, &hdr->flags))
> + __set_bit(NETFS_SREQ_CLEAR_TAIL, &sreq->flags);
> +
> + spin_lock(&netfs->lock);
> + if (hdr->error)
> + netfs->error = hdr->error;
> + else
> + netfs->transferred += hdr->res.count;
> + spin_unlock(&netfs->lock);
> +
> + /* Only the last RPC completion should call netfs_subreq_terminated() */
> + if (atomic_dec_and_test(&netfs->rpcs) &&
> + (netfs->rpc_byte_count >= sreq->len)) {
> + netfs_subreq_terminated(sreq, netfs->error ?: netfs->transferred, false);
> + nfs_netfs_put(netfs);
> + hdr->netfs = NULL;
> }
> - trace_nfs_fscache_write_page_exit(inode, page, ret);
> }
> +
> +const struct netfs_request_ops nfs_netfs_ops = {
> + .init_request = nfs_netfs_init_request,
> + .free_request = nfs_netfs_free_request,
> + .begin_cache_operation = nfs_netfs_begin_cache_operation,
> + .issue_read = nfs_netfs_issue_read,
> + .clamp_length = nfs_netfs_clamp_length
> +};
> diff --git a/fs/nfs/fscache.h b/fs/nfs/fscache.h
> index 38614ed8f951..c59a82a7d4a8 100644
> --- a/fs/nfs/fscache.h
> +++ b/fs/nfs/fscache.h
> @@ -34,6 +34,44 @@ struct nfs_fscache_inode_auxdata {
> u64 change_attr;
> };
>
> +struct nfs_netfs_io_data {
> + refcount_t refcount;
> + struct netfs_io_subrequest *sreq;
> +
> + /*
> + * NFS may split a netfs_io_subrequest into multiple RPCs, each
> + * with their own read completion. In netfs, we can only call
> + * netfs_subreq_terminated() once for each subrequest. So we
> + * must keep track of the rpcs and rpc_byte_count for what has
> + * been submitted, and only call netfs via netfs_subreq_terminated()
> + * when the final RPC has completed.
> + */
> + atomic_t rpcs;
> + unsigned long rpc_byte_count;
> +
> + /*
> + * Final dispostion of the netfs_io_subrequest, sent in
> + * netfs_subreq_terminated()
> + */
> + spinlock_t lock;
> + ssize_t transferred;
> + int error;
> +};
> +
> +static inline void nfs_netfs_get(struct nfs_netfs_io_data *netfs)
> +{
> + refcount_inc(&netfs->refcount);
> +}
> +
> +static inline void nfs_netfs_put(struct nfs_netfs_io_data *netfs)
> +{
> + if (refcount_dec_and_test(&netfs->refcount))
> + kfree(netfs);
> +}
> +extern void nfs_netfs_read_initiate(struct nfs_pgio_header *hdr);
> +extern void nfs_netfs_read_done(struct nfs_pgio_header *hdr);
> +extern void nfs_netfs_read_completion(struct nfs_pgio_header *hdr);
> +
> /*
> * fscache.c
> */
> @@ -45,43 +83,17 @@ extern void nfs_fscache_clear_inode(struct inode *);
> extern void nfs_fscache_open_file(struct inode *, struct file *);
> extern void nfs_fscache_release_file(struct inode *, struct file *);
>
> -extern int __nfs_fscache_read_page(struct inode *, struct page *);
> -extern void __nfs_fscache_write_page(struct inode *, struct page *);
> -
> static inline bool nfs_fscache_release_folio(struct folio *folio, gfp_t gfp)
> {
> if (folio_test_fscache(folio)) {
> if (current_is_kswapd() || !(gfp & __GFP_FS))
> return false;
> folio_wait_fscache(folio);
> - fscache_note_page_release(netfs_i_cookie(&NFS_I(folio->mapping->host)->netfs));
> - nfs_inc_fscache_stats(folio->mapping->host,
> - NFSIOS_FSCACHE_PAGES_UNCACHED);
> }
> + fscache_note_page_release(netfs_i_cookie(&NFS_I(folio->mapping->host)->netfs));
> return true;
> }
>
> -/*
> - * Retrieve a page from an inode data storage object.
> - */
> -static inline int nfs_fscache_read_page(struct inode *inode, struct page *page)
> -{
> - if (netfs_inode(inode)->cache)
> - return __nfs_fscache_read_page(inode, page);
> - return -ENOBUFS;
> -}
> -
> -/*
> - * Store a page newly fetched from the server in an inode data storage object
> - * in the cache.
> - */
> -static inline void nfs_fscache_write_page(struct inode *inode,
> - struct page *page)
> -{
> - if (netfs_inode(inode)->cache)
> - __nfs_fscache_write_page(inode, page);
> -}
> -
> static inline void nfs_fscache_update_auxdata(struct nfs_fscache_inode_auxdata *auxdata,
> struct inode *inode)
> {
> @@ -130,11 +142,6 @@ static inline bool nfs_fscache_release_folio(struct folio *folio, gfp_t gfp)
> {
> return true; /* may release folio */
> }
> -static inline int nfs_fscache_read_page(struct inode *inode, struct page *page)
> -{
> - return -ENOBUFS;
> -}
> -static inline void nfs_fscache_write_page(struct inode *inode, struct page *page) {}
> static inline void nfs_fscache_invalidate(struct inode *inode, int flags) {}
>
> static inline const char *nfs_server_fscache_state(struct nfs_server *server)
> diff --git a/fs/nfs/inode.c b/fs/nfs/inode.c
> index 7d1b9e39863c..de5eaba5b3f1 100644
> --- a/fs/nfs/inode.c
> +++ b/fs/nfs/inode.c
> @@ -2248,6 +2248,9 @@ struct inode *nfs_alloc_inode(struct super_block *sb)
> #endif /* CONFIG_NFS_V4 */
> #ifdef CONFIG_NFS_V4_2
> nfsi->xattr_cache = NULL;
> +#endif
> +#ifdef CONFIG_NFS_FSCACHE
> + netfs_inode_init(&nfsi->netfs, &nfs_netfs_ops);
> #endif
> return NFSI_TO_INODE(nfsi);
> }
> diff --git a/fs/nfs/internal.h b/fs/nfs/internal.h
> index 78d945f5f97f..6b1870382cea 100644
> --- a/fs/nfs/internal.h
> +++ b/fs/nfs/internal.h
> @@ -453,6 +453,10 @@ extern void nfs_sb_deactive(struct super_block *sb);
> extern int nfs_client_for_each_server(struct nfs_client *clp,
> int (*fn)(struct nfs_server *, void *),
> void *data);
> +#ifdef CONFIG_NFS_FSCACHE
> +extern const struct netfs_request_ops nfs_netfs_ops;
> +#endif
> +
> /* io.c */
> extern void nfs_start_io_read(struct inode *inode);
> extern void nfs_end_io_read(struct inode *inode);
> @@ -482,9 +486,14 @@ extern int nfs4_get_rootfh(struct nfs_server *server, struct nfs_fh *mntfh, bool
>
> struct nfs_pgio_completion_ops;
> /* read.c */
> +extern const struct nfs_pgio_completion_ops nfs_async_read_completion_ops;
> extern void nfs_pageio_init_read(struct nfs_pageio_descriptor *pgio,
> struct inode *inode, bool force_mds,
> const struct nfs_pgio_completion_ops *compl_ops);
> +extern int nfs_pageio_add_page(struct nfs_pageio_descriptor *pgio,
> + struct nfs_open_context *ctx,
> + struct page *page);
> +extern void nfs_pageio_complete_read(struct nfs_pageio_descriptor *pgio);
> extern void nfs_read_prepare(struct rpc_task *task, void *calldata);
> extern void nfs_pageio_reset_read_mds(struct nfs_pageio_descriptor *pgio);
>
> diff --git a/fs/nfs/pagelist.c b/fs/nfs/pagelist.c
> index 317cedfa52bf..600989332a6f 100644
> --- a/fs/nfs/pagelist.c
> +++ b/fs/nfs/pagelist.c
> @@ -25,6 +25,7 @@
> #include "internal.h"
> #include "pnfs.h"
> #include "nfstrace.h"
> +#include "fscache.h"
>
> #define NFSDBG_FACILITY NFSDBG_PAGECACHE
>
> @@ -68,6 +69,12 @@ void nfs_pgheader_init(struct nfs_pageio_descriptor *desc,
> hdr->good_bytes = mirror->pg_count;
> hdr->io_completion = desc->pg_io_completion;
> hdr->dreq = desc->pg_dreq;
> +#ifdef CONFIG_NFS_FSCACHE
> + if (desc->pg_netfs) {
> + hdr->netfs = desc->pg_netfs;
> + nfs_netfs_get(desc->pg_netfs);
> + }
> +#endif
> hdr->release = release;
> hdr->completion_ops = desc->pg_completion_ops;
> if (hdr->completion_ops->init_hdr)
> @@ -846,6 +853,9 @@ void nfs_pageio_init(struct nfs_pageio_descriptor *desc,
> desc->pg_lseg = NULL;
> desc->pg_io_completion = NULL;
> desc->pg_dreq = NULL;
> +#ifdef CONFIG_NFS_FSCACHE
> + desc->pg_netfs = NULL;
> +#endif
> desc->pg_bsize = bsize;
>
> desc->pg_mirror_count = 1;
> @@ -940,6 +950,7 @@ int nfs_generic_pgio(struct nfs_pageio_descriptor *desc,
> /* Set up the argument struct */
> nfs_pgio_rpcsetup(hdr, mirror->pg_count, desc->pg_ioflags, &cinfo);
> desc->pg_rpc_callops = &nfs_pgio_common_ops;
> +
> return 0;
> }
> EXPORT_SYMBOL_GPL(nfs_generic_pgio);
> @@ -1360,6 +1371,9 @@ int nfs_pageio_resend(struct nfs_pageio_descriptor *desc,
>
> desc->pg_io_completion = hdr->io_completion;
> desc->pg_dreq = hdr->dreq;
> +#ifdef CONFIG_NFS_FSCACHE
> + desc->pg_netfs = hdr->netfs;
> +#endif
> list_splice_init(&hdr->pages, &pages);
> while (!list_empty(&pages)) {
> struct nfs_page *req = nfs_list_entry(pages.next);
> diff --git a/fs/nfs/read.c b/fs/nfs/read.c
> index 525e82ea9a9e..3bc48472f207 100644
> --- a/fs/nfs/read.c
> +++ b/fs/nfs/read.c
> @@ -30,7 +30,7 @@
>
> #define NFSDBG_FACILITY NFSDBG_PAGECACHE
>
> -static const struct nfs_pgio_completion_ops nfs_async_read_completion_ops;
> +const struct nfs_pgio_completion_ops nfs_async_read_completion_ops;
> static const struct nfs_rw_ops nfs_rw_read_ops;
>
> static struct kmem_cache *nfs_rdata_cachep;
> @@ -74,7 +74,7 @@ void nfs_pageio_init_read(struct nfs_pageio_descriptor *pgio,
> }
> EXPORT_SYMBOL_GPL(nfs_pageio_init_read);
>
> -static void nfs_pageio_complete_read(struct nfs_pageio_descriptor *pgio)
> +void nfs_pageio_complete_read(struct nfs_pageio_descriptor *pgio)
> {
> struct nfs_pgio_mirror *pgm;
> unsigned long npages;
> @@ -110,20 +110,25 @@ EXPORT_SYMBOL_GPL(nfs_pageio_reset_read_mds);
>
> static void nfs_readpage_release(struct nfs_page *req, int error)
> {
> - struct inode *inode = d_inode(nfs_req_openctx(req)->dentry);
> struct page *page = req->wb_page;
>
> - dprintk("NFS: read done (%s/%llu %d@%lld)\n", inode->i_sb->s_id,
> - (unsigned long long)NFS_FILEID(inode), req->wb_bytes,
> - (long long)req_offset(req));
> -
> if (nfs_error_is_fatal_on_server(error) && error != -ETIMEDOUT)
> SetPageError(page);
> if (nfs_page_group_sync_on_bit(req, PG_UNLOCKPAGE)) {
> - if (PageUptodate(page))
> - nfs_fscache_write_page(inode, page);
> +#ifdef CONFIG_NFS_FSCACHE
> + struct inode *inode = d_inode(nfs_req_openctx(req)->dentry);
> +
> + /*
> + * If fscache is enabled, netfs will unlock pages.
> + * Otherwise, we have to unlock the page here
> + */
> + if (!netfs_inode(inode)->cache)
> + unlock_page(page);
> +#else
> unlock_page(page);
> +#endif
> }
> +
> nfs_release_request(req);
> }
>
> @@ -177,6 +182,10 @@ static void nfs_read_completion(struct nfs_pgio_header *hdr)
> nfs_list_remove_request(req);
> nfs_readpage_release(req, error);
> }
> +#ifdef CONFIG_NFS_FSCACHE
> + nfs_netfs_read_completion(hdr);
> +#endif
> +
> out:
> hdr->release(hdr);
> }
> @@ -187,6 +196,9 @@ static void nfs_initiate_read(struct nfs_pgio_header *hdr,
> struct rpc_task_setup *task_setup_data, int how)
> {
> rpc_ops->read_setup(hdr, msg);
> +#ifdef CONFIG_NFS_FSCACHE
> + nfs_netfs_read_initiate(hdr);
> +#endif
> trace_nfs_initiate_read(hdr);
> }
>
> @@ -202,7 +214,7 @@ nfs_async_read_error(struct list_head *head, int error)
> }
> }
>
> -static const struct nfs_pgio_completion_ops nfs_async_read_completion_ops = {
> +const struct nfs_pgio_completion_ops nfs_async_read_completion_ops = {
> .error_cleanup = nfs_async_read_error,
> .completion = nfs_read_completion,
> };
> @@ -219,6 +231,9 @@ static int nfs_readpage_done(struct rpc_task *task,
> if (status != 0)
> return status;
>
> +#ifdef CONFIG_NFS_FSCACHE
> + nfs_netfs_read_done(hdr);
> +#endif
> nfs_add_stats(inode, NFSIOS_SERVERREADBYTES, hdr->res.count);
> trace_nfs_readpage_done(task, hdr);
>
> @@ -294,12 +309,6 @@ nfs_pageio_add_page(struct nfs_pageio_descriptor *pgio,
>
> aligned_len = min_t(unsigned int, ALIGN(len, rsize), PAGE_SIZE);
>
> - if (!IS_SYNC(page->mapping->host)) {
> - error = nfs_fscache_read_page(page->mapping->host, page);
> - if (error == 0)
> - goto out_unlock;
> - }
> -
> new = nfs_create_request(ctx, page, 0, aligned_len);
> if (IS_ERR(new))
> goto out_error;
> @@ -315,8 +324,6 @@ nfs_pageio_add_page(struct nfs_pageio_descriptor *pgio,
> return 0;
> out_error:
> error = PTR_ERR(new);
> -out_unlock:
> - unlock_page(page);
> out:
> return error;
> }
> @@ -355,6 +362,12 @@ int nfs_read_folio(struct file *file, struct folio *folio)
> if (NFS_STALE(inode))
> goto out_unlock;
>
> +#ifdef CONFIG_NFS_FSCACHE
> + if (netfs_inode(inode)->cache) {
> + ret = netfs_read_folio(file, folio);
> + goto out;
> + }
> +#endif
> if (file == NULL) {
> ret = -EBADF;
> ctx = nfs_find_open_context(inode, NULL, FMODE_READ);
> @@ -368,8 +381,10 @@ int nfs_read_folio(struct file *file, struct folio *folio)
> &nfs_async_read_completion_ops);
>
> ret = nfs_pageio_add_page(&pgio, ctx, page);
> - if (ret)
> - goto out;
> + if (ret) {
> + put_nfs_open_context(ctx);
> + goto out_unlock;
> + }
>
> nfs_pageio_complete_read(&pgio);
> ret = pgio.pg_error < 0 ? pgio.pg_error : 0;
> @@ -378,12 +393,12 @@ int nfs_read_folio(struct file *file, struct folio *folio)
> if (!PageUptodate(page) && !ret)
> ret = xchg(&ctx->error, 0);
> }
> -out:
> put_nfs_open_context(ctx);
> - trace_nfs_aop_readpage_done(inode, page, ret);
> - return ret;
> + goto out;
> +
> out_unlock:
> unlock_page(page);
> +out:
> trace_nfs_aop_readpage_done(inode, page, ret);
> return ret;
> }
> @@ -405,6 +420,13 @@ void nfs_readahead(struct readahead_control *ractl)
> if (NFS_STALE(inode))
> goto out;
>
> +#ifdef CONFIG_NFS_FSCACHE
> + if (netfs_inode(inode)->cache) {
> + netfs_readahead(ractl);
> + ret = 0;
> + goto out;
> + }
> +#endif
> if (file == NULL) {
> ret = -EBADF;
> ctx = nfs_find_open_context(inode, NULL, FMODE_READ);
> diff --git a/include/linux/nfs_page.h b/include/linux/nfs_page.h
> index ba7e2e4b0926..8eeb16d9bacd 100644
> --- a/include/linux/nfs_page.h
> +++ b/include/linux/nfs_page.h
> @@ -101,6 +101,9 @@ struct nfs_pageio_descriptor {
> struct pnfs_layout_segment *pg_lseg;
> struct nfs_io_completion *pg_io_completion;
> struct nfs_direct_req *pg_dreq;
> +#ifdef CONFIG_NFS_FSCACHE
> + void *pg_netfs;
> +#endif
> unsigned int pg_bsize; /* default bsize for mirrors */
>
> u32 pg_mirror_count;
> diff --git a/include/linux/nfs_xdr.h b/include/linux/nfs_xdr.h
> index e86cf6642d21..e196ef595908 100644
> --- a/include/linux/nfs_xdr.h
> +++ b/include/linux/nfs_xdr.h
> @@ -1619,6 +1619,9 @@ struct nfs_pgio_header {
> const struct nfs_rw_ops *rw_ops;
> struct nfs_io_completion *io_completion;
> struct nfs_direct_req *dreq;
> +#ifdef CONFIG_NFS_FSCACHE
> + void *netfs;
> +#endif
>
> int pnfs_error;
> int error; /* merge with pnfs_error */

--
Jeff Layton <[email protected]>

2022-08-31 19:11:19

by Matthew Wilcox

[permalink] [raw]
Subject: Re: [PATCH v3 3/3] NFS: Convert buffered read paths to use netfs when fscache is enabled

On Wed, Aug 31, 2022 at 02:21:23PM -0400, Jeff Layton wrote:
> > +static int nfs_netfs_init_request(struct netfs_io_request *rreq, struct file *file)
> > {
> > - struct netfs_cache_resources cres;
> > - struct fscache_cookie *cookie = netfs_i_cookie(&NFS_I(inode)->netfs);
> > - struct iov_iter iter;
> > - struct bio_vec bvec[1];
> > - int ret;
> > -
> > - memset(&cres, 0, sizeof(cres));
> > - bvec[0].bv_page = page;
> > - bvec[0].bv_offset = 0;
> > - bvec[0].bv_len = PAGE_SIZE;
> > - iov_iter_bvec(&iter, READ, bvec, ARRAY_SIZE(bvec), PAGE_SIZE);
> > -
> > - ret = fscache_begin_read_operation(&cres, cookie);
> > - if (ret < 0)
> > - return ret;
> > -
> > - ret = fscache_read(&cres, page_offset(page), &iter, NETFS_READ_HOLE_FAIL,
> > - NULL, NULL);
> > - fscache_end_operation(&cres);
> > - return ret;
> > + struct nfs_open_context *ctx;
> > +
> > + if (file == NULL) {
> > + ctx = nfs_find_open_context(rreq->inode, NULL, FMODE_READ);
> > + if (!ctx)
> > + return -ENOMEM;
>
> That error return seems like an odd choice. A NULL return here just
> means that we don't have a suitable open file, not that we're out of
> memory.
>
> I think a NULL file pointer from netfs can only happen in readahead, and
> the comments over readahead_control say:
>
> * @file: The file, used primarily by network filesystems for authentication.
> * May be NULL if invoked internally by the filesystem.
>
> AFAICT though, only f2fs and ext4 invoke it internally.
>
> Maybe instead of doing this, it ought to just throw a WARN if we get a
> NULL file pointer and return -EINVAL or something?
>
> Willy, am I correct on when ractl->file can be NULL?

Yes. Just to quickly verify it:

$ git grep -w DEFINE_READAHEAD
fs/ext4/verity.c: DEFINE_READAHEAD(ractl, NULL, NULL, inode->i_mapping, index);
fs/f2fs/file.c: DEFINE_READAHEAD(ractl, NULL, NULL, inode->i_mapping, page_idx);
fs/f2fs/verity.c: DEFINE_READAHEAD(ractl, NULL, NULL, inode->i_mapping, index);
fs/netfs/buffered_read.c: DEFINE_READAHEAD(ractl, file, NULL, mapping, index);
fs/verity/enable.c: DEFINE_READAHEAD(ractl, file, ra, file->f_mapping, index);
include/linux/pagemap.h:#define DEFINE_READAHEAD(ractl, f, r, m, i) \
include/linux/pagemap.h: DEFINE_READAHEAD(ractl, file, ra, mapping, index);
include/linux/pagemap.h: DEFINE_READAHEAD(ractl, file, ra, mapping, index);
mm/filemap.c: DEFINE_READAHEAD(ractl, file, &file->f_ra, mapping, folio->index);
mm/filemap.c: DEFINE_READAHEAD(ractl, file, ra, mapping, vmf->pgoff);
mm/filemap.c: DEFINE_READAHEAD(ractl, file, ra, file->f_mapping, vmf->pgoff);
mm/internal.h: DEFINE_READAHEAD(ractl, file, &file->f_ra, mapping, index);

Those two uses in pagemap.h are wrappers, so we need to check their
callers too:

$ git grep 'page_cache_\(a\)*sync_readahead'
mm/filemap.c: page_cache_sync_readahead(mapping, ra, filp, index,
mm/khugepaged.c: page_cache_sync_readahead(mapping, &file->f_ra,
(ignoring the ones inside filesystems)

So yes, they all pass in a real struct file. I wouldn't even check
whether the file pointer is NULL; just assume that it's not and the
crash will be obvious to debug.