2023-11-17 21:17:02

by David Howells

[permalink] [raw]
Subject: [PATCH v2 00/51] netfs, afs, cifs: Delegate high-level I/O to netfslib

Hi Jeff, Steve,

I have been working on my netfslib helpers to the point that I can run
xfstests on AFS to completion (both with write-back buffering and, with a
small patch, write-through buffering in the pagecache). I can also run a
certain amount of xfstests on CIFS, though that requires some more
debugging. However, this seems like a good time to post a preview of the
patches.

The patches remove a little over 800 lines from AFS and over 2000 from
CIFS, albeit with around 3000 lines added to netfs. Hopefully, I will be
able to remove a bunch of lines from 9P and Ceph too.

The main aims of these patches are to get high-level I/O and knowledge of
the pagecache out of the filesystem drivers as much as possible and to get
rid, as much of possible, of the knowledge that pages/folios exist.

Further, I would like to see ->write_begin, ->write_end and ->launder_folio
go away.

Features that are added by these patches to that which is already there in
netfslib:

(1) NFS-style (and Ceph-style) locking around DIO vs buffered I/O calls to
prevent these from happening at the same time. mmap'd I/O can, of
necessity, happen at any time ignoring these locks.

(2) Support for unbuffered I/O. The data is kept in the bounce buffer and
the pagecache is not used. This can be turned on with an inode flag.

(3) Support for direct I/O. This is basically unbuffered I/O with some
extra restrictions and no RMW.

(4) Support for using a bounce buffer in an operation. The bounce buffer
may be bigger than the target data/buffer, allowing for crypto
rounding.

(5) Support for content encryption. This isn't supported yet by AFS/CIFS
but is aimed initially at Ceph.

(6) ->write_begin() and ->write_end() are ignored in favour of merging all
of that into one function, netfs_perform_write(), thereby avoiding the
function pointer traversals.

(7) Support for write-through caching in the pagecache.
netfs_perform_write() adds the pages is modifies to an I/O operation
as it goes and directly marks them writeback rather than dirty. When
writing back from write-through, it limits the range written back.
This should allow CIFS to deal with byte-range mandatory locks
correctly.

(8) O_*SYNC and RWF_*SYNC writes use write-through rather than writing to
the pagecache and then flushing afterwards. An AIO O_*SYNC write will
notify of completion when the sub-writes all complete.

(9) Support for write-streaming where modifed data is held in !uptodate
folios, with a private struct attached indicating the range that is
valid.

(10) Support for write grouping, multiplexing a pointer to a group in the
folio private data with the write-streaming data. The writepages
algorithm only writes stuff back that's in the nominated group. This
is intended for use by Ceph to write is snaps in order.

(11) Skipping reads for which we know the server could only supply zeros or
EOF (for instance if we've done a local write that leaves a hole in
the file and extends the local inode size).


General notes:

(1) netfslib now makes use of folio->private, which means the filesystem
can't use it.

(2) Use of fscache is not yet tested. I'm not sure whether to allow a
cache to be used with a write-through write.

(3) The filesystem provides wrappers to call the write helpers, allowing
it to do pre-validation, oplock/capability fetching and the passing in
of write group info.

(4) I want to try flushing the data when tearing down an inode before
invalidating it to try and render launder_folio unnecessary.

(5) Write-through caching will generate and dispatch write subrequests as
it gathers enough data to hit wsize and has whole pages that at least
span that size. This needs to be a bit more flexible, allowing for a
filesystem such as CIFS to have a variable wsize.

(6) The filesystem driver is just given read and write calls with an
iov_iter describing the data/buffer to use. Ideally, they don't see
pages or folios at all. A function, extract_iter_to_sg(), is already
available to decant part of an iterator into a scatterlist for crypto
purposes.


CIFS notes:

(1) CIFS is made to use unbuffered I/O for unbuffered caching modes and
write-through caching for cache=strict.

(2) cifs_init_request() occasionally throws an error that it can't get a
writable file when trying to do writeback.

(3) Apparent file corruption frequently appears in the target file when
cifs_copy_file_range(), even though it doesn't use any netfslib
helpers and even if it doesn't overlap with any pages in the
pagecache.

(4) I should be able to turn multipage folio support on in CIFS now.

(5) The then-unused CIFS code is removed in three patches, not one, to
avoid the git patch generator from producing confusing patches in
which it thinks code is being moved around rather than just being
removed.


Changes
=======
ver #2)
- Folded the addition of NETFS_RREQ_NONBLOCK/BLOCKED into first patch that
uses them.
- Folded addition of rsize member into first user.
- Don't set rsize in ceph (yet) and set it in kafs to 256KiB. cifs sets
it dynamically.
- Moved direct_bv next to direct_bv_count in struct netfs_io_request and
labelled it with a __counted_by().
- Passed flags into netfs_xa_store_and_mark() rather than two bools.
- Removed netfs_set_up_buffer() as it wasn't used.

David

Link: https://lore.kernel.org/r/[email protected]/ # v1

David Howells (51):
netfs: Add a procfile to list in-progress requests
netfs: Track the fpos above which the server has no data
netfs: Allow the netfs to make the io (sub)request alloc larger
netfs: Add a ->free_subrequest() op
afs: Don't use folio->private to record partial modification
netfs: Provide invalidate_folio and release_folio calls
netfs: Implement unbuffered/DIO vs buffered I/O locking
netfs: Add iov_iters to (sub)requests to describe various buffers
netfs: Add support for DIO buffering
netfs: Provide tools to create a buffer in an xarray
netfs: Add bounce buffering support
netfs: Add func to calculate pagecount/size-limited span of an
iterator
netfs: Limit subrequest by size or number of segments
netfs: Export netfs_put_subrequest() and some tracepoints
netfs: Extend the netfs_io_*request structs to handle writes
netfs: Add a hook to allow tell the netfs to update its i_size
netfs: Make netfs_put_request() handle a NULL pointer
fscache: Add a function to begin an cache op from a netfslib request
netfs: Make the refcounting of netfs_begin_read() easier to use
netfs: Prep to use folio->private for write grouping and streaming
write
netfs: Dispatch write requests to process a writeback slice
netfs: Provide func to copy data to pagecache for buffered write
netfs: Make netfs_read_folio() handle streaming-write pages
netfs: Allocate multipage folios in the writepath
netfs: Implement support for unbuffered/DIO read
netfs: Implement unbuffered/DIO write support
netfs: Implement buffered write API
netfs: Allow buffered shared-writeable mmap through
netfs_page_mkwrite()
netfs: Provide netfs_file_read_iter()
netfs: Provide a writepages implementation
netfs: Provide minimum blocksize parameter
netfs: Make netfs_skip_folio_read() take account of blocksize
netfs: Perform content encryption
netfs: Decrypt encrypted content
netfs: Support decryption on ubuffered/DIO read
netfs: Support encryption on Unbuffered/DIO write
netfs: Provide a launder_folio implementation
netfs: Implement a write-through caching option
netfs: Rearrange netfs_io_subrequest to put request pointer first
afs: Use the netfs write helpers
cifs: Replace cifs_readdata with a wrapper around netfs_io_subrequest
cifs: Share server EOF pos with netfslib
cifs: Replace cifs_writedata with a wrapper around netfs_io_subrequest
cifs: Use more fields from netfs_io_subrequest
cifs: Make wait_mtu_credits take size_t args
cifs: Implement netfslib hooks
cifs: Move cifs_loose_read_iter() and cifs_file_write_iter() to file.c
cifs: Cut over to using netfslib
cifs: Remove some code that's no longer used, part 1
cifs: Remove some code that's no longer used, part 2
cifs: Remove some code that's no longer used, part 3

fs/9p/vfs_addr.c | 51 +-
fs/afs/file.c | 206 +--
fs/afs/inode.c | 15 +-
fs/afs/internal.h | 66 +-
fs/afs/write.c | 814 +---------
fs/ceph/addr.c | 26 +-
fs/ceph/cache.h | 12 -
fs/fscache/io.c | 42 +
fs/netfs/Makefile | 9 +-
fs/netfs/buffered_read.c | 245 ++-
fs/netfs/buffered_write.c | 1222 ++++++++++++++
fs/netfs/crypto.c | 148 ++
fs/netfs/direct_read.c | 263 +++
fs/netfs/direct_write.c | 359 +++++
fs/netfs/internal.h | 118 ++
fs/netfs/io.c | 325 +++-
fs/netfs/iterator.c | 97 ++
fs/netfs/locking.c | 215 +++
fs/netfs/main.c | 101 ++
fs/netfs/misc.c | 178 +++
fs/netfs/objects.c | 64 +-
fs/netfs/output.c | 485 ++++++
fs/netfs/stats.c | 22 +-
fs/smb/client/Kconfig | 1 +
fs/smb/client/cifsfs.c | 65 +-
fs/smb/client/cifsfs.h | 10 +-
fs/smb/client/cifsglob.h | 59 +-
fs/smb/client/cifsproto.h | 10 +-
fs/smb/client/cifssmb.c | 111 +-
fs/smb/client/file.c | 2904 ++++++----------------------------
fs/smb/client/fscache.c | 109 --
fs/smb/client/fscache.h | 54 -
fs/smb/client/inode.c | 25 +-
fs/smb/client/smb2ops.c | 20 +-
fs/smb/client/smb2pdu.c | 168 +-
fs/smb/client/smb2proto.h | 5 +-
fs/smb/client/trace.h | 144 +-
fs/smb/client/transport.c | 17 +-
include/linux/fscache.h | 6 +
include/linux/netfs.h | 174 +-
include/trace/events/afs.h | 31 -
include/trace/events/netfs.h | 158 +-
mm/filemap.c | 1 +
43 files changed, 5079 insertions(+), 4076 deletions(-)
create mode 100644 fs/netfs/buffered_write.c
create mode 100644 fs/netfs/crypto.c
create mode 100644 fs/netfs/direct_read.c
create mode 100644 fs/netfs/direct_write.c
create mode 100644 fs/netfs/locking.c
create mode 100644 fs/netfs/misc.c
create mode 100644 fs/netfs/output.c


2023-11-17 21:17:35

by David Howells

[permalink] [raw]
Subject: [PATCH v2 01/51] netfs: Add a procfile to list in-progress requests

Add a procfile, /proc/fs/netfs/requests, to list in-progress netfslib I/O
requests.

Signed-off-by: David Howells <[email protected]>
cc: Jeff Layton <[email protected]>
cc: [email protected]
cc: [email protected]
cc: [email protected]
---
fs/netfs/internal.h | 22 +++++++++++
fs/netfs/main.c | 91 +++++++++++++++++++++++++++++++++++++++++++
fs/netfs/objects.c | 4 +-
include/linux/netfs.h | 6 ++-
4 files changed, 121 insertions(+), 2 deletions(-)

diff --git a/fs/netfs/internal.h b/fs/netfs/internal.h
index 43fac1b14e40..1f067aa96c50 100644
--- a/fs/netfs/internal.h
+++ b/fs/netfs/internal.h
@@ -29,6 +29,28 @@ int netfs_begin_read(struct netfs_io_request *rreq, bool sync);
* main.c
*/
extern unsigned int netfs_debug;
+extern struct list_head netfs_io_requests;
+extern spinlock_t netfs_proc_lock;
+
+#ifdef CONFIG_PROC_FS
+static inline void netfs_proc_add_rreq(struct netfs_io_request *rreq)
+{
+ spin_lock(&netfs_proc_lock);
+ list_add_tail_rcu(&rreq->proc_link, &netfs_io_requests);
+ spin_unlock(&netfs_proc_lock);
+}
+static inline void netfs_proc_del_rreq(struct netfs_io_request *rreq)
+{
+ if (!list_empty(&rreq->proc_link)) {
+ spin_lock(&netfs_proc_lock);
+ list_del_rcu(&rreq->proc_link);
+ spin_unlock(&netfs_proc_lock);
+ }
+}
+#else
+static inline void netfs_proc_add_rreq(struct netfs_io_request *rreq) {}
+static inline void netfs_proc_del_rreq(struct netfs_io_request *rreq) {}
+#endif

/*
* objects.c
diff --git a/fs/netfs/main.c b/fs/netfs/main.c
index 068568702957..21f814eee6af 100644
--- a/fs/netfs/main.c
+++ b/fs/netfs/main.c
@@ -7,6 +7,8 @@

#include <linux/module.h>
#include <linux/export.h>
+#include <linux/proc_fs.h>
+#include <linux/seq_file.h>
#include "internal.h"
#define CREATE_TRACE_POINTS
#include <trace/events/netfs.h>
@@ -18,3 +20,92 @@ MODULE_LICENSE("GPL");
unsigned netfs_debug;
module_param_named(debug, netfs_debug, uint, S_IWUSR | S_IRUGO);
MODULE_PARM_DESC(netfs_debug, "Netfs support debugging mask");
+
+#ifdef CONFIG_PROC_FS
+LIST_HEAD(netfs_io_requests);
+DEFINE_SPINLOCK(netfs_proc_lock);
+
+static const char *netfs_origins[] = {
+ [NETFS_READAHEAD] = "RA",
+ [NETFS_READPAGE] = "RP",
+ [NETFS_READ_FOR_WRITE] = "RW",
+};
+
+/*
+ * Generate a list of I/O requests in /proc/fs/netfs/requests
+ */
+static int netfs_requests_seq_show(struct seq_file *m, void *v)
+{
+ struct netfs_io_request *rreq;
+
+ if (v == &netfs_io_requests) {
+ seq_puts(m,
+ "REQUEST OR REF FL ERR OPS COVERAGE\n"
+ "======== == === == ==== === =========\n"
+ );
+ return 0;
+ }
+
+ rreq = list_entry(v, struct netfs_io_request, proc_link);
+ seq_printf(m,
+ "%08x %s %3d %2lx %4d %3d @%04llx %zx/%zx",
+ rreq->debug_id,
+ netfs_origins[rreq->origin],
+ refcount_read(&rreq->ref),
+ rreq->flags,
+ rreq->error,
+ atomic_read(&rreq->nr_outstanding),
+ rreq->start, rreq->submitted, rreq->len);
+ seq_putc(m, '\n');
+ return 0;
+}
+
+static void *netfs_requests_seq_start(struct seq_file *m, loff_t *_pos)
+ __acquires(rcu)
+{
+ rcu_read_lock();
+ return seq_list_start_head(&netfs_io_requests, *_pos);
+}
+
+static void *netfs_requests_seq_next(struct seq_file *m, void *v, loff_t *_pos)
+{
+ return seq_list_next(v, &netfs_io_requests, _pos);
+}
+
+static void netfs_requests_seq_stop(struct seq_file *m, void *v)
+ __releases(rcu)
+{
+ rcu_read_unlock();
+}
+
+static const struct seq_operations netfs_requests_seq_ops = {
+ .start = netfs_requests_seq_start,
+ .next = netfs_requests_seq_next,
+ .stop = netfs_requests_seq_stop,
+ .show = netfs_requests_seq_show,
+};
+#endif /* CONFIG_PROC_FS */
+
+static int __init netfs_init(void)
+{
+ if (!proc_mkdir("fs/netfs", NULL))
+ goto error;
+
+ if (!proc_create_seq("fs/netfs/requests", S_IFREG | 0444, NULL,
+ &netfs_requests_seq_ops))
+ goto error_proc;
+
+ return 0;
+
+error_proc:
+ remove_proc_entry("fs/netfs", NULL);
+error:
+ return -ENOMEM;
+}
+fs_initcall(netfs_init);
+
+static void __exit netfs_exit(void)
+{
+ remove_proc_entry("fs/netfs", NULL);
+}
+module_exit(netfs_exit);
diff --git a/fs/netfs/objects.c b/fs/netfs/objects.c
index e17cdf53f6a7..85f428fc52e6 100644
--- a/fs/netfs/objects.c
+++ b/fs/netfs/objects.c
@@ -45,6 +45,7 @@ struct netfs_io_request *netfs_alloc_request(struct address_space *mapping,
}
}

+ netfs_proc_add_rreq(rreq);
netfs_stat(&netfs_n_rh_rreq);
return rreq;
}
@@ -76,12 +77,13 @@ static void netfs_free_request(struct work_struct *work)
container_of(work, struct netfs_io_request, work);

trace_netfs_rreq(rreq, netfs_rreq_trace_free);
+ netfs_proc_del_rreq(rreq);
netfs_clear_subrequests(rreq, false);
if (rreq->netfs_ops->free_request)
rreq->netfs_ops->free_request(rreq);
if (rreq->cache_resources.ops)
rreq->cache_resources.ops->end_operation(&rreq->cache_resources);
- kfree(rreq);
+ kfree_rcu(rreq, rcu);
netfs_stat_d(&netfs_n_rh_rreq);
}

diff --git a/include/linux/netfs.h b/include/linux/netfs.h
index b11a84f6c32b..b447cb67f599 100644
--- a/include/linux/netfs.h
+++ b/include/linux/netfs.h
@@ -175,10 +175,14 @@ enum netfs_io_origin {
* operations to a variety of data stores and then stitch the result together.
*/
struct netfs_io_request {
- struct work_struct work;
+ union {
+ struct work_struct work;
+ struct rcu_head rcu;
+ };
struct inode *inode; /* The file being accessed */
struct address_space *mapping; /* The mapping being accessed */
struct netfs_cache_resources cache_resources;
+ struct list_head proc_link; /* Link in netfs_iorequests */
struct list_head subrequests; /* Contributory I/O operations */
void *netfs_priv; /* Private data for the netfs */
unsigned int debug_id;

2023-11-17 21:17:39

by David Howells

[permalink] [raw]
Subject: [PATCH v2 06/51] netfs: Provide invalidate_folio and release_folio calls

Provide default invalidate_folio and release_folio calls. These will need
to interact with invalidation correctly at some point. They will be needed
if netfslib is to make use of folio->private for its own purposes.

Signed-off-by: David Howells <[email protected]>
Reviewed-by: Jeff Layton <[email protected]>
cc: [email protected]
cc: [email protected]
cc: [email protected]
---
fs/9p/vfs_addr.c | 33 ++-------------------------
fs/afs/file.c | 53 ++++---------------------------------------
fs/ceph/addr.c | 24 ++------------------
fs/netfs/Makefile | 1 +
fs/netfs/misc.c | 51 +++++++++++++++++++++++++++++++++++++++++
include/linux/netfs.h | 6 +++--
6 files changed, 64 insertions(+), 104 deletions(-)
create mode 100644 fs/netfs/misc.c

diff --git a/fs/9p/vfs_addr.c b/fs/9p/vfs_addr.c
index 8a635999a7d6..18a666c43e4a 100644
--- a/fs/9p/vfs_addr.c
+++ b/fs/9p/vfs_addr.c
@@ -104,35 +104,6 @@ const struct netfs_request_ops v9fs_req_ops = {
.issue_read = v9fs_issue_read,
};

-/**
- * v9fs_release_folio - release the private state associated with a folio
- * @folio: The folio to be released
- * @gfp: The caller's allocation restrictions
- *
- * Returns true if the page can be released, false otherwise.
- */
-
-static bool v9fs_release_folio(struct folio *folio, gfp_t gfp)
-{
- if (folio_test_private(folio))
- return false;
-#ifdef CONFIG_9P_FSCACHE
- if (folio_test_fscache(folio)) {
- if (current_is_kswapd() || !(gfp & __GFP_FS))
- return false;
- folio_wait_fscache(folio);
- }
- fscache_note_page_release(v9fs_inode_cookie(V9FS_I(folio_inode(folio))));
-#endif
- return true;
-}
-
-static void v9fs_invalidate_folio(struct folio *folio, size_t offset,
- size_t length)
-{
- folio_wait_fscache(folio);
-}
-
#ifdef CONFIG_9P_FSCACHE
static void v9fs_write_to_cache_done(void *priv, ssize_t transferred_or_error,
bool was_async)
@@ -355,8 +326,8 @@ const struct address_space_operations v9fs_addr_operations = {
.writepage = v9fs_vfs_writepage,
.write_begin = v9fs_write_begin,
.write_end = v9fs_write_end,
- .release_folio = v9fs_release_folio,
- .invalidate_folio = v9fs_invalidate_folio,
+ .release_folio = netfs_release_folio,
+ .invalidate_folio = netfs_invalidate_folio,
.launder_folio = v9fs_launder_folio,
.direct_IO = v9fs_direct_IO,
};
diff --git a/fs/afs/file.c b/fs/afs/file.c
index 0c49b3b6f214..3fea5cd8ef13 100644
--- a/fs/afs/file.c
+++ b/fs/afs/file.c
@@ -20,9 +20,6 @@

static int afs_file_mmap(struct file *file, struct vm_area_struct *vma);
static int afs_symlink_read_folio(struct file *file, struct folio *folio);
-static void afs_invalidate_folio(struct folio *folio, size_t offset,
- size_t length);
-static bool afs_release_folio(struct folio *folio, gfp_t gfp_flags);

static ssize_t afs_file_read_iter(struct kiocb *iocb, struct iov_iter *iter);
static ssize_t afs_file_splice_read(struct file *in, loff_t *ppos,
@@ -57,8 +54,8 @@ const struct address_space_operations afs_file_aops = {
.readahead = netfs_readahead,
.dirty_folio = afs_dirty_folio,
.launder_folio = afs_launder_folio,
- .release_folio = afs_release_folio,
- .invalidate_folio = afs_invalidate_folio,
+ .release_folio = netfs_release_folio,
+ .invalidate_folio = netfs_invalidate_folio,
.write_begin = afs_write_begin,
.write_end = afs_write_end,
.writepages = afs_writepages,
@@ -67,8 +64,8 @@ const struct address_space_operations afs_file_aops = {

const struct address_space_operations afs_symlink_aops = {
.read_folio = afs_symlink_read_folio,
- .release_folio = afs_release_folio,
- .invalidate_folio = afs_invalidate_folio,
+ .release_folio = netfs_release_folio,
+ .invalidate_folio = netfs_invalidate_folio,
.migrate_folio = filemap_migrate_folio,
};

@@ -405,48 +402,6 @@ int afs_write_inode(struct inode *inode, struct writeback_control *wbc)
return 0;
}

-/*
- * invalidate part or all of a page
- * - release a page and clean up its private data if offset is 0 (indicating
- * the entire page)
- */
-static void afs_invalidate_folio(struct folio *folio, size_t offset,
- size_t length)
-{
- _enter("{%lu},%zu,%zu", folio->index, offset, length);
-
- folio_wait_fscache(folio);
- _leave("");
-}
-
-/*
- * release a page and clean up its private state if it's not busy
- * - return true if the page can now be released, false if not
- */
-static bool afs_release_folio(struct folio *folio, gfp_t gfp)
-{
- struct afs_vnode *vnode = AFS_FS_I(folio_inode(folio));
-
- _enter("{{%llx:%llu}[%lu],%lx},%x",
- vnode->fid.vid, vnode->fid.vnode, folio_index(folio), folio->flags,
- gfp);
-
- /* deny if folio is being written to the cache and the caller hasn't
- * elected to wait */
-#ifdef CONFIG_AFS_FSCACHE
- if (folio_test_fscache(folio)) {
- if (current_is_kswapd() || !(gfp & __GFP_FS))
- return false;
- folio_wait_fscache(folio);
- }
- fscache_note_page_release(afs_vnode_cache(vnode));
-#endif
-
- /* Indicate that the folio can be released */
- _leave(" = T");
- return true;
-}
-
static void afs_add_open_mmap(struct afs_vnode *vnode)
{
if (atomic_inc_return(&vnode->cb_nr_mmap) == 1) {
diff --git a/fs/ceph/addr.c b/fs/ceph/addr.c
index 85be3bf18cdf..03feb4dc6352 100644
--- a/fs/ceph/addr.c
+++ b/fs/ceph/addr.c
@@ -159,27 +159,7 @@ static void ceph_invalidate_folio(struct folio *folio, size_t offset,
ceph_put_snap_context(snapc);
}

- folio_wait_fscache(folio);
-}
-
-static bool ceph_release_folio(struct folio *folio, gfp_t gfp)
-{
- struct inode *inode = folio->mapping->host;
- struct ceph_client *cl = ceph_inode_to_client(inode);
-
- doutc(cl, "%llx.%llx idx %lu (%sdirty)\n", ceph_vinop(inode),
- folio->index, folio_test_dirty(folio) ? "" : "not ");
-
- if (folio_test_private(folio))
- return false;
-
- if (folio_test_fscache(folio)) {
- if (current_is_kswapd() || !(gfp & __GFP_FS))
- return false;
- folio_wait_fscache(folio);
- }
- ceph_fscache_note_page_release(inode);
- return true;
+ netfs_invalidate_folio(folio, offset, length);
}

static void ceph_netfs_expand_readahead(struct netfs_io_request *rreq)
@@ -1586,7 +1566,7 @@ const struct address_space_operations ceph_aops = {
.write_end = ceph_write_end,
.dirty_folio = ceph_dirty_folio,
.invalidate_folio = ceph_invalidate_folio,
- .release_folio = ceph_release_folio,
+ .release_folio = netfs_release_folio,
.direct_IO = noop_direct_IO,
};

diff --git a/fs/netfs/Makefile b/fs/netfs/Makefile
index 386d6fb92793..cd22554d9048 100644
--- a/fs/netfs/Makefile
+++ b/fs/netfs/Makefile
@@ -5,6 +5,7 @@ netfs-y := \
io.o \
iterator.o \
main.o \
+ misc.o \
objects.o

netfs-$(CONFIG_NETFS_STATS) += stats.o
diff --git a/fs/netfs/misc.c b/fs/netfs/misc.c
new file mode 100644
index 000000000000..c3baf2b247d9
--- /dev/null
+++ b/fs/netfs/misc.c
@@ -0,0 +1,51 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/* Miscellaneous routines.
+ *
+ * Copyright (C) 2022 Red Hat, Inc. All Rights Reserved.
+ * Written by David Howells ([email protected])
+ */
+
+#include <linux/swap.h>
+#include "internal.h"
+
+/**
+ * netfs_invalidate_folio - Invalidate or partially invalidate a folio
+ * @folio: Folio proposed for release
+ * @offset: Offset of the invalidated region
+ * @length: Length of the invalidated region
+ *
+ * Invalidate part or all of a folio for a network filesystem. The folio will
+ * be removed afterwards if the invalidated region covers the entire folio.
+ */
+void netfs_invalidate_folio(struct folio *folio, size_t offset, size_t length)
+{
+ _enter("{%lx},%zx,%zx", folio_index(folio), offset, length);
+
+ folio_wait_fscache(folio);
+}
+EXPORT_SYMBOL(netfs_invalidate_folio);
+
+/**
+ * netfs_release_folio - Try to release a folio
+ * @folio: Folio proposed for release
+ * @gfp: Flags qualifying the release
+ *
+ * Request release of a folio and clean up its private state if it's not busy.
+ * Returns true if the folio can now be released, false if not
+ */
+bool netfs_release_folio(struct folio *folio, gfp_t gfp)
+{
+ struct netfs_inode *ctx = netfs_inode(folio_inode(folio));
+
+ if (folio_test_private(folio))
+ return false;
+ if (folio_test_fscache(folio)) {
+ if (current_is_kswapd() || !(gfp & __GFP_FS))
+ return false;
+ folio_wait_fscache(folio);
+ }
+
+ fscache_note_page_release(netfs_i_cookie(ctx));
+ return true;
+}
+EXPORT_SYMBOL(netfs_release_folio);
diff --git a/include/linux/netfs.h b/include/linux/netfs.h
index 0633cd9644e1..6e662832c3ae 100644
--- a/include/linux/netfs.h
+++ b/include/linux/netfs.h
@@ -297,8 +297,10 @@ struct readahead_control;
void netfs_readahead(struct readahead_control *);
int netfs_read_folio(struct file *, struct folio *);
int netfs_write_begin(struct netfs_inode *, struct file *,
- struct address_space *, loff_t pos, unsigned int len,
- struct folio **, void **fsdata);
+ struct address_space *, loff_t pos, unsigned int len,
+ struct folio **, void **fsdata);
+void netfs_invalidate_folio(struct folio *folio, size_t offset, size_t length);
+bool netfs_release_folio(struct folio *folio, gfp_t gfp);

void netfs_subreq_terminated(struct netfs_io_subrequest *, ssize_t, bool);
void netfs_get_subrequest(struct netfs_io_subrequest *subreq,

2023-11-17 21:17:40

by David Howells

[permalink] [raw]
Subject: [PATCH v2 10/51] netfs: Provide tools to create a buffer in an xarray

Provide tools to create a buffer in an xarray, with a function to add new
folios with a mark. This will be used to create bounce buffer and can be
used more easily to create a list of folios the span of which would require
more than a page's worth of bio_vec structs.

Signed-off-by: David Howells <[email protected]>
cc: Jeff Layton <[email protected]>
cc: [email protected]
cc: [email protected]
cc: [email protected]
---
fs/netfs/internal.h | 13 +++++++
fs/netfs/misc.c | 81 +++++++++++++++++++++++++++++++++++++++++++
include/linux/netfs.h | 4 +++
3 files changed, 98 insertions(+)

diff --git a/fs/netfs/internal.h b/fs/netfs/internal.h
index 1f067aa96c50..21a47f118009 100644
--- a/fs/netfs/internal.h
+++ b/fs/netfs/internal.h
@@ -52,6 +52,19 @@ static inline void netfs_proc_add_rreq(struct netfs_io_request *rreq) {}
static inline void netfs_proc_del_rreq(struct netfs_io_request *rreq) {}
#endif

+/*
+ * misc.c
+ */
+#define NETFS_FLAG_PUT_MARK BIT(0)
+#define NETFS_FLAG_PAGECACHE_MARK BIT(1)
+int netfs_xa_store_and_mark(struct xarray *xa, unsigned long index,
+ struct folio *folio, unsigned int flags,
+ gfp_t gfp_mask);
+int netfs_add_folios_to_buffer(struct xarray *buffer,
+ struct address_space *mapping,
+ pgoff_t index, pgoff_t to, gfp_t gfp_mask);
+void netfs_clear_buffer(struct xarray *buffer);
+
/*
* objects.c
*/
diff --git a/fs/netfs/misc.c b/fs/netfs/misc.c
index c3baf2b247d9..106f2fbdccd8 100644
--- a/fs/netfs/misc.c
+++ b/fs/netfs/misc.c
@@ -8,6 +8,87 @@
#include <linux/swap.h>
#include "internal.h"

+/*
+ * Attach a folio to the buffer and maybe set marks on it to say that we need
+ * to put the folio later and twiddle the pagecache flags.
+ */
+int netfs_xa_store_and_mark(struct xarray *xa, unsigned long index,
+ struct folio *folio, unsigned int flags,
+ gfp_t gfp_mask)
+{
+ XA_STATE_ORDER(xas, xa, index, folio_order(folio));
+
+retry:
+ xas_lock(&xas);
+ for (;;) {
+ xas_store(&xas, folio);
+ if (!xas_error(&xas))
+ break;
+ xas_unlock(&xas);
+ if (!xas_nomem(&xas, gfp_mask))
+ return xas_error(&xas);
+ goto retry;
+ }
+
+ if (flags & NETFS_FLAG_PUT_MARK)
+ xas_set_mark(&xas, NETFS_BUF_PUT_MARK);
+ if (flags & NETFS_FLAG_PAGECACHE_MARK)
+ xas_set_mark(&xas, NETFS_BUF_PAGECACHE_MARK);
+ xas_unlock(&xas);
+ return xas_error(&xas);
+}
+
+/*
+ * Create the specified range of folios in the buffer attached to the read
+ * request. The folios are marked with NETFS_BUF_PUT_MARK so that we know that
+ * these need freeing later.
+ */
+int netfs_add_folios_to_buffer(struct xarray *buffer,
+ struct address_space *mapping,
+ pgoff_t index, pgoff_t to, gfp_t gfp_mask)
+{
+ struct folio *folio;
+ int ret;
+
+ if (to + 1 == index) /* Page range is inclusive */
+ return 0;
+
+ do {
+ /* TODO: Figure out what order folio can be allocated here */
+ folio = filemap_alloc_folio(readahead_gfp_mask(mapping), 0);
+ if (!folio)
+ return -ENOMEM;
+ folio->index = index;
+ ret = netfs_xa_store_and_mark(buffer, index, folio,
+ NETFS_FLAG_PUT_MARK, gfp_mask);
+ if (ret < 0) {
+ folio_put(folio);
+ return ret;
+ }
+
+ index += folio_nr_pages(folio);
+ } while (index <= to && index != 0);
+
+ return 0;
+}
+
+/*
+ * Clear an xarray buffer, putting a ref on the folios that have
+ * NETFS_BUF_PUT_MARK set.
+ */
+void netfs_clear_buffer(struct xarray *buffer)
+{
+ struct folio *folio;
+ XA_STATE(xas, buffer, 0);
+
+ rcu_read_lock();
+ xas_for_each_marked(&xas, folio, ULONG_MAX, NETFS_BUF_PUT_MARK) {
+ folio_put(folio);
+ }
+ rcu_read_unlock();
+ xa_destroy(buffer);
+}
+
/**
* netfs_invalidate_folio - Invalidate or partially invalidate a folio
* @folio: Folio proposed for release
diff --git a/include/linux/netfs.h b/include/linux/netfs.h
index 6d820a860052..47270f5d9e89 100644
--- a/include/linux/netfs.h
+++ b/include/linux/netfs.h
@@ -109,6 +109,10 @@ static inline int wait_on_page_fscache_killable(struct page *page)
return folio_wait_private_2_killable(page_folio(page));
}

+/* Marks used on xarray-based buffers */
+#define NETFS_BUF_PUT_MARK XA_MARK_0 /* - Page needs putting */
+#define NETFS_BUF_PAGECACHE_MARK XA_MARK_1 /* - Page needs wb/dirty flag wrangling */
+
enum netfs_io_source {
NETFS_FILL_WITH_ZEROES,
NETFS_DOWNLOAD_FROM_SERVER,

2023-11-17 21:17:44

by David Howells

[permalink] [raw]
Subject: [PATCH v2 11/51] netfs: Add bounce buffering support

Add a second xarray struct to netfs_io_request for the purposes of holding
a bounce buffer for when we have to deal with encrypted/compressed data or
if we have to up/download data in blocks larger than we were asked for.

Signed-off-by: David Howells <[email protected]>
cc: Jeff Layton <[email protected]>
cc: [email protected]
cc: [email protected]
cc: [email protected]
---
fs/netfs/io.c | 6 +++++-
fs/netfs/objects.c | 3 +++
include/linux/netfs.h | 2 ++
3 files changed, 10 insertions(+), 1 deletion(-)

diff --git a/fs/netfs/io.c b/fs/netfs/io.c
index e9d408e211b8..d8e9cd6ce338 100644
--- a/fs/netfs/io.c
+++ b/fs/netfs/io.c
@@ -643,7 +643,11 @@ int netfs_begin_read(struct netfs_io_request *rreq, bool sync)
return -EIO;
}

- rreq->io_iter = rreq->iter;
+ if (test_bit(NETFS_RREQ_USE_BOUNCE_BUFFER, &rreq->flags))
+ iov_iter_xarray(&rreq->io_iter, ITER_DEST, &rreq->bounce,
+ rreq->start, rreq->len);
+ else
+ rreq->io_iter = rreq->iter;

INIT_WORK(&rreq->work, netfs_rreq_work);

diff --git a/fs/netfs/objects.c b/fs/netfs/objects.c
index 4df5e5eeada6..9f3f33c93317 100644
--- a/fs/netfs/objects.c
+++ b/fs/netfs/objects.c
@@ -35,12 +35,14 @@ struct netfs_io_request *netfs_alloc_request(struct address_space *mapping,
rreq->inode = inode;
rreq->i_size = i_size_read(inode);
rreq->debug_id = atomic_inc_return(&debug_ids);
+ xa_init(&rreq->bounce);
INIT_LIST_HEAD(&rreq->subrequests);
refcount_set(&rreq->ref, 1);
__set_bit(NETFS_RREQ_IN_PROGRESS, &rreq->flags);
if (rreq->netfs_ops->init_request) {
ret = rreq->netfs_ops->init_request(rreq, file);
if (ret < 0) {
+ xa_destroy(&rreq->bounce);
kfree(rreq);
return ERR_PTR(ret);
}
@@ -94,6 +96,7 @@ static void netfs_free_request(struct work_struct *work)
}
kvfree(rreq->direct_bv);
}
+ netfs_clear_buffer(&rreq->bounce);
kfree_rcu(rreq, rcu);
netfs_stat_d(&netfs_n_rh_rreq);
}
diff --git a/include/linux/netfs.h b/include/linux/netfs.h
index 47270f5d9e89..0bc90c4035a2 100644
--- a/include/linux/netfs.h
+++ b/include/linux/netfs.h
@@ -196,6 +196,7 @@ struct netfs_io_request {
struct iov_iter iter; /* Unencrypted-side iterator */
struct iov_iter io_iter; /* I/O (Encrypted-side) iterator */
void *netfs_priv; /* Private data for the netfs */
+ struct xarray bounce; /* Bounce buffer (eg. for crypto/compression) */
struct bio_vec *direct_bv /* DIO buffer list (when handling iovec-iter) */
__counted_by(direct_bv_count);
unsigned int direct_bv_count; /* Number of elements in direct_bv[] */
@@ -218,6 +219,7 @@ struct netfs_io_request {
#define NETFS_RREQ_DONT_UNLOCK_FOLIOS 3 /* Don't unlock the folios on completion */
#define NETFS_RREQ_FAILED 4 /* The request failed */
#define NETFS_RREQ_IN_PROGRESS 5 /* Unlocked when the request completes */
+#define NETFS_RREQ_USE_BOUNCE_BUFFER 6 /* Use bounce buffer */
const struct netfs_request_ops *netfs_ops;
};


2023-11-17 21:17:46

by David Howells

[permalink] [raw]
Subject: [PATCH v2 13/51] netfs: Limit subrequest by size or number of segments

Limit a subrequest to a maximum size and/or a maximum number of contiguous
physical regions. This permits, for instance, an subreq's iterator to be
limited to the number of DMA'able segments that a large RDMA request can
handle.

Signed-off-by: David Howells <[email protected]>
cc: Jeff Layton <[email protected]>
cc: [email protected]
cc: [email protected]
cc: [email protected]
---
fs/netfs/io.c | 18 ++++++++++++++++++
include/linux/netfs.h | 1 +
include/trace/events/netfs.h | 1 +
3 files changed, 20 insertions(+)

diff --git a/fs/netfs/io.c b/fs/netfs/io.c
index d8e9cd6ce338..c80b8eed1209 100644
--- a/fs/netfs/io.c
+++ b/fs/netfs/io.c
@@ -525,6 +525,7 @@ netfs_rreq_prepare_read(struct netfs_io_request *rreq,
struct iov_iter *io_iter)
{
enum netfs_io_source source;
+ size_t lsize;

_enter("%llx-%llx,%llx", subreq->start, subreq->start + subreq->len, rreq->i_size);

@@ -547,13 +548,30 @@ netfs_rreq_prepare_read(struct netfs_io_request *rreq,
source = NETFS_INVALID_READ;
goto out;
}
+
+ if (subreq->max_nr_segs) {
+ lsize = netfs_limit_iter(io_iter, 0, subreq->len,
+ subreq->max_nr_segs);
+ if (subreq->len > lsize) {
+ subreq->len = lsize;
+ trace_netfs_sreq(subreq, netfs_sreq_trace_limited);
+ }
+ }
}

+ if (subreq->len > rreq->len)
+ pr_warn("R=%08x[%u] SREQ>RREQ %zx > %zx\n",
+ rreq->debug_id, subreq->debug_index,
+ subreq->len, rreq->len);
+
if (WARN_ON(subreq->len == 0)) {
source = NETFS_INVALID_READ;
goto out;
}

+ subreq->source = source;
+ trace_netfs_sreq(subreq, netfs_sreq_trace_prepare);
+
subreq->io_iter = *io_iter;
iov_iter_truncate(&subreq->io_iter, subreq->len);
iov_iter_advance(io_iter, subreq->len);
diff --git a/include/linux/netfs.h b/include/linux/netfs.h
index cd673596b411..20ddd46fa0bc 100644
--- a/include/linux/netfs.h
+++ b/include/linux/netfs.h
@@ -163,6 +163,7 @@ struct netfs_io_subrequest {
refcount_t ref;
short error; /* 0 or error that occurred */
unsigned short debug_index; /* Index in list (for debugging output) */
+ unsigned int max_nr_segs; /* 0 or max number of segments in an iterator */
enum netfs_io_source source; /* Where to read from/write to */
unsigned long flags;
#define NETFS_SREQ_COPY_TO_CACHE 0 /* Set if should copy the data to the cache */
diff --git a/include/trace/events/netfs.h b/include/trace/events/netfs.h
index beec534cbaab..fce6d0bc78e5 100644
--- a/include/trace/events/netfs.h
+++ b/include/trace/events/netfs.h
@@ -44,6 +44,7 @@
#define netfs_sreq_traces \
EM(netfs_sreq_trace_download_instead, "RDOWN") \
EM(netfs_sreq_trace_free, "FREE ") \
+ EM(netfs_sreq_trace_limited, "LIMIT") \
EM(netfs_sreq_trace_prepare, "PREP ") \
EM(netfs_sreq_trace_resubmit_short, "SHORT") \
EM(netfs_sreq_trace_submit, "SUBMT") \

2023-11-17 21:17:47

by David Howells

[permalink] [raw]
Subject: [PATCH v2 12/51] netfs: Add func to calculate pagecount/size-limited span of an iterator

Add a function to work out how much of an ITER_BVEC or ITER_XARRAY iterator
we can use in a pagecount-limited and size-limited span. This will be
used, for example, to limit the number of segments in a subrequest to the
maximum number of elements that an RDMA transfer can handle.

Signed-off-by: David Howells <[email protected]>
cc: Jeff Layton <[email protected]>
cc: [email protected]
cc: [email protected]
cc: [email protected]
---
fs/netfs/iterator.c | 97 +++++++++++++++++++++++++++++++++++++++++++
include/linux/netfs.h | 2 +
2 files changed, 99 insertions(+)

diff --git a/fs/netfs/iterator.c b/fs/netfs/iterator.c
index 2ff07ba655a0..b781bbbf1d8d 100644
--- a/fs/netfs/iterator.c
+++ b/fs/netfs/iterator.c
@@ -101,3 +101,100 @@ ssize_t netfs_extract_user_iter(struct iov_iter *orig, size_t orig_len,
return npages;
}
EXPORT_SYMBOL_GPL(netfs_extract_user_iter);
+
+/*
+ * Select the span of a bvec iterator we're going to use. Limit it by both maximum
+ * size and maximum number of segments. Returns the size of the span in bytes.
+ */
+static size_t netfs_limit_bvec(const struct iov_iter *iter, size_t start_offset,
+ size_t max_size, size_t max_segs)
+{
+ const struct bio_vec *bvecs = iter->bvec;
+ unsigned int nbv = iter->nr_segs, ix = 0, nsegs = 0;
+ size_t len, span = 0, n = iter->count;
+ size_t skip = iter->iov_offset + start_offset;
+
+ if (WARN_ON(!iov_iter_is_bvec(iter)) ||
+ WARN_ON(start_offset > n) ||
+ n == 0)
+ return 0;
+
+ while (n && ix < nbv && skip) {
+ len = bvecs[ix].bv_len;
+ if (skip < len)
+ break;
+ skip -= len;
+ n -= len;
+ ix++;
+ }
+
+ while (n && ix < nbv) {
+ len = min3(n, bvecs[ix].bv_len - skip, max_size);
+ span += len;
+ nsegs++;
+ ix++;
+ if (span >= max_size || nsegs >= max_segs)
+ break;
+ skip = 0;
+ n -= len;
+ }
+
+ return min(span, max_size);
+}
+
+/*
+ * Select the span of an xarray iterator we're going to use. Limit it by both
+ * maximum size and maximum number of segments. It is assumed that segments
+ * can be larger than a page in size, provided they're physically contiguous.
+ * Returns the size of the span in bytes.
+ */
+static size_t netfs_limit_xarray(const struct iov_iter *iter, size_t start_offset,
+ size_t max_size, size_t max_segs)
+{
+ struct folio *folio;
+ unsigned int nsegs = 0;
+ loff_t pos = iter->xarray_start + iter->iov_offset;
+ pgoff_t index = pos / PAGE_SIZE;
+ size_t span = 0, n = iter->count;
+
+ XA_STATE(xas, iter->xarray, index);
+
+ if (WARN_ON(!iov_iter_is_xarray(iter)) ||
+ WARN_ON(start_offset > n) ||
+ n == 0)
+ return 0;
+ max_size = min(max_size, n - start_offset);
+
+ rcu_read_lock();
+ xas_for_each(&xas, folio, ULONG_MAX) {
+ size_t offset, flen, len;
+ if (xas_retry(&xas, folio))
+ continue;
+ if (WARN_ON(xa_is_value(folio)))
+ break;
+ if (WARN_ON(folio_test_hugetlb(folio)))
+ break;
+
+ flen = folio_size(folio);
+ offset = offset_in_folio(folio, pos);
+ len = min(max_size, flen - offset);
+ span += len;
+ nsegs++;
+ if (span >= max_size || nsegs >= max_segs)
+ break;
+ }
+
+ rcu_read_unlock();
+ return min(span, max_size);
+}
+
+size_t netfs_limit_iter(const struct iov_iter *iter, size_t start_offset,
+ size_t max_size, size_t max_segs)
+{
+ if (iov_iter_is_bvec(iter))
+ return netfs_limit_bvec(iter, start_offset, max_size, max_segs);
+ if (iov_iter_is_xarray(iter))
+ return netfs_limit_xarray(iter, start_offset, max_size, max_segs);
+ BUG();
+}
+EXPORT_SYMBOL(netfs_limit_iter);
diff --git a/include/linux/netfs.h b/include/linux/netfs.h
index 0bc90c4035a2..cd673596b411 100644
--- a/include/linux/netfs.h
+++ b/include/linux/netfs.h
@@ -326,6 +326,8 @@ void netfs_stats_show(struct seq_file *);
ssize_t netfs_extract_user_iter(struct iov_iter *orig, size_t orig_len,
struct iov_iter *new,
iov_iter_extraction_t extraction_flags);
+size_t netfs_limit_iter(const struct iov_iter *iter, size_t start_offset,
+ size_t max_size, size_t max_segs);

int netfs_start_io_read(struct inode *inode);
void netfs_end_io_read(struct inode *inode);

2023-11-17 21:17:58

by David Howells

[permalink] [raw]
Subject: [PATCH v2 15/51] netfs: Extend the netfs_io_*request structs to handle writes

Modify the netfs_io_request struct to act as a point around which writes
can be coordinated. It represents and pins a range of pages that need
writing and a list of regions of dirty data in that range of pages.

If RMW is required, the original data can be downloaded into the bounce
buffer, decrypted if necessary, the modifications made, then the modified
data can be reencrypted/recompressed and sent back to the server.

Signed-off-by: David Howells <[email protected]>
cc: Jeff Layton <[email protected]>
cc: [email protected]
cc: [email protected]
cc: [email protected]
---
fs/netfs/internal.h | 6 ++++++
fs/netfs/main.c | 3 ++-
fs/netfs/objects.c | 6 ++++++
fs/netfs/stats.c | 18 ++++++++++++++----
include/linux/netfs.h | 15 ++++++++++++++-
include/trace/events/netfs.h | 8 ++++++--
6 files changed, 48 insertions(+), 8 deletions(-)

diff --git a/fs/netfs/internal.h b/fs/netfs/internal.h
index 21a47f118009..3a920377b01f 100644
--- a/fs/netfs/internal.h
+++ b/fs/netfs/internal.h
@@ -106,6 +106,12 @@ extern atomic_t netfs_n_rh_write_begin;
extern atomic_t netfs_n_rh_write_done;
extern atomic_t netfs_n_rh_write_failed;
extern atomic_t netfs_n_rh_write_zskip;
+extern atomic_t netfs_n_wh_upload;
+extern atomic_t netfs_n_wh_upload_done;
+extern atomic_t netfs_n_wh_upload_failed;
+extern atomic_t netfs_n_wh_write;
+extern atomic_t netfs_n_wh_write_done;
+extern atomic_t netfs_n_wh_write_failed;


static inline void netfs_stat(atomic_t *stat)
diff --git a/fs/netfs/main.c b/fs/netfs/main.c
index 0f0c6e70aa44..e990738c2213 100644
--- a/fs/netfs/main.c
+++ b/fs/netfs/main.c
@@ -28,10 +28,11 @@ MODULE_PARM_DESC(netfs_debug, "Netfs support debugging mask");
LIST_HEAD(netfs_io_requests);
DEFINE_SPINLOCK(netfs_proc_lock);

-static const char *netfs_origins[] = {
+static const char *netfs_origins[nr__netfs_io_origin] = {
[NETFS_READAHEAD] = "RA",
[NETFS_READPAGE] = "RP",
[NETFS_READ_FOR_WRITE] = "RW",
+ [NETFS_WRITEBACK] = "WB",
};

/*
diff --git a/fs/netfs/objects.c b/fs/netfs/objects.c
index a7947e82374a..7ef804e8915c 100644
--- a/fs/netfs/objects.c
+++ b/fs/netfs/objects.c
@@ -20,6 +20,7 @@ struct netfs_io_request *netfs_alloc_request(struct address_space *mapping,
struct inode *inode = file ? file_inode(file) : mapping->host;
struct netfs_inode *ctx = netfs_inode(inode);
struct netfs_io_request *rreq;
+ bool cached = netfs_is_cache_enabled(ctx);
int ret;

rreq = kzalloc(ctx->ops->io_request_size ?: sizeof(struct netfs_io_request),
@@ -38,7 +39,10 @@ struct netfs_io_request *netfs_alloc_request(struct address_space *mapping,
xa_init(&rreq->bounce);
INIT_LIST_HEAD(&rreq->subrequests);
refcount_set(&rreq->ref, 1);
+
__set_bit(NETFS_RREQ_IN_PROGRESS, &rreq->flags);
+ if (cached)
+ __set_bit(NETFS_RREQ_WRITE_TO_CACHE, &rreq->flags);
if (rreq->netfs_ops->init_request) {
ret = rreq->netfs_ops->init_request(rreq, file);
if (ret < 0) {
@@ -48,6 +52,7 @@ struct netfs_io_request *netfs_alloc_request(struct address_space *mapping,
}
}

+ trace_netfs_rreq_ref(rreq->debug_id, 1, netfs_rreq_trace_new);
netfs_proc_add_rreq(rreq);
netfs_stat(&netfs_n_rh_rreq);
return rreq;
@@ -132,6 +137,7 @@ struct netfs_io_subrequest *netfs_alloc_subrequest(struct netfs_io_request *rreq
sizeof(struct netfs_io_subrequest),
GFP_KERNEL);
if (subreq) {
+ INIT_WORK(&subreq->work, NULL);
INIT_LIST_HEAD(&subreq->rreq_link);
refcount_set(&subreq->ref, 2);
subreq->rreq = rreq;
diff --git a/fs/netfs/stats.c b/fs/netfs/stats.c
index 5510a7a14a40..ce2a1a983280 100644
--- a/fs/netfs/stats.c
+++ b/fs/netfs/stats.c
@@ -27,6 +27,12 @@ atomic_t netfs_n_rh_write_begin;
atomic_t netfs_n_rh_write_done;
atomic_t netfs_n_rh_write_failed;
atomic_t netfs_n_rh_write_zskip;
+atomic_t netfs_n_wh_upload;
+atomic_t netfs_n_wh_upload_done;
+atomic_t netfs_n_wh_upload_failed;
+atomic_t netfs_n_wh_write;
+atomic_t netfs_n_wh_write_done;
+atomic_t netfs_n_wh_write_failed;

void netfs_stats_show(struct seq_file *m)
{
@@ -50,9 +56,13 @@ void netfs_stats_show(struct seq_file *m)
atomic_read(&netfs_n_rh_read),
atomic_read(&netfs_n_rh_read_done),
atomic_read(&netfs_n_rh_read_failed));
- seq_printf(m, "RdHelp : WR=%u ws=%u wf=%u\n",
- atomic_read(&netfs_n_rh_write),
- atomic_read(&netfs_n_rh_write_done),
- atomic_read(&netfs_n_rh_write_failed));
+ seq_printf(m, "WrHelp : UL=%u us=%u uf=%u\n",
+ atomic_read(&netfs_n_wh_upload),
+ atomic_read(&netfs_n_wh_upload_done),
+ atomic_read(&netfs_n_wh_upload_failed));
+ seq_printf(m, "WrHelp : WR=%u ws=%u wf=%u\n",
+ atomic_read(&netfs_n_wh_write),
+ atomic_read(&netfs_n_wh_write_done),
+ atomic_read(&netfs_n_wh_write_failed));
}
EXPORT_SYMBOL(netfs_stats_show);
diff --git a/include/linux/netfs.h b/include/linux/netfs.h
index 20ddd46fa0bc..62b768260eda 100644
--- a/include/linux/netfs.h
+++ b/include/linux/netfs.h
@@ -118,6 +118,9 @@ enum netfs_io_source {
NETFS_DOWNLOAD_FROM_SERVER,
NETFS_READ_FROM_CACHE,
NETFS_INVALID_READ,
+ NETFS_UPLOAD_TO_SERVER,
+ NETFS_WRITE_TO_CACHE,
+ NETFS_INVALID_WRITE,
} __mode(byte);

typedef void (*netfs_io_terminated_t)(void *priv, ssize_t transferred_or_error,
@@ -151,9 +154,14 @@ struct netfs_cache_resources {
};

/*
- * Descriptor for a single component subrequest.
+ * Descriptor for a single component subrequest. Each operation represents an
+ * individual read/write from/to a server, a cache, a journal, etc..
+ *
+ * The buffer iterator is persistent for the life of the subrequest struct and
+ * the pages it points to can be relied on to exist for the duration.
*/
struct netfs_io_subrequest {
+ struct work_struct work;
struct netfs_io_request *rreq; /* Supervising I/O request */
struct list_head rreq_link; /* Link in rreq->subrequests */
struct iov_iter io_iter; /* Iterator for this subrequest */
@@ -178,6 +186,8 @@ enum netfs_io_origin {
NETFS_READAHEAD, /* This read was triggered by readahead */
NETFS_READPAGE, /* This read is a synchronous read */
NETFS_READ_FOR_WRITE, /* This read is to prepare a write */
+ NETFS_WRITEBACK, /* This write was triggered by writepages */
+ nr__netfs_io_origin
} __mode(byte);

/*
@@ -202,6 +212,7 @@ struct netfs_io_request {
__counted_by(direct_bv_count);
unsigned int direct_bv_count; /* Number of elements in direct_bv[] */
unsigned int debug_id;
+ unsigned int subreq_counter; /* Next subreq->debug_index */
atomic_t nr_outstanding; /* Number of ops in progress */
atomic_t nr_copy_ops; /* Number of copy-to-cache ops in progress */
size_t submitted; /* Amount submitted for I/O so far */
@@ -221,6 +232,8 @@ struct netfs_io_request {
#define NETFS_RREQ_FAILED 4 /* The request failed */
#define NETFS_RREQ_IN_PROGRESS 5 /* Unlocked when the request completes */
#define NETFS_RREQ_USE_BOUNCE_BUFFER 6 /* Use bounce buffer */
+#define NETFS_RREQ_WRITE_TO_CACHE 7 /* Need to write to the cache */
+#define NETFS_RREQ_UPLOAD_TO_SERVER 8 /* Need to write to the server */
const struct netfs_request_ops *netfs_ops;
};

diff --git a/include/trace/events/netfs.h b/include/trace/events/netfs.h
index fce6d0bc78e5..4ea4e34d279f 100644
--- a/include/trace/events/netfs.h
+++ b/include/trace/events/netfs.h
@@ -24,7 +24,8 @@
#define netfs_rreq_origins \
EM(NETFS_READAHEAD, "RA") \
EM(NETFS_READPAGE, "RP") \
- E_(NETFS_READ_FOR_WRITE, "RW")
+ EM(NETFS_READ_FOR_WRITE, "RW") \
+ E_(NETFS_WRITEBACK, "WB")

#define netfs_rreq_traces \
EM(netfs_rreq_trace_assess, "ASSESS ") \
@@ -39,7 +40,10 @@
EM(NETFS_FILL_WITH_ZEROES, "ZERO") \
EM(NETFS_DOWNLOAD_FROM_SERVER, "DOWN") \
EM(NETFS_READ_FROM_CACHE, "READ") \
- E_(NETFS_INVALID_READ, "INVL") \
+ EM(NETFS_INVALID_READ, "INVL") \
+ EM(NETFS_UPLOAD_TO_SERVER, "UPLD") \
+ EM(NETFS_WRITE_TO_CACHE, "WRIT") \
+ E_(NETFS_INVALID_WRITE, "INVL")

#define netfs_sreq_traces \
EM(netfs_sreq_trace_download_instead, "RDOWN") \

2023-11-17 21:18:01

by David Howells

[permalink] [raw]
Subject: [PATCH v2 16/51] netfs: Add a hook to allow tell the netfs to update its i_size

Add a hook for netfslib's write helpers to call to tell the network
filesystem that it should update its i_size.

Signed-off-by: David Howells <[email protected]>
cc: Jeff Layton <[email protected]>
cc: [email protected]
cc: [email protected]
cc: [email protected]
---
include/linux/netfs.h | 4 ++++
1 file changed, 4 insertions(+)

diff --git a/include/linux/netfs.h b/include/linux/netfs.h
index 62b768260eda..21650db7da54 100644
--- a/include/linux/netfs.h
+++ b/include/linux/netfs.h
@@ -248,6 +248,7 @@ struct netfs_request_ops {
void (*free_subrequest)(struct netfs_io_subrequest *rreq);
int (*begin_cache_operation)(struct netfs_io_request *rreq);

+ /* Read request handling */
void (*expand_readahead)(struct netfs_io_request *rreq);
bool (*clamp_length)(struct netfs_io_subrequest *subreq);
void (*issue_read)(struct netfs_io_subrequest *subreq);
@@ -255,6 +256,9 @@ struct netfs_request_ops {
int (*check_write_begin)(struct file *file, loff_t pos, unsigned len,
struct folio **foliop, void **_fsdata);
void (*done)(struct netfs_io_request *rreq);
+
+ /* Modification handling */
+ void (*update_i_size)(struct inode *inode, loff_t i_size);
};

/*

2023-11-17 21:18:01

by David Howells

[permalink] [raw]
Subject: [PATCH v2 17/51] netfs: Make netfs_put_request() handle a NULL pointer

Make netfs_put_request() just return if given a NULL request pointer.

Signed-off-by: David Howells <[email protected]>
cc: Jeff Layton <[email protected]>
cc: [email protected]
cc: [email protected]
cc: [email protected]
---
fs/netfs/objects.c | 23 +++++++++++++----------
1 file changed, 13 insertions(+), 10 deletions(-)

diff --git a/fs/netfs/objects.c b/fs/netfs/objects.c
index 7ef804e8915c..3ce6313cc5f9 100644
--- a/fs/netfs/objects.c
+++ b/fs/netfs/objects.c
@@ -109,19 +109,22 @@ static void netfs_free_request(struct work_struct *work)
void netfs_put_request(struct netfs_io_request *rreq, bool was_async,
enum netfs_rreq_ref_trace what)
{
- unsigned int debug_id = rreq->debug_id;
+ unsigned int debug_id;
bool dead;
int r;

- dead = __refcount_dec_and_test(&rreq->ref, &r);
- trace_netfs_rreq_ref(debug_id, r - 1, what);
- if (dead) {
- if (was_async) {
- rreq->work.func = netfs_free_request;
- if (!queue_work(system_unbound_wq, &rreq->work))
- BUG();
- } else {
- netfs_free_request(&rreq->work);
+ if (rreq) {
+ debug_id = rreq->debug_id;
+ dead = __refcount_dec_and_test(&rreq->ref, &r);
+ trace_netfs_rreq_ref(debug_id, r - 1, what);
+ if (dead) {
+ if (was_async) {
+ rreq->work.func = netfs_free_request;
+ if (!queue_work(system_unbound_wq, &rreq->work))
+ BUG();
+ } else {
+ netfs_free_request(&rreq->work);
+ }
}
}
}

2023-11-17 21:18:54

by David Howells

[permalink] [raw]
Subject: [PATCH v2 22/51] netfs: Provide func to copy data to pagecache for buffered write

Provide a netfs write helper, netfs_perform_write() to buffer data to be
written in the pagecache and mark the modified folios dirty.

It will perform "streaming writes" for folios that aren't currently
resident, if possible, storing data in partially modified folios that are
marked dirty, but not uptodate. It will also tag pages as belonging to
fs-specific write groups if so directed by the filesystem.

This is derived from generic_perform_write(), but doesn't use
->write_begin() and ->write_end(), having that logic rolled in instead.

Signed-off-by: David Howells <[email protected]>
cc: Jeff Layton <[email protected]>
cc: [email protected]
cc: [email protected]
cc: [email protected]
---
fs/netfs/Makefile | 1 +
fs/netfs/buffered_read.c | 48 +++++
fs/netfs/buffered_write.c | 327 +++++++++++++++++++++++++++++++++++
fs/netfs/internal.h | 2 +
include/linux/netfs.h | 5 +
include/trace/events/netfs.h | 70 ++++++++
6 files changed, 453 insertions(+)
create mode 100644 fs/netfs/buffered_write.c

diff --git a/fs/netfs/Makefile b/fs/netfs/Makefile
index ce1197713276..90d76e9bb40d 100644
--- a/fs/netfs/Makefile
+++ b/fs/netfs/Makefile
@@ -2,6 +2,7 @@

netfs-y := \
buffered_read.o \
+ buffered_write.o \
io.o \
iterator.o \
locking.o \
diff --git a/fs/netfs/buffered_read.c b/fs/netfs/buffered_read.c
index 05824f73cfc7..2f06344bba21 100644
--- a/fs/netfs/buffered_read.c
+++ b/fs/netfs/buffered_read.c
@@ -461,3 +461,51 @@ int netfs_write_begin(struct netfs_inode *ctx,
return ret;
}
EXPORT_SYMBOL(netfs_write_begin);
+
+/*
+ * Preload the data into a page we're proposing to write into.
+ */
+int netfs_prefetch_for_write(struct file *file, struct folio *folio,
+ size_t offset, size_t len)
+{
+ struct netfs_io_request *rreq;
+ struct address_space *mapping = folio_file_mapping(folio);
+ struct netfs_inode *ctx = netfs_inode(mapping->host);
+ unsigned long long start = folio_pos(folio);
+ size_t flen = folio_size(folio);
+ int ret;
+
+ _enter("%zx @%llx", flen, start);
+
+ ret = -ENOMEM;
+
+ rreq = netfs_alloc_request(mapping, file, start, flen,
+ NETFS_READ_FOR_WRITE);
+ if (IS_ERR(rreq)) {
+ ret = PTR_ERR(rreq);
+ goto error;
+ }
+
+ rreq->no_unlock_folio = folio_index(folio);
+ __set_bit(NETFS_RREQ_NO_UNLOCK_FOLIO, &rreq->flags);
+ ret = netfs_begin_cache_operation(rreq, ctx);
+ if (ret == -ENOMEM || ret == -EINTR || ret == -ERESTARTSYS)
+ goto error_put;
+
+ netfs_stat(&netfs_n_rh_write_begin);
+ trace_netfs_read(rreq, start, flen, netfs_read_trace_prefetch_for_write);
+
+ /* Set up the output buffer */
+ iov_iter_xarray(&rreq->iter, ITER_DEST, &mapping->i_pages,
+ rreq->start, rreq->len);
+
+ ret = netfs_begin_read(rreq, true);
+ netfs_put_request(rreq, false, netfs_rreq_trace_put_return);
+ return ret;
+
+error_put:
+ netfs_put_request(rreq, false, netfs_rreq_trace_put_discard);
+error:
+ _leave(" = %d", ret);
+ return ret;
+}
diff --git a/fs/netfs/buffered_write.c b/fs/netfs/buffered_write.c
new file mode 100644
index 000000000000..406c3f3666fa
--- /dev/null
+++ b/fs/netfs/buffered_write.c
@@ -0,0 +1,327 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/* Network filesystem high-level write support.
+ *
+ * Copyright (C) 2023 Red Hat, Inc. All Rights Reserved.
+ * Written by David Howells ([email protected])
+ */
+
+#include <linux/export.h>
+#include <linux/fs.h>
+#include <linux/mm.h>
+#include <linux/pagemap.h>
+#include <linux/slab.h>
+#include <linux/pagevec.h>
+#include "internal.h"
+
+/*
+ * Determined write method. Adjust netfs_folio_traces if this is changed.
+ */
+enum netfs_how_to_modify {
+ NETFS_FOLIO_IS_UPTODATE, /* Folio is uptodate already */
+ NETFS_JUST_PREFETCH, /* We have to read the folio anyway */
+ NETFS_WHOLE_FOLIO_MODIFY, /* We're going to overwrite the whole folio */
+ NETFS_MODIFY_AND_CLEAR, /* We can assume there is no data to be downloaded. */
+ NETFS_STREAMING_WRITE, /* Store incomplete data in non-uptodate page. */
+ NETFS_STREAMING_WRITE_CONT, /* Continue streaming write. */
+ NETFS_FLUSH_CONTENT, /* Flush incompatible content. */
+};
+
+static void netfs_set_group(struct folio *folio, struct netfs_group *netfs_group)
+{
+ if (netfs_group && !folio_get_private(folio))
+ folio_attach_private(folio, netfs_get_group(netfs_group));
+}
+
+/*
+ * Decide how we should modify a folio. We might be attempting to do
+ * write-streaming, in which case we don't want to a local RMW cycle if we can
+ * avoid it. If we're doing local caching or content crypto, we award that
+ * priority over avoiding RMW. If the file is open readably, then we also
+ * assume that we may want to read what we wrote.
+ */
+static enum netfs_how_to_modify netfs_how_to_modify(struct netfs_inode *ctx,
+ struct file *file,
+ struct folio *folio,
+ void *netfs_group,
+ size_t flen,
+ size_t offset,
+ size_t len,
+ bool maybe_trouble)
+{
+ struct netfs_folio *finfo = netfs_folio_info(folio);
+ loff_t pos = folio_file_pos(folio);
+
+ _enter("z=%llx", ctx->zero_point);
+
+ if (netfs_folio_group(folio) != netfs_group)
+ return NETFS_FLUSH_CONTENT;
+
+ if (folio_test_uptodate(folio))
+ return NETFS_FOLIO_IS_UPTODATE;
+
+ if (pos >= ctx->zero_point)
+ return NETFS_MODIFY_AND_CLEAR;
+
+ if (!maybe_trouble && offset == 0 && len >= flen)
+ return NETFS_WHOLE_FOLIO_MODIFY;
+
+ if (file->f_mode & FMODE_READ)
+ return NETFS_JUST_PREFETCH;
+
+ if (netfs_is_cache_enabled(ctx))
+ return NETFS_JUST_PREFETCH;
+
+ if (!finfo)
+ return NETFS_STREAMING_WRITE;
+
+ /* We can continue a streaming write only if it continues on from the
+ * previous. If it overlaps, we must flush lest we suffer a partial
+ * copy and disjoint dirty regions.
+ */
+ if (offset == finfo->dirty_offset + finfo->dirty_len)
+ return NETFS_STREAMING_WRITE_CONT;
+ return NETFS_FLUSH_CONTENT;
+}
+
+/*
+ * Grab a folio for writing and lock it.
+ */
+static struct folio *netfs_grab_folio_for_write(struct address_space *mapping,
+ loff_t pos, size_t part)
+{
+ pgoff_t index = pos / PAGE_SIZE;
+
+ return __filemap_get_folio(mapping, index, FGP_WRITEBEGIN,
+ mapping_gfp_mask(mapping));
+}
+
+/**
+ * netfs_perform_write - Copy data into the pagecache.
+ * @iocb: The operation parameters
+ * @iter: The source buffer
+ * @netfs_group: Grouping for dirty pages (eg. ceph snaps).
+ *
+ * Copy data into pagecache pages attached to the inode specified by @iocb.
+ * The caller must hold appropriate inode locks.
+ *
+ * Dirty pages are tagged with a netfs_folio struct if they're not up to date
+ * to indicate the range modified. Dirty pages may also be tagged with a
+ * netfs-specific grouping such that data from an old group gets flushed before
+ * a new one is started.
+ */
+ssize_t netfs_perform_write(struct kiocb *iocb, struct iov_iter *iter,
+ struct netfs_group *netfs_group)
+{
+ struct file *file = iocb->ki_filp;
+ struct inode *inode = file_inode(file);
+ struct address_space *mapping = inode->i_mapping;
+ struct netfs_inode *ctx = netfs_inode(inode);
+ struct netfs_folio *finfo;
+ struct folio *folio;
+ enum netfs_how_to_modify howto;
+ enum netfs_folio_trace trace;
+ unsigned int bdp_flags = (iocb->ki_flags & IOCB_SYNC) ? 0: BDP_ASYNC;
+ ssize_t written = 0, ret;
+ loff_t i_size, pos = iocb->ki_pos, from, to;
+ size_t max_chunk = PAGE_SIZE << MAX_PAGECACHE_ORDER;
+ bool maybe_trouble = false;
+
+ do {
+ size_t flen;
+ size_t offset; /* Offset into pagecache folio */
+ size_t part; /* Bytes to write to folio */
+ size_t copied; /* Bytes copied from user */
+
+ ret = balance_dirty_pages_ratelimited_flags(mapping, bdp_flags);
+ if (unlikely(ret < 0))
+ break;
+
+ offset = pos & (max_chunk - 1);
+ part = min(max_chunk - offset, iov_iter_count(iter));
+
+ /* Bring in the user pages that we will copy from _first_ lest
+ * we hit a nasty deadlock on copying from the same page as
+ * we're writing to, without it being marked uptodate.
+ *
+ * Not only is this an optimisation, but it is also required to
+ * check that the address is actually valid, when atomic
+ * usercopies are used below.
+ *
+ * We rely on the page being held onto long enough by the LRU
+ * that we can grab it below if this causes it to be read.
+ */
+ ret = -EFAULT;
+ if (unlikely(fault_in_iov_iter_readable(iter, part) == part))
+ break;
+
+ ret = -ENOMEM;
+ folio = netfs_grab_folio_for_write(mapping, pos, part);
+ if (!folio)
+ break;
+
+ flen = folio_size(folio);
+ offset = pos & (flen - 1);
+ part = min_t(size_t, flen - offset, part);
+
+ if (signal_pending(current)) {
+ ret = written ? -EINTR : -ERESTARTSYS;
+ goto error_folio_unlock;
+ }
+
+ /* See if we need to prefetch the area we're going to modify.
+ * We need to do this before we get a lock on the folio in case
+ * there's more than one writer competing for the same cache
+ * block.
+ */
+ howto = netfs_how_to_modify(ctx, file, folio, netfs_group,
+ flen, offset, part, maybe_trouble);
+ _debug("howto %u", howto);
+ switch (howto) {
+ case NETFS_JUST_PREFETCH:
+ ret = netfs_prefetch_for_write(file, folio, offset, part);
+ if (ret < 0) {
+ _debug("prefetch = %zd", ret);
+ goto error_folio_unlock;
+ }
+ break;
+ case NETFS_FOLIO_IS_UPTODATE:
+ case NETFS_WHOLE_FOLIO_MODIFY:
+ case NETFS_STREAMING_WRITE_CONT:
+ break;
+ case NETFS_MODIFY_AND_CLEAR:
+ zero_user_segment(&folio->page, 0, offset);
+ break;
+ case NETFS_STREAMING_WRITE:
+ ret = -EIO;
+ if (WARN_ON(folio_get_private(folio)))
+ goto error_folio_unlock;
+ break;
+ case NETFS_FLUSH_CONTENT:
+ trace_netfs_folio(folio, netfs_flush_content);
+ from = folio_pos(folio);
+ to = from + folio_size(folio) - 1;
+ folio_unlock(folio);
+ folio_put(folio);
+ ret = filemap_write_and_wait_range(mapping, from, to);
+ if (ret < 0)
+ goto error_folio_unlock;
+ continue;
+ }
+
+ if (mapping_writably_mapped(mapping))
+ flush_dcache_folio(folio);
+
+ copied = copy_folio_from_iter_atomic(folio, offset, part, iter);
+
+ flush_dcache_folio(folio);
+
+ /* Deal with a (partially) failed copy */
+ if (copied == 0) {
+ ret = -EFAULT;
+ goto error_folio_unlock;
+ }
+
+ trace = (enum netfs_folio_trace)howto;
+ switch (howto) {
+ case NETFS_FOLIO_IS_UPTODATE:
+ case NETFS_JUST_PREFETCH:
+ netfs_set_group(folio, netfs_group);
+ break;
+ case NETFS_MODIFY_AND_CLEAR:
+ zero_user_segment(&folio->page, offset + copied, flen);
+ netfs_set_group(folio, netfs_group);
+ folio_mark_uptodate(folio);
+ break;
+ case NETFS_WHOLE_FOLIO_MODIFY:
+ if (unlikely(copied < part)) {
+ maybe_trouble = true;
+ iov_iter_revert(iter, copied);
+ copied = 0;
+ goto retry;
+ }
+ netfs_set_group(folio, netfs_group);
+ folio_mark_uptodate(folio);
+ break;
+ case NETFS_STREAMING_WRITE:
+ if (offset == 0 && copied == flen) {
+ netfs_set_group(folio, netfs_group);
+ folio_mark_uptodate(folio);
+ trace = netfs_streaming_filled_page;
+ break;
+ }
+ finfo = kzalloc(sizeof(*finfo), GFP_KERNEL);
+ if (!finfo) {
+ iov_iter_revert(iter, copied);
+ ret = -ENOMEM;
+ goto error_folio_unlock;
+ }
+ finfo->netfs_group = netfs_get_group(netfs_group);
+ finfo->dirty_offset = offset;
+ finfo->dirty_len = copied;
+ folio_attach_private(folio, (void *)((unsigned long)finfo |
+ NETFS_FOLIO_INFO));
+ break;
+ case NETFS_STREAMING_WRITE_CONT:
+ finfo = netfs_folio_info(folio);
+ finfo->dirty_len += copied;
+ if (finfo->dirty_offset == 0 && finfo->dirty_len == flen) {
+ folio_change_private(folio, finfo->netfs_group);
+ folio_mark_uptodate(folio);
+ kfree(finfo);
+ trace = netfs_streaming_filled_page;
+ }
+ break;
+ default:
+ WARN(true, "Unexpected modify type %u ix=%lx\n",
+ howto, folio_index(folio));
+ ret = -EIO;
+ goto error_folio_unlock;
+ }
+
+ trace_netfs_folio(folio, trace);
+
+ /* Update the inode size if we moved the EOF marker */
+ i_size = i_size_read(inode);
+ pos += copied;
+ if (pos > i_size) {
+ if (ctx->ops->update_i_size) {
+ ctx->ops->update_i_size(inode, pos);
+ } else {
+ i_size_write(inode, pos);
+#if IS_ENABLED(CONFIG_FSCACHE)
+ fscache_update_cookie(ctx->cache, NULL, &pos);
+#endif
+ }
+ }
+ written += copied;
+
+ folio_mark_dirty(folio);
+ retry:
+ folio_unlock(folio);
+ folio_put(folio);
+ folio = NULL;
+
+ cond_resched();
+ } while (iov_iter_count(iter));
+
+out:
+ if (likely(written)) {
+ /* Flush and wait for a write that requires immediate synchronisation. */
+ if (iocb->ki_flags & (IOCB_DSYNC | IOCB_SYNC)) {
+ _debug("dsync");
+ ret = filemap_fdatawait_range(mapping, iocb->ki_pos,
+ iocb->ki_pos + written);
+ }
+
+ iocb->ki_pos += written;
+ }
+
+ _leave(" = %zd [%zd]", written, ret);
+ return written ? written : ret;
+
+error_folio_unlock:
+ folio_unlock(folio);
+ folio_put(folio);
+ goto out;
+}
+EXPORT_SYMBOL(netfs_perform_write);
diff --git a/fs/netfs/internal.h b/fs/netfs/internal.h
index c2a4da8f5efb..f0bd755ef288 100644
--- a/fs/netfs/internal.h
+++ b/fs/netfs/internal.h
@@ -19,6 +19,8 @@
* buffered_read.c
*/
void netfs_rreq_unlock_folios(struct netfs_io_request *rreq);
+int netfs_prefetch_for_write(struct file *file, struct folio *folio,
+ size_t offset, size_t len);

/*
* io.c
diff --git a/include/linux/netfs.h b/include/linux/netfs.h
index 3df6422488de..81bbd29a6b7b 100644
--- a/include/linux/netfs.h
+++ b/include/linux/netfs.h
@@ -374,6 +374,11 @@ struct netfs_cache_ops {
loff_t *_data_start, size_t *_data_len);
};

+/* High-level write API */
+ssize_t netfs_perform_write(struct kiocb *iocb, struct iov_iter *iter,
+ struct netfs_group *netfs_group);
+
+/* Address operations API */
struct readahead_control;
void netfs_readahead(struct readahead_control *);
int netfs_read_folio(struct file *, struct folio *);
diff --git a/include/trace/events/netfs.h b/include/trace/events/netfs.h
index e03635172760..94793f842000 100644
--- a/include/trace/events/netfs.h
+++ b/include/trace/events/netfs.h
@@ -19,6 +19,7 @@
EM(netfs_read_trace_expanded, "EXPANDED ") \
EM(netfs_read_trace_readahead, "READAHEAD") \
EM(netfs_read_trace_readpage, "READPAGE ") \
+ EM(netfs_read_trace_prefetch_for_write, "PREFETCHW") \
E_(netfs_read_trace_write_begin, "WRITEBEGN")

#define netfs_write_traces \
@@ -100,6 +101,28 @@
EM(netfs_sreq_trace_put_work, "PUT WORK ") \
E_(netfs_sreq_trace_put_terminated, "PUT TERM ")

+#define netfs_folio_traces \
+ /* The first few correspond to enum netfs_how_to_modify */ \
+ EM(netfs_folio_is_uptodate, "mod-uptodate") \
+ EM(netfs_just_prefetch, "mod-prefetch") \
+ EM(netfs_whole_folio_modify, "mod-whole-f") \
+ EM(netfs_modify_and_clear, "mod-n-clear") \
+ EM(netfs_streaming_write, "mod-streamw") \
+ EM(netfs_streaming_write_cont, "mod-streamw+") \
+ EM(netfs_flush_content, "flush") \
+ EM(netfs_streaming_filled_page, "mod-streamw-f") \
+ /* The rest are for writeback */ \
+ EM(netfs_folio_trace_clear, "clear") \
+ EM(netfs_folio_trace_clear_s, "clear-s") \
+ EM(netfs_folio_trace_clear_g, "clear-g") \
+ EM(netfs_folio_trace_kill, "kill") \
+ EM(netfs_folio_trace_mkwrite, "mkwrite") \
+ EM(netfs_folio_trace_mkwrite_plus, "mkwrite+") \
+ EM(netfs_folio_trace_redirty, "redirty") \
+ EM(netfs_folio_trace_redirtied, "redirtied") \
+ EM(netfs_folio_trace_store, "store") \
+ E_(netfs_folio_trace_store_plus, "store+")
+
#ifndef __NETFS_DECLARE_TRACE_ENUMS_ONCE_ONLY
#define __NETFS_DECLARE_TRACE_ENUMS_ONCE_ONLY

@@ -115,6 +138,7 @@ enum netfs_sreq_trace { netfs_sreq_traces } __mode(byte);
enum netfs_failure { netfs_failures } __mode(byte);
enum netfs_rreq_ref_trace { netfs_rreq_ref_traces } __mode(byte);
enum netfs_sreq_ref_trace { netfs_sreq_ref_traces } __mode(byte);
+enum netfs_folio_trace { netfs_folio_traces } __mode(byte);

#endif

@@ -135,6 +159,7 @@ netfs_sreq_traces;
netfs_failures;
netfs_rreq_ref_traces;
netfs_sreq_ref_traces;
+netfs_folio_traces;

/*
* Now redefine the EM() and E_() macros to map the enums to the strings that
@@ -335,6 +360,51 @@ TRACE_EVENT(netfs_sreq_ref,
__entry->ref)
);

+TRACE_EVENT(netfs_folio,
+ TP_PROTO(struct folio *folio, enum netfs_folio_trace why),
+
+ TP_ARGS(folio, why),
+
+ TP_STRUCT__entry(
+ __field(ino_t, ino)
+ __field(pgoff_t, index)
+ __field(unsigned int, nr)
+ __field(enum netfs_folio_trace, why)
+ ),
+
+ TP_fast_assign(
+ __entry->ino = folio->mapping->host->i_ino;
+ __entry->why = why;
+ __entry->index = folio_index(folio);
+ __entry->nr = folio_nr_pages(folio);
+ ),
+
+ TP_printk("i=%05lx ix=%05lx-%05lx %s",
+ __entry->ino, __entry->index, __entry->index + __entry->nr - 1,
+ __print_symbolic(__entry->why, netfs_folio_traces))
+ );
+
+TRACE_EVENT(netfs_write_iter,
+ TP_PROTO(const struct kiocb *iocb, const struct iov_iter *from),
+
+ TP_ARGS(iocb, from),
+
+ TP_STRUCT__entry(
+ __field(unsigned long long, start )
+ __field(size_t, len )
+ __field(unsigned int, flags )
+ ),
+
+ TP_fast_assign(
+ __entry->start = iocb->ki_pos;
+ __entry->len = iov_iter_count(from);
+ __entry->flags = iocb->ki_flags;
+ ),
+
+ TP_printk("WRITE-ITER s=%llx l=%zx f=%x",
+ __entry->start, __entry->len, __entry->flags)
+ );
+
TRACE_EVENT(netfs_write,
TP_PROTO(const struct netfs_io_request *wreq,
enum netfs_write_trace what),

2023-11-17 21:19:01

by David Howells

[permalink] [raw]
Subject: [PATCH v2 24/51] netfs: Allocate multipage folios in the writepath

Allocate a multipage folio when copying data into the pagecache if possible
if there's sufficient data to warrant it.

Signed-off-by: David Howells <[email protected]>
cc: Jeff Layton <[email protected]>
cc: [email protected]
cc: [email protected]
cc: [email protected]
---
fs/netfs/buffered_write.c | 9 +++++++--
1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/fs/netfs/buffered_write.c b/fs/netfs/buffered_write.c
index 406c3f3666fa..4de6a12149e4 100644
--- a/fs/netfs/buffered_write.c
+++ b/fs/netfs/buffered_write.c
@@ -84,14 +84,19 @@ static enum netfs_how_to_modify netfs_how_to_modify(struct netfs_inode *ctx,
}

/*
- * Grab a folio for writing and lock it.
+ * Grab a folio for writing and lock it. Attempt to allocate as large a folio
+ * as possible to hold as much of the remaining length as possible in one go.
*/
static struct folio *netfs_grab_folio_for_write(struct address_space *mapping,
loff_t pos, size_t part)
{
pgoff_t index = pos / PAGE_SIZE;
+ fgf_t fgp_flags = FGP_WRITEBEGIN;

- return __filemap_get_folio(mapping, index, FGP_WRITEBEGIN,
+ if (mapping_large_folio_support(mapping))
+ fgp_flags |= fgf_set_order(pos % PAGE_SIZE + part);
+
+ return __filemap_get_folio(mapping, index, fgp_flags,
mapping_gfp_mask(mapping));
}


2023-11-17 21:19:25

by David Howells

[permalink] [raw]
Subject: [PATCH v2 27/51] netfs: Implement buffered write API

Institute a netfs write helper, netfs_file_write_iter(), to be pointed at
by the network filesystem ->write_iter() call. Make it handled buffered
writes by calling the previously defined netfs_perform_write() to copy the
source data into the pagecache.

Signed-off-by: David Howells <[email protected]>
cc: Jeff Layton <[email protected]>
cc: [email protected]
cc: [email protected]
cc: [email protected]
---
fs/netfs/buffered_write.c | 83 +++++++++++++++++++++++++++++++++++++++
include/linux/netfs.h | 3 ++
2 files changed, 86 insertions(+)

diff --git a/fs/netfs/buffered_write.c b/fs/netfs/buffered_write.c
index 4de6a12149e4..60e7da53cbd2 100644
--- a/fs/netfs/buffered_write.c
+++ b/fs/netfs/buffered_write.c
@@ -330,3 +330,86 @@ ssize_t netfs_perform_write(struct kiocb *iocb, struct iov_iter *iter,
goto out;
}
EXPORT_SYMBOL(netfs_perform_write);
+
+/**
+ * netfs_buffered_write_iter_locked - write data to a file
+ * @iocb: IO state structure (file, offset, etc.)
+ * @from: iov_iter with data to write
+ * @netfs_group: Grouping for dirty pages (eg. ceph snaps).
+ *
+ * This function does all the work needed for actually writing data to a
+ * file. It does all basic checks, removes SUID from the file, updates
+ * modification times and calls proper subroutines depending on whether we
+ * do direct IO or a standard buffered write.
+ *
+ * The caller must hold appropriate locks around this function and have called
+ * generic_write_checks() already. The caller is also responsible for doing
+ * any necessary syncing afterwards.
+ *
+ * This function does *not* take care of syncing data in case of O_SYNC write.
+ * A caller has to handle it. This is mainly due to the fact that we want to
+ * avoid syncing under i_rwsem.
+ *
+ * Return:
+ * * number of bytes written, even for truncated writes
+ * * negative error code if no data has been written at all
+ */
+ssize_t netfs_buffered_write_iter_locked(struct kiocb *iocb, struct iov_iter *from,
+ struct netfs_group *netfs_group)
+{
+ struct file *file = iocb->ki_filp;
+ ssize_t ret;
+
+ trace_netfs_write_iter(iocb, from);
+
+ ret = file_remove_privs(file);
+ if (ret)
+ return ret;
+
+ ret = file_update_time(file);
+ if (ret)
+ return ret;
+
+ return netfs_perform_write(iocb, from, netfs_group);
+}
+EXPORT_SYMBOL(netfs_buffered_write_iter_locked);
+
+/**
+ * netfs_file_write_iter - write data to a file
+ * @iocb: IO state structure
+ * @from: iov_iter with data to write
+ *
+ * Perform a write to a file, writing into the pagecache if possible and doing
+ * an unbuffered write instead if not.
+ *
+ * Return:
+ * * Negative error code if no data has been written at all of
+ * vfs_fsync_range() failed for a synchronous write
+ * * Number of bytes written, even for truncated writes
+ */
+ssize_t netfs_file_write_iter(struct kiocb *iocb, struct iov_iter *from)
+{
+ struct file *file = iocb->ki_filp;
+ struct inode *inode = file->f_mapping->host;
+ struct netfs_inode *ictx = netfs_inode(inode);
+ ssize_t ret;
+
+ _enter("%llx,%zx,%llx", iocb->ki_pos, iov_iter_count(from), i_size_read(inode));
+
+ if ((iocb->ki_flags & IOCB_DIRECT) ||
+ test_bit(NETFS_ICTX_UNBUFFERED, &ictx->flags))
+ return netfs_unbuffered_write_iter(iocb, from);
+
+ ret = netfs_start_io_write(inode);
+ if (ret < 0)
+ return ret;
+
+ ret = generic_write_checks(iocb, from);
+ if (ret > 0)
+ ret = netfs_buffered_write_iter_locked(iocb, from, NULL);
+ netfs_end_io_write(inode);
+ if (ret > 0)
+ ret = generic_write_sync(iocb, ret);
+ return ret;
+}
+EXPORT_SYMBOL(netfs_file_write_iter);
diff --git a/include/linux/netfs.h b/include/linux/netfs.h
index 4f9a46a21c28..4cdadd1ce328 100644
--- a/include/linux/netfs.h
+++ b/include/linux/netfs.h
@@ -389,7 +389,10 @@ ssize_t netfs_unbuffered_read_iter(struct kiocb *iocb, struct iov_iter *iter);
/* High-level write API */
ssize_t netfs_perform_write(struct kiocb *iocb, struct iov_iter *iter,
struct netfs_group *netfs_group);
+ssize_t netfs_buffered_write_iter_locked(struct kiocb *iocb, struct iov_iter *from,
+ struct netfs_group *netfs_group);
ssize_t netfs_unbuffered_write_iter(struct kiocb *iocb, struct iov_iter *from);
+ssize_t netfs_file_write_iter(struct kiocb *iocb, struct iov_iter *from);

/* Address operations API */
struct readahead_control;

2023-11-17 21:19:33

by David Howells

[permalink] [raw]
Subject: [PATCH v2 29/51] netfs: Provide netfs_file_read_iter()

Provide a top-level-ish function that can be pointed to directly by
->read_iter file op.

Signed-off-by: David Howells <[email protected]>
cc: Jeff Layton <[email protected]>
cc: [email protected]
cc: [email protected]
cc: [email protected]
---
fs/netfs/buffered_read.c | 33 +++++++++++++++++++++++++++++++++
include/linux/netfs.h | 1 +
2 files changed, 34 insertions(+)

diff --git a/fs/netfs/buffered_read.c b/fs/netfs/buffered_read.c
index 374707df6575..ab9f8e123245 100644
--- a/fs/netfs/buffered_read.c
+++ b/fs/netfs/buffered_read.c
@@ -564,3 +564,36 @@ int netfs_prefetch_for_write(struct file *file, struct folio *folio,
_leave(" = %d", ret);
return ret;
}
+
+/**
+ * netfs_file_read_iter - Generic filesystem read routine
+ * @iocb: kernel I/O control block
+ * @iter: destination for the data read
+ *
+ * This is the ->read_iter() routine for all filesystems that can use the page
+ * cache directly.
+ *
+ * The IOCB_NOWAIT flag in iocb->ki_flags indicates that -EAGAIN shall be
+ * returned when no data can be read without waiting for I/O requests to
+ * complete; it doesn't prevent readahead.
+ *
+ * The IOCB_NOIO flag in iocb->ki_flags indicates that no new I/O requests
+ * shall be made for the read or for readahead. When no data can be read,
+ * -EAGAIN shall be returned. When readahead would be triggered, a partial,
+ * possibly empty read shall be returned.
+ *
+ * Return:
+ * * number of bytes copied, even for partial reads
+ * * negative error code (or 0 if IOCB_NOIO) if nothing was read
+ */
+ssize_t netfs_file_read_iter(struct kiocb *iocb, struct iov_iter *iter)
+{
+ struct netfs_inode *ictx = netfs_inode(iocb->ki_filp->f_mapping->host);
+
+ if ((iocb->ki_flags & IOCB_DIRECT) ||
+ test_bit(NETFS_ICTX_UNBUFFERED, &ictx->flags))
+ return netfs_unbuffered_read_iter(iocb, iter);
+
+ return filemap_read(iocb, iter, 0);
+}
+EXPORT_SYMBOL(netfs_file_read_iter);
diff --git a/include/linux/netfs.h b/include/linux/netfs.h
index 80e48af8b72f..2ab989407dcb 100644
--- a/include/linux/netfs.h
+++ b/include/linux/netfs.h
@@ -385,6 +385,7 @@ struct netfs_cache_ops {

/* High-level read API. */
ssize_t netfs_unbuffered_read_iter(struct kiocb *iocb, struct iov_iter *iter);
+ssize_t netfs_file_read_iter(struct kiocb *iocb, struct iov_iter *iter);

/* High-level write API */
ssize_t netfs_perform_write(struct kiocb *iocb, struct iov_iter *iter,

2023-11-17 21:19:38

by David Howells

[permalink] [raw]
Subject: [PATCH v2 31/51] netfs: Provide minimum blocksize parameter

Add a parameter for minimum blocksize in the netfs_i_context struct. This
can be used, for instance, to force I/O alignment for content encryption.
It also requires the use of an RMW cycle if a write we want to do doesn't
meet the block alignment requirements.

Signed-off-by: David Howells <[email protected]>
cc: Jeff Layton <[email protected]>
cc: [email protected]
cc: [email protected]
cc: [email protected]
---
fs/netfs/buffered_read.c | 26 ++++++++++++++++++++++----
fs/netfs/buffered_write.c | 3 ++-
fs/netfs/direct_read.c | 3 ++-
include/linux/netfs.h | 2 ++
4 files changed, 28 insertions(+), 6 deletions(-)

diff --git a/fs/netfs/buffered_read.c b/fs/netfs/buffered_read.c
index ab9f8e123245..e06461ef0bfa 100644
--- a/fs/netfs/buffered_read.c
+++ b/fs/netfs/buffered_read.c
@@ -527,14 +527,26 @@ int netfs_prefetch_for_write(struct file *file, struct folio *folio,
struct address_space *mapping = folio_file_mapping(folio);
struct netfs_inode *ctx = netfs_inode(mapping->host);
unsigned long long start = folio_pos(folio);
- size_t flen = folio_size(folio);
+ unsigned long long i_size, rstart, end;
+ size_t rlen;
int ret;

- _enter("%zx @%llx", flen, start);
+ DEFINE_READAHEAD(ractl, file, NULL, mapping, folio_index(folio));
+
+ _enter("%zx @%llx", len, start);

ret = -ENOMEM;

- rreq = netfs_alloc_request(mapping, file, start, flen,
+ i_size = i_size_read(mapping->host);
+ end = round_up(start + len, 1U << ctx->min_bshift);
+ if (end > i_size) {
+ unsigned long long limit = round_up(start + len, PAGE_SIZE);
+ end = max(limit, round_up(i_size, PAGE_SIZE));
+ }
+ rstart = round_down(start, 1U << ctx->min_bshift);
+ rlen = end - rstart;
+
+ rreq = netfs_alloc_request(mapping, file, rstart, rlen,
NETFS_READ_FOR_WRITE);
if (IS_ERR(rreq)) {
ret = PTR_ERR(rreq);
@@ -548,7 +560,13 @@ int netfs_prefetch_for_write(struct file *file, struct folio *folio,
goto error_put;

netfs_stat(&netfs_n_rh_write_begin);
- trace_netfs_read(rreq, start, flen, netfs_read_trace_prefetch_for_write);
+ trace_netfs_read(rreq, rstart, rlen, netfs_read_trace_prefetch_for_write);
+
+ /* Expand the request to meet caching requirements and download
+ * preferences.
+ */
+ ractl._nr_pages = folio_nr_pages(folio);
+ netfs_rreq_expand(rreq, &ractl);

/* Set up the output buffer */
iov_iter_xarray(&rreq->iter, ITER_DEST, &mapping->i_pages,
diff --git a/fs/netfs/buffered_write.c b/fs/netfs/buffered_write.c
index 097086d75d1c..4f0feedb357a 100644
--- a/fs/netfs/buffered_write.c
+++ b/fs/netfs/buffered_write.c
@@ -80,7 +80,8 @@ static enum netfs_how_to_modify netfs_how_to_modify(struct netfs_inode *ctx,
if (file->f_mode & FMODE_READ)
return NETFS_JUST_PREFETCH;

- if (netfs_is_cache_enabled(ctx))
+ if (netfs_is_cache_enabled(ctx) ||
+ ctx->min_bshift > 0)
return NETFS_JUST_PREFETCH;

if (!finfo)
diff --git a/fs/netfs/direct_read.c b/fs/netfs/direct_read.c
index 1d26468aafd9..52ad8fa66dd5 100644
--- a/fs/netfs/direct_read.c
+++ b/fs/netfs/direct_read.c
@@ -185,7 +185,8 @@ static ssize_t netfs_unbuffered_read_iter_locked(struct kiocb *iocb, struct iov_
* will then need to pad the request out to the minimum block size.
*/
if (test_bit(NETFS_RREQ_USE_BOUNCE_BUFFER, &rreq->flags)) {
- start = rreq->start;
+ min_bsize = 1ULL << ctx->min_bshift;
+ start = round_down(rreq->start, min_bsize);
end = min_t(unsigned long long,
round_up(rreq->start + rreq->len, min_bsize),
ctx->remote_i_size);
diff --git a/include/linux/netfs.h b/include/linux/netfs.h
index 02a8ddddc8cd..cb80de66d165 100644
--- a/include/linux/netfs.h
+++ b/include/linux/netfs.h
@@ -141,6 +141,7 @@ struct netfs_inode {
unsigned long flags;
#define NETFS_ICTX_ODIRECT 0 /* The file has DIO in progress */
#define NETFS_ICTX_UNBUFFERED 1 /* I/O should not use the pagecache */
+ unsigned char min_bshift; /* log2 min block size for bounding box or 0 */
};

/*
@@ -463,6 +464,7 @@ static inline void netfs_inode_init(struct netfs_inode *ctx,
ctx->remote_i_size = i_size_read(&ctx->inode);
ctx->zero_point = ctx->remote_i_size;
ctx->flags = 0;
+ ctx->min_bshift = 0;
#if IS_ENABLED(CONFIG_FSCACHE)
ctx->cache = NULL;
#endif

2023-11-17 21:19:55

by David Howells

[permalink] [raw]
Subject: [PATCH v2 35/51] netfs: Support decryption on ubuffered/DIO read

Support unbuffered and direct I/O reads from an encrypted file. This may
require making a larger read than is required into a bounce buffer and
copying out the required bits. We don't decrypt in-place in the user
buffer lest userspace interfere and muck up the decryption.

Signed-off-by: David Howells <[email protected]>
cc: Jeff Layton <[email protected]>
cc: [email protected]
cc: [email protected]
cc: [email protected]
---
fs/netfs/direct_read.c | 10 ++++++++++
fs/netfs/internal.h | 17 +++++++++++++++++
2 files changed, 27 insertions(+)

diff --git a/fs/netfs/direct_read.c b/fs/netfs/direct_read.c
index 52ad8fa66dd5..158719b56900 100644
--- a/fs/netfs/direct_read.c
+++ b/fs/netfs/direct_read.c
@@ -181,6 +181,16 @@ static ssize_t netfs_unbuffered_read_iter_locked(struct kiocb *iocb, struct iov_
iov_iter_advance(iter, orig_count);
}

+ /* If we're going to do decryption or decompression, we're going to
+ * need a bounce buffer - and if the data is misaligned for the crypto
+ * algorithm, we decrypt in place and then copy.
+ */
+ if (test_bit(NETFS_RREQ_CONTENT_ENCRYPTION, &rreq->flags)) {
+ if (!netfs_is_crypto_aligned(rreq, iter))
+ __set_bit(NETFS_RREQ_CRYPT_IN_PLACE, &rreq->flags);
+ __set_bit(NETFS_RREQ_USE_BOUNCE_BUFFER, &rreq->flags);
+ }
+
/* If we're going to use a bounce buffer, we need to set it up. We
* will then need to pad the request out to the minimum block size.
*/
diff --git a/fs/netfs/internal.h b/fs/netfs/internal.h
index fbecfd9b3174..447a67301329 100644
--- a/fs/netfs/internal.h
+++ b/fs/netfs/internal.h
@@ -193,6 +193,23 @@ static inline void netfs_put_group_many(struct netfs_group *netfs_group, int nr)
netfs_group->free(netfs_group);
}

+/*
+ * Check to see if a buffer aligns with the crypto unit block size. If it
+ * doesn't the crypto layer is going to copy all the data - in which case
+ * relying on the crypto op for a free copy is pointless.
+ */
+static inline bool netfs_is_crypto_aligned(struct netfs_io_request *rreq,
+ struct iov_iter *iter)
+{
+ struct netfs_inode *ctx = netfs_inode(rreq->inode);
+ unsigned long align, mask = (1UL << ctx->min_bshift) - 1;
+
+ if (!ctx->min_bshift)
+ return true;
+ align = iov_iter_alignment(iter);
+ return (align & mask) == 0;
+}
+
/*****************************************************************************/
/*
* debug tracing

2023-11-17 21:19:55

by David Howells

[permalink] [raw]
Subject: [PATCH v2 32/51] netfs: Make netfs_skip_folio_read() take account of blocksize

Make netfs_skip_folio_read() take account of blocksize such as crypto
blocksize.

Signed-off-by: David Howells <[email protected]>
cc: Jeff Layton <[email protected]>
cc: [email protected]
cc: [email protected]
cc: [email protected]
---
fs/netfs/buffered_read.c | 32 +++++++++++++++++++++-----------
1 file changed, 21 insertions(+), 11 deletions(-)

diff --git a/fs/netfs/buffered_read.c b/fs/netfs/buffered_read.c
index e06461ef0bfa..de696aaaefbd 100644
--- a/fs/netfs/buffered_read.c
+++ b/fs/netfs/buffered_read.c
@@ -337,6 +337,7 @@ EXPORT_SYMBOL(netfs_read_folio);

/*
* Prepare a folio for writing without reading first
+ * @ctx: File context
* @folio: The folio being prepared
* @pos: starting position for the write
* @len: length of write
@@ -350,32 +351,41 @@ EXPORT_SYMBOL(netfs_read_folio);
* If any of these criteria are met, then zero out the unwritten parts
* of the folio and return true. Otherwise, return false.
*/
-static bool netfs_skip_folio_read(struct folio *folio, loff_t pos, size_t len,
- bool always_fill)
+static bool netfs_skip_folio_read(struct netfs_inode *ctx, struct folio *folio,
+ loff_t pos, size_t len, bool always_fill)
{
struct inode *inode = folio_inode(folio);
- loff_t i_size = i_size_read(inode);
+ loff_t i_size = i_size_read(inode), low, high;
size_t offset = offset_in_folio(folio, pos);
size_t plen = folio_size(folio);
+ size_t min_bsize = 1UL << ctx->min_bshift;
+
+ if (likely(min_bsize == 1)) {
+ low = folio_file_pos(folio);
+ high = low + plen;
+ } else {
+ low = round_down(pos, min_bsize);
+ high = round_up(pos + len, min_bsize);
+ }

if (unlikely(always_fill)) {
- if (pos - offset + len <= i_size)
- return false; /* Page entirely before EOF */
+ if (low < i_size)
+ return false; /* Some part of the block before EOF */
zero_user_segment(&folio->page, 0, plen);
folio_mark_uptodate(folio);
return true;
}

- /* Full folio write */
- if (offset == 0 && len >= plen)
+ /* Full page write */
+ if (pos == low && high == pos + len)
return true;

- /* Page entirely beyond the end of the file */
- if (pos - offset >= i_size)
+ /* pos beyond last page in the file */
+ if (low >= i_size)
goto zero_out;

/* Write that covers from the start of the folio to EOF or beyond */
- if (offset == 0 && (pos + len) >= i_size)
+ if (pos == low && (pos + len) >= i_size)
goto zero_out;

return false;
@@ -454,7 +464,7 @@ int netfs_write_begin(struct netfs_inode *ctx,
* to preload the granule.
*/
if (!netfs_is_cache_enabled(ctx) &&
- netfs_skip_folio_read(folio, pos, len, false)) {
+ netfs_skip_folio_read(ctx, folio, pos, len, false)) {
netfs_stat(&netfs_n_rh_write_zskip);
goto have_folio_no_wait;
}

2023-11-17 21:21:21

by David Howells

[permalink] [raw]
Subject: [PATCH v2 43/51] cifs: Replace cifs_writedata with a wrapper around netfs_io_subrequest

Replace the cifs_writedata struct with the same wrapper around
netfs_io_subrequest that was used to replace cifs_readdata.

Signed-off-by: David Howells <[email protected]>
cc: Steve French <[email protected]>
cc: Shyam Prasad N <[email protected]>
cc: Rohith Surabattula <[email protected]>
cc: Jeff Layton <[email protected]>
cc: [email protected]
cc: [email protected]
cc: [email protected]
cc: [email protected]
---
fs/smb/client/cifsglob.h | 30 +++------------
fs/smb/client/cifsproto.h | 16 ++++++--
fs/smb/client/cifssmb.c | 9 ++---
fs/smb/client/file.c | 79 ++++++++++++++++-----------------------
fs/smb/client/smb2pdu.c | 9 ++---
fs/smb/client/smb2proto.h | 3 +-
6 files changed, 58 insertions(+), 88 deletions(-)

diff --git a/fs/smb/client/cifsglob.h b/fs/smb/client/cifsglob.h
index 94f2411861d0..a41aeb2967bd 100644
--- a/fs/smb/client/cifsglob.h
+++ b/fs/smb/client/cifsglob.h
@@ -238,7 +238,6 @@ struct cifs_fattr;
struct smb3_fs_context;
struct cifs_fid;
struct cifs_io_subrequest;
-struct cifs_writedata;
struct cifs_io_parms;
struct cifs_search_info;
struct cifsInodeInfo;
@@ -413,8 +412,7 @@ struct smb_version_operations {
/* async read from the server */
int (*async_readv)(struct cifs_io_subrequest *);
/* async write to the server */
- int (*async_writev)(struct cifs_writedata *,
- void (*release)(struct kref *));
+ int (*async_writev)(struct cifs_io_subrequest *);
/* sync read from the server */
int (*sync_read)(const unsigned int, struct cifs_fid *,
struct cifs_io_parms *, unsigned int *, char **,
@@ -1442,35 +1440,17 @@ struct cifs_io_subrequest {
#endif
struct cifs_credits credits;

- // TODO: Remove following elements
- struct list_head list;
- struct completion done;
- struct work_struct work;
- struct iov_iter iter;
- __u64 offset;
- unsigned int bytes;
-};
+ enum writeback_sync_modes sync_mode;
+ bool uncached;
+ struct bio_vec *bv;

-/* asynchronous write support */
-struct cifs_writedata {
- struct kref refcount;
+ // TODO: Remove following elements
struct list_head list;
struct completion done;
- enum writeback_sync_modes sync_mode;
struct work_struct work;
- struct cifsFileInfo *cfile;
- struct cifs_aio_ctx *ctx;
struct iov_iter iter;
- struct bio_vec *bv;
__u64 offset;
- pid_t pid;
unsigned int bytes;
- int result;
- struct TCP_Server_Info *server;
-#ifdef CONFIG_CIFS_SMB_DIRECT
- struct smbd_mr *mr;
-#endif
- struct cifs_credits credits;
};

/*
diff --git a/fs/smb/client/cifsproto.h b/fs/smb/client/cifsproto.h
index 1702f95efda1..be46efbbf7ac 100644
--- a/fs/smb/client/cifsproto.h
+++ b/fs/smb/client/cifsproto.h
@@ -590,11 +590,19 @@ static inline void cifs_put_readdata(struct cifs_io_subrequest *rdata)
int cifs_async_readv(struct cifs_io_subrequest *rdata);
int cifs_readv_receive(struct TCP_Server_Info *server, struct mid_q_entry *mid);

-int cifs_async_writev(struct cifs_writedata *wdata,
- void (*release)(struct kref *kref));
+int cifs_async_writev(struct cifs_io_subrequest *wdata);
void cifs_writev_complete(struct work_struct *work);
-struct cifs_writedata *cifs_writedata_alloc(work_func_t complete);
-void cifs_writedata_release(struct kref *refcount);
+struct cifs_io_subrequest *cifs_writedata_alloc(work_func_t complete);
+void cifs_writedata_release(struct cifs_io_subrequest *rdata);
+static inline void cifs_get_writedata(struct cifs_io_subrequest *wdata)
+{
+ refcount_inc(&wdata->subreq.ref);
+}
+static inline void cifs_put_writedata(struct cifs_io_subrequest *wdata)
+{
+ if (refcount_dec_and_test(&wdata->subreq.ref))
+ cifs_writedata_release(wdata);
+}
int cifs_query_mf_symlink(unsigned int xid, struct cifs_tcon *tcon,
struct cifs_sb_info *cifs_sb,
const unsigned char *path, char *pbuf,
diff --git a/fs/smb/client/cifssmb.c b/fs/smb/client/cifssmb.c
index 76005b3d5ffe..14fca3fa3e08 100644
--- a/fs/smb/client/cifssmb.c
+++ b/fs/smb/client/cifssmb.c
@@ -1610,7 +1610,7 @@ CIFSSMBWrite(const unsigned int xid, struct cifs_io_parms *io_parms,
static void
cifs_writev_callback(struct mid_q_entry *mid)
{
- struct cifs_writedata *wdata = mid->callback_data;
+ struct cifs_io_subrequest *wdata = mid->callback_data;
struct cifs_tcon *tcon = tlink_tcon(wdata->cfile->tlink);
unsigned int written;
WRITE_RSP *smb = (WRITE_RSP *)mid->resp_buf;
@@ -1655,8 +1655,7 @@ cifs_writev_callback(struct mid_q_entry *mid)

/* cifs_async_writev - send an async write, and set up mid to handle result */
int
-cifs_async_writev(struct cifs_writedata *wdata,
- void (*release)(struct kref *kref))
+cifs_async_writev(struct cifs_io_subrequest *wdata)
{
int rc = -EACCES;
WRITE_REQ *smb = NULL;
@@ -1723,14 +1722,14 @@ cifs_async_writev(struct cifs_writedata *wdata,
iov[1].iov_len += 4; /* pad bigger by four bytes */
}

- kref_get(&wdata->refcount);
+ cifs_get_writedata(wdata);
rc = cifs_call_async(tcon->ses->server, &rqst, NULL,
cifs_writev_callback, NULL, wdata, 0, NULL);

if (rc == 0)
cifs_stats_inc(&tcon->stats.cifs_stats.num_writes);
else
- kref_put(&wdata->refcount, release);
+ cifs_put_writedata(wdata);

async_writev_out:
cifs_small_buf_release(smb);
diff --git a/fs/smb/client/file.c b/fs/smb/client/file.c
index 385830b02e0a..6b28fea8a980 100644
--- a/fs/smb/client/file.c
+++ b/fs/smb/client/file.c
@@ -2411,10 +2411,10 @@ cifs_get_readable_path(struct cifs_tcon *tcon, const char *name,
}

void
-cifs_writedata_release(struct kref *refcount)
+cifs_writedata_release(struct cifs_io_subrequest *wdata)
{
- struct cifs_writedata *wdata = container_of(refcount,
- struct cifs_writedata, refcount);
+ if (wdata->uncached)
+ kref_put(&wdata->ctx->refcount, cifs_aio_ctx_release);
#ifdef CONFIG_CIFS_SMB_DIRECT
if (wdata->mr) {
smbd_deregister_mr(wdata->mr);
@@ -2433,7 +2433,7 @@ cifs_writedata_release(struct kref *refcount)
* possible that the page was redirtied so re-clean the page.
*/
static void
-cifs_writev_requeue(struct cifs_writedata *wdata)
+cifs_writev_requeue(struct cifs_io_subrequest *wdata)
{
int rc = 0;
struct inode *inode = d_inode(wdata->cfile->dentry);
@@ -2443,7 +2443,7 @@ cifs_writev_requeue(struct cifs_writedata *wdata)

server = tlink_tcon(wdata->cfile->tlink)->ses->server;
do {
- struct cifs_writedata *wdata2;
+ struct cifs_io_subrequest *wdata2;
unsigned int wsize, cur_len;

wsize = server->ops->wp_retry_size(inode);
@@ -2466,7 +2466,7 @@ cifs_writev_requeue(struct cifs_writedata *wdata)
wdata2->sync_mode = wdata->sync_mode;
wdata2->offset = fpos;
wdata2->bytes = cur_len;
- wdata2->iter = wdata->iter;
+ wdata2->iter = wdata->iter;

iov_iter_advance(&wdata2->iter, fpos - wdata->offset);
iov_iter_truncate(&wdata2->iter, wdata2->bytes);
@@ -2488,11 +2488,10 @@ cifs_writev_requeue(struct cifs_writedata *wdata)
rc = -EBADF;
} else {
wdata2->pid = wdata2->cfile->pid;
- rc = server->ops->async_writev(wdata2,
- cifs_writedata_release);
+ rc = server->ops->async_writev(wdata2);
}

- kref_put(&wdata2->refcount, cifs_writedata_release);
+ cifs_put_writedata(wdata2);
if (rc) {
if (is_retryable_error(rc))
continue;
@@ -2511,14 +2510,14 @@ cifs_writev_requeue(struct cifs_writedata *wdata)

if (rc != 0 && !is_retryable_error(rc))
mapping_set_error(inode->i_mapping, rc);
- kref_put(&wdata->refcount, cifs_writedata_release);
+ cifs_put_writedata(wdata);
}

void
cifs_writev_complete(struct work_struct *work)
{
- struct cifs_writedata *wdata = container_of(work,
- struct cifs_writedata, work);
+ struct cifs_io_subrequest *wdata = container_of(work,
+ struct cifs_io_subrequest, work);
struct inode *inode = d_inode(wdata->cfile->dentry);

if (wdata->result == 0) {
@@ -2539,16 +2538,16 @@ cifs_writev_complete(struct work_struct *work)

if (wdata->result != -EAGAIN)
mapping_set_error(inode->i_mapping, wdata->result);
- kref_put(&wdata->refcount, cifs_writedata_release);
+ cifs_put_writedata(wdata);
}

-struct cifs_writedata *cifs_writedata_alloc(work_func_t complete)
+struct cifs_io_subrequest *cifs_writedata_alloc(work_func_t complete)
{
- struct cifs_writedata *wdata;
+ struct cifs_io_subrequest *wdata;

wdata = kzalloc(sizeof(*wdata), GFP_NOFS);
if (wdata != NULL) {
- kref_init(&wdata->refcount);
+ refcount_set(&wdata->subreq.ref, 1);
INIT_LIST_HEAD(&wdata->list);
init_completion(&wdata->done);
INIT_WORK(&wdata->work, complete);
@@ -2729,7 +2728,7 @@ static ssize_t cifs_write_back_from_locked_folio(struct address_space *mapping,
{
struct inode *inode = mapping->host;
struct TCP_Server_Info *server;
- struct cifs_writedata *wdata;
+ struct cifs_io_subrequest *wdata;
struct cifs_sb_info *cifs_sb = CIFS_SB(inode->i_sb);
struct cifs_credits credits_on_stack;
struct cifs_credits *credits = &credits_on_stack;
@@ -2821,10 +2820,9 @@ static ssize_t cifs_write_back_from_locked_folio(struct address_space *mapping,
if (wdata->cfile->invalidHandle)
rc = -EAGAIN;
else
- rc = wdata->server->ops->async_writev(wdata,
- cifs_writedata_release);
+ rc = wdata->server->ops->async_writev(wdata);
if (rc >= 0) {
- kref_put(&wdata->refcount, cifs_writedata_release);
+ cifs_put_writedata(wdata);
goto err_close;
}
} else {
@@ -2834,7 +2832,7 @@ static ssize_t cifs_write_back_from_locked_folio(struct address_space *mapping,
}

err_wdata:
- kref_put(&wdata->refcount, cifs_writedata_release);
+ cifs_put_writedata(wdata);
err_uncredit:
add_credits_and_wake_if(server, credits, 0);
err_close:
@@ -3223,23 +3221,13 @@ int cifs_flush(struct file *file, fl_owner_t id)
return rc;
}

-static void
-cifs_uncached_writedata_release(struct kref *refcount)
-{
- struct cifs_writedata *wdata = container_of(refcount,
- struct cifs_writedata, refcount);
-
- kref_put(&wdata->ctx->refcount, cifs_aio_ctx_release);
- cifs_writedata_release(refcount);
-}
-
static void collect_uncached_write_data(struct cifs_aio_ctx *ctx);

static void
cifs_uncached_writev_complete(struct work_struct *work)
{
- struct cifs_writedata *wdata = container_of(work,
- struct cifs_writedata, work);
+ struct cifs_io_subrequest *wdata = container_of(work,
+ struct cifs_io_subrequest, work);
struct inode *inode = d_inode(wdata->cfile->dentry);
struct cifsInodeInfo *cifsi = CIFS_I(inode);

@@ -3252,11 +3240,11 @@ cifs_uncached_writev_complete(struct work_struct *work)
complete(&wdata->done);
collect_uncached_write_data(wdata->ctx);
/* the below call can possibly free the last ref to aio ctx */
- kref_put(&wdata->refcount, cifs_uncached_writedata_release);
+ cifs_put_writedata(wdata);
}

static int
-cifs_resend_wdata(struct cifs_writedata *wdata, struct list_head *wdata_list,
+cifs_resend_wdata(struct cifs_io_subrequest *wdata, struct list_head *wdata_list,
struct cifs_aio_ctx *ctx)
{
unsigned int wsize;
@@ -3305,8 +3293,7 @@ cifs_resend_wdata(struct cifs_writedata *wdata, struct list_head *wdata_list,
wdata->mr = NULL;
}
#endif
- rc = server->ops->async_writev(wdata,
- cifs_uncached_writedata_release);
+ rc = server->ops->async_writev(wdata);
}
}

@@ -3321,7 +3308,7 @@ cifs_resend_wdata(struct cifs_writedata *wdata, struct list_head *wdata_list,
} while (rc == -EAGAIN);

fail:
- kref_put(&wdata->refcount, cifs_uncached_writedata_release);
+ cifs_put_writedata(wdata);
return rc;
}

@@ -3373,7 +3360,7 @@ cifs_write_from_iter(loff_t fpos, size_t len, struct iov_iter *from,
{
int rc = 0;
size_t cur_len, max_len;
- struct cifs_writedata *wdata;
+ struct cifs_io_subrequest *wdata;
pid_t pid;
struct TCP_Server_Info *server;
unsigned int xid, max_segs = INT_MAX;
@@ -3437,6 +3424,7 @@ cifs_write_from_iter(loff_t fpos, size_t len, struct iov_iter *from,
break;
}

+ wdata->uncached = true;
wdata->sync_mode = WB_SYNC_ALL;
wdata->offset = (__u64)fpos;
wdata->cfile = cifsFileInfo_get(open_file);
@@ -3456,14 +3444,12 @@ cifs_write_from_iter(loff_t fpos, size_t len, struct iov_iter *from,
if (wdata->cfile->invalidHandle)
rc = -EAGAIN;
else
- rc = server->ops->async_writev(wdata,
- cifs_uncached_writedata_release);
+ rc = server->ops->async_writev(wdata);
}

if (rc) {
add_credits_and_wake_if(server, &wdata->credits, 0);
- kref_put(&wdata->refcount,
- cifs_uncached_writedata_release);
+ cifs_put_writedata(wdata);
if (rc == -EAGAIN)
continue;
break;
@@ -3481,7 +3467,7 @@ cifs_write_from_iter(loff_t fpos, size_t len, struct iov_iter *from,

static void collect_uncached_write_data(struct cifs_aio_ctx *ctx)
{
- struct cifs_writedata *wdata, *tmp;
+ struct cifs_io_subrequest *wdata, *tmp;
struct cifs_tcon *tcon;
struct cifs_sb_info *cifs_sb;
struct dentry *dentry = ctx->cfile->dentry;
@@ -3536,8 +3522,7 @@ static void collect_uncached_write_data(struct cifs_aio_ctx *ctx)
ctx->cfile, cifs_sb, &tmp_list,
ctx);

- kref_put(&wdata->refcount,
- cifs_uncached_writedata_release);
+ cifs_put_writedata(wdata);
}

list_splice(&tmp_list, &ctx->list);
@@ -3545,7 +3530,7 @@ static void collect_uncached_write_data(struct cifs_aio_ctx *ctx)
}
}
list_del_init(&wdata->list);
- kref_put(&wdata->refcount, cifs_uncached_writedata_release);
+ cifs_put_writedata(wdata);
}

cifs_stats_bytes_written(tcon, ctx->total_len);
diff --git a/fs/smb/client/smb2pdu.c b/fs/smb/client/smb2pdu.c
index 148652891ead..85a85821390f 100644
--- a/fs/smb/client/smb2pdu.c
+++ b/fs/smb/client/smb2pdu.c
@@ -4520,7 +4520,7 @@ SMB2_read(const unsigned int xid, struct cifs_io_parms *io_parms,
static void
smb2_writev_callback(struct mid_q_entry *mid)
{
- struct cifs_writedata *wdata = mid->callback_data;
+ struct cifs_io_subrequest *wdata = mid->callback_data;
struct cifs_tcon *tcon = tlink_tcon(wdata->cfile->tlink);
struct TCP_Server_Info *server = wdata->server;
unsigned int written;
@@ -4601,8 +4601,7 @@ smb2_writev_callback(struct mid_q_entry *mid)

/* smb2_async_writev - send an async write, and set up mid to handle result */
int
-smb2_async_writev(struct cifs_writedata *wdata,
- void (*release)(struct kref *kref))
+smb2_async_writev(struct cifs_io_subrequest *wdata)
{
int rc = -EACCES, flags = 0;
struct smb2_write_req *req = NULL;
@@ -4734,7 +4733,7 @@ smb2_async_writev(struct cifs_writedata *wdata,
flags |= CIFS_HAS_CREDITS;
}

- kref_get(&wdata->refcount);
+ cifs_get_writedata(wdata);
rc = cifs_call_async(server, &rqst, NULL, smb2_writev_callback, NULL,
wdata, flags, &wdata->credits);

@@ -4746,7 +4745,7 @@ smb2_async_writev(struct cifs_writedata *wdata,
io_parms->offset,
io_parms->length,
rc);
- kref_put(&wdata->refcount, release);
+ cifs_put_writedata(wdata);
cifs_stats_fail_inc(tcon, SMB2_WRITE_HE);
}

diff --git a/fs/smb/client/smb2proto.h b/fs/smb/client/smb2proto.h
index 02ffe5ec9b21..4d3d51e42d3c 100644
--- a/fs/smb/client/smb2proto.h
+++ b/fs/smb/client/smb2proto.h
@@ -189,8 +189,7 @@ extern int SMB2_get_srv_num(const unsigned int xid, struct cifs_tcon *tcon,
extern int smb2_async_readv(struct cifs_io_subrequest *rdata);
extern int SMB2_read(const unsigned int xid, struct cifs_io_parms *io_parms,
unsigned int *nbytes, char **buf, int *buf_type);
-extern int smb2_async_writev(struct cifs_writedata *wdata,
- void (*release)(struct kref *kref));
+extern int smb2_async_writev(struct cifs_io_subrequest *wdata);
extern int SMB2_write(const unsigned int xid, struct cifs_io_parms *io_parms,
unsigned int *nbytes, struct kvec *iov, int n_vec);
extern int SMB2_echo(struct TCP_Server_Info *server);

2023-11-17 21:21:26

by David Howells

[permalink] [raw]
Subject: [PATCH v2 34/51] netfs: Decrypt encrypted content

Implement a facility to provide decryption for encrypted content to a whole
read-request in one go (which might have been stitched together from
disparate sources with divisions that don't match page boundaries).

Note that this doesn't necessarily gain the best throughput if the crypto
block size is equal to or less than the size of a page (in which case we
might be better doing it as pages become read), but it will handle crypto
blocks larger than the size of a page.

Signed-off-by: David Howells <[email protected]>
cc: Jeff Layton <[email protected]>
cc: [email protected]
cc: [email protected]
cc: [email protected]
---
fs/netfs/crypto.c | 59 ++++++++++++++++++++++++++++++++++++
fs/netfs/internal.h | 1 +
fs/netfs/io.c | 6 +++-
include/linux/netfs.h | 3 ++
include/trace/events/netfs.h | 2 ++
5 files changed, 70 insertions(+), 1 deletion(-)

diff --git a/fs/netfs/crypto.c b/fs/netfs/crypto.c
index 943d01f430e2..6729bcda4f47 100644
--- a/fs/netfs/crypto.c
+++ b/fs/netfs/crypto.c
@@ -87,3 +87,62 @@ bool netfs_encrypt(struct netfs_io_request *wreq)
wreq->error = ret;
return false;
}
+
+/*
+ * Decrypt the result of a read request.
+ */
+void netfs_decrypt(struct netfs_io_request *rreq)
+{
+ struct netfs_inode *ctx = netfs_inode(rreq->inode);
+ struct scatterlist source_sg[16], dest_sg[16];
+ unsigned int n_source;
+ size_t n, chunk, bsize = 1UL << ctx->crypto_bshift;
+ loff_t pos;
+ int ret;
+
+ trace_netfs_rreq(rreq, netfs_rreq_trace_decrypt);
+ if (rreq->start >= rreq->i_size)
+ return;
+
+ n = min_t(unsigned long long, rreq->len, rreq->i_size - rreq->start);
+
+ _debug("DECRYPT %llx-%llx f=%lx",
+ rreq->start, rreq->start + n, rreq->flags);
+
+ pos = rreq->start;
+ for (; n > 0; n -= chunk, pos += chunk) {
+ chunk = min(n, bsize);
+
+ ret = netfs_iter_to_sglist(&rreq->io_iter, chunk,
+ source_sg, ARRAY_SIZE(source_sg));
+ if (ret < 0)
+ goto error;
+ n_source = ret;
+
+ if (test_bit(NETFS_RREQ_CRYPT_IN_PLACE, &rreq->flags)) {
+ ret = ctx->ops->decrypt_block(rreq, pos, chunk,
+ source_sg, n_source,
+ source_sg, n_source);
+ } else {
+ ret = netfs_iter_to_sglist(&rreq->iter, chunk,
+ dest_sg, ARRAY_SIZE(dest_sg));
+ if (ret < 0)
+ goto error;
+ ret = ctx->ops->decrypt_block(rreq, pos, chunk,
+ source_sg, n_source,
+ dest_sg, ret);
+ }
+
+ if (ret < 0)
+ goto error_failed;
+ }
+
+ return;
+
+error_failed:
+ trace_netfs_failure(rreq, NULL, ret, netfs_fail_decryption);
+error:
+ rreq->error = ret;
+ set_bit(NETFS_RREQ_FAILED, &rreq->flags);
+ return;
+}
diff --git a/fs/netfs/internal.h b/fs/netfs/internal.h
index d3e74ad478ce..fbecfd9b3174 100644
--- a/fs/netfs/internal.h
+++ b/fs/netfs/internal.h
@@ -26,6 +26,7 @@ int netfs_prefetch_for_write(struct file *file, struct folio *folio,
* crypto.c
*/
bool netfs_encrypt(struct netfs_io_request *wreq);
+void netfs_decrypt(struct netfs_io_request *rreq);

/*
* direct_write.c
diff --git a/fs/netfs/io.c b/fs/netfs/io.c
index 36a3f720193a..9887b22e4cb3 100644
--- a/fs/netfs/io.c
+++ b/fs/netfs/io.c
@@ -398,6 +398,9 @@ static void netfs_rreq_assess(struct netfs_io_request *rreq, bool was_async)
return;
}

+ if (!test_bit(NETFS_RREQ_FAILED, &rreq->flags) &&
+ test_bit(NETFS_RREQ_CONTENT_ENCRYPTION, &rreq->flags))
+ netfs_decrypt(rreq);
if (rreq->origin != NETFS_DIO_READ)
netfs_rreq_unlock_folios(rreq);
else
@@ -427,7 +430,8 @@ static void netfs_rreq_work(struct work_struct *work)
static void netfs_rreq_terminated(struct netfs_io_request *rreq,
bool was_async)
{
- if (test_bit(NETFS_RREQ_INCOMPLETE_IO, &rreq->flags) &&
+ if ((test_bit(NETFS_RREQ_INCOMPLETE_IO, &rreq->flags) ||
+ test_bit(NETFS_RREQ_CONTENT_ENCRYPTION, &rreq->flags)) &&
was_async) {
if (!queue_work(system_unbound_wq, &rreq->work))
BUG();
diff --git a/include/linux/netfs.h b/include/linux/netfs.h
index 639f1f9cb7e0..364361cc93be 100644
--- a/include/linux/netfs.h
+++ b/include/linux/netfs.h
@@ -327,6 +327,9 @@ struct netfs_request_ops {
int (*encrypt_block)(struct netfs_io_request *wreq, loff_t pos, size_t len,
struct scatterlist *source_sg, unsigned int n_source,
struct scatterlist *dest_sg, unsigned int n_dest);
+ int (*decrypt_block)(struct netfs_io_request *rreq, loff_t pos, size_t len,
+ struct scatterlist *source_sg, unsigned int n_source,
+ struct scatterlist *dest_sg, unsigned int n_dest);
};

/*
diff --git a/include/trace/events/netfs.h b/include/trace/events/netfs.h
index 70e2f9a48f24..2f35057602fa 100644
--- a/include/trace/events/netfs.h
+++ b/include/trace/events/netfs.h
@@ -40,6 +40,7 @@
#define netfs_rreq_traces \
EM(netfs_rreq_trace_assess, "ASSESS ") \
EM(netfs_rreq_trace_copy, "COPY ") \
+ EM(netfs_rreq_trace_decrypt, "DECRYPT") \
EM(netfs_rreq_trace_done, "DONE ") \
EM(netfs_rreq_trace_encrypt, "ENCRYPT") \
EM(netfs_rreq_trace_free, "FREE ") \
@@ -75,6 +76,7 @@
#define netfs_failures \
EM(netfs_fail_check_write_begin, "check-write-begin") \
EM(netfs_fail_copy_to_cache, "copy-to-cache") \
+ EM(netfs_fail_decryption, "decryption") \
EM(netfs_fail_dio_read_short, "dio-read-short") \
EM(netfs_fail_dio_read_zero, "dio-read-zero") \
EM(netfs_fail_encryption, "encryption") \

2023-11-17 21:21:39

by David Howells

[permalink] [raw]
Subject: [PATCH v2 36/51] netfs: Support encryption on Unbuffered/DIO write

Support unbuffered and direct I/O writes to an encrypted file. This may
require making an RMW cycle if the write is not appropriately aligned with
respect to the crypto blocks.

Signed-off-by: David Howells <[email protected]>
cc: Jeff Layton <[email protected]>
cc: [email protected]
cc: [email protected]
cc: [email protected]
---
fs/netfs/direct_read.c | 2 +-
fs/netfs/direct_write.c | 210 ++++++++++++++++++++++++++++++++++-
fs/netfs/internal.h | 8 ++
fs/netfs/io.c | 117 +++++++++++++++++++
fs/netfs/main.c | 1 +
include/linux/netfs.h | 4 +
include/trace/events/netfs.h | 1 +
7 files changed, 337 insertions(+), 6 deletions(-)

diff --git a/fs/netfs/direct_read.c b/fs/netfs/direct_read.c
index 158719b56900..c01cbe42db8a 100644
--- a/fs/netfs/direct_read.c
+++ b/fs/netfs/direct_read.c
@@ -88,7 +88,7 @@ static int netfs_copy_xarray_to_iter(struct netfs_io_request *rreq,
* If we did a direct read to a bounce buffer (say we needed to decrypt it),
* copy the data obtained to the destination iterator.
*/
-static int netfs_dio_copy_bounce_to_dest(struct netfs_io_request *rreq)
+int netfs_dio_copy_bounce_to_dest(struct netfs_io_request *rreq)
{
struct iov_iter *dest_iter = &rreq->iter;
struct kiocb *iocb = rreq->iocb;
diff --git a/fs/netfs/direct_write.c b/fs/netfs/direct_write.c
index b1a4921ac4a2..f9dea801d6dd 100644
--- a/fs/netfs/direct_write.c
+++ b/fs/netfs/direct_write.c
@@ -23,6 +23,100 @@ static void netfs_cleanup_dio_write(struct netfs_io_request *wreq)
}
}

+/*
+ * Allocate a bunch of pages and add them into the xarray buffer starting at
+ * the given index.
+ */
+static int netfs_alloc_buffer(struct xarray *xa, pgoff_t index, unsigned int nr_pages)
+{
+ struct page *page;
+ unsigned int n;
+ int ret = 0;
+ LIST_HEAD(list);
+
+ n = alloc_pages_bulk_list(GFP_NOIO, nr_pages, &list);
+ if (n < nr_pages) {
+ ret = -ENOMEM;
+ }
+
+ while ((page = list_first_entry_or_null(&list, struct page, lru))) {
+ list_del(&page->lru);
+ page->index = index;
+ ret = xa_insert(xa, index++, page, GFP_NOIO);
+ if (ret < 0)
+ break;
+ }
+
+ while ((page = list_first_entry_or_null(&list, struct page, lru))) {
+ list_del(&page->lru);
+ __free_page(page);
+ }
+ return ret;
+}
+
+/*
+ * Copy all of the data from the source iterator into folios in the destination
+ * xarray. We cannot step through and kmap the source iterator if it's an
+ * iovec, so we have to step through the xarray and drop the RCU lock each
+ * time.
+ */
+static int netfs_copy_iter_to_xarray(struct iov_iter *src, struct xarray *xa,
+ unsigned long long start)
+{
+ struct folio *folio;
+ void *base;
+ pgoff_t index = start / PAGE_SIZE;
+ size_t len, copied, count = iov_iter_count(src);
+
+ XA_STATE(xas, xa, index);
+
+ _enter("%zx", count);
+
+ if (!count)
+ return -EIO;
+
+ len = PAGE_SIZE - offset_in_page(start);
+ rcu_read_lock();
+ xas_for_each(&xas, folio, ULONG_MAX) {
+ size_t offset;
+
+ if (xas_retry(&xas, folio))
+ continue;
+
+ /* There shouldn't be a need to call xas_pause() as no one else
+ * can see the xarray we're iterating over.
+ */
+ rcu_read_unlock();
+
+ offset = offset_in_folio(folio, start);
+ _debug("folio %lx +%zx [%llx]", folio->index, offset, start);
+
+ while (offset < folio_size(folio)) {
+ len = min(count, len);
+
+ base = kmap_local_folio(folio, offset);
+ copied = copy_from_iter(base, len, src);
+ kunmap_local(base);
+ if (copied != len)
+ goto out;
+ count -= len;
+ if (count == 0)
+ goto out;
+
+ start += len;
+ offset += len;
+ len = PAGE_SIZE;
+ }
+
+ rcu_read_lock();
+ }
+
+ rcu_read_unlock();
+out:
+ _leave(" = %zx", count);
+ return count ? -EIO : 0;
+}
+
/*
* Perform an unbuffered write where we may have to do an RMW operation on an
* encrypted file. This can also be used for direct I/O writes.
@@ -31,20 +125,47 @@ ssize_t netfs_unbuffered_write_iter_locked(struct kiocb *iocb, struct iov_iter *
struct netfs_group *netfs_group)
{
struct netfs_io_request *wreq;
+ struct netfs_inode *ctx = netfs_inode(file_inode(iocb->ki_filp));
+ unsigned long long real_size = ctx->remote_i_size;
unsigned long long start = iocb->ki_pos;
unsigned long long end = start + iov_iter_count(iter);
ssize_t ret, n;
- bool async = !is_sync_kiocb(iocb);
+ size_t min_bsize = 1UL << ctx->min_bshift;
+ size_t bmask = min_bsize - 1;
+ size_t gap_before = start & bmask;
+ size_t gap_after = (min_bsize - end) & bmask;
+ bool use_bounce, async = !is_sync_kiocb(iocb);
+ enum {
+ DIRECT_IO, COPY_TO_BOUNCE, ENC_TO_BOUNCE, COPY_THEN_ENC,
+ } buffering;

_enter("");

+ /* The real size must be rounded out to the crypto block size plus
+ * any trailer we might want to attach.
+ */
+ if (real_size && ctx->crypto_bshift) {
+ size_t cmask = 1UL << ctx->crypto_bshift;
+
+ if (real_size < ctx->crypto_trailer)
+ return -EIO;
+ if ((real_size - ctx->crypto_trailer) & cmask)
+ return -EIO;
+ real_size -= ctx->crypto_trailer;
+ }
+
/* We're going to need a bounce buffer if what we transmit is going to
* be different in some way to the source buffer, e.g. because it gets
* encrypted/compressed or because it needs expanding to a block size.
*/
- // TODO
+ use_bounce = test_bit(NETFS_ICTX_ENCRYPTED, &ctx->flags);
+ if (gap_before || gap_after) {
+ if (iocb->ki_flags & IOCB_DIRECT)
+ return -EINVAL;
+ use_bounce = true;
+ }

- _debug("uw %llx-%llx", start, end);
+ _debug("uw %llx-%llx +%zx,%zx", start, end, gap_before, gap_after);

wreq = netfs_alloc_request(iocb->ki_filp->f_mapping, iocb->ki_filp,
start, end - start,
@@ -53,7 +174,57 @@ ssize_t netfs_unbuffered_write_iter_locked(struct kiocb *iocb, struct iov_iter *
if (IS_ERR(wreq))
return PTR_ERR(wreq);

- {
+ if (use_bounce) {
+ unsigned long long bstart = start - gap_before;
+ unsigned long long bend = end + gap_after;
+ pgoff_t first = bstart / PAGE_SIZE;
+ pgoff_t last = (bend - 1) / PAGE_SIZE;
+
+ _debug("bounce %llx-%llx %lx-%lx", bstart, bend, first, last);
+
+ ret = netfs_alloc_buffer(&wreq->bounce, first, last - first + 1);
+ if (ret < 0)
+ goto out;
+
+ iov_iter_xarray(&wreq->io_iter, READ, &wreq->bounce,
+ bstart, bend - bstart);
+
+ if (gap_before || gap_after)
+ async = false; /* We may have to repeat the RMW cycle */
+ }
+
+repeat_rmw_cycle:
+ if (use_bounce) {
+ /* If we're going to need to do an RMW cycle, fill in the gaps
+ * at the ends of the buffer.
+ */
+ if (gap_before || gap_after) {
+ struct iov_iter buffer = wreq->io_iter;
+
+ if ((gap_before && start - gap_before < real_size) ||
+ (gap_after && end < real_size)) {
+ ret = netfs_rmw_read(wreq, iocb->ki_filp,
+ start - gap_before, gap_before,
+ end, end < real_size ? gap_after : 0);
+ if (ret < 0)
+ goto out;
+ }
+
+ if (gap_before && start - gap_before >= real_size)
+ iov_iter_zero(gap_before, &buffer);
+ if (gap_after && end >= real_size) {
+ iov_iter_advance(&buffer, end - start);
+ iov_iter_zero(gap_after, &buffer);
+ }
+ }
+
+ if (!test_bit(NETFS_RREQ_CONTENT_ENCRYPTION, &wreq->flags))
+ buffering = COPY_TO_BOUNCE;
+ else if (!gap_before && !gap_after && netfs_is_crypto_aligned(wreq, iter))
+ buffering = ENC_TO_BOUNCE;
+ else
+ buffering = COPY_THEN_ENC;
+ } else {
/* If this is an async op and we're not using a bounce buffer,
* we have to save the source buffer as the iterator is only
* good until we return. In such a case, extract an iterator
@@ -77,10 +248,25 @@ ssize_t netfs_unbuffered_write_iter_locked(struct kiocb *iocb, struct iov_iter *
}

wreq->io_iter = wreq->iter;
+ buffering = DIRECT_IO;
}

/* Copy the data into the bounce buffer and encrypt it. */
- // TODO
+ if (buffering == COPY_TO_BOUNCE ||
+ buffering == COPY_THEN_ENC) {
+ ret = netfs_copy_iter_to_xarray(iter, &wreq->bounce, wreq->start);
+ if (ret < 0)
+ goto out;
+ wreq->iter = wreq->io_iter;
+ wreq->start -= gap_before;
+ wreq->len += gap_before + gap_after;
+ }
+
+ if (buffering == COPY_THEN_ENC ||
+ buffering == ENC_TO_BOUNCE) {
+ if (!netfs_encrypt(wreq))
+ goto out;
+ }

/* Dispatch the write. */
__set_bit(NETFS_RREQ_UPLOAD_TO_SERVER, &wreq->flags);
@@ -101,6 +287,20 @@ ssize_t netfs_unbuffered_write_iter_locked(struct kiocb *iocb, struct iov_iter *
wait_on_bit(&wreq->flags, NETFS_RREQ_IN_PROGRESS,
TASK_UNINTERRUPTIBLE);

+ /* See if the write failed due to a 3rd party race when doing
+ * an RMW on a partially modified block in an encrypted file.
+ */
+ if (test_and_clear_bit(NETFS_RREQ_REPEAT_RMW, &wreq->flags)) {
+ netfs_clear_subrequests(wreq, false);
+ iov_iter_revert(iter, end - start);
+ wreq->error = 0;
+ wreq->start = start;
+ wreq->len = end - start;
+ wreq->transferred = 0;
+ wreq->submitted = 0;
+ goto repeat_rmw_cycle;
+ }
+
ret = wreq->error;
_debug("waited = %zd", ret);
if (ret == 0) {
diff --git a/fs/netfs/internal.h b/fs/netfs/internal.h
index 447a67301329..782b73b1f5a7 100644
--- a/fs/netfs/internal.h
+++ b/fs/netfs/internal.h
@@ -28,6 +28,11 @@ int netfs_prefetch_for_write(struct file *file, struct folio *folio,
bool netfs_encrypt(struct netfs_io_request *wreq);
void netfs_decrypt(struct netfs_io_request *rreq);

+/*
+ * direct_read.c
+ */
+int netfs_dio_copy_bounce_to_dest(struct netfs_io_request *rreq);
+
/*
* direct_write.c
*/
@@ -38,6 +43,9 @@ ssize_t netfs_unbuffered_write_iter_locked(struct kiocb *iocb, struct iov_iter *
* io.c
*/
int netfs_begin_read(struct netfs_io_request *rreq, bool sync);
+ssize_t netfs_rmw_read(struct netfs_io_request *wreq, struct file *file,
+ unsigned long long start1, size_t len1,
+ unsigned long long start2, size_t len2);

/*
* main.c
diff --git a/fs/netfs/io.c b/fs/netfs/io.c
index 9887b22e4cb3..14a9f3312d3b 100644
--- a/fs/netfs/io.c
+++ b/fs/netfs/io.c
@@ -775,3 +775,120 @@ int netfs_begin_read(struct netfs_io_request *rreq, bool sync)
out:
return ret;
}
+
+static bool netfs_rmw_read_one(struct netfs_io_request *rreq,
+ unsigned long long start, size_t len)
+{
+ struct netfs_inode *ctx = netfs_inode(rreq->inode);
+ struct iov_iter io_iter;
+ unsigned long long pstart, end = start + len;
+ pgoff_t first, last;
+ ssize_t ret;
+ size_t min_bsize = 1UL << ctx->min_bshift;
+
+ /* Determine the block we need to load. */
+ end = round_up(end, min_bsize);
+ start = round_down(start, min_bsize);
+
+ /* Determine the folios we need to insert. */
+ pstart = round_down(start, PAGE_SIZE);
+ first = pstart / PAGE_SIZE;
+ last = DIV_ROUND_UP(end, PAGE_SIZE);
+
+ ret = netfs_add_folios_to_buffer(&rreq->bounce, rreq->mapping,
+ first, last, GFP_NOFS);
+ if (ret < 0) {
+ rreq->error = ret;
+ return false;
+ }
+
+ rreq->start = start;
+ rreq->len = len;
+ rreq->submitted = 0;
+ iov_iter_xarray(&rreq->io_iter, ITER_DEST, &rreq->bounce, start, len);
+
+ io_iter = rreq->io_iter;
+ do {
+ _debug("submit %llx + %zx >= %llx",
+ rreq->start, rreq->submitted, rreq->i_size);
+ if (rreq->start + rreq->submitted >= rreq->i_size)
+ break;
+ if (!netfs_rreq_submit_slice(rreq, &io_iter, &rreq->subreq_counter))
+ break;
+ } while (rreq->submitted < rreq->len);
+
+ if (rreq->submitted < rreq->len) {
+ netfs_put_request(rreq, false, netfs_rreq_trace_put_no_submit);
+ return false;
+ }
+
+ return true;
+}
+
+/*
+ * Begin the process of reading in one or two chunks of data for use by
+ * unbuffered write to perform an RMW cycle. We don't read directly into the
+ * write buffer as this may get called to redo the read in the case that a
+ * conditional write fails due to conflicting 3rd-party modifications.
+ */
+ssize_t netfs_rmw_read(struct netfs_io_request *wreq, struct file *file,
+ unsigned long long start1, size_t len1,
+ unsigned long long start2, size_t len2)
+{
+ struct netfs_io_request *rreq;
+ ssize_t ret;
+
+ _enter("RMW:R=%x %llx-%llx %llx-%llx",
+ rreq->debug_id, start1, start1 + len1 - 1, start2, start2 + len2 - 1);
+
+ rreq = netfs_alloc_request(wreq->mapping, file,
+ start1, start2 - start1 + len2, NETFS_RMW_READ);
+ if (IS_ERR(rreq))
+ return PTR_ERR(rreq);
+
+ INIT_WORK(&rreq->work, netfs_rreq_work);
+
+ rreq->iter = wreq->io_iter;
+ __set_bit(NETFS_RREQ_CRYPT_IN_PLACE, &rreq->flags);
+ __set_bit(NETFS_RREQ_USE_BOUNCE_BUFFER, &rreq->flags);
+
+ /* Chop the reads into slices according to what the netfs wants and
+ * submit each one.
+ */
+ netfs_get_request(rreq, netfs_rreq_trace_get_for_outstanding);
+ atomic_set(&rreq->nr_outstanding, 1);
+ if (len1 && !netfs_rmw_read_one(rreq, start1, len1))
+ goto wait;
+ if (len2)
+ netfs_rmw_read_one(rreq, start2, len2);
+
+wait:
+ /* Keep nr_outstanding incremented so that the ref always belongs to us
+ * and the service code isn't punted off to a random thread pool to
+ * process.
+ */
+ for (;;) {
+ wait_var_event(&rreq->nr_outstanding,
+ atomic_read(&rreq->nr_outstanding) == 1);
+ netfs_rreq_assess(rreq, false);
+ if (atomic_read(&rreq->nr_outstanding) == 1)
+ break;
+ cond_resched();
+ }
+
+ trace_netfs_rreq(wreq, netfs_rreq_trace_wait_ip);
+ wait_on_bit(&wreq->flags, NETFS_RREQ_IN_PROGRESS,
+ TASK_UNINTERRUPTIBLE);
+
+ ret = rreq->error;
+ if (ret == 0 && rreq->submitted < rreq->len) {
+ trace_netfs_failure(rreq, NULL, ret, netfs_fail_short_read);
+ ret = -EIO;
+ }
+
+ if (ret == 0)
+ ret = netfs_dio_copy_bounce_to_dest(rreq);
+
+ netfs_put_request(rreq, false, netfs_rreq_trace_put_return);
+ return ret;
+}
diff --git a/fs/netfs/main.c b/fs/netfs/main.c
index 1cf10f9c4c1f..b335e6a50f9c 100644
--- a/fs/netfs/main.c
+++ b/fs/netfs/main.c
@@ -33,6 +33,7 @@ static const char *netfs_origins[nr__netfs_io_origin] = {
[NETFS_READPAGE] = "RP",
[NETFS_READ_FOR_WRITE] = "RW",
[NETFS_WRITEBACK] = "WB",
+ [NETFS_RMW_READ] = "RM",
[NETFS_UNBUFFERED_WRITE] = "UW",
[NETFS_DIO_READ] = "DR",
[NETFS_DIO_WRITE] = "DW",
diff --git a/include/linux/netfs.h b/include/linux/netfs.h
index 364361cc93be..c3d1eac1ce51 100644
--- a/include/linux/netfs.h
+++ b/include/linux/netfs.h
@@ -145,6 +145,7 @@ struct netfs_inode {
#define NETFS_ICTX_ENCRYPTED 2 /* The file contents are encrypted */
unsigned char min_bshift; /* log2 min block size for bounding box or 0 */
unsigned char crypto_bshift; /* log2 of crypto block size */
+ unsigned char crypto_trailer; /* Size of crypto trailer */
};

/*
@@ -233,6 +234,7 @@ enum netfs_io_origin {
NETFS_READPAGE, /* This read is a synchronous read */
NETFS_READ_FOR_WRITE, /* This read is to prepare a write */
NETFS_WRITEBACK, /* This write was triggered by writepages */
+ NETFS_RMW_READ, /* This is an unbuffered read for RMW */
NETFS_UNBUFFERED_WRITE, /* This is an unbuffered write */
NETFS_DIO_READ, /* This is a direct I/O read */
NETFS_DIO_WRITE, /* This is a direct I/O write */
@@ -291,6 +293,7 @@ struct netfs_io_request {
#define NETFS_RREQ_BLOCKED 10 /* We blocked */
#define NETFS_RREQ_CONTENT_ENCRYPTION 11 /* Content encryption is in use */
#define NETFS_RREQ_CRYPT_IN_PLACE 12 /* Enc/dec in place in ->io_iter */
+#define NETFS_RREQ_REPEAT_RMW 13 /* Need to repeat RMW cycle */
const struct netfs_request_ops *netfs_ops;
void (*cleanup)(struct netfs_io_request *req);
};
@@ -479,6 +482,7 @@ static inline void netfs_inode_init(struct netfs_inode *ctx,
ctx->flags = 0;
ctx->min_bshift = 0;
ctx->crypto_bshift = 0;
+ ctx->crypto_trailer = 0;
#if IS_ENABLED(CONFIG_FSCACHE)
ctx->cache = NULL;
#endif
diff --git a/include/trace/events/netfs.h b/include/trace/events/netfs.h
index 2f35057602fa..825946f510ee 100644
--- a/include/trace/events/netfs.h
+++ b/include/trace/events/netfs.h
@@ -33,6 +33,7 @@
EM(NETFS_READPAGE, "RP") \
EM(NETFS_READ_FOR_WRITE, "RW") \
EM(NETFS_WRITEBACK, "WB") \
+ EM(NETFS_RMW_READ, "RM") \
EM(NETFS_UNBUFFERED_WRITE, "UW") \
EM(NETFS_DIO_READ, "DR") \
E_(NETFS_DIO_WRITE, "DW")

2023-11-17 21:22:27

by David Howells

[permalink] [raw]
Subject: [PATCH v2 39/51] netfs: Rearrange netfs_io_subrequest to put request pointer first

Rearrange the netfs_io_subrequest struct to put the netfs_io_request
pointer (rreq) first. This then allows netfs_io_subrequest to be put in a
union with a pointer to a wrapper around netfs_io_request for cifs.

Signed-off-by: David Howells <[email protected]>
cc: Steve French <[email protected]>
cc: Shyam Prasad N <[email protected]>
cc: Rohith Surabattula <[email protected]>
cc: Jeff Layton <[email protected]>
cc: [email protected]
cc: [email protected]
cc: [email protected]
cc: [email protected]
---
include/linux/netfs.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/linux/netfs.h b/include/linux/netfs.h
index 39f885fea383..7d9b61d21a70 100644
--- a/include/linux/netfs.h
+++ b/include/linux/netfs.h
@@ -209,8 +209,8 @@ struct netfs_cache_resources {
* the pages it points to can be relied on to exist for the duration.
*/
struct netfs_io_subrequest {
- struct work_struct work;
struct netfs_io_request *rreq; /* Supervising I/O request */
+ struct work_struct work;
struct list_head rreq_link; /* Link in rreq->subrequests */
struct iov_iter io_iter; /* Iterator for this subrequest */
loff_t start; /* Where to start the I/O */

2023-11-17 21:22:39

by David Howells

[permalink] [raw]
Subject: [PATCH v2 41/51] cifs: Replace cifs_readdata with a wrapper around netfs_io_subrequest

Netfslib has a facility whereby the allocation for netfs_io_subrequest can
be increased to so that filesystem-specific data can be tagged on the end.

Prepare to use this by making a struct, cifs_io_subrequest, that wraps
netfs_io_subrequest, and absorb struct cifs_readdata into it.

Signed-off-by: David Howells <[email protected]>
cc: Steve French <[email protected]>
cc: Shyam Prasad N <[email protected]>
cc: Rohith Surabattula <[email protected]>
cc: Jeff Layton <[email protected]>
cc: [email protected]
cc: [email protected]
cc: [email protected]
cc: [email protected]
---
fs/smb/client/cifsglob.h | 22 ++++++++++--------
fs/smb/client/cifsproto.h | 9 ++++++--
fs/smb/client/cifssmb.c | 11 ++++-----
fs/smb/client/file.c | 48 ++++++++++++++++++---------------------
fs/smb/client/smb2ops.c | 2 +-
fs/smb/client/smb2pdu.c | 13 ++++++-----
fs/smb/client/smb2proto.h | 2 +-
fs/smb/client/transport.c | 4 ++--
8 files changed, 56 insertions(+), 55 deletions(-)

diff --git a/fs/smb/client/cifsglob.h b/fs/smb/client/cifsglob.h
index 6ffbd81bd109..0f74aa12e6b6 100644
--- a/fs/smb/client/cifsglob.h
+++ b/fs/smb/client/cifsglob.h
@@ -237,7 +237,7 @@ struct dfs_info3_param;
struct cifs_fattr;
struct smb3_fs_context;
struct cifs_fid;
-struct cifs_readdata;
+struct cifs_io_subrequest;
struct cifs_writedata;
struct cifs_io_parms;
struct cifs_search_info;
@@ -411,7 +411,7 @@ struct smb_version_operations {
/* send a flush request to the server */
int (*flush)(const unsigned int, struct cifs_tcon *, struct cifs_fid *);
/* async read from the server */
- int (*async_readv)(struct cifs_readdata *);
+ int (*async_readv)(struct cifs_io_subrequest *);
/* async write to the server */
int (*async_writev)(struct cifs_writedata *,
void (*release)(struct kref *));
@@ -1427,26 +1427,28 @@ struct cifs_aio_ctx {
};

/* asynchronous read support */
-struct cifs_readdata {
- struct kref refcount;
- struct list_head list;
- struct completion done;
+struct cifs_io_subrequest {
+ struct netfs_io_subrequest subreq;
struct cifsFileInfo *cfile;
struct address_space *mapping;
struct cifs_aio_ctx *ctx;
- __u64 offset;
ssize_t got_bytes;
- unsigned int bytes;
pid_t pid;
int result;
- struct work_struct work;
- struct iov_iter iter;
struct kvec iov[2];
struct TCP_Server_Info *server;
#ifdef CONFIG_CIFS_SMB_DIRECT
struct smbd_mr *mr;
#endif
struct cifs_credits credits;
+
+ // TODO: Remove following elements
+ struct list_head list;
+ struct completion done;
+ struct work_struct work;
+ struct iov_iter iter;
+ __u64 offset;
+ unsigned int bytes;
};

/* asynchronous write support */
diff --git a/fs/smb/client/cifsproto.h b/fs/smb/client/cifsproto.h
index d87e2c26cce2..1702f95efda1 100644
--- a/fs/smb/client/cifsproto.h
+++ b/fs/smb/client/cifsproto.h
@@ -581,8 +581,13 @@ void __cifs_put_smb_ses(struct cifs_ses *ses);
extern struct cifs_ses *
cifs_get_smb_ses(struct TCP_Server_Info *server, struct smb3_fs_context *ctx);

-void cifs_readdata_release(struct kref *refcount);
-int cifs_async_readv(struct cifs_readdata *rdata);
+void cifs_readdata_release(struct cifs_io_subrequest *rdata);
+static inline void cifs_put_readdata(struct cifs_io_subrequest *rdata)
+{
+ if (refcount_dec_and_test(&rdata->subreq.ref))
+ cifs_readdata_release(rdata);
+}
+int cifs_async_readv(struct cifs_io_subrequest *rdata);
int cifs_readv_receive(struct TCP_Server_Info *server, struct mid_q_entry *mid);

int cifs_async_writev(struct cifs_writedata *wdata,
diff --git a/fs/smb/client/cifssmb.c b/fs/smb/client/cifssmb.c
index 25503f1a4fd2..76005b3d5ffe 100644
--- a/fs/smb/client/cifssmb.c
+++ b/fs/smb/client/cifssmb.c
@@ -24,6 +24,8 @@
#include <linux/swap.h>
#include <linux/task_io_accounting_ops.h>
#include <linux/uaccess.h>
+#include <linux/netfs.h>
+#include <trace/events/netfs.h>
#include "cifspdu.h"
#include "cifsfs.h"
#include "cifsglob.h"
@@ -1260,12 +1262,11 @@ CIFS_open(const unsigned int xid, struct cifs_open_parms *oparms, int *oplock,
static void
cifs_readv_callback(struct mid_q_entry *mid)
{
- struct cifs_readdata *rdata = mid->callback_data;
+ struct cifs_io_subrequest *rdata = mid->callback_data;
struct cifs_tcon *tcon = tlink_tcon(rdata->cfile->tlink);
struct TCP_Server_Info *server = tcon->ses->server;
struct smb_rqst rqst = { .rq_iov = rdata->iov,
.rq_nvec = 2,
- .rq_iter_size = iov_iter_count(&rdata->iter),
.rq_iter = rdata->iter };
struct cifs_credits credits = { .value = 1, .instance = 0 };

@@ -1310,7 +1311,7 @@ cifs_readv_callback(struct mid_q_entry *mid)

/* cifs_async_readv - send an async write, and set up mid to handle result */
int
-cifs_async_readv(struct cifs_readdata *rdata)
+cifs_async_readv(struct cifs_io_subrequest *rdata)
{
int rc;
READ_REQ *smb = NULL;
@@ -1362,15 +1363,11 @@ cifs_async_readv(struct cifs_readdata *rdata)
rdata->iov[1].iov_base = (char *)smb + 4;
rdata->iov[1].iov_len = get_rfc1002_length(smb);

- kref_get(&rdata->refcount);
rc = cifs_call_async(tcon->ses->server, &rqst, cifs_readv_receive,
cifs_readv_callback, NULL, rdata, 0, NULL);

if (rc == 0)
cifs_stats_inc(&tcon->stats.cifs_stats.num_reads);
- else
- kref_put(&rdata->refcount, cifs_readdata_release);
-
cifs_small_buf_release(smb);
return rc;
}
diff --git a/fs/smb/client/file.c b/fs/smb/client/file.c
index 45ca492c141c..8c9e33efb9a9 100644
--- a/fs/smb/client/file.c
+++ b/fs/smb/client/file.c
@@ -2949,7 +2949,7 @@ static int cifs_writepages_region(struct address_space *mapping,
continue;
}

- folio_batch_release(&fbatch);
+ folio_batch_release(&fbatch);
cond_resched();
} while (wbc->nr_to_write > 0);

@@ -3783,13 +3783,13 @@ cifs_strict_writev(struct kiocb *iocb, struct iov_iter *from)
return written;
}

-static struct cifs_readdata *cifs_readdata_alloc(work_func_t complete)
+static struct cifs_io_subrequest *cifs_readdata_alloc(work_func_t complete)
{
- struct cifs_readdata *rdata;
+ struct cifs_io_subrequest *rdata;

rdata = kzalloc(sizeof(*rdata), GFP_KERNEL);
if (rdata) {
- kref_init(&rdata->refcount);
+ refcount_set(&rdata->subreq.ref, 1);
INIT_LIST_HEAD(&rdata->list);
init_completion(&rdata->done);
INIT_WORK(&rdata->work, complete);
@@ -3799,11 +3799,8 @@ static struct cifs_readdata *cifs_readdata_alloc(work_func_t complete)
}

void
-cifs_readdata_release(struct kref *refcount)
+cifs_readdata_release(struct cifs_io_subrequest *rdata)
{
- struct cifs_readdata *rdata = container_of(refcount,
- struct cifs_readdata, refcount);
-
if (rdata->ctx)
kref_put(&rdata->ctx->refcount, cifs_aio_ctx_release);
#ifdef CONFIG_CIFS_SMB_DIRECT
@@ -3823,16 +3820,16 @@ static void collect_uncached_read_data(struct cifs_aio_ctx *ctx);
static void
cifs_uncached_readv_complete(struct work_struct *work)
{
- struct cifs_readdata *rdata = container_of(work,
- struct cifs_readdata, work);
+ struct cifs_io_subrequest *rdata =
+ container_of(work, struct cifs_io_subrequest, work);

complete(&rdata->done);
collect_uncached_read_data(rdata->ctx);
/* the below call can possibly free the last ref to aio ctx */
- kref_put(&rdata->refcount, cifs_readdata_release);
+ cifs_put_readdata(rdata);
}

-static int cifs_resend_rdata(struct cifs_readdata *rdata,
+static int cifs_resend_rdata(struct cifs_io_subrequest *rdata,
struct list_head *rdata_list,
struct cifs_aio_ctx *ctx)
{
@@ -3900,7 +3897,7 @@ static int cifs_resend_rdata(struct cifs_readdata *rdata,
} while (rc == -EAGAIN);

fail:
- kref_put(&rdata->refcount, cifs_readdata_release);
+ cifs_put_readdata(rdata);
return rc;
}

@@ -3909,7 +3906,7 @@ cifs_send_async_read(loff_t fpos, size_t len, struct cifsFileInfo *open_file,
struct cifs_sb_info *cifs_sb, struct list_head *rdata_list,
struct cifs_aio_ctx *ctx)
{
- struct cifs_readdata *rdata;
+ struct cifs_io_subrequest *rdata;
unsigned int rsize, nsegs, max_segs = INT_MAX;
struct cifs_credits credits_on_stack;
struct cifs_credits *credits = &credits_on_stack;
@@ -3977,7 +3974,7 @@ cifs_send_async_read(loff_t fpos, size_t len, struct cifsFileInfo *open_file,
rdata->ctx = ctx;
kref_get(&ctx->refcount);

- rdata->iter = ctx->iter;
+ rdata->iter = ctx->iter;
iov_iter_truncate(&rdata->iter, cur_len);

rc = adjust_credits(server, &rdata->credits, rdata->bytes);
@@ -3991,7 +3988,7 @@ cifs_send_async_read(loff_t fpos, size_t len, struct cifsFileInfo *open_file,

if (rc) {
add_credits_and_wake_if(server, &rdata->credits, 0);
- kref_put(&rdata->refcount, cifs_readdata_release);
+ cifs_put_readdata(rdata);
if (rc == -EAGAIN)
continue;
break;
@@ -4009,7 +4006,7 @@ cifs_send_async_read(loff_t fpos, size_t len, struct cifsFileInfo *open_file,
static void
collect_uncached_read_data(struct cifs_aio_ctx *ctx)
{
- struct cifs_readdata *rdata, *tmp;
+ struct cifs_io_subrequest *rdata, *tmp;
struct cifs_sb_info *cifs_sb;
int rc;

@@ -4055,8 +4052,7 @@ collect_uncached_read_data(struct cifs_aio_ctx *ctx)
rdata->cfile, cifs_sb,
&tmp_list, ctx);

- kref_put(&rdata->refcount,
- cifs_readdata_release);
+ cifs_put_readdata(rdata);
}

list_splice(&tmp_list, &ctx->list);
@@ -4072,7 +4068,7 @@ collect_uncached_read_data(struct cifs_aio_ctx *ctx)
ctx->total_len += rdata->got_bytes;
}
list_del_init(&rdata->list);
- kref_put(&rdata->refcount, cifs_readdata_release);
+ cifs_put_readdata(rdata);
}

/* mask nodata case */
@@ -4444,8 +4440,8 @@ static void cifs_unlock_folios(struct address_space *mapping, pgoff_t first, pgo

static void cifs_readahead_complete(struct work_struct *work)
{
- struct cifs_readdata *rdata = container_of(work,
- struct cifs_readdata, work);
+ struct cifs_io_subrequest *rdata = container_of(work,
+ struct cifs_io_subrequest, work);
struct folio *folio;
pgoff_t last;
bool good = rdata->result == 0 || (rdata->result == -EAGAIN && rdata->got_bytes);
@@ -4471,7 +4467,7 @@ static void cifs_readahead_complete(struct work_struct *work)
}
rcu_read_unlock();

- kref_put(&rdata->refcount, cifs_readdata_release);
+ cifs_put_readdata(rdata);
}

static void cifs_readahead(struct readahead_control *ractl)
@@ -4511,7 +4507,7 @@ static void cifs_readahead(struct readahead_control *ractl)
*/
while ((nr_pages = ra_pages)) {
unsigned int i, rsize;
- struct cifs_readdata *rdata;
+ struct cifs_io_subrequest *rdata;
struct cifs_credits credits_on_stack;
struct cifs_credits *credits = &credits_on_stack;
struct folio *folio;
@@ -4630,11 +4626,11 @@ static void cifs_readahead(struct readahead_control *ractl)
rdata->offset / PAGE_SIZE,
(rdata->offset + rdata->bytes - 1) / PAGE_SIZE);
/* Fallback to the readpage in error/reconnect cases */
- kref_put(&rdata->refcount, cifs_readdata_release);
+ cifs_put_readdata(rdata);
break;
}

- kref_put(&rdata->refcount, cifs_readdata_release);
+ cifs_put_readdata(rdata);
}

free_xid(xid);
diff --git a/fs/smb/client/smb2ops.c b/fs/smb/client/smb2ops.c
index a959ed2c9b22..ae0acf9c411d 100644
--- a/fs/smb/client/smb2ops.c
+++ b/fs/smb/client/smb2ops.c
@@ -4592,7 +4592,7 @@ handle_read_data(struct TCP_Server_Info *server, struct mid_q_entry *mid,
unsigned int cur_off;
unsigned int cur_page_idx;
unsigned int pad_len;
- struct cifs_readdata *rdata = mid->callback_data;
+ struct cifs_io_subrequest *rdata = mid->callback_data;
struct smb2_hdr *shdr = (struct smb2_hdr *)buf;
int length;
bool use_rdma_mr = false;
diff --git a/fs/smb/client/smb2pdu.c b/fs/smb/client/smb2pdu.c
index 2eb29fa278c3..148652891ead 100644
--- a/fs/smb/client/smb2pdu.c
+++ b/fs/smb/client/smb2pdu.c
@@ -23,6 +23,8 @@
#include <linux/uuid.h>
#include <linux/pagemap.h>
#include <linux/xattr.h>
+#include <linux/netfs.h>
+#include <trace/events/netfs.h>
#include "cifsglob.h"
#include "cifsacl.h"
#include "cifsproto.h"
@@ -4175,7 +4177,7 @@ static inline bool smb3_use_rdma_offload(struct cifs_io_parms *io_parms)
*/
static int
smb2_new_read_req(void **buf, unsigned int *total_len,
- struct cifs_io_parms *io_parms, struct cifs_readdata *rdata,
+ struct cifs_io_parms *io_parms, struct cifs_io_subrequest *rdata,
unsigned int remaining_bytes, int request_type)
{
int rc = -EACCES;
@@ -4267,13 +4269,14 @@ smb2_new_read_req(void **buf, unsigned int *total_len,
static void
smb2_readv_callback(struct mid_q_entry *mid)
{
- struct cifs_readdata *rdata = mid->callback_data;
+ struct cifs_io_subrequest *rdata = mid->callback_data;
struct cifs_tcon *tcon = tlink_tcon(rdata->cfile->tlink);
struct TCP_Server_Info *server = rdata->server;
struct smb2_hdr *shdr =
(struct smb2_hdr *)rdata->iov[0].iov_base;
struct cifs_credits credits = { .value = 0, .instance = 0 };
- struct smb_rqst rqst = { .rq_iov = &rdata->iov[1], .rq_nvec = 1 };
+ struct smb_rqst rqst = { .rq_iov = &rdata->iov[1],
+ .rq_nvec = 1 };

if (rdata->got_bytes) {
rqst.rq_iter = rdata->iter;
@@ -4354,7 +4357,7 @@ smb2_readv_callback(struct mid_q_entry *mid)

/* smb2_async_readv - send an async read, and set up mid to handle result */
int
-smb2_async_readv(struct cifs_readdata *rdata)
+smb2_async_readv(struct cifs_io_subrequest *rdata)
{
int rc, flags = 0;
char *buf;
@@ -4412,13 +4415,11 @@ smb2_async_readv(struct cifs_readdata *rdata)
flags |= CIFS_HAS_CREDITS;
}

- kref_get(&rdata->refcount);
rc = cifs_call_async(server, &rqst,
cifs_readv_receive, smb2_readv_callback,
smb3_handle_read_data, rdata, flags,
&rdata->credits);
if (rc) {
- kref_put(&rdata->refcount, cifs_readdata_release);
cifs_stats_fail_inc(io_parms.tcon, SMB2_READ_HE);
trace_smb3_read_err(0 /* xid */, io_parms.persistent_fid,
io_parms.tcon->tid,
diff --git a/fs/smb/client/smb2proto.h b/fs/smb/client/smb2proto.h
index 46eff9ec302a..02ffe5ec9b21 100644
--- a/fs/smb/client/smb2proto.h
+++ b/fs/smb/client/smb2proto.h
@@ -186,7 +186,7 @@ extern int SMB2_query_acl(const unsigned int xid, struct cifs_tcon *tcon,
extern int SMB2_get_srv_num(const unsigned int xid, struct cifs_tcon *tcon,
u64 persistent_fid, u64 volatile_fid,
__le64 *uniqueid);
-extern int smb2_async_readv(struct cifs_readdata *rdata);
+extern int smb2_async_readv(struct cifs_io_subrequest *rdata);
extern int SMB2_read(const unsigned int xid, struct cifs_io_parms *io_parms,
unsigned int *nbytes, char **buf, int *buf_type);
extern int smb2_async_writev(struct cifs_writedata *wdata,
diff --git a/fs/smb/client/transport.c b/fs/smb/client/transport.c
index 4f717ad7c21b..bae758ec621b 100644
--- a/fs/smb/client/transport.c
+++ b/fs/smb/client/transport.c
@@ -1677,7 +1677,7 @@ __cifs_readv_discard(struct TCP_Server_Info *server, struct mid_q_entry *mid,
static int
cifs_readv_discard(struct TCP_Server_Info *server, struct mid_q_entry *mid)
{
- struct cifs_readdata *rdata = mid->callback_data;
+ struct cifs_io_subrequest *rdata = mid->callback_data;

return __cifs_readv_discard(server, mid, rdata->result);
}
@@ -1687,7 +1687,7 @@ cifs_readv_receive(struct TCP_Server_Info *server, struct mid_q_entry *mid)
{
int length, len;
unsigned int data_offset, data_len;
- struct cifs_readdata *rdata = mid->callback_data;
+ struct cifs_io_subrequest *rdata = mid->callback_data;
char *buf = server->smallbuf;
unsigned int buflen = server->pdu_size + HEADER_PREAMBLE_SIZE(server);
bool use_rdma_mr = false;

2023-11-17 21:22:40

by David Howells

[permalink] [raw]
Subject: [PATCH v2 44/51] cifs: Use more fields from netfs_io_subrequest

Use more fields from netfs_io_subrequest instead of those incorporated into
cifs_io_subrequest from cifs_readdata and cifs_writedata.

Signed-off-by: David Howells <[email protected]>
cc: Steve French <[email protected]>
cc: Shyam Prasad N <[email protected]>
cc: Rohith Surabattula <[email protected]>
cc: Jeff Layton <[email protected]>
cc: [email protected]
cc: [email protected]
cc: [email protected]
cc: [email protected]
---
fs/smb/client/cifsglob.h | 3 -
fs/smb/client/cifssmb.c | 52 +++++++++---------
fs/smb/client/file.c | 112 +++++++++++++++++++-------------------
fs/smb/client/smb2ops.c | 4 +-
fs/smb/client/smb2pdu.c | 52 +++++++++---------
fs/smb/client/transport.c | 6 +-
6 files changed, 113 insertions(+), 116 deletions(-)

diff --git a/fs/smb/client/cifsglob.h b/fs/smb/client/cifsglob.h
index a41aeb2967bd..9c2bb0fc58e8 100644
--- a/fs/smb/client/cifsglob.h
+++ b/fs/smb/client/cifsglob.h
@@ -1448,9 +1448,6 @@ struct cifs_io_subrequest {
struct list_head list;
struct completion done;
struct work_struct work;
- struct iov_iter iter;
- __u64 offset;
- unsigned int bytes;
};

/*
diff --git a/fs/smb/client/cifssmb.c b/fs/smb/client/cifssmb.c
index 14fca3fa3e08..112a5a2d95b8 100644
--- a/fs/smb/client/cifssmb.c
+++ b/fs/smb/client/cifssmb.c
@@ -1267,12 +1267,12 @@ cifs_readv_callback(struct mid_q_entry *mid)
struct TCP_Server_Info *server = tcon->ses->server;
struct smb_rqst rqst = { .rq_iov = rdata->iov,
.rq_nvec = 2,
- .rq_iter = rdata->iter };
+ .rq_iter = rdata->subreq.io_iter };
struct cifs_credits credits = { .value = 1, .instance = 0 };

- cifs_dbg(FYI, "%s: mid=%llu state=%d result=%d bytes=%u\n",
+ cifs_dbg(FYI, "%s: mid=%llu state=%d result=%d bytes=%zu\n",
__func__, mid->mid, mid->mid_state, rdata->result,
- rdata->bytes);
+ rdata->subreq.len);

switch (mid->mid_state) {
case MID_RESPONSE_RECEIVED:
@@ -1320,14 +1320,14 @@ cifs_async_readv(struct cifs_io_subrequest *rdata)
struct smb_rqst rqst = { .rq_iov = rdata->iov,
.rq_nvec = 2 };

- cifs_dbg(FYI, "%s: offset=%llu bytes=%u\n",
- __func__, rdata->offset, rdata->bytes);
+ cifs_dbg(FYI, "%s: offset=%llu bytes=%zu\n",
+ __func__, rdata->subreq.start, rdata->subreq.len);

if (tcon->ses->capabilities & CAP_LARGE_FILES)
wct = 12;
else {
wct = 10; /* old style read */
- if ((rdata->offset >> 32) > 0) {
+ if ((rdata->subreq.start >> 32) > 0) {
/* can not handle this big offset for old */
return -EIO;
}
@@ -1342,12 +1342,12 @@ cifs_async_readv(struct cifs_io_subrequest *rdata)

smb->AndXCommand = 0xFF; /* none */
smb->Fid = rdata->cfile->fid.netfid;
- smb->OffsetLow = cpu_to_le32(rdata->offset & 0xFFFFFFFF);
+ smb->OffsetLow = cpu_to_le32(rdata->subreq.start & 0xFFFFFFFF);
if (wct == 12)
- smb->OffsetHigh = cpu_to_le32(rdata->offset >> 32);
+ smb->OffsetHigh = cpu_to_le32(rdata->subreq.start >> 32);
smb->Remaining = 0;
- smb->MaxCount = cpu_to_le16(rdata->bytes & 0xFFFF);
- smb->MaxCountHigh = cpu_to_le32(rdata->bytes >> 16);
+ smb->MaxCount = cpu_to_le16(rdata->subreq.len & 0xFFFF);
+ smb->MaxCountHigh = cpu_to_le32(rdata->subreq.len >> 16);
if (wct == 12)
smb->ByteCount = 0;
else {
@@ -1631,13 +1631,13 @@ cifs_writev_callback(struct mid_q_entry *mid)
* client. OS/2 servers are known to set incorrect
* CountHigh values.
*/
- if (written > wdata->bytes)
+ if (written > wdata->subreq.len)
written &= 0xFFFF;

- if (written < wdata->bytes)
+ if (written < wdata->subreq.len)
wdata->result = -ENOSPC;
else
- wdata->bytes = written;
+ wdata->subreq.len = written;
break;
case MID_REQUEST_SUBMITTED:
case MID_RETRY_NEEDED:
@@ -1668,7 +1668,7 @@ cifs_async_writev(struct cifs_io_subrequest *wdata)
wct = 14;
} else {
wct = 12;
- if (wdata->offset >> 32 > 0) {
+ if (wdata->subreq.start >> 32 > 0) {
/* can not handle big offset for old srv */
return -EIO;
}
@@ -1683,9 +1683,9 @@ cifs_async_writev(struct cifs_io_subrequest *wdata)

smb->AndXCommand = 0xFF; /* none */
smb->Fid = wdata->cfile->fid.netfid;
- smb->OffsetLow = cpu_to_le32(wdata->offset & 0xFFFFFFFF);
+ smb->OffsetLow = cpu_to_le32(wdata->subreq.start & 0xFFFFFFFF);
if (wct == 14)
- smb->OffsetHigh = cpu_to_le32(wdata->offset >> 32);
+ smb->OffsetHigh = cpu_to_le32(wdata->subreq.start >> 32);
smb->Reserved = 0xFFFFFFFF;
smb->WriteMode = 0;
smb->Remaining = 0;
@@ -1701,24 +1701,24 @@ cifs_async_writev(struct cifs_io_subrequest *wdata)

rqst.rq_iov = iov;
rqst.rq_nvec = 2;
- rqst.rq_iter = wdata->iter;
- rqst.rq_iter_size = iov_iter_count(&wdata->iter);
+ rqst.rq_iter = wdata->subreq.io_iter;
+ rqst.rq_iter_size = iov_iter_count(&wdata->subreq.io_iter);

- cifs_dbg(FYI, "async write at %llu %u bytes\n",
- wdata->offset, wdata->bytes);
+ cifs_dbg(FYI, "async write at %llu %zu bytes\n",
+ wdata->subreq.start, wdata->subreq.len);

- smb->DataLengthLow = cpu_to_le16(wdata->bytes & 0xFFFF);
- smb->DataLengthHigh = cpu_to_le16(wdata->bytes >> 16);
+ smb->DataLengthLow = cpu_to_le16(wdata->subreq.len & 0xFFFF);
+ smb->DataLengthHigh = cpu_to_le16(wdata->subreq.len >> 16);

if (wct == 14) {
- inc_rfc1001_len(&smb->hdr, wdata->bytes + 1);
- put_bcc(wdata->bytes + 1, &smb->hdr);
+ inc_rfc1001_len(&smb->hdr, wdata->subreq.len + 1);
+ put_bcc(wdata->subreq.len + 1, &smb->hdr);
} else {
/* wct == 12 */
struct smb_com_writex_req *smbw =
(struct smb_com_writex_req *)smb;
- inc_rfc1001_len(&smbw->hdr, wdata->bytes + 5);
- put_bcc(wdata->bytes + 5, &smbw->hdr);
+ inc_rfc1001_len(&smbw->hdr, wdata->subreq.len + 5);
+ put_bcc(wdata->subreq.len + 5, &smbw->hdr);
iov[1].iov_len += 4; /* pad bigger by four bytes */
}

diff --git a/fs/smb/client/file.c b/fs/smb/client/file.c
index 6b28fea8a980..497429eec942 100644
--- a/fs/smb/client/file.c
+++ b/fs/smb/client/file.c
@@ -2438,8 +2438,8 @@ cifs_writev_requeue(struct cifs_io_subrequest *wdata)
int rc = 0;
struct inode *inode = d_inode(wdata->cfile->dentry);
struct TCP_Server_Info *server;
- unsigned int rest_len = wdata->bytes;
- loff_t fpos = wdata->offset;
+ unsigned int rest_len = wdata->subreq.len;
+ loff_t fpos = wdata->subreq.start;

server = tlink_tcon(wdata->cfile->tlink)->ses->server;
do {
@@ -2464,14 +2464,14 @@ cifs_writev_requeue(struct cifs_io_subrequest *wdata)
}

wdata2->sync_mode = wdata->sync_mode;
- wdata2->offset = fpos;
- wdata2->bytes = cur_len;
- wdata2->iter = wdata->iter;
+ wdata2->subreq.start = fpos;
+ wdata2->subreq.len = cur_len;
+ wdata2->subreq.io_iter = wdata->subreq.io_iter;

- iov_iter_advance(&wdata2->iter, fpos - wdata->offset);
- iov_iter_truncate(&wdata2->iter, wdata2->bytes);
+ iov_iter_advance(&wdata2->subreq.io_iter, fpos - wdata->subreq.start);
+ iov_iter_truncate(&wdata2->subreq.io_iter, wdata2->subreq.len);

- if (iov_iter_is_xarray(&wdata2->iter))
+ if (iov_iter_is_xarray(&wdata2->subreq.io_iter))
/* Check for pages having been redirtied and clean
* them. We can do this by walking the xarray. If
* it's not an xarray, then it's a DIO and we shouldn't
@@ -2505,7 +2505,7 @@ cifs_writev_requeue(struct cifs_io_subrequest *wdata)
} while (rest_len > 0);

/* Clean up remaining pages from the original wdata */
- if (iov_iter_is_xarray(&wdata->iter))
+ if (iov_iter_is_xarray(&wdata->subreq.io_iter))
cifs_pages_write_failed(inode, fpos, rest_len);

if (rc != 0 && !is_retryable_error(rc))
@@ -2522,19 +2522,19 @@ cifs_writev_complete(struct work_struct *work)

if (wdata->result == 0) {
spin_lock(&inode->i_lock);
- cifs_update_eof(CIFS_I(inode), wdata->offset, wdata->bytes);
+ cifs_update_eof(CIFS_I(inode), wdata->subreq.start, wdata->subreq.len);
spin_unlock(&inode->i_lock);
cifs_stats_bytes_written(tlink_tcon(wdata->cfile->tlink),
- wdata->bytes);
+ wdata->subreq.len);
} else if (wdata->sync_mode == WB_SYNC_ALL && wdata->result == -EAGAIN)
return cifs_writev_requeue(wdata);

if (wdata->result == -EAGAIN)
- cifs_pages_write_redirty(inode, wdata->offset, wdata->bytes);
+ cifs_pages_write_redirty(inode, wdata->subreq.start, wdata->subreq.len);
else if (wdata->result < 0)
- cifs_pages_write_failed(inode, wdata->offset, wdata->bytes);
+ cifs_pages_write_failed(inode, wdata->subreq.start, wdata->subreq.len);
else
- cifs_pages_written_back(inode, wdata->offset, wdata->bytes);
+ cifs_pages_written_back(inode, wdata->subreq.start, wdata->subreq.len);

if (wdata->result != -EAGAIN)
mapping_set_error(inode->i_mapping, wdata->result);
@@ -2766,7 +2766,7 @@ static ssize_t cifs_write_back_from_locked_folio(struct address_space *mapping,
}

wdata->sync_mode = wbc->sync_mode;
- wdata->offset = folio_pos(folio);
+ wdata->subreq.start = folio_pos(folio);
wdata->pid = cfile->pid;
wdata->credits = credits_on_stack;
wdata->cfile = cfile;
@@ -2801,7 +2801,7 @@ static ssize_t cifs_write_back_from_locked_folio(struct address_space *mapping,
len = min_t(loff_t, len, max_len);
}

- wdata->bytes = len;
+ wdata->subreq.len = len;

/* We now have a contiguous set of dirty pages, each with writeback
* set; the first page is still locked at this point, but all the rest
@@ -2810,10 +2810,10 @@ static ssize_t cifs_write_back_from_locked_folio(struct address_space *mapping,
folio_unlock(folio);

if (start < i_size) {
- iov_iter_xarray(&wdata->iter, ITER_SOURCE, &mapping->i_pages,
+ iov_iter_xarray(&wdata->subreq.io_iter, ITER_SOURCE, &mapping->i_pages,
start, len);

- rc = adjust_credits(wdata->server, &wdata->credits, wdata->bytes);
+ rc = adjust_credits(wdata->server, &wdata->credits, wdata->subreq.len);
if (rc)
goto err_wdata;

@@ -3232,7 +3232,7 @@ cifs_uncached_writev_complete(struct work_struct *work)
struct cifsInodeInfo *cifsi = CIFS_I(inode);

spin_lock(&inode->i_lock);
- cifs_update_eof(cifsi, wdata->offset, wdata->bytes);
+ cifs_update_eof(cifsi, wdata->subreq.start, wdata->subreq.len);
if (cifsi->netfs.remote_i_size > inode->i_size)
i_size_write(inode, cifsi->netfs.remote_i_size);
spin_unlock(&inode->i_lock);
@@ -3268,19 +3268,19 @@ cifs_resend_wdata(struct cifs_io_subrequest *wdata, struct list_head *wdata_list
* segments
*/
do {
- rc = server->ops->wait_mtu_credits(server, wdata->bytes,
+ rc = server->ops->wait_mtu_credits(server, wdata->subreq.len,
&wsize, &credits);
if (rc)
goto fail;

- if (wsize < wdata->bytes) {
+ if (wsize < wdata->subreq.len) {
add_credits_and_wake_if(server, &credits, 0);
msleep(1000);
}
- } while (wsize < wdata->bytes);
+ } while (wsize < wdata->subreq.len);
wdata->credits = credits;

- rc = adjust_credits(server, &wdata->credits, wdata->bytes);
+ rc = adjust_credits(server, &wdata->credits, wdata->subreq.len);

if (!rc) {
if (wdata->cfile->invalidHandle)
@@ -3426,19 +3426,19 @@ cifs_write_from_iter(loff_t fpos, size_t len, struct iov_iter *from,

wdata->uncached = true;
wdata->sync_mode = WB_SYNC_ALL;
- wdata->offset = (__u64)fpos;
+ wdata->subreq.start = (__u64)fpos;
wdata->cfile = cifsFileInfo_get(open_file);
wdata->server = server;
wdata->pid = pid;
- wdata->bytes = cur_len;
+ wdata->subreq.len = cur_len;
wdata->credits = credits_on_stack;
- wdata->iter = *from;
+ wdata->subreq.io_iter = *from;
wdata->ctx = ctx;
kref_get(&ctx->refcount);

- iov_iter_truncate(&wdata->iter, cur_len);
+ iov_iter_truncate(&wdata->subreq.io_iter, cur_len);

- rc = adjust_credits(server, &wdata->credits, wdata->bytes);
+ rc = adjust_credits(server, &wdata->credits, wdata->subreq.len);

if (!rc) {
if (wdata->cfile->invalidHandle)
@@ -3500,7 +3500,7 @@ static void collect_uncached_write_data(struct cifs_aio_ctx *ctx)
if (wdata->result)
rc = wdata->result;
else
- ctx->total_len += wdata->bytes;
+ ctx->total_len += wdata->subreq.len;

/* resend call if it's a retryable error */
if (rc == -EAGAIN) {
@@ -3515,10 +3515,10 @@ static void collect_uncached_write_data(struct cifs_aio_ctx *ctx)
wdata, &tmp_list, ctx);
else {
iov_iter_advance(&tmp_from,
- wdata->offset - ctx->pos);
+ wdata->subreq.start - ctx->pos);

- rc = cifs_write_from_iter(wdata->offset,
- wdata->bytes, &tmp_from,
+ rc = cifs_write_from_iter(wdata->subreq.start,
+ wdata->subreq.len, &tmp_from,
ctx->cfile, cifs_sb, &tmp_list,
ctx);

@@ -3841,20 +3841,20 @@ static int cifs_resend_rdata(struct cifs_io_subrequest *rdata,
* segments
*/
do {
- rc = server->ops->wait_mtu_credits(server, rdata->bytes,
+ rc = server->ops->wait_mtu_credits(server, rdata->subreq.len,
&rsize, &credits);

if (rc)
goto fail;

- if (rsize < rdata->bytes) {
+ if (rsize < rdata->subreq.len) {
add_credits_and_wake_if(server, &credits, 0);
msleep(1000);
}
- } while (rsize < rdata->bytes);
+ } while (rsize < rdata->subreq.len);
rdata->credits = credits;

- rc = adjust_credits(server, &rdata->credits, rdata->bytes);
+ rc = adjust_credits(server, &rdata->credits, rdata->subreq.len);
if (!rc) {
if (rdata->cfile->invalidHandle)
rc = -EAGAIN;
@@ -3952,17 +3952,17 @@ cifs_send_async_read(loff_t fpos, size_t len, struct cifsFileInfo *open_file,

rdata->server = server;
rdata->cfile = cifsFileInfo_get(open_file);
- rdata->offset = fpos;
- rdata->bytes = cur_len;
+ rdata->subreq.start = fpos;
+ rdata->subreq.len = cur_len;
rdata->pid = pid;
rdata->credits = credits_on_stack;
rdata->ctx = ctx;
kref_get(&ctx->refcount);

- rdata->iter = ctx->iter;
- iov_iter_truncate(&rdata->iter, cur_len);
+ rdata->subreq.io_iter = ctx->iter;
+ iov_iter_truncate(&rdata->subreq.io_iter, cur_len);

- rc = adjust_credits(server, &rdata->credits, rdata->bytes);
+ rc = adjust_credits(server, &rdata->credits, rdata->subreq.len);

if (!rc) {
if (rdata->cfile->invalidHandle)
@@ -4032,8 +4032,8 @@ collect_uncached_read_data(struct cifs_aio_ctx *ctx)
&tmp_list, ctx);
} else {
rc = cifs_send_async_read(
- rdata->offset + got_bytes,
- rdata->bytes - got_bytes,
+ rdata->subreq.start + got_bytes,
+ rdata->subreq.len - got_bytes,
rdata->cfile, cifs_sb,
&tmp_list, ctx);

@@ -4047,7 +4047,7 @@ collect_uncached_read_data(struct cifs_aio_ctx *ctx)
rc = rdata->result;

/* if there was a short read -- discard anything left */
- if (rdata->got_bytes && rdata->got_bytes < rdata->bytes)
+ if (rdata->got_bytes && rdata->got_bytes < rdata->subreq.len)
rc = -ENODATA;

ctx->total_len += rdata->got_bytes;
@@ -4431,16 +4431,16 @@ static void cifs_readahead_complete(struct work_struct *work)
pgoff_t last;
bool good = rdata->result == 0 || (rdata->result == -EAGAIN && rdata->got_bytes);

- XA_STATE(xas, &rdata->mapping->i_pages, rdata->offset / PAGE_SIZE);
+ XA_STATE(xas, &rdata->mapping->i_pages, rdata->subreq.start / PAGE_SIZE);

if (good)
cifs_readahead_to_fscache(rdata->mapping->host,
- rdata->offset, rdata->bytes);
+ rdata->subreq.start, rdata->subreq.len);

- if (iov_iter_count(&rdata->iter) > 0)
- iov_iter_zero(iov_iter_count(&rdata->iter), &rdata->iter);
+ if (iov_iter_count(&rdata->subreq.io_iter) > 0)
+ iov_iter_zero(iov_iter_count(&rdata->subreq.io_iter), &rdata->subreq.io_iter);

- last = (rdata->offset + rdata->bytes - 1) / PAGE_SIZE;
+ last = (rdata->subreq.start + rdata->subreq.len - 1) / PAGE_SIZE;

rcu_read_lock();
xas_for_each(&xas, folio, last) {
@@ -4579,8 +4579,8 @@ static void cifs_readahead(struct readahead_control *ractl)
break;
}

- rdata->offset = ra_index * PAGE_SIZE;
- rdata->bytes = nr_pages * PAGE_SIZE;
+ rdata->subreq.start = ra_index * PAGE_SIZE;
+ rdata->subreq.len = nr_pages * PAGE_SIZE;
rdata->cfile = cifsFileInfo_get(open_file);
rdata->server = server;
rdata->mapping = ractl->mapping;
@@ -4594,10 +4594,10 @@ static void cifs_readahead(struct readahead_control *ractl)
ra_pages -= nr_pages;
ra_index += nr_pages;

- iov_iter_xarray(&rdata->iter, ITER_DEST, &rdata->mapping->i_pages,
- rdata->offset, rdata->bytes);
+ iov_iter_xarray(&rdata->subreq.io_iter, ITER_DEST, &rdata->mapping->i_pages,
+ rdata->subreq.start, rdata->subreq.len);

- rc = adjust_credits(server, &rdata->credits, rdata->bytes);
+ rc = adjust_credits(server, &rdata->credits, rdata->subreq.len);
if (!rc) {
if (rdata->cfile->invalidHandle)
rc = -EAGAIN;
@@ -4608,8 +4608,8 @@ static void cifs_readahead(struct readahead_control *ractl)
if (rc) {
add_credits_and_wake_if(server, &rdata->credits, 0);
cifs_unlock_folios(rdata->mapping,
- rdata->offset / PAGE_SIZE,
- (rdata->offset + rdata->bytes - 1) / PAGE_SIZE);
+ rdata->subreq.start / PAGE_SIZE,
+ (rdata->subreq.start + rdata->subreq.len - 1) / PAGE_SIZE);
/* Fallback to the readpage in error/reconnect cases */
cifs_put_readdata(rdata);
break;
diff --git a/fs/smb/client/smb2ops.c b/fs/smb/client/smb2ops.c
index bd978544d857..5d8566ba4d20 100644
--- a/fs/smb/client/smb2ops.c
+++ b/fs/smb/client/smb2ops.c
@@ -4694,7 +4694,7 @@ handle_read_data(struct TCP_Server_Info *server, struct mid_q_entry *mid,

/* Copy the data to the output I/O iterator. */
rdata->result = cifs_copy_pages_to_iter(pages, pages_len,
- cur_off, &rdata->iter);
+ cur_off, &rdata->subreq.io_iter);
if (rdata->result != 0) {
if (is_offloaded)
mid->mid_state = MID_RESPONSE_MALFORMED;
@@ -4708,7 +4708,7 @@ handle_read_data(struct TCP_Server_Info *server, struct mid_q_entry *mid,
/* read response payload is in buf */
WARN_ONCE(pages && !xa_empty(pages),
"read data can be either in buf or in pages");
- length = copy_to_iter(buf + data_offset, data_len, &rdata->iter);
+ length = copy_to_iter(buf + data_offset, data_len, &rdata->subreq.io_iter);
if (length < 0)
return length;
rdata->got_bytes = data_len;
diff --git a/fs/smb/client/smb2pdu.c b/fs/smb/client/smb2pdu.c
index 85a85821390f..a0b13c27c1c0 100644
--- a/fs/smb/client/smb2pdu.c
+++ b/fs/smb/client/smb2pdu.c
@@ -4218,7 +4218,7 @@ smb2_new_read_req(void **buf, unsigned int *total_len,
struct smbd_buffer_descriptor_v1 *v1;
bool need_invalidate = server->dialect == SMB30_PROT_ID;

- rdata->mr = smbd_register_mr(server->smbd_conn, &rdata->iter,
+ rdata->mr = smbd_register_mr(server->smbd_conn, &rdata->subreq.io_iter,
true, need_invalidate);
if (!rdata->mr)
return -EAGAIN;
@@ -4279,17 +4279,17 @@ smb2_readv_callback(struct mid_q_entry *mid)
.rq_nvec = 1 };

if (rdata->got_bytes) {
- rqst.rq_iter = rdata->iter;
- rqst.rq_iter_size = iov_iter_count(&rdata->iter);
+ rqst.rq_iter = rdata->subreq.io_iter;
+ rqst.rq_iter_size = iov_iter_count(&rdata->subreq.io_iter);
}

WARN_ONCE(rdata->server != mid->server,
"rdata server %p != mid server %p",
rdata->server, mid->server);

- cifs_dbg(FYI, "%s: mid=%llu state=%d result=%d bytes=%u\n",
+ cifs_dbg(FYI, "%s: mid=%llu state=%d result=%d bytes=%zu\n",
__func__, mid->mid, mid->mid_state, rdata->result,
- rdata->bytes);
+ rdata->subreq.len);

switch (mid->mid_state) {
case MID_RESPONSE_RECEIVED:
@@ -4342,13 +4342,13 @@ smb2_readv_callback(struct mid_q_entry *mid)
cifs_stats_fail_inc(tcon, SMB2_READ_HE);
trace_smb3_read_err(0 /* xid */,
rdata->cfile->fid.persistent_fid,
- tcon->tid, tcon->ses->Suid, rdata->offset,
- rdata->bytes, rdata->result);
+ tcon->tid, tcon->ses->Suid, rdata->subreq.start,
+ rdata->subreq.len, rdata->result);
} else
trace_smb3_read_done(0 /* xid */,
rdata->cfile->fid.persistent_fid,
tcon->tid, tcon->ses->Suid,
- rdata->offset, rdata->got_bytes);
+ rdata->subreq.start, rdata->got_bytes);

queue_work(cifsiod_wq, &rdata->work);
release_mid(mid);
@@ -4370,16 +4370,16 @@ smb2_async_readv(struct cifs_io_subrequest *rdata)
unsigned int total_len;
int credit_request;

- cifs_dbg(FYI, "%s: offset=%llu bytes=%u\n",
- __func__, rdata->offset, rdata->bytes);
+ cifs_dbg(FYI, "%s: offset=%llu bytes=%zu\n",
+ __func__, rdata->subreq.start, rdata->subreq.len);

if (!rdata->server)
rdata->server = cifs_pick_channel(tcon->ses);

io_parms.tcon = tlink_tcon(rdata->cfile->tlink);
io_parms.server = server = rdata->server;
- io_parms.offset = rdata->offset;
- io_parms.length = rdata->bytes;
+ io_parms.offset = rdata->subreq.start;
+ io_parms.length = rdata->subreq.len;
io_parms.persistent_fid = rdata->cfile->fid.persistent_fid;
io_parms.volatile_fid = rdata->cfile->fid.volatile_fid;
io_parms.pid = rdata->pid;
@@ -4398,7 +4398,7 @@ smb2_async_readv(struct cifs_io_subrequest *rdata)
shdr = (struct smb2_hdr *)buf;

if (rdata->credits.value > 0) {
- shdr->CreditCharge = cpu_to_le16(DIV_ROUND_UP(rdata->bytes,
+ shdr->CreditCharge = cpu_to_le16(DIV_ROUND_UP(rdata->subreq.len,
SMB2_MAX_BUFFER_SIZE));
credit_request = le16_to_cpu(shdr->CreditCharge) + 8;
if (server->credits >= server->max_credits)
@@ -4408,7 +4408,7 @@ smb2_async_readv(struct cifs_io_subrequest *rdata)
min_t(int, server->max_credits -
server->credits, credit_request));

- rc = adjust_credits(server, &rdata->credits, rdata->bytes);
+ rc = adjust_credits(server, &rdata->credits, rdata->subreq.len);
if (rc)
goto async_readv_out;

@@ -4546,13 +4546,13 @@ smb2_writev_callback(struct mid_q_entry *mid)
* client. OS/2 servers are known to set incorrect
* CountHigh values.
*/
- if (written > wdata->bytes)
+ if (written > wdata->subreq.len)
written &= 0xFFFF;

- if (written < wdata->bytes)
+ if (written < wdata->subreq.len)
wdata->result = -ENOSPC;
else
- wdata->bytes = written;
+ wdata->subreq.len = written;
break;
case MID_REQUEST_SUBMITTED:
case MID_RETRY_NEEDED:
@@ -4583,8 +4583,8 @@ smb2_writev_callback(struct mid_q_entry *mid)
cifs_stats_fail_inc(tcon, SMB2_WRITE_HE);
trace_smb3_write_err(0 /* no xid */,
wdata->cfile->fid.persistent_fid,
- tcon->tid, tcon->ses->Suid, wdata->offset,
- wdata->bytes, wdata->result);
+ tcon->tid, tcon->ses->Suid, wdata->subreq.start,
+ wdata->subreq.len, wdata->result);
if (wdata->result == -ENOSPC)
pr_warn_once("Out of space writing to %s\n",
tcon->tree_name);
@@ -4592,7 +4592,7 @@ smb2_writev_callback(struct mid_q_entry *mid)
trace_smb3_write_done(0 /* no xid */,
wdata->cfile->fid.persistent_fid,
tcon->tid, tcon->ses->Suid,
- wdata->offset, wdata->bytes);
+ wdata->subreq.start, wdata->subreq.len);

queue_work(cifsiod_wq, &wdata->work);
release_mid(mid);
@@ -4625,8 +4625,8 @@ smb2_async_writev(struct cifs_io_subrequest *wdata)
_io_parms = (struct cifs_io_parms) {
.tcon = tcon,
.server = server,
- .offset = wdata->offset,
- .length = wdata->bytes,
+ .offset = wdata->subreq.start,
+ .length = wdata->subreq.len,
.persistent_fid = wdata->cfile->fid.persistent_fid,
.volatile_fid = wdata->cfile->fid.volatile_fid,
.pid = wdata->pid,
@@ -4668,10 +4668,10 @@ smb2_async_writev(struct cifs_io_subrequest *wdata)
*/
if (smb3_use_rdma_offload(io_parms)) {
struct smbd_buffer_descriptor_v1 *v1;
- size_t data_size = iov_iter_count(&wdata->iter);
+ size_t data_size = iov_iter_count(&wdata->subreq.io_iter);
bool need_invalidate = server->dialect == SMB30_PROT_ID;

- wdata->mr = smbd_register_mr(server->smbd_conn, &wdata->iter,
+ wdata->mr = smbd_register_mr(server->smbd_conn, &wdata->subreq.io_iter,
false, need_invalidate);
if (!wdata->mr) {
rc = -EAGAIN;
@@ -4698,7 +4698,7 @@ smb2_async_writev(struct cifs_io_subrequest *wdata)

rqst.rq_iov = iov;
rqst.rq_nvec = 1;
- rqst.rq_iter = wdata->iter;
+ rqst.rq_iter = wdata->subreq.io_iter;
rqst.rq_iter_size = iov_iter_count(&rqst.rq_iter);
#ifdef CONFIG_CIFS_SMB_DIRECT
if (wdata->mr)
@@ -4716,7 +4716,7 @@ smb2_async_writev(struct cifs_io_subrequest *wdata)
#endif

if (wdata->credits.value > 0) {
- shdr->CreditCharge = cpu_to_le16(DIV_ROUND_UP(wdata->bytes,
+ shdr->CreditCharge = cpu_to_le16(DIV_ROUND_UP(wdata->subreq.len,
SMB2_MAX_BUFFER_SIZE));
credit_request = le16_to_cpu(shdr->CreditCharge) + 8;
if (server->credits >= server->max_credits)
diff --git a/fs/smb/client/transport.c b/fs/smb/client/transport.c
index bae758ec621b..3048516573e8 100644
--- a/fs/smb/client/transport.c
+++ b/fs/smb/client/transport.c
@@ -1692,8 +1692,8 @@ cifs_readv_receive(struct TCP_Server_Info *server, struct mid_q_entry *mid)
unsigned int buflen = server->pdu_size + HEADER_PREAMBLE_SIZE(server);
bool use_rdma_mr = false;

- cifs_dbg(FYI, "%s: mid=%llu offset=%llu bytes=%u\n",
- __func__, mid->mid, rdata->offset, rdata->bytes);
+ cifs_dbg(FYI, "%s: mid=%llu offset=%llu bytes=%zu\n",
+ __func__, mid->mid, rdata->subreq.start, rdata->subreq.len);

/*
* read the rest of READ_RSP header (sans Data array), or whatever we
@@ -1798,7 +1798,7 @@ cifs_readv_receive(struct TCP_Server_Info *server, struct mid_q_entry *mid)
length = data_len; /* An RDMA read is already done. */
else
#endif
- length = cifs_read_iter_from_socket(server, &rdata->iter,
+ length = cifs_read_iter_from_socket(server, &rdata->subreq.io_iter,
data_len);
if (length > 0)
rdata->got_bytes += length;

2023-11-17 21:22:43

by David Howells

[permalink] [raw]
Subject: [PATCH v2 49/51] cifs: Remove some code that's no longer used, part 1

Remove some code that was #if'd out with the netfslib conversion. This is
split into parts for file.c as the diff generator otherwise produces a hard
to read diff for part of it where a big chunk is cut out.

Signed-off-by: David Howells <[email protected]>
cc: Steve French <[email protected]>
cc: Shyam Prasad N <[email protected]>
cc: Rohith Surabattula <[email protected]>
cc: Jeff Layton <[email protected]>
cc: [email protected]
cc: [email protected]
cc: [email protected]
cc: [email protected]
---
fs/smb/client/cifsglob.h | 12 -
fs/smb/client/cifsproto.h | 21 --
fs/smb/client/file.c | 640 --------------------------------------
fs/smb/client/fscache.c | 111 -------
fs/smb/client/fscache.h | 58 ----
5 files changed, 842 deletions(-)

diff --git a/fs/smb/client/cifsglob.h b/fs/smb/client/cifsglob.h
index 7e1a64ea9297..d0e9a64862de 100644
--- a/fs/smb/client/cifsglob.h
+++ b/fs/smb/client/cifsglob.h
@@ -1447,18 +1447,6 @@ struct cifs_io_subrequest {
struct smbd_mr *mr;
#endif
struct cifs_credits credits;
-
-#if 0 // TODO: Remove following elements
- struct list_head list;
- struct completion done;
- struct work_struct work;
- struct cifsFileInfo *cfile;
- struct address_space *mapping;
- struct cifs_aio_ctx *ctx;
- enum writeback_sync_modes sync_mode;
- bool uncached;
- struct bio_vec *bv;
-#endif
};

/*
diff --git a/fs/smb/client/cifsproto.h b/fs/smb/client/cifsproto.h
index b0acb72ebeaa..f86a6ce605dc 100644
--- a/fs/smb/client/cifsproto.h
+++ b/fs/smb/client/cifsproto.h
@@ -581,32 +581,11 @@ void __cifs_put_smb_ses(struct cifs_ses *ses);
extern struct cifs_ses *
cifs_get_smb_ses(struct TCP_Server_Info *server, struct smb3_fs_context *ctx);

-#if 0 // TODO Remove
-void cifs_readdata_release(struct cifs_io_subrequest *rdata);
-static inline void cifs_put_readdata(struct cifs_io_subrequest *rdata)
-{
- if (refcount_dec_and_test(&rdata->subreq.ref))
- cifs_readdata_release(rdata);
-}
-#endif
int cifs_async_readv(struct cifs_io_subrequest *rdata);
int cifs_readv_receive(struct TCP_Server_Info *server, struct mid_q_entry *mid);

int cifs_async_writev(struct cifs_io_subrequest *wdata);
void cifs_writev_complete(struct work_struct *work);
-#if 0 // TODO Remove
-struct cifs_io_subrequest *cifs_writedata_alloc(work_func_t complete);
-void cifs_writedata_release(struct cifs_io_subrequest *rdata);
-static inline void cifs_get_writedata(struct cifs_io_subrequest *wdata)
-{
- refcount_inc(&wdata->subreq.ref);
-}
-static inline void cifs_put_writedata(struct cifs_io_subrequest *wdata)
-{
- if (refcount_dec_and_test(&wdata->subreq.ref))
- cifs_writedata_release(wdata);
-}
-#endif
int cifs_query_mf_symlink(unsigned int xid, struct cifs_tcon *tcon,
struct cifs_sb_info *cifs_sb,
const unsigned char *path, char *pbuf,
diff --git a/fs/smb/client/file.c b/fs/smb/client/file.c
index c4084b28e49f..ce14ecb1c1c6 100644
--- a/fs/smb/client/file.c
+++ b/fs/smb/client/file.c
@@ -411,133 +411,6 @@ const struct netfs_request_ops cifs_req_ops = {
.create_write_requests = cifs_create_write_requests,
};

-#if 0 // TODO remove 397
-/*
- * Remove the dirty flags from a span of pages.
- */
-static void cifs_undirty_folios(struct inode *inode, loff_t start, unsigned int len)
-{
- struct address_space *mapping = inode->i_mapping;
- struct folio *folio;
- pgoff_t end;
-
- XA_STATE(xas, &mapping->i_pages, start / PAGE_SIZE);
-
- rcu_read_lock();
-
- end = (start + len - 1) / PAGE_SIZE;
- xas_for_each_marked(&xas, folio, end, PAGECACHE_TAG_DIRTY) {
- if (xas_retry(&xas, folio))
- continue;
- xas_pause(&xas);
- rcu_read_unlock();
- folio_lock(folio);
- folio_clear_dirty_for_io(folio);
- folio_unlock(folio);
- rcu_read_lock();
- }
-
- rcu_read_unlock();
-}
-
-/*
- * Completion of write to server.
- */
-void cifs_pages_written_back(struct inode *inode, loff_t start, unsigned int len)
-{
- struct address_space *mapping = inode->i_mapping;
- struct folio *folio;
- pgoff_t end;
-
- XA_STATE(xas, &mapping->i_pages, start / PAGE_SIZE);
-
- if (!len)
- return;
-
- rcu_read_lock();
-
- end = (start + len - 1) / PAGE_SIZE;
- xas_for_each(&xas, folio, end) {
- if (xas_retry(&xas, folio))
- continue;
- if (!folio_test_writeback(folio)) {
- WARN_ONCE(1, "bad %x @%llx page %lx %lx\n",
- len, start, folio_index(folio), end);
- continue;
- }
-
- folio_detach_private(folio);
- folio_end_writeback(folio);
- }
-
- rcu_read_unlock();
-}
-
-/*
- * Failure of write to server.
- */
-void cifs_pages_write_failed(struct inode *inode, loff_t start, unsigned int len)
-{
- struct address_space *mapping = inode->i_mapping;
- struct folio *folio;
- pgoff_t end;
-
- XA_STATE(xas, &mapping->i_pages, start / PAGE_SIZE);
-
- if (!len)
- return;
-
- rcu_read_lock();
-
- end = (start + len - 1) / PAGE_SIZE;
- xas_for_each(&xas, folio, end) {
- if (xas_retry(&xas, folio))
- continue;
- if (!folio_test_writeback(folio)) {
- WARN_ONCE(1, "bad %x @%llx page %lx %lx\n",
- len, start, folio_index(folio), end);
- continue;
- }
-
- folio_set_error(folio);
- folio_end_writeback(folio);
- }
-
- rcu_read_unlock();
-}
-
-/*
- * Redirty pages after a temporary failure.
- */
-void cifs_pages_write_redirty(struct inode *inode, loff_t start, unsigned int len)
-{
- struct address_space *mapping = inode->i_mapping;
- struct folio *folio;
- pgoff_t end;
-
- XA_STATE(xas, &mapping->i_pages, start / PAGE_SIZE);
-
- if (!len)
- return;
-
- rcu_read_lock();
-
- end = (start + len - 1) / PAGE_SIZE;
- xas_for_each(&xas, folio, end) {
- if (!folio_test_writeback(folio)) {
- WARN_ONCE(1, "bad %x @%llx page %lx %lx\n",
- len, start, folio_index(folio), end);
- continue;
- }
-
- filemap_dirty_folio(folio->mapping, folio);
- folio_end_writeback(folio);
- }
-
- rcu_read_unlock();
-}
-#endif // end netfslib remove 397
-
/*
* Mark as invalid, all open files on tree connections since they
* were closed when session to server was lost.
@@ -2498,92 +2371,6 @@ cifs_update_eof(struct cifsInodeInfo *cifsi, loff_t offset,
netfs_resize_file(&cifsi->netfs, end_of_write);
}

-#if 0 // TODO remove 2483
-static ssize_t
-cifs_write(struct cifsFileInfo *open_file, __u32 pid, const char *write_data,
- size_t write_size, loff_t *offset)
-{
- int rc = 0;
- unsigned int bytes_written = 0;
- unsigned int total_written;
- struct cifs_tcon *tcon;
- struct TCP_Server_Info *server;
- unsigned int xid;
- struct dentry *dentry = open_file->dentry;
- struct cifsInodeInfo *cifsi = CIFS_I(d_inode(dentry));
- struct cifs_io_parms io_parms = {0};
-
- cifs_dbg(FYI, "write %zd bytes to offset %lld of %pd\n",
- write_size, *offset, dentry);
-
- tcon = tlink_tcon(open_file->tlink);
- server = tcon->ses->server;
-
- if (!server->ops->sync_write)
- return -ENOSYS;
-
- xid = get_xid();
-
- for (total_written = 0; write_size > total_written;
- total_written += bytes_written) {
- rc = -EAGAIN;
- while (rc == -EAGAIN) {
- struct kvec iov[2];
- unsigned int len;
-
- if (open_file->invalidHandle) {
- /* we could deadlock if we called
- filemap_fdatawait from here so tell
- reopen_file not to flush data to
- server now */
- rc = cifs_reopen_file(open_file, false);
- if (rc != 0)
- break;
- }
-
- len = min(server->ops->wp_retry_size(d_inode(dentry)),
- (unsigned int)write_size - total_written);
- /* iov[0] is reserved for smb header */
- iov[1].iov_base = (char *)write_data + total_written;
- iov[1].iov_len = len;
- io_parms.pid = pid;
- io_parms.tcon = tcon;
- io_parms.offset = *offset;
- io_parms.length = len;
- rc = server->ops->sync_write(xid, &open_file->fid,
- &io_parms, &bytes_written, iov, 1);
- }
- if (rc || (bytes_written == 0)) {
- if (total_written)
- break;
- else {
- free_xid(xid);
- return rc;
- }
- } else {
- spin_lock(&d_inode(dentry)->i_lock);
- cifs_update_eof(cifsi, *offset, bytes_written);
- spin_unlock(&d_inode(dentry)->i_lock);
- *offset += bytes_written;
- }
- }
-
- cifs_stats_bytes_written(tcon, total_written);
-
- if (total_written > 0) {
- spin_lock(&d_inode(dentry)->i_lock);
- if (*offset > d_inode(dentry)->i_size) {
- i_size_write(d_inode(dentry), *offset);
- d_inode(dentry)->i_blocks = (512 - 1 + *offset) >> 9;
- }
- spin_unlock(&d_inode(dentry)->i_lock);
- }
- mark_inode_dirty_sync(d_inode(dentry));
- free_xid(xid);
- return total_written;
-}
-#endif // end netfslib remove 2483
-
struct cifsFileInfo *find_readable_file(struct cifsInodeInfo *cifs_inode,
bool fsuid_only)
{
@@ -4843,293 +4630,6 @@ int cifs_file_mmap(struct file *file, struct vm_area_struct *vma)
return rc;
}

-#if 0 // TODO remove 4794
-/*
- * Unlock a bunch of folios in the pagecache.
- */
-static void cifs_unlock_folios(struct address_space *mapping, pgoff_t first, pgoff_t last)
-{
- struct folio *folio;
- XA_STATE(xas, &mapping->i_pages, first);
-
- rcu_read_lock();
- xas_for_each(&xas, folio, last) {
- folio_unlock(folio);
- }
- rcu_read_unlock();
-}
-
-static void cifs_readahead_complete(struct work_struct *work)
-{
- struct cifs_io_subrequest *rdata = container_of(work,
- struct cifs_io_subrequest, work);
- struct folio *folio;
- pgoff_t last;
- bool good = rdata->result == 0 || (rdata->result == -EAGAIN && rdata->got_bytes);
-
- XA_STATE(xas, &rdata->mapping->i_pages, rdata->subreq.start / PAGE_SIZE);
-
- if (good)
- cifs_readahead_to_fscache(rdata->mapping->host,
- rdata->subreq.start, rdata->subreq.len);
-
- if (iov_iter_count(&rdata->subreq.io_iter) > 0)
- iov_iter_zero(iov_iter_count(&rdata->subreq.io_iter), &rdata->subreq.io_iter);
-
- last = (rdata->subreq.start + rdata->subreq.len - 1) / PAGE_SIZE;
-
- rcu_read_lock();
- xas_for_each(&xas, folio, last) {
- if (good) {
- flush_dcache_folio(folio);
- folio_mark_uptodate(folio);
- }
- folio_unlock(folio);
- }
- rcu_read_unlock();
-
- cifs_put_readdata(rdata);
-}
-
-static void cifs_readahead(struct readahead_control *ractl)
-{
- struct cifsFileInfo *open_file = ractl->file->private_data;
- struct cifs_sb_info *cifs_sb = CIFS_FILE_SB(ractl->file);
- struct TCP_Server_Info *server;
- unsigned int xid, nr_pages, cache_nr_pages = 0;
- unsigned int ra_pages;
- pgoff_t next_cached = ULONG_MAX, ra_index;
- bool caching = fscache_cookie_enabled(cifs_inode_cookie(ractl->mapping->host)) &&
- cifs_inode_cookie(ractl->mapping->host)->cache_priv;
- bool check_cache = caching;
- pid_t pid;
- int rc = 0;
-
- /* Note that readahead_count() lags behind our dequeuing of pages from
- * the ractl, wo we have to keep track for ourselves.
- */
- ra_pages = readahead_count(ractl);
- ra_index = readahead_index(ractl);
-
- xid = get_xid();
-
- if (cifs_sb->mnt_cifs_flags & CIFS_MOUNT_RWPIDFORWARD)
- pid = open_file->pid;
- else
- pid = current->tgid;
-
- server = cifs_pick_channel(tlink_tcon(open_file->tlink)->ses);
-
- cifs_dbg(FYI, "%s: file=%p mapping=%p num_pages=%u\n",
- __func__, ractl->file, ractl->mapping, ra_pages);
-
- /*
- * Chop the readahead request up into rsize-sized read requests.
- */
- while ((nr_pages = ra_pages)) {
- unsigned int i;
- struct cifs_io_subrequest *rdata;
- struct cifs_credits credits_on_stack;
- struct cifs_credits *credits = &credits_on_stack;
- struct folio *folio;
- pgoff_t fsize;
- size_t rsize;
-
- /*
- * Find out if we have anything cached in the range of
- * interest, and if so, where the next chunk of cached data is.
- */
- if (caching) {
- if (check_cache) {
- rc = cifs_fscache_query_occupancy(
- ractl->mapping->host, ra_index, nr_pages,
- &next_cached, &cache_nr_pages);
- if (rc < 0)
- caching = false;
- check_cache = false;
- }
-
- if (ra_index == next_cached) {
- /*
- * TODO: Send a whole batch of pages to be read
- * by the cache.
- */
- folio = readahead_folio(ractl);
- fsize = folio_nr_pages(folio);
- ra_pages -= fsize;
- ra_index += fsize;
- if (cifs_readpage_from_fscache(ractl->mapping->host,
- &folio->page) < 0) {
- /*
- * TODO: Deal with cache read failure
- * here, but for the moment, delegate
- * that to readpage.
- */
- caching = false;
- }
- folio_unlock(folio);
- next_cached += fsize;
- cache_nr_pages -= fsize;
- if (cache_nr_pages == 0)
- check_cache = true;
- continue;
- }
- }
-
- if (open_file->invalidHandle) {
- rc = cifs_reopen_file(open_file, true);
- if (rc) {
- if (rc == -EAGAIN)
- continue;
- break;
- }
- }
-
- if (cifs_sb->ctx->rsize == 0)
- cifs_sb->ctx->rsize =
- server->ops->negotiate_rsize(tlink_tcon(open_file->tlink),
- cifs_sb->ctx);
-
- rc = server->ops->wait_mtu_credits(server, cifs_sb->ctx->rsize,
- &rsize, credits);
- if (rc)
- break;
- nr_pages = min_t(size_t, rsize / PAGE_SIZE, ra_pages);
- if (next_cached != ULONG_MAX)
- nr_pages = min_t(size_t, nr_pages, next_cached - ra_index);
-
- /*
- * Give up immediately if rsize is too small to read an entire
- * page. The VFS will fall back to readpage. We should never
- * reach this point however since we set ra_pages to 0 when the
- * rsize is smaller than a cache page.
- */
- if (unlikely(!nr_pages)) {
- add_credits_and_wake_if(server, credits, 0);
- break;
- }
-
- rdata = cifs_readdata_alloc(cifs_readahead_complete);
- if (!rdata) {
- /* best to give up if we're out of mem */
- add_credits_and_wake_if(server, credits, 0);
- break;
- }
-
- rdata->subreq.start = ra_index * PAGE_SIZE;
- rdata->subreq.len = nr_pages * PAGE_SIZE;
- rdata->cfile = cifsFileInfo_get(open_file);
- rdata->server = server;
- rdata->mapping = ractl->mapping;
- rdata->pid = pid;
- rdata->credits = credits_on_stack;
-
- for (i = 0; i < nr_pages; i++) {
- if (!readahead_folio(ractl))
- WARN_ON(1);
- }
- ra_pages -= nr_pages;
- ra_index += nr_pages;
-
- iov_iter_xarray(&rdata->subreq.io_iter, ITER_DEST, &rdata->mapping->i_pages,
- rdata->subreq.start, rdata->subreq.len);
-
- rc = adjust_credits(server, &rdata->credits, rdata->subreq.len);
- if (!rc) {
- if (rdata->cfile->invalidHandle)
- rc = -EAGAIN;
- else
- rc = server->ops->async_readv(rdata);
- }
-
- if (rc) {
- add_credits_and_wake_if(server, &rdata->credits, 0);
- cifs_unlock_folios(rdata->mapping,
- rdata->subreq.start / PAGE_SIZE,
- (rdata->subreq.start + rdata->subreq.len - 1) / PAGE_SIZE);
- /* Fallback to the readpage in error/reconnect cases */
- cifs_put_readdata(rdata);
- break;
- }
-
- cifs_put_readdata(rdata);
- }
-
- free_xid(xid);
-}
-
-/*
- * cifs_readpage_worker must be called with the page pinned
- */
-static int cifs_readpage_worker(struct file *file, struct page *page,
- loff_t *poffset)
-{
- struct inode *inode = file_inode(file);
- struct timespec64 atime, mtime;
- char *read_data;
- int rc;
-
- /* Is the page cached? */
- rc = cifs_readpage_from_fscache(inode, page);
- if (rc == 0)
- goto read_complete;
-
- read_data = kmap(page);
- /* for reads over a certain size could initiate async read ahead */
-
- rc = cifs_read(file, read_data, PAGE_SIZE, poffset);
-
- if (rc < 0)
- goto io_error;
- else
- cifs_dbg(FYI, "Bytes read %d\n", rc);
-
- /* we do not want atime to be less than mtime, it broke some apps */
- atime = inode_set_atime_to_ts(inode, current_time(inode));
- mtime = inode_get_mtime(inode);
- if (timespec64_compare(&atime, &mtime))
- inode_set_atime_to_ts(inode, inode_get_mtime(inode));
-
- if (PAGE_SIZE > rc)
- memset(read_data + rc, 0, PAGE_SIZE - rc);
-
- flush_dcache_page(page);
- SetPageUptodate(page);
- rc = 0;
-
-io_error:
- kunmap(page);
-
-read_complete:
- unlock_page(page);
- return rc;
-}
-
-static int cifs_read_folio(struct file *file, struct folio *folio)
-{
- struct page *page = &folio->page;
- loff_t offset = page_file_offset(page);
- int rc = -EACCES;
- unsigned int xid;
-
- xid = get_xid();
-
- if (file->private_data == NULL) {
- rc = -EBADF;
- free_xid(xid);
- return rc;
- }
-
- cifs_dbg(FYI, "read_folio %p at offset %d 0x%x\n",
- page, (int)offset, (int)offset);
-
- rc = cifs_readpage_worker(file, page, &offset);
-
- free_xid(xid);
- return rc;
-}
-#endif // end netfslib remove 4794
-
static int is_inode_writable(struct cifsInodeInfo *cifs_inode)
{
struct cifsFileInfo *open_file;
@@ -5175,125 +4675,6 @@ bool is_size_safe_to_change(struct cifsInodeInfo *cifsInode, __u64 end_of_file)
return true;
}

-#if 0 // TODO remove 5152
-static int cifs_write_begin(struct file *file, struct address_space *mapping,
- loff_t pos, unsigned len,
- struct page **pagep, void **fsdata)
-{
- int oncethru = 0;
- pgoff_t index = pos >> PAGE_SHIFT;
- loff_t offset = pos & (PAGE_SIZE - 1);
- loff_t page_start = pos & PAGE_MASK;
- loff_t i_size;
- struct page *page;
- int rc = 0;
-
- cifs_dbg(FYI, "write_begin from %lld len %d\n", (long long)pos, len);
-
-start:
- page = grab_cache_page_write_begin(mapping, index);
- if (!page) {
- rc = -ENOMEM;
- goto out;
- }
-
- if (PageUptodate(page))
- goto out;
-
- /*
- * If we write a full page it will be up to date, no need to read from
- * the server. If the write is short, we'll end up doing a sync write
- * instead.
- */
- if (len == PAGE_SIZE)
- goto out;
-
- /*
- * optimize away the read when we have an oplock, and we're not
- * expecting to use any of the data we'd be reading in. That
- * is, when the page lies beyond the EOF, or straddles the EOF
- * and the write will cover all of the existing data.
- */
- if (CIFS_CACHE_READ(CIFS_I(mapping->host))) {
- i_size = i_size_read(mapping->host);
- if (page_start >= i_size ||
- (offset == 0 && (pos + len) >= i_size)) {
- zero_user_segments(page, 0, offset,
- offset + len,
- PAGE_SIZE);
- /*
- * PageChecked means that the parts of the page
- * to which we're not writing are considered up
- * to date. Once the data is copied to the
- * page, it can be set uptodate.
- */
- SetPageChecked(page);
- goto out;
- }
- }
-
- if ((file->f_flags & O_ACCMODE) != O_WRONLY && !oncethru) {
- /*
- * might as well read a page, it is fast enough. If we get
- * an error, we don't need to return it. cifs_write_end will
- * do a sync write instead since PG_uptodate isn't set.
- */
- cifs_readpage_worker(file, page, &page_start);
- put_page(page);
- oncethru = 1;
- goto start;
- } else {
- /* we could try using another file handle if there is one -
- but how would we lock it to prevent close of that handle
- racing with this read? In any case
- this will be written out by write_end so is fine */
- }
-out:
- *pagep = page;
- return rc;
-}
-
-static bool cifs_release_folio(struct folio *folio, gfp_t gfp)
-{
- if (folio_test_private(folio))
- return 0;
- if (folio_test_fscache(folio)) {
- if (current_is_kswapd() || !(gfp & __GFP_FS))
- return false;
- folio_wait_fscache(folio);
- }
- fscache_note_page_release(cifs_inode_cookie(folio->mapping->host));
- return true;
-}
-
-static void cifs_invalidate_folio(struct folio *folio, size_t offset,
- size_t length)
-{
- folio_wait_fscache(folio);
-}
-
-static int cifs_launder_folio(struct folio *folio)
-{
- int rc = 0;
- loff_t range_start = folio_pos(folio);
- loff_t range_end = range_start + folio_size(folio);
- struct writeback_control wbc = {
- .sync_mode = WB_SYNC_ALL,
- .nr_to_write = 0,
- .range_start = range_start,
- .range_end = range_end,
- };
-
- cifs_dbg(FYI, "Launder page: %lu\n", folio->index);
-
- if (folio_clear_dirty_for_io(folio))
- rc = cifs_writepage_locked(&folio->page, &wbc);
-
- folio_wait_fscache(folio);
- return rc;
-}
-#endif // end netfslib remove 5152
-
void cifs_oplock_break(struct work_struct *work)
{
struct cifsFileInfo *cfile = container_of(work, struct cifsFileInfo,
@@ -5383,27 +4764,6 @@ void cifs_oplock_break(struct work_struct *work)
cifs_done_oplock_break(cinode);
}

-#if 0 // TODO remove 5333
-/*
- * The presence of cifs_direct_io() in the address space ops vector
- * allowes open() O_DIRECT flags which would have failed otherwise.
- *
- * In the non-cached mode (mount with cache=none), we shunt off direct read and write requests
- * so this method should never be called.
- *
- * Direct IO is not yet supported in the cached mode.
- */
-static ssize_t
-cifs_direct_io(struct kiocb *iocb, struct iov_iter *iter)
-{
- /*
- * FIXME
- * Eventually need to support direct IO for non forcedirectio mounts
- */
- return -EINVAL;
-}
-#endif // netfs end remove 5333
-
static int cifs_swap_activate(struct swap_info_struct *sis,
struct file *swap_file, sector_t *span)
{
diff --git a/fs/smb/client/fscache.c b/fs/smb/client/fscache.c
index e4cb0938fb15..bd9284923cc6 100644
--- a/fs/smb/client/fscache.c
+++ b/fs/smb/client/fscache.c
@@ -136,114 +136,3 @@ void cifs_fscache_release_inode_cookie(struct inode *inode)
cifsi->netfs.cache = NULL;
}
}
-
-#if 0 // TODO remove
-/*
- * Fallback page reading interface.
- */
-static int fscache_fallback_read_page(struct inode *inode, struct page *page)
-{
- struct netfs_cache_resources cres;
- struct fscache_cookie *cookie = cifs_inode_cookie(inode);
- struct iov_iter iter;
- struct bio_vec bvec;
- int ret;
-
- memset(&cres, 0, sizeof(cres));
- bvec_set_page(&bvec, page, PAGE_SIZE, 0);
- iov_iter_bvec(&iter, ITER_DEST, &bvec, 1, PAGE_SIZE);
-
- ret = fscache_begin_read_operation(&cres, cookie);
- if (ret < 0)
- return ret;
-
- ret = fscache_read(&cres, page_offset(page), &iter, NETFS_READ_HOLE_FAIL,
- NULL, NULL);
- fscache_end_operation(&cres);
- return ret;
-}
-
-/*
- * Fallback page writing interface.
- */
-static int fscache_fallback_write_pages(struct inode *inode, loff_t start, size_t len,
- bool no_space_allocated_yet)
-{
- struct netfs_cache_resources cres;
- struct fscache_cookie *cookie = cifs_inode_cookie(inode);
- struct iov_iter iter;
- int ret;
-
- memset(&cres, 0, sizeof(cres));
- iov_iter_xarray(&iter, ITER_SOURCE, &inode->i_mapping->i_pages, start, len);
-
- ret = fscache_begin_write_operation(&cres, cookie);
- if (ret < 0)
- return ret;
-
- ret = cres.ops->prepare_write(&cres, &start, &len, i_size_read(inode),
- no_space_allocated_yet);
- if (ret == 0)
- ret = fscache_write(&cres, start, &iter, NULL, NULL);
- fscache_end_operation(&cres);
- return ret;
-}
-
-/*
- * Retrieve a page from FS-Cache
- */
-int __cifs_readpage_from_fscache(struct inode *inode, struct page *page)
-{
- int ret;
-
- cifs_dbg(FYI, "%s: (fsc:%p, p:%p, i:0x%p\n",
- __func__, cifs_inode_cookie(inode), page, inode);
-
- ret = fscache_fallback_read_page(inode, page);
- if (ret < 0)
- return ret;
-
- /* Read completed synchronously */
- SetPageUptodate(page);
- return 0;
-}
-
-void __cifs_readahead_to_fscache(struct inode *inode, loff_t pos, size_t len)
-{
- cifs_dbg(FYI, "%s: (fsc: %p, p: %llx, l: %zx, i: %p)\n",
- __func__, cifs_inode_cookie(inode), pos, len, inode);
-
- fscache_fallback_write_pages(inode, pos, len, true);
-}
-
-/*
- * Query the cache occupancy.
- */
-int __cifs_fscache_query_occupancy(struct inode *inode,
- pgoff_t first, unsigned int nr_pages,
- pgoff_t *_data_first,
- unsigned int *_data_nr_pages)
-{
- struct netfs_cache_resources cres;
- struct fscache_cookie *cookie = cifs_inode_cookie(inode);
- loff_t start, data_start;
- size_t len, data_len;
- int ret;
-
- ret = fscache_begin_read_operation(&cres, cookie);
- if (ret < 0)
- return ret;
-
- start = first * PAGE_SIZE;
- len = nr_pages * PAGE_SIZE;
- ret = cres.ops->query_occupancy(&cres, start, len, PAGE_SIZE,
- &data_start, &data_len);
- if (ret == 0) {
- *_data_first = data_start / PAGE_SIZE;
- *_data_nr_pages = len / PAGE_SIZE;
- }
-
- fscache_end_operation(&cres);
- return ret;
-}
-#endif
diff --git a/fs/smb/client/fscache.h b/fs/smb/client/fscache.h
index c2c05a778a71..ece1a826adb9 100644
--- a/fs/smb/client/fscache.h
+++ b/fs/smb/client/fscache.h
@@ -74,43 +74,6 @@ static inline void cifs_invalidate_cache(struct inode *inode, unsigned int flags
i_size_read(inode), flags);
}

-#if 0 // TODO remove
-extern int __cifs_fscache_query_occupancy(struct inode *inode,
- pgoff_t first, unsigned int nr_pages,
- pgoff_t *_data_first,
- unsigned int *_data_nr_pages);
-
-static inline int cifs_fscache_query_occupancy(struct inode *inode,
- pgoff_t first, unsigned int nr_pages,
- pgoff_t *_data_first,
- unsigned int *_data_nr_pages)
-{
- if (!cifs_inode_cookie(inode))
- return -ENOBUFS;
- return __cifs_fscache_query_occupancy(inode, first, nr_pages,
- _data_first, _data_nr_pages);
-}
-
-extern int __cifs_readpage_from_fscache(struct inode *pinode, struct page *ppage);
-extern void __cifs_readahead_to_fscache(struct inode *pinode, loff_t pos, size_t len);
-
-
-static inline int cifs_readpage_from_fscache(struct inode *inode,
- struct page *page)
-{
- if (cifs_inode_cookie(inode))
- return __cifs_readpage_from_fscache(inode, page);
- return -ENOBUFS;
-}
-
-static inline void cifs_readahead_to_fscache(struct inode *inode,
- loff_t pos, size_t len)
-{
- if (cifs_inode_cookie(inode))
- __cifs_readahead_to_fscache(inode, pos, len);
-}
-#endif
-
#else /* CONFIG_CIFS_FSCACHE */
static inline
void cifs_fscache_fill_coherency(struct inode *inode,
@@ -127,27 +90,6 @@ static inline void cifs_fscache_unuse_inode_cookie(struct inode *inode, bool upd
static inline struct fscache_cookie *cifs_inode_cookie(struct inode *inode) { return NULL; }
static inline void cifs_invalidate_cache(struct inode *inode, unsigned int flags) {}

-#if 0 // TODO remove
-static inline int cifs_fscache_query_occupancy(struct inode *inode,
- pgoff_t first, unsigned int nr_pages,
- pgoff_t *_data_first,
- unsigned int *_data_nr_pages)
-{
- *_data_first = ULONG_MAX;
- *_data_nr_pages = 0;
- return -ENOBUFS;
-}
-
-static inline int
-cifs_readpage_from_fscache(struct inode *inode, struct page *page)
-{
- return -ENOBUFS;
-}
-
-static inline
-void cifs_readahead_to_fscache(struct inode *inode, loff_t pos, size_t len) {}
-#endif
-
#endif /* CONFIG_CIFS_FSCACHE */

#endif /* _CIFS_FSCACHE_H */

2023-11-17 21:22:46

by David Howells

[permalink] [raw]
Subject: [PATCH v2 50/51] cifs: Remove some code that's no longer used, part 2

Remove some code that was #if'd out with the netfslib conversion. This is
split into parts for file.c as the diff generator otherwise produces a hard
to read diff for part of it where a big chunk is cut out.

Signed-off-by: David Howells <[email protected]>
cc: Steve French <[email protected]>
cc: Shyam Prasad N <[email protected]>
cc: Rohith Surabattula <[email protected]>
cc: Jeff Layton <[email protected]>
cc: [email protected]
cc: [email protected]
cc: [email protected]
cc: [email protected]
---
fs/smb/client/file.c | 694 +------------------------------------------
1 file changed, 1 insertion(+), 693 deletions(-)

diff --git a/fs/smb/client/file.c b/fs/smb/client/file.c
index ce14ecb1c1c6..ffa8c59109ab 100644
--- a/fs/smb/client/file.c
+++ b/fs/smb/client/file.c
@@ -2575,699 +2575,6 @@ cifs_get_readable_path(struct cifs_tcon *tcon, const char *name,
return -ENOENT;
}

-#if 0 // TODO remove 2773
-void
-cifs_writedata_release(struct cifs_io_subrequest *wdata)
-{
- if (wdata->uncached)
- kref_put(&wdata->ctx->refcount, cifs_aio_ctx_release);
-#ifdef CONFIG_CIFS_SMB_DIRECT
- if (wdata->mr) {
- smbd_deregister_mr(wdata->mr);
- wdata->mr = NULL;
- }
-#endif
-
- if (wdata->cfile)
- cifsFileInfo_put(wdata->cfile);
-
- kfree(wdata);
-}
-
-/*
- * Write failed with a retryable error. Resend the write request. It's also
- * possible that the page was redirtied so re-clean the page.
- */
-static void
-cifs_writev_requeue(struct cifs_io_subrequest *wdata)
-{
- int rc = 0;
- struct inode *inode = d_inode(wdata->cfile->dentry);
- struct TCP_Server_Info *server;
- unsigned int rest_len = wdata->subreq.len;
- loff_t fpos = wdata->subreq.start;
-
- server = tlink_tcon(wdata->cfile->tlink)->ses->server;
- do {
- struct cifs_io_subrequest *wdata2;
- unsigned int wsize, cur_len;
-
- wsize = server->ops->wp_retry_size(inode);
- if (wsize < rest_len) {
- if (wsize < PAGE_SIZE) {
- rc = -EOPNOTSUPP;
- break;
- }
- cur_len = min(round_down(wsize, PAGE_SIZE), rest_len);
- } else {
- cur_len = rest_len;
- }
-
- wdata2 = cifs_writedata_alloc(cifs_writev_complete);
- if (!wdata2) {
- rc = -ENOMEM;
- break;
- }
-
- wdata2->sync_mode = wdata->sync_mode;
- wdata2->subreq.start = fpos;
- wdata2->subreq.len = cur_len;
- wdata2->subreq.io_iter = wdata->subreq.io_iter;
-
- iov_iter_advance(&wdata2->subreq.io_iter, fpos - wdata->subreq.start);
- iov_iter_truncate(&wdata2->subreq.io_iter, wdata2->subreq.len);
-
- if (iov_iter_is_xarray(&wdata2->subreq.io_iter))
- /* Check for pages having been redirtied and clean
- * them. We can do this by walking the xarray. If
- * it's not an xarray, then it's a DIO and we shouldn't
- * be mucking around with the page bits.
- */
- cifs_undirty_folios(inode, fpos, cur_len);
-
- rc = cifs_get_writable_file(CIFS_I(inode), FIND_WR_ANY,
- &wdata2->cfile);
- if (!wdata2->cfile) {
- cifs_dbg(VFS, "No writable handle to retry writepages rc=%d\n",
- rc);
- if (!is_retryable_error(rc))
- rc = -EBADF;
- } else {
- wdata2->pid = wdata2->cfile->pid;
- rc = server->ops->async_writev(wdata2);
- }
-
- cifs_put_writedata(wdata2);
- if (rc) {
- if (is_retryable_error(rc))
- continue;
- fpos += cur_len;
- rest_len -= cur_len;
- break;
- }
-
- fpos += cur_len;
- rest_len -= cur_len;
- } while (rest_len > 0);
-
- /* Clean up remaining pages from the original wdata */
- if (iov_iter_is_xarray(&wdata->subreq.io_iter))
- cifs_pages_write_failed(inode, fpos, rest_len);
-
- if (rc != 0 && !is_retryable_error(rc))
- mapping_set_error(inode->i_mapping, rc);
- cifs_put_writedata(wdata);
-}
-
-void
-cifs_writev_complete(struct work_struct *work)
-{
- struct cifs_io_subrequest *wdata = container_of(work,
- struct cifs_io_subrequest, work);
- struct inode *inode = d_inode(wdata->cfile->dentry);
-
- if (wdata->result == 0) {
- spin_lock(&inode->i_lock);
- cifs_update_eof(CIFS_I(inode), wdata->subreq.start, wdata->subreq.len);
- spin_unlock(&inode->i_lock);
- cifs_stats_bytes_written(tlink_tcon(wdata->cfile->tlink),
- wdata->subreq.len);
- } else if (wdata->sync_mode == WB_SYNC_ALL && wdata->result == -EAGAIN)
- return cifs_writev_requeue(wdata);
-
- if (wdata->result == -EAGAIN)
- cifs_pages_write_redirty(inode, wdata->subreq.start, wdata->subreq.len);
- else if (wdata->result < 0)
- cifs_pages_write_failed(inode, wdata->subreq.start, wdata->subreq.len);
- else
- cifs_pages_written_back(inode, wdata->subreq.start, wdata->subreq.len);
-
- if (wdata->result != -EAGAIN)
- mapping_set_error(inode->i_mapping, wdata->result);
- cifs_put_writedata(wdata);
-}
-
-struct cifs_io_subrequest *cifs_writedata_alloc(work_func_t complete)
-{
- struct cifs_io_subrequest *wdata;
-
- wdata = kzalloc(sizeof(*wdata), GFP_NOFS);
- if (wdata != NULL) {
- refcount_set(&wdata->subreq.ref, 1);
- INIT_LIST_HEAD(&wdata->list);
- init_completion(&wdata->done);
- INIT_WORK(&wdata->work, complete);
- }
- return wdata;
-}
-
-static int cifs_partialpagewrite(struct page *page, unsigned from, unsigned to)
-{
- struct address_space *mapping = page->mapping;
- loff_t offset = (loff_t)page->index << PAGE_SHIFT;
- char *write_data;
- int rc = -EFAULT;
- int bytes_written = 0;
- struct inode *inode;
- struct cifsFileInfo *open_file;
-
- if (!mapping || !mapping->host)
- return -EFAULT;
-
- inode = page->mapping->host;
-
- offset += (loff_t)from;
- write_data = kmap(page);
- write_data += from;
-
- if ((to > PAGE_SIZE) || (from > to)) {
- kunmap(page);
- return -EIO;
- }
-
- /* racing with truncate? */
- if (offset > mapping->host->i_size) {
- kunmap(page);
- return 0; /* don't care */
- }
-
- /* check to make sure that we are not extending the file */
- if (mapping->host->i_size - offset < (loff_t)to)
- to = (unsigned)(mapping->host->i_size - offset);
-
- rc = cifs_get_writable_file(CIFS_I(mapping->host), FIND_WR_ANY,
- &open_file);
- if (!rc) {
- bytes_written = cifs_write(open_file, open_file->pid,
- write_data, to - from, &offset);
- cifsFileInfo_put(open_file);
- /* Does mm or vfs already set times? */
- simple_inode_init_ts(inode);
- if ((bytes_written > 0) && (offset))
- rc = 0;
- else if (bytes_written < 0)
- rc = bytes_written;
- else
- rc = -EFAULT;
- } else {
- cifs_dbg(FYI, "No writable handle for write page rc=%d\n", rc);
- if (!is_retryable_error(rc))
- rc = -EIO;
- }
-
- kunmap(page);
- return rc;
-}
-
-/*
- * Extend the region to be written back to include subsequent contiguously
- * dirty pages if possible, but don't sleep while doing so.
- */
-static void cifs_extend_writeback(struct address_space *mapping,
- long *_count,
- loff_t start,
- int max_pages,
- size_t max_len,
- unsigned int *_len)
-{
- struct folio_batch batch;
- struct folio *folio;
- unsigned int psize, nr_pages;
- size_t len = *_len;
- pgoff_t index = (start + len) / PAGE_SIZE;
- bool stop = true;
- unsigned int i;
- XA_STATE(xas, &mapping->i_pages, index);
-
- folio_batch_init(&batch);
-
- do {
- /* Firstly, we gather up a batch of contiguous dirty pages
- * under the RCU read lock - but we can't clear the dirty flags
- * there if any of those pages are mapped.
- */
- rcu_read_lock();
-
- xas_for_each(&xas, folio, ULONG_MAX) {
- stop = true;
- if (xas_retry(&xas, folio))
- continue;
- if (xa_is_value(folio))
- break;
- if (folio_index(folio) != index)
- break;
- if (!folio_try_get_rcu(folio)) {
- xas_reset(&xas);
- continue;
- }
- nr_pages = folio_nr_pages(folio);
- if (nr_pages > max_pages)
- break;
-
- /* Has the page moved or been split? */
- if (unlikely(folio != xas_reload(&xas))) {
- folio_put(folio);
- break;
- }
-
- if (!folio_trylock(folio)) {
- folio_put(folio);
- break;
- }
- if (!folio_test_dirty(folio) || folio_test_writeback(folio)) {
- folio_unlock(folio);
- folio_put(folio);
- break;
- }
-
- max_pages -= nr_pages;
- psize = folio_size(folio);
- len += psize;
- stop = false;
- if (max_pages <= 0 || len >= max_len || *_count <= 0)
- stop = true;
-
- index += nr_pages;
- if (!folio_batch_add(&batch, folio))
- break;
- if (stop)
- break;
- }
-
- if (!stop)
- xas_pause(&xas);
- rcu_read_unlock();
-
- /* Now, if we obtained any pages, we can shift them to being
- * writable and mark them for caching.
- */
- if (!folio_batch_count(&batch))
- break;
-
- for (i = 0; i < folio_batch_count(&batch); i++) {
- folio = batch.folios[i];
- /* The folio should be locked, dirty and not undergoing
- * writeback from the loop above.
- */
- if (!folio_clear_dirty_for_io(folio))
- WARN_ON(1);
- folio_start_writeback(folio);
-
- *_count -= folio_nr_pages(folio);
- folio_unlock(folio);
- }
-
- folio_batch_release(&batch);
- cond_resched();
- } while (!stop);
-
- *_len = len;
-}
-
-/*
- * Write back the locked page and any subsequent non-locked dirty pages.
- */
-static ssize_t cifs_write_back_from_locked_folio(struct address_space *mapping,
- struct writeback_control *wbc,
- struct folio *folio,
- loff_t start, loff_t end)
-{
- struct inode *inode = mapping->host;
- struct TCP_Server_Info *server;
- struct cifs_io_subrequest *wdata;
- struct cifs_sb_info *cifs_sb = CIFS_SB(inode->i_sb);
- struct cifs_credits credits_on_stack;
- struct cifs_credits *credits = &credits_on_stack;
- struct cifsFileInfo *cfile = NULL;
- unsigned int xid, len;
- loff_t i_size = i_size_read(inode);
- size_t max_len, wsize;
- long count = wbc->nr_to_write;
- int rc;
-
- /* The folio should be locked, dirty and not undergoing writeback. */
- folio_start_writeback(folio);
-
- count -= folio_nr_pages(folio);
- len = folio_size(folio);
-
- xid = get_xid();
- server = cifs_pick_channel(cifs_sb_master_tcon(cifs_sb)->ses);
-
- rc = cifs_get_writable_file(CIFS_I(inode), FIND_WR_ANY, &cfile);
- if (rc) {
- cifs_dbg(VFS, "No writable handle in writepages rc=%d\n", rc);
- goto err_xid;
- }
-
- rc = server->ops->wait_mtu_credits(server, cifs_sb->ctx->wsize,
- &wsize, credits);
- if (rc != 0)
- goto err_close;
-
- wdata = cifs_writedata_alloc(cifs_writev_complete);
- if (!wdata) {
- rc = -ENOMEM;
- goto err_uncredit;
- }
-
- wdata->sync_mode = wbc->sync_mode;
- wdata->subreq.start = folio_pos(folio);
- wdata->pid = cfile->pid;
- wdata->credits = credits_on_stack;
- wdata->cfile = cfile;
- wdata->server = server;
- cfile = NULL;
-
- /* Find all consecutive lockable dirty pages, stopping when we find a
- * page that is not immediately lockable, is not dirty or is missing,
- * or we reach the end of the range.
- */
- if (start < i_size) {
- /* Trim the write to the EOF; the extra data is ignored. Also
- * put an upper limit on the size of a single storedata op.
- */
- max_len = wsize;
- max_len = min_t(unsigned long long, max_len, end - start + 1);
- max_len = min_t(unsigned long long, max_len, i_size - start);
-
- if (len < max_len) {
- int max_pages = INT_MAX;
-
-#ifdef CONFIG_CIFS_SMB_DIRECT
- if (server->smbd_conn)
- max_pages = server->smbd_conn->max_frmr_depth;
-#endif
- max_pages -= folio_nr_pages(folio);
-
- if (max_pages > 0)
- cifs_extend_writeback(mapping, &count, start,
- max_pages, max_len, &len);
- }
- len = min_t(loff_t, len, max_len);
- }
-
- wdata->subreq.len = len;
-
- /* We now have a contiguous set of dirty pages, each with writeback
- * set; the first page is still locked at this point, but all the rest
- * have been unlocked.
- */
- folio_unlock(folio);
-
- if (start < i_size) {
- iov_iter_xarray(&wdata->subreq.io_iter, ITER_SOURCE, &mapping->i_pages,
- start, len);
-
- rc = adjust_credits(wdata->server, &wdata->credits, wdata->subreq.len);
- if (rc)
- goto err_wdata;
-
- if (wdata->cfile->invalidHandle)
- rc = -EAGAIN;
- else
- rc = wdata->server->ops->async_writev(wdata);
- if (rc >= 0) {
- cifs_put_writedata(wdata);
- goto err_close;
- }
- } else {
- /* The dirty region was entirely beyond the EOF. */
- cifs_pages_written_back(inode, start, len);
- rc = 0;
- }
-
-err_wdata:
- cifs_put_writedata(wdata);
-err_uncredit:
- add_credits_and_wake_if(server, credits, 0);
-err_close:
- if (cfile)
- cifsFileInfo_put(cfile);
-err_xid:
- free_xid(xid);
- if (rc == 0) {
- wbc->nr_to_write = count;
- rc = len;
- } else if (is_retryable_error(rc)) {
- cifs_pages_write_redirty(inode, start, len);
- } else {
- cifs_pages_write_failed(inode, start, len);
- mapping_set_error(mapping, rc);
- }
- /* Indication to update ctime and mtime as close is deferred */
- set_bit(CIFS_INO_MODIFIED_ATTR, &CIFS_I(inode)->flags);
- return rc;
-}
-
-/*
- * write a region of pages back to the server
- */
-static int cifs_writepages_region(struct address_space *mapping,
- struct writeback_control *wbc,
- loff_t start, loff_t end, loff_t *_next)
-{
- struct folio_batch fbatch;
- int skips = 0;
-
- folio_batch_init(&fbatch);
- do {
- int nr;
- pgoff_t index = start / PAGE_SIZE;
-
- nr = filemap_get_folios_tag(mapping, &index, end / PAGE_SIZE,
- PAGECACHE_TAG_DIRTY, &fbatch);
- if (!nr)
- break;
-
- for (int i = 0; i < nr; i++) {
- ssize_t ret;
- struct folio *folio = fbatch.folios[i];
-
-redo_folio:
- start = folio_pos(folio); /* May regress with THPs */
-
- /* At this point we hold neither the i_pages lock nor the
- * page lock: the page may be truncated or invalidated
- * (changing page->mapping to NULL), or even swizzled
- * back from swapper_space to tmpfs file mapping
- */
- if (wbc->sync_mode != WB_SYNC_NONE) {
- ret = folio_lock_killable(folio);
- if (ret < 0)
- goto write_error;
- } else {
- if (!folio_trylock(folio))
- goto skip_write;
- }
-
- if (folio_mapping(folio) != mapping ||
- !folio_test_dirty(folio)) {
- start += folio_size(folio);
- folio_unlock(folio);
- continue;
- }
-
- if (folio_test_writeback(folio) ||
- folio_test_fscache(folio)) {
- folio_unlock(folio);
- if (wbc->sync_mode == WB_SYNC_NONE)
- goto skip_write;
-
- folio_wait_writeback(folio);
-#ifdef CONFIG_CIFS_FSCACHE
- folio_wait_fscache(folio);
-#endif
- goto redo_folio;
- }
-
- if (!folio_clear_dirty_for_io(folio))
- /* We hold the page lock - it should've been dirty. */
- WARN_ON(1);
-
- ret = cifs_write_back_from_locked_folio(mapping, wbc, folio, start, end);
- if (ret < 0)
- goto write_error;
-
- start += ret;
- continue;
-
-write_error:
- folio_batch_release(&fbatch);
- *_next = start;
- return ret;
-
-skip_write:
- /*
- * Too many skipped writes, or need to reschedule?
- * Treat it as a write error without an error code.
- */
- if (skips >= 5 || need_resched()) {
- ret = 0;
- goto write_error;
- }
-
- /* Otherwise, just skip that folio and go on to the next */
- skips++;
- start += folio_size(folio);
- continue;
- }
-
- folio_batch_release(&fbatch);
- cond_resched();
- } while (wbc->nr_to_write > 0);
-
- *_next = start;
- return 0;
-}
-
-/*
- * Write some of the pending data back to the server
- */
-static int cifs_writepages(struct address_space *mapping,
- struct writeback_control *wbc)
-{
- loff_t start, next;
- int ret;
-
- /* We have to be careful as we can end up racing with setattr()
- * truncating the pagecache since the caller doesn't take a lock here
- * to prevent it.
- */
-
- if (wbc->range_cyclic) {
- start = mapping->writeback_index * PAGE_SIZE;
- ret = cifs_writepages_region(mapping, wbc, start, LLONG_MAX, &next);
- if (ret == 0) {
- mapping->writeback_index = next / PAGE_SIZE;
- if (start > 0 && wbc->nr_to_write > 0) {
- ret = cifs_writepages_region(mapping, wbc, 0,
- start, &next);
- if (ret == 0)
- mapping->writeback_index =
- next / PAGE_SIZE;
- }
- }
- } else if (wbc->range_start == 0 && wbc->range_end == LLONG_MAX) {
- ret = cifs_writepages_region(mapping, wbc, 0, LLONG_MAX, &next);
- if (wbc->nr_to_write > 0 && ret == 0)
- mapping->writeback_index = next / PAGE_SIZE;
- } else {
- ret = cifs_writepages_region(mapping, wbc,
- wbc->range_start, wbc->range_end, &next);
- }
-
- return ret;
-}
-
-static int
-cifs_writepage_locked(struct page *page, struct writeback_control *wbc)
-{
- int rc;
- unsigned int xid;
-
- xid = get_xid();
-/* BB add check for wbc flags */
- get_page(page);
- if (!PageUptodate(page))
- cifs_dbg(FYI, "ppw - page not up to date\n");
-
- /*
- * Set the "writeback" flag, and clear "dirty" in the radix tree.
- *
- * A writepage() implementation always needs to do either this,
- * or re-dirty the page with "redirty_page_for_writepage()" in
- * the case of a failure.
- *
- * Just unlocking the page will cause the radix tree tag-bits
- * to fail to update with the state of the page correctly.
- */
- set_page_writeback(page);
-retry_write:
- rc = cifs_partialpagewrite(page, 0, PAGE_SIZE);
- if (is_retryable_error(rc)) {
- if (wbc->sync_mode == WB_SYNC_ALL && rc == -EAGAIN)
- goto retry_write;
- redirty_page_for_writepage(wbc, page);
- } else if (rc != 0) {
- SetPageError(page);
- mapping_set_error(page->mapping, rc);
- } else {
- SetPageUptodate(page);
- }
- end_page_writeback(page);
- put_page(page);
- free_xid(xid);
- return rc;
-}
-
-static int cifs_write_end(struct file *file, struct address_space *mapping,
- loff_t pos, unsigned len, unsigned copied,
- struct page *page, void *fsdata)
-{
- int rc;
- struct inode *inode = mapping->host;
- struct cifsFileInfo *cfile = file->private_data;
- struct cifs_sb_info *cifs_sb = CIFS_SB(cfile->dentry->d_sb);
- struct folio *folio = page_folio(page);
- __u32 pid;
-
- if (cifs_sb->mnt_cifs_flags & CIFS_MOUNT_RWPIDFORWARD)
- pid = cfile->pid;
- else
- pid = current->tgid;
-
- cifs_dbg(FYI, "write_end for page %p from pos %lld with %d bytes\n",
- page, pos, copied);
-
- if (folio_test_checked(folio)) {
- if (copied == len)
- folio_mark_uptodate(folio);
- folio_clear_checked(folio);
- } else if (!folio_test_uptodate(folio) && copied == PAGE_SIZE)
- folio_mark_uptodate(folio);
-
- if (!folio_test_uptodate(folio)) {
- char *page_data;
- unsigned offset = pos & (PAGE_SIZE - 1);
- unsigned int xid;
-
- xid = get_xid();
- /* this is probably better than directly calling
- partialpage_write since in this function the file handle is
- known which we might as well leverage */
- /* BB check if anything else missing out of ppw
- such as updating last write time */
- page_data = kmap(page);
- rc = cifs_write(cfile, pid, page_data + offset, copied, &pos);
- /* if (rc < 0) should we set writebehind rc? */
- kunmap(page);
-
- free_xid(xid);
- } else {
- rc = copied;
- pos += copied;
- set_page_dirty(page);
- }
-
- if (rc > 0) {
- spin_lock(&inode->i_lock);
- if (pos > inode->i_size) {
- i_size_write(inode, pos);
- inode->i_blocks = (512 - 1 + pos) >> 9;
- }
- spin_unlock(&inode->i_lock);
- }
-
- unlock_page(page);
- put_page(page);
- /* Indication to update ctime and mtime as close is deferred */
- set_bit(CIFS_INO_MODIFIED_ATTR, &CIFS_I(inode)->flags);
-
- return rc;
-}
-#endif // End netfs removal 2773
-
/*
* Flush data on a strict file.
*/
@@ -4582,6 +3889,7 @@ cifs_read(struct file *file, char *read_data, size_t read_size, loff_t *offset)
}
#endif // end netfslib remove 4633

+
static vm_fault_t cifs_page_mkwrite(struct vm_fault *vmf)
{
return netfs_page_mkwrite(vmf, NULL);