2024-03-06 22:40:38

by David Howells

[permalink] [raw]
Subject: [RFC PATCH] mm: Replace ->launder_folio() with flush and wait

Here's a patch to have a go at getting rid of ->launder_folio(). Since it's
failable and cannot guarantee that pages in the range are removed, I've tried
to replace laundering with just flush-and-wait, dropping the folio lock around
the I/O.

Are there any tests for launder_folio as triggering it is tricky?

David
---
mm: Replace ->launder_folio() with flush and wait

invalidate_inode_pages2_range() and its wrappers are frequently used to
invalidate overlapping folios prior to and after doing direct I/O. This
calls ->launder_folio() to flush dirty folios out to the backing store,
keeping the folio lock across the I/O - presumably to prevent the folio
from being redirtied and thereby prevent it from being removed.

However... If we're doing this prior to doing DIO on a file, there may be
nothing preventing an mmapped write from recreating and redirtying the
folio the moment it is removed from the mapping lest the kernel deadlock on
doing DIO to/from a buffer mmapped from the target file.

Further, invalidate_inode_pages2_range() is permitted to fail - and half
the callers don't even check to see if it *did* fail, probably not
unreasonably.

In which case, there's no need to keep the folio lock and no need for a
special op. Instead, drop the lock and flush the range covering the page
to the end of the range to be invalidated (with a flag set in the wbc to
say what we're doing) and then wait for it to complete.

That said, there are other places that use invalidate_inode_pages2_range()
and most of those don't implement launder_page, so add a flag to only do
this in those filesystems that implemented it: 9p, afs, cifs, fuse, nfs and
orangefs.

Signed-off-by: David Howells <[email protected]>
cc: Matthew Wilcox <[email protected]>
cc: Christoph Hellwig <[email protected]>
cc: Andrew Morton <[email protected]>
cc: Alexander Viro <[email protected]>
cc: Christian Brauner <[email protected]>
cc: Jeff Layton <[email protected]>
cc: [email protected]
cc: [email protected]
cc: [email protected]
cc: [email protected]
cc: [email protected]
cc: [email protected]
cc: [email protected]
cc: [email protected]
cc: [email protected]
---
Documentation/filesystems/locking.rst | 8 ---
Documentation/filesystems/vfs.rst | 6 --
fs/9p/vfs_addr.c | 1
fs/9p/vfs_inode.c | 1
fs/afs/file.c | 1
fs/afs/inode.c | 1
fs/fuse/dir.c | 2
fs/fuse/file.c | 17 -------
fs/netfs/buffered_write.c | 76 ----------------------------------
fs/nfs/file.c | 23 ----------
fs/nfs/inode.c | 1
fs/nfs/nfstrace.h | 1
fs/orangefs/inode.c | 2
fs/smb/client/file.c | 23 ----------
fs/smb/client/inode.c | 1
include/linux/fs.h | 1
include/linux/netfs.h | 1
include/linux/pagemap.h | 6 ++
include/linux/writeback.h | 1
mm/truncate.c | 35 +++++++++------
20 files changed, 36 insertions(+), 172 deletions(-)

diff --git a/Documentation/filesystems/locking.rst b/Documentation/filesystems/locking.rst
index d5bf4b6b7509..139554d1ab51 100644
--- a/Documentation/filesystems/locking.rst
+++ b/Documentation/filesystems/locking.rst
@@ -262,7 +262,6 @@ prototypes::
int (*direct_IO)(struct kiocb *, struct iov_iter *iter);
int (*migrate_folio)(struct address_space *, struct folio *dst,
struct folio *src, enum migrate_mode);
- int (*launder_folio)(struct folio *);
bool (*is_partially_uptodate)(struct folio *, size_t from, size_t count);
int (*error_remove_folio)(struct address_space *, struct folio *);
int (*swap_activate)(struct swap_info_struct *sis, struct file *f, sector_t *span)
@@ -288,7 +287,6 @@ release_folio: yes
free_folio: yes
direct_IO:
migrate_folio: yes (both)
-launder_folio: yes
is_partially_uptodate: yes
error_remove_folio: yes
swap_activate: no
@@ -394,12 +392,6 @@ try_to_free_buffers().
->free_folio() is called when the kernel has dropped the folio
from the page cache.

-->launder_folio() may be called prior to releasing a folio if
-it is still found to be dirty. It returns zero if the folio was successfully
-cleaned, or an error value if not. Note that in order to prevent the folio
-getting mapped back in and redirtied, it needs to be kept locked
-across the entire operation.
-
->swap_activate() will be called to prepare the given file for swap. It
should perform any validation and preparation necessary to ensure that
writes can be performed with minimal memory allocation. It should call
diff --git a/Documentation/filesystems/vfs.rst b/Documentation/filesystems/vfs.rst
index eebcc0f9e2bc..b2af9ee6515a 100644
--- a/Documentation/filesystems/vfs.rst
+++ b/Documentation/filesystems/vfs.rst
@@ -818,7 +818,6 @@ cache in your filesystem. The following members are defined:
ssize_t (*direct_IO)(struct kiocb *, struct iov_iter *iter);
int (*migrate_folio)(struct mapping *, struct folio *dst,
struct folio *src, enum migrate_mode);
- int (*launder_folio) (struct folio *);

bool (*is_partially_uptodate) (struct folio *, size_t from,
size_t count);
@@ -1012,11 +1011,6 @@ cache in your filesystem. The following members are defined:
folio to this function. migrate_folio should transfer any private
data across and update any references that it has to the folio.

-``launder_folio``
- Called before freeing a folio - it writes back the dirty folio.
- To prevent redirtying the folio, it is kept locked during the
- whole operation.
-
``is_partially_uptodate``
Called by the VM when reading a file through the pagecache when
the underlying blocksize is smaller than the size of the folio.
diff --git a/fs/9p/vfs_addr.c b/fs/9p/vfs_addr.c
index 047855033d32..967c81b7aa73 100644
--- a/fs/9p/vfs_addr.c
+++ b/fs/9p/vfs_addr.c
@@ -141,7 +141,6 @@ const struct address_space_operations v9fs_addr_operations = {
.dirty_folio = netfs_dirty_folio,
.release_folio = netfs_release_folio,
.invalidate_folio = netfs_invalidate_folio,
- .launder_folio = netfs_launder_folio,
.direct_IO = noop_direct_IO,
.writepages = netfs_writepages,
};
diff --git a/fs/9p/vfs_inode.c b/fs/9p/vfs_inode.c
index 32572982f72e..e085e520cb12 100644
--- a/fs/9p/vfs_inode.c
+++ b/fs/9p/vfs_inode.c
@@ -263,6 +263,7 @@ int v9fs_init_inode(struct v9fs_session_info *v9ses,
simple_inode_init_ts(inode);
inode->i_mapping->a_ops = &v9fs_addr_operations;
inode->i_private = NULL;
+ mapping_set_launder_folios(inode->i_mapping);

switch (mode & S_IFMT) {
case S_IFIFO:
diff --git a/fs/afs/file.c b/fs/afs/file.c
index ef2cc8f565d2..dfd8f60f5e1f 100644
--- a/fs/afs/file.c
+++ b/fs/afs/file.c
@@ -54,7 +54,6 @@ const struct address_space_operations afs_file_aops = {
.read_folio = netfs_read_folio,
.readahead = netfs_readahead,
.dirty_folio = netfs_dirty_folio,
- .launder_folio = netfs_launder_folio,
.release_folio = netfs_release_folio,
.invalidate_folio = netfs_invalidate_folio,
.migrate_folio = filemap_migrate_folio,
diff --git a/fs/afs/inode.c b/fs/afs/inode.c
index 94fc049aff58..4041d9a4dae8 100644
--- a/fs/afs/inode.c
+++ b/fs/afs/inode.c
@@ -104,6 +104,7 @@ static int afs_inode_init_from_status(struct afs_operation *op,
inode->i_fop = &afs_file_operations;
inode->i_mapping->a_ops = &afs_file_aops;
mapping_set_large_folios(inode->i_mapping);
+ mapping_set_launder_folios(inode->i_mapping);
break;
case AFS_FTYPE_DIR:
inode->i_mode = S_IFDIR | (status->mode & S_IALLUGO);
diff --git a/fs/fuse/dir.c b/fs/fuse/dir.c
index d19cbf34c634..d87ee76bc578 100644
--- a/fs/fuse/dir.c
+++ b/fs/fuse/dir.c
@@ -1978,7 +1978,7 @@ int fuse_do_setattr(struct dentry *dentry, struct iattr *attr,

/*
* Only call invalidate_inode_pages2() after removing
- * FUSE_NOWRITE, otherwise fuse_launder_folio() would deadlock.
+ * FUSE_NOWRITE, otherwise writeback would deadlock.
*/
if ((is_truncate || !is_wb) &&
S_ISREG(inode->i_mode) && oldsize != outarg.attr.size) {
diff --git a/fs/fuse/file.c b/fs/fuse/file.c
index 148a71b8b4d0..4933fe0b3269 100644
--- a/fs/fuse/file.c
+++ b/fs/fuse/file.c
@@ -2393,21 +2393,6 @@ static int fuse_write_end(struct file *file, struct address_space *mapping,
return copied;
}

-static int fuse_launder_folio(struct folio *folio)
-{
- int err = 0;
- if (folio_clear_dirty_for_io(folio)) {
- struct inode *inode = folio->mapping->host;
-
- /* Serialize with pending writeback for the same page */
- fuse_wait_on_page_writeback(inode, folio->index);
- err = fuse_writepage_locked(&folio->page);
- if (!err)
- fuse_wait_on_page_writeback(inode, folio->index);
- }
- return err;
-}
-
/*
* Write back dirty data/metadata now (there may not be any suitable
* open files later for data)
@@ -3227,7 +3212,6 @@ static const struct address_space_operations fuse_file_aops = {
.readahead = fuse_readahead,
.writepage = fuse_writepage,
.writepages = fuse_writepages,
- .launder_folio = fuse_launder_folio,
.dirty_folio = filemap_dirty_folio,
.bmap = fuse_bmap,
.direct_IO = fuse_direct_IO,
@@ -3241,6 +3225,7 @@ void fuse_init_file_inode(struct inode *inode, unsigned int flags)

inode->i_fop = &fuse_file_operations;
inode->i_data.a_ops = &fuse_file_aops;
+ mapping_set_launder_folios(inode->i_mapping);

INIT_LIST_HEAD(&fi->write_files);
INIT_LIST_HEAD(&fi->queued_writes);
diff --git a/fs/netfs/buffered_write.c b/fs/netfs/buffered_write.c
index 9a0d32e4b422..56aeb7e70bc8 100644
--- a/fs/netfs/buffered_write.c
+++ b/fs/netfs/buffered_write.c
@@ -908,7 +908,7 @@ static ssize_t netfs_write_back_from_locked_folio(struct address_space *mapping,
_enter(",%lx,%llx-%llx,%u", folio->index, start, end, caching);

wreq = netfs_alloc_request(mapping, NULL, start, folio_size(folio),
- NETFS_WRITEBACK);
+ wbc->for_launder ? NETFS_LAUNDER_WRITE : NETFS_WRITEBACK);
if (IS_ERR(wreq)) {
folio_unlock(folio);
return PTR_ERR(wreq);
@@ -1181,77 +1181,3 @@ int netfs_writepages(struct address_space *mapping,
return ret;
}
EXPORT_SYMBOL(netfs_writepages);
-
-/*
- * Deal with the disposition of a laundered folio.
- */
-static void netfs_cleanup_launder_folio(struct netfs_io_request *wreq)
-{
- if (wreq->error) {
- pr_notice("R=%08x Laundering error %d\n", wreq->debug_id, wreq->error);
- mapping_set_error(wreq->mapping, wreq->error);
- }
-}
-
-/**
- * netfs_launder_folio - Clean up a dirty folio that's being invalidated
- * @folio: The folio to clean
- *
- * This is called to write back a folio that's being invalidated when an inode
- * is getting torn down. Ideally, writepages would be used instead.
- */
-int netfs_launder_folio(struct folio *folio)
-{
- struct netfs_io_request *wreq;
- struct address_space *mapping = folio->mapping;
- struct netfs_folio *finfo = netfs_folio_info(folio);
- struct netfs_group *group = netfs_folio_group(folio);
- struct bio_vec bvec;
- unsigned long long i_size = i_size_read(mapping->host);
- unsigned long long start = folio_pos(folio);
- size_t offset = 0, len;
- int ret = 0;
-
- if (finfo) {
- offset = finfo->dirty_offset;
- start += offset;
- len = finfo->dirty_len;
- } else {
- len = folio_size(folio);
- }
- len = min_t(unsigned long long, len, i_size - start);
-
- wreq = netfs_alloc_request(mapping, NULL, start, len, NETFS_LAUNDER_WRITE);
- if (IS_ERR(wreq)) {
- ret = PTR_ERR(wreq);
- goto out;
- }
-
- if (!folio_clear_dirty_for_io(folio))
- goto out_put;
-
- trace_netfs_folio(folio, netfs_folio_trace_launder);
-
- _debug("launder %llx-%llx", start, start + len - 1);
-
- /* Speculatively write to the cache. We have to fix this up later if
- * the store fails.
- */
- wreq->cleanup = netfs_cleanup_launder_folio;
-
- bvec_set_folio(&bvec, folio, len, offset);
- iov_iter_bvec(&wreq->iter, ITER_SOURCE, &bvec, 1, len);
- __set_bit(NETFS_RREQ_UPLOAD_TO_SERVER, &wreq->flags);
- ret = netfs_begin_write(wreq, true, netfs_write_trace_launder);
-
-out_put:
- folio_detach_private(folio);
- netfs_put_group(group);
- kfree(finfo);
- netfs_put_request(wreq, false, netfs_rreq_trace_put_return);
-out:
- folio_wait_fscache(folio);
- _leave(" = %d", ret);
- return ret;
-}
-EXPORT_SYMBOL(netfs_launder_folio);
diff --git a/fs/nfs/file.c b/fs/nfs/file.c
index 8577ccf621f5..6efe0af3ba80 100644
--- a/fs/nfs/file.c
+++ b/fs/nfs/file.c
@@ -484,28 +484,6 @@ static void nfs_check_dirty_writeback(struct folio *folio,
*dirty = true;
}

-/*
- * Attempt to clear the private state associated with a page when an error
- * occurs that requires the cached contents of an inode to be written back or
- * destroyed
- * - Called if either PG_private or fscache is set on the page
- * - Caller holds page lock
- * - Return 0 if successful, -error otherwise
- */
-static int nfs_launder_folio(struct folio *folio)
-{
- struct inode *inode = folio->mapping->host;
- int ret;
-
- dfprintk(PAGECACHE, "NFS: launder_folio(%ld, %llu)\n",
- inode->i_ino, folio_pos(folio));
-
- folio_wait_fscache(folio);
- ret = nfs_wb_folio(inode, folio);
- trace_nfs_launder_folio_done(inode, folio, ret);
- return ret;
-}
-
static int nfs_swap_activate(struct swap_info_struct *sis, struct file *file,
sector_t *span)
{
@@ -564,7 +542,6 @@ const struct address_space_operations nfs_file_aops = {
.invalidate_folio = nfs_invalidate_folio,
.release_folio = nfs_release_folio,
.migrate_folio = nfs_migrate_folio,
- .launder_folio = nfs_launder_folio,
.is_dirty_writeback = nfs_check_dirty_writeback,
.error_remove_folio = generic_error_remove_folio,
.swap_activate = nfs_swap_activate,
diff --git a/fs/nfs/inode.c b/fs/nfs/inode.c
index ebb8d60e1152..f2f8cf00a442 100644
--- a/fs/nfs/inode.c
+++ b/fs/nfs/inode.c
@@ -491,6 +491,7 @@ nfs_fhget(struct super_block *sb, struct nfs_fh *fh, struct nfs_fattr *fattr)
inode->i_fop = NFS_SB(sb)->nfs_client->rpc_ops->file_ops;
inode->i_data.a_ops = &nfs_file_aops;
nfs_inode_init_regular(nfsi);
+ mapping_set_launder_folios(inode->i_mapping);
} else if (S_ISDIR(inode->i_mode)) {
inode->i_op = NFS_SB(sb)->nfs_client->rpc_ops->dir_inode_ops;
inode->i_fop = &nfs_dir_operations;
diff --git a/fs/nfs/nfstrace.h b/fs/nfs/nfstrace.h
index afedb449b54f..f0e8c0fb9447 100644
--- a/fs/nfs/nfstrace.h
+++ b/fs/nfs/nfstrace.h
@@ -1039,7 +1039,6 @@ DEFINE_NFS_FOLIO_EVENT(nfs_writeback_folio);
DEFINE_NFS_FOLIO_EVENT_DONE(nfs_writeback_folio_done);

DEFINE_NFS_FOLIO_EVENT(nfs_invalidate_folio);
-DEFINE_NFS_FOLIO_EVENT_DONE(nfs_launder_folio_done);

TRACE_EVENT(nfs_aop_readahead,
TP_PROTO(
diff --git a/fs/orangefs/inode.c b/fs/orangefs/inode.c
index 085912268442..a9ea40261832 100644
--- a/fs/orangefs/inode.c
+++ b/fs/orangefs/inode.c
@@ -626,7 +626,6 @@ static const struct address_space_operations orangefs_address_operations = {
.invalidate_folio = orangefs_invalidate_folio,
.release_folio = orangefs_release_folio,
.free_folio = orangefs_free_folio,
- .launder_folio = orangefs_launder_folio,
.direct_IO = orangefs_direct_IO,
};

@@ -980,6 +979,7 @@ static const struct inode_operations orangefs_file_inode_operations = {
static int orangefs_init_iops(struct inode *inode)
{
inode->i_mapping->a_ops = &orangefs_address_operations;
+ mapping_set_launder_folios(inode->i_mapping);

switch (inode->i_mode & S_IFMT) {
case S_IFREG:
diff --git a/fs/smb/client/file.c b/fs/smb/client/file.c
index f391c9b803d8..4fd871ba00f9 100644
--- a/fs/smb/client/file.c
+++ b/fs/smb/client/file.c
@@ -4858,27 +4858,6 @@ static void cifs_invalidate_folio(struct folio *folio, size_t offset,
folio_wait_fscache(folio);
}

-static int cifs_launder_folio(struct folio *folio)
-{
- int rc = 0;
- loff_t range_start = folio_pos(folio);
- loff_t range_end = range_start + folio_size(folio);
- struct writeback_control wbc = {
- .sync_mode = WB_SYNC_ALL,
- .nr_to_write = 0,
- .range_start = range_start,
- .range_end = range_end,
- };
-
- cifs_dbg(FYI, "Launder page: %lu\n", folio->index);
-
- if (folio_clear_dirty_for_io(folio))
- rc = cifs_writepage_locked(&folio->page, &wbc);
-
- folio_wait_fscache(folio);
- return rc;
-}
-
void cifs_oplock_break(struct work_struct *work)
{
struct cifsFileInfo *cfile = container_of(work, struct cifsFileInfo,
@@ -5057,7 +5036,6 @@ const struct address_space_operations cifs_addr_ops = {
.release_folio = cifs_release_folio,
.direct_IO = cifs_direct_io,
.invalidate_folio = cifs_invalidate_folio,
- .launder_folio = cifs_launder_folio,
.migrate_folio = filemap_migrate_folio,
/*
* TODO: investigate and if useful we could add an is_dirty_writeback
@@ -5080,6 +5058,5 @@ const struct address_space_operations cifs_addr_ops_smallbuf = {
.dirty_folio = netfs_dirty_folio,
.release_folio = cifs_release_folio,
.invalidate_folio = cifs_invalidate_folio,
- .launder_folio = cifs_launder_folio,
.migrate_folio = filemap_migrate_folio,
};
diff --git a/fs/smb/client/inode.c b/fs/smb/client/inode.c
index d02f8ba29cb5..24c1c2610e04 100644
--- a/fs/smb/client/inode.c
+++ b/fs/smb/client/inode.c
@@ -56,6 +56,7 @@ static void cifs_set_ops(struct inode *inode)
inode->i_data.a_ops = &cifs_addr_ops_smallbuf;
else
inode->i_data.a_ops = &cifs_addr_ops;
+ mapping_set_launder_folios(inode->i_mapping);
break;
case S_IFDIR:
if (IS_AUTOMOUNT(inode)) {
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 1fbc72c5f112..ded54555ab30 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -432,7 +432,6 @@ struct address_space_operations {
*/
int (*migrate_folio)(struct address_space *, struct folio *dst,
struct folio *src, enum migrate_mode);
- int (*launder_folio)(struct folio *);
bool (*is_partially_uptodate) (struct folio *, size_t from,
size_t count);
void (*is_dirty_writeback) (struct folio *, bool *dirty, bool *wb);
diff --git a/include/linux/netfs.h b/include/linux/netfs.h
index 100cbb261269..e566d31afb04 100644
--- a/include/linux/netfs.h
+++ b/include/linux/netfs.h
@@ -410,7 +410,6 @@ int netfs_unpin_writeback(struct inode *inode, struct writeback_control *wbc);
void netfs_clear_inode_writeback(struct inode *inode, const void *aux);
void netfs_invalidate_folio(struct folio *folio, size_t offset, size_t length);
bool netfs_release_folio(struct folio *folio, gfp_t gfp);
-int netfs_launder_folio(struct folio *folio);

/* VMA operations API. */
vm_fault_t netfs_page_mkwrite(struct vm_fault *vmf, struct netfs_group *netfs_group);
diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h
index 2df35e65557d..27ea8b9139a3 100644
--- a/include/linux/pagemap.h
+++ b/include/linux/pagemap.h
@@ -207,6 +207,7 @@ enum mapping_flags {
AS_STABLE_WRITES, /* must wait for writeback before modifying
folio contents */
AS_UNMOVABLE, /* The mapping cannot be moved, ever */
+ AS_LAUNDER_FOLIOS, /* Use laundering in invalidate_inode_pages2*() */
};

/**
@@ -323,6 +324,11 @@ static inline bool mapping_unmovable(struct address_space *mapping)
return test_bit(AS_UNMOVABLE, &mapping->flags);
}

+static inline void mapping_set_launder_folios(struct address_space *mapping)
+{
+ set_bit(AS_LAUNDER_FOLIOS, &mapping->flags);
+}
+
static inline gfp_t mapping_gfp_mask(struct address_space * mapping)
{
return mapping->gfp_mask;
diff --git a/include/linux/writeback.h b/include/linux/writeback.h
index 9845cb62e40b..7a1b79b344fa 100644
--- a/include/linux/writeback.h
+++ b/include/linux/writeback.h
@@ -62,6 +62,7 @@ struct writeback_control {
unsigned for_reclaim:1; /* Invoked from the page allocator */
unsigned range_cyclic:1; /* range_start is cyclic */
unsigned for_sync:1; /* sync(2) WB_SYNC_ALL writeback */
+ unsigned for_launder:1; /* Called from invalidate_inode_pages2_range() */
unsigned unpinned_netfs_wb:1; /* Cleared I_PINNING_NETFS_WB */

/*
diff --git a/mm/truncate.c b/mm/truncate.c
index 725b150e47ac..cb92759c4d8d 100644
--- a/mm/truncate.c
+++ b/mm/truncate.c
@@ -586,15 +586,6 @@ static int invalidate_complete_folio2(struct address_space *mapping,
return 0;
}

-static int folio_launder(struct address_space *mapping, struct folio *folio)
-{
- if (!folio_test_dirty(folio))
- return 0;
- if (folio->mapping != mapping || mapping->a_ops->launder_folio == NULL)
- return 0;
- return mapping->a_ops->launder_folio(folio);
-}
-
/**
* invalidate_inode_pages2_range - remove range of pages from an address_space
* @mapping: the address_space
@@ -657,13 +648,29 @@ int invalidate_inode_pages2_range(struct address_space *mapping,
unmap_mapping_folio(folio);
BUG_ON(folio_mapped(folio));

- ret2 = folio_launder(mapping, folio);
- if (ret2 == 0) {
- if (!invalidate_complete_folio2(mapping, folio))
+ if (folio_test_dirty(folio) &&
+ test_bit(AS_LAUNDER_FOLIOS, &mapping->flags)) {
+ struct writeback_control wbc = {
+ .sync_mode = WB_SYNC_ALL,
+ .nr_to_write = LLONG_MAX,
+ .range_start = folio_pos(folio),
+ .range_end = (end + 1ULL) * PAGE_SIZE - 1,
+ .for_launder = true,
+ };
+
+ folio_unlock(folio);
+ ret2 = filemap_fdatawrite_wbc(mapping, &wbc);
+ folio_lock(folio);
+ if (ret2 < 0)
+ ret = ret2;
+ else if (!invalidate_complete_folio2(mapping, folio) &&
+ ret == 0)
+ ret2 = -EBUSY;
+ } else {
+ if (!invalidate_complete_folio2(mapping, folio) &&
+ ret == 0)
ret2 = -EBUSY;
}
- if (ret2 < 0)
- ret = ret2;
folio_unlock(folio);
}
folio_batch_remove_exceptionals(&fbatch);



2024-03-07 10:36:44

by David Howells

[permalink] [raw]
Subject: Re: [RFC PATCH] mm: Replace ->launder_folio() with flush and wait

Matthew Wilcox <[email protected]> wrote:

> On Wed, Mar 06, 2024 at 10:39:37PM +0000, David Howells wrote:
> > Here's a patch to have a go at getting rid of ->launder_folio(). Since it's
> > failable and cannot guarantee that pages in the range are removed, I've tried
> > to replace laundering with just flush-and-wait, dropping the folio lock around
> > the I/O.
>
> My sense is that ->launder_folio doesn't actually need to be replaced.
>
> commit e3db7691e9f3dff3289f64e3d98583e28afe03db
> Author: Trond Myklebust <[email protected]>
> Date: Wed Jan 10 23:15:39 2007 -0800
>
> [PATCH] NFS: Fix race in nfs_release_page()
>
> NFS: Fix race in nfs_release_page()
>
> invalidate_inode_pages2() may find the dirty bit has been set on a page
> owing to the fact that the page may still be mapped after it was locked.
> Only after the call to unmap_mapping_range() are we sure that the page
> can no longer be dirtied.
> In order to fix this, NFS has hooked the releasepage() method and tries
> to write the page out between the call to unmap_mapping_range() and the
> call to remove_mapping(). This, however leads to deadlocks in the page
> reclaim code, where the page may be locked without holding a reference
> to the inode or dentry.
>
> Fix is to add a new address_space_operation, launder_page(), which will
> attempt to write out a dirty page without releasing the page lock.
>
> Signed-off-by: Trond Myklebust <[email protected]>
>
> I don't understand why this couldn't've been solved by page_mkwrite.
> NFS did later add nfs_vm_page_mkwrite in July 2007, and maybe it's just
> not needed any more? I haven't looked into it enough to make sure,
> but my belief is that we should be able to get rid of it.

Okay, it's slightly more complex than I thought - and I'm not sure all callers
are actually using it correctly. There are some additional interesting cases
I've found, beyond the pre-/post-DIO case:

(1) NFS relies on it to retry the write before stripping the pages in the
case where a writeback error occurs. I think this can probably be dealt
with by sticking a filemap_fdatawrite() call before the invalidation.
I'm not sure if this would incur the deadlock with the page reclaim code
of which Trond speaks.

(2) invalidate_inode_pages2() is used in some places to effect invalidation
of the pagecache in the case where the server tells us that a third party
modified the server copy of a file. What the right behaviour should be
here, I'm not sure, but at the moment, any dirty data will get laundered
back to the server. Possibly it should be simply invalidated locally or
the user asked how they want to handle the divergence.

Some filesystems use invalidate_remote_inode() instead which seems to
leave the dirty pages in place locally.

If it is desirous to save the dirty data, then filemap_fdatawrite()
could be deployed before invalidating the pages.

(3) Fuse uses invalidate_inode_pages2() in fuse_do_setattr() to strip all the
pages from an inode that has had its size changed, laundering any page
that's still dirty. This would seem to be excessive, but maybe Miklos
had a reason for doing it that way.

There are some places that should perhaps be using kiocb_invalidate_pages()
and kiocb_invalidate_post_direct_write() instead - assuming Christoph has no
objection to the latter function being exported.

David


2024-03-07 16:16:17

by David Howells

[permalink] [raw]
Subject: [RFC PATCH v2] mm: Kill ->launder_folio()

invalidate_inode_pages2_range() and its wrappers are frequently used to
invalidate overlapping folios prior to and after doing direct I/O. This
calls ->launder_folio() to flush dirty folios out to the backing store,
keeping the folio lock across the I/O - presumably to prevent the folio
from being redirtied and thereby prevent it from being removed.

However... If we're doing this prior to doing DIO on a file, there may be
nothing preventing an mmapped write from recreating and redirtying the
folio the moment it is removed from the mapping lest the kernel deadlock on
doing DIO to/from a buffer mmapped from the target file.

Further, invalidate_inode_pages2_range() is permitted to fail - and half
the callers don't even check to see if it *did* fail, probably not
unreasonably.

In which case, there's no point doing the laundry there; better to call
something like filemap_fdatawrite() beforehand. If mmap is going to
interfere, we can't stop it.

There are some other cases in which this is used:

(1) In fuse_do_setattr(), when the size of a file is changed. Calling
invalidate_inode_pages2() here is probably the wrong thing to do as
the preceding truncate_pagecache() should do the appropriate page
trimming and this would just seem to reduce performance for no good
reason.

(2) In some network filesystems, when the server informs the client of a
third-party modification to a file, the local pagecache is zapped with
invalidate_inode_pages2() rather than invalidate_remote_inode(). The
former writes back the dirty data whereas the latter retains it plus
surrounding obsolete data in the same folio. Maybe this should be
done by filemap_fdatawrite() followed by invalidate_inode_pages2().

Possibly, ->page_mkwrite() could be used to hold off mmap writes until
remote invalidation-induced writeback is achieved.

(3) In NFS, this is used to attempt to save the data when some sort of
fatal error occurs. It may be sufficient to do a filemap_fdatawrite()
before calling invalidate_inode_pages2(). nfs_writepages() can
observe the error state and do the laundering thing. Again, maybe,
->page_mkwrite() could be used to hold off mmap writes until the
pagecache has been invalidated.

Note that this only affects 9p, afs, cifs, fuse, nfs and orangefs.

Signed-off-by: David Howells <[email protected]>
cc: Trond Myklebust <[email protected]>
cc: Matthew Wilcox <[email protected]>
cc: Christoph Hellwig <[email protected]>
cc: Andrew Morton <[email protected]>
cc: Alexander Viro <[email protected]>
cc: Christian Brauner <[email protected]>
cc: Jeff Layton <[email protected]>
cc: [email protected]
cc: [email protected]
cc: [email protected]
cc: [email protected]
cc: [email protected]
cc: [email protected]
cc: [email protected]
cc: [email protected]
cc: [email protected]
---
Documentation/filesystems/locking.rst | 8 --
Documentation/filesystems/vfs.rst | 6 -
fs/9p/vfs_addr.c | 2
fs/afs/file.c | 1
fs/afs/internal.h | 1
fs/afs/write.c | 10 --
fs/ceph/file.c | 6 -
fs/fuse/file.c | 16 ----
fs/netfs/buffered_write.c | 74 --------------------
fs/netfs/main.c | 1
fs/nfs/file.c | 23 ------
fs/nfs/inode.c | 4 -
fs/nfs/nfstrace.h | 1
fs/orangefs/inode.c | 1
fs/smb/client/file.c | 122 ----------------------------------
include/linux/fs.h | 1
include/linux/netfs.h | 2
include/trace/events/netfs.h | 3
mm/truncate.c | 25 +-----
19 files changed, 14 insertions(+), 293 deletions(-)

diff --git a/Documentation/filesystems/locking.rst b/Documentation/filesystems/locking.rst
index d5bf4b6b7509..139554d1ab51 100644
--- a/Documentation/filesystems/locking.rst
+++ b/Documentation/filesystems/locking.rst
@@ -262,7 +262,6 @@ prototypes::
int (*direct_IO)(struct kiocb *, struct iov_iter *iter);
int (*migrate_folio)(struct address_space *, struct folio *dst,
struct folio *src, enum migrate_mode);
- int (*launder_folio)(struct folio *);
bool (*is_partially_uptodate)(struct folio *, size_t from, size_t count);
int (*error_remove_folio)(struct address_space *, struct folio *);
int (*swap_activate)(struct swap_info_struct *sis, struct file *f, sector_t *span)
@@ -288,7 +287,6 @@ release_folio: yes
free_folio: yes
direct_IO:
migrate_folio: yes (both)
-launder_folio: yes
is_partially_uptodate: yes
error_remove_folio: yes
swap_activate: no
@@ -394,12 +392,6 @@ try_to_free_buffers().
->free_folio() is called when the kernel has dropped the folio
from the page cache.

-->launder_folio() may be called prior to releasing a folio if
-it is still found to be dirty. It returns zero if the folio was successfully
-cleaned, or an error value if not. Note that in order to prevent the folio
-getting mapped back in and redirtied, it needs to be kept locked
-across the entire operation.
-
->swap_activate() will be called to prepare the given file for swap. It
should perform any validation and preparation necessary to ensure that
writes can be performed with minimal memory allocation. It should call
diff --git a/Documentation/filesystems/vfs.rst b/Documentation/filesystems/vfs.rst
index eebcc0f9e2bc..b2af9ee6515a 100644
--- a/Documentation/filesystems/vfs.rst
+++ b/Documentation/filesystems/vfs.rst
@@ -818,7 +818,6 @@ cache in your filesystem. The following members are defined:
ssize_t (*direct_IO)(struct kiocb *, struct iov_iter *iter);
int (*migrate_folio)(struct mapping *, struct folio *dst,
struct folio *src, enum migrate_mode);
- int (*launder_folio) (struct folio *);

bool (*is_partially_uptodate) (struct folio *, size_t from,
size_t count);
@@ -1012,11 +1011,6 @@ cache in your filesystem. The following members are defined:
folio to this function. migrate_folio should transfer any private
data across and update any references that it has to the folio.

-``launder_folio``
- Called before freeing a folio - it writes back the dirty folio.
- To prevent redirtying the folio, it is kept locked during the
- whole operation.
-
``is_partially_uptodate``
Called by the VM when reading a file through the pagecache when
the underlying blocksize is smaller than the size of the folio.
diff --git a/fs/9p/vfs_addr.c b/fs/9p/vfs_addr.c
index 047855033d32..5a943c122d83 100644
--- a/fs/9p/vfs_addr.c
+++ b/fs/9p/vfs_addr.c
@@ -89,7 +89,6 @@ static int v9fs_init_request(struct netfs_io_request *rreq, struct file *file)
bool writing = (rreq->origin == NETFS_READ_FOR_WRITE ||
rreq->origin == NETFS_WRITEBACK ||
rreq->origin == NETFS_WRITETHROUGH ||
- rreq->origin == NETFS_LAUNDER_WRITE ||
rreq->origin == NETFS_UNBUFFERED_WRITE ||
rreq->origin == NETFS_DIO_WRITE);

@@ -141,7 +140,6 @@ const struct address_space_operations v9fs_addr_operations = {
.dirty_folio = netfs_dirty_folio,
.release_folio = netfs_release_folio,
.invalidate_folio = netfs_invalidate_folio,
- .launder_folio = netfs_launder_folio,
.direct_IO = noop_direct_IO,
.writepages = netfs_writepages,
};
diff --git a/fs/afs/file.c b/fs/afs/file.c
index ef2cc8f565d2..dfd8f60f5e1f 100644
--- a/fs/afs/file.c
+++ b/fs/afs/file.c
@@ -54,7 +54,6 @@ const struct address_space_operations afs_file_aops = {
.read_folio = netfs_read_folio,
.readahead = netfs_readahead,
.dirty_folio = netfs_dirty_folio,
- .launder_folio = netfs_launder_folio,
.release_folio = netfs_release_folio,
.invalidate_folio = netfs_invalidate_folio,
.migrate_folio = filemap_migrate_folio,
diff --git a/fs/afs/internal.h b/fs/afs/internal.h
index 6ce5a612937c..b93aa026daa4 100644
--- a/fs/afs/internal.h
+++ b/fs/afs/internal.h
@@ -916,7 +916,6 @@ struct afs_operation {
loff_t pos;
loff_t size;
loff_t i_size;
- bool laundering; /* Laundering page, PG_writeback not set */
} store;
struct {
struct iattr *attr;
diff --git a/fs/afs/write.c b/fs/afs/write.c
index 74402d95a884..1bc26466eb72 100644
--- a/fs/afs/write.c
+++ b/fs/afs/write.c
@@ -75,8 +75,7 @@ static void afs_store_data_success(struct afs_operation *op)
op->ctime = op->file[0].scb.status.mtime_client;
afs_vnode_commit_status(op, &op->file[0]);
if (!afs_op_error(op)) {
- if (!op->store.laundering)
- afs_pages_written_back(vnode, op->store.pos, op->store.size);
+ afs_pages_written_back(vnode, op->store.pos, op->store.size);
afs_stat_v(vnode, n_stores);
atomic_long_add(op->store.size, &afs_v2net(vnode)->n_store_bytes);
}
@@ -91,8 +90,7 @@ static const struct afs_operation_ops afs_store_data_operation = {
/*
* write to a file
*/
-static int afs_store_data(struct afs_vnode *vnode, struct iov_iter *iter, loff_t pos,
- bool laundering)
+static int afs_store_data(struct afs_vnode *vnode, struct iov_iter *iter, loff_t pos)
{
struct afs_operation *op;
struct afs_wb_key *wbk = NULL;
@@ -123,7 +121,6 @@ static int afs_store_data(struct afs_vnode *vnode, struct iov_iter *iter, loff_t
op->file[0].modification = true;
op->store.pos = pos;
op->store.size = size;
- op->store.laundering = laundering;
op->flags |= AFS_OPERATION_UNINTR;
op->ops = &afs_store_data_operation;

@@ -168,8 +165,7 @@ static void afs_upload_to_server(struct netfs_io_subrequest *subreq)
subreq->rreq->debug_id, subreq->debug_index, subreq->io_iter.count);

trace_netfs_sreq(subreq, netfs_sreq_trace_submit);
- ret = afs_store_data(vnode, &subreq->io_iter, subreq->start,
- subreq->rreq->origin == NETFS_LAUNDER_WRITE);
+ ret = afs_store_data(vnode, &subreq->io_iter, subreq->start);
netfs_write_subrequest_terminated(subreq, ret < 0 ? ret : subreq->len,
false);
}
diff --git a/fs/ceph/file.c b/fs/ceph/file.c
index abe8028d95bf..6af96c154f81 100644
--- a/fs/ceph/file.c
+++ b/fs/ceph/file.c
@@ -1450,11 +1450,9 @@ ceph_direct_read_write(struct kiocb *iocb, struct iov_iter *iter,

ceph_fscache_invalidate(inode, true);

- ret2 = invalidate_inode_pages2_range(inode->i_mapping,
- pos >> PAGE_SHIFT,
- (pos + count - 1) >> PAGE_SHIFT);
+ ret2 = kiocb_invalidate_pages(iocb, count);
if (ret2 < 0)
- doutc(cl, "invalidate_inode_pages2_range returned %d\n",
+ doutc(cl, "kiocb_invalidate_pages returned %d\n",
ret2);

flags = /* CEPH_OSD_FLAG_ORDERSNAP | */ CEPH_OSD_FLAG_WRITE;
diff --git a/fs/fuse/file.c b/fs/fuse/file.c
index 148a71b8b4d0..43411f0d3ce1 100644
--- a/fs/fuse/file.c
+++ b/fs/fuse/file.c
@@ -2393,21 +2393,6 @@ static int fuse_write_end(struct file *file, struct address_space *mapping,
return copied;
}

-static int fuse_launder_folio(struct folio *folio)
-{
- int err = 0;
- if (folio_clear_dirty_for_io(folio)) {
- struct inode *inode = folio->mapping->host;
-
- /* Serialize with pending writeback for the same page */
- fuse_wait_on_page_writeback(inode, folio->index);
- err = fuse_writepage_locked(&folio->page);
- if (!err)
- fuse_wait_on_page_writeback(inode, folio->index);
- }
- return err;
-}
-
/*
* Write back dirty data/metadata now (there may not be any suitable
* open files later for data)
@@ -3227,7 +3212,6 @@ static const struct address_space_operations fuse_file_aops = {
.readahead = fuse_readahead,
.writepage = fuse_writepage,
.writepages = fuse_writepages,
- .launder_folio = fuse_launder_folio,
.dirty_folio = filemap_dirty_folio,
.bmap = fuse_bmap,
.direct_IO = fuse_direct_IO,
diff --git a/fs/netfs/buffered_write.c b/fs/netfs/buffered_write.c
index 9a0d32e4b422..4bdd427035de 100644
--- a/fs/netfs/buffered_write.c
+++ b/fs/netfs/buffered_write.c
@@ -1181,77 +1181,3 @@ int netfs_writepages(struct address_space *mapping,
return ret;
}
EXPORT_SYMBOL(netfs_writepages);
-
-/*
- * Deal with the disposition of a laundered folio.
- */
-static void netfs_cleanup_launder_folio(struct netfs_io_request *wreq)
-{
- if (wreq->error) {
- pr_notice("R=%08x Laundering error %d\n", wreq->debug_id, wreq->error);
- mapping_set_error(wreq->mapping, wreq->error);
- }
-}
-
-/**
- * netfs_launder_folio - Clean up a dirty folio that's being invalidated
- * @folio: The folio to clean
- *
- * This is called to write back a folio that's being invalidated when an inode
- * is getting torn down. Ideally, writepages would be used instead.
- */
-int netfs_launder_folio(struct folio *folio)
-{
- struct netfs_io_request *wreq;
- struct address_space *mapping = folio->mapping;
- struct netfs_folio *finfo = netfs_folio_info(folio);
- struct netfs_group *group = netfs_folio_group(folio);
- struct bio_vec bvec;
- unsigned long long i_size = i_size_read(mapping->host);
- unsigned long long start = folio_pos(folio);
- size_t offset = 0, len;
- int ret = 0;
-
- if (finfo) {
- offset = finfo->dirty_offset;
- start += offset;
- len = finfo->dirty_len;
- } else {
- len = folio_size(folio);
- }
- len = min_t(unsigned long long, len, i_size - start);
-
- wreq = netfs_alloc_request(mapping, NULL, start, len, NETFS_LAUNDER_WRITE);
- if (IS_ERR(wreq)) {
- ret = PTR_ERR(wreq);
- goto out;
- }
-
- if (!folio_clear_dirty_for_io(folio))
- goto out_put;
-
- trace_netfs_folio(folio, netfs_folio_trace_launder);
-
- _debug("launder %llx-%llx", start, start + len - 1);
-
- /* Speculatively write to the cache. We have to fix this up later if
- * the store fails.
- */
- wreq->cleanup = netfs_cleanup_launder_folio;
-
- bvec_set_folio(&bvec, folio, len, offset);
- iov_iter_bvec(&wreq->iter, ITER_SOURCE, &bvec, 1, len);
- __set_bit(NETFS_RREQ_UPLOAD_TO_SERVER, &wreq->flags);
- ret = netfs_begin_write(wreq, true, netfs_write_trace_launder);
-
-out_put:
- folio_detach_private(folio);
- netfs_put_group(group);
- kfree(finfo);
- netfs_put_request(wreq, false, netfs_rreq_trace_put_return);
-out:
- folio_wait_fscache(folio);
- _leave(" = %d", ret);
- return ret;
-}
-EXPORT_SYMBOL(netfs_launder_folio);
diff --git a/fs/netfs/main.c b/fs/netfs/main.c
index 5e77618a7940..dd4bbdc1a4d0 100644
--- a/fs/netfs/main.c
+++ b/fs/netfs/main.c
@@ -33,7 +33,6 @@ static const char *netfs_origins[nr__netfs_io_origin] = {
[NETFS_READ_FOR_WRITE] = "RW",
[NETFS_WRITEBACK] = "WB",
[NETFS_WRITETHROUGH] = "WT",
- [NETFS_LAUNDER_WRITE] = "LW",
[NETFS_UNBUFFERED_WRITE] = "UW",
[NETFS_DIO_READ] = "DR",
[NETFS_DIO_WRITE] = "DW",
diff --git a/fs/nfs/file.c b/fs/nfs/file.c
index 8577ccf621f5..6efe0af3ba80 100644
--- a/fs/nfs/file.c
+++ b/fs/nfs/file.c
@@ -484,28 +484,6 @@ static void nfs_check_dirty_writeback(struct folio *folio,
*dirty = true;
}

-/*
- * Attempt to clear the private state associated with a page when an error
- * occurs that requires the cached contents of an inode to be written back or
- * destroyed
- * - Called if either PG_private or fscache is set on the page
- * - Caller holds page lock
- * - Return 0 if successful, -error otherwise
- */
-static int nfs_launder_folio(struct folio *folio)
-{
- struct inode *inode = folio->mapping->host;
- int ret;
-
- dfprintk(PAGECACHE, "NFS: launder_folio(%ld, %llu)\n",
- inode->i_ino, folio_pos(folio));
-
- folio_wait_fscache(folio);
- ret = nfs_wb_folio(inode, folio);
- trace_nfs_launder_folio_done(inode, folio, ret);
- return ret;
-}
-
static int nfs_swap_activate(struct swap_info_struct *sis, struct file *file,
sector_t *span)
{
@@ -564,7 +542,6 @@ const struct address_space_operations nfs_file_aops = {
.invalidate_folio = nfs_invalidate_folio,
.release_folio = nfs_release_folio,
.migrate_folio = nfs_migrate_folio,
- .launder_folio = nfs_launder_folio,
.is_dirty_writeback = nfs_check_dirty_writeback,
.error_remove_folio = generic_error_remove_folio,
.swap_activate = nfs_swap_activate,
diff --git a/fs/nfs/inode.c b/fs/nfs/inode.c
index ebb8d60e1152..898f65784fae 100644
--- a/fs/nfs/inode.c
+++ b/fs/nfs/inode.c
@@ -1162,8 +1162,10 @@ void nfs_file_clear_open_context(struct file *filp)
* We fatal error on write before. Try to writeback
* every page again.
*/
- if (ctx->error < 0)
+ if (ctx->error < 0) {
+ filemap_fdatawrite(inode->i_mapping);
invalidate_inode_pages2(inode->i_mapping);
+ }
filp->private_data = NULL;
put_nfs_open_context_sync(ctx);
}
diff --git a/fs/nfs/nfstrace.h b/fs/nfs/nfstrace.h
index afedb449b54f..f0e8c0fb9447 100644
--- a/fs/nfs/nfstrace.h
+++ b/fs/nfs/nfstrace.h
@@ -1039,7 +1039,6 @@ DEFINE_NFS_FOLIO_EVENT(nfs_writeback_folio);
DEFINE_NFS_FOLIO_EVENT_DONE(nfs_writeback_folio_done);

DEFINE_NFS_FOLIO_EVENT(nfs_invalidate_folio);
-DEFINE_NFS_FOLIO_EVENT_DONE(nfs_launder_folio_done);

TRACE_EVENT(nfs_aop_readahead,
TP_PROTO(
diff --git a/fs/orangefs/inode.c b/fs/orangefs/inode.c
index 085912268442..d78876d08175 100644
--- a/fs/orangefs/inode.c
+++ b/fs/orangefs/inode.c
@@ -626,7 +626,6 @@ static const struct address_space_operations orangefs_address_operations = {
.invalidate_folio = orangefs_invalidate_folio,
.release_folio = orangefs_release_folio,
.free_folio = orangefs_free_folio,
- .launder_folio = orangefs_launder_folio,
.direct_IO = orangefs_direct_IO,
};

diff --git a/fs/smb/client/file.c b/fs/smb/client/file.c
index f391c9b803d8..f5d5efeba5e9 100644
--- a/fs/smb/client/file.c
+++ b/fs/smb/client/file.c
@@ -2561,64 +2561,6 @@ struct cifs_writedata *cifs_writedata_alloc(work_func_t complete)
return wdata;
}

-static int cifs_partialpagewrite(struct page *page, unsigned from, unsigned to)
-{
- struct address_space *mapping = page->mapping;
- loff_t offset = (loff_t)page->index << PAGE_SHIFT;
- char *write_data;
- int rc = -EFAULT;
- int bytes_written = 0;
- struct inode *inode;
- struct cifsFileInfo *open_file;
-
- if (!mapping || !mapping->host)
- return -EFAULT;
-
- inode = page->mapping->host;
-
- offset += (loff_t)from;
- write_data = kmap(page);
- write_data += from;
-
- if ((to > PAGE_SIZE) || (from > to)) {
- kunmap(page);
- return -EIO;
- }
-
- /* racing with truncate? */
- if (offset > mapping->host->i_size) {
- kunmap(page);
- return 0; /* don't care */
- }
-
- /* check to make sure that we are not extending the file */
- if (mapping->host->i_size - offset < (loff_t)to)
- to = (unsigned)(mapping->host->i_size - offset);
-
- rc = cifs_get_writable_file(CIFS_I(mapping->host), FIND_WR_ANY,
- &open_file);
- if (!rc) {
- bytes_written = cifs_write(open_file, open_file->pid,
- write_data, to - from, &offset);
- cifsFileInfo_put(open_file);
- /* Does mm or vfs already set times? */
- simple_inode_init_ts(inode);
- if ((bytes_written > 0) && (offset))
- rc = 0;
- else if (bytes_written < 0)
- rc = bytes_written;
- else
- rc = -EFAULT;
- } else {
- cifs_dbg(FYI, "No writable handle for write page rc=%d\n", rc);
- if (!is_retryable_error(rc))
- rc = -EIO;
- }
-
- kunmap(page);
- return rc;
-}
-
/*
* Extend the region to be written back to include subsequent contiguously
* dirty pages if possible, but don't sleep while doing so.
@@ -3001,47 +2943,6 @@ static int cifs_writepages(struct address_space *mapping,
return ret;
}

-static int
-cifs_writepage_locked(struct page *page, struct writeback_control *wbc)
-{
- int rc;
- unsigned int xid;
-
- xid = get_xid();
-/* BB add check for wbc flags */
- get_page(page);
- if (!PageUptodate(page))
- cifs_dbg(FYI, "ppw - page not up to date\n");
-
- /*
- * Set the "writeback" flag, and clear "dirty" in the radix tree.
- *
- * A writepage() implementation always needs to do either this,
- * or re-dirty the page with "redirty_page_for_writepage()" in
- * the case of a failure.
- *
- * Just unlocking the page will cause the radix tree tag-bits
- * to fail to update with the state of the page correctly.
- */
- set_page_writeback(page);
-retry_write:
- rc = cifs_partialpagewrite(page, 0, PAGE_SIZE);
- if (is_retryable_error(rc)) {
- if (wbc->sync_mode == WB_SYNC_ALL && rc == -EAGAIN)
- goto retry_write;
- redirty_page_for_writepage(wbc, page);
- } else if (rc != 0) {
- SetPageError(page);
- mapping_set_error(page->mapping, rc);
- } else {
- SetPageUptodate(page);
- }
- end_page_writeback(page);
- put_page(page);
- free_xid(xid);
- return rc;
-}
-
static int cifs_write_end(struct file *file, struct address_space *mapping,
loff_t pos, unsigned len, unsigned copied,
struct page *page, void *fsdata)
@@ -4858,27 +4759,6 @@ static void cifs_invalidate_folio(struct folio *folio, size_t offset,
folio_wait_fscache(folio);
}

-static int cifs_launder_folio(struct folio *folio)
-{
- int rc = 0;
- loff_t range_start = folio_pos(folio);
- loff_t range_end = range_start + folio_size(folio);
- struct writeback_control wbc = {
- .sync_mode = WB_SYNC_ALL,
- .nr_to_write = 0,
- .range_start = range_start,
- .range_end = range_end,
- };
-
- cifs_dbg(FYI, "Launder page: %lu\n", folio->index);
-
- if (folio_clear_dirty_for_io(folio))
- rc = cifs_writepage_locked(&folio->page, &wbc);
-
- folio_wait_fscache(folio);
- return rc;
-}
-
void cifs_oplock_break(struct work_struct *work)
{
struct cifsFileInfo *cfile = container_of(work, struct cifsFileInfo,
@@ -5057,7 +4937,6 @@ const struct address_space_operations cifs_addr_ops = {
.release_folio = cifs_release_folio,
.direct_IO = cifs_direct_io,
.invalidate_folio = cifs_invalidate_folio,
- .launder_folio = cifs_launder_folio,
.migrate_folio = filemap_migrate_folio,
/*
* TODO: investigate and if useful we could add an is_dirty_writeback
@@ -5080,6 +4959,5 @@ const struct address_space_operations cifs_addr_ops_smallbuf = {
.dirty_folio = netfs_dirty_folio,
.release_folio = cifs_release_folio,
.invalidate_folio = cifs_invalidate_folio,
- .launder_folio = cifs_launder_folio,
.migrate_folio = filemap_migrate_folio,
};
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 1fbc72c5f112..ded54555ab30 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -432,7 +432,6 @@ struct address_space_operations {
*/
int (*migrate_folio)(struct address_space *, struct folio *dst,
struct folio *src, enum migrate_mode);
- int (*launder_folio)(struct folio *);
bool (*is_partially_uptodate) (struct folio *, size_t from,
size_t count);
void (*is_dirty_writeback) (struct folio *, bool *dirty, bool *wb);
diff --git a/include/linux/netfs.h b/include/linux/netfs.h
index 100cbb261269..7bd2aeccb299 100644
--- a/include/linux/netfs.h
+++ b/include/linux/netfs.h
@@ -232,7 +232,6 @@ enum netfs_io_origin {
NETFS_READ_FOR_WRITE, /* This read is to prepare a write */
NETFS_WRITEBACK, /* This write was triggered by writepages */
NETFS_WRITETHROUGH, /* This write was made by netfs_perform_write() */
- NETFS_LAUNDER_WRITE, /* This is triggered by ->launder_folio() */
NETFS_UNBUFFERED_WRITE, /* This is an unbuffered write */
NETFS_DIO_READ, /* This is a direct I/O read */
NETFS_DIO_WRITE, /* This is a direct I/O write */
@@ -410,7 +409,6 @@ int netfs_unpin_writeback(struct inode *inode, struct writeback_control *wbc);
void netfs_clear_inode_writeback(struct inode *inode, const void *aux);
void netfs_invalidate_folio(struct folio *folio, size_t offset, size_t length);
bool netfs_release_folio(struct folio *folio, gfp_t gfp);
-int netfs_launder_folio(struct folio *folio);

/* VMA operations API. */
vm_fault_t netfs_page_mkwrite(struct vm_fault *vmf, struct netfs_group *netfs_group);
diff --git a/include/trace/events/netfs.h b/include/trace/events/netfs.h
index 447a8c21cf57..57ed767f0230 100644
--- a/include/trace/events/netfs.h
+++ b/include/trace/events/netfs.h
@@ -25,7 +25,6 @@

#define netfs_write_traces \
EM(netfs_write_trace_dio_write, "DIO-WRITE") \
- EM(netfs_write_trace_launder, "LAUNDER ") \
EM(netfs_write_trace_unbuffered_write, "UNB-WRITE") \
EM(netfs_write_trace_writeback, "WRITEBACK") \
E_(netfs_write_trace_writethrough, "WRITETHRU")
@@ -36,7 +35,6 @@
EM(NETFS_READ_FOR_WRITE, "RW") \
EM(NETFS_WRITEBACK, "WB") \
EM(NETFS_WRITETHROUGH, "WT") \
- EM(NETFS_LAUNDER_WRITE, "LW") \
EM(NETFS_UNBUFFERED_WRITE, "UW") \
EM(NETFS_DIO_READ, "DR") \
E_(NETFS_DIO_WRITE, "DW")
@@ -131,7 +129,6 @@
EM(netfs_folio_trace_end_copy, "end-copy") \
EM(netfs_folio_trace_filled_gaps, "filled-gaps") \
EM(netfs_folio_trace_kill, "kill") \
- EM(netfs_folio_trace_launder, "launder") \
EM(netfs_folio_trace_mkwrite, "mkwrite") \
EM(netfs_folio_trace_mkwrite_plus, "mkwrite+") \
EM(netfs_folio_trace_read_gaps, "read-gaps") \
diff --git a/mm/truncate.c b/mm/truncate.c
index 725b150e47ac..dab17b926991 100644
--- a/mm/truncate.c
+++ b/mm/truncate.c
@@ -561,10 +561,10 @@ static int invalidate_complete_folio2(struct address_space *mapping,
struct folio *folio)
{
if (folio->mapping != mapping)
- return 0;
+ return -EBUSY;

if (!filemap_release_folio(folio, GFP_KERNEL))
- return 0;
+ return -EBUSY;

spin_lock(&mapping->host->i_lock);
xa_lock_irq(&mapping->i_pages);
@@ -579,20 +579,11 @@ static int invalidate_complete_folio2(struct address_space *mapping,
spin_unlock(&mapping->host->i_lock);

filemap_free_folio(mapping, folio);
- return 1;
+ return 0;
failed:
xa_unlock_irq(&mapping->i_pages);
spin_unlock(&mapping->host->i_lock);
- return 0;
-}
-
-static int folio_launder(struct address_space *mapping, struct folio *folio)
-{
- if (!folio_test_dirty(folio))
- return 0;
- if (folio->mapping != mapping || mapping->a_ops->launder_folio == NULL)
- return 0;
- return mapping->a_ops->launder_folio(folio);
+ return -EBUSY;
}

/**
@@ -657,12 +648,8 @@ int invalidate_inode_pages2_range(struct address_space *mapping,
unmap_mapping_folio(folio);
BUG_ON(folio_mapped(folio));

- ret2 = folio_launder(mapping, folio);
- if (ret2 == 0) {
- if (!invalidate_complete_folio2(mapping, folio))
- ret2 = -EBUSY;
- }
- if (ret2 < 0)
+ ret2 = invalidate_complete_folio2(mapping, folio);
+ if (ret2)
ret = ret2;
folio_unlock(folio);
}