This patch series fixes the issue I described here:
https://www.spinics.net/lists/linux-block/msg38274.html
Essentially the issue is that journal_finish_inode_data_buffers() operates
on the entire address space of each of the inodes associated with a given
journal entry. This means that if we have an inode where we are constantly
appending dirty pages we can end up waiting for an indefinite amount of
time in journal_finish_inode_data_buffers().
This series improves this situation in ext4 by scoping each of the inode
dirty ranges associated with a given transaction. Other users of jbd2
which don't (yet?) take advantage of this scoping (ocfs2) will continue to
have the old behavior.
Ross Zwisler (3):
mm: add filemap_fdatawait_range_keep_errors()
jbd2: introduce jbd2_inode dirty range scoping
ext4: use jbd2_inode dirty range scoping
fs/ext4/ext4_jbd2.h | 12 +++++------
fs/ext4/inode.c | 13 +++++++++---
fs/ext4/move_extent.c | 3 ++-
fs/jbd2/commit.c | 26 +++++++++++++++++------
fs/jbd2/journal.c | 2 ++
fs/jbd2/transaction.c | 49 ++++++++++++++++++++++++-------------------
include/linux/fs.h | 2 ++
include/linux/jbd2.h | 22 +++++++++++++++++++
mm/filemap.c | 22 +++++++++++++++++++
9 files changed, 114 insertions(+), 37 deletions(-)
--
2.22.0.410.gd8fdbe21b5-goog
In the spirit of filemap_fdatawait_range() and
filemap_fdatawait_keep_errors(), introduce
filemap_fdatawait_range_keep_errors() which both takes a range upon
which to wait and does not clear errors from the address space.
Signed-off-by: Ross Zwisler <[email protected]>
---
include/linux/fs.h | 2 ++
mm/filemap.c | 22 ++++++++++++++++++++++
2 files changed, 24 insertions(+)
diff --git a/include/linux/fs.h b/include/linux/fs.h
index f7fdfe93e25d3..79fec8a8413f4 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -2712,6 +2712,8 @@ extern int filemap_flush(struct address_space *);
extern int filemap_fdatawait_keep_errors(struct address_space *mapping);
extern int filemap_fdatawait_range(struct address_space *, loff_t lstart,
loff_t lend);
+extern int filemap_fdatawait_range_keep_errors(struct address_space *mapping,
+ loff_t start_byte, loff_t end_byte);
static inline int filemap_fdatawait(struct address_space *mapping)
{
diff --git a/mm/filemap.c b/mm/filemap.c
index df2006ba0cfa5..e87252ca0835a 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -553,6 +553,28 @@ int filemap_fdatawait_range(struct address_space *mapping, loff_t start_byte,
}
EXPORT_SYMBOL(filemap_fdatawait_range);
+/**
+ * filemap_fdatawait_range_keep_errors - wait for writeback to complete
+ * @mapping: address space structure to wait for
+ * @start_byte: offset in bytes where the range starts
+ * @end_byte: offset in bytes where the range ends (inclusive)
+ *
+ * Walk the list of under-writeback pages of the given address space in the
+ * given range and wait for all of them. Unlike filemap_fdatawait_range(),
+ * this function does not clear error status of the address space.
+ *
+ * Use this function if callers don't handle errors themselves. Expected
+ * call sites are system-wide / filesystem-wide data flushers: e.g. sync(2),
+ * fsfreeze(8)
+ */
+int filemap_fdatawait_range_keep_errors(struct address_space *mapping,
+ loff_t start_byte, loff_t end_byte)
+{
+ __filemap_fdatawait_range(mapping, start_byte, end_byte);
+ return filemap_check_and_keep_errors(mapping);
+}
+EXPORT_SYMBOL(filemap_fdatawait_range_keep_errors);
+
/**
* file_fdatawait_range - wait for writeback to complete
* @file: file pointing to address space structure to wait for
--
2.22.0.410.gd8fdbe21b5-goog