Some of the higher layers like iomap takes inode_lock() when calling
generic_write_sync().
Also writeback already happens from other paths without inode lock,
so it's difficult to say that we really need sync_mapping_buffers() to
take any inode locking here. Having said that, let's add
generic_buffer_fsync() implementation in buffer.c with no
inode_lock/unlock() for now so that filesystems like ext2 and
ext4's nojournal mode can use it.
Ext4 when got converted to iomap for direct-io already copied it's own
variant of __generic_file_fsync() without lock. Hence let's add a helper
API and use it both in ext2 and ext4.
Later we can review other filesystems as well to see if we can make
generic_buffer_fsync() which does not take any inode_lock() as the
default path.
Tested-by: Disha Goel <[email protected]>
Signed-off-by: Ritesh Harjani (IBM) <[email protected]>
---
fs/buffer.c | 43 +++++++++++++++++++++++++++++++++++++
include/linux/buffer_head.h | 2 ++
2 files changed, 45 insertions(+)
diff --git a/fs/buffer.c b/fs/buffer.c
index 9e1e2add541e..df98f1966a71 100644
--- a/fs/buffer.c
+++ b/fs/buffer.c
@@ -593,6 +593,49 @@ int sync_mapping_buffers(struct address_space *mapping)
}
EXPORT_SYMBOL(sync_mapping_buffers);
+/**
+ * generic_buffer_fsync - generic buffer fsync implementation
+ * for simple filesystems with no inode lock
+ *
+ * @file: file to synchronize
+ * @start: start offset in bytes
+ * @end: end offset in bytes (inclusive)
+ * @datasync: only synchronize essential metadata if true
+ *
+ * This is a generic implementation of the fsync method for simple
+ * filesystems which track all non-inode metadata in the buffers list
+ * hanging off the address_space structure.
+ */
+int generic_buffer_fsync(struct file *file, loff_t start, loff_t end,
+ bool datasync)
+{
+ struct inode *inode = file->f_mapping->host;
+ int err;
+ int ret;
+
+ err = file_write_and_wait_range(file, start, end);
+ if (err)
+ return err;
+
+ ret = sync_mapping_buffers(inode->i_mapping);
+ if (!(inode->i_state & I_DIRTY_ALL))
+ goto out;
+ if (datasync && !(inode->i_state & I_DIRTY_DATASYNC))
+ goto out;
+
+ err = sync_inode_metadata(inode, 1);
+ if (ret == 0)
+ ret = err;
+
+out:
+ /* check and advance again to catch errors after syncing out buffers */
+ err = file_check_and_advance_wb_err(file);
+ if (ret == 0)
+ ret = err;
+ return ret;
+}
+EXPORT_SYMBOL(generic_buffer_fsync);
+
/*
* Called when we've recently written block `bblock', and it is known that
* `bblock' was for a buffer_boundary() buffer. This means that the block at
diff --git a/include/linux/buffer_head.h b/include/linux/buffer_head.h
index 8f14dca5fed7..3170d0792d52 100644
--- a/include/linux/buffer_head.h
+++ b/include/linux/buffer_head.h
@@ -211,6 +211,8 @@ int inode_has_buffers(struct inode *);
void invalidate_inode_buffers(struct inode *);
int remove_inode_buffers(struct inode *inode);
int sync_mapping_buffers(struct address_space *mapping);
+int generic_buffer_fsync(struct file *file, loff_t start, loff_t end,
+ bool datasync);
void clean_bdev_aliases(struct block_device *bdev, sector_t block,
sector_t len);
static inline void clean_bdev_bh_alias(struct buffer_head *bh)
--
2.39.2
Looks good:
Reviewed-by: Christoph Hellwig <[email protected]>