LinuxLists.cc - [PATCH v2 00/10] xfs/iomap: fix non-atomic clone operation and don't update size when zeroing range post eof

2024-03-15 13:02:10

Subject: [PATCH v2 00/10] xfs/iomap: fix non-atomic clone operation and don't update size when zeroing range post eof

From: Zhang Yi <[email protected]>

Changes since v1:
- Make xfs_bmapi_convert_delalloc() to allocate the target offset and
drop the writeback helper xfs_convert_blocks().
- Don't use xfs_iomap_write_direct() to convert delalloc blocks for
zeroing posteof case, use xfs_bmapi_convert_delalloc() instead.
- Fix two off-by-one issues when converting delalloc blocks.
- Add a separate patch to drop the buffered write failure handle in
zeroing and unsharing.
- Add a comments do emphasize updating i_size should under folio lock.
- Make iomap_write_end() to return a boolean, and do some cleanups in
buffered write begin path.

This patch series fix a problem of exposing zeroed data on xfs since the
non-atomic clone operation. This problem was found while I was
developing ext4 buffered IO iomap conversion (ext4 is relying on this
fix [1]), the root cause of this problem and the discussion about the
solution please see [2]. After fix the problem, iomap_zero_range()
doesn't need to update i_size so that ext4 can use it to zero partial
block, e.g. truncate eof block [3].

[1] https://lore.kernel.org/linux-ext4/[email protected]/
[2] https://lore.kernel.org/linux-ext4/[email protected]/
[3] https://lore.kernel.org/linux-ext4/[email protected]/

Thanks,
Yi.

Zhang Yi (10):
xfs: match lock mode in xfs_buffered_write_iomap_begin()
xfs: allow xfs_bmapi_convert_delalloc() to pass NULL seq
xfs: make xfs_bmapi_convert_delalloc() to allocate the target offset
xfs: drop xfs_convert_blocks()
xfs: convert delayed extents to unwritten when zeroing post eof blocks
iomap: drop the write failure handles when unsharing and zeroing
iomap: don't increase i_size if it's not a write operation
iomap: use a new variable to handle the written bytes in
iomap_write_iter()
iomap: make block_write_end() return a boolean
iomap: do some small logical cleanup in buffered write

fs/iomap/buffered-io.c | 101 ++++++++++++++++++++++-----------------
fs/xfs/libxfs/xfs_bmap.c | 40 ++++++++++++++--
fs/xfs/xfs_aops.c | 54 ++++++---------------
fs/xfs/xfs_iomap.c | 39 +++++++++++++--
4 files changed, 142 insertions(+), 92 deletions(-)

--
2.39.2

2024-03-15 13:02:17

by Zhang Yi

[permalink] [raw]

Subject: [PATCH v2 04/10] xfs: drop xfs_convert_blocks()

From: Zhang Yi <[email protected]>

Since xfs_bmapi_convert_delalloc() can make sure converting the target
offset, it's time to drop xfs_convert_blocks().

Signed-off-by: Zhang Yi <[email protected]>
---
fs/xfs/xfs_aops.c | 41 ++++++++++++++---------------------------
1 file changed, 14 insertions(+), 27 deletions(-)

diff --git a/fs/xfs/xfs_aops.c b/fs/xfs/xfs_aops.c
index 376ec0993943..6479e0dac69d 100644
--- a/fs/xfs/xfs_aops.c
+++ b/fs/xfs/xfs_aops.c
@@ -233,32 +233,6 @@ xfs_imap_valid(
return true;
}

-/*
- * Pass in a dellalloc extent and convert it to real extents, return the real
- * extent that maps offset_fsb in wpc->iomap.
- *
- * The current page is held locked so nothing could have removed the block
- * backing offset_fsb, although it could have moved from the COW to the data
- * fork by another thread.
- */
-static int
-xfs_convert_blocks(
- struct iomap_writepage_ctx *wpc,
- struct xfs_inode *ip,
- int whichfork,
- loff_t offset)
-{
- unsigned *seq;
-
- if (whichfork == XFS_COW_FORK)
- seq = &XFS_WPC(wpc)->cow_seq;
- else
- seq = &XFS_WPC(wpc)->data_seq;
-
- return xfs_bmapi_convert_delalloc(ip, whichfork, offset,
- &wpc->iomap, seq);
-}
-
static int
xfs_map_blocks(
struct iomap_writepage_ctx *wpc,
@@ -276,6 +250,7 @@ xfs_map_blocks(
struct xfs_iext_cursor icur;
int retries = 0;
int error = 0;
+ unsigned int *seq;

if (xfs_is_shutdown(mp))
return -EIO;
@@ -373,7 +348,19 @@ xfs_map_blocks(
trace_xfs_map_blocks_found(ip, offset, count, whichfork, &imap);
return 0;
allocate_blocks:
- error = xfs_convert_blocks(wpc, ip, whichfork, offset);
+ /*
+ * Convert a dellalloc extent to a real one. The current page is held
+ * locked so nothing could have removed the block backing offset_fsb,
+ * although it could have moved from the COW to the data fork by another
+ * thread.
+ */
+ if (whichfork == XFS_COW_FORK)
+ seq = &XFS_WPC(wpc)->cow_seq;
+ else
+ seq = &XFS_WPC(wpc)->data_seq;
+
+ error = xfs_bmapi_convert_delalloc(ip, whichfork, offset,
+ &wpc->iomap, seq);
if (error) {
/*
* If we failed to find the extent in the COW fork we might have
--
2.39.2

2024-03-15 13:02:26

by Zhang Yi

[permalink] [raw]

Subject: [PATCH v2 03/10] xfs: make xfs_bmapi_convert_delalloc() to allocate the target offset

From: Zhang Yi <[email protected]>

Since xfs_bmapi_convert_delalloc() only attempts to allocate the entire
delalloc extent and require multiple invocations to allocate the target
offset. So xfs_convert_blocks() add a loop to do this job and we call it
in the write back path, but xfs_convert_blocks() isn't a common helper.
Let's do it in xfs_bmapi_convert_delalloc(), preparing for the post EOF
delalloc blocks converting in the buffered write begin path.

Signed-off-by: Zhang Yi <[email protected]>
---
fs/xfs/libxfs/xfs_bmap.c | 34 ++++++++++++++++++++++++++++++++--
fs/xfs/xfs_aops.c | 15 +--------------
2 files changed, 33 insertions(+), 16 deletions(-)

diff --git a/fs/xfs/libxfs/xfs_bmap.c b/fs/xfs/libxfs/xfs_bmap.c
index 07dc35de8ce5..042e8d3ab0ba 100644
--- a/fs/xfs/libxfs/xfs_bmap.c
+++ b/fs/xfs/libxfs/xfs_bmap.c
@@ -4516,8 +4516,8 @@ xfs_bmapi_write(
* invocations to allocate the target offset if a large enough physical extent
* is not available.
*/
-int
-xfs_bmapi_convert_delalloc(
+static int
+__xfs_bmapi_convert_delalloc(
struct xfs_inode *ip,
int whichfork,
xfs_off_t offset,
@@ -4648,6 +4648,36 @@ xfs_bmapi_convert_delalloc(
return error;
}

+/*
+ * Pass in a dellalloc extent and convert it to real extents, return the real
+ * extent that maps offset_fsb in iomap.
+ */
+int
+xfs_bmapi_convert_delalloc(
+ struct xfs_inode *ip,
+ int whichfork,
+ loff_t offset,
+ struct iomap *iomap,
+ unsigned int *seq)
+{
+ int error;
+
+ /*
+ * Attempt to allocate whatever delalloc extent currently backs offset
+ * and put the result into iomap. Allocate in a loop because it may
+ * take several attempts to allocate real blocks for a contiguous
+ * delalloc extent if free space is sufficiently fragmented.
+ */
+ do {
+ error = __xfs_bmapi_convert_delalloc(ip, whichfork, offset,
+ iomap, seq);
+ if (error)
+ return error;
+ } while (iomap->offset + iomap->length <= offset);
+
+ return 0;
+}
+
int
xfs_bmapi_remap(
struct xfs_trans *tp,
diff --git a/fs/xfs/xfs_aops.c b/fs/xfs/xfs_aops.c
index 813f85156b0c..376ec0993943 100644
--- a/fs/xfs/xfs_aops.c
+++ b/fs/xfs/xfs_aops.c
@@ -248,7 +248,6 @@ xfs_convert_blocks(
int whichfork,
loff_t offset)
{
- int error;
unsigned *seq;

if (whichfork == XFS_COW_FORK)
@@ -256,20 +255,8 @@ xfs_convert_blocks(
else
seq = &XFS_WPC(wpc)->data_seq;

- /*
- * Attempt to allocate whatever delalloc extent currently backs offset
- * and put the result into wpc->iomap. Allocate in a loop because it
- * may take several attempts to allocate real blocks for a contiguous
- * delalloc extent if free space is sufficiently fragmented.
- */
- do {
- error = xfs_bmapi_convert_delalloc(ip, whichfork, offset,
+ return xfs_bmapi_convert_delalloc(ip, whichfork, offset,
&wpc->iomap, seq);
- if (error)
- return error;
- } while (wpc->iomap.offset + wpc->iomap.length <= offset);
-
- return 0;
}

static int
--
2.39.2

2024-03-15 13:03:06

by Zhang Yi

[permalink] [raw]

Subject: [PATCH v2 08/10] iomap: use a new variable to handle the written bytes in iomap_write_iter()

From: Zhang Yi <[email protected]>

In iomap_write_iter(), the status variable used to receive the return
value from iomap_write_end() is confusing, replace it with a new written
variable to represent the written bytes in each cycle, no logic changes.

Signed-off-by: Zhang Yi <[email protected]>
---
fs/iomap/buffered-io.c | 33 +++++++++++++++++----------------
1 file changed, 17 insertions(+), 16 deletions(-)

diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
index e9112dc78d15..291648c61a32 100644
--- a/fs/iomap/buffered-io.c
+++ b/fs/iomap/buffered-io.c
@@ -851,7 +851,7 @@ static loff_t iomap_write_iter(struct iomap_iter *iter, struct iov_iter *i)
loff_t length = iomap_length(iter);
size_t chunk = PAGE_SIZE << MAX_PAGECACHE_ORDER;
loff_t pos = iter->pos;
- ssize_t written = 0;
+ ssize_t total_written = 0;
long status = 0;
struct address_space *mapping = iter->inode->i_mapping;
unsigned int bdp_flags = (iter->flags & IOMAP_NOWAIT) ? BDP_ASYNC : 0;
@@ -862,6 +862,7 @@ static loff_t iomap_write_iter(struct iomap_iter *iter, struct iov_iter *i)
size_t offset; /* Offset into folio */
size_t bytes; /* Bytes to write to folio */
size_t copied; /* Bytes copied from user */
+ size_t written; /* Bytes have been written */

bytes = iov_iter_count(i);
retry:
@@ -906,7 +907,7 @@ static loff_t iomap_write_iter(struct iomap_iter *iter, struct iov_iter *i)
flush_dcache_folio(folio);

copied = copy_folio_from_iter_atomic(folio, offset, bytes, i);
- status = iomap_write_end(iter, pos, bytes, copied, folio);
+ written = iomap_write_end(iter, pos, bytes, copied, folio);

/*
* Update the in-memory inode size after copying the data into
@@ -916,22 +917,22 @@ static loff_t iomap_write_iter(struct iomap_iter *iter, struct iov_iter *i)
* unlock and release the folio.
*/
old_size = iter->inode->i_size;
- if (pos + status > old_size) {
- i_size_write(iter->inode, pos + status);
+ if (pos + written > old_size) {
+ i_size_write(iter->inode, pos + written);
iter->iomap.flags |= IOMAP_F_SIZE_CHANGED;
}
- __iomap_put_folio(iter, pos, status, folio);
+ __iomap_put_folio(iter, pos, written, folio);

if (old_size < pos)
pagecache_isize_extended(iter->inode, old_size, pos);
- if (status < bytes)
- iomap_write_failed(iter->inode, pos + status,
- bytes - status);
- if (unlikely(copied != status))
- iov_iter_revert(i, copied - status);
+ if (written < bytes)
+ iomap_write_failed(iter->inode, pos + written,
+ bytes - written);
+ if (unlikely(copied != written))
+ iov_iter_revert(i, copied - written);

cond_resched();
- if (unlikely(status == 0)) {
+ if (unlikely(written == 0)) {
/*
* A short copy made iomap_write_end() reject the
* thing entirely. Might be memory poisoning
@@ -945,17 +946,17 @@ static loff_t iomap_write_iter(struct iomap_iter *iter, struct iov_iter *i)
goto retry;
}
} else {
- pos += status;
- written += status;
- length -= status;
+ pos += written;
+ total_written += written;
+ length -= written;
}
} while (iov_iter_count(i) && length);

if (status == -EAGAIN) {
- iov_iter_revert(i, written);
+ iov_iter_revert(i, total_written);
return -EAGAIN;
}
- return written ? written : status;
+ return total_written ? total_written : status;
}

ssize_t
--
2.39.2

2024-03-15 13:03:13

by Zhang Yi

[permalink] [raw]

Subject: [PATCH v2 06/10] iomap: drop the write failure handles when unsharing and zeroing

From: Zhang Yi <[email protected]>

Unsharing and zeroing can only happen within EOF, so there is never a
need to perform posteof pagecache truncation if write begin fails, also
partial write could never theoretically happened from iomap_write_end(),
so remove both of them.

Signed-off-by: Zhang Yi <[email protected]>
---
fs/iomap/buffered-io.c | 10 ++++++----
1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
index 093c4515b22a..7e32a204650b 100644
--- a/fs/iomap/buffered-io.c
+++ b/fs/iomap/buffered-io.c
@@ -786,7 +786,6 @@ static int iomap_write_begin(struct iomap_iter *iter, loff_t pos,

out_unlock:
__iomap_put_folio(iter, pos, 0, folio);
- iomap_write_failed(iter->inode, pos, len);

return status;
}
@@ -863,8 +862,6 @@ static size_t iomap_write_end(struct iomap_iter *iter, loff_t pos, size_t len,

if (old_size < pos)
pagecache_isize_extended(iter->inode, old_size, pos);
- if (ret < len)
- iomap_write_failed(iter->inode, pos + ret, len - ret);
return ret;
}

@@ -912,8 +909,10 @@ static loff_t iomap_write_iter(struct iomap_iter *iter, struct iov_iter *i)
}

status = iomap_write_begin(iter, pos, bytes, &folio);
- if (unlikely(status))
+ if (unlikely(status)) {
+ iomap_write_failed(iter->inode, pos, bytes);
break;
+ }
if (iter->iomap.flags & IOMAP_F_STALE)
break;

@@ -927,6 +926,9 @@ static loff_t iomap_write_iter(struct iomap_iter *iter, struct iov_iter *i)
copied = copy_folio_from_iter_atomic(folio, offset, bytes, i);
status = iomap_write_end(iter, pos, bytes, copied, folio);

+ if (status < bytes)
+ iomap_write_failed(iter->inode, pos + status,
+ bytes - status);
if (unlikely(copied != status))
iov_iter_revert(i, copied - status);

--
2.39.2

2024-03-15 13:03:19

by Zhang Yi

[permalink] [raw]

Subject: [PATCH v2 01/10] xfs: match lock mode in xfs_buffered_write_iomap_begin()

From: Zhang Yi <[email protected]>

Commit 1aa91d9c9933 ("xfs: Add async buffered write support") replace
xfs_ilock(XFS_ILOCK_EXCL) with xfs_ilock_for_iomap() when locking the
writing inode, and a new variable lockmode is used to indicate the lock
mode. Although the lockmode should always be XFS_ILOCK_EXCL, it's still
better to use this variable instead of useing XFS_ILOCK_EXCL directly
when unlocking the inode.

Fixes: 1aa91d9c9933 ("xfs: Add async buffered write support")
Signed-off-by: Zhang Yi <[email protected]>
Reviewed-by: Darrick J. Wong <[email protected]>
Reviewed-by: Christoph Hellwig <[email protected]>
---
fs/xfs/xfs_iomap.c | 10 +++++-----
1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/fs/xfs/xfs_iomap.c b/fs/xfs/xfs_iomap.c
index 18c8f168b153..ccf83e72d8ca 100644
--- a/fs/xfs/xfs_iomap.c
+++ b/fs/xfs/xfs_iomap.c
@@ -1149,13 +1149,13 @@ xfs_buffered_write_iomap_begin(
* them out if the write happens to fail.
*/
seq = xfs_iomap_inode_sequence(ip, IOMAP_F_NEW);
- xfs_iunlock(ip, XFS_ILOCK_EXCL);
+ xfs_iunlock(ip, lockmode);
trace_xfs_iomap_alloc(ip, offset, count, allocfork, &imap);
return xfs_bmbt_to_iomap(ip, iomap, &imap, flags, IOMAP_F_NEW, seq);

found_imap:
seq = xfs_iomap_inode_sequence(ip, 0);
- xfs_iunlock(ip, XFS_ILOCK_EXCL);
+ xfs_iunlock(ip, lockmode);
return xfs_bmbt_to_iomap(ip, iomap, &imap, flags, 0, seq);

found_cow:
@@ -1165,17 +1165,17 @@ xfs_buffered_write_iomap_begin(
if (error)
goto out_unlock;
seq = xfs_iomap_inode_sequence(ip, IOMAP_F_SHARED);
- xfs_iunlock(ip, XFS_ILOCK_EXCL);
+ xfs_iunlock(ip, lockmode);
return xfs_bmbt_to_iomap(ip, iomap, &cmap, flags,
IOMAP_F_SHARED, seq);
}

xfs_trim_extent(&cmap, offset_fsb, imap.br_startoff - offset_fsb);
- xfs_iunlock(ip, XFS_ILOCK_EXCL);
+ xfs_iunlock(ip, lockmode);
return xfs_bmbt_to_iomap(ip, iomap, &cmap, flags, 0, seq);

out_unlock:
- xfs_iunlock(ip, XFS_ILOCK_EXCL);
+ xfs_iunlock(ip, lockmode);
return error;
}

--
2.39.2

2024-03-15 13:03:22

by Zhang Yi

[permalink] [raw]

Subject: [PATCH v2 10/10] iomap: do some small logical cleanup in buffered write

From: Zhang Yi <[email protected]>

Since iomap_write_end() can never return a partial write length, the
comperation between written, copied and bytes becomes useless, just
merge them with the unwritten branch.

Signed-off-by: Zhang Yi <[email protected]>
---
fs/iomap/buffered-io.c | 8 +++-----
1 file changed, 3 insertions(+), 5 deletions(-)

diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
index 004673ea8bc1..f2fb89056259 100644
--- a/fs/iomap/buffered-io.c
+++ b/fs/iomap/buffered-io.c
@@ -937,11 +937,6 @@ static loff_t iomap_write_iter(struct iomap_iter *iter, struct iov_iter *i)

if (old_size < pos)
pagecache_isize_extended(iter->inode, old_size, pos);
- if (written < bytes)
- iomap_write_failed(iter->inode, pos + written,
- bytes - written);
- if (unlikely(copied != written))
- iov_iter_revert(i, copied - written);

cond_resched();
if (unlikely(written == 0)) {
@@ -951,6 +946,9 @@ static loff_t iomap_write_iter(struct iomap_iter *iter, struct iov_iter *i)
* halfway through, might be a race with munmap,
* might be severe memory pressure.
*/
+ iomap_write_failed(iter->inode, pos, bytes);
+ iov_iter_revert(i, copied);
+
if (chunk > PAGE_SIZE)
chunk /= 2;
if (copied) {
--
2.39.2

2024-03-15 13:04:53

by Zhang Yi

[permalink] [raw]

Subject: [PATCH v2 07/10] iomap: don't increase i_size if it's not a write operation

From: Zhang Yi <[email protected]>

Increase i_size in iomap_zero_range() and iomap_unshare_iter() is not
needed, the caller should handle it. Especially, when truncate partial
block, we should not increase i_size beyond the new EOF here. It doesn't
affect xfs and gfs2 now because they set the new file size after zero
out, it doesn't matter that a transient increase in i_size, but it will
affect ext4 because it set file size before truncate. So move the i_size
updating logic to iomap_write_iter().

Signed-off-by: Zhang Yi <[email protected]>
---
fs/iomap/buffered-io.c | 50 +++++++++++++++++++++---------------------
1 file changed, 25 insertions(+), 25 deletions(-)

diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
index 7e32a204650b..e9112dc78d15 100644
--- a/fs/iomap/buffered-io.c
+++ b/fs/iomap/buffered-io.c
@@ -837,32 +837,13 @@ static size_t iomap_write_end(struct iomap_iter *iter, loff_t pos, size_t len,
size_t copied, struct folio *folio)
{
const struct iomap *srcmap = iomap_iter_srcmap(iter);
- loff_t old_size = iter->inode->i_size;
- size_t ret;
-
- if (srcmap->type == IOMAP_INLINE) {
- ret = iomap_write_end_inline(iter, folio, pos, copied);
- } else if (srcmap->flags & IOMAP_F_BUFFER_HEAD) {
- ret = block_write_end(NULL, iter->inode->i_mapping, pos, len,
- copied, &folio->page, NULL);
- } else {
- ret = __iomap_write_end(iter->inode, pos, len, copied, folio);
- }
-
- /*
- * Update the in-memory inode size after copying the data into the page
- * cache. It's up to the file system to write the updated size to disk,
- * preferably after I/O completion so that no stale data is exposed.
- */
- if (pos + ret > old_size) {
- i_size_write(iter->inode, pos + ret);
- iter->iomap.flags |= IOMAP_F_SIZE_CHANGED;
- }
- __iomap_put_folio(iter, pos, ret, folio);

- if (old_size < pos)
- pagecache_isize_extended(iter->inode, old_size, pos);
- return ret;
+ if (srcmap->type == IOMAP_INLINE)
+ return iomap_write_end_inline(iter, folio, pos, copied);
+ if (srcmap->flags & IOMAP_F_BUFFER_HEAD)
+ return block_write_end(NULL, iter->inode->i_mapping, pos, len,
+ copied, &folio->page, NULL);
+ return __iomap_write_end(iter->inode, pos, len, copied, folio);
}

static loff_t iomap_write_iter(struct iomap_iter *iter, struct iov_iter *i)
@@ -877,6 +858,7 @@ static loff_t iomap_write_iter(struct iomap_iter *iter, struct iov_iter *i)

do {
struct folio *folio;
+ loff_t old_size;
size_t offset; /* Offset into folio */
size_t bytes; /* Bytes to write to folio */
size_t copied; /* Bytes copied from user */
@@ -926,6 +908,22 @@ static loff_t iomap_write_iter(struct iomap_iter *iter, struct iov_iter *i)
copied = copy_folio_from_iter_atomic(folio, offset, bytes, i);
status = iomap_write_end(iter, pos, bytes, copied, folio);

+ /*
+ * Update the in-memory inode size after copying the data into
+ * the page cache. It's up to the file system to write the
+ * updated size to disk, preferably after I/O completion so that
+ * no stale data is exposed. Only once that's done can we
+ * unlock and release the folio.
+ */
+ old_size = iter->inode->i_size;
+ if (pos + status > old_size) {
+ i_size_write(iter->inode, pos + status);
+ iter->iomap.flags |= IOMAP_F_SIZE_CHANGED;
+ }
+ __iomap_put_folio(iter, pos, status, folio);
+
+ if (old_size < pos)
+ pagecache_isize_extended(iter->inode, old_size, pos);
if (status < bytes)
iomap_write_failed(iter->inode, pos + status,
bytes - status);
@@ -1298,6 +1296,7 @@ static loff_t iomap_unshare_iter(struct iomap_iter *iter)
bytes = folio_size(folio) - offset;

bytes = iomap_write_end(iter, pos, bytes, bytes, folio);
+ __iomap_put_folio(iter, pos, bytes, folio);
if (WARN_ON_ONCE(bytes == 0))
return -EIO;

@@ -1362,6 +1361,7 @@ static loff_t iomap_zero_iter(struct iomap_iter *iter, bool *did_zero)
folio_mark_accessed(folio);

bytes = iomap_write_end(iter, pos, bytes, bytes, folio);
+ __iomap_put_folio(iter, pos, bytes, folio);
if (WARN_ON_ONCE(bytes == 0))
return -EIO;

--
2.39.2

2024-03-15 13:05:53

by Zhang Yi

[permalink] [raw]

Subject: [PATCH v2 09/10] iomap: make block_write_end() return a boolean

From: Zhang Yi <[email protected]>

For now, we can make sure iomap_write_end() always return 0 or copied
bytes, so instead of return written bytes, convert to return a boolean
to indicate the copied bytes have been written to the pagecache.

Signed-off-by: Zhang Yi <[email protected]>
---
fs/iomap/buffered-io.c | 50 +++++++++++++++++++++++++++---------------
1 file changed, 32 insertions(+), 18 deletions(-)

diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
index 291648c61a32..004673ea8bc1 100644
--- a/fs/iomap/buffered-io.c
+++ b/fs/iomap/buffered-io.c
@@ -790,7 +790,7 @@ static int iomap_write_begin(struct iomap_iter *iter, loff_t pos,
return status;
}

-static size_t __iomap_write_end(struct inode *inode, loff_t pos, size_t len,
+static bool __iomap_write_end(struct inode *inode, loff_t pos, size_t len,
size_t copied, struct folio *folio)
{
flush_dcache_folio(folio);
@@ -807,14 +807,14 @@ static size_t __iomap_write_end(struct inode *inode, loff_t pos, size_t len,
* redo the whole thing.
*/
if (unlikely(copied < len && !folio_test_uptodate(folio)))
- return 0;
+ return false;
iomap_set_range_uptodate(folio, offset_in_folio(folio, pos), len);
iomap_set_range_dirty(folio, offset_in_folio(folio, pos), copied);
filemap_dirty_folio(inode->i_mapping, folio);
- return copied;
+ return true;
}

-static size_t iomap_write_end_inline(const struct iomap_iter *iter,
+static void iomap_write_end_inline(const struct iomap_iter *iter,
struct folio *folio, loff_t pos, size_t copied)
{
const struct iomap *iomap = &iter->iomap;
@@ -829,21 +829,32 @@ static size_t iomap_write_end_inline(const struct iomap_iter *iter,
kunmap_local(addr);

mark_inode_dirty(iter->inode);
- return copied;
}

-/* Returns the number of bytes copied. May be 0. Cannot be an errno. */
-static size_t iomap_write_end(struct iomap_iter *iter, loff_t pos, size_t len,
+/*
+ * Returns true if all copied bytes have been written to the pagecache,
+ * otherwise return false.
+ */
+static bool iomap_write_end(struct iomap_iter *iter, loff_t pos, size_t len,
size_t copied, struct folio *folio)
{
const struct iomap *srcmap = iomap_iter_srcmap(iter);
+ bool ret = true;

- if (srcmap->type == IOMAP_INLINE)
- return iomap_write_end_inline(iter, folio, pos, copied);
- if (srcmap->flags & IOMAP_F_BUFFER_HEAD)
- return block_write_end(NULL, iter->inode->i_mapping, pos, len,
- copied, &folio->page, NULL);
- return __iomap_write_end(iter->inode, pos, len, copied, folio);
+ if (srcmap->type == IOMAP_INLINE) {
+ iomap_write_end_inline(iter, folio, pos, copied);
+ } else if (srcmap->flags & IOMAP_F_BUFFER_HEAD) {
+ size_t bh_written;
+
+ bh_written = block_write_end(NULL, iter->inode->i_mapping, pos,
+ len, copied, &folio->page, NULL);
+ WARN_ON_ONCE(bh_written != copied && bh_written != 0);
+ ret = bh_written == copied;
+ } else {
+ ret = __iomap_write_end(iter->inode, pos, len, copied, folio);
+ }
+
+ return ret;
}

static loff_t iomap_write_iter(struct iomap_iter *iter, struct iov_iter *i)
@@ -907,7 +918,8 @@ static loff_t iomap_write_iter(struct iomap_iter *iter, struct iov_iter *i)
flush_dcache_folio(folio);

copied = copy_folio_from_iter_atomic(folio, offset, bytes, i);
- written = iomap_write_end(iter, pos, bytes, copied, folio);
+ written = iomap_write_end(iter, pos, bytes, copied, folio) ?
+ copied : 0;

/*
* Update the in-memory inode size after copying the data into
@@ -1285,6 +1297,7 @@ static loff_t iomap_unshare_iter(struct iomap_iter *iter)
int status;
size_t offset;
size_t bytes = min_t(u64, SIZE_MAX, length);
+ bool ret;

status = iomap_write_begin(iter, pos, bytes, &folio);
if (unlikely(status))
@@ -1296,9 +1309,9 @@ static loff_t iomap_unshare_iter(struct iomap_iter *iter)
if (bytes > folio_size(folio) - offset)
bytes = folio_size(folio) - offset;

- bytes = iomap_write_end(iter, pos, bytes, bytes, folio);
+ ret = iomap_write_end(iter, pos, bytes, bytes, folio);
__iomap_put_folio(iter, pos, bytes, folio);
- if (WARN_ON_ONCE(bytes == 0))
+ if (WARN_ON_ONCE(!ret))
return -EIO;

cond_resched();
@@ -1347,6 +1360,7 @@ static loff_t iomap_zero_iter(struct iomap_iter *iter, bool *did_zero)
int status;
size_t offset;
size_t bytes = min_t(u64, SIZE_MAX, length);
+ bool ret;

status = iomap_write_begin(iter, pos, bytes, &folio);
if (status)
@@ -1361,9 +1375,9 @@ static loff_t iomap_zero_iter(struct iomap_iter *iter, bool *did_zero)
folio_zero_range(folio, offset, bytes);
folio_mark_accessed(folio);

- bytes = iomap_write_end(iter, pos, bytes, bytes, folio);
+ ret = iomap_write_end(iter, pos, bytes, bytes, folio);
__iomap_put_folio(iter, pos, bytes, folio);
- if (WARN_ON_ONCE(bytes == 0))
+ if (WARN_ON_ONCE(!ret))
return -EIO;

pos += bytes;
--
2.39.2

2024-03-18 01:37:59

by Christoph Hellwig

[permalink] [raw]

Subject: Re: [PATCH v2 10/10] iomap: do some small logical cleanup in buffered write

Looks good:

Reviewed-by: Christoph Hellwig <[email protected]>

2024-03-18 02:41:22

by Christoph Hellwig

[permalink] [raw]

Subject: Re: [PATCH v2 09/10] iomap: make block_write_end() return a boolean

Looks good:

Reviewed-by: Christoph Hellwig <[email protected]>

2024-03-18 02:51:21

by Christoph Hellwig

[permalink] [raw]

Subject: Re: [PATCH v2 06/10] iomap: drop the write failure handles when unsharing and zeroing

Looks good:

Reviewed-by: Christoph Hellwig <[email protected]>

2024-03-18 02:51:32

by Christoph Hellwig

[permalink] [raw]

Subject: Re: [PATCH v2 04/10] xfs: drop xfs_convert_blocks()

Maybe just fold this into the previous patch?

Otherwise this looks good to me.

2024-03-18 05:41:01

by Christoph Hellwig

[permalink] [raw]

Subject: Re: [PATCH v2 07/10] iomap: don't increase i_size if it's not a write operation

Looks good:

Reviewed-by: Christoph Hellwig <[email protected]>

2024-03-18 06:35:20

by Christoph Hellwig

[permalink] [raw]

Subject: Re: [PATCH v2 03/10] xfs: make xfs_bmapi_convert_delalloc() to allocate the target offset

Looks good:

Reviewed-by: Christoph Hellwig <[email protected]>

2024-03-18 06:38:47

by Zhang Yi

[permalink] [raw]

Subject: Re: [PATCH v2 04/10] xfs: drop xfs_convert_blocks()

On 2024/3/18 9:33, Christoph Hellwig wrote:
> Maybe just fold this into the previous patch?
>
> Otherwise this looks good to me.
>
Okay, it's fine by me to fold them

Thanks,
Yi.

2024-03-18 07:31:54

by Christoph Hellwig

[permalink] [raw]

Subject: Re: [PATCH v2 08/10] iomap: use a new variable to handle the written bytes in iomap_write_iter()

Looks good:

Reviewed-by: Christoph Hellwig <[email protected]>