2024-03-20 11:14:34

by Zhang Yi

[permalink] [raw]
Subject: [PATCH v4 6/9] iomap: don't increase i_size if it's not a write operation

From: Zhang Yi <[email protected]>

Increase i_size in iomap_zero_range() and iomap_unshare_iter() is not
needed, the caller should handle it. Especially, when truncate partial
block, we should not increase i_size beyond the new EOF here. It doesn't
affect xfs and gfs2 now because they set the new file size after zero
out, it doesn't matter that a transient increase in i_size, but it will
affect ext4 because it set file size before truncate. So move the i_size
updating logic to iomap_write_iter().

Signed-off-by: Zhang Yi <[email protected]>
Reviewed-by: Christoph Hellwig <[email protected]>
Reviewed-by: Darrick J. Wong <[email protected]>
---
fs/iomap/buffered-io.c | 50 +++++++++++++++++++++---------------------
1 file changed, 25 insertions(+), 25 deletions(-)

diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
index 7e32a204650b..e9112dc78d15 100644
--- a/fs/iomap/buffered-io.c
+++ b/fs/iomap/buffered-io.c
@@ -837,32 +837,13 @@ static size_t iomap_write_end(struct iomap_iter *iter, loff_t pos, size_t len,
size_t copied, struct folio *folio)
{
const struct iomap *srcmap = iomap_iter_srcmap(iter);
- loff_t old_size = iter->inode->i_size;
- size_t ret;
-
- if (srcmap->type == IOMAP_INLINE) {
- ret = iomap_write_end_inline(iter, folio, pos, copied);
- } else if (srcmap->flags & IOMAP_F_BUFFER_HEAD) {
- ret = block_write_end(NULL, iter->inode->i_mapping, pos, len,
- copied, &folio->page, NULL);
- } else {
- ret = __iomap_write_end(iter->inode, pos, len, copied, folio);
- }
-
- /*
- * Update the in-memory inode size after copying the data into the page
- * cache. It's up to the file system to write the updated size to disk,
- * preferably after I/O completion so that no stale data is exposed.
- */
- if (pos + ret > old_size) {
- i_size_write(iter->inode, pos + ret);
- iter->iomap.flags |= IOMAP_F_SIZE_CHANGED;
- }
- __iomap_put_folio(iter, pos, ret, folio);

- if (old_size < pos)
- pagecache_isize_extended(iter->inode, old_size, pos);
- return ret;
+ if (srcmap->type == IOMAP_INLINE)
+ return iomap_write_end_inline(iter, folio, pos, copied);
+ if (srcmap->flags & IOMAP_F_BUFFER_HEAD)
+ return block_write_end(NULL, iter->inode->i_mapping, pos, len,
+ copied, &folio->page, NULL);
+ return __iomap_write_end(iter->inode, pos, len, copied, folio);
}

static loff_t iomap_write_iter(struct iomap_iter *iter, struct iov_iter *i)
@@ -877,6 +858,7 @@ static loff_t iomap_write_iter(struct iomap_iter *iter, struct iov_iter *i)

do {
struct folio *folio;
+ loff_t old_size;
size_t offset; /* Offset into folio */
size_t bytes; /* Bytes to write to folio */
size_t copied; /* Bytes copied from user */
@@ -926,6 +908,22 @@ static loff_t iomap_write_iter(struct iomap_iter *iter, struct iov_iter *i)
copied = copy_folio_from_iter_atomic(folio, offset, bytes, i);
status = iomap_write_end(iter, pos, bytes, copied, folio);

+ /*
+ * Update the in-memory inode size after copying the data into
+ * the page cache. It's up to the file system to write the
+ * updated size to disk, preferably after I/O completion so that
+ * no stale data is exposed. Only once that's done can we
+ * unlock and release the folio.
+ */
+ old_size = iter->inode->i_size;
+ if (pos + status > old_size) {
+ i_size_write(iter->inode, pos + status);
+ iter->iomap.flags |= IOMAP_F_SIZE_CHANGED;
+ }
+ __iomap_put_folio(iter, pos, status, folio);
+
+ if (old_size < pos)
+ pagecache_isize_extended(iter->inode, old_size, pos);
if (status < bytes)
iomap_write_failed(iter->inode, pos + status,
bytes - status);
@@ -1298,6 +1296,7 @@ static loff_t iomap_unshare_iter(struct iomap_iter *iter)
bytes = folio_size(folio) - offset;

bytes = iomap_write_end(iter, pos, bytes, bytes, folio);
+ __iomap_put_folio(iter, pos, bytes, folio);
if (WARN_ON_ONCE(bytes == 0))
return -EIO;

@@ -1362,6 +1361,7 @@ static loff_t iomap_zero_iter(struct iomap_iter *iter, bool *did_zero)
folio_mark_accessed(folio);

bytes = iomap_write_end(iter, pos, bytes, bytes, folio);
+ __iomap_put_folio(iter, pos, bytes, folio);
if (WARN_ON_ONCE(bytes == 0))
return -EIO;

--
2.39.2



2024-04-19 06:11:01

by Chandan Babu R

[permalink] [raw]
Subject: Re: [PATCH v4 6/9] iomap: don't increase i_size if it's not a write operation

On Wed, Mar 20, 2024 at 07:05:45 PM +0800, Zhang Yi wrote:
> From: Zhang Yi <[email protected]>
>
> Increase i_size in iomap_zero_range() and iomap_unshare_iter() is not
> needed, the caller should handle it. Especially, when truncate partial
> block, we should not increase i_size beyond the new EOF here. It doesn't
> affect xfs and gfs2 now because they set the new file size after zero
> out, it doesn't matter that a transient increase in i_size, but it will
> affect ext4 because it set file size before truncate. So move the i_size
> updating logic to iomap_write_iter().
>

This patch causes generic/522 to consistently fail when using the following
fstest configuration,

TEST_DEV=/dev/loop16
TEST_LOGDEV=/dev/loop13
SCRATCH_DEV_POOL="/dev/loop5 /dev/loop6 /dev/loop7 /dev/loop8 /dev/loop9 /dev/loop10 /dev/loop11 /dev/loop12"
MKFS_OPTIONS='-f -m reflink=1,rmapbt=1, -i sparse=1, -lsize=1g'
MOUNT_OPTIONS='-o usrquota,grpquota,prjquota'
TEST_FS_MOUNT_OPTS="$TEST_FS_MOUNT_OPTS -o usrquota,grpquota,prjquota"
TEST_FS_MOUNT_OPTS="-o logdev=/dev/loop13"
SCRATCH_LOGDEV=/dev/loop15
USE_EXTERNAL=yes
LOGWRITES_DEV=/dev/loop15

--
Chandan

2024-04-19 08:15:07

by Zhang Yi

[permalink] [raw]
Subject: Re: [PATCH v4 6/9] iomap: don't increase i_size if it's not a write operation

On 2024/4/19 14:07, Chandan Babu R wrote:
> On Wed, Mar 20, 2024 at 07:05:45 PM +0800, Zhang Yi wrote:
>> From: Zhang Yi <[email protected]>
>>
>> Increase i_size in iomap_zero_range() and iomap_unshare_iter() is not
>> needed, the caller should handle it. Especially, when truncate partial
>> block, we should not increase i_size beyond the new EOF here. It doesn't
>> affect xfs and gfs2 now because they set the new file size after zero
>> out, it doesn't matter that a transient increase in i_size, but it will
>> affect ext4 because it set file size before truncate. So move the i_size
>> updating logic to iomap_write_iter().
>>
>
> This patch causes generic/522 to consistently fail when using the following
> fstest configuration,
>
> TEST_DEV=/dev/loop16
> TEST_LOGDEV=/dev/loop13
> SCRATCH_DEV_POOL="/dev/loop5 /dev/loop6 /dev/loop7 /dev/loop8 /dev/loop9 /dev/loop10 /dev/loop11 /dev/loop12"
> MKFS_OPTIONS='-f -m reflink=1,rmapbt=1, -i sparse=1, -lsize=1g'
> MOUNT_OPTIONS='-o usrquota,grpquota,prjquota'
> TEST_FS_MOUNT_OPTS="$TEST_FS_MOUNT_OPTS -o usrquota,grpquota,prjquota"
> TEST_FS_MOUNT_OPTS="-o logdev=/dev/loop13"
> SCRATCH_LOGDEV=/dev/loop15
> USE_EXTERNAL=yes
> LOGWRITES_DEV=/dev/loop15
>

Sorry for the regression, I didn't notice this issue last time I ran
this test. It looks like the 004 patch doesn't work for some cases.
I can reproduce it on my machine now, and I'll take a look at this
and fix it soon.

Thanks,
Yi.


2024-04-23 11:30:29

by Zhang Yi

[permalink] [raw]
Subject: Re: [PATCH v4 6/9] iomap: don't increase i_size if it's not a write operation

On 2024/4/19 14:07, Chandan Babu R wrote:
> On Wed, Mar 20, 2024 at 07:05:45 PM +0800, Zhang Yi wrote:
>> From: Zhang Yi <[email protected]>
>>
>> Increase i_size in iomap_zero_range() and iomap_unshare_iter() is not
>> needed, the caller should handle it. Especially, when truncate partial
>> block, we should not increase i_size beyond the new EOF here. It doesn't
>> affect xfs and gfs2 now because they set the new file size after zero
>> out, it doesn't matter that a transient increase in i_size, but it will
>> affect ext4 because it set file size before truncate. So move the i_size
>> updating logic to iomap_write_iter().
>>
>
> This patch causes generic/522 to consistently fail when using the following
> fstest configuration,
>
> TEST_DEV=/dev/loop16
> TEST_LOGDEV=/dev/loop13
> SCRATCH_DEV_POOL="/dev/loop5 /dev/loop6 /dev/loop7 /dev/loop8 /dev/loop9 /dev/loop10 /dev/loop11 /dev/loop12"
> MKFS_OPTIONS='-f -m reflink=1,rmapbt=1, -i sparse=1, -lsize=1g'
> MOUNT_OPTIONS='-o usrquota,grpquota,prjquota'
> TEST_FS_MOUNT_OPTS="$TEST_FS_MOUNT_OPTS -o usrquota,grpquota,prjquota"
> TEST_FS_MOUNT_OPTS="-o logdev=/dev/loop13"
> SCRATCH_LOGDEV=/dev/loop15
> USE_EXTERNAL=yes
> LOGWRITES_DEV=/dev/loop15
>

Hello!

The root cause of the problem is caused by patch 4/9, this patch is fine,
I've revised the patch 4/9 and send it out separately in reply to v4. I've
tested that on my machine for over 100 rounds on generic/522 and it hasn't
failed again. Please take a look at the v5 patch for details and test it
on your machine.

https://lore.kernel.org/linux-xfs/[email protected]/T/#m2da33e643b642071aa20077321e2c431f5a64e38

Thanks,
Yi.