If ext4 converts an inline file to extents when applying writes under
delayed allocation that exceed the available inline storage, one or
more delayed allocated extents may be stored in the extent status cache
with an accompanying increase in the reserved block count. If the file
is subsequently truncated before writeback occurs, that inode's delayed
allocated extents will not be removed from the extent status cache and
the reserved block count will not be reduced as required after
truncation. At minimum, unexpected ENOSPC conditions can occur.
Eric Whitney (2):
ext4: remove extent cache entries when truncating inline data
ext4: enforce buffer head state assertion in ext4_da_map_blocks
fs/ext4/inline.c | 19 +++++++++++++++++++
fs/ext4/inode.c | 15 +++++++++------
2 files changed, 28 insertions(+), 6 deletions(-)
--
2.20.1
Remove the code that re-initializes a buffer head with an invalid block
number and BH_New and BH_Delay bits when a matching delayed and
unwritten block has been found in the extent status cache. Replace it
with assertions that verify the buffer head already has this state
correctly set. The current code masked an inline data truncation bug
that left stale entries in the extent status cache. With this change,
generic/130 can be used to reproduce and detect that bug.
Signed-off-by: Eric Whitney <[email protected]>
---
fs/ext4/inode.c | 15 +++++++++------
1 file changed, 9 insertions(+), 6 deletions(-)
diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index d8de607849df..c795184153d8 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -1718,13 +1718,16 @@ static int ext4_da_map_blocks(struct inode *inode, sector_t iblock,
}
/*
- * Delayed extent could be allocated by fallocate.
- * So we need to check it.
+ * the buffer head associated with a delayed and not unwritten
+ * block found in the extent status cache must contain an
+ * invalid block number and have its BH_New and BH_Delay bits
+ * set, reflecting the state assigned when the block was
+ * initially delayed allocated
*/
- if (ext4_es_is_delayed(&es) && !ext4_es_is_unwritten(&es)) {
- map_bh(bh, inode->i_sb, invalid_block);
- set_buffer_new(bh);
- set_buffer_delay(bh);
+ if (ext4_es_is_delonly(&es)) {
+ BUG_ON(bh->b_blocknr != invalid_block);
+ BUG_ON(!buffer_new(bh));
+ BUG_ON(!buffer_delay(bh));
return 0;
}
--
2.20.1
Conditionally remove all cached extents belonging to an inode
when truncating its inline data. It's only necessary to attempt to
remove cached extents when a conversion from inline to extent storage
has been initiated (!EXT4_STATE_MAY_INLINE_DATA). This avoids
unnecessary es lock overhead in the more common inline case.
Signed-off-by: Eric Whitney <[email protected]>
---
fs/ext4/inline.c | 19 +++++++++++++++++++
1 file changed, 19 insertions(+)
diff --git a/fs/ext4/inline.c b/fs/ext4/inline.c
index 70cb64db33f7..49b0b4fcea6d 100644
--- a/fs/ext4/inline.c
+++ b/fs/ext4/inline.c
@@ -7,6 +7,7 @@
#include <linux/iomap.h>
#include <linux/fiemap.h>
#include <linux/iversion.h>
+#include <linux/backing-dev.h>
#include "ext4_jbd2.h"
#include "ext4.h"
@@ -1903,6 +1904,24 @@ int ext4_inline_data_truncate(struct inode *inode, int *has_inline)
EXT4_I(inode)->i_disksize = i_size;
if (i_size < inline_size) {
+ /*
+ * if there's inline data to truncate and this file was
+ * converted to extents after that inline data was written,
+ * the extent status cache must be cleared to avoid leaving
+ * behind stale delayed allocated extent entries
+ */
+ if (!ext4_test_inode_state(inode, EXT4_STATE_MAY_INLINE_DATA)) {
+retry:
+ err = ext4_es_remove_extent(inode, 0, EXT_MAX_BLOCKS);
+ if (err == -ENOMEM) {
+ cond_resched();
+ congestion_wait(BLK_RW_ASYNC, HZ/50);
+ goto retry;
+ }
+ if (err)
+ goto out_error;
+ }
+
/* Clear the content in the xattr space. */
if (inline_size > EXT4_MIN_INLINE_DATA_SIZE) {
if ((err = ext4_xattr_ibody_find(inode, &i, &is)) != 0)
--
2.20.1
On Thu, 19 Aug 2021 10:49:25 -0400, Eric Whitney wrote:
> If ext4 converts an inline file to extents when applying writes under
> delayed allocation that exceed the available inline storage, one or
> more delayed allocated extents may be stored in the extent status cache
> with an accompanying increase in the reserved block count. If the file
> is subsequently truncated before writeback occurs, that inode's delayed
> allocated extents will not be removed from the extent status cache and
> the reserved block count will not be reduced as required after
> truncation. At minimum, unexpected ENOSPC conditions can occur.
>
> [...]
Applied, thanks!
[1/2] ext4: remove extent cache entries when truncating inline data
commit: 0add491df4e5e2c8cc6eeeaa6dbcca50f932090c
[2/2] ext4: enforce buffer head state assertion in ext4_da_map_blocks
commit: 948ca5f30e1df0c11eb5b0f410b9ceb97fa77ad9
Best regards,
--
Theodore Ts'o <[email protected]>