From: Ye Bin <[email protected]>
Diff v4 Vs v3:
1. Fix checkpatch warning.
2. Fix alignment issues about patch [2-3]
Diff v3 Vs v2:
1. Add fixes tag and rename label 'out' for first patch.
2. Do not split string across lines for second patch.
3. Just check pending tree if empty, drop clear code for third patch.
Diff V2 vs V1:
Use ext4_error() when detect 'i_reserved_data_blocks' and pending tree abnormal.
Ye Bin (3):
ext4: fix incorrect calculate 'reserved' in '__es_remove_extent' when
enable bigalloc feature
ext4: record error when detect abnormal 'i_reserved_data_blocks'
ext4: add check pending tree when evict inode
fs/ext4/extents_status.c | 3 ++-
fs/ext4/super.c | 13 +++++++++----
2 files changed, 11 insertions(+), 5 deletions(-)
--
2.31.1
From: Ye Bin <[email protected]>
If 'i_reserved_data_blocks' is not cleared which mean something wrong with
code, free space accounting is likely wrong, according to Jan Kara's advice
use ext4_error() to record this abnormal let fsck to repair and also we can
capture this issue.
Signed-off-by: Ye Bin <[email protected]>
Reviewed-by: Eric Whitney <[email protected]>
---
fs/ext4/super.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/fs/ext4/super.c b/fs/ext4/super.c
index 840e0a614959..4b2d257d3845 100644
--- a/fs/ext4/super.c
+++ b/fs/ext4/super.c
@@ -1387,10 +1387,10 @@ static void ext4_destroy_inode(struct inode *inode)
}
if (EXT4_I(inode)->i_reserved_data_blocks)
- ext4_msg(inode->i_sb, KERN_ERR,
- "Inode %lu (%p): i_reserved_data_blocks (%u) not cleared!",
- inode->i_ino, EXT4_I(inode),
- EXT4_I(inode)->i_reserved_data_blocks);
+ ext4_error(inode->i_sb,
+ "Inode %lu (%p): i_reserved_data_blocks (%u) not cleared!",
+ inode->i_ino, EXT4_I(inode),
+ EXT4_I(inode)->i_reserved_data_blocks);
}
static void init_once(void *foo)
--
2.31.1
From: Ye Bin <[email protected]>
Syzbot found the following issue:
BUG: memory leak
unreferenced object 0xffff8881bde17420 (size 32):
comm "rep", pid 2327, jiffies 4295381963 (age 32.265s)
hex dump (first 32 bytes):
01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
backtrace:
[<00000000ac6d38f8>] __insert_pending+0x13c/0x2d0
[<00000000d717de3b>] ext4_es_insert_delayed_block+0x399/0x4e0
[<000000004be03913>] ext4_da_map_blocks.constprop.0+0x739/0xfa0
[<00000000885a832a>] ext4_da_get_block_prep+0x10c/0x440
[<0000000029b7f8ef>] __block_write_begin_int+0x28d/0x860
[<00000000e182ebc3>] ext4_da_write_inline_data_begin+0x2d1/0xf30
[<00000000ced0c8a2>] ext4_da_write_begin+0x612/0x860
[<000000008d5f27fa>] generic_perform_write+0x215/0x4d0
[<00000000552c1cde>] ext4_buffered_write_iter+0x101/0x3b0
[<0000000052177ae8>] do_iter_readv_writev+0x19f/0x340
[<000000004b9de834>] do_iter_write+0x13b/0x650
[<00000000e2401b9b>] iter_file_splice_write+0x5a5/0xab0
[<0000000023aa5d90>] direct_splice_actor+0x103/0x1e0
[<0000000089e00fc1>] splice_direct_to_actor+0x2c9/0x7b0
[<000000004386851e>] do_splice_direct+0x159/0x280
[<00000000b567e609>] do_sendfile+0x932/0x1200
Above issue fixed by
commit 1b8f787ef547 ("ext4: fix warning in 'ext4_da_release_space'")
in this scene. To make things better add check pending tree when evict
inode.
According to Eric Whitney's suggestion, bigalloc + inline is still in
development so we just add test for this situation, there isn't need to
add code to free pending tree entry.
Reported-by: [email protected]
Signed-off-by: Ye Bin <[email protected]>
---
fs/ext4/super.c | 5 +++++
1 file changed, 5 insertions(+)
diff --git a/fs/ext4/super.c b/fs/ext4/super.c
index 4b2d257d3845..15b6634975e7 100644
--- a/fs/ext4/super.c
+++ b/fs/ext4/super.c
@@ -1391,6 +1391,11 @@ static void ext4_destroy_inode(struct inode *inode)
"Inode %lu (%p): i_reserved_data_blocks (%u) not cleared!",
inode->i_ino, EXT4_I(inode),
EXT4_I(inode)->i_reserved_data_blocks);
+
+ if (!RB_EMPTY_ROOT(&EXT4_I(inode)->i_pending_tree.root))
+ ext4_error(inode->i_sb,
+ "Inode %lu (%p): i_pending_tree not empty!",
+ inode->i_ino, EXT4_I(inode));
}
static void init_once(void *foo)
--
2.31.1
From: Ye Bin <[email protected]>
Syzbot report issue as follows:
EXT4-fs error (device loop0): ext4_validate_block_bitmap:398: comm rep:
bg 0: block 5: invalid block bitmap
EXT4-fs (loop0): Delayed block allocation failed for inode 18 at logical
offset 0 with max blocks 32 with error 28
EXT4-fs (loop0): This should not happen!! Data will be lost
EXT4-fs (loop0): Total free blocks count 0
EXT4-fs (loop0): Free/Dirty block details
EXT4-fs (loop0): free_blocks=0
EXT4-fs (loop0): dirty_blocks=32
EXT4-fs (loop0): Block reservation details
EXT4-fs (loop0): i_reserved_data_blocks=2
EXT4-fs (loop0): Inode 18 (00000000845cd634):
i_reserved_data_blocks (1) not cleared!
Above issue happens as follows:
Assume:
sbi->s_cluster_ratio = 16
Step1:
Insert delay block [0, 31] -> ei->i_reserved_data_blocks=2
Step2:
ext4_writepages
mpage_map_and_submit_extent -> return failed
mpage_release_unused_pages -> to release [0, 30]
ext4_es_remove_extent -> remove lblk=0 end=30
__es_remove_extent -> len1=0 len2=31-30=1
__es_remove_extent:
...
if (len2 > 0) {
...
if (len1 > 0) {
...
} else {
es->es_lblk = end + 1;
es->es_len = len2;
...
}
if (count_reserved)
count_rsvd(inode, lblk, ...);
goto out; -> will return but didn't calculate 'reserved'
...
Step3:
ext4_destroy_inode -> trigger "i_reserved_data_blocks (1) not cleared!"
To solve above issue if 'len2>0' call 'get_rsvd()' before goto out.
Reported-by: [email protected]
Fixes: 8fcc3a580651 ("ext4: rework reserved cluster accounting when invalidating pages")
Signed-off-by: Ye Bin <[email protected]>
---
fs/ext4/extents_status.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/fs/ext4/extents_status.c b/fs/ext4/extents_status.c
index cd0a861853e3..7ada374ff27d 100644
--- a/fs/ext4/extents_status.c
+++ b/fs/ext4/extents_status.c
@@ -1371,7 +1371,7 @@ static int __es_remove_extent(struct inode *inode, ext4_lblk_t lblk,
if (count_reserved)
count_rsvd(inode, lblk, orig_es.es_len - len1 - len2,
&orig_es, &rc);
- goto out;
+ goto out_get_reserved;
}
if (len1 > 0) {
@@ -1413,6 +1413,7 @@ static int __es_remove_extent(struct inode *inode, ext4_lblk_t lblk,
}
}
+out_get_reserved:
if (count_reserved)
*reserved = get_rsvd(inode, end, es, &rc);
out:
--
2.31.1
* Ye Bin <[email protected]>:
> From: Ye Bin <[email protected]>
>
> Syzbot report issue as follows:
> EXT4-fs error (device loop0): ext4_validate_block_bitmap:398: comm rep:
> bg 0: block 5: invalid block bitmap
> EXT4-fs (loop0): Delayed block allocation failed for inode 18 at logical
> offset 0 with max blocks 32 with error 28
> EXT4-fs (loop0): This should not happen!! Data will be lost
>
> EXT4-fs (loop0): Total free blocks count 0
> EXT4-fs (loop0): Free/Dirty block details
> EXT4-fs (loop0): free_blocks=0
> EXT4-fs (loop0): dirty_blocks=32
> EXT4-fs (loop0): Block reservation details
> EXT4-fs (loop0): i_reserved_data_blocks=2
> EXT4-fs (loop0): Inode 18 (00000000845cd634):
> i_reserved_data_blocks (1) not cleared!
>
> Above issue happens as follows:
> Assume:
> sbi->s_cluster_ratio = 16
> Step1:
> Insert delay block [0, 31] -> ei->i_reserved_data_blocks=2
> Step2:
> ext4_writepages
> mpage_map_and_submit_extent -> return failed
> mpage_release_unused_pages -> to release [0, 30]
> ext4_es_remove_extent -> remove lblk=0 end=30
> __es_remove_extent -> len1=0 len2=31-30=1
> __es_remove_extent:
> ...
> if (len2 > 0) {
> ...
> if (len1 > 0) {
> ...
> } else {
> es->es_lblk = end + 1;
> es->es_len = len2;
> ...
> }
> if (count_reserved)
> count_rsvd(inode, lblk, ...);
> goto out; -> will return but didn't calculate 'reserved'
> ...
> Step3:
> ext4_destroy_inode -> trigger "i_reserved_data_blocks (1) not cleared!"
>
> To solve above issue if 'len2>0' call 'get_rsvd()' before goto out.
>
> Reported-by: [email protected]
> Fixes: 8fcc3a580651 ("ext4: rework reserved cluster accounting when invalidating pages")
> Signed-off-by: Ye Bin <[email protected]>
> ---
> fs/ext4/extents_status.c | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/fs/ext4/extents_status.c b/fs/ext4/extents_status.c
> index cd0a861853e3..7ada374ff27d 100644
> --- a/fs/ext4/extents_status.c
> +++ b/fs/ext4/extents_status.c
> @@ -1371,7 +1371,7 @@ static int __es_remove_extent(struct inode *inode, ext4_lblk_t lblk,
> if (count_reserved)
> count_rsvd(inode, lblk, orig_es.es_len - len1 - len2,
> &orig_es, &rc);
> - goto out;
> + goto out_get_reserved;
> }
>
> if (len1 > 0) {
> @@ -1413,6 +1413,7 @@ static int __es_remove_extent(struct inode *inode, ext4_lblk_t lblk,
> }
> }
>
> +out_get_reserved:
> if (count_reserved)
> *reserved = get_rsvd(inode, end, es, &rc);
> out:
> --
OK, this looks good.
As before, feel free to add:
Reviewed-by: Eric Whitney <[email protected]>
> 2.31.1
>
* Ye Bin <[email protected]>:
> From: Ye Bin <[email protected]>
>
> Syzbot found the following issue:
> BUG: memory leak
> unreferenced object 0xffff8881bde17420 (size 32):
> comm "rep", pid 2327, jiffies 4295381963 (age 32.265s)
> hex dump (first 32 bytes):
> 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
> 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
> backtrace:
> [<00000000ac6d38f8>] __insert_pending+0x13c/0x2d0
> [<00000000d717de3b>] ext4_es_insert_delayed_block+0x399/0x4e0
> [<000000004be03913>] ext4_da_map_blocks.constprop.0+0x739/0xfa0
> [<00000000885a832a>] ext4_da_get_block_prep+0x10c/0x440
> [<0000000029b7f8ef>] __block_write_begin_int+0x28d/0x860
> [<00000000e182ebc3>] ext4_da_write_inline_data_begin+0x2d1/0xf30
> [<00000000ced0c8a2>] ext4_da_write_begin+0x612/0x860
> [<000000008d5f27fa>] generic_perform_write+0x215/0x4d0
> [<00000000552c1cde>] ext4_buffered_write_iter+0x101/0x3b0
> [<0000000052177ae8>] do_iter_readv_writev+0x19f/0x340
> [<000000004b9de834>] do_iter_write+0x13b/0x650
> [<00000000e2401b9b>] iter_file_splice_write+0x5a5/0xab0
> [<0000000023aa5d90>] direct_splice_actor+0x103/0x1e0
> [<0000000089e00fc1>] splice_direct_to_actor+0x2c9/0x7b0
> [<000000004386851e>] do_splice_direct+0x159/0x280
> [<00000000b567e609>] do_sendfile+0x932/0x1200
>
> Above issue fixed by
> commit 1b8f787ef547 ("ext4: fix warning in 'ext4_da_release_space'")
> in this scene. To make things better add check pending tree when evict
> inode.
> According to Eric Whitney's suggestion, bigalloc + inline is still in
> development so we just add test for this situation, there isn't need to
> add code to free pending tree entry.
>
> Reported-by: [email protected]
> Signed-off-by: Ye Bin <[email protected]>
> ---
> fs/ext4/super.c | 5 +++++
> 1 file changed, 5 insertions(+)
>
> diff --git a/fs/ext4/super.c b/fs/ext4/super.c
> index 4b2d257d3845..15b6634975e7 100644
> --- a/fs/ext4/super.c
> +++ b/fs/ext4/super.c
> @@ -1391,6 +1391,11 @@ static void ext4_destroy_inode(struct inode *inode)
> "Inode %lu (%p): i_reserved_data_blocks (%u) not cleared!",
> inode->i_ino, EXT4_I(inode),
> EXT4_I(inode)->i_reserved_data_blocks);
> +
> + if (!RB_EMPTY_ROOT(&EXT4_I(inode)->i_pending_tree.root))
> + ext4_error(inode->i_sb,
> + "Inode %lu (%p): i_pending_tree not empty!",
> + inode->i_ino, EXT4_I(inode));
> }
>
> static void init_once(void *foo)
> --
Looks good. Feel free to add:
Reviewed-by: Eric Whitney <[email protected]>
> 2.31.1
>
On Thu, Dec 08, 2022 at 11:34:24AM +0800, Ye Bin wrote:
> From: Ye Bin <[email protected]>
>
Thanks, applied with an edit commit description to make it clearer
what's being fixed.
ext4: fix reserved cluster accounting in __es_remove_extent()
When bigalloc is enabled, reserved cluster accounting for delayed
allocation is handled in extent_status.c. With a corrupted file
system, it's possible for this accounting to be incorrect,
dsicovered by Syzbot:
....
In general, it's better to explain what is being changed and why, and
put the big messy Syzbot change after the English description of the
change. Remember, what's important is that we make ext4 better ---
not that we are getting rid of a Syzbot report. When someone reads
the commit description later, what they will care about is how the
code has been improved.
Cheers,
- Ted
> Syzbot report issue as follows:
> EXT4-fs error (device loop0): ext4_validate_block_bitmap:398: comm rep:
> bg 0: block 5: invalid block bitmap
> EXT4-fs (loop0): Delayed block allocation failed for inode 18 at logical
> offset 0 with max blocks 32 with error 28
> EXT4-fs (loop0): This should not happen!! Data will be lost
>
> EXT4-fs (loop0): Total free blocks count 0
> EXT4-fs (loop0): Free/Dirty block details
> EXT4-fs (loop0): free_blocks=0
> EXT4-fs (loop0): dirty_blocks=32
> EXT4-fs (loop0): Block reservation details
> EXT4-fs (loop0): i_reserved_data_blocks=2
> EXT4-fs (loop0): Inode 18 (00000000845cd634):
> i_reserved_data_blocks (1) not cleared!
>
> Above issue happens as follows:
> Assume:
> sbi->s_cluster_ratio = 16
> Step1:
> Insert delay block [0, 31] -> ei->i_reserved_data_blocks=2
> Step2:
> ext4_writepages
> mpage_map_and_submit_extent -> return failed
> mpage_release_unused_pages -> to release [0, 30]
> ext4_es_remove_extent -> remove lblk=0 end=30
> __es_remove_extent -> len1=0 len2=31-30=1
> __es_remove_extent:
> ...
> if (len2 > 0) {
> ...
> if (len1 > 0) {
> ...
> } else {
> es->es_lblk = end + 1;
> es->es_len = len2;
> ...
> }
> if (count_reserved)
> count_rsvd(inode, lblk, ...);
> goto out; -> will return but didn't calculate 'reserved'
> ...
> Step3:
> ext4_destroy_inode -> trigger "i_reserved_data_blocks (1) not cleared!"
>
> To solve above issue if 'len2>0' call 'get_rsvd()' before goto out.
>
> Reported-by: [email protected]
> Fixes: 8fcc3a580651 ("ext4: rework reserved cluster accounting when invalidating pages")
> Signed-off-by: Ye Bin <[email protected]>
> ---
> fs/ext4/extents_status.c | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/fs/ext4/extents_status.c b/fs/ext4/extents_status.c
> index cd0a861853e3..7ada374ff27d 100644
> --- a/fs/ext4/extents_status.c
> +++ b/fs/ext4/extents_status.c
> @@ -1371,7 +1371,7 @@ static int __es_remove_extent(struct inode *inode, ext4_lblk_t lblk,
> if (count_reserved)
> count_rsvd(inode, lblk, orig_es.es_len - len1 - len2,
> &orig_es, &rc);
> - goto out;
> + goto out_get_reserved;
> }
>
> if (len1 > 0) {
> @@ -1413,6 +1413,7 @@ static int __es_remove_extent(struct inode *inode, ext4_lblk_t lblk,
> }
> }
>
> +out_get_reserved:
> if (count_reserved)
> *reserved = get_rsvd(inode, end, es, &rc);
> out:
> --
> 2.31.1
>
On Thu, Dec 08, 2022 at 11:34:26AM +0800, Ye Bin wrote:
>
> Above issue fixed by
> commit 1b8f787ef547 ("ext4: fix warning in 'ext4_da_release_space'")
> in this scene. To make things better add check pending tree when evict
> inode.
> According to Eric Whitney's suggestion, bigalloc + inline is still in
> development so we just add test for this situation, there isn't need to
> add code to free pending tree entry.
The i_pending_tree is an in-memory data structure, and so it's not
appropriate to call ext4_error(), because there will be nothing for
fsck to fix.
If you really want to a bug to be noticed, you could use a ext4_msg
plus a WARN_ON(); but an ext4_error() is really not appropriate.
Cheers,
- Ted