On 2024/5/10 19:52, Luis Henriques (SUSE) wrote:
> When doing fast_commit replay an infinite loop may occur due to an
> uninitialized extent_status struct. ext4_ext_determine_insert_hole() does
> not detect the replay and calls ext4_es_find_extent_range(), which will
> return immediately without initializing the 'es' variable.
>
> Because 'es' contains garbage, an integer overflow may happen causing an
> infinite loop in this function, easily reproducible using fstest generic/039.
>
> This commit fixes this issue by detecting the replay in function
> ext4_ext_determine_insert_hole(). It also adds initialization code to the
> error path in function ext4_es_find_extent_range().
>
> Thanks to Zhang Yi, for figuring out the real problem!
>
> Fixes: 8016e29f4362 ("ext4: fast commit recovery path")
> Signed-off-by: Luis Henriques (SUSE) <[email protected]>
> ---
> Hi!
>
> Two comments:
> 1) The change in ext4_ext_map_blocks() could probably use the min_not_zero
> macro instead. I decided not to do so simply because I wasn't sure if
> that would be safe, but I'm fine changing that if you think it is.
>
> 2) I thought about returning 'EXT_MAX_BLOCKS' instead of '0' in
> ext4_lblk_t ext4_ext_determine_insert_hole(), which would then avoid
> the extra change to ext4_ext_map_blocks(). '0' sounds like the right
> value to return, but I'm also OK using 'EXT_MAX_BLOCKS' instead.
>
> And again thanks to Zhang Yi for pointing me the *real* problem!
>
> fs/ext4/extents.c | 6 +++++-
> fs/ext4/extents_status.c | 5 ++++-
> 2 files changed, 9 insertions(+), 2 deletions(-)
>
> diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c
> index e57054bdc5fd..b5bfcb6c18a0 100644
> --- a/fs/ext4/extents.c
> +++ b/fs/ext4/extents.c
> @@ -4052,6 +4052,9 @@ static ext4_lblk_t ext4_ext_determine_insert_hole(struct inode *inode,
> ext4_lblk_t hole_start, len;
> struct extent_status es;
>
> + if (EXT4_SB(inode->i_sb)->s_mount_state & EXT4_FC_REPLAY)
> + return 0;
> +
Sorry, I think it's may not correct. When replaying the jouranl, although
we don't use the extent statue tree, we still need to query the accurate
hole length, e.g. please see skip_hole(). If you do this, the hole length
becomes incorrect, right?
Thanks,
Yi.
> hole_start = lblk;
> len = ext4_ext_find_hole(inode, path, &hole_start);
> again:
> @@ -4226,7 +4229,8 @@ int ext4_ext_map_blocks(handle_t *handle, struct inode *inode,
> len = ext4_ext_determine_insert_hole(inode, path, map->m_lblk);
>
> map->m_pblk = 0;
> - map->m_len = min_t(unsigned int, map->m_len, len);
> + if (len > 0)
> + map->m_len = min_t(unsigned int, map->m_len, len);
> goto out;
> }
>
> diff --git a/fs/ext4/extents_status.c b/fs/ext4/extents_status.c
> index 4a00e2f019d9..acb9616ca119 100644
> --- a/fs/ext4/extents_status.c
> +++ b/fs/ext4/extents_status.c
> @@ -310,8 +310,11 @@ void ext4_es_find_extent_range(struct inode *inode,
> ext4_lblk_t lblk, ext4_lblk_t end,
> struct extent_status *es)
> {
> - if (EXT4_SB(inode->i_sb)->s_mount_state & EXT4_FC_REPLAY)
> + if (EXT4_SB(inode->i_sb)->s_mount_state & EXT4_FC_REPLAY) {
> + /* Initialize extent to zero */
> + es->es_lblk = es->es_len = es->es_pblk = 0;
> return;
> + }
>
> trace_ext4_es_find_extent_range_enter(inode, lblk);
>
>
On Sat 11 May 2024 02:24:17 PM +08, Zhang Yi wrote;
> On 2024/5/10 19:52, Luis Henriques (SUSE) wrote:
>> When doing fast_commit replay an infinite loop may occur due to an
>> uninitialized extent_status struct. ext4_ext_determine_insert_hole() does
>> not detect the replay and calls ext4_es_find_extent_range(), which will
>> return immediately without initializing the 'es' variable.
>>
>> Because 'es' contains garbage, an integer overflow may happen causing an
>> infinite loop in this function, easily reproducible using fstest generic/039.
>>
>> This commit fixes this issue by detecting the replay in function
>> ext4_ext_determine_insert_hole(). It also adds initialization code to the
>> error path in function ext4_es_find_extent_range().
>>
>> Thanks to Zhang Yi, for figuring out the real problem!
>>
>> Fixes: 8016e29f4362 ("ext4: fast commit recovery path")
>> Signed-off-by: Luis Henriques (SUSE) <[email protected]>
>> ---
>> Hi!
>>
>> Two comments:
>> 1) The change in ext4_ext_map_blocks() could probably use the min_not_zero
>> macro instead. I decided not to do so simply because I wasn't sure if
>> that would be safe, but I'm fine changing that if you think it is.
>>
>> 2) I thought about returning 'EXT_MAX_BLOCKS' instead of '0' in
>> ext4_lblk_t ext4_ext_determine_insert_hole(), which would then avoid
>> the extra change to ext4_ext_map_blocks(). '0' sounds like the right
>> value to return, but I'm also OK using 'EXT_MAX_BLOCKS' instead.
>>
>> And again thanks to Zhang Yi for pointing me the *real* problem!
>>
>> fs/ext4/extents.c | 6 +++++-
>> fs/ext4/extents_status.c | 5 ++++-
>> 2 files changed, 9 insertions(+), 2 deletions(-)
>>
>> diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c
>> index e57054bdc5fd..b5bfcb6c18a0 100644
>> --- a/fs/ext4/extents.c
>> +++ b/fs/ext4/extents.c
>> @@ -4052,6 +4052,9 @@ static ext4_lblk_t ext4_ext_determine_insert_hole(struct inode *inode,
>> ext4_lblk_t hole_start, len;
>> struct extent_status es;
>>
>> + if (EXT4_SB(inode->i_sb)->s_mount_state & EXT4_FC_REPLAY)
>> + return 0;
>> +
>
> Sorry, I think it's may not correct. When replaying the jouranl, although
> we don't use the extent statue tree, we still need to query the accurate
> hole length, e.g. please see skip_hole(). If you do this, the hole length
> becomes incorrect, right?
Thank you for your review (and sorry for my delay replying).
So, I see three different options to follow your suggestion:
1) Initialize 'es' immediately when declaring it in function
ext4_ext_determine_insert_hole():
es.es_lblk = es.es_len = es.es_pblk = 0;
2) Initialize 'es' only in ext4_es_find_extent_range() when checking if an
fc replay is in progress (my patch was already doing something like
that):
if (EXT4_SB(inode->i_sb)->s_mount_state & EXT4_FC_REPLAY) {
/* Initialize extent to zero */
es->es_lblk = es->es_len = es->es_pblk = 0;
return;
}
3) Remove the check for fc replay in function ext4_es_find_extent_range(),
which will then unconditionally call __es_find_extent_range(). This
will effectively also initialize the 'es' fields to '0' and, because
__es_tree_search() will return NULL (at least in generic/039 test!),
nothing else will be done.
Since all these 3 options seem to have the same result, I believe option
1) is probably the best as it initializes the structure shortly after it's
declaration. Would you agree? Or did I misunderstood you?
Cheers,
--
Luis
>
> Thanks,
> Yi.
>
>> hole_start = lblk;
>> len = ext4_ext_find_hole(inode, path, &hole_start);
>> again:
>> @@ -4226,7 +4229,8 @@ int ext4_ext_map_blocks(handle_t *handle, struct inode *inode,
>> len = ext4_ext_determine_insert_hole(inode, path, map->m_lblk);
>>
>> map->m_pblk = 0;
>> - map->m_len = min_t(unsigned int, map->m_len, len);
>> + if (len > 0)
>> + map->m_len = min_t(unsigned int, map->m_len, len);
>> goto out;
>> }
>>
>> diff --git a/fs/ext4/extents_status.c b/fs/ext4/extents_status.c
>> index 4a00e2f019d9..acb9616ca119 100644
>> --- a/fs/ext4/extents_status.c
>> +++ b/fs/ext4/extents_status.c
>> @@ -310,8 +310,11 @@ void ext4_es_find_extent_range(struct inode *inode,
>> ext4_lblk_t lblk, ext4_lblk_t end,
>> struct extent_status *es)
>> {
>> - if (EXT4_SB(inode->i_sb)->s_mount_state & EXT4_FC_REPLAY)
>> + if (EXT4_SB(inode->i_sb)->s_mount_state & EXT4_FC_REPLAY) {
>> + /* Initialize extent to zero */
>> + es->es_lblk = es->es_len = es->es_pblk = 0;
>> return;
>> + }
>>
>> trace_ext4_es_find_extent_range_enter(inode, lblk);
>>
>>
On 2024/5/14 21:04, Luis Henriques wrote:
> On Sat 11 May 2024 02:24:17 PM +08, Zhang Yi wrote;
>
>> On 2024/5/10 19:52, Luis Henriques (SUSE) wrote:
>>> When doing fast_commit replay an infinite loop may occur due to an
>>> uninitialized extent_status struct. ext4_ext_determine_insert_hole() does
>>> not detect the replay and calls ext4_es_find_extent_range(), which will
>>> return immediately without initializing the 'es' variable.
>>>
>>> Because 'es' contains garbage, an integer overflow may happen causing an
>>> infinite loop in this function, easily reproducible using fstest generic/039.
>>>
>>> This commit fixes this issue by detecting the replay in function
>>> ext4_ext_determine_insert_hole(). It also adds initialization code to the
>>> error path in function ext4_es_find_extent_range().
>>>
>>> Thanks to Zhang Yi, for figuring out the real problem!
>>>
>>> Fixes: 8016e29f4362 ("ext4: fast commit recovery path")
>>> Signed-off-by: Luis Henriques (SUSE) <[email protected]>
>>> ---
>>> Hi!
>>>
>>> Two comments:
>>> 1) The change in ext4_ext_map_blocks() could probably use the min_not_zero
>>> macro instead. I decided not to do so simply because I wasn't sure if
>>> that would be safe, but I'm fine changing that if you think it is.
>>>
>>> 2) I thought about returning 'EXT_MAX_BLOCKS' instead of '0' in
>>> ext4_lblk_t ext4_ext_determine_insert_hole(), which would then avoid
>>> the extra change to ext4_ext_map_blocks(). '0' sounds like the right
>>> value to return, but I'm also OK using 'EXT_MAX_BLOCKS' instead.
>>>
>>> And again thanks to Zhang Yi for pointing me the *real* problem!
>>>
>>> fs/ext4/extents.c | 6 +++++-
>>> fs/ext4/extents_status.c | 5 ++++-
>>> 2 files changed, 9 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c
>>> index e57054bdc5fd..b5bfcb6c18a0 100644
>>> --- a/fs/ext4/extents.c
>>> +++ b/fs/ext4/extents.c
>>> @@ -4052,6 +4052,9 @@ static ext4_lblk_t ext4_ext_determine_insert_hole(struct inode *inode,
>>> ext4_lblk_t hole_start, len;
>>> struct extent_status es;
>>>
>>> + if (EXT4_SB(inode->i_sb)->s_mount_state & EXT4_FC_REPLAY)
>>> + return 0;
>>> +
>>
>> Sorry, I think it's may not correct. When replaying the jouranl, although
>> we don't use the extent statue tree, we still need to query the accurate
>> hole length, e.g. please see skip_hole(). If you do this, the hole length
>> becomes incorrect, right?
>
> Thank you for your review (and sorry for my delay replying).
>
> So, I see three different options to follow your suggestion:
>
> 1) Initialize 'es' immediately when declaring it in function
> ext4_ext_determine_insert_hole():
>
> es.es_lblk = es.es_len = es.es_pblk = 0;
>
> 2) Initialize 'es' only in ext4_es_find_extent_range() when checking if an
> fc replay is in progress (my patch was already doing something like
> that):
>
> if (EXT4_SB(inode->i_sb)->s_mount_state & EXT4_FC_REPLAY) {
> /* Initialize extent to zero */
> es->es_lblk = es->es_len = es->es_pblk = 0;
> return;
> }
>
> 3) Remove the check for fc replay in function ext4_es_find_extent_range(),
> which will then unconditionally call __es_find_extent_range(). This
> will effectively also initialize the 'es' fields to '0' and, because
> __es_tree_search() will return NULL (at least in generic/039 test!),
> nothing else will be done.
>
> Since all these 3 options seem to have the same result, I believe option
> 1) is probably the best as it initializes the structure shortly after it's
> declaration. Would you agree? Or did I misunderstood you?
>
Both 1 and 2 are looks fine to me, but I would prefer to initialize it
unconditionally in ext4_es_find_extent_range().
@@ -310,6 +310,8 @@ void ext4_es_find_extent_range(struct inode *inode,
ext4_lblk_t lblk, ext4_lblk_t end,
struct extent_status *es)
{
+ es->es_lblk = es->es_len = es->es_pblk = 0;
+
if (EXT4_SB(inode->i_sb)->s_mount_state & EXT4_FC_REPLAY)
return;
Thanks,
Yi.
On Wed 15 May 2024 12:59:26 PM +08, Zhang Yi wrote;
> On 2024/5/14 21:04, Luis Henriques wrote:
>> On Sat 11 May 2024 02:24:17 PM +08, Zhang Yi wrote;
>>
>>> On 2024/5/10 19:52, Luis Henriques (SUSE) wrote:
>>>> When doing fast_commit replay an infinite loop may occur due to an
>>>> uninitialized extent_status struct. ext4_ext_determine_insert_hole() does
>>>> not detect the replay and calls ext4_es_find_extent_range(), which will
>>>> return immediately without initializing the 'es' variable.
>>>>
>>>> Because 'es' contains garbage, an integer overflow may happen causing an
>>>> infinite loop in this function, easily reproducible using fstest generic/039.
>>>>
>>>> This commit fixes this issue by detecting the replay in function
>>>> ext4_ext_determine_insert_hole(). It also adds initialization code to the
>>>> error path in function ext4_es_find_extent_range().
>>>>
>>>> Thanks to Zhang Yi, for figuring out the real problem!
>>>>
>>>> Fixes: 8016e29f4362 ("ext4: fast commit recovery path")
>>>> Signed-off-by: Luis Henriques (SUSE) <[email protected]>
>>>> ---
>>>> Hi!
>>>>
>>>> Two comments:
>>>> 1) The change in ext4_ext_map_blocks() could probably use the min_not_zero
>>>> macro instead. I decided not to do so simply because I wasn't sure if
>>>> that would be safe, but I'm fine changing that if you think it is.
>>>>
>>>> 2) I thought about returning 'EXT_MAX_BLOCKS' instead of '0' in
>>>> ext4_lblk_t ext4_ext_determine_insert_hole(), which would then avoid
>>>> the extra change to ext4_ext_map_blocks(). '0' sounds like the right
>>>> value to return, but I'm also OK using 'EXT_MAX_BLOCKS' instead.
>>>>
>>>> And again thanks to Zhang Yi for pointing me the *real* problem!
>>>>
>>>> fs/ext4/extents.c | 6 +++++-
>>>> fs/ext4/extents_status.c | 5 ++++-
>>>> 2 files changed, 9 insertions(+), 2 deletions(-)
>>>>
>>>> diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c
>>>> index e57054bdc5fd..b5bfcb6c18a0 100644
>>>> --- a/fs/ext4/extents.c
>>>> +++ b/fs/ext4/extents.c
>>>> @@ -4052,6 +4052,9 @@ static ext4_lblk_t ext4_ext_determine_insert_hole(struct inode *inode,
>>>> ext4_lblk_t hole_start, len;
>>>> struct extent_status es;
>>>>
>>>> + if (EXT4_SB(inode->i_sb)->s_mount_state & EXT4_FC_REPLAY)
>>>> + return 0;
>>>> +
>>>
>>> Sorry, I think it's may not correct. When replaying the jouranl, although
>>> we don't use the extent statue tree, we still need to query the accurate
>>> hole length, e.g. please see skip_hole(). If you do this, the hole length
>>> becomes incorrect, right?
>>
>> Thank you for your review (and sorry for my delay replying).
>>
>> So, I see three different options to follow your suggestion:
>>
>> 1) Initialize 'es' immediately when declaring it in function
>> ext4_ext_determine_insert_hole():
>>
>> es.es_lblk = es.es_len = es.es_pblk = 0;
>>
>> 2) Initialize 'es' only in ext4_es_find_extent_range() when checking if an
>> fc replay is in progress (my patch was already doing something like
>> that):
>>
>> if (EXT4_SB(inode->i_sb)->s_mount_state & EXT4_FC_REPLAY) {
>> /* Initialize extent to zero */
>> es->es_lblk = es->es_len = es->es_pblk = 0;
>> return;
>> }
>>
>> 3) Remove the check for fc replay in function ext4_es_find_extent_range(),
>> which will then unconditionally call __es_find_extent_range(). This
>> will effectively also initialize the 'es' fields to '0' and, because
>> __es_tree_search() will return NULL (at least in generic/039 test!),
>> nothing else will be done.
>>
>> Since all these 3 options seem to have the same result, I believe option
>> 1) is probably the best as it initializes the structure shortly after it's
>> declaration. Would you agree? Or did I misunderstood you?
>>
>
> Both 1 and 2 are looks fine to me, but I would prefer to initialize it
> unconditionally in ext4_es_find_extent_range().
>
> @@ -310,6 +310,8 @@ void ext4_es_find_extent_range(struct inode *inode,
> ext4_lblk_t lblk, ext4_lblk_t end,
> struct extent_status *es)
> {
> + es->es_lblk = es->es_len = es->es_pblk = 0;
> +
> if (EXT4_SB(inode->i_sb)->s_mount_state & EXT4_FC_REPLAY)
> return;
Thank you, Yi. I'll send out v2 shortly. Although, to be fair, the real
patch author shouldn't be me. :-)
Cheers,
--
Luis