LinuxLists.cc - Re: [PATCH v3 03/26] ext4: correct the hole length returned by ext4_map

2024-05-09 15:16:56

Subject: Re: [PATCH v3 03/26] ext4: correct the hole length returned by ext4_map_blocks()

On Sat 27 Jan 2024 09:58:02 AM +08, Zhang Yi wrote;
<...>
> +static ext4_lblk_t ext4_ext_determine_insert_hole(struct inode *inode,
> + struct ext4_ext_path *path,
> + ext4_lblk_t lblk)
> +{
> + ext4_lblk_t hole_start, len;
> + struct extent_status es;
> +
> + hole_start = lblk;
> + len = ext4_ext_find_hole(inode, path, &hole_start);
> +again:
> + ext4_es_find_extent_range(inode, &ext4_es_is_delayed, hole_start,
> + hole_start + len - 1, &es);
> + if (!es.es_len)
> + goto insert_hole;
> +
> + /*
> + * There's a delalloc extent in the hole, handle it if the delalloc
> + * extent is in front of, behind and straddle the queried range.
> + */
> + if (lblk >= es.es_lblk + es.es_len) {
> + /*
> + * The delalloc extent is in front of the queried range,
> + * find again from the queried start block.
> + */
> + len -= lblk - hole_start;
> + hole_start = lblk;
> + goto again;

It's looks like it's easy to trigger an infinite loop here using fstest
generic/039. If I understand it correctly (which doesn't happen as often
as I'd like), this is due to an integer overflow in the 'if' condition,
and should be fixed with the patch below.

From 3117af2f8dacad37a2722850421f31075ae9e88d Mon Sep 17 00:00:00 2001
From: "Luis Henriques (SUSE)" <[email protected]>
Date: Thu, 9 May 2024 15:53:01 +0100
Subject: [PATCH] ext4: fix infinite loop caused by integer overflow

An integer overflow will happen if the extent_status len is set to
EXT_MAX_BLOCKS (0xffffffff). This may cause an infinite loop in function
ext4_ext_determine_insert_hole(), easily reproducible using fstest
generic/039.

Fixes: 6430dea07e85 ("ext4: correct the hole length returned by ext4_map_blocks()")
Signed-off-by: Luis Henriques (SUSE) <[email protected]>
---
fs/ext4/extents.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c
index e57054bdc5fd..193121b394f9 100644
--- a/fs/ext4/extents.c
+++ b/fs/ext4/extents.c
@@ -4064,7 +4064,7 @@ static ext4_lblk_t ext4_ext_determine_insert_hole(struct inode *inode,
* There's a delalloc extent in the hole, handle it if the delalloc
* extent is in front of, behind and straddle the queried range.
*/
- if (lblk >= es.es_lblk + es.es_len) {
+ if (lblk >= ((__u64) es.es_lblk) + es.es_len) {
/*
* The delalloc extent is in front of the queried range,
* find again from the queried start block.

2024-05-09 16:41:11

by Theodore Ts'o

[permalink] [raw]

Subject: Re: [PATCH v3 03/26] ext4: correct the hole length returned by ext4_map_blocks()

On Thu, May 09, 2024 at 04:16:34PM +0100, Luis Henriques wrote:
>
> It's looks like it's easy to trigger an infinite loop here using fstest
> generic/039. If I understand it correctly (which doesn't happen as often
> as I'd like), this is due to an integer overflow in the 'if' condition,
> and should be fixed with the patch below.

Thanks for the report. However, I can't reproduce the failure, and
looking at generic/039, I don't see how it could be relevant to the
code path in question. Generic/039 creates a test symlink with two
hard links in the same directory, syncs the file system, and then
removes one of the hard links, and then drops access to the block
device using dmflakey. So I don't see how the extent code would be
involved at all. Are you sure that you have the correct test listed?

Looking at the code in question in fs/ext4/extents.c:

again:
ext4_es_find_extent_range(inode, &ext4_es_is_delayed, hole_start,
hole_start + len - 1, &es);
if (!es.es_len)
goto insert_hole;

* There's a delalloc extent in the hole, handle it if the delalloc
* extent is in front of, behind and straddle the queried range.
*/
- if (lblk >= es.es_lblk + es.es_len) {
+ if (lblk >= ((__u64) es.es_lblk) + es.es_len) {
/*
* The delalloc extent is in front of the queried range,
* find again from the queried start block.
len -= lblk - hole_start;
hole_start = lblk;
goto again;

lblk and es.es_lblk are both __u32. So the infinite loop is
presumably because es.es_lblk + es.es_len has overflowed. This should
never happen(tm), and in fact we have a test for this case which
*should* have gotten tripped when ext4_es_find_extent_range() calls
__es_tree_search() in fs/ext4/extents_status.c:

static inline ext4_lblk_t ext4_es_end(struct extent_status *es)
{
BUG_ON(es->es_lblk + es->es_len < es->es_lblk);
return es->es_lblk + es->es_len - 1;
}

So the patch is harmless, and I can see how it might fix what you were
seeing --- but I'm a bit nervous that I can't reproduce it and the
commit description claims that it reproduces easily; and we should
have never allowed the entry to have gotten introduced into the
extents status tree in the first place, and if it had been introduced,
it should have been caught before it was returned by
ext4_es_find_extent_range().

Can you give more details about the reproducer; can you double check
the test id, and how easily you can trigger the failure, and what is
the hardware you used to run the test?

Many thanks,

- Ted

2024-05-09 17:24:01

by Luis Henriques

[permalink] [raw]

Subject: Re: [PATCH v3 03/26] ext4: correct the hole length returned by ext4_map_blocks()

On Thu 09 May 2024 12:39:53 PM -04, Theodore Ts'o wrote;

> On Thu, May 09, 2024 at 04:16:34PM +0100, Luis Henriques wrote:
>>
>> It's looks like it's easy to trigger an infinite loop here using fstest
>> generic/039. If I understand it correctly (which doesn't happen as often
>> as I'd like), this is due to an integer overflow in the 'if' condition,
>> and should be fixed with the patch below.
>
> Thanks for the report. However, I can't reproduce the failure, and
> looking at generic/039, I don't see how it could be relevant to the
> code path in question. Generic/039 creates a test symlink with two
> hard links in the same directory, syncs the file system, and then
> removes one of the hard links, and then drops access to the block
> device using dmflakey. So I don't see how the extent code would be
> involved at all. Are you sure that you have the correct test listed?

Yep, I just retested and it's definitely generic/039. I'm using a simple
test environment, with virtme-ng.

> Looking at the code in question in fs/ext4/extents.c:
>
> again:
> ext4_es_find_extent_range(inode, &ext4_es_is_delayed, hole_start,
> hole_start + len - 1, &es);
> if (!es.es_len)
> goto insert_hole;
>
> * There's a delalloc extent in the hole, handle it if the delalloc
> * extent is in front of, behind and straddle the queried range.
> */
> - if (lblk >= es.es_lblk + es.es_len) {
> + if (lblk >= ((__u64) es.es_lblk) + es.es_len) {
> /*
> * The delalloc extent is in front of the queried range,
> * find again from the queried start block.
> len -= lblk - hole_start;
> hole_start = lblk;
> goto again;
>
> lblk and es.es_lblk are both __u32. So the infinite loop is
> presumably because es.es_lblk + es.es_len has overflowed. This should
> never happen(tm), and in fact we have a test for this case which

If I instrument the code, I can see that es.es_len is definitely set to
EXT_MAX_BLOCKS, which will overflow.

> *should* have gotten tripped when ext4_es_find_extent_range() calls
> __es_tree_search() in fs/ext4/extents_status.c:
>
> static inline ext4_lblk_t ext4_es_end(struct extent_status *es)
> {
> BUG_ON(es->es_lblk + es->es_len < es->es_lblk);
> return es->es_lblk + es->es_len - 1;
> }
>
> So the patch is harmless, and I can see how it might fix what you were
> seeing --- but I'm a bit nervous that I can't reproduce it and the
> commit description claims that it reproduces easily; and we should
> have never allowed the entry to have gotten introduced into the
> extents status tree in the first place, and if it had been introduced,
> it should have been caught before it was returned by
> ext4_es_find_extent_range().
>
> Can you give more details about the reproducer; can you double check
> the test id, and how easily you can trigger the failure, and what is
> the hardware you used to run the test?

So, here's few more details that may clarify, and that I should have added
to the commit description:

When the test hangs, the test is blocked mounting the flakey device:

mount -t ext4 -o acl,user_xattr /dev/mapper/flakey-test /mnt/scratch

which will eventually call into ext4_ext_map_blocks(), triggering the bug.

Also, some more code instrumentation shows that after the call to
ext4_ext_find_hole(), the 'hole_start' will be set to '1' and 'len' to
'0xfffffffe'. This '0xfffffffe' value is a bit odd, but it comes from the
fact that, in ext4_ext_find_hole(), the call to
ext4_ext_next_allocated_block() will return EXT_MAX_BLOCKS and 'len' will
thus be set to 'EXT_MAX_BLOCKS - 1'.

Does this make sense?

Cheers,
--
Luis

2024-05-10 03:40:07

by Zhang Yi

[permalink] [raw]

Subject: Re: [PATCH v3 03/26] ext4: correct the hole length returned by ext4_map_blocks()

On 2024/5/10 1:23, Luis Henriques wrote:
> On Thu 09 May 2024 12:39:53 PM -04, Theodore Ts'o wrote;
>
>> On Thu, May 09, 2024 at 04:16:34PM +0100, Luis Henriques wrote:
>>>
>>> It's looks like it's easy to trigger an infinite loop here using fstest
>>> generic/039. If I understand it correctly (which doesn't happen as often
>>> as I'd like), this is due to an integer overflow in the 'if' condition,
>>> and should be fixed with the patch below.
>>
>> Thanks for the report. However, I can't reproduce the failure, and
>> looking at generic/039, I don't see how it could be relevant to the
>> code path in question. Generic/039 creates a test symlink with two
>> hard links in the same directory, syncs the file system, and then
>> removes one of the hard links, and then drops access to the block
>> device using dmflakey. So I don't see how the extent code would be
>> involved at all. Are you sure that you have the correct test listed?
>
> Yep, I just retested and it's definitely generic/039. I'm using a simple
> test environment, with virtme-ng.
>
>> Looking at the code in question in fs/ext4/extents.c:
>>
>> again:
>> ext4_es_find_extent_range(inode, &ext4_es_is_delayed, hole_start,
>> hole_start + len - 1, &es);
>> if (!es.es_len)
>> goto insert_hole;
>>
>> * There's a delalloc extent in the hole, handle it if the delalloc
>> * extent is in front of, behind and straddle the queried range.
>> */
>> - if (lblk >= es.es_lblk + es.es_len) {
>> + if (lblk >= ((__u64) es.es_lblk) + es.es_len) {
>> /*
>> * The delalloc extent is in front of the queried range,
>> * find again from the queried start block.
>> len -= lblk - hole_start;
>> hole_start = lblk;
>> goto again;
>>
>> lblk and es.es_lblk are both __u32. So the infinite loop is
>> presumably because es.es_lblk + es.es_len has overflowed. This should
>> never happen(tm), and in fact we have a test for this case which
>
> If I instrument the code, I can see that es.es_len is definitely set to
> EXT_MAX_BLOCKS, which will overflow.
>

Thanks for the report. After looking at the code, I think the root
cause of this issue is the variable es was not initialized on replaying
fast commit. ext4_es_find_extent_range() will return directly when
EXT4_FC_REPLAY flag is set, and then the es.len becomes stall.

I can always reproduce this issue on generic/039 with
MKFS_OPTIONS="-O fast_commit".

This uninitialization problem originally existed in the old
ext4_ext_put_gap_in_cache(), but it didn't trigger any real problem
since we never check and use extent cache when replaying fast commit.
So I suppose the correct fix would be to unconditionally initialize
the es variable.

Thanks,
Yi.

>> *should* have gotten tripped when ext4_es_find_extent_range() calls
>> __es_tree_search() in fs/ext4/extents_status.c:
>>
>> static inline ext4_lblk_t ext4_es_end(struct extent_status *es)
>> {
>> BUG_ON(es->es_lblk + es->es_len < es->es_lblk);
>> return es->es_lblk + es->es_len - 1;
>> }
>>
>> So the patch is harmless, and I can see how it might fix what you were
>> seeing --- but I'm a bit nervous that I can't reproduce it and the
>> commit description claims that it reproduces easily; and we should
>> have never allowed the entry to have gotten introduced into the
>> extents status tree in the first place, and if it had been introduced,
>> it should have been caught before it was returned by
>> ext4_es_find_extent_range().
>>
>> Can you give more details about the reproducer; can you double check
>> the test id, and how easily you can trigger the failure, and what is
>> the hardware you used to run the test?
>
> So, here's few more details that may clarify, and that I should have added
> to the commit description:
>
> When the test hangs, the test is blocked mounting the flakey device:
>
> mount -t ext4 -o acl,user_xattr /dev/mapper/flakey-test /mnt/scratch
>
> which will eventually call into ext4_ext_map_blocks(), triggering the bug.
>
> Also, some more code instrumentation shows that after the call to
> ext4_ext_find_hole(), the 'hole_start' will be set to '1' and 'len' to
> '0xfffffffe'. This '0xfffffffe' value is a bit odd, but it comes from the
> fact that, in ext4_ext_find_hole(), the call to
> ext4_ext_next_allocated_block() will return EXT_MAX_BLOCKS and 'len' will
> thus be set to 'EXT_MAX_BLOCKS - 1'.
>
> Does this make sense?
>
> Cheers,
>

2024-05-10 09:42:04

by Luis Henriques

[permalink] [raw]

Subject: Re: [PATCH v3 03/26] ext4: correct the hole length returned by ext4_map_blocks()

On Fri 10 May 2024 11:39:48 AM +08, Zhang Yi wrote;

> On 2024/5/10 1:23, Luis Henriques wrote:
>> On Thu 09 May 2024 12:39:53 PM -04, Theodore Ts'o wrote;
>>
>>> On Thu, May 09, 2024 at 04:16:34PM +0100, Luis Henriques wrote:
>>>>
>>>> It's looks like it's easy to trigger an infinite loop here using fstest
>>>> generic/039. If I understand it correctly (which doesn't happen as often
>>>> as I'd like), this is due to an integer overflow in the 'if' condition,
>>>> and should be fixed with the patch below.
>>>
>>> Thanks for the report. However, I can't reproduce the failure, and
>>> looking at generic/039, I don't see how it could be relevant to the
>>> code path in question. Generic/039 creates a test symlink with two
>>> hard links in the same directory, syncs the file system, and then
>>> removes one of the hard links, and then drops access to the block
>>> device using dmflakey. So I don't see how the extent code would be
>>> involved at all. Are you sure that you have the correct test listed?
>>
>> Yep, I just retested and it's definitely generic/039. I'm using a simple
>> test environment, with virtme-ng.
>>
>>> Looking at the code in question in fs/ext4/extents.c:
>>>
>>> again:
>>> ext4_es_find_extent_range(inode, &ext4_es_is_delayed, hole_start,
>>> hole_start + len - 1, &es);
>>> if (!es.es_len)
>>> goto insert_hole;
>>>
>>> * There's a delalloc extent in the hole, handle it if the delalloc
>>> * extent is in front of, behind and straddle the queried range.
>>> */
>>> - if (lblk >= es.es_lblk + es.es_len) {
>>> + if (lblk >= ((__u64) es.es_lblk) + es.es_len) {
>>> /*
>>> * The delalloc extent is in front of the queried range,
>>> * find again from the queried start block.
>>> len -= lblk - hole_start;
>>> hole_start = lblk;
>>> goto again;
>>>
>>> lblk and es.es_lblk are both __u32. So the infinite loop is
>>> presumably because es.es_lblk + es.es_len has overflowed. This should
>>> never happen(tm), and in fact we have a test for this case which
>>
>> If I instrument the code, I can see that es.es_len is definitely set to
>> EXT_MAX_BLOCKS, which will overflow.
>>
>
> Thanks for the report. After looking at the code, I think the root
> cause of this issue is the variable es was not initialized on replaying
> fast commit. ext4_es_find_extent_range() will return directly when
> EXT4_FC_REPLAY flag is set, and then the es.len becomes stall.
>
> I can always reproduce this issue on generic/039 with
> MKFS_OPTIONS="-O fast_commit".
>
> This uninitialization problem originally existed in the old
> ext4_ext_put_gap_in_cache(), but it didn't trigger any real problem
> since we never check and use extent cache when replaying fast commit.
> So I suppose the correct fix would be to unconditionally initialize
> the es variable.

Oh, you're absolutely right -- the extent_status 'es' struct isn't being
initialized in that case. I totally failed to see that. And yes, I also
failed to mention I had 'fast_commit' feature enabled, sorry!

Thanks a lot for figuring this out, Yi. I'm looking at this code and
trying to understand if it would be safe to call __es_find_extent_range()
when EXT4_FC_REPLAY is in progress. Probably not, and probably better to
simply do:

es->es_lblk = es->es_len = es->es_pblk = 0;

in that case. I'll send out a patch later today.

Cheers,
--
Luis