Hi,
Syzbot found an issue [1] in fallocate() that looks to me like a loss of precision.
The C reproducer [2] calls fallocate() and passes the size 0xffeffeff000ul, and
offset 0x1000000ul, which is then used to calculate the first_block and
stop_block using ext4_lblk_t type (u32). I think this gets the MSB of the size
truncated and leads to invalid calculations, and eventually his BUG() in
https://elixir.bootlin.com/linux/v5.16.11/source/fs/ext4/indirect.c#L1244
The issue can be reproduced on 5.17.0-rc5, but I don't think it's a new
regression. I spent some time debugging it, but could spot anything obvious.
Can someone have a look please.
[1] https://syzkaller.appspot.com/bug?id=b80bd9cf348aac724a4f4dff251800106d721331
[2] https://syzkaller.appspot.com/text?tag=ReproC&x=14ba0238700000
--
Thanks,
Tadeusz
On 22/02/24 05:12PM, Tadeusz Struk wrote:
>
> Hi,
> Syzbot found an issue [1] in fallocate() that looks to me like a loss of precision.
> The C reproducer [2] calls fallocate() and passes the size 0xffeffeff000ul, and
> offset 0x1000000ul, which is then used to calculate the first_block and
> stop_block using ext4_lblk_t type (u32). I think this gets the MSB of the size
> truncated and leads to invalid calculations, and eventually his BUG() in
> https://elixir.bootlin.com/linux/v5.16.11/source/fs/ext4/indirect.c#L1244
> The issue can be reproduced on 5.17.0-rc5, but I don't think it's a new
> regression. I spent some time debugging it, but could spot anything obvious.
> Can someone have a look please.
I did look into it a little. Below are some of my observations.
If nobody gets to it before me, I can spend sometime next week to verify it's
correctness.
So I think based on the warning log before kernel BUG_ON() [1], it looks like it
has the problem with ext4_block_to_path() calculation with end offset.
It seems it is not fitting into triple block ptrs calculation.
<log>
======
EXT4-fs warning (device sda1): ext4_block_to_path:107: block 1074791436 > max in inode 1137
But ideally it should fit in (right?) since we do make sure if end >= max_block;
then we set it to end = max_block.
Then looking at how we calculate max_block is below
max_block = (EXT4_SB(inode->i_sb)->s_bitmap_maxbytes + blocksize-1) >> EXT4_BLOCK_SIZE_BITS(inode->i_sb);
So looking closely I think there _could be_ a off by 1 calculation error in
above. So I think above calculation is for max_len and not max_end_block.
I think max_block should be max_end_block which should be this -
max_end_block = ((EXT4_SB(inode->i_sb)->s_bitmap_maxbytes + blocksize-1) >> EXT4_BLOCK_SIZE_BITS(inode->i_sb))-1;
But I haven't yet verified it completely. This is just my initial thought.
Maybe others can confirm too. Or maybe there is more then one problem.
But somehow above looks more likely to me.
I can verify this sometime next week when I get back to it.
But thanks for reporting the issue :)
-ritesh
[1] https://syzkaller.appspot.com/bug?id=b80bd9cf348aac724a4f4dff251800106d721331
>
>
> [1] https://syzkaller.appspot.com/bug?id=b80bd9cf348aac724a4f4dff251800106d721331
> [2] https://syzkaller.appspot.com/text?tag=ReproC&x=14ba0238700000
>
> --
> Thanks,
> Tadeusz
On 2/25/22 09:10, Ritesh Harjani wrote:
> On 22/02/24 05:12PM, Tadeusz Struk wrote:
>>
>> Hi,
>> Syzbot found an issue [1] in fallocate() that looks to me like a loss of precision.
>> The C reproducer [2] calls fallocate() and passes the size 0xffeffeff000ul, and
>> offset 0x1000000ul, which is then used to calculate the first_block and
>> stop_block using ext4_lblk_t type (u32). I think this gets the MSB of the size
>> truncated and leads to invalid calculations, and eventually his BUG() in
>> https://elixir.bootlin.com/linux/v5.16.11/source/fs/ext4/indirect.c#L1244
>> The issue can be reproduced on 5.17.0-rc5, but I don't think it's a new
>> regression. I spent some time debugging it, but could spot anything obvious.
>> Can someone have a look please.
>
> I did look into it a little. Below are some of my observations.
> If nobody gets to it before me, I can spend sometime next week to verify it's
> correctness.
>
> So I think based on the warning log before kernel BUG_ON() [1], it looks like it
> has the problem with ext4_block_to_path() calculation with end offset.
> It seems it is not fitting into triple block ptrs calculation.
>
> <log>
> ======
> EXT4-fs warning (device sda1): ext4_block_to_path:107: block 1074791436 > max in inode 1137
>
> But ideally it should fit in (right?) since we do make sure if end >= max_block;
> then we set it to end = max_block.
>
> Then looking at how we calculate max_block is below
> max_block = (EXT4_SB(inode->i_sb)->s_bitmap_maxbytes + blocksize-1) >> EXT4_BLOCK_SIZE_BITS(inode->i_sb);
>
> So looking closely I think there _could be_ a off by 1 calculation error in
> above. So I think above calculation is for max_len and not max_end_block.
>
> I think max_block should be max_end_block which should be this -
> max_end_block = ((EXT4_SB(inode->i_sb)->s_bitmap_maxbytes + blocksize-1) >> EXT4_BLOCK_SIZE_BITS(inode->i_sb))-1;
>
>
> But I haven't yet verified it completely. This is just my initial thought.
> Maybe others can confirm too. Or maybe there is more then one problem.
> But somehow above looks more likely to me.
I tried the above and it does help.
>
> I can verify this sometime next week when I get back to it.
> But thanks for reporting the issue :)
Next week is perfectly fine. Thanks for looking into it.
--
Thanks,
Tadeusz
On 2/25/22 11:19, Tadeusz Struk wrote:
>> I can verify this sometime next week when I get back to it.
>> But thanks for reporting the issue :)
>
> Next week is perfectly fine. Thanks for looking into it.
Hi Ritesh,
Did you have chance to look into this?
If you want I can send a patch that fixes the off by 1 calculation error.
--
Thanks,
Tadeusz
On 22/03/02 03:14PM, Tadeusz Struk wrote:
> On 2/25/22 11:19, Tadeusz Struk wrote:
> > > I?can?verify?this?sometime?next?week?when?I?get?back?to?it.
> > > But?thanks?for?reporting?the?issue :)
> >
> > Next?week?is?perfectly?fine.?Thanks?for?looking?into?it.
>
> Hi Ritesh,
> Did you have chance to look into this?
> If you want I can send a patch that fixes the off by 1 calculation error.
Hi Tadeusz,
I wanted to look at that path a bit more before sending that patch.
Last analysis by me was more of a cursory look at the kernel dmesg log which you
had shared.
In case if you want to pursue that issue and spend time on it, then feel free to
do it.
I got pulled into number of other things last week and this week. So didn't get
a chance to look into it yet. I hope to look into this soon (if no one else
picks up :))
-ritesh
>
> --
> Thanks,
> Tadeusz
On 3/3/22 06:56, Ritesh Harjani wrote:
> On 22/03/02 03:14PM, Tadeusz Struk wrote:
>> On 2/25/22 11:19, Tadeusz Struk wrote:
>>>> I can verify this sometime next week when I get back to it.
>>>> But thanks for reporting the issue :)
>>>
>>> Next week is perfectly fine. Thanks for looking into it.
>>
>> Hi Ritesh,
>> Did you have chance to look into this?
>> If you want I can send a patch that fixes the off by 1 calculation error.
>
> Hi Tadeusz,
>
> I wanted to look at that path a bit more before sending that patch.
> Last analysis by me was more of a cursory look at the kernel dmesg log which you
> had shared.
>
> In case if you want to pursue that issue and spend time on it, then feel free to
> do it.
>
> I got pulled into number of other things last week and this week. So didn't get
> a chance to look into it yet. I hope to look into this soon (if no one else
> picks up :))
I'm not familiar with the internals of ext4 implementation so I would rather
have someone who knows it look at it.
I will wait till the end of this week and if there will be fix for it by then
I will send a patch fixing the off by 1 issue to get the ball rolling.
--
Thanks,
Tadeusz
On 22/03/03 07:37AM, Tadeusz Struk wrote:
> On 3/3/22 06:56, Ritesh Harjani wrote:
> > On 22/03/02 03:14PM, Tadeusz Struk wrote:
> > > On 2/25/22 11:19, Tadeusz Struk wrote:
> > > > > I?can?verify?this?sometime?next?week?when?I?get?back?to?it.
> > > > > But?thanks?for?reporting?the?issue :)
> > > >
> > > > Next?week?is?perfectly?fine.?Thanks?for?looking?into?it.
> > >
> > > Hi Ritesh,
> > > Did you have chance to look into this?
> > > If you want I can send a patch that fixes the off by 1 calculation error.
> >
> > Hi Tadeusz,
> >
> > I wanted to look at that path a bit more before sending that patch.
> > Last analysis by me was more of a cursory look at the kernel dmesg log which you
> > had shared.
> >
> > In case if you want to pursue that issue and spend time on it, then feel free to
> > do it.
> >
> > I got pulled into number of other things last week and this week. So didn't get
> > a chance to look into it yet. I hope to look into this soon (if no one else
> > picks up :))
>
> I'm not familiar with the internals of ext4 implementation so I would rather
> have someone who knows it look at it.
No problem. I am willing to look into this anyways.
btw, this issue could be seen easily with below cmd on non-extent ext4 FS.
sudo xfs_io -f -c "truncate 0x4010040c000" -c "fsync" -c "fpunch 0x1000000 0xffefffff000" testfile
-ritesh
On 22/03/04 01:38AM, Ritesh Harjani wrote:
> On 22/03/03 07:37AM, Tadeusz Struk wrote:
> > On 3/3/22 06:56, Ritesh Harjani wrote:
> > > On 22/03/02 03:14PM, Tadeusz Struk wrote:
> > > > On 2/25/22 11:19, Tadeusz Struk wrote:
> > > > > > I?can?verify?this?sometime?next?week?when?I?get?back?to?it.
> > > > > > But?thanks?for?reporting?the?issue :)
> > > > >
> > > > > Next?week?is?perfectly?fine.?Thanks?for?looking?into?it.
> > > >
> > > > Hi Ritesh,
> > > > Did you have chance to look into this?
> > > > If you want I can send a patch that fixes the off by 1 calculation error.
> > >
> > > Hi Tadeusz,
> > >
> > > I wanted to look at that path a bit more before sending that patch.
> > > Last analysis by me was more of a cursory look at the kernel dmesg log which you
> > > had shared.
> > >
> > > In case if you want to pursue that issue and spend time on it, then feel free to
> > > do it.
> > >
> > > I got pulled into number of other things last week and this week. So didn't get
> > > a chance to look into it yet. I hope to look into this soon (if no one else
> > > picks up :))
> >
> > I'm not familiar with the internals of ext4 implementation so I would rather
> > have someone who knows it look at it.
>
> No problem. I am willing to look into this anyways.
> btw, this issue could be seen easily with below cmd on non-extent ext4 FS.
>
> sudo xfs_io -f -c "truncate 0x4010040c000" -c "fsync" -c "fpunch 0x1000000 0xffefffff000" testfile
Just FYI - The change which we discussed to fix the max_block to max_end_block, is not correct.
Since it will still leave 1 block at the end after punch operation, if the file has s_bitmap_maxbytes size.
This is due to the fact that, "end" is expected to be 1 block after the end of last block.
Will try look into it to see how can we fix this.
1210 /**
1211 * ext4_ind_remove_space - remove space from the range
1212 * @handle: JBD handle for this transaction
1213 * @inode: inode we are dealing with
1214 * @start: First block to remove
1215 * @end: One block after the last block to remove (exclusive)
1216 *
1217 * Free the blocks in the defined range (end is exclusive endpoint of
1218 * range). This is used by ext4_punch_hole().
1219 */
1220 int ext4_ind_remove_space(handle_t *handle, struct inode *inode,
1221 ext4_lblk_t start, ext4_lblk_t end)
-ritesh
On 3/6/22 21:45, Ritesh Harjani wrote:
> On 22/03/04 01:38AM, Ritesh Harjani wrote:
>> On 22/03/03 07:37AM, Tadeusz Struk wrote:
>>> On 3/3/22 06:56, Ritesh Harjani wrote:
>>>> On 22/03/02 03:14PM, Tadeusz Struk wrote:
>>>>> On 2/25/22 11:19, Tadeusz Struk wrote:
>>>>>>> I can verify this sometime next week when I get back to it.
>>>>>>> But thanks for reporting the issue :)
>>>>>>
>>>>>> Next week is perfectly fine. Thanks for looking into it.
>>>>>
>>>>> Hi Ritesh,
>>>>> Did you have chance to look into this?
>>>>> If you want I can send a patch that fixes the off by 1 calculation error.
>>>>
>>>> Hi Tadeusz,
>>>>
>>>> I wanted to look at that path a bit more before sending that patch.
>>>> Last analysis by me was more of a cursory look at the kernel dmesg log which you
>>>> had shared.
>>>>
>>>> In case if you want to pursue that issue and spend time on it, then feel free to
>>>> do it.
>>>>
>>>> I got pulled into number of other things last week and this week. So didn't get
>>>> a chance to look into it yet. I hope to look into this soon (if no one else
>>>> picks up :))
>>>
>>> I'm not familiar with the internals of ext4 implementation so I would rather
>>> have someone who knows it look at it.
>>
>> No problem. I am willing to look into this anyways.
>> btw, this issue could be seen easily with below cmd on non-extent ext4 FS.
>>
>> sudo xfs_io -f -c "truncate 0x4010040c000" -c "fsync" -c "fpunch 0x1000000 0xffefffff000" testfile
>
> Just FYI - The change which we discussed to fix the max_block to max_end_block, is not correct.
> Since it will still leave 1 block at the end after punch operation, if the file has s_bitmap_maxbytes size.
> This is due to the fact that, "end" is expected to be 1 block after the end of last block.
>
> Will try look into it to see how can we fix this.
>
> 1210 /**
> 1211 * ext4_ind_remove_space - remove space from the range
> 1212 * @handle: JBD handle for this transaction
> 1213 * @inode: inode we are dealing with
> 1214 * @start: First block to remove
> 1215 * @end: One block after the last block to remove (exclusive)
> 1216 *
> 1217 * Free the blocks in the defined range (end is exclusive endpoint of
> 1218 * range). This is used by ext4_punch_hole().
> 1219 */
> 1220 int ext4_ind_remove_space(handle_t *handle, struct inode *inode,
> 1221 ext4_lblk_t start, ext4_lblk_t end)
>
Hi Ritesh,
Thanks for update. Let me know if I can be of any help to you.
--
Thanks,
Tadeusz
On 3/6/22 21:45, Ritesh Harjani wrote:
> Just FYI - The change which we discussed to fix the max_block to max_end_block, is not correct.
> Since it will still leave 1 block at the end after punch operation, if the file has s_bitmap_maxbytes size.
> This is due to the fact that, "end" is expected to be 1 block after the end of last block.
>
> Will try look into it to see how can we fix this.
>
> 1210 /**
> 1211 * ext4_ind_remove_space - remove space from the range
> 1212 * @handle: JBD handle for this transaction
> 1213 * @inode: inode we are dealing with
> 1214 * @start: First block to remove
> 1215 * @end: One block after the last block to remove (exclusive)
> 1216 *
> 1217 * Free the blocks in the defined range (end is exclusive endpoint of
> 1218 * range). This is used by ext4_punch_hole().
> 1219 */
> 1220 int ext4_ind_remove_space(handle_t *handle, struct inode *inode,
> 1221 ext4_lblk_t start, ext4_lblk_t end)
>
> -ritesh
Hi Ritesh,
I tried to bisect it and went as far back as 4.4 and the issue still triggers there.
I couldn't build anything older than that with my compiler, but I suspect that
the issue exists even before 3.0 where ext4_fallocate() has been introduced.
--
Thanks,
Tadeusz
On 3/6/22 21:45, Ritesh Harjani wrote:
> Just FYI - The change which we discussed to fix the max_block to max_end_block, is not correct.
> Since it will still leave 1 block at the end after punch operation, if the file has s_bitmap_maxbytes size.
> This is due to the fact that, "end" is expected to be 1 block after the end of last block.
>
> Will try look into it to see how can we fix this.
>
> 1210 /**
> 1211 * ext4_ind_remove_space - remove space from the range
> 1212 * @handle: JBD handle for this transaction
> 1213 * @inode: inode we are dealing with
> 1214 * @start: First block to remove
> 1215 * @end: One block after the last block to remove (exclusive)
So in that case, in ext4_punch_hole(), what should be done is:
if offset + length > sbi->s_bitmap_maxbytes - inode->i_sb->s_blocksize
it either needs to update the length to:
length = sbi->s_bitmap_maxbytes - inode->i_sb->s_blocksize - offset;
or -ENOSPC should be returned, which would be more consistent with the `man 2
fallocate`:
"ERRORS:
...
ENOSPC There is not enough space left on the device containing the file
referred to by fd."
Please let me know if my reckoning is correct, and if so which option you
prefer and I will follow with a patch.
--
Thanks,
Tadeusz