2.6.32-longterm review patch. If anyone has any objections, please let me know.
------------------
From: Brian Foster <[email protected]>
commit 97795d2a5b8d3c8dc4365d4bd3404191840453ba upstream.
If we hit a condition where we have allocated metadata blocks that
were not appropriately reserved, we risk underflow of
ei->i_reserved_meta_blocks. In turn, this can throw
sbi->s_dirtyclusters_counter significantly out of whack and undermine
the nondelalloc fallback logic in ext4_nonda_switch(). Warn if this
occurs and set i_allocated_meta_blocks to avoid this problem.
This condition is reproduced by xfstests 270 against ext2 with
delalloc enabled:
Mar 28 08:58:02 localhost kernel: [ 171.526344] EXT4-fs (loop1): delayed block allocation failed for inode 14 at logical offset 64486 with max blocks 64 with error -28
Mar 28 08:58:02 localhost kernel: [ 171.526346] EXT4-fs (loop1): This should not happen!! Data will be lost
270 ultimately fails with an inconsistent filesystem and requires an
fsck to repair. The cause of the error is an underflow in
ext4_da_update_reserve_space() due to an unreserved meta block
allocation.
Signed-off-by: Brian Foster <[email protected]>
Signed-off-by: "Theodore Ts'o" <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
Signed-off-by: Willy Tarreau <[email protected]>
---
fs/ext4/inode.c | 9 +++++++++
1 files changed, 9 insertions(+), 0 deletions(-)
diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index 72ba88f..efe6363 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -1112,6 +1112,15 @@ void ext4_da_update_reserve_space(struct inode *inode,
used = ei->i_reserved_data_blocks;
}
+ if (unlikely(ei->i_allocated_meta_blocks > ei->i_reserved_meta_blocks)) {
+ ext4_msg(inode->i_sb, KERN_NOTICE, "%s: ino %lu, allocated %d "
+ "with only %d reserved metadata blocks\n", __func__,
+ inode->i_ino, ei->i_allocated_meta_blocks,
+ ei->i_reserved_meta_blocks);
+ WARN_ON(1);
+ ei->i_allocated_meta_blocks = ei->i_reserved_meta_blocks;
+ }
+
/* Update per-inode reservations */
ei->i_reserved_data_blocks -= used;
used += ei->i_allocated_meta_blocks;
--
1.7.2.1.45.g54fbc
On Tue, Oct 02, 2012 at 12:53:47AM +0200, Willy Tarreau wrote:
> 2.6.32-longterm review patch. If anyone has any objections, please let me know.
>
> ------------------
>
> From: Brian Foster <[email protected]>
>
> commit 97795d2a5b8d3c8dc4365d4bd3404191840453ba upstream.
>
> If we hit a condition where we have allocated metadata blocks that
> were not appropriately reserved, we risk underflow of
> ei->i_reserved_meta_blocks. In turn, this can throw
> sbi->s_dirtyclusters_counter significantly out of whack and undermine
> the nondelalloc fallback logic in ext4_nonda_switch(). Warn if this
> occurs and set i_allocated_meta_blocks to avoid this problem.
>
> This condition is reproduced by xfstests 270 against ext2 with
> delalloc enabled:
>
> Mar 28 08:58:02 localhost kernel: [ 171.526344] EXT4-fs (loop1): delayed block allocation failed for inode 14 at logical offset 64486 with max blocks 64 with error -28
> Mar 28 08:58:02 localhost kernel: [ 171.526346] EXT4-fs (loop1): This should not happen!! Data will be lost
>
> 270 ultimately fails with an inconsistent filesystem and requires an
> fsck to repair. The cause of the error is an underflow in
> ext4_da_update_reserve_space() due to an unreserved meta block
> allocation.
[...]
> + if (unlikely(ei->i_allocated_meta_blocks > ei->i_reserved_meta_blocks)) {
> + ext4_msg(inode->i_sb, KERN_NOTICE, "%s: ino %lu, allocated %d "
> + "with only %d reserved metadata blocks\n", __func__,
> + inode->i_ino, ei->i_allocated_meta_blocks,
> + ei->i_reserved_meta_blocks);
> + WARN_ON(1);
> + ei->i_allocated_meta_blocks = ei->i_reserved_meta_blocks;
> + }
[...]
This seems to be working around a bug elsewhere. Has the underlying
bug been fixed in mainline yet?
Ben.
--
Ben Hutchings
We get into the habit of living before acquiring the habit of thinking.
- Albert Camus
On 10/04/2012 05:55 PM, Ben Hutchings wrote:
> On Tue, Oct 02, 2012 at 12:53:47AM +0200, Willy Tarreau wrote:
>> 2.6.32-longterm review patch. If anyone has any objections, please let me know.
>>
>> ------------------
>>
>> From: Brian Foster <[email protected]>
>>
>> commit 97795d2a5b8d3c8dc4365d4bd3404191840453ba upstream.
>>
>> If we hit a condition where we have allocated metadata blocks that
>> were not appropriately reserved, we risk underflow of
>> ei->i_reserved_meta_blocks. In turn, this can throw
>> sbi->s_dirtyclusters_counter significantly out of whack and undermine
>> the nondelalloc fallback logic in ext4_nonda_switch(). Warn if this
>> occurs and set i_allocated_meta_blocks to avoid this problem.
>>
>> This condition is reproduced by xfstests 270 against ext2 with
>> delalloc enabled:
>>
>> Mar 28 08:58:02 localhost kernel: [ 171.526344] EXT4-fs (loop1): delayed block allocation failed for inode 14 at logical offset 64486 with max blocks 64 with error -28
>> Mar 28 08:58:02 localhost kernel: [ 171.526346] EXT4-fs (loop1): This should not happen!! Data will be lost
>>
>> 270 ultimately fails with an inconsistent filesystem and requires an
>> fsck to repair. The cause of the error is an underflow in
>> ext4_da_update_reserve_space() due to an unreserved meta block
>> allocation.
> [...]
>> + if (unlikely(ei->i_allocated_meta_blocks > ei->i_reserved_meta_blocks)) {
>> + ext4_msg(inode->i_sb, KERN_NOTICE, "%s: ino %lu, allocated %d "
>> + "with only %d reserved metadata blocks\n", __func__,
>> + inode->i_ino, ei->i_allocated_meta_blocks,
>> + ei->i_reserved_meta_blocks);
>> + WARN_ON(1);
>> + ei->i_allocated_meta_blocks = ei->i_reserved_meta_blocks;
>> + }
> [...]
>
> This seems to be working around a bug elsewhere. Has the underlying
> bug been fixed in mainline yet?
>
Yes, the bug was fixed in:
03179fe92318e7934c180d96f12eff2cb36ef7b6
ext4: undo ext4_calc_metadata_amount if we fail to claim space
Brian
> Ben.
>
On Fri, Oct 05, 2012 at 07:59:11AM -0400, Brian Foster wrote:
> On 10/04/2012 05:55 PM, Ben Hutchings wrote:
> > On Tue, Oct 02, 2012 at 12:53:47AM +0200, Willy Tarreau wrote:
> >> 2.6.32-longterm review patch. If anyone has any objections, please let me know.
> >>
> >> ------------------
> >>
> >> From: Brian Foster <[email protected]>
> >>
> >> commit 97795d2a5b8d3c8dc4365d4bd3404191840453ba upstream.
> >>
> >> If we hit a condition where we have allocated metadata blocks that
> >> were not appropriately reserved, we risk underflow of
> >> ei->i_reserved_meta_blocks. In turn, this can throw
> >> sbi->s_dirtyclusters_counter significantly out of whack and undermine
> >> the nondelalloc fallback logic in ext4_nonda_switch(). Warn if this
> >> occurs and set i_allocated_meta_blocks to avoid this problem.
> >>
> >> This condition is reproduced by xfstests 270 against ext2 with
> >> delalloc enabled:
> >>
> >> Mar 28 08:58:02 localhost kernel: [ 171.526344] EXT4-fs (loop1): delayed block allocation failed for inode 14 at logical offset 64486 with max blocks 64 with error -28
> >> Mar 28 08:58:02 localhost kernel: [ 171.526346] EXT4-fs (loop1): This should not happen!! Data will be lost
> >>
> >> 270 ultimately fails with an inconsistent filesystem and requires an
> >> fsck to repair. The cause of the error is an underflow in
> >> ext4_da_update_reserve_space() due to an unreserved meta block
> >> allocation.
> > [...]
> >> + if (unlikely(ei->i_allocated_meta_blocks > ei->i_reserved_meta_blocks)) {
> >> + ext4_msg(inode->i_sb, KERN_NOTICE, "%s: ino %lu, allocated %d "
> >> + "with only %d reserved metadata blocks\n", __func__,
> >> + inode->i_ino, ei->i_allocated_meta_blocks,
> >> + ei->i_reserved_meta_blocks);
> >> + WARN_ON(1);
> >> + ei->i_allocated_meta_blocks = ei->i_reserved_meta_blocks;
> >> + }
> > [...]
> >
> > This seems to be working around a bug elsewhere. Has the underlying
> > bug been fixed in mainline yet?
> >
>
> Yes, the bug was fixed in:
>
> 03179fe92318e7934c180d96f12eff2cb36ef7b6
> ext4: undo ext4_calc_metadata_amount if we fail to claim space
So should we merge this one instead/too ?
Willy
On 10/05/2012 08:37 AM, Willy Tarreau wrote:
> On Fri, Oct 05, 2012 at 07:59:11AM -0400, Brian Foster wrote:
>> On 10/04/2012 05:55 PM, Ben Hutchings wrote:
>>> On Tue, Oct 02, 2012 at 12:53:47AM +0200, Willy Tarreau wrote:
>>>> 2.6.32-longterm review patch. If anyone has any objections, please let me know.
>>>>
>>>> ------------------
>>>>
>>>> From: Brian Foster <[email protected]>
>>>>
>>>> commit 97795d2a5b8d3c8dc4365d4bd3404191840453ba upstream.
>>>>
>>>> If we hit a condition where we have allocated metadata blocks that
>>>> were not appropriately reserved, we risk underflow of
>>>> ei->i_reserved_meta_blocks. In turn, this can throw
>>>> sbi->s_dirtyclusters_counter significantly out of whack and undermine
>>>> the nondelalloc fallback logic in ext4_nonda_switch(). Warn if this
>>>> occurs and set i_allocated_meta_blocks to avoid this problem.
>>>>
>>>> This condition is reproduced by xfstests 270 against ext2 with
>>>> delalloc enabled:
>>>>
>>>> Mar 28 08:58:02 localhost kernel: [ 171.526344] EXT4-fs (loop1): delayed block allocation failed for inode 14 at logical offset 64486 with max blocks 64 with error -28
>>>> Mar 28 08:58:02 localhost kernel: [ 171.526346] EXT4-fs (loop1): This should not happen!! Data will be lost
>>>>
>>>> 270 ultimately fails with an inconsistent filesystem and requires an
>>>> fsck to repair. The cause of the error is an underflow in
>>>> ext4_da_update_reserve_space() due to an unreserved meta block
>>>> allocation.
>>> [...]
>>>> + if (unlikely(ei->i_allocated_meta_blocks > ei->i_reserved_meta_blocks)) {
>>>> + ext4_msg(inode->i_sb, KERN_NOTICE, "%s: ino %lu, allocated %d "
>>>> + "with only %d reserved metadata blocks\n", __func__,
>>>> + inode->i_ino, ei->i_allocated_meta_blocks,
>>>> + ei->i_reserved_meta_blocks);
>>>> + WARN_ON(1);
>>>> + ei->i_allocated_meta_blocks = ei->i_reserved_meta_blocks;
>>>> + }
>>> [...]
>>>
>>> This seems to be working around a bug elsewhere. Has the underlying
>>> bug been fixed in mainline yet?
>>>
>>
>> Yes, the bug was fixed in:
>>
>> 03179fe92318e7934c180d96f12eff2cb36ef7b6
>> ext4: undo ext4_calc_metadata_amount if we fail to claim space
>
> So should we merge this one instead/too ?
>
>From the perspective of the bug, I think you would want both patches. I
should probably defer to Ted if he proposed this latter change for stable...
Brian
> Willy
>
On Fri, 2012-10-05 at 07:59 -0400, Brian Foster wrote:
> On 10/04/2012 05:55 PM, Ben Hutchings wrote:
> > On Tue, Oct 02, 2012 at 12:53:47AM +0200, Willy Tarreau wrote:
> >> 2.6.32-longterm review patch. If anyone has any objections, please let me know.
> >>
> >> ------------------
> >>
> >> From: Brian Foster <[email protected]>
> >>
> >> commit 97795d2a5b8d3c8dc4365d4bd3404191840453ba upstream.
> >>
> >> If we hit a condition where we have allocated metadata blocks that
> >> were not appropriately reserved, we risk underflow of
> >> ei->i_reserved_meta_blocks. In turn, this can throw
> >> sbi->s_dirtyclusters_counter significantly out of whack and undermine
> >> the nondelalloc fallback logic in ext4_nonda_switch(). Warn if this
> >> occurs and set i_allocated_meta_blocks to avoid this problem.
> >>
> >> This condition is reproduced by xfstests 270 against ext2 with
> >> delalloc enabled:
> >>
> >> Mar 28 08:58:02 localhost kernel: [ 171.526344] EXT4-fs (loop1): delayed block allocation failed for inode 14 at logical offset 64486 with max blocks 64 with error -28
> >> Mar 28 08:58:02 localhost kernel: [ 171.526346] EXT4-fs (loop1): This should not happen!! Data will be lost
> >>
> >> 270 ultimately fails with an inconsistent filesystem and requires an
> >> fsck to repair. The cause of the error is an underflow in
> >> ext4_da_update_reserve_space() due to an unreserved meta block
> >> allocation.
> > [...]
> >> + if (unlikely(ei->i_allocated_meta_blocks > ei->i_reserved_meta_blocks)) {
> >> + ext4_msg(inode->i_sb, KERN_NOTICE, "%s: ino %lu, allocated %d "
> >> + "with only %d reserved metadata blocks\n", __func__,
> >> + inode->i_ino, ei->i_allocated_meta_blocks,
> >> + ei->i_reserved_meta_blocks);
> >> + WARN_ON(1);
> >> + ei->i_allocated_meta_blocks = ei->i_reserved_meta_blocks;
> >> + }
> > [...]
> >
> > This seems to be working around a bug elsewhere. Has the underlying
> > bug been fixed in mainline yet?
> >
>
> Yes, the bug was fixed in:
>
> 03179fe92318e7934c180d96f12eff2cb36ef7b6
> ext4: undo ext4_calc_metadata_amount if we fail to claim space
OK, and that's been applied to stable as:
3.2: d9af293 ext4: undo ext4_calc_metadata_amount if we fail to claim space
3.4: c0ce1fd ext4: undo ext4_calc_metadata_amount if we fail to claim space
3.5: 564dfa3 ext4: undo ext4_calc_metadata_amount if we fail to claim space
Presumably it will need some backporting for older versions.
Ben.
--
Ben Hutchings
You can't have everything. Where would you put it?
On Sun, Oct 07, 2012 at 02:47:22AM +0100, Ben Hutchings wrote:
(...)
> > > This seems to be working around a bug elsewhere. Has the underlying
> > > bug been fixed in mainline yet?
> > >
> >
> > Yes, the bug was fixed in:
> >
> > 03179fe92318e7934c180d96f12eff2cb36ef7b6
> > ext4: undo ext4_calc_metadata_amount if we fail to claim space
>
> OK, and that's been applied to stable as:
>
> 3.2: d9af293 ext4: undo ext4_calc_metadata_amount if we fail to claim space
> 3.4: c0ce1fd ext4: undo ext4_calc_metadata_amount if we fail to claim space
> 3.5: 564dfa3 ext4: undo ext4_calc_metadata_amount if we fail to claim space
>
> Presumably it will need some backporting for older versions.
OK. I have checked the code, and it changed significantly since. I can
still see the logic there, but function names and calculations differ,
so I'd rather defer this patch for next version.
Willy