2008-03-25 18:56:22

by Aneesh Kumar K.V

[permalink] [raw]
Subject: [RFC PATCH] ext2: Retry block allocation if new blocks are allocated from system zone.

If the block allocator gets blocks out of system zone ext2 calls
ext2_error. But if the file system is mounted with errors=continue
retry block allocation. We need to mark the system zone
blocks as in use to make sure retry don't pick them again

System zone is the block range mapping block bitmap, inode bitmap
and inode table.

Signed-off-by: Aneesh Kumar K.V <[email protected]>
---
fs/ext2/balloc.c | 16 +++++++++++-----
1 files changed, 11 insertions(+), 5 deletions(-)

diff --git a/fs/ext2/balloc.c b/fs/ext2/balloc.c
index e7b2baf..501399e 100644
--- a/fs/ext2/balloc.c
+++ b/fs/ext2/balloc.c
@@ -149,11 +149,12 @@ read_block_bitmap(struct super_block *sb, unsigned int block_group)
block_group, le32_to_cpu(desc->bg_block_bitmap));
return NULL;
}
- if (!ext2_valid_block_bitmap(sb, desc, block_group, bh)) {
- brelse(bh);
- return NULL;
- }

+ ext2_valid_block_bitmap(sb, desc, block_group, bh);
+ /*
+ * file system mounted not to panic on error,
+ * continue with corrput bitmap
+ */
return bh;
}

@@ -1381,7 +1382,12 @@ allocated:
"Allocating block in system zone - "
"blocks from "E2FSBLK", length %lu",
ret_block, num);
- goto out;
+ /*
+ * ext2_try_to_allocate marked the blocks we allocated
+ * as in use. So we may want to selectively
+ * mark some of the blocks as free
+ */
+ goto retry_alloc;
}

performed_allocation = 1;
--
1.5.5.rc0.16.g02b00.dirty



2008-03-25 18:56:23

by Aneesh Kumar K.V

[permalink] [raw]
Subject: [RFC PATCH] ext4: Retry block allocation if new blocks are allocated from system zone.

If the block allocator gets blocks out of system zone ext4 calls
ext4_error. But if the file system is mounted with errors=continue
retry block allocation. We need to mark the system zone blocks as
in use to make sure retry don't pick them again

System zone is the block range mapping block bitmap, inode bitmap and inode
table.

Signed-off-by: Aneesh Kumar K.V <[email protected]>
Signed-off-by: Mingming Cao <[email protected]>
---
fs/ext4/balloc.c | 17 +++++++++++------
fs/ext4/mballoc.c | 47 ++++++++++++++++++++++++++++++++++++-----------
2 files changed, 47 insertions(+), 17 deletions(-)

diff --git a/fs/ext4/balloc.c b/fs/ext4/balloc.c
index 1e6c5dc..0a7249b 100644
--- a/fs/ext4/balloc.c
+++ b/fs/ext4/balloc.c
@@ -289,11 +289,11 @@ read_block_bitmap(struct super_block *sb, ext4_group_t block_group)
(int)block_group, (unsigned long long)bitmap_blk);
return NULL;
}
- if (!ext4_valid_block_bitmap(sb, desc, block_group, bh)) {
- put_bh(bh);
- return NULL;
- }
-
+ ext4_valid_block_bitmap(sb, desc, block_group, bh);
+ /*
+ * file system mounted not to panic on error,
+ * continue with corrput bitmap
+ */
return bh;
}
/*
@@ -1779,7 +1779,12 @@ allocated:
"Allocating block in system zone - "
"blocks from %llu, length %lu",
ret_block, num);
- goto out;
+ /*
+ * claim_block marked the blocks we allocated
+ * as in use. So we may want to selectively
+ * mark some of the blocks as free
+ */
+ goto retry_alloc;
}

performed_allocation = 1;
diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c
index 340daae..7718826 100644
--- a/fs/ext4/mballoc.c
+++ b/fs/ext4/mballoc.c
@@ -3040,7 +3040,7 @@ ext4_mb_mark_diskspace_used(struct ext4_allocation_context *ac,
struct ext4_sb_info *sbi;
struct super_block *sb;
ext4_fsblk_t block;
- int err;
+ int err, len;

BUG_ON(ac->ac_status != AC_STATUS_FOUND);
BUG_ON(ac->ac_b_ex.fe_len <= 0);
@@ -3074,14 +3074,27 @@ ext4_mb_mark_diskspace_used(struct ext4_allocation_context *ac,
+ ac->ac_b_ex.fe_start
+ le32_to_cpu(es->s_first_data_block);

- if (block == ext4_block_bitmap(sb, gdp) ||
- block == ext4_inode_bitmap(sb, gdp) ||
- in_range(block, ext4_inode_table(sb, gdp),
- EXT4_SB(sb)->s_itb_per_group)) {
-
+ len = ac->ac_b_ex.fe_len;
+ if (in_range(ext4_block_bitmap(sb, gdp), block, len) ||
+ in_range(ext4_inode_bitmap(sb, gdp), block, len) ||
+ in_range(block, ext4_inode_table(sb, gdp),
+ EXT4_SB(sb)->s_itb_per_group) ||
+ in_range(block + len - 1, ext4_inode_table(sb, gdp),
+ EXT4_SB(sb)->s_itb_per_group)) {
ext4_error(sb, __func__,
"Allocating block in system zone - block = %llu",
block);
+ /* File system mounted not to panic on error
+ * Fix the bitmap and repeat the block allocation
+ * We leak some of the blocks here.
+ */
+ mb_set_bits(sb_bgl_lock(sbi, ac->ac_b_ex.fe_group),
+ bitmap_bh->b_data, ac->ac_b_ex.fe_start,
+ ac->ac_b_ex.fe_len);
+ err = ext4_journal_dirty_metadata(handle, bitmap_bh);
+ if (!err)
+ err = -EAGAIN;
+ goto out_err;
}
#ifdef AGGRESSIVE_CHECK
{
@@ -4331,7 +4344,6 @@ ext4_fsblk_t ext4_mb_new_blocks(handle_t *handle,

ac->ac_op = EXT4_MB_HISTORY_ALLOC;
ext4_mb_normalize_request(ac, ar);
-
repeat:
/* allocate space in core */
ext4_mb_regular_allocator(ac);
@@ -4345,10 +4357,21 @@ repeat:
}

if (likely(ac->ac_status == AC_STATUS_FOUND)) {
- ext4_mb_mark_diskspace_used(ac, handle);
- *errp = 0;
- block = ext4_grp_offs_to_block(sb, &ac->ac_b_ex);
- ar->len = ac->ac_b_ex.fe_len;
+ *errp = ext4_mb_mark_diskspace_used(ac, handle);
+ if (*errp == -EAGAIN) {
+ ac->ac_b_ex.fe_group = 0;
+ ac->ac_b_ex.fe_start = 0;
+ ac->ac_b_ex.fe_len = 0;
+ ac->ac_status = AC_STATUS_CONTINUE;
+ goto repeat;
+ } else if (*errp) {
+ ac->ac_b_ex.fe_len = 0;
+ ar->len = 0;
+ ext4_mb_show_ac(ac);
+ } else {
+ block = ext4_grp_offs_to_block(sb, &ac->ac_b_ex);
+ ar->len = ac->ac_b_ex.fe_len;
+ }
} else {
freed = ext4_mb_discard_preallocations(sb, ac->ac_o_ex.fe_len);
if (freed)
@@ -4535,6 +4558,8 @@ do_more:
ext4_error(sb, __func__,
"Freeing blocks in system zone - "
"Block = %lu, count = %lu", block, count);
+ /* err = 0. ext4_std_error should be a no op */
+ goto error_return;
}

BUFFER_TRACE(bitmap_bh, "getting write access");
--
1.5.5.rc0.16.g02b00.dirty


2008-03-25 18:56:23

by Aneesh Kumar K.V

[permalink] [raw]
Subject: [RFC PATCH] ext3: Retry block allocation if new blocks are allocated from system zone.

If the block allocator gets blocks out of system zone ext3 calls
ext3_error. But if the file system is mounted with errors=continue
retry block allocation. We need to mark the system zone blocks as
in use to make sure retry don't pick them again

System zone is the block range mapping block bitmap, inode bitmap and inode
table.

Signed-off-by: Aneesh Kumar K.V <[email protected]>
Signed-off-by: Mingming Cao <[email protected]>
---
fs/ext3/balloc.c | 16 +++++++++++-----
1 files changed, 11 insertions(+), 5 deletions(-)

diff --git a/fs/ext3/balloc.c b/fs/ext3/balloc.c
index da0cb2c..874362e 100644
--- a/fs/ext3/balloc.c
+++ b/fs/ext3/balloc.c
@@ -164,10 +164,11 @@ read_block_bitmap(struct super_block *sb, unsigned int block_group)
block_group, le32_to_cpu(desc->bg_block_bitmap));
return NULL;
}
- if (!ext3_valid_block_bitmap(sb, desc, block_group, bh)) {
- brelse(bh);
- return NULL;
- }
+ ext3_valid_block_bitmap(sb, desc, block_group, bh);
+ /*
+ * file system mounted not to panic on error,
+ * continue with corrput bitmap
+ */
return bh;
}
/*
@@ -1642,7 +1643,12 @@ allocated:
"Allocating block in system zone - "
"blocks from "E3FSBLK", length %lu",
ret_block, num);
- goto out;
+ /*
+ * claim_block marked the blocks we allocated
+ * as in use. So we may want to selectively
+ * mark some of the blocks as free
+ */
+ goto retry_alloc;
}

performed_allocation = 1;
--
1.5.5.rc0.16.g02b00.dirty


2008-03-25 23:57:19

by Andreas Dilger

[permalink] [raw]
Subject: Re: [RFC PATCH] ext2: Retry block allocation if new blocks are allocated from system zone.

On Mar 26, 2008 00:26 +0530, Aneesh Kumar K.V wrote:
> If the block allocator gets blocks out of system zone ext2 calls
> ext2_error. But if the file system is mounted with errors=continue
> retry block allocation. We need to mark the system zone
> blocks as in use to make sure retry don't pick them again
>
> System zone is the block range mapping block bitmap, inode bitmap
> and inode table.

Nack * 3. It appears that this will spin in a loop, because retry_alloc:
will not change the group that is being used, and extN_valid_block_bitmap()
does nothing to fix up the bitmap to avoid re-allocating the same blocks.

Instead, it seems safest that extN_new_blocks() should skip the group
entirely after marking the group as having no free blocks, so the allocator
will no longer choose it. This also means that when freeing blocks it
needs to watch out for underflow of the ->bg_free_blocks_count value,
which is itself a good safety measure.

> Signed-off-by: Aneesh Kumar K.V <[email protected]>
> ---
> fs/ext2/balloc.c | 16 +++++++++++-----
> 1 files changed, 11 insertions(+), 5 deletions(-)
>
> diff --git a/fs/ext2/balloc.c b/fs/ext2/balloc.c
> index e7b2baf..501399e 100644
> --- a/fs/ext2/balloc.c
> +++ b/fs/ext2/balloc.c
> @@ -149,11 +149,12 @@ read_block_bitmap(struct super_block *sb, unsigned int block_group)
> block_group, le32_to_cpu(desc->bg_block_bitmap));
> return NULL;
> }
> - if (!ext2_valid_block_bitmap(sb, desc, block_group, bh)) {
> - brelse(bh);
> - return NULL;
> - }
>
> + ext2_valid_block_bitmap(sb, desc, block_group, bh);
> + /*
> + * file system mounted not to panic on error,
> + * continue with corrput bitmap
> + */
> return bh;
> }
>
> @@ -1381,7 +1382,12 @@ allocated:
> "Allocating block in system zone - "
> "blocks from "E2FSBLK", length %lu",
> ret_block, num);
> - goto out;
> + /*
> + * ext2_try_to_allocate marked the blocks we allocated
> + * as in use. So we may want to selectively
> + * mark some of the blocks as free
> + */
> + goto retry_alloc;
> }
>
> performed_allocation = 1;
> --
> 1.5.5.rc0.16.g02b00.dirty

Cheers, Andreas
--
Andreas Dilger
Sr. Staff Engineer, Lustre Group
Sun Microsystems of Canada, Inc.


2008-03-26 07:38:15

by Aneesh Kumar K.V

[permalink] [raw]
Subject: Re: [RFC PATCH] ext2: Retry block allocation if new blocks are allocated from system zone.

On Tue, Mar 25, 2008 at 05:56:49PM -0600, Andreas Dilger wrote:
> On Mar 26, 2008 00:26 +0530, Aneesh Kumar K.V wrote:
> > If the block allocator gets blocks out of system zone ext2 calls
> > ext2_error. But if the file system is mounted with errors=continue
> > retry block allocation. We need to mark the system zone
> > blocks as in use to make sure retry don't pick them again
> >
> > System zone is the block range mapping block bitmap, inode bitmap
> > and inode table.
>
> Nack * 3. It appears that this will spin in a loop, because retry_alloc:
> will not change the group that is being used, and extN_valid_block_bitmap()
> does nothing to fix up the bitmap to avoid re-allocating the same blocks.


As also discussed on irc, ext2_try_to_allocate(fs/ext2/balloc.c:738)
marks the blocks allocated as in use. So we will not really spin in
loop here. Same is true for ext3 and ext4. (fs/ext4/balloc.c:1066) for
mballoc it is done as a part of patch.

>
> Instead, it seems safest that extN_new_blocks() should skip the group
> entirely after marking the group as having no free blocks, so the allocator
> will no longer choose it. This also means that when freeing blocks it
> needs to watch out for underflow of the ->bg_free_blocks_count value,
> which is itself a good safety measure.

Agreed. As you suggested on irc, i will try to introduce EXT4_BG_CORRUPT
and ignore the group which has EXT4_BG_CORRUPT set in block allocation.
I would also try to do e2fsck changes that will clear this flag after
fixing the bitmap corruption.

-aneesh