ext4: Fix s_dirty_blocks_counter if block allocation failed with nodelalloc
From: Akira Fujita <[email protected]>
If block allocation failed after marking claimed blocks as dirty blocks
with nodelalloc, we have to subtract these blocks from
s_dirty_blocks_counter in error handling.
Otherwise s_dirty_blocks_counter goes wrong so that
filesystem's free blocks decreases incorrectly.
This issue was reported as ext4 online defrag's bug by Li Zefan.
http://marc.info/?l=linux-ext4&m=122697235715170&w=2
Reported-by: Li Zefan <[email protected]>
Signed-off-by: Akira Fujita <[email protected]>
---
mballoc.c | 9 +++++++++
1 file changed, 9 insertions(+)
diff -X linux-2.6.28-rc6-ext4/Documentation/dontdiff -upNr linux-2.6.28-rc6-ext4/fs/ext4/mballoc.c linux-2.6.28-rc6-mballoc-fix/fs/ext4/mballoc.c
--- linux-2.6.28-rc6-ext4/fs/ext4/mballoc.c 2008-12-01 11:44:28.000000000 +0900
+++ linux-2.6.28-rc6-mballoc-fix/fs/ext4/mballoc.c 2008-12-01 12:04:06.000000000 +0900
@@ -4495,12 +4495,18 @@ ext4_fsblk_t ext4_mb_new_blocks(handle_t
if (!ac) {
ar->len = 0;
*errp = -ENOMEM;
+ if (!(ac->ac_flags & EXT4_MB_DELALLOC_RESERVED))
+ percpu_counter_sub(&sbi->s_dirtyblocks_counter,
+ reserv_blks);
goto out1;
}
*errp = ext4_mb_initialize_context(ac, ar);
if (*errp) {
ar->len = 0;
+ if (!(ac->ac_flags & EXT4_MB_DELALLOC_RESERVED))
+ percpu_counter_sub(&sbi->s_dirtyblocks_counter,
+ reserv_blks);
goto out2;
}
@@ -4541,6 +4547,9 @@ repeat:
if (freed)
goto repeat;
*errp = -ENOSPC;
+ if (!(ac->ac_flags & EXT4_MB_DELALLOC_RESERVED))
+ percpu_counter_sub(&sbi->s_dirtyblocks_counter,
+ reserv_blks);
ac->ac_b_ex.fe_len = 0;
ar->len = 0;
ext4_mb_show_ac(ac);
On Mon, Dec 01, 2008 at 07:21:54PM +0900, Akira Fujita wrote:
> ext4: Fix s_dirty_blocks_counter if block allocation failed with nodelalloc
>
> From: Akira Fujita <[email protected]>
>
> If block allocation failed after marking claimed blocks as dirty blocks
> with nodelalloc, we have to subtract these blocks from
> s_dirty_blocks_counter in error handling.
> Otherwise s_dirty_blocks_counter goes wrong so that
> filesystem's free blocks decreases incorrectly.
Why did the block allocation fail ? With delayed allocation ENOSPC
should not happen during block allocation. That would mean we did
something wrong in block reservation.
>
> This issue was reported as ext4 online defrag's bug by Li Zefan.
> http://marc.info/?l=linux-ext4&m=122697235715170&w=2
-aneesh
Hi Aneesh,
Aneesh Kumar K.V wrote:
> On Mon, Dec 01, 2008 at 07:21:54PM +0900, Akira Fujita wrote:
>> ext4: Fix s_dirty_blocks_counter if block allocation failed with nodelalloc
>>
>> From: Akira Fujita <[email protected]>
>>
>> If block allocation failed after marking claimed blocks as dirty blocks
>> with nodelalloc, we have to subtract these blocks from
>> s_dirty_blocks_counter in error handling.
>> Otherwise s_dirty_blocks_counter goes wrong so that
>> filesystem's free blocks decreases incorrectly.
>
> Why did the block allocation fail ? With delayed allocation ENOSPC
> should not happen during block allocation. That would mean we did
> something wrong in block reservation.
My case was *nodelalloc* and FS was almost full.
This problem occurs in multiple defrag running in short time.
Usually defrag releases temporary inode's blocks with iput,
then FS free blocks are recover but contiguous blocks do not recover
until next journal commit.
so we can not re-use contiguous blocks immediately.
There are enough free blocks in FS so that
ext4_claim_free_blocks marks claimed blocks as dirty,
but ext4_regular_allocator can not find enough blocks,
so mb_new_blocks returns ENOSPC without decreasing dirty blocks.
Regards,
Akira Fujita
On Thu, Dec 04, 2008 at 10:27:25AM +0900, Akira Fujita wrote:
>
> Hi Aneesh,
> Aneesh Kumar K.V wrote:
>> On Mon, Dec 01, 2008 at 07:21:54PM +0900, Akira Fujita wrote:
>>> ext4: Fix s_dirty_blocks_counter if block allocation failed with nodelalloc
>>>
>>> From: Akira Fujita <[email protected]>
>>>
>>> If block allocation failed after marking claimed blocks as dirty blocks
>>> with nodelalloc, we have to subtract these blocks from
>>> s_dirty_blocks_counter in error handling.
>>> Otherwise s_dirty_blocks_counter goes wrong so that
>>> filesystem's free blocks decreases incorrectly.
>>
>> Why did the block allocation fail ? With delayed allocation ENOSPC
>> should not happen during block allocation. That would mean we did
>> something wrong in block reservation.
>
> My case was *nodelalloc* and FS was almost full.
> This problem occurs in multiple defrag running in short time.
> Usually defrag releases temporary inode's blocks with iput,
> then FS free blocks are recover but contiguous blocks do not recover
> until next journal commit.
> so we can not re-use contiguous blocks immediately.
> There are enough free blocks in FS so that
> ext4_claim_free_blocks marks claimed blocks as dirty,
> but ext4_regular_allocator can not find enough blocks,
> so mb_new_blocks returns ENOSPC without decreasing dirty blocks.
>
ok how about doing the check once in ext4_mb_new_blocks.
diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c
index bacc2f4..22d31c3 100644
--- a/fs/ext4/mballoc.c
+++ b/fs/ext4/mballoc.c
@@ -4550,7 +4550,7 @@ ext4_fsblk_t ext4_mb_new_blocks(handle_t *handle,
}
if (ar->len == 0) {
*errp = -EDQUOT;
- return 0;
+ goto out3;
}
inquota = ar->len;
@@ -4623,6 +4623,13 @@ ext4_fsblk_t ext4_mb_new_blocks(handle_t *handle,
out1:
if (ar->len < inquota)
DQUOT_FREE_BLOCK(ar->inode, inquota - ar->len);
+out3:
+ if (!ar->len) {
+ if (!EXT4_I(ar->inode)->i_delalloc_reserved_flag)
+ /* release all the reserved blocks if non delalloc */
+ percpu_counter_sub(&sbi->s_dirtyblocks_counter,
+ reserv_blks);
+ }
return block;
}
Aneesh Kumar K.V wrote:
> On Thu, Dec 04, 2008 at 10:27:25AM +0900, Akira Fujita wrote:
>> Hi Aneesh,
>> Aneesh Kumar K.V wrote:
>>> On Mon, Dec 01, 2008 at 07:21:54PM +0900, Akira Fujita wrote:
>>>> ext4: Fix s_dirty_blocks_counter if block allocation failed with nodelalloc
>>>>
>>>> From: Akira Fujita <[email protected]>
>>>>
>>>> If block allocation failed after marking claimed blocks as dirty blocks
>>>> with nodelalloc, we have to subtract these blocks from
>>>> s_dirty_blocks_counter in error handling.
>>>> Otherwise s_dirty_blocks_counter goes wrong so that
>>>> filesystem's free blocks decreases incorrectly.
>>> Why did the block allocation fail ? With delayed allocation ENOSPC
>>> should not happen during block allocation. That would mean we did
>>> something wrong in block reservation.
>> My case was *nodelalloc* and FS was almost full.
>> This problem occurs in multiple defrag running in short time.
>> Usually defrag releases temporary inode's blocks with iput,
>> then FS free blocks are recover but contiguous blocks do not recover
>> until next journal commit.
>> so we can not re-use contiguous blocks immediately.
>> There are enough free blocks in FS so that
>> ext4_claim_free_blocks marks claimed blocks as dirty,
>> but ext4_regular_allocator can not find enough blocks,
>> so mb_new_blocks returns ENOSPC without decreasing dirty blocks.
>>
> ok how about doing the check once in ext4_mb_new_blocks.
It works fine. Thank you.
Tested-by: Akira Fujita <[email protected]>
> diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c
> index bacc2f4..22d31c3 100644
> --- a/fs/ext4/mballoc.c
> +++ b/fs/ext4/mballoc.c
> @@ -4550,7 +4550,7 @@ ext4_fsblk_t ext4_mb_new_blocks(handle_t *handle,
> }
> if (ar->len == 0) {
> *errp = -EDQUOT;
> - return 0;
> + goto out3;
> }
> inquota = ar->len;
>
> @@ -4623,6 +4623,13 @@ ext4_fsblk_t ext4_mb_new_blocks(handle_t *handle,
> out1:
> if (ar->len < inquota)
> DQUOT_FREE_BLOCK(ar->inode, inquota - ar->len);
> +out3:
> + if (!ar->len) {
> + if (!EXT4_I(ar->inode)->i_delalloc_reserved_flag)
> + /* release all the reserved blocks if non delalloc */
> + percpu_counter_sub(&sbi->s_dirtyblocks_counter,
> + reserv_blks);
> + }
>
> return block;
> }
Akira Fujita wrote:
>
> Aneesh Kumar K.V wrote:
>> On Thu, Dec 04, 2008 at 10:27:25AM +0900, Akira Fujita wrote:
>>> Hi Aneesh,
>>> Aneesh Kumar K.V wrote:
>>>> On Mon, Dec 01, 2008 at 07:21:54PM +0900, Akira Fujita wrote:
>>>>> ext4: Fix s_dirty_blocks_counter if block allocation failed with
>>>>> nodelalloc
>>>>>
>>>>> From: Akira Fujita <[email protected]>
>>>>>
>>>>> If block allocation failed after marking claimed blocks as dirty
>>>>> blocks
>>>>> with nodelalloc, we have to subtract these blocks from
>>>>> s_dirty_blocks_counter in error handling.
>>>>> Otherwise s_dirty_blocks_counter goes wrong so that
>>>>> filesystem's free blocks decreases incorrectly.
>>>> Why did the block allocation fail ? With delayed allocation ENOSPC
>>>> should not happen during block allocation. That would mean we did
>>>> something wrong in block reservation.
>>> My case was *nodelalloc* and FS was almost full.
>>> This problem occurs in multiple defrag running in short time.
>>> Usually defrag releases temporary inode's blocks with iput,
>>> then FS free blocks are recover but contiguous blocks do not recover
>>> until next journal commit.
>>> so we can not re-use contiguous blocks immediately.
>>> There are enough free blocks in FS so that
>>> ext4_claim_free_blocks marks claimed blocks as dirty,
>>> but ext4_regular_allocator can not find enough blocks,
>>> so mb_new_blocks returns ENOSPC without decreasing dirty blocks.
>>>
>> ok how about doing the check once in ext4_mb_new_blocks.
>
> It works fine. Thank you.
> Tested-by: Akira Fujita <[email protected]>
>
I'll test it when I'm free and have access to the test machine. :)
But not this week..
>> diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c
>> index bacc2f4..22d31c3 100644
>> --- a/fs/ext4/mballoc.c
>> +++ b/fs/ext4/mballoc.c
>> @@ -4550,7 +4550,7 @@ ext4_fsblk_t ext4_mb_new_blocks(handle_t *handle,
>> }
>> if (ar->len == 0) {
>> *errp = -EDQUOT;
>> - return 0;
>> + goto out3;
>> }
>> inquota = ar->len;
>>
>> @@ -4623,6 +4623,13 @@ ext4_fsblk_t ext4_mb_new_blocks(handle_t *handle,
>> out1:
>> if (ar->len < inquota)
>> DQUOT_FREE_BLOCK(ar->inode, inquota - ar->len);
>> +out3:
>> + if (!ar->len) {
>> + if (!EXT4_I(ar->inode)->i_delalloc_reserved_flag)
>> + /* release all the reserved blocks if non delalloc */
>> + percpu_counter_sub(&sbi->s_dirtyblocks_counter,
>> + reserv_blks);
>> + }
>>
>> return block;
>> }
>
>