2009-02-25 06:40:24

by Kazuya Mio

[permalink] [raw]
Subject: double free of blocks occurred during online defrag

Hi Aneesh,

When I remove the file that is running online defrag, the following error occurs
after closing the file descriptor:

Jan 22 17:06:52 G3-OPC-SVR2 kernel: EXT4-fs error (device hda8):
ext4_mb_release_inode_pa: free 2048, pa_free 1562
Jan 22 17:06:52 G3-OPC-SVR2 kernel: EXT4-fs error (device hda8): mb_free_blocks:
double-free of inode 0's block 802817(bit 0 in group 98)
Jan 22 17:06:52 G3-OPC-SVR2 kernel: EXT4-fs error (device hda8): mb_free_blocks:
double-free of inode 0's block 802818(bit 1 in group 98)
Jan 22 17:06:52 G3-OPC-SVR2 kernel: EXT4-fs error (device hda8): mb_free_blocks:
double-free of inode 0's block 802819(bit 2 in group 98)
Jan 22 17:06:52 G3-OPC-SVR2 kernel: EXT4-fs error (device hda8): mb_free_blocks:
double-free of inode 0's block 802820(bit 3 in group 98)
Jan 22 17:06:52 G3-OPC-SVR2 kernel: EXT4-fs error (device hda8): mb_free_blocks:
double-free of inode 0's block 802821(bit 4 in group 98)
Jan 22 17:06:52 G3-OPC-SVR2 kernel: EXT4-fs error (device hda8): mb_free_blocks:
double-free of inode 0's block 802822(bit 5 in group 98)

So, online defrag calls ext4_discard_preallocations() at the end of
ext4_defrag() to avoid double-free error.
However, above error hasn't occurred since applying your patch posted on Nov
6th, 2008 because this error is caused by the same reason of your report.
http://marc.info/?l=linux-ext4&m=122599787406193&w=4

What is the status of this patch?

Regards,
Kazuya Mio


2009-02-25 10:45:08

by Aneesh Kumar K.V

[permalink] [raw]
Subject: Re: double free of blocks occurred during online defrag

On Wed, Feb 25, 2009 at 03:39:52PM +0900, Kazuya Mio wrote:
> Hi Aneesh,
>
> When I remove the file that is running online defrag, the following error occurs
> after closing the file descriptor:
>
> Jan 22 17:06:52 G3-OPC-SVR2 kernel: EXT4-fs error (device hda8):
> ext4_mb_release_inode_pa: free 2048, pa_free 1562
> Jan 22 17:06:52 G3-OPC-SVR2 kernel: EXT4-fs error (device hda8): mb_free_blocks:
> double-free of inode 0's block 802817(bit 0 in group 98)
> Jan 22 17:06:52 G3-OPC-SVR2 kernel: EXT4-fs error (device hda8): mb_free_blocks:
> double-free of inode 0's block 802818(bit 1 in group 98)
> Jan 22 17:06:52 G3-OPC-SVR2 kernel: EXT4-fs error (device hda8): mb_free_blocks:
> double-free of inode 0's block 802819(bit 2 in group 98)
> Jan 22 17:06:52 G3-OPC-SVR2 kernel: EXT4-fs error (device hda8): mb_free_blocks:
> double-free of inode 0's block 802820(bit 3 in group 98)
> Jan 22 17:06:52 G3-OPC-SVR2 kernel: EXT4-fs error (device hda8): mb_free_blocks:
> double-free of inode 0's block 802821(bit 4 in group 98)
> Jan 22 17:06:52 G3-OPC-SVR2 kernel: EXT4-fs error (device hda8): mb_free_blocks:
> double-free of inode 0's block 802822(bit 5 in group 98)
>
> So, online defrag calls ext4_discard_preallocations() at the end of
> ext4_defrag() to avoid double-free error.
> However, above error hasn't occurred since applying your patch posted on Nov
> 6th, 2008 because this error is caused by the same reason of your report.
> http://marc.info/?l=linux-ext4&m=122599787406193&w=4
>
> What is the status of this patch?

We dropped the patch because I found that the double free in my case was
not exactly due the explanation given in the patch above.

I asked to drop the patch in

http://article.gmane.org/gmane.comp.file-systems.ext4/10199

I also found that the patch is not completely correct. The meta-data
blocks which are added to the free_list are not allocated from any
prealloc space.

So what you are seeing may be a different problem which the patch is
hiding from happening. I guess you will have to look more closely at why the
double-free is happening in your case.

-aneesh

2009-02-25 11:00:06

by Aneesh Kumar K.V

[permalink] [raw]
Subject: Re: double free of blocks occurred during online defrag

On Wed, Feb 25, 2009 at 04:14:46PM +0530, Aneesh Kumar K.V wrote:
> On Wed, Feb 25, 2009 at 03:39:52PM +0900, Kazuya Mio wrote:
> > Hi Aneesh,
> >
> > When I remove the file that is running online defrag, the following error occurs
> > after closing the file descriptor:
> >
> > Jan 22 17:06:52 G3-OPC-SVR2 kernel: EXT4-fs error (device hda8):
> > ext4_mb_release_inode_pa: free 2048, pa_free 1562
> > Jan 22 17:06:52 G3-OPC-SVR2 kernel: EXT4-fs error (device hda8): mb_free_blocks:
> > double-free of inode 0's block 802817(bit 0 in group 98)
> > Jan 22 17:06:52 G3-OPC-SVR2 kernel: EXT4-fs error (device hda8): mb_free_blocks:
> > double-free of inode 0's block 802818(bit 1 in group 98)
> > Jan 22 17:06:52 G3-OPC-SVR2 kernel: EXT4-fs error (device hda8): mb_free_blocks:
> > double-free of inode 0's block 802819(bit 2 in group 98)
> > Jan 22 17:06:52 G3-OPC-SVR2 kernel: EXT4-fs error (device hda8): mb_free_blocks:
> > double-free of inode 0's block 802820(bit 3 in group 98)
> > Jan 22 17:06:52 G3-OPC-SVR2 kernel: EXT4-fs error (device hda8): mb_free_blocks:
> > double-free of inode 0's block 802821(bit 4 in group 98)
> > Jan 22 17:06:52 G3-OPC-SVR2 kernel: EXT4-fs error (device hda8): mb_free_blocks:
> > double-free of inode 0's block 802822(bit 5 in group 98)
> >
> > So, online defrag calls ext4_discard_preallocations() at the end of
> > ext4_defrag() to avoid double-free error.
> > However, above error hasn't occurred since applying your patch posted on Nov
> > 6th, 2008 because this error is caused by the same reason of your report.
> > http://marc.info/?l=linux-ext4&m=122599787406193&w=4
> >
> > What is the status of this patch?
>
> We dropped the patch because I found that the double free in my case was
> not exactly due the explanation given in the patch above.
>
> I asked to drop the patch in
>
> http://article.gmane.org/gmane.comp.file-systems.ext4/10199
>
> I also found that the patch is not completely correct. The meta-data
> blocks which are added to the free_list are not allocated from any
> prealloc space.
>
> So what you are seeing may be a different problem which the patch is
> hiding from happening. I guess you will have to look more closely at why the
> double-free is happening in your case.

I found one case of double-free , but not sure how the above patch is
helping to avoid that. Any how here is the case:


a) We have inode prealloc space. We allocated some blocks out of that
for data
b) We later free the data blocks. That means we mark the bits in bitmap
and buddy as free.
c) Now we want to discard the prealloc space. We look at the bitmap and
try mark the blocks which are free in bitmap as free in buddy. But since
the blocks are already marked free in buddy we hit the double free case.

To fix this we will have to scan all the inode prealloc space of the group
and if he blocks belong to the inode prealloc space we should not mark
them free in buddy.

-aneesh

2009-02-26 09:49:48

by Aneesh Kumar K.V

[permalink] [raw]
Subject: Re: double free of blocks occurred during online defrag

On Wed, Feb 25, 2009 at 04:29:44PM +0530, Aneesh Kumar K.V wrote:
> On Wed, Feb 25, 2009 at 04:14:46PM +0530, Aneesh Kumar K.V wrote:
> > On Wed, Feb 25, 2009 at 03:39:52PM +0900, Kazuya Mio wrote:
> > > Hi Aneesh,
> > >
> > > When I remove the file that is running online defrag, the following error occurs
> > > after closing the file descriptor:
> > >
> > > Jan 22 17:06:52 G3-OPC-SVR2 kernel: EXT4-fs error (device hda8):
> > > ext4_mb_release_inode_pa: free 2048, pa_free 1562
> > > Jan 22 17:06:52 G3-OPC-SVR2 kernel: EXT4-fs error (device hda8): mb_free_blocks:
> > > double-free of inode 0's block 802817(bit 0 in group 98)
> > > Jan 22 17:06:52 G3-OPC-SVR2 kernel: EXT4-fs error (device hda8): mb_free_blocks:
> > > double-free of inode 0's block 802818(bit 1 in group 98)
> > > Jan 22 17:06:52 G3-OPC-SVR2 kernel: EXT4-fs error (device hda8): mb_free_blocks:
> > > double-free of inode 0's block 802819(bit 2 in group 98)
> > > Jan 22 17:06:52 G3-OPC-SVR2 kernel: EXT4-fs error (device hda8): mb_free_blocks:
> > > double-free of inode 0's block 802820(bit 3 in group 98)
> > > Jan 22 17:06:52 G3-OPC-SVR2 kernel: EXT4-fs error (device hda8): mb_free_blocks:
> > > double-free of inode 0's block 802821(bit 4 in group 98)
> > > Jan 22 17:06:52 G3-OPC-SVR2 kernel: EXT4-fs error (device hda8): mb_free_blocks:
> > > double-free of inode 0's block 802822(bit 5 in group 98)
> > >
> > > So, online defrag calls ext4_discard_preallocations() at the end of
> > > ext4_defrag() to avoid double-free error.
> > > However, above error hasn't occurred since applying your patch posted on Nov
> > > 6th, 2008 because this error is caused by the same reason of your report.
> > > http://marc.info/?l=linux-ext4&m=122599787406193&w=4
> > >
> > > What is the status of this patch?
> >
> > We dropped the patch because I found that the double free in my case was
> > not exactly due the explanation given in the patch above.
> >
> > I asked to drop the patch in
> >
> > http://article.gmane.org/gmane.comp.file-systems.ext4/10199
> >
> > I also found that the patch is not completely correct. The meta-data
> > blocks which are added to the free_list are not allocated from any
> > prealloc space.
> >
> > So what you are seeing may be a different problem which the patch is
> > hiding from happening. I guess you will have to look more closely at why the
> > double-free is happening in your case.
>
> I found one case of double-free , but not sure how the above patch is
> helping to avoid that. Any how here is the case:
>
>
> a) We have inode prealloc space. We allocated some blocks out of that
> for data
> b) We later free the data blocks. That means we mark the bits in bitmap
> and buddy as free.
> c) Now we want to discard the prealloc space. We look at the bitmap and
> try mark the blocks which are free in bitmap as free in buddy. But since
> the blocks are already marked free in buddy we hit the double free case.
>
> To fix this we will have to scan all the inode prealloc space of the group
> and if he blocks belong to the inode prealloc space we should not mark
> them free in buddy.
>

Ok I guess this won't happen. Because only way to free data blocks is via
truncate and truncate cause the prealloc space discard. In other words. Before
we free the blocks we are throwing away the prealloc space.

-aneesh


2009-02-26 22:38:11

by Mike Snitzer

[permalink] [raw]
Subject: Re: double free of blocks occurred during online defrag

On Wed, Feb 25, 2009 at 5:44 AM, Aneesh Kumar K.V
<[email protected]> wrote:
> On Wed, Feb 25, 2009 at 03:39:52PM +0900, Kazuya Mio wrote:
>> Hi Aneesh,
>>
>> When I remove the file that is running online defrag, the following error occurs
>> after closing the file descriptor:
>>
>> Jan 22 17:06:52 G3-OPC-SVR2 kernel: EXT4-fs error (device hda8):
>> ext4_mb_release_inode_pa: free 2048, pa_free 1562
>> Jan 22 17:06:52 G3-OPC-SVR2 kernel: EXT4-fs error (device hda8): mb_free_blocks:
>> double-free of inode 0's block 802817(bit 0 in group 98)
>> Jan 22 17:06:52 G3-OPC-SVR2 kernel: EXT4-fs error (device hda8): mb_free_blocks:
>> double-free of inode 0's block 802818(bit 1 in group 98)
>> Jan 22 17:06:52 G3-OPC-SVR2 kernel: EXT4-fs error (device hda8): mb_free_blocks:
>> double-free of inode 0's block 802819(bit 2 in group 98)
>> Jan 22 17:06:52 G3-OPC-SVR2 kernel: EXT4-fs error (device hda8): mb_free_blocks:
>> double-free of inode 0's block 802820(bit 3 in group 98)
>> Jan 22 17:06:52 G3-OPC-SVR2 kernel: EXT4-fs error (device hda8): mb_free_blocks:
>> double-free of inode 0's block 802821(bit 4 in group 98)
>> Jan 22 17:06:52 G3-OPC-SVR2 kernel: EXT4-fs error (device hda8): mb_free_blocks:
>> double-free of inode 0's block 802822(bit 5 in group 98)
>>
>> So, online defrag calls ext4_discard_preallocations() at the end of
>> ext4_defrag() to avoid double-free error.
>> However, above error hasn't occurred since applying your patch posted on Nov
>> 6th, 2008 because this error is caused by the same reason of your report.
>> http://marc.info/?l=linux-ext4&m=122599787406193&w=4
>>
>> What is the status of this patch?
>
> We dropped the patch because I found that the double free in my case was
> not exactly due the explanation given in the patch above.
>
> I asked to drop the patch in
>
> http://article.gmane.org/gmane.comp.file-systems.ext4/10199
>
> I also found that the patch is not completely correct. The meta-data
> blocks which are added to the free_list are not allocated from any
> prealloc space.

Aneesh,

I occasionally hit the "Double free" ext4_error in
ext4_mb_free_metadata() as well as the (free != pa->pa_free)
ext4_error in ext4_mb_release_inode_pa().

I'm exploring your patch to see if it eliminates my problems.

I'm missing why the meta-data blocks on the free_list are a concern.
Yes, ext4_mb_generate_from_freelist() will set all these
free'd-but-not-yet-committed metadata blocks in the temporary bitmap.
ext4_mb_release_inode_pa() will then use that bitmap, which happens to
have extra bits set for metadata blocks, but because those bits are
set they will continue to be left unchanged in the buddy bitmap
(because AFAIK ext4_mb_release_inode_pa()'s search only cares about
regions of the pa space that are _not_ set in the temporary bitmap).

So why do these metadata blocks' bits being set in the temporary
bitmap _really_ matter for the purposes of maintaining correctness in
the buddy bitmap?

thanks,
Mike

2009-03-02 08:33:55

by Kazuya Mio

[permalink] [raw]
Subject: Re: double free of blocks occurred during online defrag

Hi Aneesh,

Thanks for your reply.

Aneesh Kumar K.V wrote:
> We dropped the patch because I found that the double free in my case was
> not exactly due the explanation given in the patch above.
>
> I asked to drop the patch in
>
> http://article.gmane.org/gmane.comp.file-systems.ext4/10199
>
> I also found that the patch is not completely correct. The meta-data
> blocks which are added to the free_list are not allocated from any
> prealloc space.
>
> So what you are seeing may be a different problem which the patch is
> hiding from happening. I guess you will have to look more closely at why the
> double-free is happening in your case.
>
> -aneesh
>

I thought that the problem would be solved if we check whether on-disk and buddy
bitmap is different or not when inode prealloc space is released.
However, my understanding was incorrect.
I will continue examining this problem.

Best regards,
Kazuya Mio

2009-03-06 08:24:08

by Kazuya Mio

[permalink] [raw]
Subject: Re: double free of blocks occurred during online defrag

Aneesh Kumar K.V wrote:
> On Wed, Feb 25, 2009 at 03:39:52PM +0900, Kazuya Mio wrote:
>> Hi Aneesh,
>>
>> When I remove the file that is running online defrag, the following error occurs
>> after closing the file descriptor:
>>
>> Jan 22 17:06:52 G3-OPC-SVR2 kernel: EXT4-fs error (device hda8):
>> ext4_mb_release_inode_pa: free 2048, pa_free 1562
>> Jan 22 17:06:52 G3-OPC-SVR2 kernel: EXT4-fs error (device hda8): mb_free_blocks:
>> double-free of inode 0's block 802817(bit 0 in group 98)
>> Jan 22 17:06:52 G3-OPC-SVR2 kernel: EXT4-fs error (device hda8): mb_free_blocks:
>> double-free of inode 0's block 802818(bit 1 in group 98)
>> Jan 22 17:06:52 G3-OPC-SVR2 kernel: EXT4-fs error (device hda8): mb_free_blocks:
>> double-free of inode 0's block 802819(bit 2 in group 98)
>> Jan 22 17:06:52 G3-OPC-SVR2 kernel: EXT4-fs error (device hda8): mb_free_blocks:
>> double-free of inode 0's block 802820(bit 3 in group 98)
>> Jan 22 17:06:52 G3-OPC-SVR2 kernel: EXT4-fs error (device hda8): mb_free_blocks:
>> double-free of inode 0's block 802821(bit 4 in group 98)
>> Jan 22 17:06:52 G3-OPC-SVR2 kernel: EXT4-fs error (device hda8): mb_free_blocks:
>> double-free of inode 0's block 802822(bit 5 in group 98)
>>
>> So, online defrag calls ext4_discard_preallocations() at the end of
>> ext4_defrag() to avoid double-free error.
>> However, above error hasn't occurred since applying your patch posted on Nov
>> 6th, 2008 because this error is caused by the same reason of your report.
>> http://marc.info/?l=linux-ext4&m=122599787406193&w=4
>>
>> What is the status of this patch?
>
> We dropped the patch because I found that the double free in my case was
> not exactly due the explanation given in the patch above.
>
> I asked to drop the patch in
>
> http://article.gmane.org/gmane.comp.file-systems.ext4/10199
>
> I also found that the patch is not completely correct. The meta-data
> blocks which are added to the free_list are not allocated from any
> prealloc space.
>
> So what you are seeing may be a different problem which the patch is
> hiding from happening. I guess you will have to look more closely at why the
> double-free is happening in your case.
>
> -aneesh
> --
> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>

I looked into double-free error I had reported, and found out that it was
caused by online defrag. The steps to be caused double-free error are as
follows:

1. We have two files. "DATA" shows data blocks, "used PA" shows the
preallocation space (called PA) that is allocated, and "free PA" shows PA
that is free.

file1: [ DATA1 | used PA1 | free PA1 ]
file2: [ DATA2 | used PA2 | free PA2 ]

2. Exchange data blocks. The blocks exchanged by defrag are DATA and used PA.

file1: [ DATA2 | used PA2 | free PA1 ]
file2: [ DATA1 | used PA1 | free PA2 ]

3. When file1 is closed, ext4_truncate() is called by removing file1. DATA2
and used PA2 are freed via ext4_truncate(). Moreover,
ext4_discard_preallocations() is called via ext4_truncate(). But online
defrag does not change the PA list, so ext4_discard_preallocations()
frees PA of file1 (used PA1 and free PA1).

file1: [ FREE SPACE(DATA2) | FREE SPACE(used PA2) | FREE SPACE(free PA1) ]
file2: [ DATA1 | FREE SPACE(used PA1) | free PA2 ]

4. When file2 is closed, ext4_descard_preallocations() is called via
ext4_release_file(). However used PA2 is already freed. Therefore,
double-free error is occurred.

file1: [ FREE SPACE(DATA2) | *DOUBLE FREE(used PA2)* | FREE SPACE(free PA1) ]
file2: [ DATA1 | FREE SPACE(used PA1) | FREE SPACE(free PA2) ]

To prevent double-free error, I decided to call ext4_discard_preallocations()
as usual in defrag. If defrag exchanges PA list after defrag, double-free error
will occur by aborting defrag. On the other hand, exchanging the list to each
other every one page (4KB) sounds good. However, above method will change PA
list a lot of times. I think it needs many resources.

Any comment on this?

Regards,
Kazuya Mio