2010-03-27 15:07:36

by Theodore Ts'o

[permalink] [raw]
Subject: FIEMAP sometimes returns bad information for delalloc extents


I was monitoring the progress of a distributed download program, and saw
the following output from two runs of filefrag taken a few seconds
apart:

8 790 8825663 8825551 65
9 855 0 8825727 319 unknown,delalloc
10 1174 8798367 318 128

7 790 8825663 8825559 69
8 1174 8798367 8825731 128

The length of the delalloc extent, 319, is bogus. The 319 seems to come
from 1174 - 855. But it's not actually the number of delayed
allocation blocks, as we can see when the blocks finally get written;
apparently it was only 4 blocks long.

I don't have time to try to track this down just now, and it's not all
that serious a bug since delalloc regions are by definition highly
transient, but it does look like we have a bug here.

- Ted


2010-03-27 17:54:35

by Andreas Dilger

[permalink] [raw]
Subject: Re: FIEMAP sometimes returns bad information for delalloc extents

On 2010-03-27, at 09:07, Theodore Ts'o wrote:
> I was monitoring the progress of a distributed download program, and
> saw
> the following output from two runs of filefrag taken a few seconds
> apart:
>
> 8 790 8825663 8825551 65
> 9 855 0 8825727 319 unknown,delalloc
> 10 1174 8798367 318 128
>
> 7 790 8825663 8825559 69
> 8 1174 8798367 8825731 128
>
> The length of the delalloc extent, 319, is bogus. The 319 seems to
> come
> from 1174 - 855. But it's not actually the number of delayed
> allocation blocks, as we can see when the blocks finally get written;
> apparently it was only 4 blocks long.

I'm surprised it shows anything at all for delalloc blocks, since
AFAIK FIEMAP is only walking the extent tree. It would be interesting
if it walked the VM pagetable for unallocated extents in the file, and
beyond i_size.

> I don't have time to try to track this down just now, and it's not all
> that serious a bug since delalloc regions are by definition highly
> transient, but it does look like we have a bug here.


Cheers, Andreas
--
Andreas Dilger
Principal Engineer, Lustre Group
Oracle Corporation Canada Inc.


2010-03-27 18:35:25

by Eric Sandeen

[permalink] [raw]
Subject: Re: FIEMAP sometimes returns bad information for delalloc extents

Andreas Dilger wrote:
> On 2010-03-27, at 09:07, Theodore Ts'o wrote:
>> I was monitoring the progress of a distributed download program, and saw
>> the following output from two runs of filefrag taken a few seconds
>> apart:
>>
>> 8 790 8825663 8825551 65
>> 9 855 0 8825727 319 unknown,delalloc
>> 10 1174 8798367 318 128
>>
>> 7 790 8825663 8825559 69
>> 8 1174 8798367 8825731 128
>>
>> The length of the delalloc extent, 319, is bogus. The 319 seems to come
>> from 1174 - 855. But it's not actually the number of delayed
>> allocation blocks, as we can see when the blocks finally get written;
>> apparently it was only 4 blocks long.
>
> I'm surprised it shows anything at all for delalloc blocks, since AFAIK
> FIEMAP is only walking the extent tree. It would be interesting if it
> walked the VM pagetable for unallocated extents in the file, and beyond
> i_size.

it does this in the callback for ext4_ext_walk_space:

if (newex->ec_type == EXT4_EXT_CACHE_GAP) {
...
page = find_get_page(inode->i_mapping, offset);
...
bh = page_buffers(page);
...
if (buffer_delay(bh)) {
flags |= FIEMAP_EXTENT_DELALLOC;
...

so it was an attempt, at least, to flag which extents are delalloc.

FWIW, on xfs xfs_bmap initially would cause a file flush, it didn't even ever
try to report delalloc until fiemap came along ...

-Eric