2011-06-14 15:41:27

by Allison Henderson

[permalink] [raw]
Subject: xfstests 252 failure

Hi all,

I just wanted to get some ideas moving on this question before too much
time goes by. Ext4 is currently failing xfstest 252, test number 12.
Currently test 12 is:

$XFS_IO_PROG $xfs_io_opt -f -c "truncate 20k" \
-c "$alloc_cmd 0 20k" \
-c "pwrite 8k 4k" -c "fsync" \
-c "$zero_cmd 4k 12k" \
-c "$map_cmd -v" $testfile | $filter_cmd
[ $? -ne 0 ]&& die_now

and the output is:

12. unwritten -> data -> unwritten
0: [0..7]: unwritten
1: [8..31]: hole
2: [32..39]: unwritten

Ext4 gets data extents here instead of unwritten extents. I did some
investigating and it looks like the fsync command causes the extents to
be written out before the punch hole operation starts. It looks like
what happens is that when an unwritten extent gets written to, it doesnt
always split the extent. If the extent is small enough, then it just
zeros out the portions that are not written to, and the whole extent
becomes a written extent. Im not sure if that is incorrect or if we
need to change the test to not compare the extent types.

It looks to me that the code in ext4 that does this is supposed to be an
optimization to help reduce fragmentation. We could change the filters
to print just "extent" instead of "unwritten" or "data", but I realize
that probably makes the test a lot less effective for xfs. If anyone
can think of some more elegant fixes, please let me know. Thx!

Allison Henderson



2011-06-14 16:06:46

by Eric Sandeen

[permalink] [raw]
Subject: Re: xfstests 252 failure

On 6/14/11 10:41 AM, Allison Henderson wrote:
> Hi all,
>
> I just wanted to get some ideas moving on this question before too
> much time goes by. Ext4 is currently failing xfstest 252, test number
> 12. Currently test 12 is:
>
> $XFS_IO_PROG $xfs_io_opt -f -c "truncate 20k" \
> -c "$alloc_cmd 0 20k" \
> -c "pwrite 8k 4k" -c "fsync" \
> -c "$zero_cmd 4k 12k" \
> -c "$map_cmd -v" $testfile | $filter_cmd
> [ $? -ne 0 ]&& die_now

so the file should go through these steps:
(H=hole, P=prealloc, D=data)

0k 20k
| H | H | H | H | H | (truncate)
| P | P | P | P | P | (alloc_cmd)
| P | P | D | P | P | (pwrite)
<fsync> (fsync)
| P | H | H | H | P | (punch)

> and the output is:
>
> 12. unwritten -> data -> unwritten
> 0: [0..7]: unwritten
> 1: [8..31]: hole
> 2: [32..39]: unwritten
>
> Ext4 gets data extents here instead of unwritten extents.

so it's like this?

0: [0..7]: data
1: [8..31]: hole
2: [32..39]: data

> I did some
> investigating and it looks like the fsync command causes the extents
> to be written out before the punch hole operation starts. It looks
> like what happens is that when an unwritten extent gets written to,
> it doesnt always split the extent. If the extent is small enough,
> then it just zeros out the portions that are not written to, and the
> whole extent becomes a written extent. Im not sure if that is
> incorrect or if we need to change the test to not compare the extent
> types.

Yes, it does do that IIRC.

I probably need to look closer, but any test which expects exact
layouts from a filesystem after a series of operations is probably
expecting too much...

>From a data integrity perspective, written zeros is as good as a hole is
as good as preallocated space, so I suppose those should all be acceptable,
though I guess "punch" should result in holes exactly as requested.

> It looks to me that the code in ext4 that does this is supposed to be
> an optimization to help reduce fragmentation. We could change the
> filters to print just "extent" instead of "unwritten" or "data", but
> I realize that probably makes the test a lot less effective for xfs.
> If anyone can think of some more elegant fixes, please let me know.
> Thx!

Josef, what do you think? It's your test originally. :)

But if the test is supposed to verify hole punching behavior,
then to me, making unwritten extents equivalent to data extents
full of 0s should be fine. Perhaps an md5sum step would be
in order to verify that it is in fact all 0s.

-Eric

> Allison Henderson
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html


2011-06-14 18:42:02

by Josef Bacik

[permalink] [raw]
Subject: Re: xfstests 252 failure

On 06/14/2011 12:06 PM, Eric Sandeen wrote:
> On 6/14/11 10:41 AM, Allison Henderson wrote:
>> Hi all,
>>
>> I just wanted to get some ideas moving on this question before too
>> much time goes by. Ext4 is currently failing xfstest 252, test number
>> 12. Currently test 12 is:
>>
>> $XFS_IO_PROG $xfs_io_opt -f -c "truncate 20k" \
>> -c "$alloc_cmd 0 20k" \
>> -c "pwrite 8k 4k" -c "fsync" \
>> -c "$zero_cmd 4k 12k" \
>> -c "$map_cmd -v" $testfile | $filter_cmd
>> [ $? -ne 0 ]&& die_now
>
> so the file should go through these steps:
> (H=hole, P=prealloc, D=data)
>
> 0k 20k
> | H | H | H | H | H | (truncate)
> | P | P | P | P | P | (alloc_cmd)
> | P | P | D | P | P | (pwrite)
> <fsync> (fsync)
> | P | H | H | H | P | (punch)
>
>> and the output is:
>>
>> 12. unwritten -> data -> unwritten
>> 0: [0..7]: unwritten
>> 1: [8..31]: hole
>> 2: [32..39]: unwritten
>>
>> Ext4 gets data extents here instead of unwritten extents.
>
> so it's like this?
>
> 0: [0..7]: data
> 1: [8..31]: hole
> 2: [32..39]: data
>
>> I did some
>> investigating and it looks like the fsync command causes the extents
>> to be written out before the punch hole operation starts. It looks
>> like what happens is that when an unwritten extent gets written to,
>> it doesnt always split the extent. If the extent is small enough,
>> then it just zeros out the portions that are not written to, and the
>> whole extent becomes a written extent. Im not sure if that is
>> incorrect or if we need to change the test to not compare the extent
>> types.
>
> Yes, it does do that IIRC.
>
> I probably need to look closer, but any test which expects exact
> layouts from a filesystem after a series of operations is probably
> expecting too much...
>
> From a data integrity perspective, written zeros is as good as a hole is
> as good as preallocated space, so I suppose those should all be acceptable,
> though I guess "punch" should result in holes exactly as requested.
>
>> It looks to me that the code in ext4 that does this is supposed to be
>> an optimization to help reduce fragmentation. We could change the
>> filters to print just "extent" instead of "unwritten" or "data", but
>> I realize that probably makes the test a lot less effective for xfs.
>> If anyone can think of some more elegant fixes, please let me know.
>> Thx!
>
> Josef, what do you think? It's your test originally. :)
>

Yes, a test that was really only meant to test the block based fiemap
since they all act in a dumb and easy to verify way. I think if we want
to keep this test we should probably have it just recognize these little
optimizations so it doesn't freak out. Thanks,

Josef

2011-06-14 19:37:50

by Allison Henderson

[permalink] [raw]
Subject: Re: xfstests 252 failure

On 06/14/2011 11:41 AM, Josef Bacik wrote:
> On 06/14/2011 12:06 PM, Eric Sandeen wrote:
>> On 6/14/11 10:41 AM, Allison Henderson wrote:
>>> Hi all,
>>>
>>> I just wanted to get some ideas moving on this question before too
>>> much time goes by. Ext4 is currently failing xfstest 252, test number
>>> 12. Currently test 12 is:
>>>
>>> $XFS_IO_PROG $xfs_io_opt -f -c "truncate 20k" \
>>> -c "$alloc_cmd 0 20k" \
>>> -c "pwrite 8k 4k" -c "fsync" \
>>> -c "$zero_cmd 4k 12k" \
>>> -c "$map_cmd -v" $testfile | $filter_cmd
>>> [ $? -ne 0 ]&& die_now
>>
>> so the file should go through these steps:
>> (H=hole, P=prealloc, D=data)
>>
>> 0k 20k
>> | H | H | H | H | H | (truncate)
>> | P | P | P | P | P | (alloc_cmd)
>> | P | P | D | P | P | (pwrite)
>> <fsync> (fsync)
>> | P | H | H | H | P | (punch)
>>
>>> and the output is:
>>>
>>> 12. unwritten -> data -> unwritten
>>> 0: [0..7]: unwritten
>>> 1: [8..31]: hole
>>> 2: [32..39]: unwritten
>>>
>>> Ext4 gets data extents here instead of unwritten extents.
>>
>> so it's like this?
>>
>> 0: [0..7]: data
>> 1: [8..31]: hole
>> 2: [32..39]: data
>>
>>> I did some
>>> investigating and it looks like the fsync command causes the extents
>>> to be written out before the punch hole operation starts. It looks
>>> like what happens is that when an unwritten extent gets written to,
>>> it doesnt always split the extent. If the extent is small enough,
>>> then it just zeros out the portions that are not written to, and the
>>> whole extent becomes a written extent. Im not sure if that is
>>> incorrect or if we need to change the test to not compare the extent
>>> types.
>>
>> Yes, it does do that IIRC.
>>
>> I probably need to look closer, but any test which expects exact
>> layouts from a filesystem after a series of operations is probably
>> expecting too much...
>>
>> From a data integrity perspective, written zeros is as good as a hole is
>> as good as preallocated space, so I suppose those should all be acceptable,
>> though I guess "punch" should result in holes exactly as requested.
>>
>>> It looks to me that the code in ext4 that does this is supposed to be
>>> an optimization to help reduce fragmentation. We could change the
>>> filters to print just "extent" instead of "unwritten" or "data", but
>>> I realize that probably makes the test a lot less effective for xfs.
>>> If anyone can think of some more elegant fixes, please let me know.
>>> Thx!
>>
>> Josef, what do you think? It's your test originally. :)
>>
>
> Yes, a test that was really only meant to test the block based fiemap
> since they all act in a dumb and easy to verify way. I think if we want
> to keep this test we should probably have it just recognize these little
> optimizations so it doesn't freak out. Thanks,
>
> Josef

Alrighty then, so it sounds like we should adjust the filters to only
recognize extents and holes, and then add a checksum to the punched
files. I think that seems pretty straight forward. I already have a
patch set out there that is adding more punch hole tests, so I can add
these changes in with it if everyone is ok with that. Thx!

Allison Henderson