2013-03-03 16:11:16

by Zheng Liu

[permalink] [raw]
Subject: [PATCH 0/2] fixup bugs in e2fsprogs that is reported by xfstests #218

Hi all,

This patch series tries to fixup bugs in e2fsprogs that is reported by
xfstests #218 (This patch [1] need to be applied to fix a bug for #218).
The first commit tries to fix a e4defrag bug, which makes e4defrag to
defrag a ext4 file when orig_physcial_cnt == donor_physical_cnt. The
second one fixes a bug in filefrag that miss calculates contiguous
extents.

After applied this patch series, #218 can pass in the following cass:

4k Block
4K Block w/ dioread_nolock

The following cases will fail because in #218 it use 'dd' command to
create a contiguous file. But it could be failed. After using 'xfs_io'
command, this failure disappears.

4K Block w/ ^flex_bg, nodelalloc
4K Block w/ ^has_journal

The following cases fail because of other reasons.

* 4K Block w/ data=journal
218 3s ... - output mismatch (see 218.out.bad)
--- 218.out 2013-02-07 22:56:14.000000000 +0800
+++ 218.out.bad 2013-03-04 00:15:43.000000000 +0800
@@ -10,7 +10,7 @@
After: 1
Write backwards sync, but contiguous - should defrag to 1 extent
Before: 10
-After: 1
+After: 10
Write backwards sync leaving holes - defrag should do nothing
Before: 16
...
(Run 'diff -u 218.out 218.out.bad' to see the entire diff)
Ran: 218
Failures: 218
Failed 1 of 1 tests

* 4k Block w/ metadata_csum
FSTYP -- ext4
PLATFORM -- Linux/x86_64 lz-desktop 3.8.0
MKFS_OPTIONS -- -O metadata_csum /dev/sda2
MOUNT_OPTIONS -- -o acl,user_xattr /dev/sda2 /mnt/sda2

218 2s ... [failed, exit status 1] - output mismatch (see 218.out.bad)
--- 218.out 2013-02-07 22:56:14.000000000 +0800
+++ 218.out.bad 2013-03-03 22:37:37.000000000 +0800
@@ -10,10 +10,7 @@
After: 1
Write backwards sync, but contiguous - should defrag to 1 extent
Before: 10
-After: 1
-Write backwards sync leaving holes - defrag should do nothing
-Before: 16
-After: 16
...
(Run 'diff -u 218.out 218.out.bad' to see the entire diff)
Ran: 218
Failures: 218
Failed 1 of 1 tests

kernel: EXT4-fs error (device sda2): __ext4_ext_check_block:480: inode #12: comm
filefrag: bad header/extent: extent tree corrupted - magic f30a, entries 1, max
340(340), depth 0(0)

* 1k Block
FSTYP -- ext4
PLATFORM -- Linux/x86_64 lz-desktop 3.8.0
MKFS_OPTIONS -- -b 1024 /dev/sda2
MOUNT_OPTIONS -- -o acl,user_xattr /dev/sda2 /mnt/sda2

218 2s ... - output mismatch (see 218.out.bad)
--- 218.out 2013-02-07 22:56:14.000000000 +0800
+++ 218.out.bad 2013-03-03 22:41:58.000000000 +0800
@@ -9,7 +9,7 @@
Before: 1
After: 1
Write backwards sync, but contiguous - should defrag to 1 extent
-Before: 10
+Before: 7
After: 1
Write backwards sync leaving holes - defrag should do nothing
...
(Run 'diff -u 218.out 218.out.bad' to see the entire diff)
Ran: 218
Failures: 218
Failed 1 of 1 tests

1. http://permalink.gmane.org/gmane.comp.file-systems.ext4/37366

Regards,
- Zheng

Zheng Liu (2):
e4defrag: defrag a file when orig_physical_cnt == donor_physical_cnt
filefrag: count a contiguous extent when both logical and physical
blocks are contiguous

misc/e4defrag.c | 2 +-
misc/filefrag.c | 49 ++++++++++++++++++++++++++++++++-----------------
2 files changed, 33 insertions(+), 18 deletions(-)

--
1.7.12.rc2.18.g61b472e



2013-03-03 16:11:20

by Zheng Liu

[permalink] [raw]
Subject: [PATCH 1/2] e4defrag: defrag a file when orig_physical_cnt == donor_physical_cnt

From: Zheng Liu <[email protected]>

When orig_physical_cnt == donor_physical_cnt, we need to defrag a file
because a file could be written backwards. So that will make it look
like a contiguous extent but actually the physical blocks are reversed.

This case can be created artificially just like xfstests #218 does. The
following script can create it.

for I in `seq 9 -1 0`; do
dd if=/dev/zero of=$testfile bs=4k count=1 conv=notrunc \
seek=$I oflag=sync &>/dev/null
done
sudo filefrag -v $testfile
sudo e4defrag -v $testfile
sudo filefrag -v $testfile

Before applied this patch, the result of running script is like:

Filesystem type is: ef53
File size of /mnt/sda3/testfile is 40960 (10 blocks of 4096 bytes)
ext: logical_offset: physical_offset: length: expected: flags:
0: 0.. 0: 34825.. 34825: 1:
1: 1.. 1: 34824.. 34824: 1: 34826:
2: 2.. 2: 34823.. 34823: 1: 34825:
3: 3.. 3: 34822.. 34822: 1: 34824:
4: 4.. 4: 34821.. 34821: 1: 34823:
5: 5.. 5: 34820.. 34820: 1: 34822:
6: 6.. 6: 34819.. 34819: 1: 34821:
7: 7.. 7: 34818.. 34818: 1: 34820:
8: 8.. 8: 34817.. 34817: 1: 34819:
9: 9.. 9: 34816.. 34816: 1: 34818: eof
/mnt/sda3/testfile: 10 extents found
ext4 defragmentation for /mnt/sda3/testfile
[1/1]/mnt/sda3/testfile: 100% extents: 10 -> 10 [ OK ]
Success: [1/1]
Filesystem type is: ef53
File size of /mnt/sda3/testfile is 40960 (10 blocks of 4096 bytes)
ext: logical_offset: physical_offset: length: expected: flags:
0: 0.. 0: 34825.. 34825: 1:
1: 1.. 1: 34824.. 34824: 1: 34826:
2: 2.. 2: 34823.. 34823: 1: 34825:
3: 3.. 3: 34822.. 34822: 1: 34824:
4: 4.. 4: 34821.. 34821: 1: 34823:
5: 5.. 5: 34820.. 34820: 1: 34822:
6: 6.. 6: 34819.. 34819: 1: 34821:
7: 7.. 7: 34818.. 34818: 1: 34820:
8: 8.. 8: 34817.. 34817: 1: 34819:
9: 9.. 9: 34816.. 34816: 1: 34818: eof
/mnt/sda3/testfile: 10 extents found

After applied it, the result of running script looks like:
Filesystem type is: ef53
File size of /mnt/sda3/testfile is 40960 (10 blocks of 4096 bytes)
ext: logical_offset: physical_offset: length: expected: flags:
0: 0.. 0: 34313.. 34313: 1:
1: 1.. 1: 34312.. 34312: 1: 34314:
2: 2.. 2: 34311.. 34311: 1: 34313:
3: 3.. 3: 34310.. 34310: 1: 34312:
4: 4.. 4: 34309.. 34309: 1: 34311:
5: 5.. 5: 34308.. 34308: 1: 34310:
6: 6.. 6: 34307.. 34307: 1: 34309:
7: 7.. 7: 34306.. 34306: 1: 34308:
8: 8.. 8: 34305.. 34305: 1: 34307:
9: 9.. 9: 34304.. 34304: 1: 34306: eof
/mnt/sda3/testfile: 10 extents found
ext4 defragmentation for /mnt/sda3/testfile
[1/1]/mnt/sda3/testfile: 100% extents: 10 -> 1 [ OK ]
Success: [1/1]
Filesystem type is: ef53
File size of /mnt/sda3/testfile is 40960 (10 blocks of 4096 bytes)
ext: logical_offset: physical_offset: length: expected: flags:
0: 0.. 9: 33794.. 33803: 10: eof
/mnt/sda3/testfile: 1 extent found

Signed-off-by: Zheng Liu <[email protected]>
---
misc/e4defrag.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/misc/e4defrag.c b/misc/e4defrag.c
index 4b31d03..ae4e8d8 100644
--- a/misc/e4defrag.c
+++ b/misc/e4defrag.c
@@ -1649,7 +1649,7 @@ check_improvement:
}

if (file_frags_start <= best ||
- orig_physical_cnt <= donor_physical_cnt) {
+ orig_physical_cnt < donor_physical_cnt) {
printf("\033[79;0H\033[K[%u/%u]%s:\t%3d%%",
defraged_file_count, total_count, file, 100);
if (mode_flag & DETAIL)
--
1.7.12.rc2.18.g61b472e


2013-03-03 16:11:24

by Zheng Liu

[permalink] [raw]
Subject: [PATCH 2/2] filefrag: count a contiguous extent when both logical and physical blocks are contiguous

From: Zheng Liu <[email protected]>

This commit fixes a bug in filefrag that filefrag miss calculates
contiguous extent when physical blocks are contiguous but logical blocks
aren't. This case can be created by xfstests #218 or the following
script.

for I in `seq 0 2 31`; do
dd if=/dev/zero of=$testfile bs=4k count=1 conv=notrunc \
seek=$I oflag=sync &>/dev/null
done

Meanwhile this commit prints expected logical block.

Signed-off-by: Zheng Liu <[email protected]>
---
misc/filefrag.c | 49 ++++++++++++++++++++++++++++++++-----------------
1 file changed, 32 insertions(+), 17 deletions(-)

diff --git a/misc/filefrag.c b/misc/filefrag.c
index ee03a07..3af938f 100644
--- a/misc/filefrag.c
+++ b/misc/filefrag.c
@@ -116,16 +116,17 @@ static int get_bmap(int fd, unsigned long block, unsigned long *phy_blk)

static void print_extent_header(void)
{
- printf(" ext: %*s %*s length: %*s flags:\n",
+ printf(" ext: %*s %*s length: %*s %*s flags:\n",
logical_width * 2 + 3,
"logical_offset:",
physical_width * 2 + 3, "physical_offset:",
- physical_width + 1,
- "expected:");
+ physical_width + 1, "log_expected:",
+ physical_width + 1, "phy_expected:");
}

static void print_extent_info(struct fiemap_extent *fm_extent, int cur_ex,
- unsigned long long expected, int blk_shift,
+ unsigned long long lexpected,
+ unsigned long long pexpected, int blk_shift,
ext2fs_struct_stat *st)
{
unsigned long long physical_blk;
@@ -133,6 +134,8 @@ static void print_extent_info(struct fiemap_extent *fm_extent, int cur_ex,
unsigned long long ext_len;
unsigned long long ext_blks;
char flags[256] = "";
+ char logex_flags[32] = "";
+ char phyex_flags[32] = "";

/* For inline data all offsets should be in bytes, not blocks */
if (fm_extent->fe_flags & FIEMAP_EXTENT_DATA_INLINE)
@@ -143,11 +146,20 @@ static void print_extent_info(struct fiemap_extent *fm_extent, int cur_ex,
logical_blk = fm_extent->fe_logical >> blk_shift;
physical_blk = fm_extent->fe_physical >> blk_shift;

- if (expected)
- sprintf(flags, ext_fmt == hex_fmt ? "%*llx: " : "%*llu: ",
- physical_width, expected >> blk_shift);
+ if (lexpected)
+ sprintf(logex_flags, ext_fmt == hex_fmt ? "%*llx: " : "%*llu: ",
+ physical_width, lexpected >> blk_shift);
else
- sprintf(flags, "%.*s ", physical_width, " ");
+ sprintf(logex_flags, "%.*s ", physical_width, " ");
+
+ if (pexpected)
+ sprintf(phyex_flags, ext_fmt == hex_fmt ? "%*llx: " : "%*llu: ",
+ physical_width, pexpected >> blk_shift);
+ else
+ sprintf(phyex_flags, "%.*s ", physical_width, " ");
+
+ strcat(flags, logex_flags);
+ strcat(flags, phyex_flags);

if (fm_extent->fe_flags & FIEMAP_EXTENT_UNKNOWN)
strcat(flags, "unknown,");
@@ -188,7 +200,7 @@ static int filefrag_fiemap(int fd, int blk_shift, int *num_extents,
struct fiemap_extent *fm_ext = &fiemap->fm_extents[0];
int count = (sizeof(buf) - sizeof(*fiemap)) /
sizeof(struct fiemap_extent);
- unsigned long long expected = 0;
+ unsigned long long lexpected = 0, pexpected = 0;
unsigned long flags = 0;
unsigned int i;
static int fiemap_incompat_printed;
@@ -230,18 +242,21 @@ static int filefrag_fiemap(int fd, int blk_shift, int *num_extents,

for (i = 0; i < fiemap->fm_mapped_extents; i++) {
if (fm_ext[i].fe_logical != 0 &&
- fm_ext[i].fe_physical != expected) {
+ (fm_ext[i].fe_logical != lexpected ||
+ fm_ext[i].fe_physical != pexpected)) {
tot_extents++;
} else {
- expected = 0;
+ lexpected = 0;
+ pexpected = 0;
if (!tot_extents)
tot_extents = 1;
}
if (verbose)
- print_extent_info(&fm_ext[i], n, expected,
- blk_shift, st);
+ print_extent_info(&fm_ext[i], n, lexpected,
+ pexpected, blk_shift, st);

- expected = fm_ext[i].fe_physical + fm_ext[i].fe_length;
+ lexpected = fm_ext[i].fe_logical + fm_ext[i].fe_length;
+ pexpected = fm_ext[i].fe_physical + fm_ext[i].fe_length;
if (fm_ext[i].fe_flags & FIEMAP_EXTENT_LAST)
last = 1;
n++;
@@ -308,7 +323,7 @@ static int filefrag_fibmap(int fd, int blk_shift, int *num_extents,
if (force_extent && last_block != 0 &&
(block != last_block + 1 ||
fm_ext.fe_logical + fm_ext.fe_length != logical)) {
- print_extent_info(&fm_ext, *num_extents - 1,
+ print_extent_info(&fm_ext, *num_extents - 1, logical,
(last_block + 1) * st->st_blksize,
blk_shift, st);
fm_ext.fe_logical = logical;
@@ -325,7 +340,7 @@ static int filefrag_fibmap(int fd, int blk_shift, int *num_extents,
}

if (force_extent)
- print_extent_info(&fm_ext, *num_extents - 1,
+ print_extent_info(&fm_ext, *num_extents - 1, logical,
last_block * st->st_blksize, blk_shift, st);

return count;
@@ -339,7 +354,7 @@ static void frag_report(const char *filename)
long fd;
unsigned long numblocks;
int data_blocks_per_cyl = 1;
- int num_extents = 1, expected = ~0;
+ int num_extents = 1, expected = 0;
int is_ext2 = 0;
static dev_t last_device;
unsigned int flags;
--
1.7.12.rc2.18.g61b472e


2013-03-13 19:54:26

by Theodore Ts'o

[permalink] [raw]
Subject: Re: [PATCH 1/2] e4defrag: defrag a file when orig_physical_cnt == donor_physical_cnt

On Mon, Mar 04, 2013 at 12:26:17AM +0800, Zheng Liu wrote:
> From: Zheng Liu <[email protected]>
>
> When orig_physical_cnt == donor_physical_cnt, we need to defrag a file
> because a file could be written backwards. So that will make it look
> like a contiguous extent but actually the physical blocks are reversed.

The problem with your change is in the case where orig_physical_cnt
and donor_physical_cnt are 1 (i.e., the file is perfectly defragged),
we will still try to swap the extents.

The fundamental problem is that we are using a metric which is flawed;
in the case of the following file:

> File size of /mnt/sda3/testfile is 40960 (10 blocks of 4096 bytes)
> ext: logical_offset: physical_offset: length: expected: flags:
> 0: 0.. 0: 34825.. 34825: 1:
> 1: 1.. 1: 34824.. 34824: 1: 34826:
> 2: 2.. 2: 34823.. 34823: 1: 34825:
> 3: 3.. 3: 34822.. 34822: 1: 34824:
> 4: 4.. 4: 34821.. 34821: 1: 34823:
> 5: 5.. 5: 34820.. 34820: 1: 34822:
> 6: 6.. 6: 34819.. 34819: 1: 34821:
> 7: 7.. 7: 34818.. 34818: 1: 34820:
> 8: 8.. 8: 34817.. 34817: 1: 34819:
> 9: 9.. 9: 34816.. 34816: 1: 34818: eof

We should be counting this as having 10 extents, not 1.

- Ted

2013-03-13 20:16:16

by Theodore Ts'o

[permalink] [raw]
Subject: Re: [PATCH 2/2] filefrag: count a contiguous extent when both logical and physical blocks are contiguous

On Mon, Mar 04, 2013 at 12:26:18AM +0800, Zheng Liu wrote:
> From: Zheng Liu <[email protected]>
>
> This commit fixes a bug in filefrag that filefrag miss calculates
> contiguous extent when physical blocks are contiguous but logical blocks
> aren't. This case can be created by xfstests #218 or the following
> script.
>
> for I in `seq 0 2 31`; do
> dd if=/dev/zero of=$testfile bs=4k count=1 conv=notrunc \
> seek=$I oflag=sync &>/dev/null
> done
>
> Meanwhile this commit prints expected logical block.

Hmm, this (and your previous patch) fundamentally raises the question
of what do we call an "extent", doesn't it?

Ignoring for now the question of what xfstests #218 is expecting (if
we disagree with what's "best", we should have a discussion with the
other fs maintainers, and in the worst case, make our own version of
the test), the question is how should defragmentation handle sparse
files? In general, sparse files imply that random access workload, so
whether or not the file is contiguous doesn't really matter much.

If we want to optimize the time to copy said sparse file, and if we
assume that by the time we are defragging said sparse file, we are
done doing writes which will allocate new blocks, then having defrag
optimize the file so that when the extents are sorted by logical block
number, the physical block numbers are contiguous, then that's
probably the best "figure of merit" to use. And I'll note that right
now that's what filefrag is reporting, and what I think e4defrag
should be changed to use when deciding whether the donor file is
"better" than the original file.

But that's not necessarily the only way to measure extents, and the
current e4defrag code is clearly of the opinion that if the file is
using a contiguous region of blocks, even if the blocks were allocated
"backwards", that there's no point defragging the file, since after
all, if the file was written in such a random order with respect to
logical block numbers, it will probably be read in a random order, so
keeping the file blocks used contiguous to minimize free block
fragmentation is the best thing to shoot for.

It's not clear that there's one right answer, but things will be a lot
less confusing if we can agree amongst ourselves what answer we want
to use --- and then if we need to either change the xfstests, or maybe
create an option to filefrag to calculate the number of fragments that
the xfstest is expecting. But we should first decide what is the
right thing, and then we can see whether or not what it matches what
the test is demanding.

- Ted

2013-03-14 12:39:24

by Zheng Liu

[permalink] [raw]
Subject: Re: [PATCH 2/2] filefrag: count a contiguous extent when both logical and physical blocks are contiguous

On Wed, Mar 13, 2013 at 04:16:11PM -0400, Theodore Ts'o wrote:
> On Mon, Mar 04, 2013 at 12:26:18AM +0800, Zheng Liu wrote:
> > From: Zheng Liu <[email protected]>
> >
> > This commit fixes a bug in filefrag that filefrag miss calculates
> > contiguous extent when physical blocks are contiguous but logical blocks
> > aren't. This case can be created by xfstests #218 or the following
> > script.
> >
> > for I in `seq 0 2 31`; do
> > dd if=/dev/zero of=$testfile bs=4k count=1 conv=notrunc \
> > seek=$I oflag=sync &>/dev/null
> > done
> >
> > Meanwhile this commit prints expected logical block.
>
> Hmm, this (and your previous patch) fundamentally raises the question
> of what do we call an "extent", doesn't it?
>
> Ignoring for now the question of what xfstests #218 is expecting (if
> we disagree with what's "best", we should have a discussion with the
> other fs maintainers, and in the worst case, make our own version of
> the test), the question is how should defragmentation handle sparse
> files? In general, sparse files imply that random access workload, so
> whether or not the file is contiguous doesn't really matter much.
>
> If we want to optimize the time to copy said sparse file, and if we
> assume that by the time we are defragging said sparse file, we are
> done doing writes which will allocate new blocks, then having defrag
> optimize the file so that when the extents are sorted by logical block
> number, the physical block numbers are contiguous, then that's
> probably the best "figure of merit" to use. And I'll note that right
> now that's what filefrag is reporting, and what I think e4defrag
> should be changed to use when deciding whether the donor file is
> "better" than the original file.
>
> But that's not necessarily the only way to measure extents, and the
> current e4defrag code is clearly of the opinion that if the file is
> using a contiguous region of blocks, even if the blocks were allocated
> "backwards", that there's no point defragging the file, since after
> all, if the file was written in such a random order with respect to
> logical block numbers, it will probably be read in a random order, so
> keeping the file blocks used contiguous to minimize free block
> fragmentation is the best thing to shoot for.
>
> It's not clear that there's one right answer, but things will be a lot
> less confusing if we can agree amongst ourselves what answer we want
> to use --- and then if we need to either change the xfstests, or maybe
> create an option to filefrag to calculate the number of fragments that
> the xfstest is expecting. But we should first decide what is the
> right thing, and then we can see whether or not what it matches what
> the test is demanding.

Thanks for the explanation. Indeed the key problem is how to define an
extent. I agree with you that we shouldn't only match what the test
expects, and just let test case pass.

Regards,
- Zheng

2013-03-17 23:11:18

by Dave Chinner

[permalink] [raw]
Subject: Re: [PATCH 2/2] filefrag: count a contiguous extent when both logical and physical blocks are contiguous

On Thu, Mar 14, 2013 at 08:54:42PM +0800, Zheng Liu wrote:
> On Wed, Mar 13, 2013 at 04:16:11PM -0400, Theodore Ts'o wrote:
> > On Mon, Mar 04, 2013 at 12:26:18AM +0800, Zheng Liu wrote:
> > > From: Zheng Liu <[email protected]>
> > >
> > > This commit fixes a bug in filefrag that filefrag miss calculates
> > > contiguous extent when physical blocks are contiguous but logical blocks
> > > aren't. This case can be created by xfstests #218 or the following
> > > script.
> > >
> > > for I in `seq 0 2 31`; do
> > > dd if=/dev/zero of=$testfile bs=4k count=1 conv=notrunc \
> > > seek=$I oflag=sync &>/dev/null
> > > done
> > >
> > > Meanwhile this commit prints expected logical block.
> >
> > Hmm, this (and your previous patch) fundamentally raises the question
> > of what do we call an "extent", doesn't it?
> >
> > Ignoring for now the question of what xfstests #218 is expecting (if
> > we disagree with what's "best", we should have a discussion with the
> > other fs maintainers, and in the worst case, make our own version of
> > the test), the question is how should defragmentation handle sparse
> > files? In general, sparse files imply that random access workload, so
> > whether or not the file is contiguous doesn't really matter much.
> >
> > If we want to optimize the time to copy said sparse file, and if we
> > assume that by the time we are defragging said sparse file, we are
> > done doing writes which will allocate new blocks, then having defrag
> > optimize the file so that when the extents are sorted by logical block
> > number, the physical block numbers are contiguous, then that's
> > probably the best "figure of merit" to use. And I'll note that right
> > now that's what filefrag is reporting, and what I think e4defrag
> > should be changed to use when deciding whether the donor file is
> > "better" than the original file.
> >
> > But that's not necessarily the only way to measure extents, and the
> > current e4defrag code is clearly of the opinion that if the file is
> > using a contiguous region of blocks, even if the blocks were allocated
> > "backwards", that there's no point defragging the file, since after
> > all, if the file was written in such a random order with respect to
> > logical block numbers, it will probably be read in a random order, so
> > keeping the file blocks used contiguous to minimize free block
> > fragmentation is the best thing to shoot for.
> >
> > It's not clear that there's one right answer, but things will be a lot
> > less confusing if we can agree amongst ourselves what answer we want
> > to use --- and then if we need to either change the xfstests, or maybe
> > create an option to filefrag to calculate the number of fragments that
> > the xfstest is expecting. But we should first decide what is the
> > right thing, and then we can see whether or not what it matches what
> > the test is demanding.
>
> Thanks for the explanation. Indeed the key problem is how to define an
> extent. I agree with you that we shouldn't only match what the test
> expects, and just let test case pass.

AFAIC, there is no ambiguity into what defines an extent. An extent
defines a single logical offset to physical block relationship. As
soon as either logical offset or physical block are no longer able
to be described by the relationship, you need to define a new
relationship. That's the definition every filesystem uses for
recording extents on disk, and that gives a pretty solid basis for
what everyone expects filefrag to report to users as an extent.

IOWs, a sparse file is simply a bunch of extent records that have
discontiguous logical offsets. Similarly, a fragmented file is
simply a bunch of extent records that are physically discontiguous.

Therefore, the definition of "sparse" and "fragmented" is not
dependent on the definition of "extent", rather it is dependent on
the relationship between extents and how the filesystem itself
defines that relationship.

So, back to test 218. If you have 10 extents that are logically
contiguous, but physically discontiguous, is that a fragmented file?

The question 218 is asking is not whether those 10 extents were
allocated backwards (that's just an underhanded trick to make the
XFS allocator create a fragmented file), but rather whether the
defragmenter can recognise files that match the above definition
of a fragmented file and hence defragment them down to a single
extent.

Similarly, 218 then creates sparse files and determines whether the
defragmenter behaves as expected for a sparse file. That is, you
cannot physically merge the extents because they are discontiguous,
and hence the file should not be touched by the defragmenter.

Hence I think you are reading way too much into the test case. It's
testing a basic premise of defragmenter behaviour - recognising
sparse and/or fragmented files and dealing with them appropriately.
If you want to change how e2defrag defines "fragmented" and "sparse"
to be something more complex and different to what 218 expects, then
you'll need a new e2defrag specific test.

Cheers,

Dave.
--
Dave Chinner
[email protected]

2013-03-19 18:44:48

by Phillip Susi

[permalink] [raw]
Subject: Re: [PATCH 2/2] filefrag: count a contiguous extent when both logical and physical blocks are contiguous

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 3/13/2013 4:16 PM, Theodore Ts'o wrote:
> If we want to optimize the time to copy said sparse file, and if
> we assume that by the time we are defragging said sparse file, we
> are done doing writes which will allocate new blocks, then having
> defrag optimize the file so that when the extents are sorted by
> logical block number, the physical block numbers are contiguous,
> then that's probably the best "figure of merit" to use. And I'll
> note that right now that's what filefrag is reporting, and what I
> think e4defrag should be changed to use when deciding whether the
> donor file is "better" than the original file.

That's certainly exactly what I expect to get after defragging a file.

> But that's not necessarily the only way to measure extents, and
> the current e4defrag code is clearly of the opinion that if the
> file is using a contiguous region of blocks, even if the blocks
> were allocated "backwards", that there's no point defragging the
> file, since after all, if the file was written in such a random
> order with respect to logical block numbers, it will probably be
> read in a random order, so keeping the file blocks used contiguous
> to minimize free block fragmentation is the best thing to shoot
> for.

That seems wrong to me. It shouldn't matter what condition the file
starts in; the end should always be a file with blocks allocated in
the proper order. If there's no point in defragging a randomly
accessed file, then the user shouldn't bother running defrag on it,
and conversely, if they do, it should perform the task they requested.

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.17 (MingW32)
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iQEcBAEBAgAGBQJRSLIeAAoJEJrBOlT6nu7526UH/RTFmZWDXgrzGUwbbBJ1IN6t
aylOR1Wg3SWKEp5Cs717iHCNPMosF31bHkHG+MbpoSvJpAfRPZ7svTDTbzfBZ17F
dLKLOAYx3KRBeX8TSYIC7e5E/qTg1C6wmiS1Rlab5boeRkmUL/kP7WPpCgKHpydZ
tgUtzO7npCWVxP3eG9NnOR9y9U0PHuS3B9Q5S2vhbeqpxQnEi4cphoSw1bOJVcCj
RkddmEUuj+usDl8kHwxlyoiRhcwMUY59UuTqEWqgDkntQL5DTdX//AfrUdW3PBmB
DKIC4dCaoivXoKarKRuLaHM04rldOvZdxNFFNnoKlnQMhh4i82RnEYSio26778c=
=0Iea
-----END PGP SIGNATURE-----