I have an x86_64 kernel with i386 userspace. e4defrag fails on the
EXT4_IOC_MOVE_EXT ioctl because it is not wired up for the compat
case. It seems that struct move_extent is compat save, only types
with fixed widths are used:
{
__u32 reserved; /* should be zero */
__u32 donor_fd; /* donor file descriptor */
__u64 orig_start; /* logical start offset in block for orig */
__u64 donor_start; /* logical start offset in block for donor */
__u64 len; /* block length to be moved */
__u64 moved_len; /* moved block length */
};
Lets just wire up EXT4_IOC_MOVE_EXT for the compat case.
Signed-off-by: Christian Borntraeger <[email protected]>
CCed: Akira Fujita <[email protected]>
---
fs/ext4/ioctl.c | 2 ++
1 file changed, 2 insertions(+)
--- a/fs/ext4/ioctl.c
+++ b/fs/ext4/ioctl.c
@@ -375,6 +375,8 @@ long ext4_compat_ioctl(struct file *file
break;
case EXT4_IOC_GROUP_ADD:
break;
+ case EXT4_IOC_MOVE_EXT:
+ break;
default:
return -ENOIOCTLCMD;
}
On 03/07/2010 03:32 PM, Christian Borntraeger wrote:
> I have an x86_64 kernel with i386 userspace. e4defrag fails on the
> EXT4_IOC_MOVE_EXT ioctl because it is not wired up for the compat
> case. It seems that struct move_extent is compat save, only types
> with fixed widths are used:
> {
> __u32 reserved; /* should be zero */
> __u32 donor_fd; /* donor file descriptor */
> __u64 orig_start; /* logical start offset in block for orig */
> __u64 donor_start; /* logical start offset in block for donor */
> __u64 len; /* block length to be moved */
> __u64 moved_len; /* moved block length */
> };
>
> Lets just wire up EXT4_IOC_MOVE_EXT for the compat case.
>
> Signed-off-by: Christian Borntraeger<[email protected]>
> CCed: Akira Fujita<[email protected]>
I'm curious, what is the overall deployment status of ext4 defragging?
I actually worked on this problem years ago[1], and am hopeful that I
will see defragging in a Linux distribution sometime in my lifetime! ;-)
Looking at Fedora rawhide, I do not see anything resembling e4defrag in
any of the RPM packages like e2fsprogs.
Thanks to everyone working on this,
Jeff
[1] http://linux.yyz.us/misc/ext2meta.c
Jeff Garzik wrote:
> On 03/07/2010 03:32 PM, Christian Borntraeger wrote:
>> I have an x86_64 kernel with i386 userspace. e4defrag fails on the
>> EXT4_IOC_MOVE_EXT ioctl because it is not wired up for the compat
>> case. It seems that struct move_extent is compat save, only types
>> with fixed widths are used:
>> {
>> __u32 reserved; /* should be zero */
>> __u32 donor_fd; /* donor file descriptor */
>> __u64 orig_start; /* logical start offset in block for
>> orig */
>> __u64 donor_start; /* logical start offset in block for
>> donor */
>> __u64 len; /* block length to be moved */
>> __u64 moved_len; /* moved block length */
>> };
>>
>> Lets just wire up EXT4_IOC_MOVE_EXT for the compat case.
>>
>> Signed-off-by: Christian Borntraeger<[email protected]>
>> CCed: Akira Fujita<[email protected]>
>
> I'm curious, what is the overall deployment status of ext4 defragging? I
> actually worked on this problem years ago[1], and am hopeful that I will
> see defragging in a Linux distribution sometime in my lifetime! ;-)
on ext4 you mean, I guess - you could use XFS if defragging is a high priority,
see xfs_fsr(8)
> Looking at Fedora rawhide, I do not see anything resembling e4defrag in
> any of the RPM packages like e2fsprogs.
I had it for a while, but with the various problems and general uneasiness
with the code so far, I took it back out lest people lose data to it.
-Eric
> Thanks to everyone working on this,
>
> Jeff
>
>
>
>
> [1] http://linux.yyz.us/misc/ext2meta.c
Christian Borntraeger wrote:
> I have an x86_64 kernel with i386 userspace. e4defrag fails on the
> EXT4_IOC_MOVE_EXT ioctl because it is not wired up for the compat
> case. It seems that struct move_extent is compat save, only types
> with fixed widths are used:
and they're well-aligned.
> {
> __u32 reserved; /* should be zero */
> __u32 donor_fd; /* donor file descriptor */
> __u64 orig_start; /* logical start offset in block for orig */
> __u64 donor_start; /* logical start offset in block for donor */
> __u64 len; /* block length to be moved */
> __u64 moved_len; /* moved block length */
> };
>
> Lets just wire up EXT4_IOC_MOVE_EXT for the compat case.
>
> Signed-off-by: Christian Borntraeger <[email protected]>
> CCed: Akira Fujita <[email protected]>
Looks fine to me.
Reviewed-by: Eric Sandeen <[email protected]>
> ---
> fs/ext4/ioctl.c | 2 ++
> 1 file changed, 2 insertions(+)
>
> --- a/fs/ext4/ioctl.c
> +++ b/fs/ext4/ioctl.c
> @@ -375,6 +375,8 @@ long ext4_compat_ioctl(struct file *file
> break;
> case EXT4_IOC_GROUP_ADD:
> break;
> + case EXT4_IOC_MOVE_EXT:
> + break;
> default:
> return -ENOIOCTLCMD;
> }
> --
> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
Am Montag 08 M?rz 2010 00:27:58 schrieb Jeff Garzik:
> I'm curious, what is the overall deployment status of ext4 defragging?
> I actually worked on this problem years ago[1], and am hopeful that I
> will see defragging in a Linux distribution sometime in my lifetime! ;-)
I had to make the following patch. (There was the same patch already on the
e2fsprogs mailing list, but I cannot find that at the moment. Would be good
if this patch can make it into the next version).
--- a/misc/e4defrag.c
+++ b/misc/e4defrag.c
@@ -1605,7 +1605,7 @@ static int file_defrag(const char *file, const struct stat64 *buf,
return 0;
}
- fd = open64(file, O_RDONLY);
+ fd = open64(file, O_RDWR);
if (fd < 0) {
if (mode_flag & DETAIL) {
PRINT_FILE_NAME(file);
I did use it on two machines and it does not seem to break anything.
Some bigger things are missing in the e4defrag tool:
- make the indirect to extend change for old files - e.g. converted ext3
file systems. Adding the MIGRATE ioctl does not look very hard, but who knows?
- defragging directories (and also remove deleted entries)
- free space optimization is only done as a side effect
- overall layout considerations (e.g. putting files close to its directory or
use the atime to move often used files to the beginning of a disk etc.)
Hi Christian,
(2010/03/08 5:32), Christian Borntraeger wrote:
> I have an x86_64 kernel with i386 userspace. e4defrag fails on the
> EXT4_IOC_MOVE_EXT ioctl because it is not wired up for the compat
> case. It seems that struct move_extent is compat save, only types
> with fixed widths are used:
> {
> __u32 reserved; /* should be zero */
> __u32 donor_fd; /* donor file descriptor */
> __u64 orig_start; /* logical start offset in block for orig */
> __u64 donor_start; /* logical start offset in block for donor */
> __u64 len; /* block length to be moved */
> __u64 moved_len; /* moved block length */
> };
>
> Lets just wire up EXT4_IOC_MOVE_EXT for the compat case.
>
> Signed-off-by: Christian Borntraeger<[email protected]>
> CCed: Akira Fujita<[email protected]>
>
> ---
> fs/ext4/ioctl.c | 2 ++
> 1 file changed, 2 insertions(+)
>
> --- a/fs/ext4/ioctl.c
> +++ b/fs/ext4/ioctl.c
> @@ -375,6 +375,8 @@ long ext4_compat_ioctl(struct file *file
> break;
> case EXT4_IOC_GROUP_ADD:
> break;
> + case EXT4_IOC_MOVE_EXT:
> + break;
> default:
> return -ENOIOCTLCMD;
> }
Looks good for me.
Thank you for your work.
Regards,
Akira Fujita
Christian Borntraeger wrote:
> Some bigger things are missing in the e4defrag tool:
> ...
> - overall layout considerations (e.g. putting files close to its directory or
> use the atime to move often used files to the beginning of a disk etc.)
Shouldn't oft-used files be placed closer to the middle? If you place
them at the beginning of the file, it's only possible for the head-stack
to be close to the file from the inner direction. Place them in the
middle and it's possible for the head-stack to be close from the outer
direction, too, which sounds like a doubling of probability. It seems
that it's the least frequently used files that should be placed at one
end of the disk or the other.
Am Montag 08 M?rz 2010 16:33:46 schrieb David Newall:
> Christian Borntraeger wrote:
> > Some bigger things are missing in the e4defrag tool:
> > ...
> > - overall layout considerations (e.g. putting files close to its directory or
> > use the atime to move often used files to the beginning of a disk etc.)
>
> Shouldn't oft-used files be placed closer to the middle? If you place
> them at the beginning of the file, it's only possible for the head-stack
> to be close to the file from the inner direction. Place them in the
> middle and it's possible for the head-stack to be close from the outer
> direction, too, which sounds like a doubling of probability. It seems
> that it's the least frequently used files that should be placed at one
> end of the disk or the other.
>
Maybe. This was just an example of things that you can come
up with and which are not yet possible up in e4defrag.
Christian
David Newall wrote:
> Christian Borntraeger wrote:
>> Some bigger things are missing in the e4defrag tool:
>> ...
>> - overall layout considerations (e.g. putting files close to its
>> directory or
>> use the atime to move often used files to the beginning of a disk etc.)
>
> Shouldn't oft-used files be placed closer to the middle? If you place
> them at the beginning of the file, it's only possible for the head-stack
> to be close to the file from the inner direction. Place them in the
> middle and it's possible for the head-stack to be close from the outer
> direction, too, which sounds like a doubling of probability. It seems
> that it's the least frequently used files that should be placed at one
> end of the disk or the other.
No. Your logic would be correct if rotating disks had
similar speed at all locations. Current disks are much
faster at the 0 end than at the middle or highest address.
It is not unusual to see 2x difference in transfer speed
so you always want the important stuff as low as possible.
jim
On Mon, Mar 8, 2010 at 11:22 AM, jim owens <[email protected]> wrote:
> David Newall wrote:
>> Christian Borntraeger wrote:
>>> Some bigger things are missing in the e4defrag tool:
>>> ...
>>> - overall layout considerations (e.g. putting files close to its
>>> directory or
>>> ? use the atime to move often used files to the beginning of a disk etc.)
>>
>> Shouldn't oft-used files be placed closer to the middle? ?If you place
>> them at the beginning of the file, it's only possible for the head-stack
>> to be close to the file from the inner direction. ?Place them in the
>> middle and it's possible for the head-stack to be close from the outer
>> direction, too, which sounds like a doubling of probability. ?It seems
>> that it's the least frequently used files that should be placed at one
>> end of the disk or the other.
>
> No. ?Your logic would be correct if rotating disks had
> similar speed at all locations. ?Current disks are much
> faster at the 0 end than at the middle or highest address.
>
> It is not unusual to see 2x difference in transfer speed
> so you always want the important stuff as low as possible.
>
> jim
Jim, I should know this, but is sector 0 on the outside edge, or the inner edge?
I assume outer so that the linear speed of the platter under the head
is faster and thus more data per second is passing under the head.
Greg
Greg Freemyer wrote:
>
> Jim, I should know this, but is sector 0 on the outside edge, or the inner edge?
>
> I assume outer so that the linear speed of the platter under the head
> is faster and thus more data per second is passing under the head.
Correct. AFAIK everyone starts 0 at the outer edge for that reason,
it makes for better benchmarks ;)
When I only worried about a few OEM drives, I used to read the zone
geometry from the drive to see where each speed transition was as the
density decreased. But that is just not worth the effort in linux
filesystems IMO, it is enough to pack low.
jim
On Mon, 08 Mar 2010 11:22:15 EST, jim owens said:
> No. Your logic would be correct if rotating disks had
> similar speed at all locations. Current disks are much
> faster at the 0 end than at the middle or highest address.
>
> It is not unusual to see 2x difference in transfer speed
> so you always want the important stuff as low as possible.
On the flip side, seek time is so much larger than the time spent
actually reading that minimizing the seeks will improve total throughput
more. Sure, maybe you spend 0.05ms reading instead of 0.1ms - but if
the seek took 0.75ms rather than 0.5ms you're still in the hole.
[email protected] wrote:
> On Mon, 08 Mar 2010 11:22:15 EST, jim owens said:
>
>> No. Your logic would be correct if rotating disks had
>> similar speed at all locations. Current disks are much
>> faster at the 0 end than at the middle or highest address.
>>
>> It is not unusual to see 2x difference in transfer speed
>> so you always want the important stuff as low as possible.
>
> On the flip side, seek time is so much larger than the time spent
> actually reading that minimizing the seeks will improve total throughput
> more. Sure, maybe you spend 0.05ms reading instead of 0.1ms - but if
> the seek took 0.75ms rather than 0.5ms you're still in the hole.
Agree that seek proximity is important, but seeking at the low end
for N blocks is also faster than seeking the same N blocks at the
slow high end of the disk.
So my point is you want to pack the data together when you can
at the low address end of the disk to maximize performance.
And we should include some free space in the fast zone because
much of our performance involves short-lived writes.
If you are filling and accessing all of the disk, then maybe
"pack from the middle" will average out better, but we hope
most filesystems have a small hot subset. If you are often
accessing outside the hot zone a lot then you probably will
get end-to-end seek penalties anyway if you are defragging
to an "old, free, hot, free, old" layout.
jim
After thinking about it overnight, I realized I think in terms
of 1 drive is 1 filesystem. That is a fatal trap for defragment.
> When I only worried about a few OEM drives, I used to read the zone
> geometry from the drive to see where each speed transition was as the
> density decreased. But that is just not worth the effort in linux
> filesystems IMO, it is enough to pack low.
So I retract that we don't care about zone geometry, we need to
care deeply, but not in the sense of how moving short distances
on a drive affects the performance. What we need to ensure is
that the placer algorithm does not span across partitions as in:
["/" 100GB created] [300GB other] [100G LVM added to "/"]
so the filesystem thinks it is 200GB contiguous and the
defragmenter thinks address 90GB is closer to address 110 GB
than 90GB is to 50GB.
jim
jim owens wrote:
> No. Your logic would be correct if rotating disks had
> similar speed at all locations. Current disks are much
> faster at the 0 end than at the middle or highest address.
>
I think my logic is still correct, although I wished I had said "closer
to the middle." In fact, simplistic ideas for placement of files are
unlikely to produce fabulous results (and that includes placing commonly
used files towards the middle of the disk, say at the inside edge of the
outermost zone.) The effort that BSD went to in FFS, placing
directories with files and meta-data in cylinder groups, illustrates
that disk performance is a sophisticated problem.
Why don't we use BSD FFS/FFS2?
Ted, can you consider this patch for ext4?
From: Christian Borntraeger <[email protected]>
I have an x86_64 kernel with i386 userspace. e4defrag fails on the
EXT4_IOC_MOVE_EXT ioctl because it is not wired up for the compat
case. It seems that
struct move_extent is compat save, only types with fixed widths are used.
{
__u32 reserved; /* should be zero */
__u32 donor_fd; /* donor file descriptor */
__u64 orig_start; /* logical start offset in block for orig */
__u64 donor_start; /* logical start offset in block for donor */
__u64 len; /* block length to be moved */
__u64 moved_len; /* moved block length */
};
Signed-off-by: Christian Borntraeger <[email protected]>
Reviewed-by: Eric Sandeen <[email protected]>
Acked-by: Akira Fujita <[email protected]>
---
fs/ext4/ioctl.c | 2 ++
1 file changed, 2 insertions(+)
--- a/fs/ext4/ioctl.c
+++ b/fs/ext4/ioctl.c
@@ -375,6 +375,8 @@ long ext4_compat_ioctl(struct file *file
break;
case EXT4_IOC_GROUP_ADD:
break;
+ case EXT4_IOC_MOVE_EXT:
+ break;
default:
return -ENOIOCTLCMD;
}
On Sun, Mar 07, 2010 at 09:32:10PM +0100, Christian Borntraeger wrote:
> I have an x86_64 kernel with i386 userspace. e4defrag fails on the
> EXT4_IOC_MOVE_EXT ioctl because it is not wired up for the compat
> case. It seems that struct move_extent is compat save, only types
> with fixed widths are used:
> {
> __u32 reserved; /* should be zero */
> __u32 donor_fd; /* donor file descriptor */
> __u64 orig_start; /* logical start offset in block for orig */
> __u64 donor_start; /* logical start offset in block for donor */
> __u64 len; /* block length to be moved */
> __u64 moved_len; /* moved block length */
> };
>
> Lets just wire up EXT4_IOC_MOVE_EXT for the compat case.
Thanks, added to the ext4 patch queue.
- Ted