2007-01-16 12:08:08

by Takashi Sato

[permalink] [raw]
Subject: [RFC][PATCH 0/3] ext4 online defrag (ver 0.2)

Hi

I have modified the online defrag patches to add new function which can
put the multiple files closer together. It is effective for
an application which reads many small files. Our goal is to reduce
OS booting time by putting the files, read during OS booting,
closer together.

Implementation:
All the files under the directory specified by
"e4defrag -r directory-name" are put closer together.
The modifications are as followings.
1. Add new ioctl(EXT4_IOC_DEFRAG) which returns the first physical
block number of the specified file. With this ioctl, a command
gets the specified directory's.

2. The new entry "goal" is added on ext4_ext_defrag_data structure
which is passed to existing ioctl(EXT4_IOC_DEFRAG)
as the argument. The kernel starts searching the free blocks
from "goal". The command passes the physical block number
gotten in the above step(1) to the ioctl.

struct ext4_ext_defrag_data {
loff_t start_offset; /* start offset to defrag in byte */
loff_t defrag_size; /* size of defrag in bytes */
ext4_fsblk_t goal; /* block offset for allocation */
};

Current status:
These patches are at the experimental stage so they have many issues and
items to improve. But they are worth enough to examine my trial.

Dependencies:
My patches depend on the following Alex's patches of the multi-block
allocation for Linux 2.6.19-rc6.
"[RFC] delayed allocation, mballoc, etc"
http://marc.theaimsgroup.com/?l=linux-ext4&m=116493228301966&w=2

Outstanding issues:
When the extent block is filled with extents and there are no space
for additional extent, the new extent cannot be inserted and the defrag
fails in my current implementation.

Items to improve:
- Optimize the depth of extent tree and the number of extent blocks
after defragmentation.
- The blocks on the temporary inode are moved to the original inode
by a page in the current implementation. I have to tune
the pages unit for the performance.
- Support indirect block file.

Next steps:
I will update my patches to solve the problem described on above
"Outstanding issues" in the beginning of February.

Any comments from reviews or tests are welcome.

Summary of patches:
*These patches apply on top of Alex's patches.
"[RFC] delayed allocation, mballoc, etc"
http://marc.theaimsgroup.com/?l=linux-ext4&m=116493228301966&w=2

[PATCH 1/3] Allocate new contiguous blocks with Alex's mballoc
- Search contiguous free blocks and allocate them for the temporary
inode with Alex's multi-block allocation.

[PATCH 2/3] Move the file data to the new blocks
- Move the blocks on the temporary inode to the original inode
by a page.

[PATCH 3/3] Online defrag command
- The defrag command. Usage is as follows:
o Put the multiple files closer together.
# e4defrag -r directory-name
o Defrag for a single file.
# e4defrag file-name
o Defrag for all files on ext4.
# e4defrag device-name

Cheers, Takashi


2007-01-16 19:21:37

by Andreas Dilger

[permalink] [raw]
Subject: Re: [RFC][PATCH 0/3] ext4 online defrag (ver 0.2)

On Jan 16, 2007 21:03 +0900, [email protected] wrote:
> 1. Add new ioctl(EXT4_IOC_DEFRAG) which returns the first physical
> block number of the specified file. With this ioctl, a command
> gets the specified directory's.

Maybe I don't understand, but how is this different from the long-time
FIBMAP ioctl?

> 2. The new entry "goal" is added on ext4_ext_defrag_data structure
> which is passed to existing ioctl(EXT4_IOC_DEFRAG)
> as the argument. The kernel starts searching the free blocks
> from "goal". The command passes the physical block number
> gotten in the above step(1) to the ioctl.
>
> struct ext4_ext_defrag_data {
> loff_t start_offset; /* start offset to defrag in byte */
> loff_t defrag_size; /* size of defrag in bytes */
> ext4_fsblk_t goal; /* block offset for allocation */
> };

Two things of note:
- presumably the start_offset and defrag_size should be multiples of the
filesystem blocksize? If they are not, is it an error or are they
adjusted to cover whole blocks?
- in previous defrag discussions (i.e. XFS defrag), it was desirable to
allow specifying different types of goals (e.g. hard, soft, kernel picks).
We may as well have a structure that allows these to be specified, instead
of having to change the interface afterward.

Cheers, Andreas
--
Andreas Dilger
Principal Software Engineer
Cluster File Systems, Inc.

2007-01-16 20:48:17

by Joel Becker

[permalink] [raw]
Subject: Re: [RFC][PATCH 0/3] ext4 online defrag (ver 0.2)

On Tue, Jan 16, 2007 at 12:21:34PM -0700, Andreas Dilger wrote:
> On Jan 16, 2007 21:03 +0900, [email protected] wrote:
> > 2. The new entry "goal" is added on ext4_ext_defrag_data structure
> > which is passed to existing ioctl(EXT4_IOC_DEFRAG)
> > as the argument. The kernel starts searching the free blocks
> > from "goal". The command passes the physical block number
> > gotten in the above step(1) to the ioctl.
> >
> > struct ext4_ext_defrag_data {
> > loff_t start_offset; /* start offset to defrag in byte */
> > loff_t defrag_size; /* size of defrag in bytes */
> > ext4_fsblk_t goal; /* block offset for allocation */
> > };
>
> Two things of note:
> - presumably the start_offset and defrag_size should be multiples of the
> filesystem blocksize? If they are not, is it an error or are they
> adjusted to cover whole blocks?

In fact, why aren't the units in blocks? THe filesystem isn't
really going to deal with anything smaller, is it?

Joel

--

Life's Little Instruction Book #335

"Every so often, push your luck."

Joel Becker
Principal Software Developer
Oracle
E-mail: [email protected]
Phone: (650) 506-8127

2007-01-17 11:23:27

by Takashi Sato

[permalink] [raw]
Subject: Re: [RFC][PATCH 0/3] ext4 online defrag (ver 0.2)

Hi,

> On Jan 16, 2007 21:03 +0900, [email protected] wrote:
>> 1. Add new ioctl(EXT4_IOC_DEFRAG) which returns the first physical
>> block number of the specified file. With this ioctl, a command
>> gets the specified directory's.
>
> Maybe I don't understand, but how is this different from the long-time
> FIBMAP ioctl?

I can use FIBMAP instead of my new ioctl.
You are right. I should have used FIBMAP ioctl...

>> struct ext4_ext_defrag_data {
>> loff_t start_offset; /* start offset to defrag in byte */
>> loff_t defrag_size; /* size of defrag in bytes */
>> ext4_fsblk_t goal; /* block offset for allocation */
>> };
>
> Two things of note:
> - presumably the start_offset and defrag_size should be multiples of the
> filesystem blocksize? If they are not, is it an error or are they
> adjusted to cover whole blocks?

Given the value which isn't multiples of the blocksize,
they are adjusted to cover whole blocks in the kernel.

But I think that it isn't clean that the unit of goal is different from
start_offset and defrag_size. I will change their unit into a blocksize
in the next update.

> - in previous defrag discussions (i.e. XFS defrag), it was desirable to
> allow specifying different types of goals (e.g. hard, soft, kernel picks).
> We may as well have a structure that allows these to be specified, instead
> of having to change the interface afterward.

Let me see... Is it the following discussion?
http://marc.theaimsgroup.com/?l=linux-ext4&m=116161490908645&w=2
http://marc.theaimsgroup.com/?l=linux-ext4&m=116184475306761&w=2

Cheers, Takashi

2007-01-19 05:19:56

by Takashi Sato

[permalink] [raw]
Subject: Re: [RFC][PATCH 0/3] ext4 online defrag (ver 0.2)

Hi,

>> On Jan 16, 2007 21:03 +0900, [email protected] wrote:
>>> 1. Add new ioctl(EXT4_IOC_DEFRAG) which returns the first physical
>>> block number of the specified file. With this ioctl, a command
>>> gets the specified directory's.
>>
>> Maybe I don't understand, but how is this different from the long-time
>> FIBMAP ioctl?
>
> I can use FIBMAP instead of my new ioctl.
> You are right. I should have used FIBMAP ioctl...

I have to get the physical block number of the specified directory.
But FIBMAP is available only for a regular file, not for a directory.
So I will use my new ioctl.

Cheers, Takashi

2007-01-19 11:33:52

by Andreas Dilger

[permalink] [raw]
Subject: Re: [RFC][PATCH 0/3] ext4 online defrag (ver 0.2)

On Jan 19, 2007 14:19 +0900, Takashi Sato wrote:
> >>On Jan 16, 2007 21:03 +0900, [email protected] wrote:
> >>>1. Add new ioctl(EXT4_IOC_DEFRAG) which returns the first physical
> >>> block number of the specified file. With this ioctl, a command
> >>> gets the specified directory's.
> >>
> >>Maybe I don't understand, but how is this different from the long-time
> >>FIBMAP ioctl?
> >
> >I can use FIBMAP instead of my new ioctl.
> >You are right. I should have used FIBMAP ioctl...
>
> I have to get the physical block number of the specified directory.
> But FIBMAP is available only for a regular file, not for a directory.
> So I will use my new ioctl.

Though it might make sense to implement FIBMAP for a directory, to keep
it consistent and allow user-space tools like "filefrag" to work on
directories also.

Cheers, Andreas
--
Andreas Dilger
Principal Software Engineer
Cluster File Systems, Inc.

2007-01-19 12:00:50

by Takashi Sato

[permalink] [raw]
Subject: Re: [RFC][PATCH 0/3] ext4 online defrag (ver 0.2)

Hi,

Thank you for your comment.

>> >>>1. Add new ioctl(EXT4_IOC_DEFRAG) which returns the first physical
>> >>> block number of the specified file. With this ioctl, a command
>> >>> gets the specified directory's.
>> >>
>> >>Maybe I don't understand, but how is this different from the long-time
>> >>FIBMAP ioctl?
>> >
>> >I can use FIBMAP instead of my new ioctl.
>> >You are right. I should have used FIBMAP ioctl...
>>
>> I have to get the physical block number of the specified directory.
>> But FIBMAP is available only for a regular file, not for a directory.
>> So I will use my new ioctl.
>
> Though it might make sense to implement FIBMAP for a directory, to keep
> it consistent and allow user-space tools like "filefrag" to work on
> directories also.

It sounds good.
I think it will be useful for other tools which use FIBMAP.
So I will consider the implementation of FIBMAP for a directory.

Cheers, Takashi