2012-06-21 09:40:59

by Zheng Liu

[permalink] [raw]
Subject: [RFC] ext4: add an io-tree to track block allocation

Hi all,

This year at ext4 workshop a new idea that calls io-tree is proposed to
solve some problmes in ext4 [1]. I summarize the problems that are
needed to solve by io-tree in here:
1. reserve quota calculation in bigalloc
2. simplify puch hole implementation
3. simplify fiemap implementation
4. SEEK_DATA/HOLE implementation

Meanwhile with io-tree, some codes can be improved as following:
1. accelerate get_block functions
2. simplify uninitialized extent conversion
3. fine granularity locking (extent lock)

I make a plan to implement io-tree that can be divided into three-steps.
Now I describe it in detailed.

* Step 1
The following problems will be solved in this step:
1. reserve quota calculation in bigalloc
2. simplify puch hole implementation
3. simplify fiemap implementation
4. SEEK_DATA/HOLE implementation

Currently a patch set has been submitted to the mailing list by
Yongqiang and Allison, which called status extent tree, and it has
simplified fiemap implementation. But it only works when delay
allocation is enabled. I will pick up this work. Now I have rebased
this patch set to 3.5-rc3, and renamed it to extent status tree as
Darrick advised.

Next I will try to solve the above problems and make it run in
nodelalloc mode.

* Step 2
To be improved:
1. accelerate get_block functions
2. simplify uninitialized extent conversion

For the above improvements, a status member will be added in extent
status tree to indicate the current status of this extent. I think that
the status includes dealloc, allocated, uninit, and hole. Then we can
let get_block functions to lookup extent status tree firstly to
accelerate get_block. Meanwhile uninitialized extent conversion can be
modified to reduce lock contention of i_mutex.

* Step 3
To be done:
1. fine granularity locking (extent lock)

Now in ext4 it does some operations with i_mutex locking. After adding
extent status tree, we can avoid to take this lock as much as possible.
It seems that a new member needs to be added to indicate the type of
locking. We can take a range lock with shared or exclusive, and, when a
range is locked, it cannot be merged by other processes and other types
extent lock.

Dave Chinner said that maybe range lock can be used in xfs too. So I
will try to implement a generic extent locking as much as possible after
step 3.

Please review this RFC, and any feedbacks are appreciated. Thanks.

In addition, I remember that at ext4 workshop Ted mentions that a big
extent tree has been implemented to improve extent cache. So we need to
consider whether need to merge big extent tree and io-tree or not after
both big extent tree and io-tree have been done.

1. http://www.spinics.net/lists/linux-ext4/msg31742.html

Regards,
Zheng


2012-06-21 11:12:39

by Yongqiang Yang

[permalink] [raw]
Subject: Re: [RFC] ext4: add an io-tree to track block allocation

On Thu, Jun 21, 2012 at 5:46 PM, Zheng Liu <[email protected]> wrote:
> Hi all,
>
> This year at ext4 workshop a new idea that calls io-tree is proposed to
> solve some problmes in ext4 [1]. ?I summarize the problems that are
> needed to solve by io-tree in here:
> 1. reserve quota calculation in bigalloc
> 2. simplify puch hole implementation
> 3. simplify fiemap implementation
> 4. SEEK_DATA/HOLE implementation
Actually, we can accelerate
ext4_da_write_cache_pages by looking up extent status tree rather
than page cache. This is one of aims of the original patch sets.

>
> Meanwhile with io-tree, some codes can be improved as following:
> 1. accelerate get_block functions
> 2. simplify uninitialized extent conversion
> 3. fine granularity locking (extent lock)
>
> I make a plan to implement io-tree that can be divided into three-steps.
> Now I describe it in detailed.
>
> * Step 1
> The following problems will be solved in this step:
> 1. reserve quota calculation in bigalloc
> 2. simplify puch hole implementation
> 3. simplify fiemap implementation
> 4. SEEK_DATA/HOLE implementation
>
> Currently a patch set has been submitted to the mailing list by
> Yongqiang and Allison, which called status extent tree, and it has
> simplified fiemap implementation. ?But it only works when delay
In my memory reserveing quota for bigalloc is also resolved in the
original patch sets. Was it sent out? If not, I can send the patch
to you if you need it:-)

> allocation is enabled. ?I will pick up this work. ?Now I have rebased
> this patch set to 3.5-rc3, and renamed it to extent status tree as
> Darrick advised.
>
> Next I will try to solve the above problems and make it run in
> nodelalloc mode.
>
> * Step 2
> To be improved:
> 1. accelerate get_block functions
> 2. simplify uninitialized extent conversion
IMHO ext4_da_write_cache_pages can be improved in this step.

Yongqiang.
>
> For the above improvements, a status member will be added in extent
> status tree to indicate the current status of this extent. ?I think that
> the status includes dealloc, allocated, uninit, and hole. ?Then we can
> let get_block functions to lookup extent status tree firstly to
> accelerate get_block. ?Meanwhile uninitialized extent conversion can be
> modified to reduce lock contention of i_mutex.
>
> * Step 3
> To be done:
> 1. fine granularity locking (extent lock)
>
> Now in ext4 it does some operations with i_mutex locking. ?After adding
> extent status tree, we can avoid to take this lock as much as possible.
> It seems that a new member needs to be added to indicate the type of
> locking. ?We can take a range lock with shared or exclusive, and, when a
> range is locked, it cannot be merged by other processes and other types
> extent lock.
>
> Dave Chinner said that maybe range lock can be used in xfs too. ?So I
> will try to implement a generic extent locking as much as possible after
> step 3.
>
> Please review this RFC, and any feedbacks are appreciated. ?Thanks.
>
> In addition, I remember that at ext4 workshop Ted mentions that a big
> extent tree has been implemented to improve extent cache. ?So we need to
> consider whether need to merge big extent tree and io-tree or not after
> both big extent tree and io-tree have been done.
>
> 1. http://www.spinics.net/lists/linux-ext4/msg31742.html
>
> Regards,
> Zheng



--
Best Wishes
Yongqiang Yang

2012-06-21 11:40:58

by Zheng Liu

[permalink] [raw]
Subject: Re: [RFC] ext4: add an io-tree to track block allocation

On Thu, Jun 21, 2012 at 07:04:31PM +0800, Yongqiang Yang wrote:
> On Thu, Jun 21, 2012 at 5:46 PM, Zheng Liu <[email protected]> wrote:
> > Hi all,
> >
> > This year at ext4 workshop a new idea that calls io-tree is proposed to
> > solve some problmes in ext4 [1].  I summarize the problems that are
> > needed to solve by io-tree in here:
> > 1. reserve quota calculation in bigalloc
> > 2. simplify puch hole implementation
> > 3. simplify fiemap implementation
> > 4. SEEK_DATA/HOLE implementation
> Actually, we can accelerate
> ext4_da_write_cache_pages by looking up extent status tree rather
> than page cache. This is one of aims of the original patch sets.

Thanks for the feedback. I will add it in my TODO list.

>
> >
> > Meanwhile with io-tree, some codes can be improved as following:
> > 1. accelerate get_block functions
> > 2. simplify uninitialized extent conversion
> > 3. fine granularity locking (extent lock)
> >
> > I make a plan to implement io-tree that can be divided into three-steps.
> > Now I describe it in detailed.
> >
> > * Step 1
> > The following problems will be solved in this step:
> > 1. reserve quota calculation in bigalloc
> > 2. simplify puch hole implementation
> > 3. simplify fiemap implementation
> > 4. SEEK_DATA/HOLE implementation
> >
> > Currently a patch set has been submitted to the mailing list by
> > Yongqiang and Allison, which called status extent tree, and it has
> > simplified fiemap implementation.  But it only works when delay
> In my memory reserveing quota for bigalloc is also resolved in the
> original patch sets. Was it sent out? If not, I can send the patch
> to you if you need it:-)

I think that this patch is 'ext4: reimplement
ext4_find_delay_alloc_range on status extent tree'. Right?

>
> > allocation is enabled.  I will pick up this work.  Now I have rebased
> > this patch set to 3.5-rc3, and renamed it to extent status tree as
> > Darrick advised.
> >
> > Next I will try to solve the above problems and make it run in
> > nodelalloc mode.
> >
> > * Step 2
> > To be improved:
> > 1. accelerate get_block functions
> > 2. simplify uninitialized extent conversion
> IMHO ext4_da_write_cache_pages can be improved in this step.
>
> Yongqiang.
> >
> > For the above improvements, a status member will be added in extent
> > status tree to indicate the current status of this extent.  I think that
> > the status includes dealloc, allocated, uninit, and hole.  Then we can
> > let get_block functions to lookup extent status tree firstly to
> > accelerate get_block.  Meanwhile uninitialized extent conversion can be
> > modified to reduce lock contention of i_mutex.
> >
> > * Step 3
> > To be done:
> > 1. fine granularity locking (extent lock)
> >
> > Now in ext4 it does some operations with i_mutex locking.  After adding
> > extent status tree, we can avoid to take this lock as much as possible.
> > It seems that a new member needs to be added to indicate the type of
> > locking.  We can take a range lock with shared or exclusive, and, when a
> > range is locked, it cannot be merged by other processes and other types
> > extent lock.
> >
> > Dave Chinner said that maybe range lock can be used in xfs too.  So I
> > will try to implement a generic extent locking as much as possible after
> > step 3.
> >
> > Please review this RFC, and any feedbacks are appreciated.  Thanks.
> >
> > In addition, I remember that at ext4 workshop Ted mentions that a big
> > extent tree has been implemented to improve extent cache.  So we need to
> > consider whether need to merge big extent tree and io-tree or not after
> > both big extent tree and io-tree have been done.
> >
> > 1. http://www.spinics.net/lists/linux-ext4/msg31742.html
> >
> > Regards,
> > Zheng
>
>
>
> --
> Best Wishes
> Yongqiang Yang
> --
> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html