2002-10-17 20:26:13

by christophe varoqui

[permalink] [raw]
Subject: block allocators and LVMs

hello,

reading the recent threads about FS block allocator algorithms and about
possible integration of a new volume management framework, I wondered if
the role of intelligent block allocators and/or online FS defragmentation
could be replaced by a block remapper in the LVM subsystem.

On one hand, online defrag seems hard to achieve (Tru64 advfs still can't
get it right after 4 years) and intelligently allocating blocks can be
costly (not to say it could be useless on a heavily fragmented logical
volume) ... on the other hand, the pvmove envisionned by M. Thornber seems
quite able to handle the extend remapping.

The block device layer is pretty well positioned to know about disk head
seeks and could do IO accounting per extend. These information seems
sufficient to efficiently order extends.

Am I completely out of my mind ?
Evidently, I would be very proud if not but I can handle responses like
"crap. just stop thinking man"

.
cvaroqui


2002-10-18 11:20:37

by Joe Thornber

[permalink] [raw]
Subject: Re: block allocators and LVMs

Christophe,

On Thu, Oct 17, 2002 at 10:30:05PM +0200, christophe varoqui wrote:
> hello,
>
> reading the recent threads about FS block allocator algorithms and about
> possible integration of a new volume management framework, I wondered if
> the role of intelligent block allocators and/or online FS defragmentation
> could be replaced by a block remapper in the LVM subsystem.

Crazy idea :)

I think this is best left to the fs to handle, mainly because the
blocks that the fs deals with are so small. You would end up with
*huge* remapping tables. Also you would need to spend a lot of time
collecting the information neccessary calculate the remapping, to do
it properly you'd need to record an ordering of data acccesses not
just io counts (ie. so you could build a Markov chain).

> Am I completely out of my mind ?

Not completely, I think the statistics could be extremely valuable for
gauging the performance of different filesystems. It would be very
easy to write a little dm target that just records all the information
in a spare block device, this target could then be layered over any
block device for testing.

- Joe

2002-10-18 12:53:16

by christophe varoqui

[permalink] [raw]
Subject: Re: block allocators and LVMs

En r?ponse ? Joe Thornber <[email protected]>:

> Christophe,
> > the role of intelligent block allocators and/or online FS
> > defragmentation could be replaced by a block remapper in the
> > LVM subsystem.
>
> Crazy idea :)
>
> I think this is best left to the fs to handle, mainly because the
> blocks that the fs deals with are so small. You would end up with
> *huge* remapping tables. Also you would need to spend a lot of time
> collecting the information neccessary calculate the remapping, to do
> it properly you'd need to record an ordering of data acccesses not
> just io counts (ie. so you could build a Markov chain).
>

I realize I didn't pick the right words (from my poor English
dictionnary) : I meant an extend remapper rather than a block remapper.

As far as I can see, this task can be done entirely from userland :

o per-extend IO counters exported from kernel-space can be turned into
a list of extends sorted by activity

o lvdisplay-like tool gives the mapping extend<->physical blocks

o a scheduled job in user-space should be able to massage this info to
decide where to move low-access-rate-extends to the border of the
platter and pack high-access-rate-extends together ... all in one run
that can be scheduled at low activity period (cron defrag way)

The algorithm could be something along the line of :

while top_user_queue_not_empty
do
extend = dequeue_lowest_user_extend
if extend_in_good_spot
then
move_extend_to_corner_destination
find_highest_user_extend_in_bad_spot
move_this_extend_to_freed_good_spot
fi
done


This sort of extend reordering is done in some big Storage Arrays like
StorageWorks EVA110 (as far as I know : they are very secretive on the
subject).

2002-10-18 14:57:57

by Joe Thornber

[permalink] [raw]
Subject: Re: block allocators and LVMs

On Fri, Oct 18, 2002 at 03:04:24PM +0200, [email protected] wrote:
> I realize I didn't pick the right words (from my poor English
> dictionnary) : I meant an extend remapper rather than a block remapper.

extent remapper.

> As far as I can see, this task can be done entirely from userland :
>
> o per-extend IO counters exported from kernel-space can be turned into
> a list of extends sorted by activity
>
> o lvdisplay-like tool gives the mapping extend<->physical blocks
>
> o a scheduled job in user-space should be able to massage this info to
> decide where to move low-access-rate-extends to the border of the
> platter and pack high-access-rate-extends together ... all in one run
> that can be scheduled at low activity period (cron defrag way)
>
> The algorithm could be something along the line of :
>
> while top_user_queue_not_empty
> do
> extend = dequeue_lowest_user_extend
> if extend_in_good_spot
> then
> move_extend_to_corner_destination
> find_highest_user_extend_in_bad_spot
> move_this_extend_to_freed_good_spot
> fi
> done

What you describe could be very beneficial, especially if you start
striping the high bandwidth areas. However in no way could this be
described as 'online FS defragmentation'.

- Joe

2002-10-18 15:45:54

by christophe varoqui

[permalink] [raw]
Subject: Re: block allocators and LVMs

On Friday 18 October 2002 17:03, you wrote:
> On Fri, Oct 18, 2002 at 03:04:24PM +0200, [email protected] wrote:
> > I realize I didn't pick the right words (from my poor English
> > dictionnary) : I meant an extend remapper rather than a block remapper.
>
> extent remapper.
>
oops, first desillusion :)

>
> What you describe could be very beneficial, especially if you start
> striping the high bandwidth areas. However in no way could this be
> described as 'online FS defragmentation'.
>
I realize the whole concept is different, but could extent remapping alleviate
the need of an *intelligent* FS block allocator, as it ensure the best
statistical-average IO perfs.

I can even imagine an *intelligent* FS block allocator being counter
productive in the case of a heavily fragmented LV (extends out-of-order ...)
because, after all, block allocator seem to take for granted that the device
is linearly mapped over physical.

This whole point fade into mud if the block group object of an extN FS is
mapped over an LVM extent. But even in this case the *simple* allocator would
be best in combination with an extent remapper.

I don't pretend an extent remapper can replace the FS block allocator, but
only *complex* block allocators ... and particularitly those that make the
false assomption that underlying block device is made of linearly and
consecutive physical blocks.

cva

2002-10-18 16:08:38

by Joe Thornber

[permalink] [raw]
Subject: Re: block allocators and LVMs

On Fri, Oct 18, 2002 at 05:51:47PM +0200, christophe varoqui wrote:
> I realize the whole concept is different, but could extent remapping alleviate
> the need of an *intelligent* FS block allocator, as it ensure the best
> statistical-average IO perfs.

Yes, you would probably find an intelligent fs allocator and an
automatically remapping lvm would fight each other. You'd have to
choose one or the other.

- Joe