2007-04-30 12:09:38

by Theodore Ts'o

[permalink] [raw]
Subject: Re: Ext2/3 block remapping tool

On Fri, Apr 27, 2007 at 12:09:42PM -0600, Andreas Dilger wrote:
> I'd prefer that such functionality be integrated with Takashi's online
> defrag tool, since it needs virtually the same functionality. For that
> matter, this is also very similar to the block-mapped -> extents tool
> from Aneesh. It doesn't make sense to have so many separate tools for
> users, especially if they start interfering with each other (i.e. defrag
> undoes the remapping done by your tool).

Yep, in fact, I'm really glad that Jan is working on the remapping
tool because if the on-line defrag kernel interfaces don't have the
right support for it, then that means we need to fix the on-line
defrag patches. :-)

While we're at it, someone want to start thinking about on-line
shrinking of ext4 filesystems? Again, the same block remapping
interfaces for defrag and file access optimizations should also be
useful for shrinking filesystems (even if some of the files that need
to be relocated are being actively used). If not, that probably means
we got the interface wrong.

- Ted


2007-04-30 12:19:56

by Jan Kara

[permalink] [raw]
Subject: Re: Ext2/3 block remapping tool

On Mon 30-04-07 08:09:30, Theodore Tso wrote:
> On Fri, Apr 27, 2007 at 12:09:42PM -0600, Andreas Dilger wrote:
> > I'd prefer that such functionality be integrated with Takashi's online
> > defrag tool, since it needs virtually the same functionality. For that
> > matter, this is also very similar to the block-mapped -> extents tool
> > from Aneesh. It doesn't make sense to have so many separate tools for
> > users, especially if they start interfering with each other (i.e. defrag
> > undoes the remapping done by your tool).
>
> Yep, in fact, I'm really glad that Jan is working on the remapping
> tool because if the on-line defrag kernel interfaces don't have the
> right support for it, then that means we need to fix the on-line
> defrag patches. :-)
;-) Exactly that was the reason why I wrote the userspace program - so
that I have something in hands when we start discussing how the kernel
interface will look like.

> While we're at it, someone want to start thinking about on-line
> shrinking of ext4 filesystems? Again, the same block remapping
> interfaces for defrag and file access optimizations should also be
> useful for shrinking filesystems (even if some of the files that need
> to be relocated are being actively used). If not, that probably means
> we got the interface wrong.
Yes, that's a good idea. Currently it seems to me that block+inode
relocation (we also need for defrag) would be enough to support filesystem
shrinking. Actually, in some ancient times (like 6-7 years ago) I had
written ext2 online filesystem shrinking. Currently, the patch is probably
unusably obsolete but I can still dig it out and look what functions did I
need at that time.

Honza
--
Jan Kara <[email protected]>
SuSE CR Labs

2007-05-01 06:19:46

by Andreas Dilger

[permalink] [raw]
Subject: Re: Ext2/3 block remapping tool

On Apr 30, 2007 08:09 -0400, Theodore Tso wrote:
> On Fri, Apr 27, 2007 at 12:09:42PM -0600, Andreas Dilger wrote:
> > I'd prefer that such functionality be integrated with Takashi's online
> > defrag tool, since it needs virtually the same functionality. For that
> > matter, this is also very similar to the block-mapped -> extents tool
> > from Aneesh. It doesn't make sense to have so many separate tools for
> > users, especially if they start interfering with each other (i.e. defrag
> > undoes the remapping done by your tool).
>
> While we're at it, someone want to start thinking about on-line
> shrinking of ext4 filesystems? Again, the same block remapping
> interfaces for defrag and file access optimizations should also be
> useful for shrinking filesystems (even if some of the files that need
> to be relocated are being actively used). If not, that probably means
> we got the interface wrong.

Except one other issue with online shrinking is that we need to move
inodes on occasion and this poses a bunch of other problems over just
remapping the data blocks.

Cheers, Andreas
--
Andreas Dilger
Principal Software Engineer
Cluster File Systems, Inc.

2007-05-01 15:28:13

by Theodore Ts'o

[permalink] [raw]
Subject: Re: Ext2/3 block remapping tool

On Tue, May 01, 2007 at 12:01:42AM -0600, Andreas Dilger wrote:
> Except one other issue with online shrinking is that we need to move
> inodes on occasion and this poses a bunch of other problems over just
> remapping the data blocks.

Well, I did say "necessary", and not "sufficient". But yes, moving
inodes, especially if the inode is currently open gets interesting. I
don't think there are that many user space applications that would
notice or care if the st_ino of an open file changed out from under
them, but there are obviously userspace applications, such as tar,
that would most definitely care.

- Ted

2007-05-01 18:52:53

by Andreas Dilger

[permalink] [raw]
Subject: Re: Ext2/3 block remapping tool

On May 01, 2007 11:28 -0400, Theodore Tso wrote:
> On Tue, May 01, 2007 at 12:01:42AM -0600, Andreas Dilger wrote:
> > Except one other issue with online shrinking is that we need to move
> > inodes on occasion and this poses a bunch of other problems over just
> > remapping the data blocks.
>
> Well, I did say "necessary", and not "sufficient". But yes, moving
> inodes, especially if the inode is currently open gets interesting. I
> don't think there are that many user space applications that would
> notice or care if the st_ino of an open file changed out from under
> them, but there are obviously userspace applications, such as tar,
> that would most definitely care.

I think "rm -r" does a LOT of this kind of operation, like:

stat(.); stat(foo); chdir(foo); stat(.); unlink(*); chdir(..); stat(.)

I think "find" does the same to avoid security problems with malicious
path manipulation.

Cheers, Andreas
--
Andreas Dilger
Principal Software Engineer
Cluster File Systems, Inc.

2007-05-01 22:18:43

by Theodore Ts'o

[permalink] [raw]
Subject: Re: Ext2/3 block remapping tool

On Tue, May 01, 2007 at 12:52:49PM -0600, Andreas Dilger wrote:
> I think "rm -r" does a LOT of this kind of operation, like:
>
> stat(.); stat(foo); chdir(foo); stat(.); unlink(*); chdir(..); stat(.)
>
> I think "find" does the same to avoid security problems with malicious
> path manipulation.

Yep, so if you're doing an rm -rf (or any other recursive descent)
while we're doing an on-line shrink, it's going to fail. I suppose we
could have an in-core inode mapping table that would continue to remap
inode numbers until the next reboot. I'm not sure we would want to
keep the inode remapping indefinitely, although if we don't it could
also end up screwing up NFS as well. Not sure I care, though. :-)

- Ted