2002-10-29 16:37:29

by Theodore Ts'o

[permalink] [raw]
Subject: [PATCH] 2/11 Ext2/3 Updates: Extended attributes, ACL, etc.


Ext2/3 forward compatibility: on-line resizing

This patch allows forward compatibility with future filesystems which
are dynamically grown by using an alternate algorithm for storing the
block group descriptors. It's also a bit more efficient, in that it
uses just a little bit less disk space. Currently, the ext2 filesystem
format requires either relocating the inode table, or reserving space in
before doing the on-line resize. The new scheme, which is documented in
"Planned Extensions to the Ext2/3 Filesystem", by Stephen Tweedie and I (see:
http://www.usenix.org/publications/library/proceedings/usenix02/tech/freenix/tso.html)

fs/ext2/super.c | 22 ++++++++++++++++++++--
fs/ext3/super.c | 22 ++++++++++++++++++++--
include/linux/ext2_fs.h | 7 +++++--
include/linux/ext3_fs.h | 6 +++++-
4 files changed, 50 insertions(+), 7 deletions(-)

diff -Nru a/fs/ext2/super.c b/fs/ext2/super.c
--- a/fs/ext2/super.c Tue Oct 29 09:54:31 2002
+++ b/fs/ext2/super.c Tue Oct 29 09:54:31 2002
@@ -476,12 +476,29 @@
return res;
}

+static unsigned long descriptor_loc(struct super_block *sb,
+ unsigned long logic_sb_block,
+ int nr)
+{
+ struct ext2_sb_info *sbi = EXT2_SB(sb);
+ unsigned long bg, first_data_block, first_meta_bg;
+
+ first_data_block = le32_to_cpu(sbi->s_es->s_first_data_block);
+ first_meta_bg = le32_to_cpu(sbi->s_es->s_first_meta_bg);
+
+ if (!EXT2_HAS_INCOMPAT_FEATURE(sb, EXT2_FEATURE_INCOMPAT_META_BG) ||
+ nr < first_meta_bg)
+ return (logic_sb_block + nr + 1);
+ bg = sbi->s_desc_per_block * nr;
+ return (first_data_block + 1 + (bg * sbi->s_blocks_per_group));
+}
+
static int ext2_fill_super(struct super_block *sb, void *data, int silent)
{
struct buffer_head * bh;
struct ext2_sb_info * sbi;
struct ext2_super_block * es;
- unsigned long sb_block = 1;
+ unsigned long block, sb_block = 1;
unsigned long logic_sb_block = get_sb_block(&data);
unsigned long offset = 0;
unsigned long def_mount_opts;
@@ -689,7 +706,8 @@
goto failed_mount;
}
for (i = 0; i < db_count; i++) {
- sbi->s_group_desc[i] = sb_bread(sb, logic_sb_block + i + 1);
+ block = descriptor_loc(sb, logic_sb_block, i);
+ sbi->s_group_desc[i] = sb_bread(sb, block);
if (!sbi->s_group_desc[i]) {
for (j = 0; j < i; j++)
brelse (sbi->s_group_desc[j]);
diff -Nru a/fs/ext3/super.c b/fs/ext3/super.c
--- a/fs/ext3/super.c Tue Oct 29 09:54:31 2002
+++ b/fs/ext3/super.c Tue Oct 29 09:54:31 2002
@@ -929,6 +929,23 @@
return res;
}

+static unsigned long descriptor_loc(struct super_block *sb,
+ unsigned long logic_sb_block,
+ int nr)
+{
+ struct ext3_sb_info *sbi = EXT3_SB(sb);
+ unsigned long bg, first_data_block, first_meta_bg;
+
+ first_data_block = le32_to_cpu(sbi->s_es->s_first_data_block);
+ first_meta_bg = le32_to_cpu(sbi->s_es->s_first_meta_bg);
+
+ if (!EXT3_HAS_INCOMPAT_FEATURE(sb, EXT3_FEATURE_INCOMPAT_META_BG) ||
+ nr < first_meta_bg)
+ return (logic_sb_block + nr + 1);
+ bg = sbi->s_desc_per_block * nr;
+ return (first_data_block + 1 + (bg * sbi->s_blocks_per_group));
+}
+

static int ext3_fill_super (struct super_block *sb, void *data, int silent)
{
@@ -936,7 +953,7 @@
struct ext3_super_block *es = 0;
struct ext3_sb_info *sbi;
unsigned long sb_block = get_sb_block(&data);
- unsigned long logic_sb_block = 1;
+ unsigned long block, logic_sb_block = 1;
unsigned long offset = 0;
unsigned long journal_inum = 0;
unsigned long def_mount_opts;
@@ -1161,7 +1178,8 @@
goto failed_mount;
}
for (i = 0; i < db_count; i++) {
- sbi->s_group_desc[i] = sb_bread(sb, logic_sb_block + i + 1);
+ block = descriptor_loc(sb, logic_sb_block, i);
+ sbi->s_group_desc[i] = sb_bread(sb, block);
if (!sbi->s_group_desc[i]) {
printk (KERN_ERR "EXT3-fs: "
"can't read group descriptor %d\n", i);
diff -Nru a/include/linux/ext2_fs.h b/include/linux/ext2_fs.h
--- a/include/linux/ext2_fs.h Tue Oct 29 09:54:31 2002
+++ b/include/linux/ext2_fs.h Tue Oct 29 09:54:31 2002
@@ -422,7 +422,8 @@
__u8 s_reserved_char_pad;
__u16 s_reserved_word_pad;
__u32 s_default_mount_opts;
- __u32 s_reserved[191]; /* Padding to the end of the block */
+ __u32 s_first_meta_bg; /* First metablock block group */
+ __u32 s_reserved[190]; /* Padding to the end of the block */
};

/*
@@ -485,10 +486,12 @@
#define EXT2_FEATURE_INCOMPAT_FILETYPE 0x0002
#define EXT3_FEATURE_INCOMPAT_RECOVER 0x0004
#define EXT3_FEATURE_INCOMPAT_JOURNAL_DEV 0x0008
+#define EXT2_FEATURE_INCOMPAT_META_BG 0x0010
#define EXT2_FEATURE_INCOMPAT_ANY 0xffffffff

#define EXT2_FEATURE_COMPAT_SUPP 0
-#define EXT2_FEATURE_INCOMPAT_SUPP EXT2_FEATURE_INCOMPAT_FILETYPE
+#define EXT2_FEATURE_INCOMPAT_SUPP (EXT2_FEATURE_INCOMPAT_FILETYPE| \
+ EXT2_FEATURE_INCOMPAT_META_BG)
#define EXT2_FEATURE_RO_COMPAT_SUPP (EXT2_FEATURE_RO_COMPAT_SPARSE_SUPER| \
EXT2_FEATURE_RO_COMPAT_LARGE_FILE| \
EXT2_FEATURE_RO_COMPAT_BTREE_DIR)
diff -Nru a/include/linux/ext3_fs.h b/include/linux/ext3_fs.h
--- a/include/linux/ext3_fs.h Tue Oct 29 09:54:31 2002
+++ b/include/linux/ext3_fs.h Tue Oct 29 09:54:31 2002
@@ -450,7 +450,8 @@
__u8 s_reserved_char_pad;
__u16 s_reserved_word_pad;
__u32 s_default_mount_opts;
- __u32 s_reserved[191]; /* Padding to the end of the block */
+ __u32 s_first_meta_bg; /* First metablock block group */
+ __u32 s_reserved[190]; /* Padding to the end of the block */
};

#ifdef __KERNEL__
@@ -529,8 +530,11 @@
#define EXT3_FEATURE_INCOMPAT_FILETYPE 0x0002
#define EXT3_FEATURE_INCOMPAT_RECOVER 0x0004 /* Needs recovery */
#define EXT3_FEATURE_INCOMPAT_JOURNAL_DEV 0x0008 /* Journal device */
+#define EXT3_FEATURE_INCOMPAT_META_BG 0x0010

#define EXT3_FEATURE_COMPAT_SUPP 0
+#define EXT2_FEATURE_INCOMPAT_SUPP (EXT2_FEATURE_INCOMPAT_FILETYPE| \
+ EXT2_FEATURE_INCOMPAT_META_BG)
#define EXT3_FEATURE_INCOMPAT_SUPP (EXT3_FEATURE_INCOMPAT_FILETYPE| \
EXT3_FEATURE_INCOMPAT_RECOVER)
#define EXT3_FEATURE_RO_COMPAT_SUPP (EXT3_FEATURE_RO_COMPAT_SPARSE_SUPER| \


2002-10-29 17:16:55

by Jeff Garzik

[permalink] [raw]
Subject: Re: [PATCH] 2/11 Ext2/3 Updates: Extended attributes, ACL, etc.

[email protected] wrote:

>Ext2/3 forward compatibility: on-line resizing
>
>
Is the interface for this going to be ext2meta? Al and sct seemed to
agree that that was the best way act upon the filesystem metadata while
it's online... I'll probably be updating that for 2.5.x VFS changes in
a few weeks, that will provide safe online defrag and a good interface
for other metadata interaction.

>This patch allows forward compatibility with future filesystems which
>are dynamically grown by using an alternate algorithm for storing the
>block group descriptors. It's also a bit more efficient, in that it
>uses just a little bit less disk space. Currently, the ext2 filesystem
>format requires either relocating the inode table, or reserving space in
>before doing the on-line resize. The new scheme, which is documented in
>"Planned Extensions to the Ext2/3 Filesystem", by Stephen Tweedie and I (see:
>http://www.usenix.org/publications/library/proceedings/usenix02/tech/freenix/tso.html)
>
>
It would be nice if this paper were available to everybody, and not
passworded.

Jeff




2002-10-31 03:36:07

by Theodore Ts'o

[permalink] [raw]
Subject: Re: [PATCH] 2/11 Ext2/3 Updates: Extended attributes, ACL, etc.

On Tue, Oct 29, 2002 at 12:22:46PM -0500, Jeff Garzik wrote:
> >(see:
> >http://www.usenix.org/publications/library/proceedings/usenix02/tech/freenix/tso.html)
> >
> It would be nice if this paper were available to everybody, and not
> passworded.

I've made it available here:

http://e2fsprogs.sourceforge.net/extensions-ext23

- Ted

2002-11-01 03:15:35

by Theodore Ts'o

[permalink] [raw]
Subject: Re: [PATCH] 2/11 Ext2/3 Updates: Extended attributes, ACL, etc.

On Tue, Oct 29, 2002 at 12:22:46PM -0500, Jeff Garzik wrote:
> [email protected] wrote:
>
> >Ext2/3 forward compatibility: on-line resizing
> >
> >
> Is the interface for this going to be ext2meta? Al and sct seemed
> to agree that that was the best way act upon the filesystem metadata
> while it's online... I'll probably be updating that for 2.5.x VFS
> changes in a few weeks, that will provide safe online defrag and a
> good interface for other metadata interaction.

I'm not sure ext2meta will be sufficient. It's not just a matter of
modifying the on-disk metadata, as would be needed for defrag, but I
would also need to modify some of the in-core data structions in the
ext2/3 filesystem data structures. For example, when you resize the
filesystem, you need to increase the number of group descriptors,
which means you need to kmalloc, copy, and then kfree sbi->group_desc
out from under the mounted filesystem.

No doubt ext2meta could be modified so it could "reach out and touch"
internal ext2/3 fileststem data structures in core. But the locking
issues involved get really messy.

My original plan was to adapt Andreas Dilger's on-line resizing patch
to use the new block group layout, which would obviate the need to
take the filesystem off-line and run ext2prepare first. I'm not
opposed to trying to do it via ext2meta, but it seems like it might
get complicated and hairy quite quickly.

- Ted

2002-11-01 03:28:49

by Alexander Viro

[permalink] [raw]
Subject: Re: [PATCH] 2/11 Ext2/3 Updates: Extended attributes, ACL, etc.



On Thu, 31 Oct 2002, Theodore Ts'o wrote:

> I'm not sure ext2meta will be sufficient. It's not just a matter of
> modifying the on-disk metadata, as would be needed for defrag, but I
> would also need to modify some of the in-core data structions in the
> ext2/3 filesystem data structures. For example, when you resize the
> filesystem, you need to increase the number of group descriptors,
> which means you need to kmalloc, copy, and then kfree sbi->group_desc
> out from under the mounted filesystem.
>
> No doubt ext2meta could be modified so it could "reach out and touch"
> internal ext2/3 fileststem data structures in core. But the locking
> issues involved get really messy.

For all practical purposes, ext2meta is part of ext2 - same driver,
two filesystem types. Locking isn't that scary, BTW - I'd looked
into that some time ago and it looked feasible.