2014-06-11 08:43:29

by Akira Fujita

[permalink] [raw]
Subject: [PATCH 3/3] mke2fs: prevent creation of unmountable ext4 with large flex_bg count

In mke2fs command, if flex_bg count is too large to filesystem blocks count,
unmountable ext4 which has the out of filesystem block offset is created (Case1).
Moreover this large flex_bg count causes an unintentional metadata layout
(bmap-imap-itable-bmap-imap-itable .. in block group) (Case2).

To fix these issues and keep healthy flex_bg layout, disallow creating ext4
with obviously large flex_bg count to filesystem blocks count.

Steps to reproduce:
(Case1)
1.
# mke2fs -t ext4 -b 4096 -O ^resize_inode -G $((2**20)) DEV 2130483

2.
# mount -t ext4 DEV MP
mount: wrong fs type, bad option, bad superblock on /dev/sdb4,
missing codepage or helper program, or other error
In some cases useful info is found in syslog - try
dmesg | tail or so

3.
# dumpe2fs DEV
<snip>
Block count: 2130483
<snip>
Flex block group size: 1048576
<snip>
Group 65: (Blocks 2129920-2130482) [INODE_UNINIT]
Checksum 0x4cb3, unused inodes 8080
Block bitmap at 67 (bg #0 + 67), Inode bitmap at 1048643 (bg #32 + 67)
Inode table at 2129979-2130483 (+59) <---- 2130483 is out of FS!
65535 free blocks, 8080 free inodes, 0 directories, 8080 unused inodes
Free blocks:
Free inodes: 525201-533280

(Case2)
1.
# mke2fs -t ext4 -G 2147483648 DEV 3145728

2.
# debugfs -R stats DEV
<snip>
Block count: 786432
<snip>
Flex block group size: 2147483648
<snip>
Group 0: block bitmap at 193, inode bitmap at 194, inode table at 195 <--
20233 free blocks, 8181 free inodes, 2 used directories, 8181 unused inodes
[Checksum 0xa597]
Group 1: block bitmap at 707, inode bitmap at 708, inode table at 709 <--
32575 free blocks, 8192 free inodes, 0 used directories, 8192 unused inodes
[Inode not init, Block not init, Checksum 0x196f]
Group 2: block bitmap at 1221, inode bitmap at 1222, inode table at 1223 <--
32768 free blocks, 8192 free inodes, 0 used directories, 8192 unused inodes
[Inode not init, Block not init, Checksum 0x856f]
<snip>

Signed-off-by: Akira Fujita <[email protected]>
---
lib/ext2fs/initialize.c | 24 ++++++++++++++++++++++++
1 file changed, 24 insertions(+)
diff --git a/lib/ext2fs/initialize.c b/lib/ext2fs/initialize.c
index 75fbf8e..34753d0 100644
--- a/lib/ext2fs/initialize.c
+++ b/lib/ext2fs/initialize.c
@@ -91,8 +91,10 @@ errcode_t ext2fs_initialize(const char *name, int flags,
unsigned int rem;
unsigned int overhead = 0;
unsigned int ipg;
+ unsigned int flexbg_size;
dgrp_t i;
blk64_t free_blocks;
+ blk64_t flexbg_overhead;
blk_t numblocks;
int rsv_gdt;
int csum_flag;
@@ -419,6 +421,28 @@ ipg_retry:
}

/*
+ * Calculate the flex_bg related metadata blocks count.
+ * It includes the boot block, the super block,
+ * the block group descriptors, the reserved gdt blocks,
+ * the block bitmaps, the inode bitmaps and the inode tables.
+ * This is a simple check, so that the backup superblock and
+ * other feature related blocks are not considered.
+ */
+ flexbg_size = 1 << fs->super->s_log_groups_per_flex;
+ flexbg_overhead = super->s_first_data_block + 1 +
+ fs->desc_blocks + super->s_reserved_gdt_blocks +
+ (__u64)flexbg_size * (2 + fs->inode_blocks_per_group);
+
+ /*
+ * Disallow creating ext4 which breaks flex_bg metadata layout
+ * obviously.
+ */
+ if (flexbg_overhead > ext2fs_blocks_count(fs->super)) {
+ retval = EXT2_ET_INVALID_ARGUMENT;
+ goto cleanup;
+ }
+
+ /*
* At this point we know how big the filesystem will be. So
* we can do any and all allocations that depend on the block
* count.


2014-06-11 19:01:20

by Andreas Dilger

[permalink] [raw]
Subject: Re: [PATCH 3/3] mke2fs: prevent creation of unmountable ext4 with large flex_bg count


On Jun 11, 2014, at 2:38 AM, Akira Fujita <[email protected]> wrote:

> In mke2fs command, if flex_bg count is too large to filesystem blocks count,
> unmountable ext4 which has the out of filesystem block offset is created (Case1).
> Moreover this large flex_bg count causes an unintentional metadata layout
> (bmap-imap-itable-bmap-imap-itable .. in block group) (Case2).
>
> To fix these issues and keep healthy flex_bg layout, disallow creating ext4
> with obviously large flex_bg count to filesystem blocks count.

Patch looks good to me.
Reviewed-by: Andreas Dilger <[email protected]>

This also reminds me of my previous flex_bg patch:
[PATCH][RFC] mke2fs: handle flex_bg collision with backup descriptors
http://permalink.gmane.org/gmane.comp.file-systems.ext4/42298

which fixes the "bmap-imap-itable-bmap-imap-itable" problem when a large
flex_bg size is used. Sadly, there were no comments on that patch.

Cheers, Andreas

> Steps to reproduce:
> (Case1)
> 1.
> # mke2fs -t ext4 -b 4096 -O ^resize_inode -G $((2**20)) DEV 2130483
>
> 2.
> # mount -t ext4 DEV MP
> mount: wrong fs type, bad option, bad superblock on /dev/sdb4,
> missing codepage or helper program, or other error
> In some cases useful info is found in syslog - try
> dmesg | tail or so
>
> 3.
> # dumpe2fs DEV
> <snip>
> Block count: 2130483
> <snip>
> Flex block group size: 1048576
> <snip>
> Group 65: (Blocks 2129920-2130482) [INODE_UNINIT]
> Checksum 0x4cb3, unused inodes 8080
> Block bitmap at 67 (bg #0 + 67), Inode bitmap at 1048643 (bg #32 + 67)
> Inode table at 2129979-2130483 (+59) <---- 2130483 is out of FS!
> 65535 free blocks, 8080 free inodes, 0 directories, 8080 unused inodes
> Free blocks:
> Free inodes: 525201-533280
>
> (Case2)
> 1.
> # mke2fs -t ext4 -G 2147483648 DEV 3145728
>
> 2.
> # debugfs -R stats DEV
> <snip>
> Block count: 786432
> <snip>
> Flex block group size: 2147483648
> <snip>
> Group 0: block bitmap at 193, inode bitmap at 194, inode table at 195 <--
> 20233 free blocks, 8181 free inodes, 2 used directories, 8181 unused inodes
> [Checksum 0xa597]
> Group 1: block bitmap at 707, inode bitmap at 708, inode table at 709 <--
> 32575 free blocks, 8192 free inodes, 0 used directories, 8192 unused inodes
> [Inode not init, Block not init, Checksum 0x196f]
> Group 2: block bitmap at 1221, inode bitmap at 1222, inode table at 1223 <--
> 32768 free blocks, 8192 free inodes, 0 used directories, 8192 unused inodes
> [Inode not init, Block not init, Checksum 0x856f]
> <snip>
>
> Signed-off-by: Akira Fujita <[email protected]>
> ---
> lib/ext2fs/initialize.c | 24 ++++++++++++++++++++++++
> 1 file changed, 24 insertions(+)
> diff --git a/lib/ext2fs/initialize.c b/lib/ext2fs/initialize.c
> index 75fbf8e..34753d0 100644
> --- a/lib/ext2fs/initialize.c
> +++ b/lib/ext2fs/initialize.c
> @@ -91,8 +91,10 @@ errcode_t ext2fs_initialize(const char *name, int flags,
> unsigned int rem;
> unsigned int overhead = 0;
> unsigned int ipg;
> + unsigned int flexbg_size;
> dgrp_t i;
> blk64_t free_blocks;
> + blk64_t flexbg_overhead;
> blk_t numblocks;
> int rsv_gdt;
> int csum_flag;
> @@ -419,6 +421,28 @@ ipg_retry:
> }
>
> /*
> + * Calculate the flex_bg related metadata blocks count.
> + * It includes the boot block, the super block,
> + * the block group descriptors, the reserved gdt blocks,
> + * the block bitmaps, the inode bitmaps and the inode tables.
> + * This is a simple check, so that the backup superblock and
> + * other feature related blocks are not considered.
> + */
> + flexbg_size = 1 << fs->super->s_log_groups_per_flex;
> + flexbg_overhead = super->s_first_data_block + 1 +
> + fs->desc_blocks + super->s_reserved_gdt_blocks +
> + (__u64)flexbg_size * (2 + fs->inode_blocks_per_group);
> +
> + /*
> + * Disallow creating ext4 which breaks flex_bg metadata layout
> + * obviously.
> + */
> + if (flexbg_overhead > ext2fs_blocks_count(fs->super)) {
> + retval = EXT2_ET_INVALID_ARGUMENT;
> + goto cleanup;
> + }
> +
> + /*
> * At this point we know how big the filesystem will be. So
> * we can do any and all allocations that depend on the block
> * count.
> --
> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html


Cheers, Andreas






Attachments:
signature.asc (833.00 B)
Message signed with OpenPGP using GPGMail

2014-06-12 23:50:29

by Darrick J. Wong

[permalink] [raw]
Subject: Re: [PATCH 3/3] mke2fs: prevent creation of unmountable ext4 with large flex_bg count

On Wed, Jun 11, 2014 at 01:01:29PM -0600, Andreas Dilger wrote:
>
> On Jun 11, 2014, at 2:38 AM, Akira Fujita <[email protected]> wrote:
>
> > In mke2fs command, if flex_bg count is too large to filesystem blocks count,
> > unmountable ext4 which has the out of filesystem block offset is created (Case1).
> > Moreover this large flex_bg count causes an unintentional metadata layout
> > (bmap-imap-itable-bmap-imap-itable .. in block group) (Case2).
> >
> > To fix these issues and keep healthy flex_bg layout, disallow creating ext4
> > with obviously large flex_bg count to filesystem blocks count.
>
> Patch looks good to me.
> Reviewed-by: Andreas Dilger <[email protected]>
>
> This also reminds me of my previous flex_bg patch:
> [PATCH][RFC] mke2fs: handle flex_bg collision with backup descriptors
> http://permalink.gmane.org/gmane.comp.file-systems.ext4/42298
>
> which fixes the "bmap-imap-itable-bmap-imap-itable" problem when a large
> flex_bg size is used. Sadly, there were no comments on that patch.

I had wondered if you were planning to address the FIXMEs in the patch, but
then forgot to ever follow up... :/

--D

>
> Cheers, Andreas
>
> > Steps to reproduce:
> > (Case1)
> > 1.
> > # mke2fs -t ext4 -b 4096 -O ^resize_inode -G $((2**20)) DEV 2130483
> >
> > 2.
> > # mount -t ext4 DEV MP
> > mount: wrong fs type, bad option, bad superblock on /dev/sdb4,
> > missing codepage or helper program, or other error
> > In some cases useful info is found in syslog - try
> > dmesg | tail or so
> >
> > 3.
> > # dumpe2fs DEV
> > <snip>
> > Block count: 2130483
> > <snip>
> > Flex block group size: 1048576
> > <snip>
> > Group 65: (Blocks 2129920-2130482) [INODE_UNINIT]
> > Checksum 0x4cb3, unused inodes 8080
> > Block bitmap at 67 (bg #0 + 67), Inode bitmap at 1048643 (bg #32 + 67)
> > Inode table at 2129979-2130483 (+59) <---- 2130483 is out of FS!
> > 65535 free blocks, 8080 free inodes, 0 directories, 8080 unused inodes
> > Free blocks:
> > Free inodes: 525201-533280
> >
> > (Case2)
> > 1.
> > # mke2fs -t ext4 -G 2147483648 DEV 3145728
> >
> > 2.
> > # debugfs -R stats DEV
> > <snip>
> > Block count: 786432
> > <snip>
> > Flex block group size: 2147483648
> > <snip>
> > Group 0: block bitmap at 193, inode bitmap at 194, inode table at 195 <--
> > 20233 free blocks, 8181 free inodes, 2 used directories, 8181 unused inodes
> > [Checksum 0xa597]
> > Group 1: block bitmap at 707, inode bitmap at 708, inode table at 709 <--
> > 32575 free blocks, 8192 free inodes, 0 used directories, 8192 unused inodes
> > [Inode not init, Block not init, Checksum 0x196f]
> > Group 2: block bitmap at 1221, inode bitmap at 1222, inode table at 1223 <--
> > 32768 free blocks, 8192 free inodes, 0 used directories, 8192 unused inodes
> > [Inode not init, Block not init, Checksum 0x856f]
> > <snip>
> >
> > Signed-off-by: Akira Fujita <[email protected]>
> > ---
> > lib/ext2fs/initialize.c | 24 ++++++++++++++++++++++++
> > 1 file changed, 24 insertions(+)
> > diff --git a/lib/ext2fs/initialize.c b/lib/ext2fs/initialize.c
> > index 75fbf8e..34753d0 100644
> > --- a/lib/ext2fs/initialize.c
> > +++ b/lib/ext2fs/initialize.c
> > @@ -91,8 +91,10 @@ errcode_t ext2fs_initialize(const char *name, int flags,
> > unsigned int rem;
> > unsigned int overhead = 0;
> > unsigned int ipg;
> > + unsigned int flexbg_size;
> > dgrp_t i;
> > blk64_t free_blocks;
> > + blk64_t flexbg_overhead;
> > blk_t numblocks;
> > int rsv_gdt;
> > int csum_flag;
> > @@ -419,6 +421,28 @@ ipg_retry:
> > }
> >
> > /*
> > + * Calculate the flex_bg related metadata blocks count.
> > + * It includes the boot block, the super block,
> > + * the block group descriptors, the reserved gdt blocks,
> > + * the block bitmaps, the inode bitmaps and the inode tables.
> > + * This is a simple check, so that the backup superblock and
> > + * other feature related blocks are not considered.
> > + */
> > + flexbg_size = 1 << fs->super->s_log_groups_per_flex;
> > + flexbg_overhead = super->s_first_data_block + 1 +
> > + fs->desc_blocks + super->s_reserved_gdt_blocks +
> > + (__u64)flexbg_size * (2 + fs->inode_blocks_per_group);
> > +
> > + /*
> > + * Disallow creating ext4 which breaks flex_bg metadata layout
> > + * obviously.
> > + */
> > + if (flexbg_overhead > ext2fs_blocks_count(fs->super)) {
> > + retval = EXT2_ET_INVALID_ARGUMENT;
> > + goto cleanup;
> > + }
> > +
> > + /*
> > * At this point we know how big the filesystem will be. So
> > * we can do any and all allocations that depend on the block
> > * count.
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> > the body of a message to [email protected]
> > More majordomo info at http://vger.kernel.org/majordomo-info.html
>
>
> Cheers, Andreas
>
>
>
>
>



2014-06-13 21:45:11

by Andreas Dilger

[permalink] [raw]
Subject: Re: [PATCH 3/3] mke2fs: prevent creation of unmountable ext4 with large flex_bg count

On Jun 12, 2014, at 5:50 PM, Darrick J. Wong <[email protected]> wrote:
> On Wed, Jun 11, 2014 at 01:01:29PM -0600, Andreas Dilger wrote:
>>
>> This also reminds me of my previous flex_bg patch:
>> [PATCH][RFC] mke2fs: handle flex_bg collision with backup descriptors
>> http://permalink.gmane.org/gmane.comp.file-systems.ext4/42298
>>
>> which fixes the "bmap-imap-itable-bmap-imap-itable" problem when a large
>> flex_bg size is used. Sadly, there were no comments on that patch.
>
> I had wondered if you were planning to address the FIXMEs in the patch, but
> then forgot to ever follow up... :/

Well, I was thinking that the patch was good enough to land as-is. It fixes 99% the problem for flex_bg size up to 65536, but only 95% of the groups for flex_bg of 131072. I guess I'll send it out again without [RFC], and if people start using flex_bg >= 131072 (which is enough for all the metadata in a 16TB chunk) then we can work out the final details. I suspect that the sparse_super2 feature would probably become more prevalent and avoid this problem entirely.

Cheers, Andreas






Attachments:
signature.asc (833.00 B)
Message signed with OpenPGP using GPGMail

2014-07-06 02:43:59

by Theodore Ts'o

[permalink] [raw]
Subject: Re: [PATCH 3/3] mke2fs: prevent creation of unmountable ext4 with large flex_bg count

On Wed, Jun 11, 2014 at 08:38:10AM +0000, Akira Fujita wrote:
> In mke2fs command, if flex_bg count is too large to filesystem blocks count,
> unmountable ext4 which has the out of filesystem block offset is created (Case1).
> Moreover this large flex_bg count causes an unintentional metadata layout
> (bmap-imap-itable-bmap-imap-itable .. in block group) (Case2).
>
> To fix these issues and keep healthy flex_bg layout, disallow creating ext4
> with obviously large flex_bg count to filesystem blocks count.

Applied, thanks.

- Ted