2009-01-28 00:27:21

by Etienne Lorrain

[permalink] [raw]
Subject: on disk format: value of bg_inode_table_hi?

Hello,

I have created an ext4 fs on a 64 Mb USB disk by "mkfs.ext4 /dev/sdb" on debian lenny (no partition). I have debugfs 1.41.3 (12-Oct-2008), so
probably an up-to-date mkfs.ext4.
When I analyse the filesystem using my own tools, I seem to read at least
some bg_inode_table_hi values which are not null, on such a very small
filesystem.
Because the superblock s_feature_incompat has the 64BITS set, I assumed
I should use those _hi fields.
If I ignore the value of bg_inode_table_hi that I read as 512, I can at
least analyse the few files I have put into it with debugfs (I strangely
cannot mount that filesystem under debian).

Here is the log I get from my debug softs (sorry, long lines):
#### disk_analyse disk 2 i.e. EBIOS 0x01: (nb found = 6):

## open_filesystem Disk 2 part 0 type 0xE, read first 4 Kbytes: name already set to 'floppy'

E2FS_get_parameter: Filesystem name: 'testext4' Filesystem opened (inode size 128, inodes_per_group 1912).

FSname 'testext4': byte_per_block 1024 bytes, sector_per_block 2, first_data_block 1, inodes_count 15296, blocks_count 61,056.

open_filesystem() success, Scan root directory: E2FS_get_sector_chain (inode 2): [E2FS_read_inode: inode 2, group 0, block 1] [bg_inode_table_hi = 512, bg_inode_table_lo = 273] i_blocks_lo 2 i_size_lo 1024: [create for lba 498] [read_analyse_chain: indirect blocknr == 0, level 1] [read_analyse_chain: indirect blocknr == 0, level 2] [read_analyse_chain: indirect blocknr == 0, level 3] 2 sectors at 498, OK

[ignore: '/.'] [ignore: '/..'] [ignore: '/lost+found'] [is vmlinuz with header: '/vmlinuz-2.6.26-1-686'] [E2FS_read_inode: inode 12, group 0, block 11] [bg_inode_table_hi = 512, bg_inode_table_lo = 273] [E2FS_treat_directory: adding file 'vmlinuz-2.6.26-1-686' size 1505936 bytes] [is initrd: '/initrd.img-2.6.26-1-686'] [E2FS_read_inode: inode 13, group 0, block 12] [bg_inode_table_hi = 512, bg_inode_table_lo = 273] [E2FS_treat_directory: adding file 'initrd.img-2.6.26-1-686' size 7182236 bytes] [E2FS_read_inode: inode 14, group 0, block 13] [bg_inode_table_hi = 512, bg_inode_table_lo = 273] [E2FS_treat_directory: storing 'boot' directory at inode 14 size 1024] [E2FS_treat_directory: need to read more sectors] [file_treat: done, read 1024 bytes]

(main dir size 1024)

Scan /boot directory: E2FS_get_sector_chain (inode 14): [E2FS_read_inode: inode 14, group 0, block 13] [bg_inode_table_hi = 512, bg_inode_table_lo = 273] i_blocks_lo 2 i_size_lo 1024: [create for lba 21816] [read_analyse_chain: indirect blocknr == 0, level 1] [read_analyse_chain: indirect blocknr == 0, level 2] [read_analyse_chain: indirect blocknr == 0, level 3] 2 sectors at 21816, OK

[ignore: '/boot/.'] [ignore: '/boot/..'] [is vmlinuz with header: '/boot/vmlinuz-2.6.26-1-686'] [E2FS_read_inode: inode 15, group 0, block 14] [bg_inode_table_hi = 512, bg_inode_table_lo = 273] [E2FS_treat_directory: adding file 'vmlinuz-2.6.26-1-686' size 1505936 bytes] [is initrd: '/boot/initrd.img-2.6.26-1-686'] [E2FS_read_inode: inode 16, group 0, block 15] [bg_inode_table_hi = 512, bg_inode_table_lo = 273] [E2FS_treat_directory: adding file 'initrd.img-2.6.26-1-686' size 7182236 bytes] [E2FS_treat_directory: need to read more sectors] [file_treat: done, read 1024 bytes]

(/ScanPath directory size 1024) , end scan.



So my question:
Is that a bug on my side, a mis-interpretation of the use of that bg_inode_table_hi field, or a problem somewhere else?

Thanks for any answer,
Etienne.





2009-01-28 18:01:59

by Andreas Dilger

[permalink] [raw]
Subject: Re: on disk format: value of bg_inode_table_hi?

On Jan 28, 2009 00:20 +0000, Etienne Lorrain wrote:
> I have created an ext4 fs on a 64 Mb USB disk by "mkfs.ext4 /dev/sdb" on
> debian lenny (no partition). I have debugfs 1.41.3 (12-Oct-2008), so
> probably an up-to-date mkfs.ext4.
> When I analyse the filesystem using my own tools, I seem to read at least
> some bg_inode_table_hi values which are not null, on such a very small
> filesystem.
> Because the superblock s_feature_incompat has the 64BITS set, I assumed
> I should use those _hi fields.
> If I ignore the value of bg_inode_table_hi that I read as 512, I can at
> least analyse the few files I have put into it with debugfs (I strangely
> cannot mount that filesystem under debian).

Does "e2fsck -f" fix this problem? It definitely should.

> Here is the log I get from my debug softs (sorry, long lines):
> #### disk_analyse disk 2 i.e. EBIOS 0x01: (nb found = 6):
>
> ## open_filesystem Disk 2 part 0 type 0xE, read first 4 Kbytes: name already set to 'floppy'
>
> E2FS_get_parameter: Filesystem name: 'testext4' Filesystem opened (inode size 128, inodes_per_group 1912).
>
> FSname 'testext4': byte_per_block 1024 bytes, sector_per_block 2, first_data_block 1, inodes_count 15296, blocks_count 61,056.
>
> open_filesystem() success, Scan root directory: E2FS_get_sector_chain (inode 2): [E2FS_read_inode: inode 2, group 0, block 1] [bg_inode_table_hi = 512, bg_inode_table_lo = 273] i_blocks_lo 2 i_size_lo 1024: [create for lba 498] [read_analyse_chain: indirect blocknr == 0, level 1] [read_analyse_chain: indirect blocknr == 0, level 2] [read_analyse_chain: indirect blocknr == 0, level 3] 2 sectors at 498, OK
>
> [ignore: '/.'] [ignore: '/..'] [ignore: '/lost+found'] [is vmlinuz with header: '/vmlinuz-2.6.26-1-686'] [E2FS_read_inode: inode 12, group 0, block 11] [bg_inode_table_hi = 512, bg_inode_table_lo = 273] [E2FS_treat_directory: adding file 'vmlinuz-2.6.26-1-686' size 1505936 bytes] [is initrd: '/initrd.img-2.6.26-1-686'] [E2FS_read_inode: inode 13, group 0, block 12] [bg_inode_table_hi = 512, bg_inode_table_lo = 273] [E2FS_treat_directory: adding file 'initrd.img-2.6.26-1-686' size 7182236 bytes] [E2FS_read_inode: inode 14, group 0, block 13] [bg_inode_table_hi = 512, bg_inode_table_lo = 273] [E2FS_treat_directory: storing 'boot' directory at inode 14 size 1024] [E2FS_treat_directory: need to read more sectors] [file_treat: done, read 1024 bytes]
>
> (main dir size 1024)
>
> Scan /boot directory: E2FS_get_sector_chain (inode 14): [E2FS_read_inode: inode 14, group 0, block 13] [bg_inode_table_hi = 512, bg_inode_table_lo = 273] i_blocks_lo 2 i_size_lo 1024: [create for lba 21816] [read_analyse_chain: indirect blocknr == 0, level 1] [read_analyse_chain: indirect blocknr == 0, level 2] [read_analyse_chain: indirect blocknr == 0, level 3] 2 sectors at 21816, OK
>
> [ignore: '/boot/.'] [ignore: '/boot/..'] [is vmlinuz with header: '/boot/vmlinuz-2.6.26-1-686'] [E2FS_read_inode: inode 15, group 0, block 14] [bg_inode_table_hi = 512, bg_inode_table_lo = 273] [E2FS_treat_directory: adding file 'vmlinuz-2.6.26-1-686' size 1505936 bytes] [is initrd: '/boot/initrd.img-2.6.26-1-686'] [E2FS_read_inode: inode 16, group 0, block 15] [bg_inode_table_hi = 512, bg_inode_table_lo = 273] [E2FS_treat_directory: adding file 'initrd.img-2.6.26-1-686' size 7182236 bytes] [E2FS_treat_directory: need to read more sectors] [file_treat: done, read 1024 bytes]
>
> (/ScanPath directory size 1024) , end scan.
>
>
>
> So my question:
> Is that a bug on my side, a mis-interpretation of the use of that bg_inode_table_hi field, or a problem somewhere else?
>
> Thanks for any answer,
> Etienne.
>
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html

Cheers, Andreas
--
Andreas Dilger
Sr. Staff Engineer, Lustre Group
Sun Microsystems of Canada, Inc.


2009-01-28 20:47:11

by Etienne Lorrain

[permalink] [raw]
Subject: Re: on disk format: value of bg_inode_table_hi?

> De: Andreas Dilger <[email protected]>
> Date: Mercredi 28 Janvier 2009, 18h01
> On Jan 28, 2009 00:20 +0000, Etienne Lorrain wrote:
> > I have created an ext4 fs on a 64 Mb USB disk by
> > "mkfs.ext4 /dev/sdb" on debian lenny (no partition).
> > When I analyse the filesystem using my own tools, I
> > seem to read at least some bg_inode_table_hi values which
> > are not null, on such a very small filesystem.
>
> Does "e2fsck -f" fix this problem? It definitely
> should.
>
> Cheers, Andreas
> --
> Andreas Dilger
> Sr. Staff Engineer, Lustre Group
> Sun Microsystems of Canada, Inc.

Well, e2fsck does not see anything:
etienne-laptop:~# e2fsck -f /dev/sdc
e2fsck 1.41.3 (12-Oct-2008)
Pass 1: Checking inodes, blocks, and sizes
Pass 2: Checking directory structure
Pass 3: Checking directory connectivity
Pass 4: Checking reference counts
Pass 5: Checking group summary information
testext4: 16/15296 files (43.8% non-contiguous), 24210/61056 blocks
etienne-laptop:~# umount /mnt/disk
umount: /mnt/disk: not mounted
etienne-laptop:~# mount -t ext4dev /dev/sdc /mnt/disk/
mount: wrong fs type, bad option, bad superblock on /dev/sdc,
missing codepage or helper program, or other error
In some cases useful info is found in syslog - try
dmesg | tail or so

But after some reading, I used debugfs to "set_super_value s_flags 4",
and now the filesystem mounts under Debian.

My problem stays, I still read a non null bg_inode_table_hi, equal to 512.
I am using a copy of the structure struct ext_group_desc from ext4fs.h,
and I can read the right inode but only if I ignore that bg_inode_table_hi
value. I think I should use this _hi field because of the 64BIT bit of
the superblock, maybe I should not.
I retried to make the filesystem on the same device but from Fedora 10
on ia32, and I got the same bg_inode_table_hi:

## open_filesystem Disk 2 part 0 type 0xE, read first 4 Kbytes: name already set to 'floppy'

E2FS_get_parameter: Filesystem name: '' Filesystem opened (inode size 128, inodes_per_group 1912).

FSname '': byte_per_block 1024 bytes, sector_per_block 2, first_data_block 1, inodes_count 15296, blocks_count 61,056.

open_filesystem() success, Scan root directory: E2FS_get_sector_chain (inode 2): [E2FS_read_inode: inode 2, group 0, block 1] [bg_inode_table_hi = 512, bg_inode_table_lo = 273] i_blocks_lo 2 i_size_lo 1024: [create for lba 498] [read_analyse_chain: indirect blocknr == 0, level 1] [read_analyse_chain: indirect blocknr == 0, level 2] [read_analyse_chain: indirect blocknr == 0, level 3] 2 sectors at 498, OK

[ignore: '/.'] [ignore: '/..'] [ignore: '/lost+found'] [ignore: '/mkfs.log'] [E2FS_treat_directory: need to read more sectors] [file_treat: done, read 1024 bytes]

(main dir size 1024) , end scan.



In short, shall I ignore bg_inode_table_hi when analysing an ext4fs,
because it is planed to be active under another bit than EXT4_FEATURE_INCOMPAT_64BIT ?

Thanks,
Etienne.





2009-01-29 05:12:28

by Theodore Ts'o

[permalink] [raw]
Subject: Re: on disk format: value of bg_inode_table_hi?

On Wed, Jan 28, 2009 at 08:47:09PM +0000, Etienne Lorrain wrote:
> > De: Andreas Dilger <[email protected]>
> > Date: Mercredi 28 Janvier 2009, 18h01
> > On Jan 28, 2009 00:20 +0000, Etienne Lorrain wrote:
> > > I have created an ext4 fs on a 64 Mb USB disk by
> > > "mkfs.ext4 /dev/sdb" on debian lenny (no partition).
> > > When I analyse the filesystem using my own tools, I
> > > seem to read at least some bg_inode_table_hi values which
> > > are not null, on such a very small filesystem.

bg_inode_table_hi doesn't exist on a small filesystem. You need to
take a look at if INCOMPAT_64BIT is not set, or if INCOMPAT_64BIT is
set and s_desc_size is 32, then only the first 32 bytes of struct
ext4_group_desc are in use --- which look exactly the same as
ext2_group_desc.

So there is no problem here. Just a misunderstanding of the
filesystem format.

- Ted

2009-01-29 11:47:31

by Etienne Lorrain

[permalink] [raw]
Subject: Re: on disk format: value of bg_inode_table_hi?

Theodore Tso <[email protected]> wrote:
> bg_inode_table_hi doesn't exist on a small filesystem.
> You need to take a look at if INCOMPAT_64BIT is not set,
> or if INCOMPAT_64BIT is set and s_desc_size is 32, then
> only the first 32 bytes of struct ext4_group_desc are
> in use --- which look exactly the same as ext2_group_desc.
>
> So there is no problem here. Just a misunderstanding of
> the filesystem format.
>
> - Ted

Thanks a lot; and sorry, I should have guessed myself that
the field is not there when s_desc_size is too small,
whatever the value of INCOMPAT_64BIT.

May I ask confirmation of this 2 points on the list too:
- the array of inode in the inode_table is an array where
each inode has the size superblock->s_inode_size (or
EXT2_GOOD_OLD_INODE_SIZE) whatever s_min_extra_isize,
s_want_extra_isize, inode->i_extra_isize because
the "extra" size of an inode is not stored in the
inode_table.

- If I only access the ext4 filesystem readonly, I do not
have any difference considering the flags of the superblock:
EXT4_FEATURE_INCOMPAT_META_BG, EXT4_FEATURE_INCOMPAT_MMP,
EXT4_FEATURE_INCOMPAT_FLEX_BG. That is, I do not want to
allocate any block, so I should not refuse to "mount" an
ext4fs whatever the value of these flags.
I do not know anything about EXT4_FEATURE_INCOMPAT_MMP...

Thanks in advance,
Etienne.




2009-01-30 04:19:46

by Theodore Ts'o

[permalink] [raw]
Subject: Re: on disk format: value of bg_inode_table_hi?

On Thu, Jan 29, 2009 at 11:47:29AM +0000, Etienne Lorrain wrote:
> - If I only access the ext4 filesystem readonly, I do not
> have any difference considering the flags of the superblock:
> EXT4_FEATURE_INCOMPAT_META_BG, EXT4_FEATURE_INCOMPAT_MMP,
> EXT4_FEATURE_INCOMPAT_FLEX_BG.

meta_bg is an INCOMPAT feature because kernels that don't understand
this option won't be able to find the inode table blocks, and would
crash and burn.

flex_bg is an INCOMPAT feature because older kernels do an explicit
sanity check to make sure a block group's metadata is in the block
group. This one could have been ro_incompat, I suppose, since the
worst that would happen with an ro mount on those older kernels is
they would cause an error when the filesystem is mounted. On the
other hand, if the filesystem was also marked "panic and reboot on
error", some users might consider it unfriendly that mounting such a
filesystem on an older kernel would cause an immediate reboot.

Given that there are some security extremists who consider the
possibility of filesystem images that cause a reboot as a "security
bug" that worthy of assignment of a CVE and urgent bug reports forcing
other developers to drop everything and address said "security bug",
it's probably best that flex_bg is an INCOMPAT feature. :-)

- Ted