2007-05-03 19:08:07

by Jose R. Santos

[permalink] [raw]
Subject: Creating a >32bit blocks filesystem.

Hi folks,

I've been trying test a patch to set JBD2_FEATURE_INCOMPAT_64BIT, but
I'm stuck just trying to test the patch since there doesn't seem to be
a way to create a ext4 filesystem that has more than 32bit blocks. It
seems like e2fsprogs + Ted's patches don't support greater that 32bit
block numbers while the the e2fsprogs 64bit patches from BULL create
the filesystem but the kernel seems unable to mount.

Am I missing something?

-JRS


2007-05-04 14:59:45

by Valerie Clement

[permalink] [raw]
Subject: Re: Creating a >32bit blocks filesystem.

Jose R. Santos wrote:
> Hi folks,
>
> I've been trying test a patch to set JBD2_FEATURE_INCOMPAT_64BIT, but
> I'm stuck just trying to test the patch since there doesn't seem to be
> a way to create a ext4 filesystem that has more than 32bit blocks. It
> seems like e2fsprogs + Ted's patches don't support greater that 32bit
> block numbers while the the e2fsprogs 64bit patches from BULL create
> the filesystem but the kernel seems unable to mount.
>
> Am I missing something?
>
> -JRS

Hi Jose,
I began to port our modifications done for the 64-bit support against
the new version of e2fsprogs Ted posted at the beginning of the week.
Note that it is *just* for test use as it breaks the backwards
compatibility.
I did a few tests with a kernel 2.6.17-rc7 and it seems to work, at
least mkfs, debugfs and fsck tools.

Get the new version of e2fsprogs at
ftp://ftp.kernel.org/pub/linux/kernel/people/tytso/e2fsprogs-interim/e2fsprogs-1.39-tyt3
and apply the patchset in attachment.

Hope this helps,
Val?rie


Attachments:
e2fsprogs-1.39-tyt3-64bit-patches.tar.gz (49.54 kB)

2007-05-07 16:19:54

by Jose R. Santos

[permalink] [raw]
Subject: Re: Creating a >32bit blocks filesystem.

On Fri, 04 May 2007 16:58:35 +0200
Valerie Clement <[email protected]> wrote:
> Hi Jose,
> I began to port our modifications done for the 64-bit support against
> the new version of e2fsprogs Ted posted at the beginning of the week.
> Note that it is *just* for test use as it breaks the backwards
> compatibility.
> I did a few tests with a kernel 2.6.17-rc7 and it seems to work, at
> least mkfs, debugfs and fsck tools.

Hi Valerie,

I tried the patches and while the tools such as mkfs and debugfs seem
to work fine, I am still unable to mount a filesystem with block
numbers exceeding 32 bits. I am testing on a 2.6.21.1 kernel with the
ext4 patches from:

ftp://ftp.kernel.org/pub/linux/kernel/people/tytso/ext4-patches/2.6.21-ext4-1

The following error shows up on the kernel log:

[12145.598822] EXT4-fs error (device dm-2): ext4_check_descriptors: Block bitmap for group 0 not in group (block 18446744069414584320)!
[12145.670781] EXT4-fs: group descriptors corrupted!

So its failing very early in ext4_check_descriptors(). The hi 32 bits of block_bitmap for the first group seem to be set all to 1s.

Thanks

-JRS

2007-05-07 18:36:59

by Jose R. Santos

[permalink] [raw]
Subject: Re: Creating a >32bit blocks filesystem.

On Mon, 7 May 2007 11:19:52 -0500
"Jose R. Santos" <[email protected]> wrote:
> I tried the patches and while the tools such as mkfs and debugfs seem
> to work fine, I am still unable to mount a filesystem with block
> numbers exceeding 32 bits.

Correction,

I tested debugfs on a 32bit blocks filesystem. Both debugfs and fsck
failed when attempting to use a 64bit filesystem.

-JRS

2007-05-09 12:18:00

by Valerie Clement

[permalink] [raw]
Subject: Re: Creating a >32bit blocks filesystem.

Jose R. Santos wrote:
> Hi Valerie,
>
> I tried the patches and while the tools such as mkfs and debugfs seem
> to work fine, I am still unable to mount a filesystem with block
> numbers exceeding 32 bits. I am testing on a 2.6.21.1 kernel with the
> ext4 patches from:
>
> ftp://ftp.kernel.org/pub/linux/kernel/people/tytso/ext4-patches/2.6.21-ext4-1
>
> The following error shows up on the kernel log:
>
> [12145.598822] EXT4-fs error (device dm-2): ext4_check_descriptors: Block bitmap for group 0 not in group (block 18446744069414584320)!
> [12145.670781] EXT4-fs: group descriptors corrupted!
>
> So its failing very early in ext4_check_descriptors(). The hi 32 bits of block_bitmap for the first group seem to be set all to 1s.
>
Hi Jose,
I'm sorry to reply so late, I had a day off yesterday.
I only tested the tools on an x86_64 system. Could it be an endianess issue?
On which platform do you run?
Val?rie

2007-05-09 13:55:48

by Jose R. Santos

[permalink] [raw]
Subject: Re: Creating a >32bit blocks filesystem.

On Wed, 09 May 2007 14:16:55 +0200
Valerie Clement <[email protected]> wrote:
> Hi Jose,
> I'm sorry to reply so late, I had a day off yesterday.
> I only tested the tools on an x86_64 system. Could it be an endianess issue?
> On which platform do you run?
> Valérie

Hi Valerie,

I think this has more to do with the fact that I'm on a 32bit
architecture and there are still a couple places where blocks are
represented using "unsigned long". I'm trying to get access to a 64bit
arch to confirm this.

-JRS

2007-05-09 14:58:06

by Valerie Clement

[permalink] [raw]
Subject: Re: Creating a >32bit blocks filesystem.

Jose R. Santos wrote:
> I think this has more to do with the fact that I'm on a 32bit
> architecture and there are still a couple places where blocks are
> represented using "unsigned long". I'm trying to get access to a 64bit
> arch to confirm this.
>
> -JRS
>
Oh, I didn't catch that you use a 32-bit system.
On 32-bit architectures, the page cache index size imposes a 16TB limit
on the filesystem size (with 4KB blocksize). So you need a 64-bit system
for your test.
Valérie

2007-05-09 15:25:28

by Eric Sandeen

[permalink] [raw]
Subject: Re: Creating a >32bit blocks filesystem.

Valerie Clement wrote:
> Jose R. Santos wrote:
>> I think this has more to do with the fact that I'm on a 32bit
>> architecture and there are still a couple places where blocks are
>> represented using "unsigned long". I'm trying to get access to a 64bit
>> arch to confirm this.
>>
>> -JRS
>>
> Oh, I didn't catch that you use a 32-bit system.
> On 32-bit architectures, the page cache index size imposes a 16TB limit
> on the filesystem size (with 4KB blocksize). So you need a 64-bit system
> for your test.
> Valérie

hm, the mount never should have gotten far enough to fail due to this,
should it have?

Jose, what exactly failed? I see references to debugfs failing, but
also kernel logs...

Things like debugfs will have issues with very large block devices due
to maximum file size restrictions on 32-bit platforms, due to the page
cache issue Valerie mentions... But trying to open it should give EFBIG
I'd think?

And mounting such a filesystem on a 32-bit system should also get
rejected early (and cleanly).

Jose, you mentioned that some blocks are still "unsigned long" on
32-bits... they shouldn't be, the LBD work should have fixed all those
long ago. But there is still the 16TB page cache limit in force.

-Eric

2007-05-09 15:29:41

by Jose R. Santos

[permalink] [raw]
Subject: Re: Creating a >32bit blocks filesystem.

On Wed, 09 May 2007 16:57:01 +0200
Valerie Clement <[email protected]> wrote:

> Jose R. Santos wrote:
> > I think this has more to do with the fact that I'm on a 32bit
> > architecture and there are still a couple places where blocks are
> > represented using "unsigned long". I'm trying to get access to a 64bit
> > arch to confirm this.
> >
> > -JRS
> >
> Oh, I didn't catch that you use a 32-bit system.
> On 32-bit architectures, the page cache index size imposes a 16TB limit
> on the filesystem size (with 4KB blocksize). So you need a 64-bit system
> for your test.
> Valérie

Thanks for the info. Would the page cache limitation also restrict debugfs and dumpe2fs from reading information on a large filesystem? Would a check need to be in place to limit the use of e2fsprog on 32bit architectures using large block devices.

-JRS

2007-05-09 16:36:21

by Jose R. Santos

[permalink] [raw]
Subject: Re: Creating a >32bit blocks filesystem.

On Wed, 09 May 2007 10:21:44 -0500
Eric Sandeen <[email protected]> wrote:

> Valerie Clement wrote:
> > Jose R. Santos wrote:
> >> I think this has more to do with the fact that I'm on a 32bit
> >> architecture and there are still a couple places where blocks are
> >> represented using "unsigned long". I'm trying to get access to a 64bit
> >> arch to confirm this.
> >>
> >> -JRS
> >>
> > Oh, I didn't catch that you use a 32-bit system.
> > On 32-bit architectures, the page cache index size imposes a 16TB limit
> > on the filesystem size (with 4KB blocksize). So you need a 64-bit system
> > for your test.
> > Valérie
>
> hm, the mount never should have gotten far enough to fail due to this,
> should it have?
>
> Jose, what exactly failed? I see references to debugfs failing, but
> also kernel logs...

debugfs 1.39-tyt3 (29-Apr-2007)
/dev/mapper/testdb: Can't read an inode bitmap while reading inode bitmap

dumpe2fs 1.39-tyt3 (29-Apr-2007)
Filesystem volume name: <none>
Last mounted on: <not available>
Filesystem UUID: 1377370d-bc15-42c0-90bc-50e86bd86198
Filesystem magic number: 0xEF53
Filesystem revision #: 1 (dynamic)
Filesystem features: has_journal resize_inode dir_index filetype 64bit spar
...
Journal backup: inode blocks
misc/dumpe2fs: Attempt to read block from filesystem resulted in short read whil
e reading journal inode


> Jose, you mentioned that some blocks are still "unsigned long" on
> 32-bits... they shouldn't be, the LBD work should have fixed all those
> long ago. But there is still the 16TB page cache limit in force.

Found this in mke2fs.c
unsigned long blocks = EXT2_BLOCKS_COUNT(fs->super);
unsigned long start;

which are later uses to wipe out any MD RAID metadata at the end of the
device. and in parse_extended_opts()

unsigned long resize, bpg, rsv_groups;
...
if (resize <= EXT2_BLOCKS_COUNT(param)) {

Both of these cases should not affect what I was doing since I'm not
resizing or my device did not have any RAID metadata on the partition.
I believe I spotted similar cases yesterday in libext2fs but I don't
recall where. I will check.

-JRS

2007-05-09 16:44:22

by Andreas Dilger

[permalink] [raw]
Subject: Re: Creating a >32bit blocks filesystem.

On May 09, 2007 16:57 +0200, Valerie Clement wrote:
> Jose R. Santos wrote:
> >I think this has more to do with the fact that I'm on a 32bit
> >architecture and there are still a couple places where blocks are
> >represented using "unsigned long". I'm trying to get access to a 64bit
> >arch to confirm this.

> Oh, I didn't catch that you use a 32-bit system.
> On 32-bit architectures, the page cache index size imposes a 16TB limit
> on the filesystem size (with 4KB blocksize). So you need a 64-bit system
> for your test.

The mke2fs code should warn the user in this case that the filesystem will
not be usable on 32-bit systems. I believe mke2fs already checks PAGE_SIZE
in order to validate blocksize > 4096 filesystem requests, and a simple
check for "sizeof(long)" to see if it is a 32-bit system would be enough.

Cheers, Andreas
--
Andreas Dilger
Principal Software Engineer
Cluster File Systems, Inc.

2007-05-09 17:05:19

by Eric Sandeen

[permalink] [raw]
Subject: Re: Creating a >32bit blocks filesystem.

Jose R. Santos wrote:
> On Wed, 09 May 2007 10:21:44 -0500
> Eric Sandeen <[email protected]> wrote:

>> Jose, you mentioned that some blocks are still "unsigned long" on
>> 32-bits... they shouldn't be, the LBD work should have fixed all those
>> long ago. But there is still the 16TB page cache limit in force.
>
> Found this in mke2fs.c
> unsigned long blocks = EXT2_BLOCKS_COUNT(fs->super);
> unsigned long start;

Ah, ok, I thought you were talking about kernelspace...

yeah, that looks like a problem. And it's from a patch called
"use_64bit_block_numbers" *grin*

there are a few related typedefs in e2fsprogs, such as blk_t, which is __u64

should we be using those for block number containers?

-Eric

2007-05-09 17:21:16

by Eric Sandeen

[permalink] [raw]
Subject: Re: Creating a >32bit blocks filesystem.

Valerie Clement wrote:

> Get the new version of e2fsprogs at
> ftp://ftp.kernel.org/pub/linux/kernel/people/tytso/e2fsprogs-interim/e2fsprogs-1.39-tyt3
> and apply the patchset in attachment.
>
> Hope this helps,
> Val?rie

Val?rie, this looks a bit odd in 02_use_64bit_io, in debugfs.c:

@@ -149,13 +149,13 @@ void do_open_filesys(int argc, char **ar
data_filename = optarg;
break;

case 'b':
- blocksize = parse_ulong(optarg, argv[0],
+ blocksize = parse_ullong(optarg, argv[0],
"block size", &err);

and same for similar code in main() in debugfs.c.

Surely the block *size* doesn't need to be 64 bits :) I guess that got
accidentally inherited from e2fsprogs-1.39 where blocksize is type
blk_t, which turned into 64 bits now... I think "int" will be fine. :)

Thanks,
-Eric