2012-11-11 22:27:18

by George Spelvin

[permalink] [raw]
Subject: mke2fs -O 64bit -E resize=<anything> divides by 0

I'm using v1.43-WIP-2012-09-22-10-g41bf599, last commit Oct. 14.

I'm trying to create a file system with 64bit support and specify a
maximum resize limit of 64 TiB = 2^34 blocks = 17179869184.

(gdb) run -n -t ext4 -O 64bit -E resize=4294967296 /dev/md1
Starting program: /root/e2fsprogs/misc/mke2fs -n -t ext4 -O 64bit -E resize=4294967295 /dev/md1
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
mke2fs 1.43-WIP (22-Sep-2012)

Program received signal SIGFPE, Arithmetic exception.
0x0000000000405f5a in parse_extended_opts (opts=<optimized out>,
param=0x64e200) at mke2fs.c:800
800 gdpb = EXT2_DESC_PER_BLOCK(param);

The issue is that

#define EXT2_DESC_PER_BLOCK(s) (EXT2_BLOCK_SIZE(s) / EXT2_DESC_SIZE(s))
#define EXT2_DESC_SIZE(s) \
((EXT2_SB(s)->s_feature_incompat & EXT4_FEATURE_INCOMPAT_64BIT) ? \
(s)->s_desc_size : EXT2_MIN_DESC_SIZE)

and s_desc_size is 0 because parse_extended_opts is called from PRS which
is called very early in main() at line 2320, while s_desc_size is set up
in ext2fs_initialize, which is not called from main() until mke2fs.c:2353.

As a temporary workaround, I notice that ext2fs_initialize sets s_desc_size to
the fixed value EXT2_MIN_DESC_SIZE_64BIT, so I changed the #define as follows:

#define EXT2_DESC_SIZE(s) \
((EXT2_SB(s)->s_feature_incompat & EXT4_FEATURE_INCOMPAT_64BIT) ? \
(s)->s_desc_size ?: EXT2_MIN_DESC_SIZE_64BIT : EXT2_MIN_DESC_SIZE)

... which seems to work.


(One point that occurred to me while wrestling with this is that the
default resize limit of initial size * 1000 should perhaps be clamped
to 2^32 if 64bit is not enabled.)


2012-11-12 01:54:11

by Andreas Dilger

[permalink] [raw]
Subject: Re: mke2fs -O 64bit -E resize=<anything> divides by 0

On 2012-11-11, at 3:27 PM, George Spelvin wrote:
> I'm using v1.43-WIP-2012-09-22-10-g41bf599, last commit Oct. 14.
>
> I'm trying to create a file system with 64bit support and specify a
> maximum resize limit of 64 TiB = 2^34 blocks = 17179869184.
>
> (gdb) run -n -t ext4 -O 64bit -E resize=4294967296 /dev/md1
> Starting program: /root/e2fsprogs/misc/mke2fs -n -t ext4 -O 64bit -E resize=4294967295 /dev/md1
> [Thread debugging using libthread_db enabled]
> Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
> mke2fs 1.43-WIP (22-Sep-2012)
>
> Program received signal SIGFPE, Arithmetic exception.
> 0x0000000000405f5a in parse_extended_opts (opts=<optimized out>,
> param=0x64e200) at mke2fs.c:800
> 800 gdpb = EXT2_DESC_PER_BLOCK(param);

This is definitely a bug in the code to do a divide-by-zero.

However, it should be pointed out that the "resize" option does not
make sense for filesystems larger than 16TB. The mechanism used for
resizing beyond 16TB is different and does not need to reserve blocks.

Cheers, Andreas.

> The issue is that
>
> #define EXT2_DESC_PER_BLOCK(s) (EXT2_BLOCK_SIZE(s) / EXT2_DESC_SIZE(s))
> #define EXT2_DESC_SIZE(s) \
> ((EXT2_SB(s)->s_feature_incompat & EXT4_FEATURE_INCOMPAT_64BIT) ? \
> (s)->s_desc_size : EXT2_MIN_DESC_SIZE)
>
> and s_desc_size is 0 because parse_extended_opts is called from PRS which
> is called very early in main() at line 2320, while s_desc_size is set up
> in ext2fs_initialize, which is not called from main() until mke2fs.c:2353.
>
> As a temporary workaround, I notice that ext2fs_initialize sets s_desc_size to
> the fixed value EXT2_MIN_DESC_SIZE_64BIT, so I changed the #define as follows:
>
> #define EXT2_DESC_SIZE(s) \
> ((EXT2_SB(s)->s_feature_incompat & EXT4_FEATURE_INCOMPAT_64BIT) ? \
> (s)->s_desc_size ?: EXT2_MIN_DESC_SIZE_64BIT : EXT2_MIN_DESC_SIZE)
>
> ... which seems to work.
>
>
> (One point that occurred to me while wrestling with this is that the
> default resize limit of initial size * 1000 should perhaps be clamped
> to 2^32 if 64bit is not enabled.)
> --
> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html


Cheers, Andreas






2012-11-12 04:32:45

by George Spelvin

[permalink] [raw]
Subject: Re: mke2fs -O 64bit -E resize=<anything> divides by 0

> However, it should be pointed out that the "resize" option does not
> make sense for filesystems larger than 16TB. The mechanism used for
> resizing beyond 16TB is different and does not need to reserve blocks.

Ah, thank you. Is this tidbit documented anywhere?

After my patch, I ran the command, with a resize=<> option.
Was that number simply ignored?

(I'm presently bulk-copying into the partition, so I can re-mkfs
and restart the bulk copy if necessary.)

2012-11-12 04:37:38

by Eric Sandeen

[permalink] [raw]
Subject: Re: mke2fs -O 64bit -E resize=<anything> divides by 0

On 11/11/12 7:54 PM, Andreas Dilger wrote:
> On 2012-11-11, at 3:27 PM, George Spelvin wrote:
>> I'm using v1.43-WIP-2012-09-22-10-g41bf599, last commit Oct. 14.
>>
>> I'm trying to create a file system with 64bit support and specify a
>> maximum resize limit of 64 TiB = 2^34 blocks = 17179869184.
>>
>> (gdb) run -n -t ext4 -O 64bit -E resize=4294967296 /dev/md1
>> Starting program: /root/e2fsprogs/misc/mke2fs -n -t ext4 -O 64bit -E resize=4294967295 /dev/md1
>> [Thread debugging using libthread_db enabled]
>> Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
>> mke2fs 1.43-WIP (22-Sep-2012)
>>
>> Program received signal SIGFPE, Arithmetic exception.
>> 0x0000000000405f5a in parse_extended_opts (opts=<optimized out>,
>> param=0x64e200) at mke2fs.c:800
>> 800 gdpb = EXT2_DESC_PER_BLOCK(param);
>
> This is definitely a bug in the code to do a divide-by-zero.
>
> However, it should be pointed out that the "resize" option does not
> make sense for filesystems larger than 16TB. The mechanism used for
> resizing beyond 16TB is different and does not need to reserve blocks.

In fairness to the reporter, nothing in the existing ext4 documentation,
AFAICT, mentions this. (but then -O 64bit isn't really documented at all)

And given that the poor reporter is re-making his whole filesystem just
because he found out that he can't grow past 16T:
"(wow, was *that* a nasty surprise)"
it's understandable that he's trying to give it a rather large resize=
value this time around.

This is one of those dark corners of weird behavior that could really use
some formal docs, at least. :(

-Eric

> Cheers, Andreas.
>
>> The issue is that
>>
>> #define EXT2_DESC_PER_BLOCK(s) (EXT2_BLOCK_SIZE(s) / EXT2_DESC_SIZE(s))
>> #define EXT2_DESC_SIZE(s) \
>> ((EXT2_SB(s)->s_feature_incompat & EXT4_FEATURE_INCOMPAT_64BIT) ? \
>> (s)->s_desc_size : EXT2_MIN_DESC_SIZE)
>>
>> and s_desc_size is 0 because parse_extended_opts is called from PRS which
>> is called very early in main() at line 2320, while s_desc_size is set up
>> in ext2fs_initialize, which is not called from main() until mke2fs.c:2353.
>>
>> As a temporary workaround, I notice that ext2fs_initialize sets s_desc_size to
>> the fixed value EXT2_MIN_DESC_SIZE_64BIT, so I changed the #define as follows:
>>
>> #define EXT2_DESC_SIZE(s) \
>> ((EXT2_SB(s)->s_feature_incompat & EXT4_FEATURE_INCOMPAT_64BIT) ? \
>> (s)->s_desc_size ?: EXT2_MIN_DESC_SIZE_64BIT : EXT2_MIN_DESC_SIZE)
>>
>> ... which seems to work.
>>
>>
>> (One point that occurred to me while wrestling with this is that the
>> default resize limit of initial size * 1000 should perhaps be clamped
>> to 2^32 if 64bit is not enabled.)
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
>> the body of a message to [email protected]
>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
>
> Cheers, Andreas
>
>
>
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>


2012-11-12 06:32:09

by George Spelvin

[permalink] [raw]
Subject: Re: mke2fs -O 64bit -E resize=<anything> divides by 0

> And given that the poor reporter is re-making his whole filesystem just
> because he found out that he can't grow past 16T:
> "(wow, was *that* a nasty surprise)"
> it's understandable that he's trying to give it a rather large resize=
> value this time around.

Er... I was trying to give a *small* resize= value, actually, given that
the default is documented as 1000x the FS size at creation. I don't have
a good mental model of what the space reservation entails, but given
*that* kind of default fudge factor, I added a small integer multiple
on top of what I thought would plausibly happen.

2012-11-15 15:38:35

by George Spelvin

[permalink] [raw]
Subject: Re: mke2fs -O 64bit -E resize=<anything> divides by 0

Just to follow up to this thread so that anyone searching archives
will know: DO NOT DO THIS, IT IS BUGGY. (As of today's mke2fs 1.43-WIP.)


Asking for preallocated space boils down to reserving space in the block
group descriptor table (both the primary and all backups) for the final
total number of block groups.

A block group is as many blocks as can be controlled by a 1-block
allocation bitmap. So with 4K blocks, that's 32K blocks, or 128 MiB.

Each descriptor is 32 bytes (or 64 bytes for 64-bit), so the largest
possible 32-bit FS, of 2^32 blocks, requires 2^17 block groups, which
requires 2^22 bytes of block group descriptor table. That's 2^10 =
1024 blocks of 2^12 = 4K size,

mke2fs keeps track of the reserved blocks by allocating them to a
special inode (#7), with each reserved area getting one indirect block,
since that corresponds to the maximum possible size.


But here's the bug! It turns out that mke2fs *cannot* preallocate more
than 1024 blocks of block group descriptor table, so the maximum
growth is 16 TiB on 32-bit, or 8 TiB on 64-bit (where the descriptors
are twice as large).

(Note that this is the size *in addition to* the current size, not
the final total.)

For 32-bit file systems, this is of course not a problem. The "1000x
default growth" documented in mke2fs really means that, if you create
a file system of 16 GiB or larger, it preallocates to the 16 TiB max.

However, when using a 64bit file syste, you can sensibly ask for more
preallocation. But if you do, (as of today; I expect Ted will at least
make it fail in future) mke2fs silently truncates the request to the
maximum it can supply.


Now, I was trying to reallocate from 10 TB to 22 TB, a 12 TB increase,
which is above the 8 TiB limit.

It turns out that there's a second bug in resize2fs which notices the
preallocated space and tries to use it, but when it's not big enough,
it does things wrong and destroys some inodes. (if flex_bg is also
enabled, which is always is for ext4).


I expect these all to get fixed fairly soon, but please, nobody else have
my data-loss experience.