2015-08-27 01:29:04

by Chris Hunter

[permalink] [raw]
Subject: errors following ext3 to ext4 conversion

I attempted to convert ext3 to ext4 filesystem. I am using e2fsprogs
(1.42.12.wc1 (15-Sep-2014)). I ran command tune2fs tune2fs -O
extents,uninit_bg,dir_index /dev/DEV -o acl,user_xattr /dev/DEV.
I then encountered errors (below) when running read-only e2fsck. I have
not mounted the filesystem.
Is it possible to backout the ext3/ext4 changes ?
Do tune2fs changes take effect immediately or next time filesystem is
mounted?


e2fsck shows a variety of errors:
Pass 1: Checking inodes, blocks, and sizes
Inode 118843400, end of extent exceeds allowed value
(logical block 1409, physical block 3803034390, len 976)
Inode 118843400, end of extent exceeds allowed value
(logical block 2385, physical block 3803056554, len 4294966945)
Inode 118843400, i_size is 8331264, should be 5771264. Fix? no
Inode 118843400, i_blocks is 16328, should be 11312. Fix? no
(...)

Pass 2: Checking directory structure
Problem in HTREE directory inode 118843400 (/O/0/d0): bad block number 2030.
Problem in HTREE directory inode 118843400 (/O/0/d0): bad block number 2031.
Problem in HTREE directory inode 118843400 (/O/0/d0): bad block number 2032.
Problem in HTREE directory inode 118843400 (/O/0/d0): bad block number 2033.

Pass 4: Checking reference counts
Unattached inode 26
Unattached inode 27
Unattached inode 28

regards,
chris hunter
[email protected]


2015-08-27 03:39:52

by Theodore Ts'o

[permalink] [raw]
Subject: Re: errors following ext3 to ext4 conversion

On Wed, Aug 26, 2015 at 08:53:13PM -0400, Chris Hunter wrote:
> I attempted to convert ext3 to ext4 filesystem. I am using e2fsprogs
> (1.42.12.wc1 (15-Sep-2014)). I ran command tune2fs tune2fs -O
> extents,uninit_bg,dir_index /dev/DEV -o acl,user_xattr /dev/DEV.
> I then encountered errors (below) when running read-only e2fsck. I have not
> mounted the filesystem.

I'd suggest using

debugfs -w -R "features ^extents,^uninit_bg" /dev/DEV

and then try re-running the read-only e2fsck. Did you try running a
read-only e2fsck before you tried using the tune2fs command?

Merely turning on the extents feature doesn't actually convert any
files to use extents. So if e2fsck is showing errors like this:

> e2fsck shows a variety of errors:
> Pass 1: Checking inodes, blocks, and sizes
> Inode 118843400, end of extent exceeds allowed value
> (logical block 1409, physical block 3803034390, len 976)
> Inode 118843400, end of extent exceeds allowed value
> (logical block 2385, physical block 3803056554, len 4294966945)

This suggests that the file system was likely corrupted before you
tried converting the file system, since there should not have been any
extent-mapped files in an ext3 file system.

Regards,

- Ted

2015-08-27 03:43:48

by Eric Sandeen

[permalink] [raw]
Subject: Re: errors following ext3 to ext4 conversion

On 8/26/15 10:39 PM, Theodore Ts'o wrote:

> Merely turning on the extents feature doesn't actually convert any
> files to use extents. So if e2fsck is showing errors like this:
>
>> e2fsck shows a variety of errors:
>> Pass 1: Checking inodes, blocks, and sizes
>> Inode 118843400, end of extent exceeds allowed value
>> (logical block 1409, physical block 3803034390, len 976)
>> Inode 118843400, end of extent exceeds allowed value
>> (logical block 2385, physical block 3803056554, len 4294966945)
>
> This suggests that the file system was likely corrupted before you
> tried converting the file system, since there should not have been any
> extent-mapped files in an ext3 file system.

Hm, do we not require a freshly-fsck'd fs (tm) prior to a conversion attempt,
like we do (I think) for resize?

That might be a good idea ...

-Eric


2015-08-27 04:15:42

by Darrick J. Wong

[permalink] [raw]
Subject: Re: errors following ext3 to ext4 conversion

On Wed, Aug 26, 2015 at 10:43:45PM -0500, Eric Sandeen wrote:
> On 8/26/15 10:39 PM, Theodore Ts'o wrote:
>
> > Merely turning on the extents feature doesn't actually convert any
> > files to use extents. So if e2fsck is showing errors like this:
> >
> >> e2fsck shows a variety of errors:
> >> Pass 1: Checking inodes, blocks, and sizes
> >> Inode 118843400, end of extent exceeds allowed value
> >> (logical block 1409, physical block 3803034390, len 976)
> >> Inode 118843400, end of extent exceeds allowed value
> >> (logical block 2385, physical block 3803056554, len 4294966945)
> >
> > This suggests that the file system was likely corrupted before you
> > tried converting the file system, since there should not have been any
> > extent-mapped files in an ext3 file system.
>
> Hm, do we not require a freshly-fsck'd fs (tm) prior to a conversion attempt,
> like we do (I think) for resize?

tune2fs is not as strict as resize2fs; iirc resize whines if it finds ERROR
status, lack of VALID status, or it having been too long since the last fsck,
whereas tune2fs only cares that the fs is marked VALID.

(Scary, if you think about it...)

--D

>
> That might be a good idea ...
>
> -Eric
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html

2015-08-27 18:58:50

by Theodore Ts'o

[permalink] [raw]
Subject: Re: errors following ext3 to ext4 conversion

On Wed, Aug 26, 2015 at 09:15:36PM -0700, Darrick J. Wong wrote:
>
> tune2fs is not as strict as resize2fs; iirc resize whines if it finds ERROR
> status, lack of VALID status, or it having been too long since the last fsck,
> whereas tune2fs only cares that the fs is marked VALID.
>
> (Scary, if you think about it...)

Originally tune2fs was for things like changing the number of reserved
blocks in the superblock, or setting a label, etc. Things for which
subtle file system corruptions wouldn't be that big of a deal.

Even for setting feature flags, tune2fs doesn't make any fundamental
changes to the file system other than flipping a few bits. So for
Chris, the good news is that undoing the tune2fs changes is relatively
easy if all he's done since then is to run a read-only e2fsck -n run.
We just have to flip a few bits. (Note, the reason why I didn't
include ^dir_index is that most ext3 file systems created using
non-paleolithic versions of e2fsprogs will have dir_index turned on
already.)

But now that we have some tune2fs operations that do resize2fs-like
operations, we probably should add checks for those more risky
operations. And even though feature-flags flipping isn't very scary
in and of itself, requiring maybe we should require it for that case
--- although we have historically supported adding things like the
extents flag, or even the journal when converting from ext2 to ext3,
while the file system was mounted.

I suspect that would fill Eric's heart with horror, but the ability to
migrate the root file system from ext2 to ext3 while it was mounted
(i.e., just run "tune2fs -O has_journal /dev/rootfs" and reboot) was
something Stephen Tweedie added, so at least at one point Red Hat was
more adventurous about what it would support in terms of file system
upgrades without using mkfs. :-)

- Ted

2015-08-27 19:28:15

by Chris Hunter

[permalink] [raw]
Subject: Re: errors following ext3 to ext4 conversion

Hi,
Thanks for the response. As Theodore pointed out, the filesystem already
had extents. I ran the tune2fs command on a filesystem that had
previously been converted from ext3 to ext4. I undid features (via
tune2fs -O ^<flag>) but the read-only fsck errors persist.

Can you elaborate on the risky tune2fs options. I assume you mean
changes that can't be undone ? or unsafe ?

regards,
chris hunter
[email protected]

On 08/27/2015 02:58 PM, Theodore Ts'o wrote:
> On Wed, Aug 26, 2015 at 09:15:36PM -0700, Darrick J. Wong wrote:
>>
>> tune2fs is not as strict as resize2fs; iirc resize whines if it finds ERROR
>> status, lack of VALID status, or it having been too long since the last fsck,
>> whereas tune2fs only cares that the fs is marked VALID.
>>
>> (Scary, if you think about it...)
>
> Originally tune2fs was for things like changing the number of reserved
> blocks in the superblock, or setting a label, etc. Things for which
> subtle file system corruptions wouldn't be that big of a deal.
>
> Even for setting feature flags, tune2fs doesn't make any fundamental
> changes to the file system other than flipping a few bits. So for
> Chris, the good news is that undoing the tune2fs changes is relatively
> easy if all he's done since then is to run a read-only e2fsck -n run.
> We just have to flip a few bits. (Note, the reason why I didn't
> include ^dir_index is that most ext3 file systems created using
> non-paleolithic versions of e2fsprogs will have dir_index turned on
> already.)
>
> But now that we have some tune2fs operations that do resize2fs-like
> operations, we probably should add checks for those more risky
> operations. And even though feature-flags flipping isn't very scary
> in and of itself, requiring maybe we should require it for that case
> --- although we have historically supported adding things like the
> extents flag, or even the journal when converting from ext2 to ext3,
> while the file system was mounted.
>
> I suspect that would fill Eric's heart with horror, but the ability to
> migrate the root file system from ext2 to ext3 while it was mounted
> (i.e., just run "tune2fs -O has_journal /dev/rootfs" and reboot) was
> something Stephen Tweedie added, so at least at one point Red Hat was
> more adventurous about what it would support in terms of file system
> upgrades without using mkfs. :-)
>
> - Ted
>

2015-08-27 21:34:07

by Eric Sandeen

[permalink] [raw]
Subject: Re: errors following ext3 to ext4 conversion

On 8/27/15 1:58 PM, Theodore Ts'o wrote:
> I suspect that would fill Eric's heart with horror, but the ability to
> migrate the root file system from ext2 to ext3 while it was mounted
> (i.e., just run "tune2fs -O has_journal /dev/rootfs" and reboot) was
> something Stephen Tweedie added, so at least at one point Red Hat was
> more adventurous about what it would support in terms of file system
> upgrades without using mkfs. :-)
>
> - Ted

Oh, it doesn't fill me with *that* much horror. ;)

TBH, my big problem with the ext3->ext4 "migration" is that you wind
up with a mongrel filesystem which mkfs.ext4 would never create,
populated with files of varying runtime limitations and capabilities,
depending on whether they were created before or after the "migration."

Adding a journal and rebooting at least gets you to a pretty standard,
predictable, and tested result. ;)

-Eric

2015-08-27 22:46:55

by Theodore Ts'o

[permalink] [raw]
Subject: Re: errors following ext3 to ext4 conversion

On Thu, Aug 27, 2015 at 03:28:12PM -0400, Chris Hunter wrote:
> Hi,
> Thanks for the response. As Theodore pointed out, the filesystem already had
> extents. I ran the tune2fs command on a filesystem that had previously been
> converted from ext3 to ext4. I undid features (via tune2fs -O ^<flag>) but
> the read-only fsck errors persist.

Woah! If the file system already had been converted from ext3 to
ext4, why were you running the tune2fs command in the first place? I
had thought the tune2fs command was your _attempt_ to convert the
filesystem from ext3 to ext4. It was on that basis that I suggested
that you use the "tune2fs -O ^<flag>" command to revert those changes.

> Can you elaborate on the risky tune2fs options. I assume you mean changes
> that can't be undone ? or unsafe ?

There are commands to change the inode size (for example) that need to
allocate more blocks and then rewrite the inode table. These commands
are risky if your file system was corrupted before you attempt to run
the tune2fs command. For similar reasons, resize2fs forces you to do
a a read/write check of the file system (so that the last fsck time
can be updated, so resize2fs can *verify* that you ran the e2fsck
command, intsead of lazily claiming you did when you didn't :-)
*before* you run resize2fs.

The main issue here is that you want to make sure the file system is
in a stable state *before* you try to make involved changes. At this
point, I'm confused about what flags you had set on your file system
before you ran your tune2fs command, so it's hard to know what to
suggest. But it's highly likely that no matter what was going on, the
fact that your file system was corrupted has nothing to do with the
tune2fs command. The tune2fs -O command only flips a few bits in the
superblock. It's highly likely that your file system was corrupted
*before* you ran the tune2fs command.

It's for that reason that it's tempting to require an e2fsck before
running a tune2fs -O command. Unfortunately, in the past we've
allowed this even for mounted file systems, and if you know what you
are doing, and your file system is in a sane state, it's awfully
convenient to turn on certain features even while the file system is
mounted.

The problem is that if you are an enterprise distribution who has to
pay for staffing a help desk, or you're someone who's trying to help
someone who is asking for advice on linux-ext4, it's awfully tempting
to assume that we have to assume that users are idiots. And sometimes
it's not that they are, but because of ambiguities in bug reports.
For example, what was the state of your file system before you ran the
tune2fs command. Was it a stock ext3 file system, or had you already
converted it to ext4. If so, how? Was the file system mounted at any
of these steps? (Running e2fsck on a mounted file system is often
going to lead to misleading file system problem reports.)

So people who do this a lot often feel that tools have to be dumbed
down, because otherwise it becomes a support nightmare....

BTW, this is probably a good time to ask if your backups are up to
date. Because regardless of what happened, it's likely you will have
suffered at least some data loss...

- Ted

2015-08-28 16:09:43

by Andreas Dilger

[permalink] [raw]
Subject: Re: errors following ext3 to ext4 conversion

On Aug 26, 2015, at 6:53 PM, Chris Hunter <[email protected]> wrote:
>
> I attempted to convert ext3 to ext4 filesystem. I am using e2fsprogs (1.42.12.wc1 (15-Sep-2014)).

This is the Lustre-patched e2fsprogs.

When was this filesystem formatted, since Lustre itself has been using
an ext4 format for a long time already (extents, uninit_bg, dir_index)
so unless it was something really ancient, these features should have
been on already. Was there some prior corruption that started you
down this road?

> I ran command tune2fs tune2fs -O extents,uninit_bg,dir_index /dev/DEV -o acl,user_xattr /dev/DEV.
> I then encountered errors (below) when running read-only e2fsck. I have not mounted the filesystem.
> Is it possible to backout the ext3/ext4 changes ?
> Do tune2fs changes take effect immediately or next time filesystem is mounted?
>
>
> e2fsck shows a variety of errors:
> Pass 1: Checking inodes, blocks, and sizes
> Inode 118843400, end of extent exceeds allowed value
> (logical block 1409, physical block 3803034390, len 976)
> Inode 118843400, end of extent exceeds allowed value
> (logical block 2385, physical block 3803056554, len 4294966945)

This is a bit strange, because the end of the first extent (1409 + 976)
matches the start of the next one (2385) so that should be correct?
Definitely the "4294966945" (= 0xfffffea1 = -351) length is incorrect
and the victim of some corruption. It isn't even some random bit flip
so I have no idea how "-351" got in there.

> Inode 118843400, i_size is 8331264, should be 5771264. Fix? no
> Inode 118843400, i_blocks is 16328, should be 11312. Fix? no

It looks like you've lost an even 2500KB = 625 blocks, or 5016 blocks off the end of this directory, depending whether i_size or i_blocks
should be trusted.

> (...)
>
> Pass 2: Checking directory structure
> Problem in HTREE directory inode 118843400 (/O/0/d0): bad block number 2030.
> Problem in HTREE directory inode 118843400 (/O/0/d0): bad block number 2031.
> Problem in HTREE directory inode 118843400 (/O/0/d0): bad block number 2032.
> Problem in HTREE directory inode 118843400 (/O/0/d0): bad block number 2033.

This looks like a Lustre OST filesystem layout. Are the other d*
directories also corrupted?

> Pass 4: Checking reference counts
> Unattached inode 26
> Unattached inode 27
> Unattached inode 28
>
> regards,
> chris hunter
> [email protected]
> --
> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html


Cheers, Andreas






2015-08-28 16:23:57

by Chris Hunter

[permalink] [raw]
Subject: Re: errors following ext3 to ext4 conversion

Hi Theodore,

You are correct there were human errors in these filesystem changes (eg.
insufficient pre-checks/validation). I don't expect the software to
compensate for poor planning.
For future reference, I desire a better understanding of risks involved
in using tune2fs. I incorrectly assumed that tune2fs doesn't "do"
anything until a filesystem is mounted.
Assuming you start with a clean filesystem, which tune2fs commands are
the most dangerous ? which are the safest ?

thanks,
chris hunter
[email protected]
"Experience is the teacher of all things." Julius Caesar


On 08/27/2015 06:46 PM, Theodore Ts'o wrote:
> On Thu, Aug 27, 2015 at 03:28:12PM -0400, Chris Hunter wrote:
>> Hi,
>> Thanks for the response. As Theodore pointed out, the filesystem already had
>> extents. I ran the tune2fs command on a filesystem that had previously been
>> converted from ext3 to ext4. I undid features (via tune2fs -O ^<flag>) but
>> the read-only fsck errors persist.
>
> Woah! If the file system already had been converted from ext3 to
> ext4, why were you running the tune2fs command in the first place? I
> had thought the tune2fs command was your _attempt_ to convert the
> filesystem from ext3 to ext4. It was on that basis that I suggested
> that you use the "tune2fs -O ^<flag>" command to revert those changes.
>
>> Can you elaborate on the risky tune2fs options. I assume you mean changes
>> that can't be undone ? or unsafe ?
>
> There are commands to change the inode size (for example) that need to
> allocate more blocks and then rewrite the inode table. These commands
> are risky if your file system was corrupted before you attempt to run
> the tune2fs command. For similar reasons, resize2fs forces you to do
> a a read/write check of the file system (so that the last fsck time
> can be updated, so resize2fs can *verify* that you ran the e2fsck
> command, intsead of lazily claiming you did when you didn't :-)
> *before* you run resize2fs.
>
> The main issue here is that you want to make sure the file system is
> in a stable state *before* you try to make involved changes. At this
> point, I'm confused about what flags you had set on your file system
> before you ran your tune2fs command, so it's hard to know what to
> suggest. But it's highly likely that no matter what was going on, the
> fact that your file system was corrupted has nothing to do with the
> tune2fs command. The tune2fs -O command only flips a few bits in the
> superblock. It's highly likely that your file system was corrupted
> *before* you ran the tune2fs command.
>
> It's for that reason that it's tempting to require an e2fsck before
> running a tune2fs -O command. Unfortunately, in the past we've
> allowed this even for mounted file systems, and if you know what you
> are doing, and your file system is in a sane state, it's awfully
> convenient to turn on certain features even while the file system is
> mounted.
>
> The problem is that if you are an enterprise distribution who has to
> pay for staffing a help desk, or you're someone who's trying to help
> someone who is asking for advice on linux-ext4, it's awfully tempting
> to assume that we have to assume that users are idiots. And sometimes
> it's not that they are, but because of ambiguities in bug reports.
> For example, what was the state of your file system before you ran the
> tune2fs command. Was it a stock ext3 file system, or had you already
> converted it to ext4. If so, how? Was the file system mounted at any
> of these steps? (Running e2fsck on a mounted file system is often
> going to lead to misleading file system problem reports.)
>
> So people who do this a lot often feel that tools have to be dumbed
> down, because otherwise it becomes a support nightmare....
>
> BTW, this is probably a good time to ask if your backups are up to
> date. Because regardless of what happened, it's likely you will have
> suffered at least some data loss...
>
> - Ted
>

2015-08-28 18:32:15

by Theodore Ts'o

[permalink] [raw]
Subject: Re: errors following ext3 to ext4 conversion

On Fri, Aug 28, 2015 at 12:23:54PM -0400, Chris Hunter wrote:
> Hi Theodore,
>
> You are correct there were human errors in these filesystem changes (eg.
> insufficient pre-checks/validation). I don't expect the software to
> compensate for poor planning.
> For future reference, I desire a better understanding of risks involved in
> using tune2fs. I incorrectly assumed that tune2fs doesn't "do" anything
> until a filesystem is mounted.
> Assuming you start with a clean filesystem, which tune2fs commands are the
> most dangerous ? which are the safest ?

In the e2fsprogs 1.42.x series (which is what you are using), the only
really "dangerous" option is "tune2fs -I", which changes the inode
size and require moving blocks around and which could potentially
cause data loss if you lose power while the tune2fs command is
running.

You're posting on the ext4 developer's mailing list, and there have
been discussions to add more "resize2fs-like" features to tune2fs in
the 1.43 development branch of e2fsprogs. For the *most* part the
commands in tune2fs are "safe" in that they don't touch anything other
than the superblock --- although if the file system is mounted, they
can result in changes right away.

The one command which can have more traps for the unwary is "tune2fs
-O". It only modifies the superblock, but in some cases, tune2fs will
ask you to run e2fsck afterwards. We do try to mask out the dangerous
options --- for example, we don't allow you to turn off the extents
option, since that cause problems. Someone who accidentally enabled
the extents flag on a file system that previously didn't, and while it
is unmounted, can turn it off using debugfs but that _is_ dangerous so
we don't allow tune2fs to do so.

So we do try to make tune2fs safer for non-experts. In this
particular case, I got a bit frustrated because we got the typical bug
report, and I apologize for my tone. For some reason, users tend not
to give us relevant information (such as this is a Lustre disk) as
background context, and in some cases, outright *wrong* information
(that this was an ext3 file system that you were converting to ext4;
from what you've said later, apparently this was already had some ext4
features because it was used as a part of Lustre). This is the sort
of thing that gives support engineers nightmares, and it's why develop
people like myself who are developers sometimes don't make good
support engineers --- we don't have the protective assumptions from
long experience that everything a user asking for help tells us may be
lies. :-)

- Ted