LinuxLists.cc - Bug in ext3

2001-11-15 14:25:14

[permalink] [raw]

Subject: Bug in ext3

I recently compiled support for ext3 into the kernel (2.4.15-pre4) and
booted that kernel onto a system that didn't have any ext3 partitions.
On boot I got these messages:

JBD: no valid journal superblock found
JBD: no valid journal superblock found
EXT3-fs: error loading journal.

This was when it mounted the root filesystem. After several minutes of
disk activity on /, I got this:

EXT2-fs error (device sd(8,20)): ext2_free_blocks: Freeing blocks not in datazone - block = 4294965248, count = 6872

Then it was mounted ro. Nothing happened to any of the other partitions
(including the 250gig RAID partition, just the root partition. I am
assuming that even though it was not an ext3 partition, that the ext3
driver took control of it. I tried fsck and remount,rw, but the problem
kept occuring (although with a different message about when removing a
file that the bit was already cleared).

Rebuilding without ext3 solves the symptoms, but of course not the real
problem.

Ben

--
.----------=======-=-======-=========-----------=====------------=-=-----.
/ Ben Collins -- Debian GNU/Linux \
` [email protected] -- [email protected] -- [email protected] '
`---=========------=======-------------=-=-----=-===-======-------=--=---'

2001-11-15 17:23:56

by Andrew Morton

[permalink] [raw]

Subject: Re: Bug in ext3

Ben Collins wrote:
>
> I recently compiled support for ext3 into the kernel (2.4.15-pre4) and
> booted that kernel onto a system that didn't have any ext3 partitions.
> On boot I got these messages:
>
> JBD: no valid journal superblock found
> JBD: no valid journal superblock found
> EXT3-fs: error loading journal.
>

It sounds like the superblock claims to be an ext3 fs, but something
has scrogged the journal file.

e2fsck should have removed the journal in this situation, with
the message "*** ext3 journal has been deleted - filesystem is
now ext2 only ***".

Please send the output of dumpe2fs, and of `fsck -fy'.

Probably you can repair it by booting with `init=/bin/sh' and
running `tune2fs -O ^has-journal' agains the fs, and then fscking
it.

2001-11-15 20:35:18

[permalink] [raw]

Subject: Re: Bug in ext3

On Thu, Nov 15, 2001 at 09:22:53AM -0800, Andrew Morton wrote:
> Ben Collins wrote:
> >
> > I recently compiled support for ext3 into the kernel (2.4.15-pre4) and
> > booted that kernel onto a system that didn't have any ext3 partitions.
> > On boot I got these messages:
> >
> > JBD: no valid journal superblock found
> > JBD: no valid journal superblock found
> > EXT3-fs: error loading journal.
> >
>
> It sounds like the superblock claims to be an ext3 fs, but something
> has scrogged the journal file.
>
> e2fsck should have removed the journal in this situation, with
> the message "*** ext3 journal has been deleted - filesystem is
> now ext2 only ***".
>
> Please send the output of dumpe2fs, and of `fsck -fy'.

No, it has always been an ext2 filesystem, and never was ext3. Fsck
shows no errors. The point being that I do _not_ want my root filesystem
to be ext3, but I do want ext3 built into the kernel. That case should
not cause a problem like I have seen.

Ben

--
.----------=======-=-======-=========-----------=====------------=-=-----.
/ Ben Collins -- Debian GNU/Linux \
` [email protected] -- [email protected] -- [email protected] '
`---=========------=======-------------=-=-----=-===-======-------=--=---'

2001-11-15 20:49:28

by Andrew Morton

[permalink] [raw]

Subject: Re: Bug in ext3

Ben Collins wrote:
>
> > Please send the output of dumpe2fs, and of `fsck -fy'.
>
> No, it has always been an ext2 filesystem, and never was ext3. Fsck
> shows no errors. The point being that I do _not_ want my root filesystem
> to be ext3, but I do want ext3 built into the kernel. That case should
> not cause a problem like I have seen.
>

ext3 thinks that the filesystem's superblock has the
EXT3_FEATURE_COMPAT_HAS_JOURNAL bit set in the s_feature_compat
field of the on-disk superblock.

It's probable that that bit _is_ set. ext2 will never notice it.

Please: the dumpe2fs output?

-

2001-11-15 21:03:08

[permalink] [raw]

Subject: Re: Bug in ext3

On Thu, Nov 15, 2001 at 12:48:26PM -0800, Andrew Morton wrote:
> Ben Collins wrote:
> >
> > > Please send the output of dumpe2fs, and of `fsck -fy'.
> >
> > No, it has always been an ext2 filesystem, and never was ext3. Fsck
> > shows no errors. The point being that I do _not_ want my root filesystem
> > to be ext3, but I do want ext3 built into the kernel. That case should
> > not cause a problem like I have seen.
> >
>
> ext3 thinks that the filesystem's superblock has the
> EXT3_FEATURE_COMPAT_HAS_JOURNAL bit set in the s_feature_compat
> field of the on-disk superblock.
>
> It's probable that that bit _is_ set. ext2 will never notice it.
>
> Please: the dumpe2fs output?

Seems it does have the field set. I guess the bug is then that if there
is no journal, then it shoudl fail to mount it, so ext2 will take over.
Is there any reason to mount a partition as ext3 if there is no journal
to be found?

Filesystem volume name: <none>
Last mounted on: <not available>
Filesystem UUID: <none>
Filesystem magic number: 0xEF53
Filesystem revision #: 1 (dynamic)
Filesystem features: has_journal filetype sparse_super
Filesystem state: not clean
Errors behavior: Continue
Filesystem OS type: Linux
Inode count: 1015808
Block count: 2028288
Reserved block count: 101414
Free blocks: 372624
Free inodes: 690438
First block: 0
Block size: 4096
Fragment size: 4096
Blocks per group: 32768
Fragments per group: 32768
Inodes per group: 16384
Inode blocks per group: 512
Last mount time: Thu Nov 15 10:07:12 2001
Last write time: Thu Nov 15 15:55:23 2001
Mount count: 2
Maximum mount count: 20
Last checked: Thu Nov 15 08:48:40 2001
Check interval: 15552000 (6 months)
Next check after: Tue May 14 09:48:40 2002
Reserved blocks uid: 0 (user root)
Reserved blocks gid: 0 (group root)
First inode: 11
Inode size: 128

--
.----------=======-=-======-=========-----------=====------------=-=-----.
/ Ben Collins -- Debian GNU/Linux \
` [email protected] -- [email protected] -- [email protected] '
`---=========------=======-------------=-=-----=-===-======-------=--=---'

2001-11-15 21:11:29

by Andrew Morton

[permalink] [raw]

Subject: Re: Bug in ext3

Ben Collins wrote:
>
> Seems it does have the field set. I guess the bug is then that if there
> is no journal, then it shoudl fail to mount it, so ext2 will take over.
> Is there any reason to mount a partition as ext3 if there is no journal
> to be found?
>
> Filesystem volume name: <none>
> Last mounted on: <not available>
> Filesystem UUID: <none>
> Filesystem magic number: 0xEF53
> Filesystem revision #: 1 (dynamic)
> Filesystem features: has_journal filetype sparse_super
> Filesystem state: not clean
> Errors behavior: Continue
> Filesystem OS type: Linux
> Inode count: 1015808
> Block count: 2028288
> Reserved block count: 101414
> Free blocks: 372624
> Free inodes: 690438
> First block: 0
> Block size: 4096
> Fragment size: 4096
> Blocks per group: 32768
> Fragments per group: 32768
> Inodes per group: 16384
> Inode blocks per group: 512
> Last mount time: Thu Nov 15 10:07:12 2001
> Last write time: Thu Nov 15 15:55:23 2001
> Mount count: 2
> Maximum mount count: 20
> Last checked: Thu Nov 15 08:48:40 2001
> Check interval: 15552000 (6 months)
> Next check after: Tue May 14 09:48:40 2002
> Reserved blocks uid: 0 (user root)
> Reserved blocks gid: 0 (group root)
> First inode: 11
> Inode size: 128

Are you running a current version of e2fsprogs? 1.25?

If you are, then this indicates that the filesystem has has_journal
set, but it doesn't have a journal inode. That is certainly something
which e2fsck should detect and fix. This may be a fsck bug.

You should be able to fix this with `tune2fs -O ^has-journal' on
the unmounted or readonly fs.

2001-11-15 22:00:09

by Andreas Dilger

[permalink] [raw]

Subject: Re: Bug in ext3

On Nov 15, 2001 16:02 -0500, Ben Collins wrote:
> On Thu, Nov 15, 2001 at 12:48:26PM -0800, Andrew Morton wrote:
> > ext3 thinks that the filesystem's superblock has the
> > EXT3_FEATURE_COMPAT_HAS_JOURNAL bit set in the s_feature_compat
> > field of the on-disk superblock.
> >
> > It's probable that that bit _is_ set. ext2 will never notice it.
> >
> > Please: the dumpe2fs output?
>
> Seems it does have the field set. I guess the bug is then that if there
> is no journal, then it shoudl fail to mount it, so ext2 will take over.
> Is there any reason to mount a partition as ext3 if there is no journal
> to be found?

It _does_ fail to mount the filesystem as ext3, and the ext2 code properly
mounts it. You can see this because the error message you got (in your
previous posting said "EXT2-fs: ..." so the error came from ext2.

Please run e2fsck (1.25) to clear this up. It may be that you have other
corruption in your filesystem. If you are sure you _never_ tried ext3
on this filesystem before, yet the has_journal bit is set, this could
be an indication of memory or cable problems.

Cheers, Andreas
--
Andreas Dilger
http://sourceforge.net/projects/ext2resize/
http://www-mddsp.enel.ucalgary.ca/People/adilger/

2001-11-15 22:05:29

[permalink] [raw]

Subject: Re: Bug in ext3

On Thu, Nov 15, 2001 at 01:10:31PM -0800, Andrew Morton wrote:
>
> Are you running a current version of e2fsprogs? 1.25?
>
> If you are, then this indicates that the filesystem has has_journal
> set, but it doesn't have a journal inode. That is certainly something
> which e2fsck should detect and fix. This may be a fsck bug.
>
> You should be able to fix this with `tune2fs -O ^has-journal' on
> the unmounted or readonly fs.

Actually, it's 1.18. I'll upgrade e2fsprogs. At the same time, would it
not be prudent to make ext3 fail to mount if it cannot setup the
journal? If it decides to keep going in the event that there is no
journal, it should not break like it did.

Ben

--
.----------=======-=-======-=========-----------=====------------=-=-----.
/ Ben Collins -- Debian GNU/Linux \
` [email protected] -- [email protected] -- [email protected] '
`---=========------=======-------------=-=-----=-===-======-------=--=---'

2001-11-15 22:07:59

[permalink] [raw]

Subject: Re: Bug in ext3

On Thu, Nov 15, 2001 at 02:58:03PM -0700, Andreas Dilger wrote:
> On Nov 15, 2001 16:02 -0500, Ben Collins wrote:
> > On Thu, Nov 15, 2001 at 12:48:26PM -0800, Andrew Morton wrote:
> > > ext3 thinks that the filesystem's superblock has the
> > > EXT3_FEATURE_COMPAT_HAS_JOURNAL bit set in the s_feature_compat
> > > field of the on-disk superblock.
> > >
> > > It's probable that that bit _is_ set. ext2 will never notice it.
> > >
> > > Please: the dumpe2fs output?
> >
> > Seems it does have the field set. I guess the bug is then that if there
> > is no journal, then it shoudl fail to mount it, so ext2 will take over.
> > Is there any reason to mount a partition as ext3 if there is no journal
> > to be found?
>
> It _does_ fail to mount the filesystem as ext3, and the ext2 code properly
> mounts it. You can see this because the error message you got (in your
> previous posting said "EXT2-fs: ..." so the error came from ext2.
>
> Please run e2fsck (1.25) to clear this up. It may be that you have other
> corruption in your filesystem. If you are sure you _never_ tried ext3
> on this filesystem before, yet the has_journal bit is set, this could
> be an indication of memory or cable problems.

Uh, something corrupted it. Believe me, there is no other corruption.
I've reverted to a non-ext3 kernel, and after a day of serious IO, no
problems have shown. So something is wrong, and it isn't my filesystem
(the erroneous flag needs to be cleared, yes, but the fact remains that
there is a problem in this case).

Ben

--
.----------=======-=-======-=========-----------=====------------=-=-----.
/ Ben Collins -- Debian GNU/Linux \
` [email protected] -- [email protected] -- [email protected] '
`---=========------=======-------------=-=-----=-===-======-------=--=---'

2001-11-15 22:49:30

[permalink] [raw]

Subject: Re: Bug in ext3

On Thu, Nov 15, 2001 at 05:06:28PM -0500, Ben Collins wrote:
> On Thu, Nov 15, 2001 at 02:58:03PM -0700, Andreas Dilger wrote:
> > On Nov 15, 2001 16:02 -0500, Ben Collins wrote:
> > > On Thu, Nov 15, 2001 at 12:48:26PM -0800, Andrew Morton wrote:
> > > > ext3 thinks that the filesystem's superblock has the
> > > > EXT3_FEATURE_COMPAT_HAS_JOURNAL bit set in the s_feature_compat
> > > > field of the on-disk superblock.
> > > >
> > > > It's probable that that bit _is_ set. ext2 will never notice it.
> > > >
> > > > Please: the dumpe2fs output?
> > >
> > > Seems it does have the field set. I guess the bug is then that if there
> > > is no journal, then it shoudl fail to mount it, so ext2 will take over.
> > > Is there any reason to mount a partition as ext3 if there is no journal
> > > to be found?
> >
> > It _does_ fail to mount the filesystem as ext3, and the ext2 code properly
> > mounts it. You can see this because the error message you got (in your
> > previous posting said "EXT2-fs: ..." so the error came from ext2.
> >
> > Please run e2fsck (1.25) to clear this up. It may be that you have other
> > corruption in your filesystem. If you are sure you _never_ tried ext3
> > on this filesystem before, yet the has_journal bit is set, this could
> > be an indication of memory or cable problems.
>
> Uh, something corrupted it. Believe me, there is no other corruption.
> I've reverted to a non-ext3 kernel, and after a day of serious IO, no
> problems have shown. So something is wrong, and it isn't my filesystem
> (the erroneous flag needs to be cleared, yes, but the fact remains that
> there is a problem in this case).

Yes, true.

Can you try again with a journal in this FS with ext3?

2001-11-15 22:59:02

[permalink] [raw]

Subject: Re: Bug in ext3

On Thu, Nov 15, 2001 at 02:49:04PM -0800, Mike Fedyk wrote:
> > >
> > > Please run e2fsck (1.25) to clear this up. It may be that you have other
> > > corruption in your filesystem. If you are sure you _never_ tried ext3
> > > on this filesystem before, yet the has_journal bit is set, this could
> > > be an indication of memory or cable problems.
> >
> > Uh, something corrupted it. Believe me, there is no other corruption.
> > I've reverted to a non-ext3 kernel, and after a day of serious IO, no
> > problems have shown. So something is wrong, and it isn't my filesystem
> > (the erroneous flag needs to be cleared, yes, but the fact remains that
> > there is a problem in this case).
>
> Yes, true.
>
> Can you try again with a journal in this FS with ext3?

Can't really do that. This is a production system. I will however try to
reproduce it on another system. If I can do that, I'll try your
suggestion.

Ben

--
.----------=======-=-======-=========-----------=====------------=-=-----.
/ Ben Collins -- Debian GNU/Linux \
` [email protected] -- [email protected] -- [email protected] '
`---=========------=======-------------=-=-----=-===-======-------=--=---'

2001-11-15 23:23:21

by Andreas Dilger

[permalink] [raw]

Subject: Re: Bug in ext3

On Nov 15, 2001 17:06 -0500, Ben Collins wrote:
> On Thu, Nov 15, 2001 at 02:58:03PM -0700, Andreas Dilger wrote:
> > Please run e2fsck (1.25) to clear this up. It may be that you have other
> > corruption in your filesystem. If you are sure you _never_ tried ext3
> > on this filesystem before, yet the has_journal bit is set, this could
> > be an indication of memory or cable problems.
>
> Uh, something corrupted it. Believe me, there is no other corruption.
> I've reverted to a non-ext3 kernel, and after a day of serious IO, no
> problems have shown. So something is wrong, and it isn't my filesystem
> (the erroneous flag needs to be cleared, yes, but the fact remains that
> there is a problem in this case).

I don't disagree that something corrupted it, but it is hard to tell from
here what it could be. Looking at ext3_read_super(), it is pretty much
a read-only path, except journal recovery. If, for some reason, you had
an old, unrecovered ext3 journal in the fs, it is possible that recovering
from it would corrupt your fs by writing old data into the fs.

This _shouldn't_ happen with newer kernels, but with old 2.2 ext3 code
this was a possibility. Also, with old e2fsck code (1.18 was right at the
very beginning when ext3 support was being added) it is possible that it
didn't fail because of the has_journal flag, but it wasn't smart enough
to detect and remove an old corrupt journal. I'm not saying this is a
likely scenario either, but we don't have much to go on.

Are you _absolutely_ sure that there was never any ext3 testing done a
long time ago on this fs? Looking at it closer, it seems unlikely that
random corruption could cause something like the above scenario, because
there are many checks before we get to journal recovery.

Looking at the ext3_read_super() path, we call ext3_load_journal(),
which calls journal_wipe() and journal_load(), which both call
journal_load()->load_superblock()->journal_get_superblock(), which
gives us the two "JBD: no valid journal superblock found" messages,
and return errors before doing anything else. Then we get the message
"EXT3-fs: error loading journal." and return without mounting the fs.

Hmm, there is a possibility that journal_destroy() calling
journal_update_superblock() scribbling data into the first block of
the old "journal". Stephen, Andrew, we need to exit from the
journal_get_superblock() with j_sb_buffer = NULL, and then check for
this in journal_destroy() so we don't call journal_update_superblock().
How does the below patch look?

Ben, unfortunately, the dumpe2fs 1.18 output doesn't show the journal
fields. When you upgrade to e2fsprogs 1.25, can you run dumpe2fs -h <dev>
again and send the output, so we can see what is all set? It appears
that at least the "has_journal" flag and a s_journal_inum are set for it
to get as far as it did. The new e2fsck will clean that up properly.

Cheers, Andreas
=========================================================================
--- ext3/fs/jbd/journal.c Thu Nov 1 00:47:55 2001
+++ linux/fs/jbd/journal.c Thu Nov 15 16:09:46 2001
@@ -900,7 +900,7 @@
if (sb->s_header.h_magic != htonl(JFS_MAGIC_NUMBER) ||
sb->s_blocksize != htonl(journal->j_blocksize)) {
printk(KERN_WARNING "JBD: no valid journal superblock found\n");
- return -EINVAL;
+ goto error;
}

switch(ntohl(sb->s_header.h_blocktype)) {
@@ -912,17 +912,22 @@
break;
default:
printk(KERN_WARNING "JBD: unrecognised superblock format ID\n");
- return -EINVAL;
+ goto error;
}

if (ntohl(sb->s_maxlen) < journal->j_maxlen)
journal->j_maxlen = ntohl(sb->s_maxlen);
else if (ntohl(sb->s_maxlen) > journal->j_maxlen) {
printk (KERN_WARNING "JBD: journal file too short\n");
- return -EINVAL;
+ goto error;
}

return 0;
+
+error:
+ brelse(bh);
+ journal->j_sb_buffer = NULL;
+ return -EINVAL;
}

/*
@@ -1027,7 +1032,8 @@
/* We can now mark the journal as empty. */
journal->j_tail = 0;
journal->j_tail_sequence = ++journal->j_transaction_sequence;
- journal_update_superblock(journal, 1);
+ if (journal->j_sb_buffer)
+ journal_update_superblock(journal, 1);

if (journal->j_inode)
iput(journal->j_inode);
@@ -1716,7 +1710,6 @@
journal_destroy_caches();
}

+MODULE_LICENSE("GPL");
module_init(journal_init);
module_exit(journal_exit);

--
Andreas Dilger
http://sourceforge.net/projects/ext2resize/
http://www-mddsp.enel.ucalgary.ca/People/adilger/

2001-11-15 23:39:31

[permalink] [raw]

Subject: Re: Bug in ext3

On Thu, Nov 15, 2001 at 04:21:49PM -0700, Andreas Dilger wrote:
> On Nov 15, 2001 17:06 -0500, Ben Collins wrote:
> > On Thu, Nov 15, 2001 at 02:58:03PM -0700, Andreas Dilger wrote:
> > > Please run e2fsck (1.25) to clear this up. It may be that you have other
> > > corruption in your filesystem. If you are sure you _never_ tried ext3
> > > on this filesystem before, yet the has_journal bit is set, this could
> > > be an indication of memory or cable problems.
> >
> > Uh, something corrupted it. Believe me, there is no other corruption.
> > I've reverted to a non-ext3 kernel, and after a day of serious IO, no
> > problems have shown. So something is wrong, and it isn't my filesystem
> > (the erroneous flag needs to be cleared, yes, but the fact remains that
> > there is a problem in this case).
>
> I don't disagree that something corrupted it, but it is hard to tell from
> here what it could be. Looking at ext3_read_super(), it is pretty much
> a read-only path, except journal recovery. If, for some reason, you had
> an old, unrecovered ext3 journal in the fs, it is possible that recovering
> from it would corrupt your fs by writing old data into the fs.
>
> This _shouldn't_ happen with newer kernels, but with old 2.2 ext3 code
> this was a possibility. Also, with old e2fsck code (1.18 was right at the
> very beginning when ext3 support was being added) it is possible that it
> didn't fail because of the has_journal flag, but it wasn't smart enough
> to detect and remove an old corrupt journal. I'm not saying this is a
> likely scenario either, but we don't have much to go on.

I wont say that I am absolutely 100% sure that ext3 was never tried on
this filesystem. I am pretty certain, but I'm guessing it doesn't really
make much difference at this point. Your scenario of the corruption
makes sense. I'll see if I can test your patch at some point (but I most
likely cannot).

Filesystem volume name: <none>
Last mounted on: <not available>
Filesystem UUID: <none>
Filesystem magic number: 0xEF53
Filesystem revision #: 1 (dynamic)
Filesystem features: has_journal filetype sparse_super
Filesystem state: not clean
Errors behavior: Continue
Filesystem OS type: Linux
Inode count: 1015808
Block count: 2028288
Reserved block count: 101414
Free blocks: 368490
Free inodes: 688732
First block: 0
Block size: 4096
Fragment size: 4096
Blocks per group: 32768
Fragments per group: 32768
Inodes per group: 16384
Inode blocks per group: 512
Last mount time: Thu Nov 15 10:07:12 2001
Last write time: Thu Nov 15 18:29:55 2001
Mount count: 2
Maximum mount count: 20
Last checked: Thu Nov 15 08:48:40 2001
Check interval: 15552000 (6 months)
Next check after: Tue May 14 09:48:40 2002
Reserved blocks uid: 0 (user root)
Reserved blocks gid: 0 (group root)
First inode: 11
Inode size: 128
Journal UUID: <none>
Journal inode: 48
Journal device: 0x0000
First orphan inode: 0

--
.----------=======-=-======-=========-----------=====------------=-=-----.
/ Ben Collins -- Debian GNU/Linux \
` [email protected] -- [email protected] -- [email protected] '
`---=========------=======-------------=-=-----=-===-======-------=--=---'

2001-11-16 00:09:24

by Andreas Dilger

[permalink] [raw]

Subject: Re: Bug in ext3

On Nov 15, 2001 18:38 -0500, Ben Collins wrote:
> I wont say that I am absolutely 100% sure that ext3 was never tried on
> this filesystem. I am pretty certain, but I'm guessing it doesn't really
> make much difference at this point. Your scenario of the corruption
> makes sense. I'll see if I can test your patch at some point (but I most
> likely cannot).
>
> Filesystem features: has_journal filetype sparse_super
> ...
> Journal inode: 48

What I'm thinking happened here is at some point long ago (maybe before
the server was put into production, who knows) someone tested ext3 on
it. When they were done, they deleted the .journal file (inode #48)
but e2fsck didn't clean up the superblock journal fields, and it went
unnoticed until now.

The other alternative is that you have some sort of random single-bit
data corruption going on (the journal inode is also a single bit set,
48 = 0x30, but a different bit than has_journal, = 0x0004).

In any case, with e2fsprogs 1.18 (and probably _only_ that version)
it doesn't complain about has_journal, but it also doesn't check if the
journal is bad and clean it up. When you try to start with an ext3-aware
kernel, it conspires to corrupt inode 48 when it tries to unload the
journal, even when it knows the journal is bad.

What would be interesting to correlate is what inode 48 is (probably a
directory, or you wouldn't have noticed it at all), with the corruption
problems you are having while ext3 is loaded.

Cheers, Andreas
--
Andreas Dilger
http://sourceforge.net/projects/ext2resize/
http://www-mddsp.enel.ucalgary.ca/People/adilger/

2001-11-16 00:56:24

[permalink] [raw]

Subject: Re: Bug in ext3

On Thu, Nov 15, 2001 at 05:07:42PM -0700, Andreas Dilger wrote:
> On Nov 15, 2001 18:38 -0500, Ben Collins wrote:
> > I wont say that I am absolutely 100% sure that ext3 was never tried on
> > this filesystem. I am pretty certain, but I'm guessing it doesn't really
> > make much difference at this point. Your scenario of the corruption
> > makes sense. I'll see if I can test your patch at some point (but I most
> > likely cannot).
> >
> > Filesystem features: has_journal filetype sparse_super
> > ...
> > Journal inode: 48
>
> What I'm thinking happened here is at some point long ago (maybe before
> the server was put into production, who knows) someone tested ext3 on
> it. When they were done, they deleted the .journal file (inode #48)
> but e2fsck didn't clean up the superblock journal fields, and it went
> unnoticed until now.
>
> The other alternative is that you have some sort of random single-bit
> data corruption going on (the journal inode is also a single bit set,
> 48 = 0x30, but a different bit than has_journal, = 0x0004).
>
> In any case, with e2fsprogs 1.18 (and probably _only_ that version)
> it doesn't complain about has_journal, but it also doesn't check if the
> journal is bad and clean it up. When you try to start with an ext3-aware
> kernel, it conspires to corrupt inode 48 when it tries to unload the
> journal, even when it knows the journal is bad.
>
> What would be interesting to correlate is what inode 48 is (probably a
> directory, or you wouldn't have noticed it at all), with the corruption
> problems you are having while ext3 is loaded.

48 /usr/lib/perl5/5.005/File/Copy.pm

Since this file is pretty small, I can only assume that it overwrote
some adjacent files. There is some corruption in this file (luckily in
the comment area :) starting at the 25th byte, and extending 12 bytes in
length. Here's the values from hexedit:

00 00 00 01 00 00 00 00 00 00 00 00

--
.----------=======-=-======-=========-----------=====------------=-=-----.
/ Ben Collins -- Debian GNU/Linux \
` [email protected] -- [email protected] -- [email protected] '
`---=========------=======-------------=-=-----=-===-======-------=--=---'

2001-11-16 03:11:07

by Andreas Dilger

[permalink] [raw]

Subject: Re: Bug in ext3

On Nov 15, 2001 19:55 -0500, Ben Collins wrote:
> On Thu, Nov 15, 2001 at 05:07:42PM -0700, Andreas Dilger wrote:
> > What would be interesting to correlate is what inode 48 is (probably a
> > directory, or you wouldn't have noticed it at all), with the corruption
> > problems you are having while ext3 is loaded.
>
> 48 /usr/lib/perl5/5.005/File/Copy.pm
>
> Since this file is pretty small, I can only assume that it overwrote
> some adjacent files. There is some corruption in this file (luckily in
> the comment area :) starting at the 25th byte, and extending 12 bytes in
> length. Here's the values from hexedit:
>
> 00 00 00 01 00 00 00 00 00 00 00 00

No, this would be the only expected corruption - there are 3 32-bit
fields that that get written to disk in journal_update_superblock(),
and these are consistent with that.

That means the source of the other corruption is unknown.

Cheers, Andreas
--
Andreas Dilger
http://sourceforge.net/projects/ext2resize/
http://www-mddsp.enel.ucalgary.ca/People/adilger/

2001-11-16 03:21:27

[permalink] [raw]

Subject: Re: Bug in ext3

On Thu, Nov 15, 2001 at 08:09:16PM -0700, Andreas Dilger wrote:
> On Nov 15, 2001 19:55 -0500, Ben Collins wrote:
> > On Thu, Nov 15, 2001 at 05:07:42PM -0700, Andreas Dilger wrote:
> > > What would be interesting to correlate is what inode 48 is (probably a
> > > directory, or you wouldn't have noticed it at all), with the corruption
> > > problems you are having while ext3 is loaded.
> >
> > 48 /usr/lib/perl5/5.005/File/Copy.pm
> >
> > Since this file is pretty small, I can only assume that it overwrote
> > some adjacent files. There is some corruption in this file (luckily in
> > the comment area :) starting at the 25th byte, and extending 12 bytes in
> > length. Here's the values from hexedit:
> >
> > 00 00 00 01 00 00 00 00 00 00 00 00
>
> No, this would be the only expected corruption - there are 3 32-bit
> fields that that get written to disk in journal_update_superblock(),
> and these are consistent with that.
>
> That means the source of the other corruption is unknown.

The "other" corruption only occured while booted with the ext3-enabled
kernel. They haven't appeared under the non-ext3 kernel at all. Even
after it got mounted read-only, performing an fsck, and remounting
read-write, it would reoccur over and over. So this "other" corruption
doesn't even sound like it can be caused by the scenario you described
(which sounds like a one shot problem).

Ben

--
.----------=======-=-======-=========-----------=====------------=-=-----.
/ Ben Collins -- Debian GNU/Linux \
` [email protected] -- [email protected] -- [email protected] '
`---=========------=======-------------=-=-----=-===-======-------=--=---'

2001-11-16 03:37:58

[permalink] [raw]

Subject: Re: Bug in ext3

On Thu, Nov 15, 2001 at 10:20:55PM -0500, Ben Collins wrote:
> On Thu, Nov 15, 2001 at 08:09:16PM -0700, Andreas Dilger wrote:
> > On Nov 15, 2001 19:55 -0500, Ben Collins wrote:
> > > On Thu, Nov 15, 2001 at 05:07:42PM -0700, Andreas Dilger wrote:
> > > > What would be interesting to correlate is what inode 48 is (probably a
> > > > directory, or you wouldn't have noticed it at all), with the corruption
> > > > problems you are having while ext3 is loaded.
> > >
> > > 48 /usr/lib/perl5/5.005/File/Copy.pm
> > >
> > > Since this file is pretty small, I can only assume that it overwrote
> > > some adjacent files. There is some corruption in this file (luckily in
> > > the comment area :) starting at the 25th byte, and extending 12 bytes in
> > > length. Here's the values from hexedit:
> > >
> > > 00 00 00 01 00 00 00 00 00 00 00 00
> >
> > No, this would be the only expected corruption - there are 3 32-bit
> > fields that that get written to disk in journal_update_superblock(),
> > and these are consistent with that.
> >
> > That means the source of the other corruption is unknown.
>
> The "other" corruption only occured while booted with the ext3-enabled
> kernel. They haven't appeared under the non-ext3 kernel at all. Even
> after it got mounted read-only, performing an fsck, and remounting
> read-write, it would reoccur over and over. So this "other" corruption
> doesn't even sound like it can be caused by the scenario you described
> (which sounds like a one shot problem).

Ben, just for kicks, try ext3 with 2.4.14 instead of 2.4.15-pre4.

Have you been able to reproduce this on another machine?

Also, post the output of lspci -vv.

Mike

2001-11-16 04:52:26

by Andreas Dilger

[permalink] [raw]

Subject: Re: Bug in ext3

On Nov 15, 2001 22:20 -0500, Ben Collins wrote:
> On Thu, Nov 15, 2001 at 08:09:16PM -0700, Andreas Dilger wrote:
> > That means the source of the other corruption is unknown.
>
> The "other" corruption only occured while booted with the ext3-enabled
> kernel. They haven't appeared under the non-ext3 kernel at all. Even
> after it got mounted read-only, performing an fsck, and remounting
> read-write, it would reoccur over and over. So this "other" corruption
> doesn't even sound like it can be caused by the scenario you described
> (which sounds like a one shot problem).

I'm not saying it wasn't caused by ext3, I'm just saying that it is
unknown.

The problem I described _would_ happen each time the filesystem was tried
to be mounted as ext3, but it would only happen to the single file given
as the journal inode (i.e. inode #48 in your case).

You say that remounting the root fs read-only, running fsck, and then
remounting read-write would cause the filesystem to be corrupted? In that
case, the problem _has_ to be in ext2, because the root filesystem was
mounted as ext2, so there is _no_ way that ext3 could corrupt it again
(the two filesystem codes are completely separate).

Note also, that if you have a serious corruption in the filesystem,
and you run e2fsck on it while it is mounted (read only, presumably),
then you need to reboot the system after e2fsck is done. Otherwise,
the kernel cache may not be consistent with what is on disk, and anything
that gets written to the disk again from the kernel has the possibility
of corrupting the filesystem.

Normally, this doesn't happen, but it is possible. With the directories
in the page cache (since 2.4.5 or so), the block device and the directory
pages are not coherent. This makes it much more important to reboot
if e2fsck makes major changes to the filesystem. Regular files have
been in the page cache for a long time, but e2fsck does not change the
contents of files, so it is not normally an issue.

This _might_ be worked around by e2fsck calling an ioctl to flush
the kernel caches, but that would probably be complex, and given the
complexity of system startup scripts today it is hard to limit the
number of affected directories/processes. An alternative would be to put
e2fsck into the initrd, so that root can be checked before it is mounted,
but that is also a change to the way systems are started.

There _may_ also be some issues with the block devices in page cache (i.e.
disconnect between inode tables or bitmaps between the block device user
(e2fsck) and what the kernel sees, but that would be an outright bug.

Cheers, Andreas
--
Andreas Dilger
http://sourceforge.net/projects/ext2resize/
http://www-mddsp.enel.ucalgary.ca/People/adilger/

2001-11-16 14:52:58

by Stephen C. Tweedie

[permalink] [raw]

Subject: Re: Bug in ext3

Hi,

On Thu, Nov 15, 2001 at 04:02:32PM -0500, Ben Collins wrote:
>
> Seems it does have the field set. I guess the bug is then that if there
> is no journal, then it shoudl fail to mount it, so ext2 will take over.

It _did_ fail to mount it, and ext2 _did_ take over: that's why, when
ext2 found something else wrong with the filesystem, the errors
looked like

EXT2-fs error (device sd(8,20)): ext2_free_blocks: Freeing blocks not
in datazone - block = 4294965248, count = 6872

which were marked as ext2 warnings, not ext3 warnings.

Cheers,
Stephen

2001-11-16 18:39:07

by Stephen C. Tweedie

[permalink] [raw]

Subject: Re: Bug in ext3

Hi,

On Thu, Nov 15, 2001 at 04:21:49PM -0700, Andreas Dilger wrote:

> I don't disagree that something corrupted it, but it is hard to tell from
> here what it could be. Looking at ext3_read_super(), it is pretty much
> a read-only path, except journal recovery. If, for some reason, you had
> an old, unrecovered ext3 journal in the fs, it is possible that recovering
> from it would corrupt your fs by writing old data into the fs.

Except that can only happen once we have found a journal to recover,
and the error that ext3 spat out indicated that it couldn't find a
journal at all and was giving up.

> Looking at the ext3_read_super() path, we call ext3_load_journal(),
> which calls journal_wipe() and journal_load(), which both call
> journal_load()->load_superblock()->journal_get_superblock(), which
> gives us the two "JBD: no valid journal superblock found" messages,
> and return errors before doing anything else. Then we get the message
> "EXT3-fs: error loading journal." and return without mounting the fs.

Indeed.

> Hmm, there is a possibility that journal_destroy() calling
> journal_update_superblock() scribbling data into the first block of
> the old "journal". Stephen, Andrew, we need to exit from the
> journal_get_superblock() with j_sb_buffer = NULL, and then check for
> this in journal_destroy() so we don't call journal_update_superblock().
> How does the below patch look?

Looks OK. I've done a slightly better version which catches a couple
of extra cases but it's basically the same solution. I've also added
a tiny patch to prevent a failed journal_wipe() from being followed by
a journal_load() attempt, so we don't get the same error twice.

Patch below.

Cheers,
Stephen

Attachments:

(No filename) (1.68 kB)
ext3-bad-super.diff (3.49 kB)
Download all attachments

2001-11-16 18:45:28

by Stephen C. Tweedie

[permalink] [raw]

Subject: Re: Bug in ext3

Hi,

On Fri, Nov 16, 2001 at 06:38:37PM +0000, Stephen C. Tweedie wrote:
> Looks OK. I've done a slightly better version which catches a couple
> of extra cases but it's basically the same solution. I've also added
> a tiny patch to prevent a failed journal_wipe() from being followed by
> a journal_load() attempt, so we don't get the same error twice.

This definitely fixes that error path: I just get one, clean error
now, and no corruption of the file that was masquerading as the
journal. It doesn't properly release the journal inode, though, so we
oops on a later ext2 mount as we think we already have the (ext3)
inode in cache. Fix to follow.

Ben, thanks for this --- this level of corrupt journal is something
that hasn't been tested in this way before.

--Stephen