2008-07-16 02:31:28

by Gary Hawco

[permalink] [raw]
Subject: Add ext4__allow_read-only_mounts_with_corrupted_block_group_checksums patch

Ted,

Just recompiled 2.6.26 with the newest snapshot (15-July 2008/2310GMT).
uninit_bg had been enabled.

To test kernel I did: tune2fs -O ^uninit_bg /dev/sda6 to remove uninit_bg
feature.
Was prompted to run e2fsck on system.

Rebooted and e2fsck saw errors and did a forced fsck without having to
reboot. All is well thus far.

Then I did the reverse: tune2fs -O uninit_bg /dev/sda6 to renable uninit_bg
feature.
Again, I was prompted to run e2fsck on system.

Rebooted. This time I got a kernel panic that started with:
EXT4-fs: ext4_check_descriptors: Checksum for group 2 failed (10368!=32990)
EXT4-fs: group descriptors corrupted!

Then the partitions available are listed followed by
No filesystem could mount root, tried: ext4dev ext2
Kernel panic - not syncing: VFS: Unavle to mount root fs on unknown-block(8,6)

I was under the impression that the newest patch would handle this or am I
mistaken?

Thanks,
Gary



2008-07-16 09:29:17

by Andreas Dilger

[permalink] [raw]
Subject: Re: Add ext4__allow_read-only_mounts_with_corrupted_block_group_checksums patch

On Jul 15, 2008 19:31 +0000, Gary Hawco wrote:
> Just recompiled 2.6.26 with the newest snapshot (15-July 2008/2310GMT).
> uninit_bg had been enabled.
>
> To test kernel I did: tune2fs -O ^uninit_bg /dev/sda6 to remove uninit_bg
> feature.
> Was prompted to run e2fsck on system.
>
> Rebooted and e2fsck saw errors and did a forced fsck without having to
> reboot. All is well thus far.

You shouldn't really have rebooted at that point, just run e2fsck on
the filesystem to clear the checksums.

> Then I did the reverse: tune2fs -O uninit_bg /dev/sda6 to renable uninit_bg
> feature. Again, I was prompted to run e2fsck on system.

Again, you should have just run e2fsck on the filesystem without rebooting,
in order to update the checksums.

> Rebooted. This time I got a kernel panic that started with:
> EXT4-fs: ext4_check_descriptors: Checksum for group 2 failed (10368!=32990)
> EXT4-fs: group descriptors corrupted!

The kernel thinks the filesystem is corrupted, so a full e2fsck needs to
be run. However, e2fsck is probably sitting on that unmountable filesystem.

> I was under the impression that the newest patch would handle this or am I
> mistaken?

We probably need to allow the kernel to mount the filesystem read-only in
this case in order to run e2fsck.

Cheers, Andreas
--
Andreas Dilger
Sr. Staff Engineer, Lustre Group
Sun Microsystems of Canada, Inc.


2008-07-16 13:34:10

by Theodore Ts'o

[permalink] [raw]
Subject: Re: Add ext4__allow_read-only_mounts_with_corrupted_block_group_checksums patch

On Wed, Jul 16, 2008 at 03:29:14AM -0600, Andreas Dilger wrote:
> > Rebooted and e2fsck saw errors and did a forced fsck without having to
> > reboot. All is well thus far.
>
> You shouldn't really have rebooted at that point, just run e2fsck on
> the filesystem to clear the checksums.

This was the root filesystem so he couldn't have run e2fsck on the
mounted filesystem.

> > Then I did the reverse: tune2fs -O uninit_bg /dev/sda6 to renable uninit_bg
> > feature. Again, I was prompted to run e2fsck on system.
>
> Again, you should have just run e2fsck on the filesystem without rebooting,
> in order to update the checksums.

Ditto.

> > I was under the impression that the newest patch would handle this or am I
> > mistaken?
>
> We probably need to allow the kernel to mount the filesystem read-only in
> this case in order to run e2fsck.

That's what the patch I added in the latest snapshot was supposed to
do. I admit I haven't had a chance to test it yet, but the code looks
like it should do the right thing. You are sure that you are mounting
the root filesystem read/only (there is "ro" on the boot command line)
and this is the kernel with this patch applied, right?

- Ted

2008-07-18 12:59:38

by Theodore Ts'o

[permalink] [raw]
Subject: Re: Add ext4__allow_read-only_mounts_with_corrupted_block_group_checksums patch

On Tue, Jul 15, 2008 at 07:31:26PM +0000, Gary Hawco wrote:
>
> Just recompiled 2.6.26 with the newest snapshot (15-July 2008/2310GMT).
> uninit_bg had been enabled.

I also just realized that while I had pushed the commit with the
patch, I had forgotten to include
ext4__allow_read-only_mounts_with_corrupted_block_group_checksums in
the series file. Doh!

I just pushed out a series file which includes this, although that one
also is rebased against 2.6.26-git6 and removed all of the files from
the stable portion of the queue (since Linus has accepted them for the
merge window).

If you add

ext4__allow_read-only_mounts_with_corrupted_block_group_checksums

to the end of your series file, then the patch will actually be
applied. :-)

Oops, and sorry for not noticing this sooner.

- Ted