LinuxLists.cc - Massive filesystem corruption

2008-12-20 18:14:50

Subject: Massive filesystem corruption

Hi,
i've lost my ext4 partition with a 2.6.27 vanilla kernel:

root@ubuntu:~# mount -t ext4dev /dev/sda1 /mnt
mount: wrong fs type, bad option, bad superblock on /dev/sda1,
missing codepage or helper program, or other error
In some cases useful info is found in syslog - try
dmesg | tail or so

root@ubuntu:~# dmesg | tail -1
[ 4874.514703] VFS: Can't find ext4 filesystem on dev sda1.
root@ubuntu:~# e2fsck /dev/sda1
e2fsck 1.41.3 (12-Oct-2008)
e2fsck: Superblock invalid, trying backup blocks...
/dev/sda1 was not cleanly unmounted, check forced.
Pass 1: Checking inodes, blocks, and sizes
Error1: Corrupt extent header on inode 107192
Aborted (core dumped)
root@ubuntu:~# gdb -q --args e2fsck /dev/sda1
(gdb) run
Starting program: /sbin/e2fsck /dev/sda1
[Thread debugging using libthread_db enabled]
e2fsck 1.41.3 (12-Oct-2008)
/sbin/e2fsck: Superblock invalid, trying backup blocks...
/dev/sda1 was not cleanly unmounted, check forced.
Pass 1: Checking inodes, blocks, and sizes
Error1: Corrupt extent header on inode 107192
[New Thread 0xb7e46700 (LWP 12878)]

Program received signal SIGABRT, Aborted.
[Switching to Thread 0xb7e46700 (LWP 12878)]
0xb8031430 in __kernel_vsyscall ()
(gdb) backtrace
#0 0xb8031430 in __kernel_vsyscall ()
#1 0xb7e8c880 in raise () from /lib/tls/i686/cmov/libc.so.6
#2 0xb7e8e248 in abort () from /lib/tls/i686/cmov/libc.so.6
#3 0x0805397b in scan_extent_node (ctx=0x9193038, pctx=0xbf830d7c,
pb=0xbf830c5c, start_block=0, ehandle=0x91b8170) at
/build/buildd/e2fsprogs-1.41.3/e2fsck/pass1.c:1700
#4 0x08054c02 in check_blocks (ctx=0x9193038, pctx=0xbf830d7c,
block_buf=0x91acff0 "\225\"\005") at
/build/buildd/e2fsprogs-1.41.3/e2fsck/pass1.c:1773
#5 0x080565ca in e2fsck_pass1 (ctx=0x9193038) at
/build/buildd/e2fsprogs-1.41.3/e2fsck/pass1.c:1030
#6 0x08050063 in e2fsck_run (ctx=0x9193038) at
/build/buildd/e2fsprogs-1.41.3/e2fsck/e2fsck.c:215
#7 0x0804e4b8 in main (argc=Cannot access memory at address 0x324e
) at /build/buildd/e2fsprogs-1.41.3/e2fsck/unix.c:1278
(gdb)

please if you know how can I read, fix or debug it answer in a reasonable time,
i need that disk space and i'll format it in a few days

2008-12-20 19:27:31

by Eric Sandeen

[permalink] [raw]

Subject: Re: Massive filesystem corruption

Matteo Croce wrote:
> Hi,
> i've lost my ext4 partition with a 2.6.27 vanilla kernel:
>
> root@ubuntu:~# mount -t ext4dev /dev/sda1 /mnt
> mount: wrong fs type, bad option, bad superblock on /dev/sda1,
> missing codepage or helper program, or other error
> In some cases useful info is found in syslog - try
> dmesg | tail or so

What happened between the last successful mount and this failure?

> root@ubuntu:~# dmesg | tail -1
> [ 4874.514703] VFS: Can't find ext4 filesystem on dev sda1.

Was there anything before that? (i.e. check tail -n 10?)

What does the beginning of the fs look like, maybe you can put the first
16k or so of a dd somewehre, or run it through hexdump -C, see if
something else stomped on this partition.

> root@ubuntu:~# e2fsck /dev/sda1
> e2fsck 1.41.3 (12-Oct-2008)
> e2fsck: Superblock invalid, trying backup blocks...
> /dev/sda1 was not cleanly unmounted, check forced.
> Pass 1: Checking inodes, blocks, and sizes
> Error1: Corrupt extent header on inode 107192
> Aborted (core dumped)
> root@ubuntu:~# gdb -q --args e2fsck /dev/sda1
> (gdb) run
> Starting program: /sbin/e2fsck /dev/sda1
> [Thread debugging using libthread_db enabled]
> e2fsck 1.41.3 (12-Oct-2008)
> /sbin/e2fsck: Superblock invalid, trying backup blocks...
> /dev/sda1 was not cleanly unmounted, check forced.
> Pass 1: Checking inodes, blocks, and sizes
> Error1: Corrupt extent header on inode 107192
> [New Thread 0xb7e46700 (LWP 12878)]
>
> Program received signal SIGABRT, Aborted.

well, this was an explicit abort():

if (pctx->errcode) {
printf("Error1: %s on inode %u\n",
error_message(pctx->errcode), pctx->ino);
abort();
}

... I guess that error is not handled yet.

can you open the fs with debugfs, and try

debugfs> stat <107192>

and/or

debugfs> dump <107192> /some/path/to/dumpfile

and maybe we can see what's wrong with this inode. If it's the only one
then perhaps it can be nuked w/ debugfs and fsck will continue.

-Eric

2008-12-21 02:16:19

by Matteo Croce

[permalink] [raw]

Subject: Re: Massive filesystem corruption

On Saturday 20 December 2008 20:27:24 Eric Sandeen wrote:
> Matteo Croce wrote:
> > Hi,
> > i've lost my ext4 partition with a 2.6.27 vanilla kernel:
> >
> > root@ubuntu:~# mount -t ext4dev /dev/sda1 /mnt
> > mount: wrong fs type, bad option, bad superblock on /dev/sda1,
> > missing codepage or helper program, or other error
> > In some cases useful info is found in syslog - try
> > dmesg | tail or so
>
> What happened between the last successful mount and this failure?

A system freeze (mouse hanged etc.)

> > root@ubuntu:~# dmesg | tail -1
> > [ 4874.514703] VFS: Can't find ext4 filesystem on dev sda1.
>
> Was there anything before that? (i.e. check tail -n 10?)

Nothing relevant, usb loading and other drivers..

> What does the beginning of the fs look like, maybe you can put the first
> 16k or so of a dd somewehre, or run it through hexdump -C, see if
> something else stomped on this partition.

I'll check it

> > root@ubuntu:~# e2fsck /dev/sda1
> > e2fsck 1.41.3 (12-Oct-2008)
> > e2fsck: Superblock invalid, trying backup blocks...
> > /dev/sda1 was not cleanly unmounted, check forced.
> > Pass 1: Checking inodes, blocks, and sizes
> > Error1: Corrupt extent header on inode 107192
> > Aborted (core dumped)
> > root@ubuntu:~# gdb -q --args e2fsck /dev/sda1
> > (gdb) run
> > Starting program: /sbin/e2fsck /dev/sda1
> > [Thread debugging using libthread_db enabled]
> > e2fsck 1.41.3 (12-Oct-2008)
> > /sbin/e2fsck: Superblock invalid, trying backup blocks...
> > /dev/sda1 was not cleanly unmounted, check forced.
> > Pass 1: Checking inodes, blocks, and sizes
> > Error1: Corrupt extent header on inode 107192
> > [New Thread 0xb7e46700 (LWP 12878)]
> >
> > Program received signal SIGABRT, Aborted.
>
> well, this was an explicit abort():
>
> if (pctx->errcode) {
> printf("Error1: %s on inode %u\n",
> error_message(pctx->errcode), pctx->ino);
> abort();
> }
>
> ... I guess that error is not handled yet.
>
> can you open the fs with debugfs, and try
>
> debugfs> stat <107192>
>
> and/or
>
> debugfs> dump <107192> /some/path/to/dumpfile
>
> and maybe we can see what's wrong with this inode. If it's the only one
> then perhaps it can be nuked w/ debugfs and fsck will continue.
>
> -Eric

debugfs is new to me, have you some docs for me to read?

2008-12-21 03:23:22

by Eric Sandeen

[permalink] [raw]

Subject: Re: Massive filesystem corruption

Matteo Croce wrote:

> debugfs is new to me, have you some docs for me to read?

sure, "man debugfs"

-Eric

2008-12-21 05:08:50

by Nick Dokos

[permalink] [raw]

Subject: Re: Massive filesystem corruption

Matteo Croce <[email protected]> wrote:

> ...
> debugfs is new to me, have you some docs for me to read?
>

The debugfs man page gives summary descriptions of all the commands. I
am not aware of any other documentation. If for some reason the man page
is not installed locally, you can try e.g.

http://linux.die.net/man/8/debugfs