2008-03-13 18:34:45

by Christian Kujau

[permalink] [raw]
Subject: e2fsck not fixing all corruptions on the first run?

Hi,

after a few unclean shutdown, I noticed that my ext4 filesystem got
mounted with:

EXT4-fs warning: mounting fs with errors, running e2fsck is recommended

So I decided to unmount and check it with a fairly current version of
e2fsck, git show tells me:

# git show
commit d6a525282f4b3ece90c271d470b27ef7b6853817
Merge: 0ea8aa6... e91c9d4...
Author: Theodore Ts'o <[email protected]>
Date: Fri Feb 29 09:21:29 2008 -0500

e2fsck indeed does find a few corruptions, fixes them. The next
run, e2fsck says "clean" of course. But when I ran with -f, e2fsck did
find more (small) corruptions, see the full log here:

http://nerdbynature.de/bits/e2fsprogs.pu/md4.log
http://nerdbynature.de/bits/e2fsprogs.pu/

Q: How comes e2fsck would skip a few things during the first run? If it
(technically) has to, couldn't it NOT mark the fs clean so that one
knows it has to run fsck again?

Thanks,
Christian.
--
BOFH excuse #274:

It was OK before you touched it.


2008-03-13 20:54:38

by Theodore Ts'o

[permalink] [raw]
Subject: Re: e2fsck not fixing all corruptions on the first run?

On Thu, Mar 13, 2008 at 07:34:40PM +0100, Christian Kujau wrote:
> e2fsck indeed does find a few corruptions, fixes them. The next run, e2fsck
> says "clean" of course. But when I ran with -f, e2fsck did find more
> (small) corruptions, see the full log here:
>
> http://nerdbynature.de/bits/e2fsprogs.pu/md4.log
> http://nerdbynature.de/bits/e2fsprogs.pu/

Hmm. What I really need is *first* e2fsck run, to see what it did.

> Q: How comes e2fsck would skip a few things during the first run? If it
> (technically) has to, couldn't it NOT mark the fs clean so that one
> knows it has to run fsck again?

It's supposed to fix everything in one run. If it doesn't, it's a
bug. It usually means that either it attempted to fix a corruption,
and didn't fix it correctly, or some corruption confused it enough
that it didn't spot another corruption in the same run.

But as a result, that's why it's most useful to know what it found and
fixed in the first e2fsck run. Seeing both e2fsck logs is most useful
when trying to determine why it wasn't able to fix everything in one
go.

- Ted

2008-03-13 21:22:31

by Christian Kujau

[permalink] [raw]
Subject: Re: e2fsck not fixing all corruptions on the first run?

On Thu, 13 Mar 2008, Theodore Tso wrote:
>> http://nerdbynature.de/bits/e2fsprogs.pu/md4.log
>> http://nerdbynature.de/bits/e2fsprogs.pu/
>
> Hmm. What I really need is *first* e2fsck run, to see what it did.

The md4.log is the first e2fsck run:

1) mounted fs, noticed error message in dmesg
2) umounted fs
3) ran e2fsck, right on top of this md4.log file...

The 2nd run is the one after the "/dev/md4: clean" message in md4.log

Thanks,
Christian.
--
BOFH excuse #265:

The mouse escaped.

2008-03-13 21:47:46

by Theodore Ts'o

[permalink] [raw]
Subject: Re: e2fsck not fixing all corruptions on the first run?

On Thu, Mar 13, 2008 at 10:22:25PM +0100, Christian Kujau wrote:
> On Thu, 13 Mar 2008, Theodore Tso wrote:
>>> http://nerdbynature.de/bits/e2fsprogs.pu/md4.log
>>> http://nerdbynature.de/bits/e2fsprogs.pu/
>>
>> Hmm. What I really need is *first* e2fsck run, to see what it did.
>
> The md4.log is the first e2fsck run:
>
> 1) mounted fs, noticed error message in dmesg
> 2) umounted fs
> 3) ran e2fsck, right on top of this md4.log file...
>
> The 2nd run is the one after the "/dev/md4: clean" message in md4.log

Oh, right. Sorry, I missed that. I think I know what's going on.
The problem is we're not supporting long symlinks which are in extents
format. That's something which we changed, but we're going to change
back (since we need it to support filesystems with > 2**32 blocks).

I'll make e2fsprogs flexible enough to support both forms of long symlinks.

- Ted

2008-03-13 21:59:58

by Christian Kujau

[permalink] [raw]
Subject: Re: e2fsck not fixing all corruptions on the first run?

On Thu, 13 Mar 2008, Theodore Tso wrote:
> Oh, right. Sorry, I missed that. I think I know what's going on.
> The problem is we're not supporting long symlinks which are in extents
> format. That's something which we changed, but we're going to change
> back (since we need it to support filesystems with > 2**32 blocks).

When you say "changed", you mean the on-disk format changed? If so, I
could (should?) just mkfs again, since ext4 is in flux anyway and nobody
said the on-disk format was frozen. IOW, I don't know if it's wise to put
in quirks just because of some old test-environment. When ext4 is
(offically) released, everybody will have the "right" format and e2fsck
will work as expected, no?

thanks,
C.
--
BOFH excuse #441:

Hash table has woodworm

2008-03-13 22:17:24

by Theodore Ts'o

[permalink] [raw]
Subject: Re: e2fsck not fixing all corruptions on the first run?

On Thu, Mar 13, 2008 at 10:59:55PM +0100, Christian Kujau wrote:
> On Thu, 13 Mar 2008, Theodore Tso wrote:
>> Oh, right. Sorry, I missed that. I think I know what's going on.
>> The problem is we're not supporting long symlinks which are in extents
>> format. That's something which we changed, but we're going to change
>> back (since we need it to support filesystems with > 2**32 blocks).
>
> When you say "changed", you mean the on-disk format changed? If so, I could
> (should?) just mkfs again, since ext4 is in flux anyway and nobody said the
> on-disk format was frozen. IOW, I don't know if it's wise to put in quirks
> just because of some old test-environment. When ext4 is (offically)
> released, everybody will have the "right" format and e2fsck will work as
> expected, no?

Technically, I guess you could say the on-disk format changed, but as
I mentioned, it's going to change back to the original format. It's
not a big deal. The question is for long symlinks, where the symlink
can't fit in the 60 bytes in the i_block[] array, whether the block
number should be encoded as a single integer as i_block[0], as it is
under ext3, or whether it should be encoded as an extents structure.

Right now, the kernel you are using is encoding it as an extents
structure. E2fsprogs doesn't think that's legal for symlink, so it
cleared them. The kernel I am running encodes symlinks the same way
as ext3; but we're planning on changing it back to the way your kernel
is currently doing things, as it turns out you need to use the extents
format if the block number is larger than 32 bits.

So it's simply a matter of teaching e2fsprogs how to understand a long
symlink if the block number was encoded as an extent structure.

So no, there's no need to reformat your filesystem. It's just that
right now, if you create a symlink where the target is larger than 60
bytes, e2fsck will think it is an invalid symlink and delete it. The
fix for that will be coming shortly, becuase this is clearly very
annoying.

- Ted

2008-03-14 03:21:14

by Theodore Ts'o

[permalink] [raw]
Subject: [PATCH, E2FSPROGS] e2fsck: Support long symlinks which use extents

Signed-off-by: "Theodore Ts'o" <[email protected]>
---
e2fsck/e2fsck.h | 2 +-
e2fsck/pass1.c | 32 +++++++++++++++++++++++++++++---
e2fsck/pass2.c | 2 +-
3 files changed, 31 insertions(+), 5 deletions(-)

diff --git a/e2fsck/e2fsck.h b/e2fsck/e2fsck.h
index be6efe1..21208e0 100644
--- a/e2fsck/e2fsck.h
+++ b/e2fsck/e2fsck.h
@@ -429,7 +429,7 @@ extern void e2fsck_setup_tdb_icount(e2fsck_t ctx, int flags,
extern void e2fsck_use_inode_shortcuts(e2fsck_t ctx, int bool);
extern int e2fsck_pass1_check_device_inode(ext2_filsys fs,
struct ext2_inode *inode);
-extern int e2fsck_pass1_check_symlink(ext2_filsys fs,
+extern int e2fsck_pass1_check_symlink(ext2_filsys fs, ext2_ino_t ino,
struct ext2_inode *inode, char *buf);
extern void e2fsck_clear_inode(e2fsck_t ctx, ext2_ino_t ino,
struct ext2_inode *inode, int restart_flag,
diff --git a/e2fsck/pass1.c b/e2fsck/pass1.c
index c598205..8638989 100644
--- a/e2fsck/pass1.c
+++ b/e2fsck/pass1.c
@@ -163,17 +163,42 @@ int e2fsck_pass1_check_device_inode(ext2_filsys fs EXT2FS_ATTR((unused)),
* Check to make sure a symlink inode is real. Returns 1 if the symlink
* checks out, 0 if not.
*/
-int e2fsck_pass1_check_symlink(ext2_filsys fs, struct ext2_inode *inode,
- char *buf)
+int e2fsck_pass1_check_symlink(ext2_filsys fs, ext2_ino_t ino,
+ struct ext2_inode *inode, char *buf)
{
unsigned int len;
int i;
blk_t blocks;
+ ext2_extent_handle_t handle;
+ struct ext2_extent_info info;
+ struct ext2fs_extent extent;

if ((inode->i_size_high || inode->i_size == 0) ||
(inode->i_flags & EXT2_INDEX_FL))
return 0;

+ if (inode->i_flags & EXT4_EXTENTS_FL) {
+ if (inode->i_size > fs->blocksize)
+ return 0;
+ if (ext2fs_extent_open(fs, ino, &handle))
+ return 0;
+ i = 0;
+ if (ext2fs_extent_get_info(handle, &info) ||
+ (info.num_entries != 1) ||
+ (info.max_depth != 0))
+ goto exit_extent;
+ if (ext2fs_extent_get(handle, EXT2_EXTENT_ROOT, &extent) ||
+ (extent.e_lblk != 0) ||
+ (extent.e_len != 1) ||
+ (extent.e_pblk < fs->super->s_first_data_block) ||
+ (extent.e_pblk >= fs->super->s_blocks_count))
+ goto exit_extent;
+ i = 1;
+ exit_extent:
+ ext2fs_extent_free(handle);
+ return i;
+ }
+
blocks = ext2fs_inode_data_blocks(fs, inode);
if (blocks) {
if ((inode->i_size >= fs->blocksize) ||
@@ -910,7 +935,8 @@ void e2fsck_pass1(e2fsck_t ctx)
check_size(ctx, &pctx);
ctx->fs_blockdev_count++;
} else if (LINUX_S_ISLNK (inode->i_mode) &&
- e2fsck_pass1_check_symlink(fs, inode, block_buf)) {
+ e2fsck_pass1_check_symlink(fs, ino, inode,
+ block_buf)) {
check_immutable(ctx, &pctx);
ctx->fs_symlinks_count++;
if (ext2fs_inode_data_blocks(fs, inode) == 0) {
diff --git a/e2fsck/pass2.c b/e2fsck/pass2.c
index a336755..56f352b 100644
--- a/e2fsck/pass2.c
+++ b/e2fsck/pass2.c
@@ -1218,7 +1218,7 @@ extern int e2fsck_process_bad_inode(e2fsck_t ctx, ext2_ino_t dir,
&& !e2fsck_pass1_check_device_inode(fs, &inode))
problem = PR_2_BAD_SOCKET;
else if (LINUX_S_ISLNK(inode.i_mode)
- && !e2fsck_pass1_check_symlink(fs, &inode, buf)) {
+ && !e2fsck_pass1_check_symlink(fs, ino, &inode, buf)) {
problem = PR_2_INVALID_SYMLINK;
}

--
1.5.4.1.144.gdfee-dirty


2008-03-14 03:21:16

by Theodore Ts'o

[permalink] [raw]
Subject: [PATCH, E2FSPROGS] e2fsck: Check for fast symlinks that have EXTENTS_FL set

These shouldn't show up in the wild, but if they do, e2fsck will offer
to clear them.

Signed-off-by: "Theodore Ts'o" <[email protected]>
---
e2fsck/pass1.c | 8 ++++++++
e2fsck/problem.c | 5 +++++
e2fsck/problem.h | 3 +++
3 files changed, 16 insertions(+), 0 deletions(-)

diff --git a/e2fsck/pass1.c b/e2fsck/pass1.c
index 8638989..50e38e1 100644
--- a/e2fsck/pass1.c
+++ b/e2fsck/pass1.c
@@ -713,6 +713,14 @@ void e2fsck_pass1(e2fsck_t ctx)
}
}

+ if (extent_fs && (inode->i_flags & EXT4_EXTENTS_FL) &&
+ LINUX_S_ISLNK(inode->i_mode) &&
+ !ext2fs_inode_has_valid_blocks(inode) &&
+ fix_problem(ctx, PR_1_FAST_SYMLINK_EXTENT_FL, &pctx)) {
+ inode->i_flags &= ~EXT4_EXTENTS_FL;
+ e2fsck_write_inode(ctx, ino, inode, "pass1");
+ }
+
if (ino == EXT2_BAD_INO) {
struct process_block_struct pb;

diff --git a/e2fsck/problem.c b/e2fsck/problem.c
index afed4fe..d3e2fd7 100644
--- a/e2fsck/problem.c
+++ b/e2fsck/problem.c
@@ -819,6 +819,11 @@ static struct e2fsck_problem problem_table[] = {
N_("@i %i missing EXTENT_FL, but is in extents format\n"),
PROMPT_FIX, PR_PREEN_OK },

+ /* Fast symlink has EXTENTS_FL set */
+ { PR_1_FAST_SYMLINK_EXTENT_FL,
+ N_("Fast symlink %i has EXTENT_FL set. "),
+ PROMPT_CLEAR, 0 },
+
/* Pass 1b errors */

/* Pass 1B: Rescan for duplicate/bad blocks */
diff --git a/e2fsck/problem.h b/e2fsck/problem.h
index d5d1a78..48a4a2b 100644
--- a/e2fsck/problem.h
+++ b/e2fsck/problem.h
@@ -476,6 +476,9 @@ struct problem_context {
/* inode missing EXTENTS_FL, but is an extent inode */
#define PR_1_UNSET_EXTENT_FL 0x01005C

+/* Fast symlink has EXTENTS_FL set */
+#define PR_1_FAST_SYMLINK_EXTENT_FL 0x01005D
+
/*
* Pass 1b errors
*/
--
1.5.4.1.144.gdfee-dirty


2008-03-14 07:17:06

by Christian Kujau

[permalink] [raw]
Subject: Re: e2fsck not fixing all corruptions on the first run?

On Thu, 13 Mar 2008, Theodore Tso wrote:
> Technically, I guess you could say the on-disk format changed, but as
> I mentioned, it's going to change back to the original format. It's
> not a big deal. The question is for long symlinks, where the symlink
> can't fit in the 60 bytes in the i_block[] array, whether the block
> number should be encoded as a single integer as i_block[0], as it is
> under ext3, or whether it should be encoded as an extents structure.

Thanks again for the explanation (and for the fix, I shall try it soon)!

Christian.
--
BOFH excuse #399:

We are a 100% Microsoft Shop.