2009-10-12 13:52:12

by Felipe Contreras

[permalink] [raw]
Subject: Weird ext4 bug: 256P used?

This is what I get with 'du -x --max-depth=3 | sort -n'.

140735340884184 ./var/lib/yum
140735340910320 ./usr/include
140735341711632 ./var/lib
140735342038956 ./var
140735344736432 ./usr
281470691009304 .

I did 'touch /forcefsck', rebooted, and didn't get any error, so I
guess at least the basic checks are passing.

Thoughts?

--
Felipe Contreras


2009-10-12 22:29:54

by Theodore Ts'o

[permalink] [raw]
Subject: Re: Weird ext4 bug: 256P used?

On Mon, Oct 12, 2009 at 04:51:31PM +0300, Felipe Contreras wrote:
> This is what I get with 'du -x --max-depth=3 | sort -n'.
>
> 140735340884184 ./var/lib/yum
> 140735340910320 ./usr/include
> 140735341711632 ./var/lib
> 140735342038956 ./var
> 140735344736432 ./usr
> 281470691009304 .
>
> I did 'touch /forcefsck', rebooted, and didn't get any error, so I
> guess at least the basic checks are passing.

So if you do "du -x | sort -n", what's the deepest directory that
shows a very large size, and can you find the files that seems to be
responsible for these large du reports?

- Ted

2009-10-12 23:04:21

by Felipe Contreras

[permalink] [raw]
Subject: Re: Weird ext4 bug: 256P used?

On Tue, Oct 13, 2009 at 1:29 AM, Theodore Tso <[email protected]> wrote:
> On Mon, Oct 12, 2009 at 04:51:31PM +0300, Felipe Contreras wrote:
>> This is what I get with 'du -x --max-depth=3 | sort -n'.
>>
>> 140735340884184       ./var/lib/yum
>> 140735340910320       ./usr/include
>> 140735341711632       ./var/lib
>> 140735342038956       ./var
>> 140735344736432       ./usr
>> 281470691009304       .
>>
>> I did 'touch /forcefsck', rebooted, and didn't get any error, so I
>> guess at least the basic checks are passing.
>
> So if you do "du -x | sort -n", what's the deepest directory that
> shows a very large size, and can you find the files that seems to be
> responsible for these large du reports?

This is the result:
140735340871696 ./var/lib/yum/yumdb/s/160f96bb8689bae7bed1f8801385845d47913ace-skype-2.0.0.72-fc5-i586
140735340872268 ./var/lib/yum/yumdb/s
140735340884168 ./var/lib/yum/yumdb
140735340884184 ./var/lib/yum
140735340910320 ./usr/include
140735341713520 ./var/lib
140735342037100 ./var
140735344736432 ./usr
281470690029776 .

However, there's no file so big:
ls -lh /var/lib/yum/yumdb/s/160f96bb8689bae7bed1f8801385845d47913ace-skype-2.0.0.72-fc5-i586
total 12K
-rw-r--r-- 1 root root 24 2009-07-27 20:52 from_repo
-rw-r--r-- 1 root root 4 2009-07-27 20:52 reason
-rw-r--r-- 1 root root 2 2009-07-27 20:52 releasever

However, there's something messed up with the uid/gid:

ls -ld /var/lib/yum/yumdb/s/160f96bb8689bae7bed1f8801385845d47913ace-skype-2.0.0.72-fc5-i586
drwxr-xr-x 2 4294901760 16711680 4096 2009-07-27 20:52
/var/lib/yum/yumdb/s/160f96bb8689bae7bed1f8801385845d47913ace-skype-2.0.0.72-fc5-i586

ls -l /usr/include/autosprintf.h
-rw-r--r-- 1 4294901760 16711680 4096 2009-06-23 03:53
/usr/include/autosprintf.h

Apparently these are the two files with the problem, and it seems to
be related to the wrong directory size.

--
Felipe Contreras

2009-10-12 23:12:51

by Theodore Ts'o

[permalink] [raw]
Subject: Re: Weird ext4 bug: 256P used?

On Tue, Oct 13, 2009 at 02:02:31AM +0300, Felipe Contreras wrote:
>
> However, there's no file so big:
> ls -lh /var/lib/yum/yumdb/s/160f96bb8689bae7bed1f8801385845d47913ace-skype-2.0.0.72-fc5-i586
> total 12K
> -rw-r--r-- 1 root root 24 2009-07-27 20:52 from_repo
> -rw-r--r-- 1 root root 4 2009-07-27 20:52 reason
> -rw-r--r-- 1 root root 2 2009-07-27 20:52 releasever
>
> However, there's something messed up with the uid/gid:
>
> ls -ld /var/lib/yum/yumdb/s/160f96bb8689bae7bed1f8801385845d47913ace-skype-2.0.0.72-fc5-i586
> drwxr-xr-x 2 4294901760 16711680 4096 2009-07-27 20:52
> /var/lib/yum/yumdb/s/160f96bb8689bae7bed1f8801385845d47913ace-skype-2.0.0.72-fc5-i586

OK, how about the output of

stat /var/lib/yum/yumdb/s/160f96bb8689bae7bed1f8801385845d47913ace-skype-2.\
0.0.72-fc5-i586

and

stat /var/lib/yum/yumdb/s/160f96bb8689bae7bed1f8801385845d47913ace-skype-2.\
0.0.72-fc5-i586/*

Thanks,

- Ted

2009-10-12 23:29:37

by Felipe Contreras

[permalink] [raw]
Subject: Re: Weird ext4 bug: 256P used?

On Tue, Oct 13, 2009 at 2:12 AM, Theodore Tso <[email protected]> wrote:
> On Tue, Oct 13, 2009 at 02:02:31AM +0300, Felipe Contreras wrote:
>>
>> However, there's no file so big:
>> ls -lh /var/lib/yum/yumdb/s/160f96bb8689bae7bed1f8801385845d47913ace-skype-2.0.0.72-fc5-i586
>> total 12K
>> -rw-r--r-- 1 root root 24 2009-07-27 20:52 from_repo
>> -rw-r--r-- 1 root root  4 2009-07-27 20:52 reason
>> -rw-r--r-- 1 root root  2 2009-07-27 20:52 releasever
>>
>> However, there's something messed up with the uid/gid:
>>
>> ls -ld /var/lib/yum/yumdb/s/160f96bb8689bae7bed1f8801385845d47913ace-skype-2.0.0.72-fc5-i586
>> drwxr-xr-x 2 4294901760 16711680 4096 2009-07-27 20:52
>> /var/lib/yum/yumdb/s/160f96bb8689bae7bed1f8801385845d47913ace-skype-2.0.0.72-fc5-i586
>
> OK, how about the output of
>
> stat /var/lib/yum/yumdb/s/160f96bb8689bae7bed1f8801385845d47913ace-skype-2.\
> 0.0.72-fc5-i586

stat /var/lib/yum/yumdb/s/160f96bb8689bae7bed1f8801385845d47913ace-skype-2.0.0.72-fc5-i586
File: `/var/lib/yum/yumdb/s/160f96bb8689bae7bed1f8801385845d47913ace-skype-2.0.0.72-fc5-i586'
Size: 4096 Blocks: 281470681743368 IO Block: 4096 directory
Device: fe00h/65024d Inode: 141257 Links: 2
Access: (0755/drwxr-xr-x) Uid: (4294901760/ UNKNOWN) Gid: (16711680/ UNKNOWN)
Access: 2009-10-12 16:36:22.326920328 +0300
Modify: 2009-07-27 20:52:23.000000000 +0300
Change: 2009-07-27 20:52:23.000000000 +0300

> and
>
> stat /var/lib/yum/yumdb/s/160f96bb8689bae7bed1f8801385845d47913ace-skype-2.\
> 0.0.72-fc5-i586/*

stat /var/lib/yum/yumdb/s/160f96bb8689bae7bed1f8801385845d47913ace-skype-2.0.0.72-fc5-i586/*
File: `/var/lib/yum/yumdb/s/160f96bb8689bae7bed1f8801385845d47913ace-skype-2.0.0.72-fc5-i586/from_repo'
Size: 24 Blocks: 8 IO Block: 4096 regular file
Device: fe00h/65024d Inode: 144750 Links: 1
Access: (0644/-rw-r--r--) Uid: ( 0/ root) Gid: ( 0/ root)
Access: 2009-07-27 20:52:23.931517024 +0300
Modify: 2009-07-27 20:52:23.931517024 +0300
Change: 2009-07-27 20:52:23.931517024 +0300
File: `/var/lib/yum/yumdb/s/160f96bb8689bae7bed1f8801385845d47913ace-skype-2.0.0.72-fc5-i586/reason'
Size: 4 Blocks: 8 IO Block: 4096 regular file
Device: fe00h/65024d Inode: 144824 Links: 1
Access: (0644/-rw-r--r--) Uid: ( 0/ root) Gid: ( 0/ root)
Access: 2009-07-27 20:52:23.931517024 +0300
Modify: 2009-07-27 20:52:23.931517024 +0300
Change: 2009-07-27 20:52:23.931517024 +0300
File: `/var/lib/yum/yumdb/s/160f96bb8689bae7bed1f8801385845d47913ace-skype-2.0.0.72-fc5-i586/releasever'
Size: 2 Blocks: 8 IO Block: 4096 regular file
Device: fe00h/65024d Inode: 144913 Links: 1
Access: (0644/-rw-r--r--) Uid: ( 0/ root) Gid: ( 0/ root)
Access: 2009-07-27 20:52:23.931517024 +0300
Modify: 2009-07-27 20:52:23.932517304 +0300
Change: 2009-07-27 20:52:23.932517304 +0300

--
Felipe Contreras

2009-10-13 01:05:59

by Theodore Ts'o

[permalink] [raw]
Subject: Re: Weird ext4 bug: 256P used?

On Tue, Oct 13, 2009 at 02:27:48AM +0300, Felipe Contreras wrote:
>
> stat /var/lib/yum/yumdb/s/160f96bb8689bae7bed1f8801385845d47913ace-skype-2.0.0.72-fc5-i586
> File: `/var/lib/yum/yumdb/s/160f96bb8689bae7bed1f8801385845d47913ace-skype-2.0.0.72-fc5-i586'
> Size: 4096 Blocks: 281470681743368 IO Block: 4096 directory
> Device: fe00h/65024d Inode: 141257 Links: 2
> Access: (0755/drwxr-xr-x) Uid: (4294901760/ UNKNOWN) Gid: (16711680/ UNKNOWN)
> Access: 2009-10-12 16:36:22.326920328 +0300
> Modify: 2009-07-27 20:52:23.000000000 +0300
> Change: 2009-07-27 20:52:23.000000000 +0300

OK, I see what's going on. i_blocks_hi is getting set to 0xFFFF.

So I definitely see the bug in e2fsck in not reporitng and fixing the
problem (and I'll fix that). I'm not sure how i_blocks_hi got set to
that value, though; it looks like everything should be doing the right
thing in the latest mainline kernel. What version of the kernel are
you using?

- Ted

2009-10-13 02:11:49

by Theodore Ts'o

[permalink] [raw]
Subject: Re: Weird ext4 bug: 256P used?

Here's a patch to e2fsprogs which will cause e2fsck to find and fix
the filesystem corruption. I'm not sure how i_blocks_hi was set to
the incorect value in the first place, but this should fix the
filesystem for you (largely a cosmetic issue).

- Ted

commit 8a8f36540bbf5d4397cf476e216e9a720b5c1d8e
Author: Theodore Ts'o <[email protected]>
Date: Mon Oct 12 21:59:37 2009 -0400

e2fsck: Fix handling of non-zero i_blocks_high field

E2fsck was not properly printing the i_blocks field in filesystem
corruption messages, and it was not properly checking i_blocks_hi and
i_blocks_lo, either. This commit fixes this.

Thanks to Felipe Conteras for pointing this out.

Signed-off-by: "Theodore Ts'o" <[email protected]>

diff --git a/e2fsck/message.c b/e2fsck/message.c
index 5e28812..9aaedc5 100644
--- a/e2fsck/message.c
+++ b/e2fsck/message.c
@@ -258,7 +258,7 @@ static _INLINE_ void expand_at_expression(e2fsck_t ctx, char ch,
/*
* This function expands '%IX' expressions
*/
-static _INLINE_ void expand_inode_expression(char ch,
+static _INLINE_ void expand_inode_expression(ext2_filsys fs, char ch,
struct problem_context *ctx)
{
struct ext2_inode *inode;
@@ -292,7 +292,8 @@ static _INLINE_ void expand_inode_expression(char ch,
printf("%u", large_inode->i_extra_isize);
break;
case 'b':
- if (inode->i_flags & EXT4_HUGE_FILE_FL)
+ if (fs->super->s_feature_ro_compat &
+ EXT4_FEATURE_RO_COMPAT_HUGE_FILE)
printf("%llu", inode->i_blocks +
(((long long) inode->osd2.linux2.l_i_blocks_hi)
<< 32));
@@ -528,7 +529,7 @@ void print_e2fsck_message(e2fsck_t ctx, const char *msg,
expand_at_expression(ctx, *cp, pctx, &first, recurse);
} else if (cp[0] == '%' && cp[1] == 'I') {
cp += 2;
- expand_inode_expression(*cp, pctx);
+ expand_inode_expression(fs, *cp, pctx);
} else if (cp[0] == '%' && cp[1] == 'D') {
cp += 2;
expand_dirent_expression(fs, *cp, pctx);
diff --git a/e2fsck/pass1.c b/e2fsck/pass1.c
index 9b12005..2531e57 100644
--- a/e2fsck/pass1.c
+++ b/e2fsck/pass1.c
@@ -1792,6 +1792,15 @@ static void check_blocks_extents(e2fsck_t ctx, struct problem_context *pctx,
ext2fs_extent_free(ehandle);
}

+static blk64_t ext2fs_inode_i_blocks(ext2_filsys fs,
+ struct ext2_inode *inode)
+{
+ return (inode->i_blocks |
+ (fs->super->s_feature_ro_compat &
+ EXT4_FEATURE_RO_COMPAT_HUGE_FILE ?
+ (__u64)inode->osd2.linux2.l_i_blocks_hi << 32 : 0));
+}
+
/*
* This subroutine is called on each inode to account for all of the
* blocks used by that inode.
@@ -1972,7 +1981,7 @@ static void check_blocks(e2fsck_t ctx, struct problem_context *pctx,
if (LINUX_S_ISREG(inode->i_mode) &&
(inode->i_size_high || inode->i_size & 0x80000000UL))
ctx->large_files++;
- if ((pb.num_blocks != inode->i_blocks) ||
+ if ((pb.num_blocks != ext2fs_inode_i_blocks(fs, inode)) ||
((fs->super->s_feature_ro_compat &
EXT4_FEATURE_RO_COMPAT_HUGE_FILE) &&
(inode->i_flags & EXT4_HUGE_FILE_FL) &&

2009-10-13 11:10:15

by Felipe Contreras

[permalink] [raw]
Subject: Re: Weird ext4 bug: 256P used?

On Tue, Oct 13, 2009 at 4:05 AM, Theodore Tso <[email protected]> wrote:
> On Tue, Oct 13, 2009 at 02:27:48AM +0300, Felipe Contreras wrote:
>>
>> stat /var/lib/yum/yumdb/s/160f96bb8689bae7bed1f8801385845d47913ace-skype-2.0.0.72-fc5-i586
>>   File: `/var/lib/yum/yumdb/s/160f96bb8689bae7bed1f8801385845d47913ace-skype-2.0.0.72-fc5-i586'
>>   Size: 4096          Blocks: 281470681743368 IO Block: 4096   directory
>> Device: fe00h/65024d  Inode: 141257      Links: 2
>> Access: (0755/drwxr-xr-x)  Uid: (4294901760/ UNKNOWN)   Gid: (16711680/ UNKNOWN)
>> Access: 2009-10-12 16:36:22.326920328 +0300
>> Modify: 2009-07-27 20:52:23.000000000 +0300
>> Change: 2009-07-27 20:52:23.000000000 +0300
>
> OK, I see what's going on.   i_blocks_hi is getting set to 0xFFFF.
>
> So I definitely see the bug in e2fsck in not reporitng and fixing the
> problem (and I'll fix that).  I'm not sure how i_blocks_hi got set to
> that value, though; it looks like everything should be doing the right
> thing in the latest mainline kernel.  What version of the kernel are
> you using?

I was using F11 kernels, and at some point I went back to compile my
own kernels. I think I started with 2.6.31 and now 2.6.31.1.

--
Felipe Contreras

2009-10-13 11:13:17

by Felipe Contreras

[permalink] [raw]
Subject: Re: Weird ext4 bug: 256P used?

On Tue, Oct 13, 2009 at 5:11 AM, Theodore Tso <[email protected]> wrote:
> Here's a patch to e2fsprogs which will cause e2fsck to find and fix
> the filesystem corruption.  I'm not sure how i_blocks_hi was set to
> the incorect value in the first place, but this should fix the
> filesystem for you (largely a cosmetic issue).
>
>                                        - Ted
>
> commit 8a8f36540bbf5d4397cf476e216e9a720b5c1d8e
> Author: Theodore Ts'o <[email protected]>
> Date:   Mon Oct 12 21:59:37 2009 -0400
>
>    e2fsck: Fix handling of non-zero i_blocks_high field
>
>    E2fsck was not properly printing the i_blocks field in filesystem
>    corruption messages, and it was not properly checking i_blocks_hi and
>    i_blocks_lo, either.  This commit fixes this.
>
>    Thanks to Felipe Conteras for pointing this out.
>
>    Signed-off-by: "Theodore Ts'o" <[email protected]>

This patch fixes my problem.
Tested-by: Felipe Contreras <[email protected]>

However, I had one problem compiling:

> diff --git a/e2fsck/message.c b/e2fsck/message.c
> index 5e28812..9aaedc5 100644
> --- a/e2fsck/message.c
> +++ b/e2fsck/message.c
> @@ -258,7 +258,7 @@ static _INLINE_ void expand_at_expression(e2fsck_t ctx, char ch,
>  /*
>  * This function expands '%IX' expressions
>  */
> -static _INLINE_ void expand_inode_expression(char ch,
> +static _INLINE_ void expand_inode_expression(ext2_filsys fs, char ch,
>                                             struct problem_context *ctx)
>  {
>        struct ext2_inode       *inode;
> @@ -292,7 +292,8 @@ static _INLINE_ void expand_inode_expression(char ch,
>                printf("%u", large_inode->i_extra_isize);
>                break;
>        case 'b':
> -               if (inode->i_flags & EXT4_HUGE_FILE_FL)
> +               if (fs->super->s_feature_ro_compat &
> +                   EXT4_FEATURE_RO_COMPAT_HUGE_FILE)
>                        printf("%llu", inode->i_blocks +
>                               (((long long) inode->osd2.linux2.l_i_blocks_hi)
>                                << 32));
> @@ -528,7 +529,7 @@ void print_e2fsck_message(e2fsck_t ctx, const char *msg,
>                        expand_at_expression(ctx, *cp, pctx, &first, recurse);
>                } else if (cp[0] == '%' && cp[1] == 'I') {
>                        cp += 2;
> -                       expand_inode_expression(*cp, pctx);
> +                       expand_inode_expression(fs, *cp, pctx);
>                } else if (cp[0] == '%' && cp[1] == 'D') {
>                        cp += 2;
>                        expand_dirent_expression(fs, *cp, pctx);
> diff --git a/e2fsck/pass1.c b/e2fsck/pass1.c
> index 9b12005..2531e57 100644
> --- a/e2fsck/pass1.c
> +++ b/e2fsck/pass1.c
> @@ -1792,6 +1792,15 @@ static void check_blocks_extents(e2fsck_t ctx, struct problem_context *pctx,
>        ext2fs_extent_free(ehandle);
>  }
>
> +static blk64_t ext2fs_inode_i_blocks(ext2_filsys fs,
> +                                    struct ext2_inode *inode)

My compiler fails because this function is already defined at
'lib/ext2fs/ext2fs.h' as non static.

> +{
> +       return (inode->i_blocks |
> +               (fs->super->s_feature_ro_compat &
> +                EXT4_FEATURE_RO_COMPAT_HUGE_FILE ?
> +                (__u64)inode->osd2.linux2.l_i_blocks_hi << 32 : 0));
> +}
> +
>  /*
>  * This subroutine is called on each inode to account for all of the
>  * blocks used by that inode.
> @@ -1972,7 +1981,7 @@ static void check_blocks(e2fsck_t ctx, struct problem_context *pctx,
>        if (LINUX_S_ISREG(inode->i_mode) &&
>            (inode->i_size_high || inode->i_size & 0x80000000UL))
>                ctx->large_files++;
> -       if ((pb.num_blocks != inode->i_blocks) ||
> +       if ((pb.num_blocks != ext2fs_inode_i_blocks(fs, inode)) ||
>            ((fs->super->s_feature_ro_compat &
>              EXT4_FEATURE_RO_COMPAT_HUGE_FILE) &&
>             (inode->i_flags & EXT4_HUGE_FILE_FL) &&

--
Felipe Contreras

2009-10-13 12:14:37

by Theodore Ts'o

[permalink] [raw]
Subject: Re: Weird ext4 bug: 256P used?

On Tue, Oct 13, 2009 at 02:11:27PM +0300, Felipe Contreras wrote:
> My compiler fails because this function is already defined at
> 'lib/ext2fs/ext2fs.h' as non static.

Ah, you must be using a 64-bit 'pu' branch version of e2fsprogs.
Yeah, this patch was meant for the 'maint' branch. A sightly
different branch is needed for the 'pu' branch.

- Ted

2009-10-13 12:20:07

by Felipe Contreras

[permalink] [raw]
Subject: Re: Weird ext4 bug: 256P used?

On Tue, Oct 13, 2009 at 3:13 PM, Theodore Tso <[email protected]> wrote:
> On Tue, Oct 13, 2009 at 02:11:27PM +0300, Felipe Contreras wrote:
>> My compiler fails because this function is already defined at
>> 'lib/ext2fs/ext2fs.h' as non static.
>
> Ah, you must be using a 64-bit 'pu' branch version of e2fsprogs.
> Yeah, this patch was meant for the 'maint' branch.  A sightly
> different branch is needed for the 'pu' branch.

I just used the HEAD:
remotes/origin/HEAD -> origin/master

Anyway, it builds fine on the 'maint' branch :)

Thanks a lot.

--
Felipe Contreras