2007-04-10 20:51:13

by Jim Garlick

[permalink] [raw]
Subject: [PATCH] e2fsprogs - pass1c terminates early if hard links

Ted,

I think this is a small buglet in e2fsck: if a file has multiple hard
links, e2fsck pass1c search_dirent_proc() doesn't maintain its count
properly and may return DIRENT_ABORT before it has found containing
directories for all inodes sharing blocks.

Signed-off-by: Jim Garlick <[email protected]>

Index: e2fsprogs+chaos/e2fsck/pass1b.c
===================================================================
--- e2fsprogs+chaos.orig/e2fsck/pass1b.c
+++ e2fsprogs+chaos/e2fsck/pass1b.c
@@ -372,8 +372,10 @@ static int search_dirent_proc(ext2_ino_t
if (!n)
return 0;
p = (struct dup_inode *) dnode_get(n);
- p->dir = dir;
- sd->count--;
+ if (!p->dir) {
+ p->dir = dir;
+ sd->count--;
+ }

return(sd->count ? 0 : DIRENT_ABORT);
}


2007-04-10 22:40:12

by Jim Garlick

[permalink] [raw]
Subject: [PATCH] e2fsprogs - e2fsck pass1c does extra work if root dir has shared blocks

Ted,

Another small bug I think: if the root directory contains shared
blocks, e2fsck pass1c search_dirent_proc() will be looking for
one more containing directory than it will ever find, and thus
loses an opportunity to terminate early.

Signed-off-by: Jim Garlick <[email protected]>

Index: e2fsprogs+chaos/e2fsck/pass1b.c
===================================================================
--- e2fsprogs+chaos.orig/e2fsck/pass1b.c
+++ e2fsprogs+chaos/e2fsck/pass1b.c
@@ -96,6 +96,7 @@ static void pass1c(e2fsck_t ctx, char *b
static void pass1d(e2fsck_t ctx, char *block_buf);

static int dup_inode_count = 0;
+static int dup_inode_founddir = 0;

static dict_t blk_dict, ino_dict;

@@ -146,7 +147,12 @@ static void add_dupe(e2fsck_t ctx, ext2_
else {
di = (struct dup_inode *) e2fsck_allocate_memory(ctx,
sizeof(struct dup_inode), "duplicate inode header");
- di->dir = (ino == EXT2_ROOT_INO) ? EXT2_ROOT_INO : 0 ;
+ if (ino == EXT2_ROOT_INO) {
+ di->dir = EXT2_ROOT_INO;
+ dup_inode_founddir++;
+ } else
+ di->dir = 0;
+
di->num_dupblocks = 0;
di->block_list = 0;
di->inode = *inode;
@@ -396,7 +402,7 @@ static void pass1c(e2fsck_t ctx, char *b
* Search through all directories to translate inodes to names
* (by searching for the containing directory for that inode.)
*/
- sd.count = dup_inode_count;
+ sd.count = dup_inode_count - dup_inode_founddir;
sd.first_inode = EXT2_FIRST_INODE(fs->super);
sd.max_inode = fs->super->s_inodes_count;
ext2fs_dblist_dir_iterate(fs->dblist, 0, block_buf,

2007-04-11 03:40:29

by Theodore Ts'o

[permalink] [raw]
Subject: Re: [PATCH] e2fsprogs - pass1c terminates early if hard links

On Tue, Apr 10, 2007 at 01:51:12PM -0700, Jim Garlick wrote:
> I think this is a small buglet in e2fsck: if a file has multiple hard
> links, e2fsck pass1c search_dirent_proc() doesn't maintain its count
> properly and may return DIRENT_ABORT before it has found containing
> directories for all inodes sharing blocks.

Yep, indeed! Thanks for submitting the patch. I've checked it into
the e2fsprogs source code repository.

Enclosed please note what I've checked into Mercurial (also visible at
http://thunk.org/hg/e2fsprogs/?rev/8bb7637c6508). I'd really
appreciate it if folks who submit bug fixes or other
changes/enhancements to e2fsck could include test cases which
reproduce the bug and then demonstrate that the patches fix the
problem, such as what I've done below. Also note that providing a
script which generates the test filesystem is now preferred to
including an image.gz file; it avoids the need for binary patches, and
it makes it clearer how the test filesystem is constructed. (We
couldn't do that before since debugfs didn't have enough facilities to
some of the test cases; patches to replace image.gz files with a
generator scripts are definitely welcome.)

So for people who are looking to create test cases, please use the
script file in tests/f_dup4 in the Mercurial repository as a template
for new test cases when possible.

Thanks, regards,

- Ted


# HG changeset patch
# User [email protected]
# Date 1176260109 14400
# Node ID 8bb7637c6508a37cbf1556a083dcf4f7985e2926
# Parent 365efb1a2299b5f9f064f15186540d2807fab308
e2fsck: pass1c terminates early if hard links

I think this is a small buglet in e2fsck: if a file has multiple hard
links, e2fsck pass1c search_dirent_proc() doesn't maintain its count
properly and may return DIRENT_ABORT before it has found containing
directories for all inodes sharing blocks.

Signed-off-by: Jim Garlick <[email protected]>
Signed-off-by: "Theodore Ts'o" <[email protected]>

diff -r 365efb1a2299 -r 8bb7637c6508 e2fsck/ChangeLog
--- a/e2fsck/ChangeLog Tue Apr 10 21:10:55 2007 -0400
+++ b/e2fsck/ChangeLog Tue Apr 10 22:55:09 2007 -0400
@@ -1,3 +1,11 @@ 2007-04-06 Theodore Tso <[email protected]
+2007-04-10 Jim Garlick <[email protected]>
+
+ * pass1b.c (search_dirent_proc): if a file has multiple hard
+ links, e2fsck pass1c search_dirent_proc() doesn't maintain
+ its count properly and may return DIRENT_ABORT before it
+ has found containing directories for all inodes sharing
+ blocks.
+
2007-04-06 Theodore Tso <[email protected]>

* e2fsck.conf.5.in: Update man page to document the scratch_files
diff -r 365efb1a2299 -r 8bb7637c6508 e2fsck/pass1b.c
--- a/e2fsck/pass1b.c Tue Apr 10 21:10:55 2007 -0400
+++ b/e2fsck/pass1b.c Tue Apr 10 22:55:09 2007 -0400
@@ -372,8 +372,10 @@ static int search_dirent_proc(ext2_ino_t
if (!n)
return 0;
p = (struct dup_inode *) dnode_get(n);
- p->dir = dir;
- sd->count--;
+ if (!p->dir) {
+ p->dir = dir;
+ sd->count--;
+ }

return(sd->count ? 0 : DIRENT_ABORT);
}
diff -r 365efb1a2299 -r 8bb7637c6508 tests/ChangeLog
--- a/tests/ChangeLog Tue Apr 10 21:10:55 2007 -0400
+++ b/tests/ChangeLog Tue Apr 10 22:55:09 2007 -0400
@@ -1,3 +1,11 @@ 2007-04-07 Theodore Tso <[email protected]
+2007-04-10 Theodore Tso <[email protected]>
+
+ * f_dup4: New test cases which tests a bugfix in e2fsck where if an
+ inode which has blocks claimed by other inodes has
+ multiple hard links, e2fsck could fail to collect the
+ directory information for all of the inodes with multiply
+ claimed blocks.
+
2007-04-07 Theodore Tso <[email protected]>

* test_script.in: Skip completely empty directories
diff -r 365efb1a2299 -r 8bb7637c6508 tests/f_dup4/expect.1
--- /dev/null Thu Jan 01 00:00:00 1970 +0000
+++ b/tests/f_dup4/expect.1 Tue Apr 10 22:55:09 2007 -0400
@@ -0,0 +1,112 @@
+Pass 1: Checking inodes, blocks, and sizes
+
+Running additional passes to resolve blocks claimed by more than one inode...
+Pass 1B: Rescanning for multiply-claimed blocks
+Multiply-claimed block(s) in inode 16: 30
+Multiply-claimed block(s) in inode 17: 30
+Multiply-claimed block(s) in inode 18: 30
+Multiply-claimed block(s) in inode 19: 30
+Multiply-claimed block(s) in inode 20: 34
+Multiply-claimed block(s) in inode 21: 34
+Multiply-claimed block(s) in inode 22: 34
+Multiply-claimed block(s) in inode 23: 34
+Pass 1C: Scanning directories for inodes with multiply-claimed blocks
+Pass 1D: Reconciling multiply-claimed blocks
+(There are 8 inodes containing multiply-claimed blocks.)
+
+File /dir/foo (inode #16, mod time Tue Apr 10 21:00:00 2007)
+ has 1 multiply-claimed block(s), shared with 3 file(s):
+ /dir/quux1 (inode #19, mod time Tue Apr 10 21:00:00 2007)
+ /dir3/baz (inode #18, mod time Tue Apr 10 21:00:00 2007)
+ /dir2/bar (inode #17, mod time Tue Apr 10 21:00:00 2007)
+Clone multiply-claimed blocks? yes
+
+File /dir2/bar (inode #17, mod time Tue Apr 10 21:00:00 2007)
+ has 1 multiply-claimed block(s), shared with 3 file(s):
+ /dir/quux1 (inode #19, mod time Tue Apr 10 21:00:00 2007)
+ /dir3/baz (inode #18, mod time Tue Apr 10 21:00:00 2007)
+ /dir/foo (inode #16, mod time Tue Apr 10 21:00:00 2007)
+Clone multiply-claimed blocks? yes
+
+File /dir3/baz (inode #18, mod time Tue Apr 10 21:00:00 2007)
+ has 1 multiply-claimed block(s), shared with 3 file(s):
+ /dir/quux1 (inode #19, mod time Tue Apr 10 21:00:00 2007)
+ /dir2/bar (inode #17, mod time Tue Apr 10 21:00:00 2007)
+ /dir/foo (inode #16, mod time Tue Apr 10 21:00:00 2007)
+Clone multiply-claimed blocks? yes
+
+File /dir/quux1 (inode #19, mod time Tue Apr 10 21:00:00 2007)
+ has 1 multiply-claimed block(s), shared with 3 file(s):
+ /dir3/baz (inode #18, mod time Tue Apr 10 21:00:00 2007)
+ /dir2/bar (inode #17, mod time Tue Apr 10 21:00:00 2007)
+ /dir/foo (inode #16, mod time Tue Apr 10 21:00:00 2007)
+Multiply-claimed blocks already reassigned or cloned.
+
+File /dir/fee (inode #20, mod time Tue Apr 10 21:00:00 2007)
+ has 1 multiply-claimed block(s), shared with 3 file(s):
+ /dir4/fum (inode #23, mod time Tue Apr 10 21:00:00 2007)
+ /dir3/foe (inode #22, mod time Tue Apr 10 21:00:00 2007)
+ /dir2/fie (inode #21, mod time Tue Apr 10 21:00:00 2007)
+Clone multiply-claimed blocks? yes
+
+File /dir2/fie (inode #21, mod time Tue Apr 10 21:00:00 2007)
+ has 1 multiply-claimed block(s), shared with 3 file(s):
+ /dir4/fum (inode #23, mod time Tue Apr 10 21:00:00 2007)
+ /dir3/foe (inode #22, mod time Tue Apr 10 21:00:00 2007)
+ /dir/fee (inode #20, mod time Tue Apr 10 21:00:00 2007)
+Clone multiply-claimed blocks? yes
+
+File /dir3/foe (inode #22, mod time Tue Apr 10 21:00:00 2007)
+ has 1 multiply-claimed block(s), shared with 3 file(s):
+ /dir4/fum (inode #23, mod time Tue Apr 10 21:00:00 2007)
+ /dir2/fie (inode #21, mod time Tue Apr 10 21:00:00 2007)
+ /dir/fee (inode #20, mod time Tue Apr 10 21:00:00 2007)
+Clone multiply-claimed blocks? yes
+
+File /dir4/fum (inode #23, mod time Tue Apr 10 21:00:00 2007)
+ has 1 multiply-claimed block(s), shared with 3 file(s):
+ /dir3/foe (inode #22, mod time Tue Apr 10 21:00:00 2007)
+ /dir2/fie (inode #21, mod time Tue Apr 10 21:00:00 2007)
+ /dir/fee (inode #20, mod time Tue Apr 10 21:00:00 2007)
+Multiply-claimed blocks already reassigned or cloned.
+
+Pass 2: Checking directory structure
+Invalid inode number for '.' in directory inode 20.
+Fix? yes
+
+Invalid inode number for '.' in directory inode 21.
+Fix? yes
+
+Invalid inode number for '.' in directory inode 22.
+Fix? yes
+
+Pass 3: Checking directory connectivity
+'..' in /dir/fee (20) is /dir4 (15), should be /dir (12).
+Fix? yes
+
+'..' in /dir2/fie (21) is /dir4 (15), should be /dir2 (13).
+Fix? yes
+
+'..' in /dir3/foe (22) is /dir4 (15), should be /dir3 (14).
+Fix? yes
+
+Pass 4: Checking reference counts
+Inode 12 ref count is 4, should be 3. Fix? yes
+
+Inode 13 ref count is 4, should be 3. Fix? yes
+
+Inode 14 ref count is 4, should be 3. Fix? yes
+
+Inode 15 ref count is 0, should be 3. Fix? yes
+
+Inode 16 ref count is 1, should be 3. Fix? yes
+
+Inode 17 ref count is 1, should be 2. Fix? yes
+
+Inode 19 ref count is 1, should be 3. Fix? yes
+
+Pass 5: Checking group summary information
+
+test_filesys: ***** FILE SYSTEM WAS MODIFIED *****
+test_filesys: 23/32 files (0.0% non-contiguous), 35/100 blocks
+Exit status is 1
diff -r 365efb1a2299 -r 8bb7637c6508 tests/f_dup4/expect.2
--- /dev/null Thu Jan 01 00:00:00 1970 +0000
+++ b/tests/f_dup4/expect.2 Tue Apr 10 22:55:09 2007 -0400
@@ -0,0 +1,7 @@
+Pass 1: Checking inodes, blocks, and sizes
+Pass 2: Checking directory structure
+Pass 3: Checking directory connectivity
+Pass 4: Checking reference counts
+Pass 5: Checking group summary information
+test_filesys: 23/32 files (0.0% non-contiguous), 35/100 blocks
+Exit status is 0
diff -r 365efb1a2299 -r 8bb7637c6508 tests/f_dup4/name
--- /dev/null Thu Jan 01 00:00:00 1970 +0000
+++ b/tests/f_dup4/name Tue Apr 10 22:55:09 2007 -0400
@@ -0,0 +1,2 @@
+find all directdory pathnames
+
diff -r 365efb1a2299 -r 8bb7637c6508 tests/f_dup4/script
--- /dev/null Thu Jan 01 00:00:00 1970 +0000
+++ b/tests/f_dup4/script Tue Apr 10 22:55:09 2007 -0400
@@ -0,0 +1,50 @@
+SKIP_GUNZIP="true"
+TEST_DATA="test.data"
+
+echo "/ Murphy Magic. The SeCrEt of the UnIvErSe is 43, NOT 42" > $TEST_DATA
+
+touch $TMPFILE
+$MKE2FS -N 32 -F -o Linux -b 1024 $TMPFILE 100 > /dev/null 2>&1
+$DEBUGFS -w $TMPFILE << EOF > /dev/null 2>&1
+set_current_time 200704102100
+set_super_value lastcheck 0
+set_super_value hash_seed null
+set_super_value mkfs_time 0
+mkdir dir
+mkdir dir2
+mkdir dir3
+mkdir dir4
+cd /dir
+write $TEST_DATA foo
+cd /dir2
+write $TEST_DATA bar
+cd /dir3
+write $TEST_DATA baz
+cd /dir4
+write $TEST_DATA quux
+mkdir /dir/fee
+mkdir /dir2/fie
+mkdir /dir3/foe
+mkdir /dir4/fum
+link /dir/foo /dir2/foo1
+link /dir/foo /dir3/foo2
+link /dir2/bar /dir3/bar1
+link /dir4/quux /dir/quux1
+link /dir4/quux /dir2/quux2
+set_inode_field /dir/foo block[0] 30
+set_inode_field /dir2/bar block[0] 30
+set_inode_field /dir3/baz block[0] 30
+set_inode_field /dir/fee block[0] 34
+set_inode_field /dir2/fie block[0] 34
+set_inode_field /dir3/foe block[0] 34
+q
+EOF
+
+E2FSCK_TIME=200704102100
+export E2FSCK_TIME
+
+. $cmd_dir/run_e2fsck
+
+rm -f $TEST_DATA
+
+unset E2FSCK_TIME TEST_DATA

2007-04-11 04:22:23

by Andreas Dilger

[permalink] [raw]
Subject: Re: [PATCH] e2fsprogs - pass1c terminates early if hard links

On Apr 10, 2007 23:40 -0400, Theodore Tso wrote:
> Also note that providing a
> script which generates the test filesystem is now preferred to
> including an image.gz file; it avoids the need for binary patches, and
> it makes it clearer how the test filesystem is constructed.

Agreed, though in some cases it is considerably easier to produce the
corruption via binary editing than debugfs commands unless there are
now facilities to copy blocks/bytes around within the filesystem?

> +find all directdory pathnames

Typo.

> diff -r 365efb1a2299 -r 8bb7637c6508 tests/f_dup4/script
> --- /dev/null Thu Jan 01 00:00:00 1970 +0000
> +++ b/tests/f_dup4/script Tue Apr 10 22:55:09 2007 -0400
> @@ -0,0 +1,50 @@
> +set_inode_field /dir/foo block[0] 30
> +set_inode_field /dir2/bar block[0] 30
> +set_inode_field /dir3/baz block[0] 30
> +set_inode_field /dir/fee block[0] 34
> +set_inode_field /dir2/fie block[0] 34
> +set_inode_field /dir3/foe block[0] 34

Also, items such as these presuppose that the directory has specific
blocks allocated to it, which need the test case to be constructed in
multiple passes (to extract these numbers) and could break at some time
in the future.


Cheers, Andreas
--
Andreas Dilger
Principal Software Engineer
Cluster File Systems, Inc.

2007-04-11 11:51:50

by Andreas Dilger

[permalink] [raw]
Subject: Re: [PATCH] e2fsprogs - pass1c terminates early if hard links

On Apr 11, 2007 06:42 -0400, Theodore Tso wrote:
> We don't currently have a way to modify indirect blocks
> directly (although I may create that soon or patches would be
> greatfully accepted :-), and of course there is a similar issue with
> extents and extended attributes blocks.

Note that as part of the i_extra_isize e2fsck support we are adding
functions to manage EAs to libext2fs that could be used as the basis
for this.

> At the moment I'm assuming that e2fsprogs block allocation algorithms
> (which are not as sophisticated as what's in the kernel) aren't going
> to be changing.

I thought as much myself.

> If they change, you're right, they could break the test.
> At the minimum I should document where the numbers are coming
> from in the test.

At least the test would break and any change that caused it could be
seen immediately.

> I could imagine a debugfs extension where we could do things like:
>
> set_var $shared_blk /dir4/quux block[0]
> set_inode_field /dir/foo block[0] $shared_blk
> set_inode_field /dir2/bar block[0] $shared_blk
> set_inode_field /dir3/baz block[0] $shared_blk
>
> Of course the temptation then would be to start adding conditions,
> functions, maybe an entire tcl interpreter.... :-)

Yes, I thought of this too, and also the "let's build a full interpreted
language into debugfs" conclusion. It wouldn't be fatal though - in
some cases it would be nice to have debugfs loop over inodes to fix some
unusual corruption.

Cheers, Andreas
--
Andreas Dilger
Principal Software Engineer
Cluster File Systems, Inc.

2007-04-11 10:42:35

by Theodore Ts'o

[permalink] [raw]
Subject: Re: [PATCH] e2fsprogs - pass1c terminates early if hard links

On Tue, Apr 10, 2007 at 10:22:20PM -0600, Andreas Dilger wrote:
> On Apr 10, 2007 23:40 -0400, Theodore Tso wrote:
> > Also note that providing a
> > script which generates the test filesystem is now preferred to
> > including an image.gz file; it avoids the need for binary patches, and
> > it makes it clearer how the test filesystem is constructed.
>
> Agreed, though in some cases it is considerably easier to produce the
> corruption via binary editing than debugfs commands unless there are
> now facilities to copy blocks/bytes around within the filesystem?

Yes, there are still some facilities that could be added to make life
more convenient. We don't currently have a way to modify indirect blocks
directly (although I may create that soon or patches would be
greatfully accepted :-), and of course there is a similar issue with
extents and extended attributes blocks.

A facility to set a specific range of bytes, or copy a range of bytes
from one block/offset to another (to simulate a disk controller
hiccup) is definitely not a bad idea. Maybe something like this?

write_data_bytes <block>/<offset> <hex string>
write_data_ascii <block>/<offset> <ascii string>
copy_data_bytes <block/offset> <block/offset> <count>

> > +set_inode_field /dir/foo block[0] 30
> > +set_inode_field /dir2/bar block[0] 30
> > +set_inode_field /dir3/baz block[0] 30
> > +set_inode_field /dir/fee block[0] 34
> > +set_inode_field /dir2/fie block[0] 34
> > +set_inode_field /dir3/foe block[0] 34
>
> Also, items such as these presuppose that the directory has specific
> blocks allocated to it, which need the test case to be constructed in
> multiple passes (to extract these numbers) and could break at some time
> in the future.

At the moment I'm assuming that e2fsprogs block allocation algorithms
(which are not as sophisticated as what's in the kernel) aren't going
to be changing. If they change, you're right, they could break the
test. At the minimum I should document where the numbers are coming
from in the test.

I could imagine a debugfs extension where we could do things like:

set_var $shared_blk /dir4/quux block[0]
set_inode_field /dir/foo block[0] $shared_blk
set_inode_field /dir2/bar block[0] $shared_blk
set_inode_field /dir3/baz block[0] $shared_blk

Of course the temptation then would be to start adding conditions,
functions, maybe an entire tcl interpreter.... :-)

- Ted

2007-04-11 12:57:08

by Theodore Ts'o

[permalink] [raw]
Subject: Re: [PATCH] e2fsprogs - pass1c terminates early if hard links

On Wed, Apr 11, 2007 at 05:51:47AM -0600, Andreas Dilger wrote:
> > Of course the temptation then would be to start adding conditions,
> > functions, maybe an entire tcl interpreter.... :-)
>
> Yes, I thought of this too, and also the "let's build a full interpreted
> language into debugfs" conclusion. It wouldn't be fatal though - in
> some cases it would be nice to have debugfs loop over inodes to fix some
> unusual corruption.

I have actually thought about the possibility of extending libss so
that it would optionally load the tcl shared library, just as today we
optionally load the readline library. The interfaces used by libss to
call individual command handlers and those used by tcl aren't actually
all that different.

If someone wanted to look into it, patches would certainly be
gratefully accepted. Having the ability to loop over inodes would
come in handy, and while you can do it to day by writing the script in
perl or bash and then using debugfs -R, (1) it's clumsy, and (2) it's
slow for big filesystems if you have to read the bitmaps each time.

Speaking of which, wasn't someone going to write patches that did lazy
bitmap loading for debugfs? I can't remember who said they would do
that. To whoever was going to do that --- one thought which I had ---
it would probably be a good idea to keep the current catastrophic
mode, and change it so that in catatrophic mode, bitmaps are not
loaded automatically, and an error message is returned right away.
There may be situations where you don't want debugfs to ever even try
loading the bitmaps (because they will fail after waiting a long
time), and so an immediate refusal to execute a particular command
requires the bitmaps to be loaded may be preferable.

- Ted

2007-04-20 12:15:05

by Theodore Ts'o

[permalink] [raw]
Subject: Re: [PATCH] e2fsprogs - e2fsck pass1c does extra work if root dir has shared blocks

On Tue, Apr 10, 2007 at 03:39:56PM -0700, Jim Garlick wrote:
>
> Another small bug I think: if the root directory contains shared
> blocks, e2fsck pass1c search_dirent_proc() will be looking for
> one more containing directory than it will ever find, and thus
> loses an opportunity to terminate early.
>
> Signed-off-by: Jim Garlick <[email protected]>

Thanks, applied.

- Ted