2011-11-18 02:52:48

by Robin Dong

[permalink] [raw]
Subject: [PATCH] e2fsck: fix "can't find dup_blk" error

From: Robin Dong <[email protected]>

After:
1. mke2fs -O ^has_journal,^resize_inode,^uninit_bg,extent,meta_bg,flex_bg,bigalloc /dev/sda
2. mount -t ext4 /dev/sda /test/
3. create file (8192K size, extent' e_len is 2) /test/1 ~ /test/10
4. use tool to change the /test/5 extent's e_len to 100, corrupt file
5. e2fsck -f /dev/sda

And it will report like:

Clone multiply-claimed blocks<y>? yes

clone_file_block: internal error: can't find dup_blk for 525345

clone_file_block: internal error: can't find dup_blk for 525346

clone_file_block: internal error: can't find dup_blk for 525347

clone_file_block: internal error: can't find dup_blk for 525348


In bigalloc filesystem, "blockcnt > 0" should change to "blockcnt >= cluster_ratio"

Signed-off-by: Robin Dong <[email protected]>
---
e2fsck/pass1b.c | 2 +-
1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/e2fsck/pass1b.c b/e2fsck/pass1b.c
index d5585dd..a9cdd9e 100644
--- a/e2fsck/pass1b.c
+++ b/e2fsck/pass1b.c
@@ -703,7 +703,7 @@ static int clone_file_block(ext2_filsys fs,
if (check_if_fs_cluster(ctx, EXT2FS_B2C(fs, *block_nr)))
is_meta = 1;

- if (((blockcnt > 0) && c == cs->dup_cluster) ||
+ if (((blockcnt >= EXT2FS_CLUSTER_RATIO(fs)) && c == cs->dup_cluster) ||
ext2fs_test_block_bitmap2(ctx->block_dup_map, *block_nr)) {
n = dict_lookup(&clstr_dict,
INT_TO_VOIDPTR(EXT2FS_B2C(fs, *block_nr)));
--
1.7.3.2



2011-11-19 04:02:19

by Theodore Ts'o

[permalink] [raw]
Subject: Re: [PATCH] e2fsck: fix "can't find dup_blk" error

On Fri, Nov 18, 2011 at 10:52:38AM +0800, Robin Dong wrote:
> From: Robin Dong <[email protected]>
>
> After:
> 1. mke2fs -O ^has_journal,^resize_inode,^uninit_bg,extent,meta_bg,flex_bg,bigalloc /dev/sda
> 2. mount -t ext4 /dev/sda /test/
> 3. create file (8192K size, extent' e_len is 2) /test/1 ~ /test/10
> 4. use tool to change the /test/5 extent's e_len to 100, corrupt file
> 5. e2fsck -f /dev/sda

I can't replicate this. I presume it only shows up with your
modification to the bigalloc format?

- Ted

2011-11-19 08:32:02

by Robin Dong

[permalink] [raw]
Subject: Re: [PATCH] e2fsck: fix "can't find dup_blk" error

2011/11/19 Ted Ts'o <[email protected]>:
> On Fri, Nov 18, 2011 at 10:52:38AM +0800, Robin Dong wrote:
>> From: Robin Dong <[email protected]>
>>
>> After:
>> 1. mke2fs -O ^has_journal,^resize_inode,^uninit_bg,extent,meta_bg,flex_bg,bigalloc /dev/sda
>> 2. mount -t ext4 /dev/sda /test/
>> 3. create file (8192K size, extent' e_len is 2) /test/1 ~ /test/10
>> 4. use tool to change the /test/5 extent's e_len to 100, corrupt file
>> 5. e2fsck -f /dev/sda
>
> I can't replicate this. ?I presume it only shows up with your
> modification to the bigalloc format?
>
> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?- Ted
>

Hi, Ted

I am not using my modification bigalloc format.
I use the upstream kernel (3.2-rc2) and uptodate e2fsprogs master
branch code, the error report will emerge after my steps above.
My /dev/sda is small (about 100G), so I guess:

# create /test/1 ~ /test/100 (e_len == 2)
# make /test/5 's e_len to 1024

may replicate this problem on larger disk.


--
--
Best Regard
Robin Dong

2011-11-28 17:02:30

by Theodore Ts'o

[permalink] [raw]
Subject: Re: [PATCH] e2fsck: fix "can't find dup_blk" error

Ok, I was able to replicate it, and created a small test case (see
attached), but it turns out your patch wasn't sufficient. It didn't
actually fix the problem correctly, and left the file system in a
corrupted state, which was seen when we ran e2fsck a second time and
the i_blocks field was not correctly.

Your patch also resulted in blocks within a logical cluster
incorrectly assigned to two physical clusters. (I still need to add
some code in e2fsck to notice this particular corruption, but it's
highly unlikely for this to happen due to hardware errors --- just
kernel/e2fsck bugs.)

I'll attach to this mail thread the patches I used to fix this, plus
the test case.

- Ted


Attachments:
(No filename) (719.00 B)
image.gz (4.56 kB)
Download all attachments

2011-11-28 17:15:46

by Theodore Ts'o

[permalink] [raw]
Subject: [PATCH 1/2] e2fsck: fix handling of duplicate blocks with bigalloc file systems

We need to do an accounting of duplicate clusters on a per-cluster
instead of a per-block basis so we know when we've correctly accounted
for all of the multiply claimed blocks in a particular inode.

Thanks to Robin Dong for reporting this bug.

Signed-off-by: "Theodore Ts'o" <[email protected]>
---
e2fsck/pass1b.c | 55 ++++++++++++++++++++++++-------------------------------
1 files changed, 24 insertions(+), 31 deletions(-)

diff --git a/e2fsck/pass1b.c b/e2fsck/pass1b.c
index d5585dd..f7ce8e4 100644
--- a/e2fsck/pass1b.c
+++ b/e2fsck/pass1b.c
@@ -259,6 +259,7 @@ struct process_block_struct {
e2fsck_t ctx;
ext2_ino_t ino;
int dup_blocks;
+ blk64_t cur_cluster;
struct ext2_inode *inode;
struct problem_context *pctx;
};
@@ -310,6 +311,7 @@ static void pass1b(e2fsck_t ctx, char *block_buf)
pb.ino = ino;
pb.dup_blocks = 0;
pb.inode = &inode;
+ pb.cur_cluster = ~0;

if (ext2fs_inode_has_valid_blocks2(fs, &inode) ||
(ino == EXT2_BAD_INO))
@@ -339,21 +341,23 @@ static void pass1b(e2fsck_t ctx, char *block_buf)

static int process_pass1b_block(ext2_filsys fs EXT2FS_ATTR((unused)),
blk64_t *block_nr,
- e2_blkcnt_t blockcnt EXT2FS_ATTR((unused)),
+ e2_blkcnt_t blockcnt,
blk64_t ref_blk EXT2FS_ATTR((unused)),
int ref_offset EXT2FS_ATTR((unused)),
void *priv_data)
{
struct process_block_struct *p;
e2fsck_t ctx;
+ blk64_t lc;

if (HOLE_BLKADDR(*block_nr))
return 0;
p = (struct process_block_struct *) priv_data;
ctx = p->ctx;
+ lc = EXT2FS_B2C(fs, blockcnt);

if (!ext2fs_test_block_bitmap2(ctx->block_dup_map, *block_nr))
- return 0;
+ goto finish;

/* OK, this is a duplicate block */
if (p->ino != EXT2_BAD_INO) {
@@ -363,8 +367,11 @@ static int process_pass1b_block(ext2_filsys fs EXT2FS_ATTR((unused)),
p->dup_blocks++;
ext2fs_mark_inode_bitmap2(inode_dup_map, p->ino);

- add_dupe(ctx, p->ino, EXT2FS_B2C(fs, *block_nr), p->inode);
+ if (lc != p->cur_cluster)
+ add_dupe(ctx, p->ino, EXT2FS_B2C(fs, *block_nr), p->inode);

+finish:
+ p->cur_cluster = lc;
return 0;
}

@@ -571,7 +578,7 @@ static void decrement_badcount(e2fsck_t ctx, blk64_t block,

static int delete_file_block(ext2_filsys fs,
blk64_t *block_nr,
- e2_blkcnt_t blockcnt EXT2FS_ATTR((unused)),
+ e2_blkcnt_t blockcnt,
blk64_t ref_block EXT2FS_ATTR((unused)),
int ref_offset EXT2FS_ATTR((unused)),
void *priv_data)
@@ -580,7 +587,7 @@ static int delete_file_block(ext2_filsys fs,
struct dup_cluster *p;
dnode_t *n;
e2fsck_t ctx;
- blk64_t c;
+ blk64_t c, lc;

pb = (struct process_block_struct *) priv_data;
ctx = pb->ctx;
@@ -589,11 +596,13 @@ static int delete_file_block(ext2_filsys fs,
return 0;

c = EXT2FS_B2C(fs, *block_nr);
+ lc = EXT2FS_B2C(fs, blockcnt);
if (ext2fs_test_block_bitmap2(ctx->block_dup_map, *block_nr)) {
n = dict_lookup(&clstr_dict, INT_TO_VOIDPTR(c));
if (n) {
p = (struct dup_cluster *) dnode_get(n);
- decrement_badcount(ctx, *block_nr, p);
+ if (lc != pb->cur_cluster)
+ decrement_badcount(ctx, *block_nr, p);
} else
com_err("delete_file_block", 0,
_("internal error: can't find dup_blk for %llu\n"),
@@ -603,6 +612,7 @@ static int delete_file_block(ext2_filsys fs,
ext2fs_block_alloc_stats2(fs, *block_nr, -1);
pb->dup_blocks++;
}
+ pb->cur_cluster = lc;

return 0;
}
@@ -621,6 +631,7 @@ static void delete_file(e2fsck_t ctx, ext2_ino_t ino,
pb.dup_blocks = 0;
pb.ctx = ctx;
pctx.str = "delete_file";
+ pb.cur_cluster = ~0;

e2fsck_read_inode(ctx, ino, &inode, "delete_file");
if (ext2fs_inode_has_valid_blocks2(fs, &inode))
@@ -703,8 +714,12 @@ static int clone_file_block(ext2_filsys fs,
if (check_if_fs_cluster(ctx, EXT2FS_B2C(fs, *block_nr)))
is_meta = 1;

- if (((blockcnt > 0) && c == cs->dup_cluster) ||
- ext2fs_test_block_bitmap2(ctx->block_dup_map, *block_nr)) {
+ if (c == cs->dup_cluster && cs->alloc_block) {
+ new_block = cs->alloc_block;
+ goto got_block;
+ }
+
+ if (ext2fs_test_block_bitmap2(ctx->block_dup_map, *block_nr)) {
n = dict_lookup(&clstr_dict,
INT_TO_VOIDPTR(EXT2FS_B2C(fs, *block_nr)));
if (!n) {
@@ -718,28 +733,6 @@ static int clone_file_block(ext2_filsys fs,
if (!is_meta)
decrement_badcount(ctx, *block_nr, p);

- if (p->num_bad == 0 && !is_meta) {
- /*
- * Normally num_bad never gets to zero; but in
- * the case of bigalloc file systems, we don't
- * how many blocks might be in use by a
- * particular inode. So we may end up
- * relocating the cluster even though this
- * inode is the last user of the cluster. In
- * that case, since we've already moved some
- * of the blocks of that cluster, we'll
- * complete the relocation and free the
- * original cluster here.
- */
- ext2fs_unmark_block_bitmap2(ctx->block_found_map,
- *block_nr);
- ext2fs_block_alloc_stats2(fs, *block_nr, -1);
- }

2011-11-28 17:15:51

by Theodore Ts'o

[permalink] [raw]
Subject: [PATCH 2/2] tests: add test case for multiply claimed blocks with bigalloc

Signed-off-by: "Theodore Ts'o" <[email protected]>
---
tests/f_dup_ba/expect.1 | 69 +++++++++++++++++++++++++++++++++++++++++++++++
tests/f_dup_ba/expect.2 | 7 +++++
tests/f_dup_ba/image.gz | Bin 0 -> 4668 bytes
tests/f_dup_ba/name | 1 +
4 files changed, 77 insertions(+), 0 deletions(-)
create mode 100644 tests/f_dup_ba/expect.1
create mode 100644 tests/f_dup_ba/expect.2
create mode 100644 tests/f_dup_ba/image.gz
create mode 100644 tests/f_dup_ba/name

diff --git a/tests/f_dup_ba/expect.1 b/tests/f_dup_ba/expect.1
new file mode 100644
index 0000000..f0ad457
--- /dev/null
+++ b/tests/f_dup_ba/expect.1
@@ -0,0 +1,69 @@
+Pass 1: Checking inodes, blocks, and sizes
+Inode 16, i_size is 8192, should be 409600. Fix? yes
+
+Inode 16, i_blocks is 128, should be 896. Fix? yes
+
+
+Running additional passes to resolve blocks claimed by more than one inode...
+Pass 1B: Rescanning for multiply-claimed blocks
+Multiply-claimed block(s) in inode 16: 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239
+Multiply-claimed block(s) in inode 17: 160 161
+Multiply-claimed block(s) in inode 18: 176 177
+Multiply-claimed block(s) in inode 19: 192 193
+Multiply-claimed block(s) in inode 20: 208 209
+Multiply-claimed block(s) in inode 21: 224 225
+Pass 1C: Scanning directories for inodes with multiply-claimed blocks
+Pass 1D: Reconciling multiply-claimed blocks
+(There are 6 inodes containing multiply-claimed blocks.)
+
+File /5 (inode #16, mod time Sun Nov 27 04:39:31 2011)
+ has 5 multiply-claimed block(s), shared with 5 file(s):
+ /10 (inode #21, mod time Sun Nov 27 04:39:36 2011)
+ /9 (inode #20, mod time Sun Nov 27 04:39:35 2011)
+ /8 (inode #19, mod time Sun Nov 27 04:39:34 2011)
+ /7 (inode #18, mod time Sun Nov 27 04:39:33 2011)
+ /6 (inode #17, mod time Sun Nov 27 04:39:32 2011)
+Clone multiply-claimed blocks? yes
+
+File /6 (inode #17, mod time Sun Nov 27 04:39:32 2011)
+ has 1 multiply-claimed block(s), shared with 1 file(s):
+ /5 (inode #16, mod time Sun Nov 27 04:39:31 2011)
+Multiply-claimed blocks already reassigned or cloned.
+
+File /7 (inode #18, mod time Sun Nov 27 04:39:33 2011)
+ has 1 multiply-claimed block(s), shared with 1 file(s):
+ /5 (inode #16, mod time Sun Nov 27 04:39:31 2011)
+Multiply-claimed blocks already reassigned or cloned.
+
+File /8 (inode #19, mod time Sun Nov 27 04:39:34 2011)
+ has 1 multiply-claimed block(s), shared with 1 file(s):
+ /5 (inode #16, mod time Sun Nov 27 04:39:31 2011)
+Multiply-claimed blocks already reassigned or cloned.
+
+File /9 (inode #20, mod time Sun Nov 27 04:39:35 2011)
+ has 1 multiply-claimed block(s), shared with 1 file(s):
+ /5 (inode #16, mod time Sun Nov 27 04:39:31 2011)
+Multiply-claimed blocks already reassigned or cloned.
+
+File /10 (inode #21, mod time Sun Nov 27 04:39:36 2011)
+ has 1 multiply-claimed block(s), shared with 1 file(s):
+ /5 (inode #16, mod time Sun Nov 27 04:39:31 2011)
+Multiply-claimed blocks already reassigned or cloned.
+
+Pass 2: Checking directory structure
+Pass 3: Checking directory connectivity
+Pass 4: Checking reference counts
+Pass 5: Checking group summary information
+Block bitmap differences: +240
+Fix? yes
+
+Free blocks count wrong for group #0 (49, counted=43).
+Fix? yes
+
+Free blocks count wrong (784, counted=688).
+Fix? yes
+
+
+test_filesys: ***** FILE SYSTEM WAS MODIFIED *****
+test_filesys: 21/64 files (0.0% non-contiguous), 336/1024 blocks
+Exit status is 1
diff --git a/tests/f_dup_ba/expect.2 b/tests/f_dup_ba/expect.2
new file mode 100644
index 0000000..ec04450
--- /dev/null
+++ b/tests/f_dup_ba/expect.2
@@ -0,0 +1,7 @@
+Pass 1: Checking inodes, blocks, and sizes
+Pass 2: Checking directory structure
+Pass 3: Checking directory connectivity
+Pass 4: Checking reference counts
+Pass 5: Checking group summary information
+test_filesys: 21/64 files (4.8% non-contiguous), 336/1024 blocks
+Exit status is 0
diff --git a/tests/f_dup_ba/image.gz b/tests/f_dup_ba/image.gz
new file mode 100644
index 0000000000000000000000000000000000000000..4b318638b917583b3613f78bc6aac6b5e4e2f6ef
GIT binary patch
literal 4668
zc-oWi=3r=Jz3j)t{PxD#?2vE?_6NoOi+qiPjW?FAjaVz+xH2%>KJ24>N7D7C1Oe8-
zg`v}Z7uGuzCN5m9x9R1%)bLk+cV_lxo9nq8Z58AaJyxFfPITe|_diV^D<ABAe{yf}
zbMx;$7I&;Q8RxIw`7h08sqICpWX-czeivW4budF|>CuO;|LxyqbAQ)9oA{-1@0TAB
zU6pkI@{Q|z!b87@NzIwG&0V(FeAS|5{l5P#|K5N2*L(ZEnBw=R?5D^5mAM}uWASm}
z`mg5>FP-}B@Xq@B_{y@+58fS~{(JuRSo!LoR`-@?O|3P&`*KUYEdzsH<G<Q}zgC<L
z++6s5K709c1_p)$8ZZAwl>-?GDL+|R7{~?<LVNEW+r<oIEk32MK)yK)dwQ2&pZ0wA
zuNgc0`Hn6#J~MxZ-I|9JH*T?bA)INnZL<F=lYf^gL$Bz?OMl?Y`*MB$&a2n<|4I7m
zxq6NL<HMKl-Mo~)@utYt`0uk;h3`<?AEj!4{Gf8H<m~tVzdrE&8T!O5bXNTyga5ai
z{>bd~o%(FP{+ath6YRx6Y<;n3`F|OHatGFnJgX0%U@r$0jQfAT_2+G_PuE@Yf|vdb
z=l;B1{K@*>JIlO2=ZidhuW{nPQqgR!v-@Lb?BA@gf9LmE72SV!xBfI<f0CUXzq^I^
zWi8h#kMAjdcT9e*MEzv7&ohhd{~Hy5y3@Lu`)_jd^{~ws_pds>d-3|!$Ez3lUwypy
zqV?Aw?Jq8W{c-%o>(?Lmy@<d5_;2%`hd&&?@CGhlR%E*SbU6jF3IZPtw=BA~^3>fI
zmRYG<&-1SD0i~J~FJ~};<o@sj$%=nu__pDCw(Hj5sk<*0UpW<W?(VuuMxfZl%Z98#
uxd%Vx;f9ZbQ7{Td!6+C7qhJ(_f>AIEM!_f;1p@@W*s_Nl(F1m?9T)(+mS8Uc

literal 0
Hc-jL100001

diff --git a/tests/f_dup_ba/name b/tests/f_dup_ba/name
new file mode 100644
index 0000000..a1a8b80
--- /dev/null
+++ b/tests/f_dup_ba/name
@@ -0,0 +1 @@
+multiply claimed blocks with bigalloc
--
1.7.4.1.22.gec8e1.dirty