2008-02-02 07:59:47

by Andreas Dilger

[permalink] [raw]
Subject: [PATCH][0/28] Lustre e2fsprogs patch series

The following series of emails will contain the large part of the
e2fsprogs patch series that is used for Lustre. It will not contain
the regression tests for EXTENTS nor the DIR_NLINK features, as those
are very large and were previously submitted.

A full tarball that includes the patches, series, and regression tests
will be uploaded to ftp://ftp.lustre.org/pub/lustre/other/e2fsprogs/

Patch series:

e2fsprogs-specdotin.patch
e2fsprogs-eacheck.patch
e2fsprogs-extended_ops.patch
e2fsprogs-tests-f_unsorted_EAs.patch
e2fsprogs-tests-f_ea_checks.patch
e2fsprogs-nlinks.patch
e2fsprogs-extents.patch
e2fsprogs-config-before-cmdline.patch
e2fsprogs-SLES10--m-support.patch
e2fsprogs-uninit.patch
e2fsprogs-nlinks-flag.patch
e2fsprogs-expand-extra-isize.patch
e2fsprogs-tests-f_expisize.patch
e2fsprogs-tests-f_expisize_ea_del.patch
e2fsprogs-ibadness-counter.patch
e2fsprogs-tests-f_ibadness.patch
e2fsprogs-tests-f_ibadness_bad_extents.patch
e2fsprogs-tests-f_random_corruption.patch
e2fsprogs-stride_option.patch
e2fsprogs-mmp.patch
e2fsprogs-journal_chksum.patch
e2fsprogs-tests-f_jchksum_bblk.patch
e2fsprogs-tests-f_jchksum_blast_trans.patch
e2fsprogs-tests-f_jchksum_remount.patch
e2fsprogs-i_size-corruption.patch
e2fsprogs-fiemap.patch
e2fsprogs-debugfs-supported_features.patch
e2fsprogs-lts-make_rpms.patch

Cheers, Andreas
--
Andreas Dilger
Sr. Staff Engineer, Lustre Group
Sun Microsystems of Canada, Inc.


2008-02-02 08:14:16

by Andreas Dilger

[permalink] [raw]
Subject: [PATCH][1/28] e2fsprogs-specdotin.patch

Add the distro type to the RPM release number, so that it is
possible release multiple distro packages without having conflicting
RPM package names.

Allow the RPM built from upstream to replace the split packages provided
by the distros. At some point in the future it may be desirable to also
split the RPM built by this spec file, but this is complicated by the
fact that SLES and RHEL have different splits.

Signed-off-by: Girish Shilamkar <[email protected]>
Signed-off-by: Andreas Dilger <[email protected]>

Index: e2fsprogs-1.40.5/e2fsprogs.spec.in
===================================================================
--- e2fsprogs-1.40.5.orig/e2fsprogs.spec.in
+++ e2fsprogs-1.40.5/e2fsprogs.spec.in
@@ -6,13 +6,22 @@
Summary: Utilities for managing the second extended (ext2) filesystem.
Name: e2fsprogs
Version: @[email protected]
-Release: 0
+Release: 0%{_vendor}
License: GPLv2
Group: System Environment/Base
Source: ftp://download.sourceforge.net/pub/sourceforge/e2fsprogs/e2fsprogs-%{version}.tar.gz
Url: http://e2fsprogs.sourceforge.net/
Prereq: /sbin/ldconfig
BuildRoot: %{_tmppath}/%{name}-root
+%if %{_vendor} == "suse"
+Group: System/Filesystems
+Provides: e2fsbn ext2fs libcom_err = %{version}
+Obsoletes: ext2fs libcom_err < %{version}
+%else
+Group: System Environment/Base
+Obsoletes: e2fsprogs-libs < %{version}
+Provides: e2fsprogs-libs = %{version}
+%endif

%description
The e2fsprogs package contains a number of utilities for creating,

Cheers, Andreas
--
Andreas Dilger
Sr. Staff Engineer, Lustre Group
Sun Microsystems of Canada, Inc.

2008-02-02 08:16:36

by Andreas Dilger

[permalink] [raw]
Subject: [PATCH] [2/28] e2fsprogs-eacheck.patch

Verify in-inode EA structure.
Allow in-inode EAs to have a checksum.
Connect zero-length inodes that have an EA to lost+found.

Signed-off-by: Andreas Dilger <[email protected]>

Index: e2fsprogs-1.40.5/e2fsck/e2fsck.h
===================================================================
--- e2fsprogs-1.40.5.orig/e2fsck/e2fsck.h
+++ e2fsprogs-1.40.5/e2fsck/e2fsck.h
@@ -478,6 +478,9 @@ extern void init_resource_track(struct r
extern int inode_has_valid_blocks(struct ext2_inode *inode);
extern void e2fsck_read_inode(e2fsck_t ctx, unsigned long ino,
struct ext2_inode * inode, const char * proc);
+extern void e2fsck_read_inode_full(e2fsck_t ctx, unsigned long ino,
+ struct ext2_inode *inode,
+ const int bufsize, const char *proc);
extern void e2fsck_write_inode(e2fsck_t ctx, unsigned long ino,
struct ext2_inode * inode, const char * proc);
extern void e2fsck_write_inode_full(e2fsck_t ctx, unsigned long ino,
Index: e2fsprogs-1.40.5/e2fsck/pass1.c
===================================================================
--- e2fsprogs-1.40.5.orig/e2fsck/pass1.c
+++ e2fsprogs-1.40.5/e2fsck/pass1.c
@@ -264,6 +264,7 @@ static void check_ea_in_inode(e2fsck_t c
remain = storage_size - sizeof(__u32);

while (!EXT2_EXT_IS_LAST_ENTRY(entry)) {
+ __u32 hash;

/* header eats this space */
remain -= sizeof(struct ext2_ext_attr_entry);
@@ -291,9 +292,12 @@ static void check_ea_in_inode(e2fsck_t c
problem = PR_1_ATTR_VALUE_BLOCK;
goto fix;
}
-
- /* e_hash must be 0 in inode's ea */
- if (entry->e_hash != 0) {
+
+ hash = ext2fs_ext_attr_hash_entry(entry,
+ start + entry->e_value_offs);
+
+ /* e_hash may be 0 in older inode's ea */
+ if (entry->e_hash != 0 && entry->e_hash != hash) {
pctx->num = entry->e_hash;
problem = PR_1_ATTR_HASH;
goto fix;
@@ -308,15 +312,12 @@ fix:
* it seems like a corruption. it's very unlikely we could repair
* EA(s) in automatic fashion -bzzz
*/
-#if 0
- problem = PR_1_ATTR_HASH;
-#endif
if (problem == 0 || !fix_problem(ctx, problem, pctx))
return;

- /* simple remove all possible EA(s) */
+ /* simply remove all remaining EA(s) */
*((__u32 *)start) = 0UL;
- e2fsck_write_inode_full(ctx, pctx->ino, pctx->inode,
+ e2fsck_write_inode_full(ctx, pctx->ino,(struct ext2_inode *)pctx->inode,
EXT2_INODE_SIZE(sb), "pass1");
}

@@ -1360,10 +1361,13 @@ static int check_ext_attr(e2fsck_t ctx,
entry = (struct ext2_ext_attr_entry *)(header+1);
end = block_buf + fs->blocksize;
while ((char *)entry < end && *(__u32 *)entry) {
+ __u32 hash;
+
if (region_allocate(region, (char *)entry - (char *)header,
EXT2_EXT_ATTR_LEN(entry->e_name_len))) {
if (fix_problem(ctx, PR_1_EA_ALLOC_COLLISION, pctx))
goto clear_extattr;
+ break;
}
if ((ctx->ext_attr_ver == 1 &&
(entry->e_name_len == 0 || entry->e_name_index != 0)) ||
@@ -1371,6 +1375,7 @@ static int check_ext_attr(e2fsck_t ctx,
entry->e_name_index == 0)) {
if (fix_problem(ctx, PR_1_EA_BAD_NAME, pctx))
goto clear_extattr;
+ break;
}
if (entry->e_value_block != 0) {
if (fix_problem(ctx, PR_1_EA_BAD_VALUE, pctx))
@@ -1387,6 +1392,17 @@ static int check_ext_attr(e2fsck_t ctx,
if (fix_problem(ctx, PR_1_EA_ALLOC_COLLISION, pctx))
goto clear_extattr;
}
+
+ hash = ext2fs_ext_attr_hash_entry(entry, block_buf +
+ entry->e_value_offs);
+
+ if (entry->e_hash != hash) {
+ pctx->num = entry->e_hash;
+ if (fix_problem(ctx, PR_1_ATTR_HASH, pctx))
+ goto clear_extattr;
+ entry->e_hash = hash;
+ }
+
entry = EXT2_EXT_ATTR_NEXT(entry);
}
if (region_allocate(region, (char *)entry - (char *)header, 4)) {
@@ -1508,8 +1524,11 @@ static void check_blocks(e2fsck_t ctx, s
}
}

- if (inode->i_file_acl && check_ext_attr(ctx, pctx, block_buf))
+ if (inode->i_file_acl && check_ext_attr(ctx, pctx, block_buf)) {
+ if (ctx->flags & E2F_FLAG_SIGNAL_MASK)
+ goto out;
pb.num_blocks++;
+ }

if (ext2fs_inode_has_valid_blocks(inode))
pctx->errcode = ext2fs_block_iterate2(fs, ino,
Index: e2fsprogs-1.40.5/e2fsck/pass4.c
===================================================================
--- e2fsprogs-1.40.5.orig/e2fsck/pass4.c
+++ e2fsprogs-1.40.5/e2fsck/pass4.c
@@ -15,6 +15,7 @@

#include "e2fsck.h"
#include "problem.h"
+#include <ext2fs/ext2_ext_attr.h>

/*
* This routine is called when an inode is not connected to the
@@ -23,31 +24,41 @@
* This subroutine returns 1 then the caller shouldn't bother with the
* rest of the pass 4 tests.
*/
-static int disconnect_inode(e2fsck_t ctx, ext2_ino_t i)
+static int disconnect_inode(e2fsck_t ctx, ext2_ino_t i,
+ struct ext2_inode *inode)
{
ext2_filsys fs = ctx->fs;
- struct ext2_inode inode;
struct problem_context pctx;
+ __u32 eamagic = 0;
+ int extra_size = 0;

- e2fsck_read_inode(ctx, i, &inode, "pass4: disconnect_inode");
+ if (EXT2_INODE_SIZE(fs->super) > EXT2_GOOD_OLD_INODE_SIZE) {
+ e2fsck_read_inode_full(ctx, i, inode,EXT2_INODE_SIZE(fs->super),
+ "pass4: disconnect_inode");
+ extra_size = ((struct ext2_inode_large *)inode)->i_extra_isize;
+ } else {
+ e2fsck_read_inode(ctx, i, inode, "pass4: disconnect_inode");
+ }
clear_problem_context(&pctx);
pctx.ino = i;
- pctx.inode = &inode;
+ pctx.inode = inode;

+ if (EXT2_INODE_SIZE(fs->super) -EXT2_GOOD_OLD_INODE_SIZE -extra_size >0)
+ eamagic = *(__u32 *)(((char *)inode) +EXT2_GOOD_OLD_INODE_SIZE +
+ extra_size);
/*
* Offer to delete any zero-length files that does not have
* blocks. If there is an EA block, it might have useful
* information, so we won't prompt to delete it, but let it be
* reconnected to lost+found.
*/
- if (!inode.i_blocks && (LINUX_S_ISREG(inode.i_mode) ||
- LINUX_S_ISDIR(inode.i_mode))) {
+ if (!inode->i_blocks && eamagic != EXT2_EXT_ATTR_MAGIC &&
+ (LINUX_S_ISREG(inode->i_mode) || LINUX_S_ISDIR(inode->i_mode))) {
if (fix_problem(ctx, PR_4_ZERO_LEN_INODE, &pctx)) {
ext2fs_icount_store(ctx->inode_link_info, i, 0);
- inode.i_links_count = 0;
- inode.i_dtime = ctx->now;
- e2fsck_write_inode(ctx, i, &inode,
- "disconnect_inode");
+ inode->i_links_count = 0;
+ inode->i_dtime = ctx->now;
+ e2fsck_write_inode(ctx, i, inode, "disconnect_inode");
/*
* Fix up the bitmaps...
*/
@@ -55,7 +66,7 @@ static int disconnect_inode(e2fsck_t ctx
ext2fs_unmark_inode_bitmap(ctx->inode_used_map, i);
ext2fs_unmark_inode_bitmap(ctx->inode_dir_map, i);
ext2fs_inode_alloc_stats2(fs, i, -1,
- LINUX_S_ISDIR(inode.i_mode));
+ LINUX_S_ISDIR(inode->i_mode));
return 0;
}
}
@@ -83,7 +94,7 @@ void e2fsck_pass4(e2fsck_t ctx)
{
ext2_filsys fs = ctx->fs;
ext2_ino_t i;
- struct ext2_inode inode;
+ struct ext2_inode *inode;
#ifdef RESOURCE_TRACK
struct resource_track rtrack;
#endif
@@ -111,6 +122,9 @@ void e2fsck_pass4(e2fsck_t ctx)
if ((ctx->progress)(ctx, 4, 0, maxgroup))
return;

+ inode = e2fsck_allocate_memory(ctx, EXT2_INODE_SIZE(fs->super),
+ "scratch inode");
+
/* Protect loop from wrap-around if s_inodes_count maxed */
for (i=1; i <= fs->super->s_inodes_count && i > 0; i++) {
if (ctx->flags & E2F_FLAG_SIGNAL_MASK)
@@ -138,7 +152,7 @@ void e2fsck_pass4(e2fsck_t ctx)
fs->blocksize, "bad_inode buffer");
if (e2fsck_process_bad_inode(ctx, 0, i, buf))
continue;
- if (disconnect_inode(ctx, i))
+ if (disconnect_inode(ctx, i, inode))
continue;
ext2fs_icount_fetch(ctx->inode_link_info, i,
&link_count);
@@ -146,18 +160,18 @@ void e2fsck_pass4(e2fsck_t ctx)
&link_counted);
}
if (link_counted != link_count) {
- e2fsck_read_inode(ctx, i, &inode, "pass4");
+ e2fsck_read_inode(ctx, i, inode, "pass4");
pctx.ino = i;
- pctx.inode = &inode;
- if (link_count != inode.i_links_count) {
+ pctx.inode = inode;
+ if (link_count != inode->i_links_count) {
pctx.num = link_count;
fix_problem(ctx,
PR_4_INCONSISTENT_COUNT, &pctx);
}
pctx.num = link_counted;
if (fix_problem(ctx, PR_4_BAD_REF_COUNT, &pctx)) {
- inode.i_links_count = link_counted;
- e2fsck_write_inode(ctx, i, &inode, "pass4");
+ inode->i_links_count = link_counted;
+ e2fsck_write_inode(ctx, i, inode, "pass4");
}
}
}
@@ -170,6 +184,8 @@ void e2fsck_pass4(e2fsck_t ctx)
errout:
if (buf)
ext2fs_free_mem(&buf);
+
+ ext2fs_free_mem(&inode);
#ifdef RESOURCE_TRACK
if (ctx->options & E2F_OPT_TIME2) {
e2fsck_clear_progbar(ctx);
Index: e2fsprogs-1.40.5/e2fsck/problem.c
===================================================================
--- e2fsprogs-1.40.5.orig/e2fsck/problem.c
+++ e2fsprogs-1.40.5/e2fsck/problem.c
@@ -776,7 +776,7 @@ static struct e2fsck_problem problem_tab

/* invalid ea entry->e_hash */
{ PR_1_ATTR_HASH,
- N_("@a in @i %i has a hash (%N) which is @n (must be 0)\n"),
+ N_("@a in @i %i has a hash (%N) which is @n\n"),
PROMPT_CLEAR, PR_PREEN_OK },

/* inode appears to be a directory */
Index: e2fsprogs-1.40.5/e2fsck/util.c
===================================================================
--- e2fsprogs-1.40.5.orig/e2fsck/util.c
+++ e2fsprogs-1.40.5/e2fsck/util.c
@@ -393,6 +393,20 @@ void e2fsck_read_inode(e2fsck_t ctx, uns
}
}

+void e2fsck_read_inode_full(e2fsck_t ctx, unsigned long ino,
+ struct ext2_inode *inode, int bufsize,
+ const char *proc)
+{
+ int retval;
+
+ retval = ext2fs_read_inode_full(ctx->fs, ino, inode, bufsize);
+ if (retval) {
+ com_err("ext2fs_read_inode_full", retval,
+ _("while reading inode %ld in %s"), ino, proc);
+ fatal_error(ctx, 0);
+ }
+}
+
extern void e2fsck_write_inode_full(e2fsck_t ctx, unsigned long ino,
struct ext2_inode * inode, int bufsize,
const char *proc)
Index: e2fsprogs-1.40.5/lib/ext2fs/ext2_ext_attr.h
===================================================================
--- e2fsprogs-1.40.5.orig/lib/ext2fs/ext2_ext_attr.h
+++ e2fsprogs-1.40.5/lib/ext2fs/ext2_ext_attr.h
@@ -30,7 +30,7 @@ struct ext2_ext_attr_entry {
__u32 e_value_block; /* disk block attribute is stored on (n/i) */
__u32 e_value_size; /* size of attribute value */
__u32 e_hash; /* hash value of name and value */
-#if 0
+#if 1
char e_name[0]; /* attribute name */
#endif
};
Index: e2fsprogs-1.40.5/lib/ext2fs/ext2fs.h
===================================================================
--- e2fsprogs-1.40.5.orig/lib/ext2fs/ext2fs.h
+++ e2fsprogs-1.40.5/lib/ext2fs/ext2fs.h
@@ -75,10 +75,12 @@ typedef __u32 ext2_dirhash_t;
#include "com_err.h"
#include "ext2_io.h"
#include "ext2_err.h"
+#include "ext2_ext_attr.h"
#else
#include <et/com_err.h>
#include <ext2fs/ext2_io.h>
#include <ext2fs/ext2_err.h>
+#include <ext2fs/ext2_ext_attr.h>
#endif

/*
@@ -711,6 +713,8 @@ extern errcode_t ext2fs_dup_handle(ext2_
extern errcode_t ext2fs_expand_dir(ext2_filsys fs, ext2_ino_t dir);

/* ext_attr.c */
+extern __u32 ext2fs_ext_attr_hash_entry(struct ext2_ext_attr_entry *entry,
+ void *data);
extern errcode_t ext2fs_read_ext_attr(ext2_filsys fs, blk_t block, void *buf);
extern errcode_t ext2fs_write_ext_attr(ext2_filsys fs, blk_t block,
void *buf);
@@ -961,6 +965,10 @@ extern errcode_t ext2fs_create_resize_in
/* swapfs.c */
extern void ext2fs_swap_ext_attr(char *to, char *from, int bufsize,
int has_header);
+extern void ext2fs_swap_ext_attr_header(struct ext2_ext_attr_header *to_header,
+ struct ext2_ext_attr_header *from_hdr);
+extern void ext2fs_swap_ext_attr_entry(struct ext2_ext_attr_entry *to_entry,
+ struct ext2_ext_attr_entry *from_entry);
extern void ext2fs_swap_super(struct ext2_super_block * super);
extern void ext2fs_swap_group_desc(struct ext2_group_desc *gdp);
extern void ext2fs_swap_inode_full(ext2_filsys fs, struct ext2_inode_large *t,
Index: e2fsprogs-1.40.5/lib/ext2fs/ext_attr.c
===================================================================
--- e2fsprogs-1.40.5.orig/lib/ext2fs/ext_attr.c
+++ e2fsprogs-1.40.5/lib/ext2fs/ext_attr.c
@@ -23,6 +23,43 @@

#include "ext2fs.h"

+#define NAME_HASH_SHIFT 5
+#define VALUE_HASH_SHIFT 16
+
+/*
+ * ext2_xattr_hash_entry()
+ *
+ * Compute the hash of an extended attribute.
+ */
+__u32 ext2fs_ext_attr_hash_entry(struct ext2_ext_attr_entry *entry, void *data)
+{
+ __u32 hash = 0;
+ char *name = entry->e_name;
+ int n;
+
+ for (n = 0; n < entry->e_name_len; n++) {
+ hash = (hash << NAME_HASH_SHIFT) ^
+ (hash >> (8*sizeof(hash) - NAME_HASH_SHIFT)) ^
+ *name++;
+ }
+
+ /* The hash needs to be calculated on the data in little-endian. */
+ if (entry->e_value_block == 0 && entry->e_value_size != 0) {
+ __u32 *value = (__u32 *)data;
+ for (n = (entry->e_value_size + EXT2_EXT_ATTR_ROUND) >>
+ EXT2_EXT_ATTR_PAD_BITS; n; n--) {
+ hash = (hash << VALUE_HASH_SHIFT) ^
+ (hash >> (8*sizeof(hash) - VALUE_HASH_SHIFT)) ^
+ ext2fs_le32_to_cpu(*value++);
+ }
+ }
+
+ return hash;
+}
+
+#undef NAME_HASH_SHIFT
+#undef VALUE_HASH_SHIFT
+
errcode_t ext2fs_read_ext_attr(ext2_filsys fs, blk_t block, void *buf)
{
errcode_t retval;
Index: e2fsprogs-1.40.5/lib/ext2fs/swapfs.c
===================================================================
--- e2fsprogs-1.40.5.orig/lib/ext2fs/swapfs.c
+++ e2fsprogs-1.40.5/lib/ext2fs/swapfs.c
@@ -90,6 +90,29 @@ void ext2fs_swap_group_desc(struct ext2_
gdp->bg_checksum = ext2fs_swab16(gdp->bg_checksum);
}

+void ext2fs_swap_ext_attr_header(struct ext2_ext_attr_header *to_header,
+ struct ext2_ext_attr_header *from_header)
+{
+ int n;
+
+ to_header->h_magic = ext2fs_swab32(from_header->h_magic);
+ to_header->h_blocks = ext2fs_swab32(from_header->h_blocks);
+ to_header->h_refcount = ext2fs_swab32(from_header->h_refcount);
+ to_header->h_hash = ext2fs_swab32(from_header->h_hash);
+ for (n = 0; n < 4; n++)
+ to_header->h_reserved[n] =
+ ext2fs_swab32(from_header->h_reserved[n]);
+}
+
+void ext2fs_swap_ext_attr_entry(struct ext2_ext_attr_entry *to_entry,
+ struct ext2_ext_attr_entry *from_entry)
+{
+ to_entry->e_value_offs = ext2fs_swab16(from_entry->e_value_offs);
+ to_entry->e_value_block = ext2fs_swab32(from_entry->e_value_block);
+ to_entry->e_value_size = ext2fs_swab32(from_entry->e_value_size);
+ to_entry->e_hash = ext2fs_swab32(from_entry->e_hash);
+}
+
void ext2fs_swap_ext_attr(char *to, char *from, int bufsize, int has_header)
{
struct ext2_ext_attr_header *from_header =
@@ -98,32 +121,22 @@ void ext2fs_swap_ext_attr(char *to, char
(struct ext2_ext_attr_header *)to;
struct ext2_ext_attr_entry *from_entry, *to_entry;
char *from_end = (char *)from_header + bufsize;
- int n;

if (to_header != from_header)
memcpy(to_header, from_header, bufsize);

- from_entry = (struct ext2_ext_attr_entry *)from_header;
- to_entry = (struct ext2_ext_attr_entry *)to_header;
-
if (has_header) {
- to_header->h_magic = ext2fs_swab32(from_header->h_magic);
- to_header->h_blocks = ext2fs_swab32(from_header->h_blocks);
- to_header->h_refcount = ext2fs_swab32(from_header->h_refcount);
- for (n=0; n<4; n++)
- to_header->h_reserved[n] =
- ext2fs_swab32(from_header->h_reserved[n]);
+ ext2fs_swap_ext_attr_header(to_header, from_header);
+
from_entry = (struct ext2_ext_attr_entry *)(from_header+1);
to_entry = (struct ext2_ext_attr_entry *)(to_header+1);
+ } else {
+ from_entry = (struct ext2_ext_attr_entry *)from_header;
+ to_entry = (struct ext2_ext_attr_entry *)to_header;
}

while ((char *)from_entry < from_end && *(__u32 *)from_entry) {
- to_entry->e_value_offs =
- ext2fs_swab16(from_entry->e_value_offs);
- to_entry->e_value_block =
- ext2fs_swab32(from_entry->e_value_block);
- to_entry->e_value_size =
- ext2fs_swab32(from_entry->e_value_size);
+ ext2fs_swap_ext_attr_entry(to_entry, from_entry);
from_entry = EXT2_EXT_ATTR_NEXT(from_entry);
to_entry = EXT2_EXT_ATTR_NEXT(to_entry);
}

Cheers, Andreas
--
Andreas Dilger
Sr. Staff Engineer, Lustre Group
Sun Microsystems of Canada, Inc.

2008-02-02 08:17:47

by Andreas Dilger

[permalink] [raw]
Subject: [PATCH][3/28] e2fsprogs-extended_ops.patch

Minor reformatting patch to make applying later patches easier.

Signed-off-by: Andreas Dilger <[email protected]>

Index: e2fsprogs-cfs/e2fsck/unix.c
===================================================================
--- e2fsprogs-cfs.orig/e2fsck/unix.c
+++ e2fsprogs-cfs/e2fsck/unix.c
@@ -537,14 +537,13 @@ static void parse_extended_opts(e2fsck_t
continue;
}
ea_ver = strtoul(arg, &p, 0);
- if (*p ||
- ((ea_ver != 1) && (ea_ver != 2))) {
- fprintf(stderr,
- _("Invalid EA version.\n"));
+ if (*p == '\0' && (ea_ver == 1 || ea_ver == 2)) {
+ ctx->ext_attr_ver = ea_ver;
+ } else {
+ fprintf(stderr, _("Invalid EA version.\n"));
extended_usage++;
continue;
}
- ctx->ext_attr_ver = ea_ver;
} else {
fprintf(stderr, _("Unknown extended option: %s\n"),
token);
@@ -558,7 +557,8 @@ static void parse_extended_opts(e2fsck_t
"and may take an argument which\n"
"is set off by an equals ('=') sign. "
"Valid extended options are:\n"
- "\tea_ver=<ea_version (1 or 2)>\n\n"), stderr);
+ "\tea_ver=<ea_version (1 or 2)>\n"
+ "\n"), stderr);
exit(1);
}
}

Cheers, Andreas
--
Andreas Dilger
Sr. Staff Engineer, Lustre Group
Sun Microsystems of Canada, Inc.

2008-02-02 08:20:52

by Andreas Dilger

[permalink] [raw]
Subject: Re: [PATCH][4/28] e2fsprogs-tests-f_unsorted_EAs.patch

Attached binary patch.

Cheers, Andreas
--
Andreas Dilger
Sr. Staff Engineer, Lustre Group
Sun Microsystems of Canada, Inc.


Attachments:
(No filename) (125.00 B)
e2fsprogs-tests-f_unsorted_EAs.patch (16.85 kB)
Download all attachments

2008-02-02 08:22:09

by Andreas Dilger

[permalink] [raw]
Subject: Re: [PATCH][5/28] e2fsprogs-tests-f_ea_checks.patch

Attached binary test patch.

Cheers, Andreas
--
Andreas Dilger
Sr. Staff Engineer, Lustre Group
Sun Microsystems of Canada, Inc.


Attachments:
(No filename) (130.00 B)
e2fsprogs-tests-f_ea_checks.patch (4.82 kB)
Download all attachments

2008-02-02 08:25:07

by Andreas Dilger

[permalink] [raw]
Subject: Re: [PATCH][6/28] e2fsprogs-nlinks.patch


Add support for the DIR_NLINK feature.

This patch includes the changes required to e2fsck to understand the
nlink count changes made in the kernel. In pass2, while counting the
links for a directory, if the link count exceeds 65000, its permanently
set to EXT2_LINK_MAX + 10. In pass4, when the counted and actual nlink
counts are compared, e2fsck does not flag an error if counted links =
EXT2_NLINK_MAX + 10 and existing link count is 1.

It also handles the case when a directory had more than 65000 subdirs
and they were later deleted. The nlink count of such a directory remains
1. In pass4 if counted links are 2 and if existing nlink count = 1,
e2fsck corrects the nlink count without displaying any errors.

The file hard link count is also increased to 65000, but this cannot be
exceeded.

Signed-off-by: Andreas Dilger <[email protected]>
Signed-off-by: Kalpak Shah <[email protected]>

Index: e2fsprogs-1.40.1/e2fsck/pass2.c
===================================================================
--- e2fsprogs-1.40.1.orig/e2fsck/pass2.c
+++ e2fsprogs-1.40.1/e2fsck/pass2.c
@@ -717,7 +717,7 @@ static int check_dir_block(ext2_filsys f
blk_t block_nr = db->blk;
ext2_ino_t ino = db->ino;
ext2_ino_t subdir_parent;
- __u16 links;
+ __u32 links;
struct check_dir_struct *cd;
char *buf;
e2fsck_t ctx;
@@ -1024,9 +1024,11 @@ static int check_dir_block(ext2_filsys f
dups_found++;
} else
dict_alloc_insert(&de_dict, dirent, dirent);
-
- ext2fs_icount_increment(ctx->inode_count, dirent->inode,
- &links);
+
+ ext2fs_icount_inc32(ctx->inode_count, dirent->inode, &links,
+ ext2fs_test_inode_bitmap(ctx->inode_dir_map,
+ dirent->inode) ?
+ EXT2_LINK_MAX : (__u32)~0U);
if (links > 1)
ctx->fs_links_count++;
ctx->fs_total_count++;
Index: e2fsprogs-1.40.1/e2fsck/pass4.c
===================================================================
--- e2fsprogs-1.40.1.orig/e2fsck/pass4.c
+++ e2fsprogs-1.40.1/e2fsck/pass4.c
@@ -99,7 +99,8 @@ void e2fsck_pass4(e2fsck_t ctx)
struct resource_track rtrack;
#endif
struct problem_context pctx;
- __u16 link_count, link_counted;
+ __u16 link_count;
+ __u32 link_counted;
char *buf = 0;
int group, maxgroup;

@@ -145,7 +146,7 @@ void e2fsck_pass4(e2fsck_t ctx)
ext2fs_test_inode_bitmap(ctx->inode_bb_map, i)))
continue;
ext2fs_icount_fetch(ctx->inode_link_info, i, &link_count);
- ext2fs_icount_fetch(ctx->inode_count, i, &link_counted);
+ ext2fs_icount_fetch32(ctx->inode_count, i, &link_counted);
if (link_counted == 0) {
if (!buf)
buf = e2fsck_allocate_memory(ctx,
@@ -156,10 +157,12 @@ void e2fsck_pass4(e2fsck_t ctx)
continue;
ext2fs_icount_fetch(ctx->inode_link_info, i,
&link_count);
- ext2fs_icount_fetch(ctx->inode_count, i,
- &link_counted);
+ ext2fs_icount_fetch32(ctx->inode_count, i,
+ &link_counted);
}
- if (link_counted != link_count) {
+ if (link_counted != link_count &&
+ !(ext2fs_test_inode_bitmap(ctx->inode_dir_map, i) &&
+ link_count == 1 && link_counted > EXT2_LINK_MAX)) {
e2fsck_read_inode(ctx, i, inode, "pass4");
pctx.ino = i;
pctx.inode = inode;
@@ -169,7 +172,12 @@ void e2fsck_pass4(e2fsck_t ctx)
PR_4_INCONSISTENT_COUNT, &pctx);
}
pctx.num = link_counted;
- if (fix_problem(ctx, PR_4_BAD_REF_COUNT, &pctx)) {
+ /* i_link_count was previously exceeded, but no longer
+ * is, fix this but don't consider it an error */
+ if ((LINUX_S_ISDIR(inode->i_mode) && link_counted > 1 &&
+ (inode->i_flags & EXT2_INDEX_FL) &&
+ link_count == 1 && !(ctx->options & E2F_OPT_NO)) ||
+ (fix_problem(ctx, PR_4_BAD_REF_COUNT, &pctx))) {
inode->i_links_count = link_counted;
e2fsck_write_inode(ctx, i, inode, "pass4");
}
Index: e2fsprogs-1.40.1/lib/ext2fs/ext2_fs.h
===================================================================
--- e2fsprogs-1.40.1.orig/lib/ext2fs/ext2_fs.h
+++ e2fsprogs-1.40.1/lib/ext2fs/ext2_fs.h
@@ -646,6 +646,7 @@ struct ext2_super_block {
#define EXT2_FEATURE_INCOMPAT_SUPP (EXT2_FEATURE_INCOMPAT_FILETYPE)
#define EXT2_FEATURE_RO_COMPAT_SUPP (EXT2_FEATURE_RO_COMPAT_SPARSE_SUPER| \
EXT2_FEATURE_RO_COMPAT_LARGE_FILE| \
+ EXT4_FEATURE_RO_COMPAT_DIR_NLINK| \
EXT2_FEATURE_RO_COMPAT_BTREE_DIR)

/*
Index: e2fsprogs-1.40.1/lib/ext2fs/ext2fs.h
===================================================================
--- e2fsprogs-1.40.1.orig/lib/ext2fs/ext2fs.h
+++ e2fsprogs-1.40.1/lib/ext2fs/ext2fs.h
@@ -462,7 +462,8 @@ typedef struct ext2_icount *ext2_icount_
EXT3_FEATURE_INCOMPAT_RECOVER)
#endif
#define EXT2_LIB_FEATURE_RO_COMPAT_SUPP (EXT2_FEATURE_RO_COMPAT_SPARSE_SUPER|\
- EXT2_FEATURE_RO_COMPAT_LARGE_FILE)
+ EXT2_FEATURE_RO_COMPAT_LARGE_FILE|\
+ EXT4_FEATURE_RO_COMPAT_DIR_NLINK)

/*
* These features are only allowed if EXT2_FLAG_SOFTSUPP_FEATURES is passed
@@ -471,7 +472,6 @@ typedef struct ext2_icount *ext2_icount_
#define EXT2_LIB_SOFTSUPP_INCOMPAT (EXT3_FEATURE_INCOMPAT_EXTENTS)
#define EXT2_LIB_SOFTSUPP_RO_COMPAT (EXT4_FEATURE_RO_COMPAT_HUGE_FILE|\
EXT4_FEATURE_RO_COMPAT_GDT_CSUM|\
- EXT4_FEATURE_RO_COMPAT_DIR_NLINK|\
EXT4_FEATURE_RO_COMPAT_EXTRA_ISIZE)

/*
@@ -795,12 +795,20 @@ extern errcode_t ext2fs_create_icount2(e
extern errcode_t ext2fs_create_icount(ext2_filsys fs, int flags,
unsigned int size,
ext2_icount_t *ret);
+extern errcode_t ext2fs_icount_fetch32(ext2_icount_t icount, ext2_ino_t ino,
+ __u32 *ret);
extern errcode_t ext2fs_icount_fetch(ext2_icount_t icount, ext2_ino_t ino,
__u16 *ret);
+extern errcode_t ext2fs_icount_inc32(ext2_icount_t icount, ext2_ino_t ino,
+ __u32 *ret, __u32 overflow);
extern errcode_t ext2fs_icount_increment(ext2_icount_t icount, ext2_ino_t ino,
__u16 *ret);
+extern errcode_t ext2fs_icount_dec32(ext2_icount_t icount, ext2_ino_t ino,
+ __u32 *ret);
extern errcode_t ext2fs_icount_decrement(ext2_icount_t icount, ext2_ino_t ino,
__u16 *ret);
+extern errcode_t ext2fs_icount_store32(ext2_icount_t icount, ext2_ino_t ino,
+ __u32 count);
extern errcode_t ext2fs_icount_store(ext2_icount_t icount, ext2_ino_t ino,
__u16 count);
extern ext2_ino_t ext2fs_get_icount_size(ext2_icount_t icount);
Index: e2fsprogs-1.40.1/lib/ext2fs/icount.c
===================================================================
--- e2fsprogs-1.40.1.orig/lib/ext2fs/icount.c
+++ e2fsprogs-1.40.1/lib/ext2fs/icount.c
@@ -43,7 +43,7 @@

struct ext2_icount_el {
ext2_ino_t ino;
- __u16 count;
+ __u32 count;
};

struct ext2_icount {
@@ -397,16 +397,16 @@ static struct ext2_icount_el *get_icount
}

static errcode_t set_inode_count(ext2_icount_t icount, ext2_ino_t ino,
- __u16 count)
+ __u32 count)
{
- struct ext2_icount_el *el;
+ struct ext2_icount_el *el;
TDB_DATA key, data;

if (icount->tdb) {
key.dptr = (unsigned char *) &ino;
key.dsize = sizeof(ext2_ino_t);
data.dptr = (unsigned char *) &count;
- data.dsize = sizeof(__u16);
+ data.dsize = sizeof(__u32);
if (count) {
if (tdb_store(icount->tdb, key, data, TDB_REPLACE))
return tdb_error(icount->tdb) +
@@ -428,9 +428,9 @@ static errcode_t set_inode_count(ext2_ic
}

static errcode_t get_inode_count(ext2_icount_t icount, ext2_ino_t ino,
- __u16 *count)
+ __u32 *count)
{
- struct ext2_icount_el *el;
+ struct ext2_icount_el *el;
TDB_DATA key, data;

if (icount->tdb) {
@@ -443,7 +443,7 @@ static errcode_t get_inode_count(ext2_ic
return tdb_error(icount->tdb) + EXT2_ET_TDB_SUCCESS;
}

- *count = *((__u16 *) data.dptr);
+ *count = *((__u32 *) data.dptr);
free(data.dptr);
return 0;
}
@@ -480,7 +480,7 @@ errcode_t ext2fs_icount_validate(ext2_ic
return ret;
}

-errcode_t ext2fs_icount_fetch(ext2_icount_t icount, ext2_ino_t ino, __u16 *ret)
+errcode_t ext2fs_icount_fetch32(ext2_icount_t icount, ext2_ino_t ino, __u32 *ret)
{
EXT2_CHECK_MAGIC(icount, EXT2_ET_MAGIC_ICOUNT);

@@ -500,10 +500,21 @@ errcode_t ext2fs_icount_fetch(ext2_icoun
return 0;
}

-errcode_t ext2fs_icount_increment(ext2_icount_t icount, ext2_ino_t ino,
- __u16 *ret)
+errcode_t ext2fs_icount_fetch(ext2_icount_t icount, ext2_ino_t ino, __u16 *ret)
{
- __u16 curr_value;
+ __u32 ret32 = ret ? *ret : 0;
+ errcode_t err;
+
+ err = ext2fs_icount_fetch32(icount, ino, &ret32);
+ *ret = (__u16)ret32;
+
+ return err;
+}
+
+errcode_t ext2fs_icount_inc32(ext2_icount_t icount, ext2_ino_t ino,
+ __u32 *ret, __u32 overflow)
+{
+ __u32 curr_value;

EXT2_CHECK_MAGIC(icount, EXT2_ET_MAGIC_ICOUNT);

@@ -528,6 +539,8 @@ errcode_t ext2fs_icount_increment(ext2_i
if (ext2fs_test_inode_bitmap(icount->multiple, ino)) {
get_inode_count(icount, ino, &curr_value);
curr_value++;
+ if (curr_value >= overflow)
+ curr_value = overflow + 10;
if (set_inode_count(icount, ino, curr_value))
return EXT2_ET_NO_MEMORY;
} else {
@@ -547,6 +560,8 @@ errcode_t ext2fs_icount_increment(ext2_i
*/
get_inode_count(icount, ino, &curr_value);
curr_value++;
+ if (curr_value >= overflow)
+ curr_value = overflow + 10;
if (set_inode_count(icount, ino, curr_value))
return EXT2_ET_NO_MEMORY;
}
@@ -557,10 +572,23 @@ errcode_t ext2fs_icount_increment(ext2_i
return 0;
}

-errcode_t ext2fs_icount_decrement(ext2_icount_t icount, ext2_ino_t ino,
+errcode_t ext2fs_icount_increment(ext2_icount_t icount, ext2_ino_t ino,
__u16 *ret)
{
- __u16 curr_value;
+ __u32 ret32 = ret ? *ret : 0;
+ errcode_t err;
+
+ err = ext2fs_icount_inc32(icount, ino, &ret32, (__u16)~0U);
+ if (ret)
+ *ret = ret32;
+
+ return err;
+}
+
+errcode_t ext2fs_icount_dec32(ext2_icount_t icount, ext2_ino_t ino,
+ __u32 *ret)
+{
+ __u32 curr_value;

if (!ino || (ino > icount->num_inodes))
return EXT2_ET_INVALID_ARGUMENT;
@@ -600,8 +628,21 @@ errcode_t ext2fs_icount_decrement(ext2_i
return 0;
}

-errcode_t ext2fs_icount_store(ext2_icount_t icount, ext2_ino_t ino,
- __u16 count)
+errcode_t ext2fs_icount_decrement(ext2_icount_t icount, ext2_ino_t ino,
+ __u16 *ret)
+{
+ __u32 ret32 = ret ? *ret : 0;
+ errcode_t err;
+
+ err = ext2fs_icount_dec32(icount, ino, &ret32);
+ if (ret)
+ *ret = ret32;
+
+ return err;
+}
+
+errcode_t ext2fs_icount_store32(ext2_icount_t icount, ext2_ino_t ino,
+ __u32 count)
{
if (!ino || (ino > icount->num_inodes))
return EXT2_ET_INVALID_ARGUMENT;
@@ -635,6 +676,12 @@ errcode_t ext2fs_icount_store(ext2_icoun
return 0;
}

+errcode_t ext2fs_icount_store(ext2_icount_t icount, ext2_ino_t ino,
+ __u16 count)
+{
+ return ext2fs_icount_store32(icount, ino, count);
+}
+
ext2_ino_t ext2fs_get_icount_size(ext2_icount_t icount)
{
if (!icount || icount->magic != EXT2_ET_MAGIC_ICOUNT)
Index: e2fsprogs-1.40.1/e2fsck/pass3.c
===================================================================
--- e2fsprogs-1.40.1.orig/e2fsck/pass3.c
+++ e2fsprogs-1.40.1/e2fsck/pass3.c
@@ -581,19 +581,22 @@ errcode_t e2fsck_adjust_inode_count(e2fs
#endif

if (adj == 1) {
- ext2fs_icount_increment(ctx->inode_count, ino, 0);
+ ext2fs_icount_inc32(ctx->inode_count, ino, 0,
+ ext2fs_test_inode_bitmap(ctx->inode_dir_map,
+ ino) ?
+ EXT2_LINK_MAX : ~0U);
if (inode.i_links_count == (__u16) ~0)
return 0;
ext2fs_icount_increment(ctx->inode_link_info, ino, 0);
inode.i_links_count++;
} else if (adj == -1) {
- ext2fs_icount_decrement(ctx->inode_count, ino, 0);
+ ext2fs_icount_dec32(ctx->inode_count, ino, 0);
if (inode.i_links_count == 0)
return 0;
ext2fs_icount_decrement(ctx->inode_link_info, ino, 0);
inode.i_links_count--;
}
-
+
retval = ext2fs_write_inode(fs, ino, &inode);
if (retval)
return retval;

Cheers, Andreas
--
Andreas Dilger
Sr. Staff Engineer, Lustre Group
Sun Microsystems of Canada, Inc.

2008-02-02 08:26:08

by Andreas Dilger

[permalink] [raw]
Subject: Re: [PATCH][0/28] e2fsprogs-extents.patch


Support for checking 32-bit extents format inodes and the INCOMPAT_EXTENTS
feature.

Clear the high 16 bits of extents and index entries, since the
extents patches did not do this explicitly. Some parts of this
code need fixing for checking > 32-bit block filesystems (when
INCOMPAT_64BIT support is added), marked "FIXME: 48-bit support".

Verify extent headers in blocks, logical ordering of extents,
logical ordering of indexes.

Add explicit checking of {d,t,}indirect and index blocks to detect
corruption instead of implicitly doing this by checking the referred
blocks and only block-at-a-time correctness. This avoids incorrectly
invoking the very lengthy duplicate blocks pass for bad indirect/index
blocks. We may want to tune the "threshold" for how many errors make
a "bad" indirect/index block.

Add ability to split or remove extents in order to allow extent
reallocation during the duplicate blocks pass.

Index: e2fsprogs-1.40.5/e2fsck/Makefile.in
===================================================================
--- e2fsprogs-1.40.5.orig/e2fsck/Makefile.in
+++ e2fsprogs-1.40.5/e2fsck/Makefile.in
@@ -256,6 +256,7 @@ super.o: $(srcdir)/super.c $(top_srcdir)
pass1.o: $(srcdir)/pass1.c $(srcdir)/e2fsck.h \
$(top_srcdir)/lib/ext2fs/ext2_fs.h $(top_builddir)/lib/ext2fs/ext2_types.h \
$(top_srcdir)/lib/ext2fs/ext2fs.h $(top_srcdir)/lib/ext2fs/ext2_fs.h \
+ $(top_srcdir)/lib/ext2fs/ext3_extents.h \
$(top_srcdir)/lib/et/com_err.h $(top_srcdir)/lib/ext2fs/ext2_io.h \
$(top_builddir)/lib/ext2fs/ext2_err.h $(top_srcdir)/lib/ext2fs/bitops.h \
$(top_srcdir)/lib/blkid/blkid.h $(top_builddir)/lib/blkid/blkid_types.h \
Index: e2fsprogs-1.40.5/e2fsck/e2fsck.h
===================================================================
--- e2fsprogs-1.40.5.orig/e2fsck/e2fsck.h
+++ e2fsprogs-1.40.5/e2fsck/e2fsck.h
@@ -328,6 +328,7 @@ struct e2fsck_struct {
__u32 large_files;
__u32 fs_ext_attr_inodes;
__u32 fs_ext_attr_blocks;
+ __u32 extent_files;

/* misc fields */
time_t now;
Index: e2fsprogs-1.40.5/e2fsck/pass1.c
===================================================================
--- e2fsprogs-1.40.5.orig/e2fsck/pass1.c
+++ e2fsprogs-1.40.5/e2fsck/pass1.c
@@ -46,6 +46,7 @@

#include "e2fsck.h"
#include <ext2fs/ext2_ext_attr.h>
+#include <ext2fs/ext3_extents.h>

#include "problem.h"

@@ -79,16 +80,19 @@ static void adjust_extattr_refcount(e2fs
struct process_block_struct {
ext2_ino_t ino;
unsigned is_dir:1, is_reg:1, clear:1, suppress:1,
- fragmented:1, compressed:1, bbcheck:1;
+ fragmented:1, compressed:1, bbcheck:1, extent:1;
blk_t num_blocks;
blk_t max_blocks;
e2_blkcnt_t last_block;
int num_illegal_blocks;
+ int last_illegal_blocks;
blk_t previous_block;
struct ext2_inode *inode;
struct problem_context *pctx;
ext2fs_block_bitmap fs_meta_blocks;
e2fsck_t ctx;
+ struct ext3_extent_header *eh_prev;
+ void *block_buf;
};

struct process_inode_block {
@@ -137,7 +141,7 @@ int e2fsck_pass1_check_device_inode(ext2
* If the index flag is set, then this is a bogus
* device/fifo/socket
*/
- if (inode->i_flags & EXT2_INDEX_FL)
+ if (inode->i_flags & (EXT2_INDEX_FL | EXT4_EXTENTS_FL))
return 0;

/*
@@ -171,7 +175,7 @@ int e2fsck_pass1_check_symlink(ext2_fils
blk_t blocks;

if ((inode->i_size_high || inode->i_size == 0) ||
- (inode->i_flags & EXT2_INDEX_FL))
+ (inode->i_flags & (EXT2_INDEX_FL | EXT4_EXTENTS_FL)))
return 0;

blocks = ext2fs_inode_data_blocks(fs, inode);
@@ -484,7 +488,9 @@ void e2fsck_pass1(e2fsck_t ctx)
int imagic_fs;
int busted_fs_time = 0;
int inode_size;
-
+ struct ext3_extent_header *eh;
+ int extent_fs;
+
#ifdef RESOURCE_TRACK
init_resource_track(&rtrack, ctx->fs->io);
#endif
@@ -515,6 +521,7 @@ void e2fsck_pass1(e2fsck_t ctx)
#undef EXT2_BPP

imagic_fs = (sb->s_feature_compat & EXT2_FEATURE_COMPAT_IMAGIC_INODES);
+ extent_fs = (sb->s_feature_incompat & EXT3_FEATURE_INCOMPAT_EXTENTS);

/*
* Allocate bitmaps structures
@@ -891,8 +898,7 @@ void e2fsck_pass1(e2fsck_t ctx)
check_blocks(ctx, &pctx, block_buf);
continue;
}
- }
- else if (LINUX_S_ISFIFO (inode->i_mode) &&
+ } else if (LINUX_S_ISFIFO (inode->i_mode) &&
e2fsck_pass1_check_device_inode(fs, inode)) {
check_immutable(ctx, &pctx);
check_size(ctx, &pctx);
@@ -904,21 +910,75 @@ void e2fsck_pass1(e2fsck_t ctx)
ctx->fs_sockets_count++;
} else
mark_inode_bad(ctx, ino);
- if (inode->i_block[EXT2_IND_BLOCK])
- ctx->fs_ind_count++;
- if (inode->i_block[EXT2_DIND_BLOCK])
- ctx->fs_dind_count++;
- if (inode->i_block[EXT2_TIND_BLOCK])
- ctx->fs_tind_count++;
- if (inode->i_block[EXT2_IND_BLOCK] ||
- inode->i_block[EXT2_DIND_BLOCK] ||
- inode->i_block[EXT2_TIND_BLOCK] ||
- inode->i_file_acl) {
- inodes_to_process[process_inode_count].ino = ino;
- inodes_to_process[process_inode_count].inode = *inode;
- process_inode_count++;
- } else
- check_blocks(ctx, &pctx, block_buf);
+
+ eh = (struct ext3_extent_header *)inode->i_block;
+ if ((inode->i_flags & EXT4_EXTENTS_FL)) {
+ if ((LINUX_S_ISREG(inode->i_mode) ||
+ LINUX_S_ISDIR(inode->i_mode)) &&
+ ext2fs_extent_header_verify(eh, EXT2_N_BLOCKS *
+ sizeof(__u32)) == 0) {
+ if (!extent_fs &&
+ fix_problem(ctx,PR_1_EXTENT_FEATURE,&pctx)){
+ sb->s_feature_incompat |=
+ EXT3_FEATURE_INCOMPAT_EXTENTS;
+ ext2fs_mark_super_dirty(fs);
+ extent_fs = 1;
+ }
+ } else if (fix_problem(ctx, PR_1_SET_EXTENT_FL, &pctx)){
+ inode->i_flags &= ~EXT4_EXTENTS_FL;
+ e2fsck_write_inode(ctx, ino, inode, "pass1");
+ goto check_ind_inode;
+ }
+ } else if (extent_fs &&
+ (LINUX_S_ISREG(inode->i_mode) ||
+ LINUX_S_ISDIR(inode->i_mode)) &&
+ ext2fs_extent_header_verify(eh, EXT2_N_BLOCKS *
+ sizeof(__u32)) == 0 &&
+ fix_problem(ctx, PR_1_UNSET_EXTENT_FL, &pctx)) {
+ inode->i_flags |= EXT4_EXTENTS_FL;
+ e2fsck_write_inode(ctx, ino, inode, "pass1");
+ }
+ if (extent_fs && inode->i_flags & EXT4_EXTENTS_FL) {
+ ctx->extent_files++;
+ switch(eh->eh_depth) {
+ case 0:
+ break;
+ case 1:
+ ctx->fs_ind_count++;
+ break;
+ case 2:
+ ctx->fs_dind_count++;
+ break;
+ default:
+ ctx->fs_tind_count++;
+ break;
+ }
+ if (eh->eh_depth > 0) {
+ inodes_to_process[process_inode_count].ino = ino;
+ inodes_to_process[process_inode_count].inode = *inode;
+ process_inode_count++;
+ } else {
+ check_blocks(ctx, &pctx, block_buf);
+ }
+ } else {
+ check_ind_inode:
+ if (inode->i_block[EXT2_IND_BLOCK])
+ ctx->fs_ind_count++;
+ if (inode->i_block[EXT2_DIND_BLOCK])
+ ctx->fs_dind_count++;
+ if (inode->i_block[EXT2_TIND_BLOCK])
+ ctx->fs_tind_count++;
+ if (inode->i_block[EXT2_IND_BLOCK] ||
+ inode->i_block[EXT2_DIND_BLOCK] ||
+ inode->i_block[EXT2_TIND_BLOCK] ||
+ inode->i_file_acl) {
+ inodes_to_process[process_inode_count].ino = ino;
+ inodes_to_process[process_inode_count].inode = *inode;
+ process_inode_count++;
+ } else {
+ check_blocks(ctx, &pctx, block_buf);
+ }
+ }

if (ctx->flags & E2F_FLAG_SIGNAL_MASK)
return;
@@ -1426,10 +1486,23 @@ clear_extattr:
return 0;
}

+static int htree_blk_iter_cb(ext2_filsys fs EXT2FS_ATTR((unused)),
+ blk_t *blocknr,
+ e2_blkcnt_t blockcnt EXT2FS_ATTR((unused)),
+ blk_t ref_blk EXT2FS_ATTR((unused)),
+ int ref_offset EXT2FS_ATTR((unused)),
+ void *priv_data)
+{
+ blk_t *blk = priv_data;
+
+ *blk = *blocknr;
+
+ return BLOCK_ABORT;
+}
+
/* Returns 1 if bad htree, 0 if OK */
static int handle_htree(e2fsck_t ctx, struct problem_context *pctx,
- ext2_ino_t ino EXT2FS_ATTR((unused)),
- struct ext2_inode *inode,
+ ext2_ino_t ino, struct ext2_inode *inode,
char *block_buf)
{
struct ext2_dx_root_info *root;
@@ -1443,7 +1516,8 @@ static int handle_htree(e2fsck_t ctx, st
fix_problem(ctx, PR_1_HTREE_SET, pctx)))
return 1;

- blk = inode->i_block[0];
+ ext2fs_block_iterate2(fs, ino, BLOCK_FLAG_DATA_ONLY | BLOCK_FLAG_HOLE,
+ block_buf, htree_blk_iter_cb, &blk);
if (((blk == 0) ||
(blk < fs->super->s_first_data_block) ||
(blk >= fs->super->s_blocks_count)) &&
@@ -1480,6 +1554,135 @@ static int handle_htree(e2fsck_t ctx, st
return 0;
}

+/* sort 0 to the end of the list so we can exit early */
+static EXT2_QSORT_TYPE verify_ind_cmp(const void *a, const void *b)
+{
+ const __u32 blk_a = *(__u32 *)a - 1, blk_b = *(__u32 *)b - 1;
+
+ return blk_b > blk_a ? -1 : blk_a - blk_b;
+}
+
+/* Verify whether an indirect block is sane. If it has multiple references
+ * to the same block, or if it has a large number of bad or duplicate blocks
+ * chances are that it is corrupt and we should just clear it instead of
+ * trying to salvage it.
+ * NOTE: this needs to get a copy of the blocks, since it reorders them */
+static int e2fsck_ind_block_verify(struct process_block_struct *p,
+ void *block_buf, int buflen)
+{
+ __u32 blocks[EXT2_N_BLOCKS], *indir = block_buf;
+ int num_indir = buflen / sizeof(*indir);
+ int i, bad = 0;
+
+ if (num_indir == EXT2_N_BLOCKS) {
+ memcpy(blocks, block_buf, buflen);
+ indir = blocks;
+ }
+ qsort(indir, num_indir, sizeof(*indir), verify_ind_cmp);
+
+ for (i = 0; i < num_indir; i++) {
+ if (indir[i] == 0)
+ break;
+
+ /* bad block number, or duplicate block */
+ if (indir[i] < p->ctx->fs->super->s_first_data_block ||
+ indir[i] > p->ctx->fs->super->s_blocks_count ||
+ ext2fs_fast_test_block_bitmap(p->ctx->block_found_map,
+ indir[i]))
+ bad++;
+
+ /* shouldn't reference the same block twice within a block */
+ if (i > 0 && indir[i] == indir[i - 1])
+ bad++;
+ }
+
+ if ((num_indir <= EXT2_N_BLOCKS && bad > 4) || bad > 8)
+ return PR_1_INDIRECT_BAD;
+
+#if DEBUG_E2FSCK
+ /* For debugging, clobber buffer to ensure it doesn't appear sane */
+ memset(indir, 0xca, buflen);
+#endif
+ return 0;
+}
+
+static int e2fsck_ext_block_verify(struct process_block_struct *p,
+ void *block_buf, int buflen)
+{
+ struct ext3_extent_header *eh = block_buf, *eh_sav;
+ e2fsck_t ctx = p->ctx;
+ struct problem_context *pctx = p->pctx;
+ int i, problem = 0, high_bits_ok = 0;
+
+ if (ext2fs_extent_header_verify(eh, buflen))
+ return PR_1_EXTENT_IDX_BAD;
+
+ if (p->eh_prev && p->eh_prev->eh_depth != eh->eh_depth + 1)
+ return PR_1_EXTENT_IDX_BAD;
+
+ if (ctx->fs->super->s_blocks_count_hi) /* FIXME: 48-bit support ??? */
+ high_bits_ok = 1;
+
+ eh_sav = p->eh_prev;
+ p->eh_prev = eh;
+
+ if (eh->eh_depth == 0) {
+ struct ext3_extent *ex = EXT_FIRST_EXTENT(eh), *ex_prev = NULL;
+
+ for (i = 0; i < eh->eh_entries; i++, ex++) {
+ if (ex->ee_start_hi && !high_bits_ok &&
+ fix_problem(ctx, PR_1_EXTENT_HI, pctx)) {
+ ex->ee_start_hi = 0;
+ problem = PR_1_EXTENT_CHANGED;
+ }
+
+ if (ext2fs_extent_verify(ctx->fs, ex, ex_prev, NULL,0)){
+ p->num_illegal_blocks++;
+ pctx->blkcount = ex->ee_start;
+ pctx->num = ex->ee_len;
+ pctx->blk = ex->ee_block;
+ if (fix_problem(ctx, PR_1_EXTENT_BAD, pctx)) {
+ ext2fs_extent_remove(eh, ex);
+ i--; ex--; /* check next (moved) item */
+ problem = PR_1_EXTENT_CHANGED;
+ continue;
+ }
+ }
+
+ ex_prev = ex;
+ }
+ } else {
+ struct ext3_extent_idx *ix =EXT_FIRST_INDEX(eh), *ix_prev =NULL;
+
+ for (i = 0; i < eh->eh_entries; i++, ix++) {
+ if (ix->ei_leaf_hi && !high_bits_ok &&
+ fix_problem(ctx, PR_1_EXTENT_HI, pctx)) {
+ ix->ei_leaf_hi = ix->ei_unused = 0;
+ problem = PR_1_EXTENT_CHANGED;
+ }
+
+ if (ext2fs_extent_index_verify(ctx->fs, ix, ix_prev)) {
+ p->num_illegal_blocks++;
+ pctx->blkcount = ix->ei_leaf;;
+ pctx->num = i;
+ pctx->blk = ix->ei_block;
+ if (fix_problem(ctx, PR_1_EXTENT_IDX_BAD,pctx)){
+ ext2fs_extent_index_remove(eh, ix);
+ i--; ix--; /* check next (moved) item */
+ problem = PR_1_EXTENT_CHANGED;
+ continue;
+ }
+ }
+
+ ix_prev = ix;
+ }
+ }
+
+ p->eh_prev = eh_sav;
+
+ return problem;
+}
+
/*
* This subroutine is called on each inode to account for all of the
* blocks used by that inode.
@@ -1499,9 +1701,11 @@ static void check_blocks(e2fsck_t ctx, s
pb.num_blocks = 0;
pb.last_block = -1;
pb.num_illegal_blocks = 0;
+ pb.last_illegal_blocks = 0;
pb.suppress = 0; pb.clear = 0;
pb.fragmented = 0;
pb.compressed = 0;
+ pb.extent = !!(inode->i_flags & EXT4_EXTENTS_FL);
pb.previous_block = 0;
pb.is_dir = LINUX_S_ISDIR(inode->i_mode);
pb.is_reg = LINUX_S_ISREG(inode->i_mode);
@@ -1509,6 +1713,8 @@ static void check_blocks(e2fsck_t ctx, s
pb.inode = inode;
pb.pctx = pctx;
pb.ctx = ctx;
+ pb.eh_prev = NULL;
+ pb.block_buf = block_buf;
pctx->ino = ino;
pctx->errcode = 0;

@@ -1530,10 +1736,27 @@ static void check_blocks(e2fsck_t ctx, s
pb.num_blocks++;
}

- if (ext2fs_inode_has_valid_blocks(inode))
- pctx->errcode = ext2fs_block_iterate2(fs, ino,
- pb.is_dir ? BLOCK_FLAG_HOLE : 0,
- block_buf, process_block, &pb);
+ if (ext2fs_inode_has_valid_blocks(inode)) {
+ int problem = 0;
+
+ if (pb.extent)
+ problem = e2fsck_ext_block_verify(&pb, inode->i_block,
+ sizeof(inode->i_block));
+ else
+ problem = e2fsck_ind_block_verify(&pb, inode->i_block,
+ sizeof(inode->i_block));
+ if (problem == PR_1_EXTENT_CHANGED) {
+ dirty_inode++;
+ problem = 0;
+ }
+
+ if (problem && fix_problem(ctx, problem, pctx))
+ pb.clear = 1;
+ else
+ pctx->errcode = ext2fs_block_iterate2(fs, ino,
+ pb.is_dir ? BLOCK_FLAG_HOLE : 0,
+ block_buf, process_block, &pb);
+ }
end_problem_latch(ctx, PR_LATCH_BLOCK);
end_problem_latch(ctx, PR_LATCH_TOOBIG);
if (ctx->flags & E2F_FLAG_SIGNAL_MASK)
@@ -1697,6 +1920,9 @@ static char *describe_illegal_block(ext2
}
#endif

+#define IND_BLKCNT(_b) ((_b) == BLOCK_COUNT_IND || (_b) == BLOCK_COUNT_DIND ||\
+ (_b) == BLOCK_COUNT_TIND)
+
/*
* This is a helper function for check_blocks().
*/
@@ -1775,7 +2001,8 @@ static int process_block(ext2_filsys fs,
* file be contiguous. (Which can never be true for really
* big files that are greater than a block group.)
*/
- if (!HOLE_BLKADDR(p->previous_block)) {
+ if (!HOLE_BLKADDR(p->previous_block) &&
+ !(p->extent && IND_BLKCNT(blockcnt))) {
if (p->previous_block+1 != blk)
p->fragmented = 1;
}
@@ -1792,9 +2019,34 @@ static int process_block(ext2_filsys fs,
blk >= fs->super->s_blocks_count)
problem = PR_1_ILLEGAL_BLOCK_NUM;

+ if (!problem && IND_BLKCNT(blockcnt) && p->ino != EXT2_RESIZE_INO) {
+ if (p->extent) {
+ if (ext2fs_read_ext_block(ctx->fs, blk, p->block_buf))
+ problem = PR_1_BLOCK_ITERATE;
+ else
+ problem = e2fsck_ext_block_verify(p,
+ p->block_buf,
+ fs->blocksize);
+ if (problem == PR_1_EXTENT_CHANGED) {
+ if (ext2fs_write_ext_block(ctx->fs, blk,
+ p->block_buf))
+ problem = PR_1_BLOCK_ITERATE;
+ }
+
+ } else {
+ if (ext2fs_read_ind_block(ctx->fs, blk, p->block_buf))
+ problem = PR_1_BLOCK_ITERATE;
+ else
+ problem = e2fsck_ind_block_verify(p,
+ p->block_buf,
+ fs->blocksize);
+ }
+ }
+
if (problem) {
p->num_illegal_blocks++;
- if (!p->suppress && (p->num_illegal_blocks % 12) == 0) {
+ if (!p->suppress &&
+ p->num_illegal_blocks - p->last_illegal_blocks > 12) {
if (fix_problem(ctx, PR_1_TOO_MANY_BAD_BLOCKS, pctx)) {
p->clear = 1;
return BLOCK_ABORT;
@@ -1804,9 +2056,12 @@ static int process_block(ext2_filsys fs,
set_latch_flags(PR_LATCH_BLOCK,
PRL_SUPPRESS, 0);
}
+ p->last_illegal_blocks = p->num_illegal_blocks;
}
pctx->blk = blk;
pctx->blkcount = blockcnt;
+ if (problem == PR_1_EXTENT_CHANGED)
+ goto mark_used;
if (fix_problem(ctx, problem, pctx)) {
blk = *block_nr = 0;
ret_code = BLOCK_CHANGED;
@@ -1815,6 +2070,7 @@ static int process_block(ext2_filsys fs,
return 0;
}

+mark_used:
if (p->ino == EXT2_RESIZE_INO) {
/*
* The resize inode has already be sanity checked
Index: e2fsprogs-1.40.5/e2fsck/pass2.c
===================================================================
--- e2fsprogs-1.40.5.orig/e2fsck/pass2.c
+++ e2fsprogs-1.40.5/e2fsck/pass2.c
@@ -285,7 +285,16 @@ void e2fsck_pass2(e2fsck_t ctx)
ext2fs_mark_super_dirty(fs);
}
}
-
+
+ if (!ctx->extent_files &&
+ (sb->s_feature_incompat & EXT3_FEATURE_INCOMPAT_EXTENTS)) {
+ if (fs->flags & EXT2_FLAG_RW) {
+ sb->s_feature_incompat &=
+ ~EXT3_FEATURE_INCOMPAT_EXTENTS;
+ ext2fs_mark_super_dirty(fs);
+ }
+ }
+
#ifdef RESOURCE_TRACK
if (ctx->options & E2F_OPT_TIME2) {
e2fsck_clear_progbar(ctx);
Index: e2fsprogs-1.40.5/e2fsck/problem.c
===================================================================
--- e2fsprogs-1.40.5.orig/e2fsck/problem.c
+++ e2fsprogs-1.40.5/e2fsck/problem.c
@@ -784,6 +784,46 @@ static struct e2fsck_problem problem_tab
N_("@i %i is a %It but it looks like it is really a directory.\n"),
PROMPT_FIX, 0 },

+ /* indirect block corrupt */
+ { PR_1_INDIRECT_BAD,
+ N_("@i %i has corrupt indirect block\n"),
+ PROMPT_CLEAR, PR_PREEN_OK },
+
+ /* inode has extents, superblock missing INCOMPAT_EXTENTS feature */
+ { PR_1_EXTENT_FEATURE,
+ N_("@i %i is in extent format, but @S is missing EXTENTS feature\n"),
+ PROMPT_FIX, PR_PREEN_OK },
+
+ /* inode has EXTENTS_FL set, but is not an extent inode */
+ { PR_1_SET_EXTENT_FL,
+ N_("@i %i has EXTENT_FL set, but is not in extents format\n"),
+ PROMPT_FIX, PR_PREEN_OK },
+
+ /* inode missing EXTENTS_FL, but is an extent inode */
+ { PR_1_UNSET_EXTENT_FL,
+ N_("@i %i missing EXTENT_FL, but is in extents format\n"),
+ PROMPT_FIX, PR_PREEN_OK },
+
+ /* extent index corrupt */
+ { PR_1_EXTENT_BAD,
+ N_("@i %i has corrupt extent at @b %b (logical %B) length %N\n"),
+ PROMPT_CLEAR, PR_PREEN_OK },
+
+ /* extent index corrupt */
+ { PR_1_EXTENT_IDX_BAD,
+ N_("@i %i has corrupt extent index at @b %b (logical %B) entry %N\n"),
+ PROMPT_CLEAR, PR_PREEN_OK },
+
+ /* extent has high 16 bits set */
+ { PR_1_EXTENT_HI,
+ N_("High 16 bits of extent/index @b set\n"),
+ PROMPT_CLEAR, PR_LATCH_EXTENT_HI|PR_PREEN_OK|PR_NO_OK|PR_PREEN_NOMSG},
+
+ /* extent has high 16 bits set header */
+ { PR_1_EXTENT_HI_LATCH,
+ N_("@i %i has high 16 bits of extent/index @b set\n"),
+ PROMPT_CLEAR, PR_PREEN_OK | PR_NO_OK | PR_PREEN_NOMSG },
+
/* Pass 1b errors */

/* Pass 1B: Rescan for duplicate/bad blocks */
@@ -1518,6 +1558,7 @@ static struct latch_descr pr_latch_info[
{ PR_LATCH_LOW_DTIME, PR_1_ORPHAN_LIST_REFUGEES, 0 },
{ PR_LATCH_TOOBIG, PR_1_INODE_TOOBIG, 0 },
{ PR_LATCH_OPTIMIZE_DIR, PR_3A_OPTIMIZE_DIR_HEADER, PR_3A_OPTIMIZE_DIR_END },
+ { PR_LATCH_EXTENT_HI, PR_1_EXTENT_HI_LATCH, 0 },
{ -1, 0, 0 },
};

Index: e2fsprogs-1.40.5/e2fsck/problem.h
===================================================================
--- e2fsprogs-1.40.5.orig/e2fsck/problem.h
+++ e2fsprogs-1.40.5/e2fsck/problem.h
@@ -38,6 +38,7 @@ struct problem_context {
#define PR_LATCH_LOW_DTIME 0x0070 /* Latch for pass1 orphaned list refugees */
#define PR_LATCH_TOOBIG 0x0080 /* Latch for file to big errors */
#define PR_LATCH_OPTIMIZE_DIR 0x0090 /* Latch for optimize directories */
+#define PR_LATCH_EXTENT_HI 0x00A0 /* Latch for extent high bits set */

#define PR_LATCH(x) ((((x) & PR_LATCH_MASK) >> 4) - 1)

@@ -455,6 +456,33 @@ struct problem_context {
/* inode appears to be a directory */
#define PR_1_TREAT_AS_DIRECTORY 0x010055

+/* indirect block corrupt */
+#define PR_1_INDIRECT_BAD 0x010059
+
+/* wrong EXT3_FEATURE_INCOMPAT_EXTENTS flag */
+#define PR_1_EXTENT_FEATURE 0x010060
+
+/* EXT4_EXTENT_FL flag set on non-extent file */
+#define PR_1_SET_EXTENT_FL 0x010061
+
+/* EXT4_EXTENT_FL flag not set extent file */
+#define PR_1_UNSET_EXTENT_FL 0x010062
+
+/* extent index corrupt */
+#define PR_1_EXTENT_BAD 0x010063
+
+/* extent index corrupt */
+#define PR_1_EXTENT_IDX_BAD 0x010064
+
+/* extent/index has high 16 bits set - header */
+#define PR_1_EXTENT_HI 0x010065
+
+/* extent/index has high 16 bits set */
+#define PR_1_EXTENT_HI_LATCH 0x010066
+
+/* extent/index was modified & repaired - not really a problem */
+#define PR_1_EXTENT_CHANGED 0x010067
+
/*
* Pass 1b errors
*/
Index: e2fsprogs-1.40.5/lib/ext2fs/Makefile.in
===================================================================
--- e2fsprogs-1.40.5.orig/lib/ext2fs/Makefile.in
+++ e2fsprogs-1.40.5/lib/ext2fs/Makefile.in
@@ -35,6 +35,7 @@ OBJS= $(DEBUGFS_LIB_OBJS) $(RESIZE_LIB_O
dir_iterate.o \
expanddir.o \
ext_attr.o \
+ extents.o \
finddev.o \
flushb.o \
freefs.o \
@@ -90,6 +91,7 @@ SRCS= ext2_err.c \
$(srcdir)/dupfs.c \
$(srcdir)/expanddir.c \
$(srcdir)/ext_attr.c \
+ $(srcdir)/extents.c \
$(srcdir)/fileio.c \
$(srcdir)/finddev.c \
$(srcdir)/flushb.c \
@@ -127,6 +129,7 @@ SRCS= ext2_err.c \
$(srcdir)/tst_bitops.c \
$(srcdir)/tst_byteswap.c \
$(srcdir)/tst_getsize.c \
+ $(srcdir)/tst_types.c \
$(srcdir)/tst_iscan.c \
$(srcdir)/unix_io.c \
$(srcdir)/unlink.c \
@@ -394,6 +397,10 @@ ext_attr.o: $(srcdir)/ext_attr.c $(srcdi
$(srcdir)/ext2fs.h $(srcdir)/ext2_fs.h $(srcdir)/ext3_extents.h \
$(top_srcdir)/lib/et/com_err.h $(srcdir)/ext2_io.h \
$(top_builddir)/lib/ext2fs/ext2_err.h $(srcdir)/bitops.h
+extents.o: $(srcdir)/extents.c $(srcdir)/ext2_fs.h \
+ $(top_builddir)/lib/ext2fs/ext2_types.h $(srcdir)/ext3_extents.h \
+ $(srcdir)/ext2fs.h $(srcdir)/ext2_fs.h $(top_srcdir)/lib/et/com_err.h \
+ $(srcdir)/ext2_io.h $(top_builddir)/lib/ext2fs/ext2_err.h $(srcdir)/bitops.h
fileio.o: $(srcdir)/fileio.c $(srcdir)/ext2_fs.h \
$(top_builddir)/lib/ext2fs/ext2_types.h $(srcdir)/ext2fs.h \
$(srcdir)/ext2_fs.h $(srcdir)/ext3_extents.h $(top_srcdir)/lib/et/com_err.h \
Index: e2fsprogs-1.40.5/lib/ext2fs/block.c
===================================================================
--- e2fsprogs-1.40.5.orig/lib/ext2fs/block.c
+++ e2fsprogs-1.40.5/lib/ext2fs/block.c
@@ -17,24 +17,17 @@

#include "ext2_fs.h"
#include "ext2fs.h"
+#include "block.h"

-struct block_context {
- ext2_filsys fs;
- int (*func)(ext2_filsys fs,
- blk_t *blocknr,
- e2_blkcnt_t bcount,
- blk_t ref_blk,
- int ref_offset,
- void *priv_data);
- e2_blkcnt_t bcount;
- int bsize;
- int flags;
- errcode_t errcode;
- char *ind_buf;
- char *dind_buf;
- char *tind_buf;
- void *priv_data;
-};
+#ifdef EXT_DEBUG
+void ext_show_inode(struct ext2_inode *inode, ext2_ino_t ino)
+{
+ printf("inode: %u blocks: %u\n",
+ ino, inode->i_blocks);
+}
+#else
+#define ext_show_inode(inode, ino) do { } while (0)
+#endif

static int block_iterate_ind(blk_t *ind_block, blk_t ref_block,
int ref_offset, struct block_context *ctx)
@@ -276,7 +269,6 @@ errcode_t ext2fs_block_iterate2(ext2_fil
void *priv_data)
{
int i;
- int got_inode = 0;
int ret = 0;
blk_t blocks[EXT2_N_BLOCKS]; /* directory data blocks */
struct ext2_inode inode;
@@ -286,19 +278,20 @@ errcode_t ext2fs_block_iterate2(ext2_fil

EXT2_CHECK_MAGIC(fs, EXT2_ET_MAGIC_EXT2FS_FILSYS);

+ ctx.errcode = ext2fs_read_inode(fs, ino, &inode);
+ if (ctx.errcode)
+ return ctx.errcode;
+
/*
* Check to see if we need to limit large files
*/
if (flags & BLOCK_FLAG_NO_LARGE) {
- ctx.errcode = ext2fs_read_inode(fs, ino, &inode);
- if (ctx.errcode)
- return ctx.errcode;
- got_inode = 1;
if (!LINUX_S_ISDIR(inode.i_mode) &&
(inode.i_size_high != 0))
return EXT2_ET_FILE_TOO_BIG;
}

+ /* The in-memory inode may have been changed by e2fsck */
retval = ext2fs_get_blocks(fs, ino, blocks);
if (retval)
return retval;
@@ -325,10 +318,6 @@ errcode_t ext2fs_block_iterate2(ext2_fil
*/
if ((fs->super->s_creator_os == EXT2_OS_HURD) &&
!(flags & BLOCK_FLAG_DATA_ONLY)) {
- ctx.errcode = ext2fs_read_inode(fs, ino, &inode);
- if (ctx.errcode)
- goto abort_exit;
- got_inode = 1;
if (inode.osd1.hurd1.h_i_translator) {
ret |= (*ctx.func)(fs,
&inode.osd1.hurd1.h_i_translator,
@@ -338,7 +327,16 @@ errcode_t ext2fs_block_iterate2(ext2_fil
goto abort_exit;
}
}
-
+
+ /* Iterate over normal data blocks with extents.
+ * We can't do any fixing here because this gets called by other
+ * callers than e2fsck_pass1->check_blocks(). */
+ if (inode.i_flags & EXT4_EXTENTS_FL) {
+ ext_show_inode(&inode, ino);
+ ret |= block_iterate_extents(blocks, sizeof(blocks), 0, 0,&ctx);
+ goto abort_exit;
+ }
+
/*
* Iterate over normal data blocks
*/
@@ -373,11 +371,6 @@ errcode_t ext2fs_block_iterate2(ext2_fil

abort_exit:
if (ret & BLOCK_CHANGED) {
- if (!got_inode) {
- retval = ext2fs_read_inode(fs, ino, &inode);
- if (retval)
- return retval;
- }
for (i=0; i < EXT2_N_BLOCKS; i++)
inode.i_block[i] = blocks[i];
retval = ext2fs_write_inode(fs, ino, &inode);
Index: e2fsprogs-1.40.5/lib/ext2fs/block.h
===================================================================
--- /dev/null
+++ e2fsprogs-1.40.5/lib/ext2fs/block.h
@@ -0,0 +1,33 @@
+/*
+ * block.h --- header for block iteration in block.c, extent.c
+ *
+ * Copyright (C) 1993, 1994, 1995, 1996 Theodore Ts'o.
+ *
+ * %Begin-Header%
+ * This file may be redistributed under the terms of the GNU Public
+ * License.
+ * %End-Header%
+ */
+
+struct block_context {
+ ext2_filsys fs;
+ int (*func)(ext2_filsys fs,
+ blk_t *blocknr,
+ e2_blkcnt_t bcount,
+ blk_t ref_blk,
+ int ref_offset,
+ void *priv_data);
+ e2_blkcnt_t bcount;
+ int bsize;
+ int flags;
+ errcode_t errcode;
+ char *ind_buf;
+ char *dind_buf;
+ char *tind_buf;
+ void *priv_data;
+};
+
+/* libext2fs nternal function, in extent.c */
+extern int block_iterate_extents(void *eh_buf, unsigned bufsize,blk_t ref_block,
+ int ref_offset EXT2FS_ATTR((unused)),
+ struct block_context *ctx);
Index: e2fsprogs-1.40.5/lib/ext2fs/bmap.c
===================================================================
--- e2fsprogs-1.40.5.orig/lib/ext2fs/bmap.c
+++ e2fsprogs-1.40.5/lib/ext2fs/bmap.c
@@ -17,6 +17,7 @@

#include "ext2_fs.h"
#include "ext2fs.h"
+#include "ext3_extents.h"

#if defined(__GNUC__) && !defined(NO_INLINE_FUNCS)
#define _BMAP_INLINE_ __inline__
@@ -31,6 +32,65 @@ extern errcode_t ext2fs_bmap(ext2_filsys

#define inode_bmap(inode, nr) ((inode)->i_block[(nr)])

+/* see also block_iterate_extents() */
+static errcode_t block_bmap_extents(void *eh_buf, unsigned bufsize,
+ ext2_filsys fs, blk_t block,blk_t *phys_blk)
+{
+ struct ext3_extent_header *eh = eh_buf;
+ struct ext3_extent *ex;
+ errcode_t ret = 0;
+ int i;
+
+ ret = ext2fs_extent_header_verify(eh, bufsize);
+ if (ret)
+ return ret;
+
+ if (eh->eh_depth == 0) {
+ ex = EXT_FIRST_EXTENT(eh);
+ for (i = 0; i < eh->eh_entries; i++, ex++) {
+ if (block < ex->ee_block)
+ continue;
+
+ if (block < ex->ee_block + ex->ee_len)
+ /* FIXME: 48-bit support */
+ *phys_blk = ex->ee_start + block - ex->ee_block;
+
+ /* only the first extent > block could hold the block
+ * otherwise the extents would overlap */
+ break;
+ }
+ } else {
+ struct ext3_extent_idx *ix;
+ char *block_buf;
+
+ ret = ext2fs_get_mem(fs->blocksize, &block_buf);
+ if (ret)
+ return ret;
+
+ ix = EXT_FIRST_INDEX(eh);
+ for (i = 0; i < eh->eh_entries; i++, ix++) {
+ if (block < ix->ei_block)
+ continue;
+
+ ret = io_channel_read_blk(fs->io, ix->ei_leaf, 1,
+ block_buf);
+ if (ret)
+ goto free_buf;
+
+ ret = block_bmap_extents(block_buf, fs->blocksize,
+ fs, block, phys_blk);
+
+ /* only the first extent > block could hold the block
+ * otherwise the extents would overlap */
+ break;
+ }
+
+ free_buf:
+ ext2fs_free_mem(&block_buf);
+ }
+ return ret;
+}
+
static _BMAP_INLINE_ errcode_t block_ind_bmap(ext2_filsys fs, int flags,
blk_t ind, char *block_buf,
int *blocks_alloc,
@@ -149,6 +209,16 @@ errcode_t ext2fs_bmap(ext2_filsys fs, ex
return retval;
inode = &inode_buf;
}
+
+ if (inode->i_flags & EXT4_EXTENTS_FL) {
+ if (bmap_flags) /* unsupported as yet */
+ return EXT2_ET_BLOCK_ALLOC_FAIL;
+ retval = block_bmap_extents(inode->i_block,
+ sizeof(inode->i_block),
+ fs, block, phys_blk);
+ goto done;
+ }
+
addr_per_block = (blk_t) fs->blocksize >> 2;

if (!block_buf) {
Index: e2fsprogs-1.40.5/lib/ext2fs/ext2_err.et.in
===================================================================
--- e2fsprogs-1.40.5.orig/lib/ext2fs/ext2_err.et.in
+++ e2fsprogs-1.40.5/lib/ext2fs/ext2_err.et.in
@@ -326,5 +326,17 @@ ec EXT2_ET_TDB_ERR_NOEXIST,
ec EXT2_ET_TDB_ERR_RDONLY,
"TDB: Write not permitted"

+ec EXT2_ET_EXTENT_HEADER_BAD,
+ "Corrupt extent header"
+
+ec EXT2_ET_EXTENT_INDEX_BAD,
+ "Corrupt extent index"
+
+ec EXT2_ET_EXTENT_LEAF_BAD,
+ "Corrupt extent"
+
+ec EXT2_ET_EXTENT_NO_SPACE,
+ "No free space in extent map"
+
end

Index: e2fsprogs-1.40.5/lib/ext2fs/ext2fs.h
===================================================================
--- e2fsprogs-1.40.5.orig/lib/ext2fs/ext2fs.h
+++ e2fsprogs-1.40.5/lib/ext2fs/ext2fs.h
@@ -436,12 +436,14 @@ typedef struct ext2_icount *ext2_icount_
EXT3_FEATURE_INCOMPAT_JOURNAL_DEV|\
EXT2_FEATURE_INCOMPAT_META_BG|\
EXT3_FEATURE_INCOMPAT_RECOVER|\
+ EXT3_FEATURE_INCOMPAT_EXTENTS|\
EXT4_FEATURE_INCOMPAT_FLEX_BG)
#else
#define EXT2_LIB_FEATURE_INCOMPAT_SUPP (EXT2_FEATURE_INCOMPAT_FILETYPE|\
EXT3_FEATURE_INCOMPAT_JOURNAL_DEV|\
EXT2_FEATURE_INCOMPAT_META_BG|\
EXT3_FEATURE_INCOMPAT_RECOVER|\
+ EXT3_FEATURE_INCOMPAT_EXTENTS|\
EXT4_FEATURE_INCOMPAT_FLEX_BG)
#endif
#define EXT2_LIB_FEATURE_RO_COMPAT_SUPP (EXT2_FEATURE_RO_COMPAT_SPARSE_SUPER|\
@@ -722,6 +724,21 @@ extern errcode_t ext2fs_adjust_ea_refcou
char *block_buf,
int adjust, __u32 *newcount);

+/* extent.c */
+errcode_t ext2fs_extent_header_verify(struct ext3_extent_header *eh, int size);
+errcode_t ext2fs_extent_verify(ext2_filsys fs, struct ext3_extent *ex,
+ struct ext3_extent *ex_prev,
+ struct ext3_extent_idx *ix, int ix_len);
+errcode_t ext2fs_extent_index_verify(ext2_filsys fs,
+ struct ext3_extent_idx *ix,
+ struct ext3_extent_idx *ix_prev);
+errcode_t ext2fs_extent_remove(struct ext3_extent_header *eh,
+ struct ext3_extent *ex);
+errcode_t ext2fs_extent_split(ext2_filsys fs, struct ext3_extent_header **eh,
+ struct ext3_extent **ex, int count, int *flag);
+errcode_t ext2fs_extent_index_remove(struct ext3_extent_header *eh,
+ struct ext3_extent_idx *ix);
+
/* fileio.c */
extern errcode_t ext2fs_file_open2(ext2_filsys fs, ext2_ino_t ino,
struct ext2_inode *inode,
@@ -810,6 +827,8 @@ extern errcode_t ext2fs_image_bitmap_rea
/* ind_block.c */
errcode_t ext2fs_read_ind_block(ext2_filsys fs, blk_t blk, void *buf);
errcode_t ext2fs_write_ind_block(ext2_filsys fs, blk_t blk, void *buf);
+errcode_t ext2fs_read_ext_block(ext2_filsys fs, blk_t blk, void *buf);
+errcode_t ext2fs_write_ext_block(ext2_filsys fs, blk_t blk, void *buf);

/* initialize.c */
extern errcode_t ext2fs_initialize(const char *name, int flags,
@@ -984,6 +1003,9 @@ extern void ext2fs_swap_inode_full(ext2_
int bufsize);
extern void ext2fs_swap_inode(ext2_filsys fs,struct ext2_inode *t,
struct ext2_inode *f, int hostorder);
+extern void ext2fs_swap_extent_header(struct ext3_extent_header *eh);
+extern void ext2fs_swap_extent_index(struct ext3_extent_idx *ix);
+extern void ext2fs_swap_extent(struct ext3_extent *ex);

/* valid_blk.c */
extern int ext2fs_inode_has_valid_blocks(struct ext2_inode *inode);
Index: e2fsprogs-1.40.5/lib/ext2fs/extents.c
===================================================================
--- /dev/null
+++ e2fsprogs-1.40.5/lib/ext2fs/extents.c
@@ -0,0 +1,475 @@
+/*
+ * extent.c --- iterate over all blocks in an extent-mapped inode
+ *
+ * Copyright (C) 2005 Alex Tomas <[email protected]>
+ * Copyright (C) 2006 Andreas Dilger <[email protected]>
+ *
+ * %Begin-Header%
+ * This file may be redistributed under the terms of the GNU Public
+ * License.
+ * %End-Header%
+ */
+
+#include <stdio.h>
+#include <string.h>
+#if HAVE_UNISTD_H
+#include <unistd.h>
+#endif
+
+#include "ext2_fs.h"
+#include "ext2fs.h"
+#include "block.h"
+
+#ifdef EXT_DEBUG
+void ext_show_header(struct ext3_extent_header *eh)
+{
+ printf("header: magic=%x entries=%u max=%u depth=%u generation=%u\n",
+ eh->eh_magic, eh->eh_entries, eh->eh_max, eh->eh_depth,
+ eh->eh_generation);
+}
+
+void ext_show_index(struct ext3_extent_idx *ix)
+{
+ printf("index: block=%u leaf=%u leaf_hi=%u unused=%u\n",
+ ix->ei_block, ix->ei_leaf, ix->ei_leaf_hi, ix->ei_unused);
+}
+
+void ext_show_extent(struct ext3_extent *ex)
+{
+ printf("extent: block=%u-%u len=%u start=%u start_hi=%u\n",
+ ex->ee_block, ex->ee_block + ex->ee_len - 1,
+ ex->ee_len, ex->ee_start, ex->ee_start_hi);
+}
+
+#define ext_printf(fmt, args...) printf(fmt, ## args)
+#else
+#define ext_show_header(eh) do { } while (0)
+#define ext_show_index(ix) do { } while (0)
+#define ext_show_extent(ex) do { } while (0)
+#define ext_printf(fmt, args...) do { } while (0)
+#endif
+
+errcode_t ext2fs_extent_header_verify(struct ext3_extent_header *eh, int size)
+{
+ int eh_max, entry_size;
+
+ ext_show_header(eh);
+ if (eh->eh_magic != EXT3_EXT_MAGIC)
+ return EXT2_ET_EXTENT_HEADER_BAD;
+ if (eh->eh_entries > eh->eh_max)
+ return EXT2_ET_EXTENT_HEADER_BAD;
+ if (eh->eh_depth == 0)
+ entry_size = sizeof(struct ext3_extent);
+ else
+ entry_size = sizeof(struct ext3_extent_idx);
+
+ eh_max = (size - sizeof(*eh)) / entry_size;
+ /* Allow two extent-sized items at the end of the block, for
+ * ext4_extent_tail with checksum in the future. */
+ if (eh->eh_max > eh_max || eh->eh_max < eh_max - 2)
+ return EXT2_ET_EXTENT_HEADER_BAD;
+
+ return 0;
+}
+
+/* Verify that a single extent @ex is valid. If @ex_prev is passed in,
+ * then this was the previous logical extent in this block and we can
+ * do additional sanity checking (though in case of error we don't know
+ * which of the two extents is bad). Similarly, if @ix is passed in
+ * we can check that this extent is logically part of the index that
+ * refers to it (though again we can't know which of the two is bad). */
+errcode_t ext2fs_extent_verify(ext2_filsys fs, struct ext3_extent *ex,
+ struct ext3_extent *ex_prev,
+ struct ext3_extent_idx *ix, int ix_len)
+{
+ ext_show_extent(ex);
+ /* FIXME: 48-bit support */
+ if (ex->ee_start > fs->super->s_blocks_count)
+ return EXT2_ET_EXTENT_LEAF_BAD;
+
+ if (ex->ee_len == 0)
+ return EXT2_ET_EXTENT_LEAF_BAD;
+
+ if (ex->ee_len >= fs->super->s_blocks_per_group)
+ return EXT2_ET_EXTENT_LEAF_BAD;
+
+ if (ex_prev) {
+ /* We can't have a zero logical block except for first index */
+ if (ex->ee_block == 0)
+ return EXT2_ET_EXTENT_LEAF_BAD;
+
+ /* FIXME: 48-bit support */
+ /* extents must be in logical offset order */
+ if (ex->ee_block < ex_prev->ee_block + ex_prev->ee_len)
+ return EXT2_ET_EXTENT_LEAF_BAD;
+
+ /* extents must not overlap physical blocks */
+ if ((ex->ee_start < ex_prev->ee_start + ex_prev->ee_len) &&
+ (ex->ee_start + ex->ee_len > ex_prev->ee_start))
+ return EXT2_ET_EXTENT_LEAF_BAD;
+ }
+
+ if (ix) {
+ /* FIXME: 48-bit support */
+ if (ex->ee_block < ix->ei_block)
+ return EXT2_ET_EXTENT_LEAF_BAD;
+
+ if (ix_len && ex->ee_block + ex->ee_len > ix->ei_block + ix_len)
+ return EXT2_ET_EXTENT_LEAF_BAD;
+ }
+
+ return 0;
+}
+
+errcode_t ext2fs_extent_index_verify(ext2_filsys fs, struct ext3_extent_idx *ix,
+ struct ext3_extent_idx *ix_prev)
+{
+ ext_show_index(ix);
+ /* FIXME: 48-bit support */
+ if (ix->ei_leaf > fs->super->s_blocks_count)
+ return EXT2_ET_EXTENT_INDEX_BAD;
+
+ if (ix_prev == NULL)
+ return 0;
+
+ /* We can't have a zero logical block except for first index */
+ if (ix->ei_block == 0)
+ return EXT2_ET_EXTENT_INDEX_BAD;
+
+ if (ix->ei_block <= ix_prev->ei_block)
+ return EXT2_ET_EXTENT_INDEX_BAD;
+
+ return 0;
+}
+
+errcode_t ext2fs_extent_remove(struct ext3_extent_header *eh,
+ struct ext3_extent *ex)
+{
+ int offs = ex - EXT_FIRST_EXTENT(eh);
+
+ if (offs < 0 || offs > eh->eh_entries)
+ return EXT2_ET_EXTENT_LEAF_BAD;
+
+ ext_printf("remove extent: offset %u\n", offs);
+
+ memmove(ex, ex + 1, (eh->eh_entries - offs - 1) * sizeof(*ex));
+ --eh->eh_entries;
+
+ return 0;
+}
+
+static errcode_t ext2fs_extent_split_internal(struct ext3_extent_header *eh,
+ struct ext3_extent *ex, int offs)
+{
+ int entry = ex - EXT_FIRST_EXTENT(eh);
+ struct ext3_extent *ex_new = ex + 1;
+
+ ext_printf("split: ee_len: %u ee_block: %u ee_start: %u offset: %u\n",
+ ex->ee_len, ex->ee_block, ex->ee_start, offs);
+ memmove(ex_new, ex, (eh->eh_entries - entry) * sizeof(*ex));
+ ++eh->eh_entries;
+
+ ex->ee_len = offs;
+ /* FIXME: 48-bit support */
+ ex_new->ee_len -= offs;
+ ex_new->ee_block += offs;
+ ex_new->ee_start += offs;
+
+ return 0;
+}
+
+errcode_t ext2fs_extent_split(ext2_filsys fs,
+ struct ext3_extent_header **eh_orig,
+ struct ext3_extent **ex_orig, int offs, int *flag)
+{
+ struct ext3_extent_header *eh_parent = *eh_orig;
+ int retval, entry = *ex_orig - EXT_FIRST_EXTENT(eh_parent);
+ blk_t new_block;
+ char *buf;
+ struct ext3_extent_idx *ei = EXT_FIRST_INDEX(eh_parent);
+
+ if (entry < 0 || entry > (*eh_orig)->eh_entries)
+ return EXT2_ET_EXTENT_LEAF_BAD;
+
+ if (offs > (*ex_orig)->ee_len)
+ return EXT2_ET_EXTENT_LEAF_BAD;
+
+ if (eh_parent->eh_entries >= eh_parent->eh_max) {
+ ext_printf("split: eh_entries: %u eh_max: %u\n",
+ eh_parent->eh_entries, eh_parent->eh_max);
+ if (eh_parent->eh_max == 4) {
+ struct ext3_extent_header *eh_child;
+ struct ext3_extent *ex_child;
+
+ retval = ext2fs_get_mem(fs->blocksize, &buf);
+
+ if (retval)
+ return EXT2_ET_EXTENT_NO_SPACE;
+
+ memset(buf, 0, fs->blocksize);
+ memcpy(buf, eh_parent, sizeof(*eh_parent) +
+ eh_parent->eh_entries * sizeof(*ex_child));
+ eh_child = (struct ext3_extent_header *)buf;
+
+ eh_child->eh_max = (fs->blocksize -
+ sizeof(struct ext3_extent_header)) /
+ sizeof(struct ext3_extent);
+ retval = ext2fs_new_block(fs, (*ex_orig)->ee_block, 0,
+ &new_block);
+ if (retval)
+ return EXT2_ET_EXTENT_NO_SPACE;
+
+ retval = io_channel_write_blk(fs->io, new_block, 1,buf);
+ if (retval)
+ return EXT2_ET_EXTENT_NO_SPACE;
+
+ eh_parent->eh_entries = 1;
+ eh_parent->eh_depth = 1;
+
+ ex_child = EXT_FIRST_EXTENT(eh_child);
+ ei->ei_block = ex_child->ee_block;
+ /* FIXME: 48-bit support*/
+ ei->ei_leaf = new_block;
+
+ *eh_orig = eh_child;
+ *ex_orig = EXT_FIRST_EXTENT(eh_child) + entry;
+
+ *flag = BLOCK_CHANGED;
+ } else {
+ return EXT2_ET_EXTENT_NO_SPACE;
+ }
+ }
+
+ return ext2fs_extent_split_internal(*eh_orig, *ex_orig, offs);
+}
+
+errcode_t ext2fs_extent_index_remove(struct ext3_extent_header *eh,
+ struct ext3_extent_idx *ix)
+{
+ struct ext3_extent_idx *first = EXT_FIRST_INDEX(eh);
+ int offs = ix - first;
+
+ ext_printf("remove index: offset %u\n", offs);
+
+ memmove(ix, ix + 1, (eh->eh_entries - offs - 1) * sizeof(*ix));
+ --eh->eh_entries;
+
+ return 0;
+}
+
+/* Internal function for ext2fs_block_iterate2() to recursively walk the
+ * extent tree, with a callback function for each block. We also call the
+ * callback function on index blocks unless BLOCK_FLAG_DATA_ONLY is given.
+ * We traverse the tree in-order (internal nodes before their children)
+ * unless BLOCK_FLAG_DEPTH_FIRST is given.
+ *
+ * See also block_bmap_extents(). */
+int block_iterate_extents(void *eh_buf, unsigned bufsize, blk_t ref_block,
+ int ref_offset EXT2FS_ATTR((unused)),
+ struct block_context *ctx)
+{
+ struct ext3_extent_header *orig_eh, *eh;
+ struct ext3_extent *ex, *ex_prev = NULL;
+ int ret = 0;
+ int item, offs, flags, split_flag = 0;
+ blk_t block_address;
+
+ orig_eh = eh = eh_buf;
+
+ if (ext2fs_extent_header_verify(eh, bufsize))
+ return BLOCK_ERROR;
+
+ if (eh->eh_depth == 0) {
+ ex = EXT_FIRST_EXTENT(eh);
+ for (item = 0; item < eh->eh_entries; item++, ex++) {
+ ext_show_extent(ex);
+ for (offs = 0; offs < ex->ee_len; offs++) {
+ block_address = ex->ee_start + offs;
+ flags = (*ctx->func)(ctx->fs, &block_address,
+ (ex->ee_block + offs),
+ ref_block, item,
+ ctx->priv_data);
+ if (flags & (BLOCK_ABORT | BLOCK_ERROR)) {
+ ret |= flags &(BLOCK_ABORT|BLOCK_ERROR);
+ return ret;
+ }
+ if (!(flags & BLOCK_CHANGED))
+ continue;
+
+ ext_printf("extent leaf changed: "
+ "block was %u+%u = %u, now %u\n",
+ ex->ee_start, offs,
+ ex->ee_start + offs, block_address);
+
+ /* FIXME: 48-bit support */
+ if (ex_prev &&
+ block_address ==
+ ex_prev->ee_start + ex_prev->ee_len &&
+ ex->ee_block + offs ==
+ ex_prev->ee_block + ex_prev->ee_len) {
+ /* can merge block with prev extent */
+ ex_prev->ee_len++;
+ ex->ee_len--;
+ ret |= BLOCK_CHANGED;
+
+ if (ex->ee_len == 0) {
+ /* no blocks left in this one */
+ ext2fs_extent_remove(eh, ex);
+ item--; ex--;
+ break;
+ } else {
+ /* FIXME: 48-bit support */
+ ex->ee_start++;
+ ex->ee_block++;
+ offs--;
+ }
+
+ } else if (offs > 0 && /* implies ee_len > 1 */
+ (ctx->errcode =
+ ext2fs_extent_split(ctx->fs, &eh,
+ &ex, offs,
+ &split_flag)
+ /* advance ex past newly split item,
+ * comparison is bogus to make sure
+ * increment doesn't change logic */
+ || (offs > 0 && ex++ == NULL))) {
+ /* split before new block failed */
+ ret |= BLOCK_ABORT | BLOCK_ERROR;
+ return ret;
+
+ } else if (ex->ee_len > 1 &&
+ (ctx->errcode =
+ ext2fs_extent_split(ctx->fs, &eh,
+ &ex, 1,
+ &split_flag))) {
+ /* split after new block failed */
+ ret |= BLOCK_ABORT | BLOCK_ERROR;
+ return ret;
+
+ } else {
+ if (ex->ee_len != 1) {
+ /* this is an internal error */
+ ctx->errcode =
+ EXT2_ET_EXTENT_INDEX_BAD;
+ ret |= BLOCK_ABORT |BLOCK_ERROR;
+ return ret;
+ }
+ /* FIXME: 48-bit support */
+ ex->ee_start = block_address;
+ ret |= BLOCK_CHANGED;
+ }
+ }
+ ex_prev = ex;
+ }
+ /* Multi level split at depth == 0.
+ * ex has been changed to point to newly allocated block
+ * buffer. And after returning in this scenario, only inode is
+ * updated with changed i_block. Hence explicitly write to the
+ * block is required. */
+ if (split_flag == BLOCK_CHANGED) {
+ struct ext3_extent_idx *ix = EXT_FIRST_INDEX(orig_eh);
+ ctx->errcode = ext2fs_write_ext_block(ctx->fs,
+ ix->ei_leaf, eh);
+ }
+ } else {
+ char *block_buf;
+ struct ext3_extent_idx *ix;
+
+ ret = ext2fs_get_mem(ctx->fs->blocksize, &block_buf);
+ if (ret)
+ return ret;
+
+ ext_show_header(eh);
+ ix = EXT_FIRST_INDEX(eh);
+ for (item = 0; item < eh->eh_entries; item++, ix++) {
+ ext_show_index(ix);
+ /* index is processed first in e2fsck case */
+ if (!(ctx->flags & BLOCK_FLAG_DEPTH_TRAVERSE) &&
+ !(ctx->flags & BLOCK_FLAG_DATA_ONLY)) {
+ block_address = ix->ei_leaf;
+ flags = (*ctx->func)(ctx->fs, &block_address,
+ BLOCK_COUNT_IND, ref_block,
+ item, ctx->priv_data);
+ if (flags & (BLOCK_ABORT | BLOCK_ERROR)) {
+ ret |= flags &(BLOCK_ABORT|BLOCK_ERROR);
+ goto free_buf;
+ }
+ if (flags & BLOCK_CHANGED) {
+ ret |= BLOCK_CHANGED;
+ /* index has no more block, remove it */
+ /* FIXME: 48-bit support */
+ ix->ei_leaf = block_address;
+ if (ix->ei_leaf == 0 &&
+ ix->ei_leaf_hi == 0) {
+ if(ext2fs_extent_index_remove(eh, ix)) {
+ ret |= BLOCK_ABORT |BLOCK_ERROR;
+ goto free_buf;
+ } else {
+ --item; --ix;
+ continue;
+ }
+ }
+ /* remapped? */
+ }
+ }
+ ctx->errcode = ext2fs_read_ext_block(ctx->fs,
+ ix->ei_leaf,
+ block_buf);
+ if (ctx->errcode) {
+ ret |= BLOCK_ERROR;
+ goto free_buf;
+ }
+ flags = block_iterate_extents(block_buf,
+ ctx->fs->blocksize,
+ ix->ei_leaf, item, ctx);
+ if (flags & BLOCK_CHANGED) {
+ struct ext3_extent_header *nh;
+ ctx->errcode =
+ ext2fs_write_ext_block(ctx->fs,
+ ix->ei_leaf,
+ block_buf);
+
+ nh = (struct ext3_extent_header *)block_buf;
+ if (nh->eh_entries == 0)
+ ix->ei_leaf = ix->ei_leaf_hi = 0;
+ }
+ if (flags & (BLOCK_ABORT | BLOCK_ERROR)) {
+ ret |= flags & (BLOCK_ABORT | BLOCK_ERROR);
+ goto free_buf;
+ }
+ if ((ctx->flags & BLOCK_FLAG_DEPTH_TRAVERSE) &&
+ !(ctx->flags & BLOCK_FLAG_DATA_ONLY)) {
+ flags = (*ctx->func)(ctx->fs, &block_address,
+ BLOCK_COUNT_IND, ref_block,
+ item, ctx->priv_data);
+ if (flags & (BLOCK_ABORT | BLOCK_ERROR)) {
+ ret |= flags &(BLOCK_ABORT|BLOCK_ERROR);
+ goto free_buf;
+ }
+ if (flags & BLOCK_CHANGED)
+ /* FIXME: 48-bit support */
+ ix->ei_leaf = block_address;
+ }
+
+ if (flags & BLOCK_CHANGED) {
+ /* index has no more block, remove it */
+ if (ix->ei_leaf == 0 && ix->ei_leaf_hi == 0 &&
+ ext2fs_extent_index_remove(eh, ix)) {
+ ret |= BLOCK_ABORT |BLOCK_ERROR;
+ goto free_buf;
+ }
+
+ ret |= BLOCK_CHANGED;
+ if (ref_block == 0) {
+ --item; --ix;
+ continue;
+ }
+ /* remapped? */
+ }
+ }
+
+ free_buf:
+ ext2fs_free_mem(&block_buf);
+ }
+ return ret;
+}
Index: e2fsprogs-1.40.5/lib/ext2fs/ind_block.c
===================================================================
--- e2fsprogs-1.40.5.orig/lib/ext2fs/ind_block.c
+++ e2fsprogs-1.40.5/lib/ext2fs/ind_block.c
@@ -22,9 +22,9 @@
errcode_t ext2fs_read_ind_block(ext2_filsys fs, blk_t blk, void *buf)
{
errcode_t retval;
- blk_t *block_nr;
- int i;
- int limit = fs->blocksize >> 2;
+ int limit = fs->blocksize >> 2;
+ blk_t *block_nr = (blk_t *)buf;
+ int i;

if ((fs->flags & EXT2_FLAG_IMAGE_FILE) &&
(fs->io != fs->image_io))
@@ -35,7 +35,6 @@ errcode_t ext2fs_read_ind_block(ext2_fil
return retval;
}
#ifdef WORDS_BIGENDIAN
- block_nr = (blk_t *) buf;
for (i = 0; i < limit; i++, block_nr++)
*block_nr = ext2fs_swab32(*block_nr);
#endif
@@ -60,3 +59,82 @@ errcode_t ext2fs_write_ind_block(ext2_fi
}


+errcode_t ext2fs_read_ext_block(ext2_filsys fs, blk_t blk, void *buf)
+{
+ errcode_t retval;
+
+ if ((fs->flags & EXT2_FLAG_IMAGE_FILE) &&
+ (fs->io != fs->image_io))
+ memset(buf, 0, fs->blocksize);
+ else {
+ retval = io_channel_read_blk(fs->io, blk, 1, buf);
+ if (retval)
+ return retval;
+ }
+#ifdef EXT2FS_ENABLE_SWAPFS
+ if (fs->flags & (EXT2_FLAG_SWAP_BYTES | EXT2_FLAG_SWAP_BYTES_READ)) {
+ struct ext3_extent_header *eh = buf;
+ int i, limit;
+
+ ext2fs_swap_extent_header(eh);
+
+ if (eh->eh_depth == 0) {
+ struct ext3_extent *ex = EXT_FIRST_EXTENT(eh);
+
+ limit = (fs->blocksize - sizeof(*eh)) / sizeof(*ex);
+ if (eh->eh_entries < limit)
+ limit = eh->eh_entries;
+
+ for (i = 0; i < limit; i++, ex++)
+ ext2fs_swap_extent(ex);
+ } else {
+ struct ext3_extent_idx *ix = EXT_FIRST_INDEX(eh);
+
+ limit = (fs->blocksize - sizeof(*eh)) / sizeof(*ix);
+ if (eh->eh_entries < limit)
+ limit = eh->eh_entries;
+
+ for (i = 0; i < limit; i++, ix++)
+ ext2fs_swap_extent_index(ix);
+ }
+ }
+#endif
+ return 0;
+}
+
+errcode_t ext2fs_write_ext_block(ext2_filsys fs, blk_t blk, void *buf)
+{
+ if (fs->flags & EXT2_FLAG_IMAGE_FILE)
+ return 0;
+
+#ifdef EXT2FS_ENABLE_SWAPFS
+ if (fs->flags & (EXT2_FLAG_SWAP_BYTES | EXT2_FLAG_SWAP_BYTES_WRITE)) {
+ struct ext3_extent_header *eh = buf;
+ int i, limit;
+
+ if (eh->eh_depth == 0) {
+ struct ext3_extent *ex = EXT_FIRST_EXTENT(eh);
+
+ limit = (fs->blocksize - sizeof(*eh)) / sizeof(*ex);
+ if (eh->eh_entries < limit)
+ limit = eh->eh_entries;
+
+ for (i = 0; i < limit; i++, ex++)
+ ext2fs_swap_extent(ex);
+ } else {
+ struct ext3_extent_idx *ix = EXT_FIRST_INDEX(eh);
+
+ limit = (fs->blocksize - sizeof(*eh)) / sizeof(*ix);
+ if (eh->eh_entries < limit)
+ limit = eh->eh_entries;
+
+ for (i = 0; i < limit; i++, ix++)
+ ext2fs_swap_extent_index(ix);
+ }
+
+ ext2fs_swap_extent_header(eh);
+ }
+#endif
+ return io_channel_write_blk(fs->io, blk, 1, buf);
+}
+
Index: e2fsprogs-1.40.5/lib/ext2fs/swapfs.c
===================================================================
--- e2fsprogs-1.40.5.orig/lib/ext2fs/swapfs.c
+++ e2fsprogs-1.40.5/lib/ext2fs/swapfs.c
@@ -142,11 +142,33 @@ void ext2fs_swap_ext_attr(char *to, char
}
}

+void ext2fs_swap_extent_header(struct ext3_extent_header *eh) {
+ eh->eh_magic = ext2fs_swab16(eh->eh_magic);
+ eh->eh_entries = ext2fs_swab16(eh->eh_entries);
+ eh->eh_max = ext2fs_swab16(eh->eh_max);
+ eh->eh_depth = ext2fs_swab16(eh->eh_depth);
+ eh->eh_generation = ext2fs_swab32(eh->eh_generation);
+}
+
+void ext2fs_swap_extent_index(struct ext3_extent_idx *ix) {
+ ix->ei_block = ext2fs_swab32(ix->ei_block);
+ ix->ei_leaf = ext2fs_swab32(ix->ei_leaf);
+ ix->ei_leaf_hi = ext2fs_swab16(ix->ei_leaf_hi);
+ ix->ei_unused = ext2fs_swab16(ix->ei_unused);
+}
+
+void ext2fs_swap_extent(struct ext3_extent *ex) {
+ ex->ee_block = ext2fs_swab32(ex->ee_block);
+ ex->ee_len = ext2fs_swab16(ex->ee_len);
+ ex->ee_start_hi =ext2fs_swab16(ex->ee_start_hi);
+ ex->ee_start = ext2fs_swab32(ex->ee_start);
+}
+
void ext2fs_swap_inode_full(ext2_filsys fs, struct ext2_inode_large *t,
struct ext2_inode_large *f, int hostorder,
int bufsize)
{
- unsigned i, has_data_blocks, extra_isize;
+ unsigned i, has_data_blocks, extra_isize, has_extents;
int islnk = 0;
__u32 *eaf, *eat;

@@ -164,18 +186,46 @@ void ext2fs_swap_inode_full(ext2_filsys
t->i_gid = ext2fs_swab16(f->i_gid);
t->i_links_count = ext2fs_swab16(f->i_links_count);
t->i_file_acl = ext2fs_swab32(f->i_file_acl);
- if (hostorder)
- has_data_blocks = ext2fs_inode_data_blocks(fs,
+ if (hostorder) {
+ has_data_blocks = ext2fs_inode_data_blocks(fs,
(struct ext2_inode *) f);
- t->i_blocks = ext2fs_swab32(f->i_blocks);
- if (!hostorder)
- has_data_blocks = ext2fs_inode_data_blocks(fs,
+ t->i_blocks = ext2fs_swab32(f->i_blocks);
+ has_extents = (f->i_flags & EXT4_EXTENTS_FL);
+ t->i_flags = ext2fs_swab32(f->i_flags);
+ } else {
+ t->i_blocks = ext2fs_swab32(f->i_blocks);
+ has_data_blocks = ext2fs_inode_data_blocks(fs,
(struct ext2_inode *) t);
+ t->i_flags = ext2fs_swab32(f->i_flags);
+ has_extents = (t->i_flags & EXT4_EXTENTS_FL);
+ }
t->i_flags = ext2fs_swab32(f->i_flags);
t->i_dir_acl = ext2fs_swab32(f->i_dir_acl);
- if (!islnk || has_data_blocks ) {
- for (i = 0; i < EXT2_N_BLOCKS; i++)
- t->i_block[i] = ext2fs_swab32(f->i_block[i]);
+ if (!islnk || has_data_blocks) {
+ if (has_extents) {
+ struct ext3_extent_header *eh;
+ int max = EXT2_N_BLOCKS * sizeof(__u32) - sizeof(*eh);
+
+ memcpy(t->i_block, f->i_block, sizeof(f->i_block));
+ eh = (struct ext3_extent_header *)t->i_block;
+ ext2fs_swap_extent_header(eh);
+
+ if (!eh->eh_depth) {
+ struct ext3_extent *ex = EXT_FIRST_EXTENT(eh);
+ max = max / sizeof(struct ext3_extent);
+ for (i = 0; i < max; i++, ex++)
+ ext2fs_swap_extent(ex);
+ } else {
+ struct ext3_extent_idx *ix =
+ EXT_FIRST_INDEX(eh);
+ max = max / sizeof(struct ext3_extent_idx);
+ for (i = 0; i < max; i++, ix++)
+ ext2fs_swap_extent_index(ix);
+ }
+ } else {
+ for (i = 0; i < EXT2_N_BLOCKS; i++)
+ t->i_block[i] = ext2fs_swab32(f->i_block[i]);
+ }
} else if (t != f) {
for (i = 0; i < EXT2_N_BLOCKS; i++)
t->i_block[i] = f->i_block[i];
@@ -218,11 +268,13 @@ void ext2fs_swap_inode_full(ext2_filsys
if (bufsize < (int) (sizeof(struct ext2_inode) + sizeof(__u16)))
return; /* no i_extra_isize field */

- if (hostorder)
+ if (hostorder) {
extra_isize = f->i_extra_isize;
- t->i_extra_isize = ext2fs_swab16(f->i_extra_isize);
- if (!hostorder)
+ t->i_extra_isize = ext2fs_swab16(f->i_extra_isize);
+ } else {
+ t->i_extra_isize = ext2fs_swab16(f->i_extra_isize);
extra_isize = t->i_extra_isize;
+ }
if (extra_isize > EXT2_INODE_SIZE(fs->super) -
sizeof(struct ext2_inode)) {
/* this is error case: i_extra_size is too large */
Index: e2fsprogs-1.40.5/lib/ext2fs/valid_blk.c
===================================================================
--- e2fsprogs-1.40.5.orig/lib/ext2fs/valid_blk.c
+++ e2fsprogs-1.40.5/lib/ext2fs/valid_blk.c
@@ -19,6 +19,7 @@

#include "ext2_fs.h"
#include "ext2fs.h"
+#include "ext3_extents.h"

/*
* This function returns 1 if the inode's block entries actually
@@ -41,12 +42,23 @@ int ext2fs_inode_has_valid_blocks(struct
if (LINUX_S_ISLNK (inode->i_mode)) {
if (inode->i_file_acl == 0) {
/* With no EA block, we can rely on i_blocks */
- if (inode->i_blocks == 0)
- return 0;
+ if (inode->i_flags & EXT4_EXTENTS_FL) {
+ struct ext3_extent_header *eh;
+ eh = (struct ext3_extent_header *)inode->i_block;
+ if (eh->eh_entries == 0)
+ return 0;
+ } else {
+ if (inode->i_blocks == 0)
+ return 0;
+ }
} else {
/* With an EA block, life gets more tricky */
if (inode->i_size >= EXT2_N_BLOCKS*4)
return 1; /* definitely using i_block[] */
+ /*
+ * we cant have EA + extents, so assume we aren't
+ * using extents
+ */
if (inode->i_size > 4 && inode->i_block[1] == 0)
return 1; /* definitely using i_block[] */
return 0; /* Probably a fast symlink */
Index: e2fsprogs-1.40.5/tests/f_bad_disconnected_inode/expect.1
===================================================================
--- e2fsprogs-1.40.5.orig/tests/f_bad_disconnected_inode/expect.1
+++ e2fsprogs-1.40.5/tests/f_bad_disconnected_inode/expect.1
@@ -1,4 +1,10 @@
Pass 1: Checking inodes, blocks, and sizes
+Inode 15 has EXTENT_FL set, but is not in extents format
+Fix? yes
+
+Inode 16 has EXTENT_FL set, but is not in extents format
+Fix? yes
+
Pass 2: Checking directory structure
Pass 3: Checking directory connectivity
/lost+found not found. Create? yes
Index: e2fsprogs-1.40.5/tests/f_bbfile/expect.1
===================================================================
--- e2fsprogs-1.40.5.orig/tests/f_bbfile/expect.1
+++ e2fsprogs-1.40.5/tests/f_bbfile/expect.1
@@ -3,46 +3,60 @@ Filesystem did not have a UUID; generati
Pass 1: Checking inodes, blocks, and sizes
Group 0's inode bitmap (4) is bad. Relocate? yes

+Inode 11 has corrupt indirect block
+Clear? yes
+
Relocating group 0's inode bitmap from 4 to 43...
+Restarting e2fsck from the beginning...
+Pass 1: Checking inodes, blocks, and sizes

Running additional passes to resolve blocks claimed by more than one inode...
Pass 1B: Rescanning for multiply-claimed blocks
Multiply-claimed block(s) in inode 2: 21
-Multiply-claimed block(s) in inode 11: 9 10 11 12 13 14 15 16 17 18 19 20
Multiply-claimed block(s) in inode 12: 25 26
Pass 1C: Scanning directories for inodes with multiply-claimed blocks
Pass 1D: Reconciling multiply-claimed blocks
-(There are 3 inodes containing multiply-claimed blocks.)
+(There are 2 inodes containing multiply-claimed blocks.)

File / (inode #2, mod time Sun Jan 2 08:29:13 1994)
has 1 multiply-claimed block(s), shared with 1 file(s):
<The bad blocks inode> (inode #1, mod time Sun Jul 17 00:47:58 1994)
Clone multiply-claimed blocks? yes

-File /lost+found (inode #11, mod time Sun Jan 2 08:28:40 1994)
- has 12 multiply-claimed block(s), shared with 1 file(s):
- <The bad blocks inode> (inode #1, mod time Sun Jul 17 00:47:58 1994)
-Clone multiply-claimed blocks? yes
-
File /termcap (inode #12, mod time Sun Jan 2 08:29:13 1994)
has 2 multiply-claimed block(s), shared with 1 file(s):
<The bad blocks inode> (inode #1, mod time Sun Jul 17 00:47:58 1994)
Clone multiply-claimed blocks? yes

Pass 2: Checking directory structure
+Entry 'lost+found' in / (2) has deleted/unused inode 11. Clear? yes
+
Pass 3: Checking directory connectivity
+/lost+found not found. Create? yes
+
Pass 4: Checking reference counts
+Inode 2 ref count is 4, should be 3. Fix? yes
+
Pass 5: Checking group summary information
Block bitmap differences: +43
Fix? yes

-Free blocks count wrong for group #0 (57, counted=41).
+Free blocks count wrong for group #0 (56, counted=52).
+Fix? yes
+
+Free blocks count wrong (56, counted=52).
+Fix? yes
+
+Free inodes count wrong for group #0 (19, counted=20).
+Fix? yes
+
+Directories count wrong for group #0 (3, counted=2).
Fix? yes

-Free blocks count wrong (57, counted=41).
+Free inodes count wrong (19, counted=20).
Fix? yes


test_filesys: ***** FILE SYSTEM WAS MODIFIED *****
-test_filesys: 12/32 files (0.0% non-contiguous), 59/100 blocks
+test_filesys: 12/32 files (0.0% non-contiguous), 48/100 blocks
Exit status is 1
Index: e2fsprogs-1.40.5/tests/f_bbfile/expect.2
===================================================================
--- e2fsprogs-1.40.5.orig/tests/f_bbfile/expect.2
+++ e2fsprogs-1.40.5/tests/f_bbfile/expect.2
@@ -3,5 +3,5 @@ Pass 2: Checking directory structure
Pass 3: Checking directory connectivity
Pass 4: Checking reference counts
Pass 5: Checking group summary information
-test_filesys: 12/32 files (8.3% non-contiguous), 59/100 blocks
+test_filesys: 12/32 files (8.3% non-contiguous), 48/100 blocks
Exit status is 0
Index: e2fsprogs-1.40.5/tests/f_lotsbad/expect.1
===================================================================
--- e2fsprogs-1.40.5.orig/tests/f_lotsbad/expect.1
+++ e2fsprogs-1.40.5/tests/f_lotsbad/expect.1
@@ -8,54 +8,41 @@ Inode 13, i_size is 15360, should be 122

Inode 13, i_blocks is 32, should be 30. Fix? yes

-Inode 12 has illegal block(s). Clear? yes
+Inode 12 has corrupt indirect block
+Clear? yes

-Illegal block #12 (778398818) in inode 12. CLEARED.
-Illegal block #13 (1768444960) in inode 12. CLEARED.
-Illegal block #14 (1752375411) in inode 12. CLEARED.
-Illegal block #15 (1684829551) in inode 12. CLEARED.
-Illegal block #16 (1886349344) in inode 12. CLEARED.
-Illegal block #17 (1819633253) in inode 12. CLEARED.
-Illegal block #18 (1663072620) in inode 12. CLEARED.
-Illegal block #19 (1735287144) in inode 12. CLEARED.
-Illegal block #20 (1310731877) in inode 12. CLEARED.
-Illegal block #21 (560297071) in inode 12. CLEARED.
-Illegal block #22 (543512352) in inode 12. CLEARED.
-Too many illegal blocks in inode 12.
-Clear inode? yes
+Inode 12, i_blocks is 34, should be 24. Fix? yes

-Restarting e2fsck from the beginning...
-Pass 1: Checking inodes, blocks, and sizes
Pass 2: Checking directory structure
-Entry 'termcap' in / (2) has deleted/unused inode 12. Clear? yes
+Directory inode 13 has an unallocated block #16580876. Allocate? yes

Pass 3: Checking directory connectivity
Pass 4: Checking reference counts
Inode 2 ref count is 5, should be 4. Fix? yes

Pass 5: Checking group summary information
-Block bitmap differences: -(27--41) -(44--45) -(74--90)
+Block bitmap differences: -(38--41) -(74--90)
Fix? yes

-Free blocks count wrong for group #0 (9, counted=43).
+Free blocks count wrong for group #0 (9, counted=30).
Fix? yes

-Free blocks count wrong (9, counted=43).
+Free blocks count wrong (9, counted=30).
Fix? yes

-Inode bitmap differences: -12 -14
+Inode bitmap differences: -14
Fix? yes

-Free inodes count wrong for group #0 (18, counted=20).
+Free inodes count wrong for group #0 (18, counted=19).
Fix? yes

Directories count wrong for group #0 (4, counted=3).
Fix? yes

-Free inodes count wrong (18, counted=20).
+Free inodes count wrong (18, counted=19).
Fix? yes


test_filesys: ***** FILE SYSTEM WAS MODIFIED *****
-test_filesys: 12/32 files (0.0% non-contiguous), 57/100 blocks
+test_filesys: 13/32 files (7.7% non-contiguous), 70/100 blocks
Exit status is 1
Index: e2fsprogs-1.40.5/tests/f_lotsbad/expect.2
===================================================================
--- e2fsprogs-1.40.5.orig/tests/f_lotsbad/expect.2
+++ e2fsprogs-1.40.5/tests/f_lotsbad/expect.2
@@ -1,7 +1,18 @@
Pass 1: Checking inodes, blocks, and sizes
+Inode 13 is too big. Truncate? yes
+
+Block #16580876 (37) causes directory to be too big. CLEARED.
+Inode 13, i_size is 4093916160, should be 12288. Fix? yes
+
+Inode 13, i_blocks is 32, should be 30. Fix? yes
+
Pass 2: Checking directory structure
+Directory inode 13 has an unallocated block #16580876. Allocate? yes
+
Pass 3: Checking directory connectivity
Pass 4: Checking reference counts
Pass 5: Checking group summary information
-test_filesys: 12/32 files (0.0% non-contiguous), 57/100 blocks
-Exit status is 0
+
+test_filesys: ***** FILE SYSTEM WAS MODIFIED *****
+test_filesys: 13/32 files (15.4% non-contiguous), 70/100 blocks
+Exit status is 1
Index: e2fsprogs-1.40.5/tests/f_messy_inode/expect.1
===================================================================
--- e2fsprogs-1.40.5.orig/tests/f_messy_inode/expect.1
+++ e2fsprogs-1.40.5/tests/f_messy_inode/expect.1
@@ -1,38 +1,36 @@
Filesystem did not have a UUID; generating one.

Pass 1: Checking inodes, blocks, and sizes
-Inode 14 has illegal block(s). Clear? yes
-
-Illegal block #2 (4294901760) in inode 14. CLEARED.
-Illegal block #3 (4294901760) in inode 14. CLEARED.
-Illegal block #4 (4294901760) in inode 14. CLEARED.
-Illegal block #5 (4294901760) in inode 14. CLEARED.
-Illegal block #6 (4294901760) in inode 14. CLEARED.
-Illegal block #7 (4294901760) in inode 14. CLEARED.
-Illegal block #8 (4294901760) in inode 14. CLEARED.
-Illegal block #9 (4294901760) in inode 14. CLEARED.
-Illegal block #10 (4294901760) in inode 14. CLEARED.
-Inode 14, i_size is 18446462598732849291, should be 2048. Fix? yes
-
-Inode 14, i_blocks is 18, should be 4. Fix? yes
+Inode 14 has corrupt indirect block
+Clear? yes

+Restarting e2fsck from the beginning...
+Pass 1: Checking inodes, blocks, and sizes
Pass 2: Checking directory structure
-i_file_acl for inode 14 (/MAKEDEV) is 4294901760, should be zero.
-Clear? yes
+Entry 'MAKEDEV' in / (2) has deleted/unused inode 14. Clear? yes

Pass 3: Checking directory connectivity
Pass 4: Checking reference counts
Pass 5: Checking group summary information
-Block bitmap differences: -(43--49)
+Block bitmap differences: -(41--49)
+Fix? yes
+
+Free blocks count wrong for group #0 (68, counted=77).
+Fix? yes
+
+Free blocks count wrong (68, counted=77).
+Fix? yes
+
+Inode bitmap differences: -14
Fix? yes

-Free blocks count wrong for group #0 (68, counted=75).
+Free inodes count wrong for group #0 (3, counted=4).
Fix? yes

-Free blocks count wrong (68, counted=75).
+Free inodes count wrong (3, counted=4).
Fix? yes


test_filesys: ***** FILE SYSTEM WAS MODIFIED *****
-test_filesys: 29/32 files (3.4% non-contiguous), 25/100 blocks
+test_filesys: 28/32 files (0.0% non-contiguous), 23/100 blocks
Exit status is 1
Index: e2fsprogs-1.40.5/tests/f_messy_inode/expect.2
===================================================================
--- e2fsprogs-1.40.5.orig/tests/f_messy_inode/expect.2
+++ e2fsprogs-1.40.5/tests/f_messy_inode/expect.2
@@ -3,5 +3,5 @@ Pass 2: Checking directory structure
Pass 3: Checking directory connectivity
Pass 4: Checking reference counts
Pass 5: Checking group summary information
-test_filesys: 29/32 files (0.0% non-contiguous), 25/100 blocks
+test_filesys: 28/32 files (0.0% non-contiguous), 23/100 blocks
Exit status is 0

Cheers, Andreas
--
Andreas Dilger
Sr. Staff Engineer, Lustre Group
Sun Microsystems of Canada, Inc.

2008-02-02 08:27:08

by Andreas Dilger

[permalink] [raw]
Subject: Re: [PATCH][7/28] e2fsprogs-extents.patch

Support for checking 32-bit extents format inodes and the INCOMPAT_EXTENTS
feature.

Clear the high 16 bits of extents and index entries, since the
extents patches did not do this explicitly. Some parts of this
code need fixing for checking > 32-bit block filesystems (when
INCOMPAT_64BIT support is added), marked "FIXME: 48-bit support".

Verify extent headers in blocks, logical ordering of extents,
logical ordering of indexes.

Add explicit checking of {d,t,}indirect and index blocks to detect
corruption instead of implicitly doing this by checking the referred
blocks and only block-at-a-time correctness. This avoids incorrectly
invoking the very lengthy duplicate blocks pass for bad indirect/index
blocks. We may want to tune the "threshold" for how many errors make
a "bad" indirect/index block.

Add ability to split or remove extents in order to allow extent
reallocation during the duplicate blocks pass.

Index: e2fsprogs-1.40.5/e2fsck/Makefile.in
===================================================================
--- e2fsprogs-1.40.5.orig/e2fsck/Makefile.in
+++ e2fsprogs-1.40.5/e2fsck/Makefile.in
@@ -256,6 +256,7 @@ super.o: $(srcdir)/super.c $(top_srcdir)
pass1.o: $(srcdir)/pass1.c $(srcdir)/e2fsck.h \
$(top_srcdir)/lib/ext2fs/ext2_fs.h $(top_builddir)/lib/ext2fs/ext2_types.h \
$(top_srcdir)/lib/ext2fs/ext2fs.h $(top_srcdir)/lib/ext2fs/ext2_fs.h \
+ $(top_srcdir)/lib/ext2fs/ext3_extents.h \
$(top_srcdir)/lib/et/com_err.h $(top_srcdir)/lib/ext2fs/ext2_io.h \
$(top_builddir)/lib/ext2fs/ext2_err.h $(top_srcdir)/lib/ext2fs/bitops.h \
$(top_srcdir)/lib/blkid/blkid.h $(top_builddir)/lib/blkid/blkid_types.h \
Index: e2fsprogs-1.40.5/e2fsck/e2fsck.h
===================================================================
--- e2fsprogs-1.40.5.orig/e2fsck/e2fsck.h
+++ e2fsprogs-1.40.5/e2fsck/e2fsck.h
@@ -328,6 +328,7 @@ struct e2fsck_struct {
__u32 large_files;
__u32 fs_ext_attr_inodes;
__u32 fs_ext_attr_blocks;
+ __u32 extent_files;

/* misc fields */
time_t now;
Index: e2fsprogs-1.40.5/e2fsck/pass1.c
===================================================================
--- e2fsprogs-1.40.5.orig/e2fsck/pass1.c
+++ e2fsprogs-1.40.5/e2fsck/pass1.c
@@ -46,6 +46,7 @@

#include "e2fsck.h"
#include <ext2fs/ext2_ext_attr.h>
+#include <ext2fs/ext3_extents.h>

#include "problem.h"

@@ -79,16 +80,19 @@ static void adjust_extattr_refcount(e2fs
struct process_block_struct {
ext2_ino_t ino;
unsigned is_dir:1, is_reg:1, clear:1, suppress:1,
- fragmented:1, compressed:1, bbcheck:1;
+ fragmented:1, compressed:1, bbcheck:1, extent:1;
blk_t num_blocks;
blk_t max_blocks;
e2_blkcnt_t last_block;
int num_illegal_blocks;
+ int last_illegal_blocks;
blk_t previous_block;
struct ext2_inode *inode;
struct problem_context *pctx;
ext2fs_block_bitmap fs_meta_blocks;
e2fsck_t ctx;
+ struct ext3_extent_header *eh_prev;
+ void *block_buf;
};

struct process_inode_block {
@@ -137,7 +141,7 @@ int e2fsck_pass1_check_device_inode(ext2
* If the index flag is set, then this is a bogus
* device/fifo/socket
*/
- if (inode->i_flags & EXT2_INDEX_FL)
+ if (inode->i_flags & (EXT2_INDEX_FL | EXT4_EXTENTS_FL))
return 0;

/*
@@ -171,7 +175,7 @@ int e2fsck_pass1_check_symlink(ext2_fils
blk_t blocks;

if ((inode->i_size_high || inode->i_size == 0) ||
- (inode->i_flags & EXT2_INDEX_FL))
+ (inode->i_flags & (EXT2_INDEX_FL | EXT4_EXTENTS_FL)))
return 0;

blocks = ext2fs_inode_data_blocks(fs, inode);
@@ -484,7 +488,9 @@ void e2fsck_pass1(e2fsck_t ctx)
int imagic_fs;
int busted_fs_time = 0;
int inode_size;
-
+ struct ext3_extent_header *eh;
+ int extent_fs;
+
#ifdef RESOURCE_TRACK
init_resource_track(&rtrack, ctx->fs->io);
#endif
@@ -515,6 +521,7 @@ void e2fsck_pass1(e2fsck_t ctx)
#undef EXT2_BPP

imagic_fs = (sb->s_feature_compat & EXT2_FEATURE_COMPAT_IMAGIC_INODES);
+ extent_fs = (sb->s_feature_incompat & EXT3_FEATURE_INCOMPAT_EXTENTS);

/*
* Allocate bitmaps structures
@@ -891,8 +898,7 @@ void e2fsck_pass1(e2fsck_t ctx)
check_blocks(ctx, &pctx, block_buf);
continue;
}
- }
- else if (LINUX_S_ISFIFO (inode->i_mode) &&
+ } else if (LINUX_S_ISFIFO (inode->i_mode) &&
e2fsck_pass1_check_device_inode(fs, inode)) {
check_immutable(ctx, &pctx);
check_size(ctx, &pctx);
@@ -904,21 +910,75 @@ void e2fsck_pass1(e2fsck_t ctx)
ctx->fs_sockets_count++;
} else
mark_inode_bad(ctx, ino);
- if (inode->i_block[EXT2_IND_BLOCK])
- ctx->fs_ind_count++;
- if (inode->i_block[EXT2_DIND_BLOCK])
- ctx->fs_dind_count++;
- if (inode->i_block[EXT2_TIND_BLOCK])
- ctx->fs_tind_count++;
- if (inode->i_block[EXT2_IND_BLOCK] ||
- inode->i_block[EXT2_DIND_BLOCK] ||
- inode->i_block[EXT2_TIND_BLOCK] ||
- inode->i_file_acl) {
- inodes_to_process[process_inode_count].ino = ino;
- inodes_to_process[process_inode_count].inode = *inode;
- process_inode_count++;
- } else
- check_blocks(ctx, &pctx, block_buf);
+
+ eh = (struct ext3_extent_header *)inode->i_block;
+ if ((inode->i_flags & EXT4_EXTENTS_FL)) {
+ if ((LINUX_S_ISREG(inode->i_mode) ||
+ LINUX_S_ISDIR(inode->i_mode)) &&
+ ext2fs_extent_header_verify(eh, EXT2_N_BLOCKS *
+ sizeof(__u32)) == 0) {
+ if (!extent_fs &&
+ fix_problem(ctx,PR_1_EXTENT_FEATURE,&pctx)){
+ sb->s_feature_incompat |=
+ EXT3_FEATURE_INCOMPAT_EXTENTS;
+ ext2fs_mark_super_dirty(fs);
+ extent_fs = 1;
+ }
+ } else if (fix_problem(ctx, PR_1_SET_EXTENT_FL, &pctx)){
+ inode->i_flags &= ~EXT4_EXTENTS_FL;
+ e2fsck_write_inode(ctx, ino, inode, "pass1");
+ goto check_ind_inode;
+ }
+ } else if (extent_fs &&
+ (LINUX_S_ISREG(inode->i_mode) ||
+ LINUX_S_ISDIR(inode->i_mode)) &&
+ ext2fs_extent_header_verify(eh, EXT2_N_BLOCKS *
+ sizeof(__u32)) == 0 &&
+ fix_problem(ctx, PR_1_UNSET_EXTENT_FL, &pctx)) {
+ inode->i_flags |= EXT4_EXTENTS_FL;
+ e2fsck_write_inode(ctx, ino, inode, "pass1");
+ }
+ if (extent_fs && inode->i_flags & EXT4_EXTENTS_FL) {
+ ctx->extent_files++;
+ switch(eh->eh_depth) {
+ case 0:
+ break;
+ case 1:
+ ctx->fs_ind_count++;
+ break;
+ case 2:
+ ctx->fs_dind_count++;
+ break;
+ default:
+ ctx->fs_tind_count++;
+ break;
+ }
+ if (eh->eh_depth > 0) {
+ inodes_to_process[process_inode_count].ino = ino;
+ inodes_to_process[process_inode_count].inode = *inode;
+ process_inode_count++;
+ } else {
+ check_blocks(ctx, &pctx, block_buf);
+ }
+ } else {
+ check_ind_inode:
+ if (inode->i_block[EXT2_IND_BLOCK])
+ ctx->fs_ind_count++;
+ if (inode->i_block[EXT2_DIND_BLOCK])
+ ctx->fs_dind_count++;
+ if (inode->i_block[EXT2_TIND_BLOCK])
+ ctx->fs_tind_count++;
+ if (inode->i_block[EXT2_IND_BLOCK] ||
+ inode->i_block[EXT2_DIND_BLOCK] ||
+ inode->i_block[EXT2_TIND_BLOCK] ||
+ inode->i_file_acl) {
+ inodes_to_process[process_inode_count].ino = ino;
+ inodes_to_process[process_inode_count].inode = *inode;
+ process_inode_count++;
+ } else {
+ check_blocks(ctx, &pctx, block_buf);
+ }
+ }

if (ctx->flags & E2F_FLAG_SIGNAL_MASK)
return;
@@ -1426,10 +1486,23 @@ clear_extattr:
return 0;
}

+static int htree_blk_iter_cb(ext2_filsys fs EXT2FS_ATTR((unused)),
+ blk_t *blocknr,
+ e2_blkcnt_t blockcnt EXT2FS_ATTR((unused)),
+ blk_t ref_blk EXT2FS_ATTR((unused)),
+ int ref_offset EXT2FS_ATTR((unused)),
+ void *priv_data)
+{
+ blk_t *blk = priv_data;
+
+ *blk = *blocknr;
+
+ return BLOCK_ABORT;
+}
+
/* Returns 1 if bad htree, 0 if OK */
static int handle_htree(e2fsck_t ctx, struct problem_context *pctx,
- ext2_ino_t ino EXT2FS_ATTR((unused)),
- struct ext2_inode *inode,
+ ext2_ino_t ino, struct ext2_inode *inode,
char *block_buf)
{
struct ext2_dx_root_info *root;
@@ -1443,7 +1516,8 @@ static int handle_htree(e2fsck_t ctx, st
fix_problem(ctx, PR_1_HTREE_SET, pctx)))
return 1;

- blk = inode->i_block[0];
+ ext2fs_block_iterate2(fs, ino, BLOCK_FLAG_DATA_ONLY | BLOCK_FLAG_HOLE,
+ block_buf, htree_blk_iter_cb, &blk);
if (((blk == 0) ||
(blk < fs->super->s_first_data_block) ||
(blk >= fs->super->s_blocks_count)) &&
@@ -1480,6 +1554,135 @@ static int handle_htree(e2fsck_t ctx, st
return 0;
}

+/* sort 0 to the end of the list so we can exit early */
+static EXT2_QSORT_TYPE verify_ind_cmp(const void *a, const void *b)
+{
+ const __u32 blk_a = *(__u32 *)a - 1, blk_b = *(__u32 *)b - 1;
+
+ return blk_b > blk_a ? -1 : blk_a - blk_b;
+}
+
+/* Verify whether an indirect block is sane. If it has multiple references
+ * to the same block, or if it has a large number of bad or duplicate blocks
+ * chances are that it is corrupt and we should just clear it instead of
+ * trying to salvage it.
+ * NOTE: this needs to get a copy of the blocks, since it reorders them */
+static int e2fsck_ind_block_verify(struct process_block_struct *p,
+ void *block_buf, int buflen)
+{
+ __u32 blocks[EXT2_N_BLOCKS], *indir = block_buf;
+ int num_indir = buflen / sizeof(*indir);
+ int i, bad = 0;
+
+ if (num_indir == EXT2_N_BLOCKS) {
+ memcpy(blocks, block_buf, buflen);
+ indir = blocks;
+ }
+ qsort(indir, num_indir, sizeof(*indir), verify_ind_cmp);
+
+ for (i = 0; i < num_indir; i++) {
+ if (indir[i] == 0)
+ break;
+
+ /* bad block number, or duplicate block */
+ if (indir[i] < p->ctx->fs->super->s_first_data_block ||
+ indir[i] > p->ctx->fs->super->s_blocks_count ||
+ ext2fs_fast_test_block_bitmap(p->ctx->block_found_map,
+ indir[i]))
+ bad++;
+
+ /* shouldn't reference the same block twice within a block */
+ if (i > 0 && indir[i] == indir[i - 1])
+ bad++;
+ }
+
+ if ((num_indir <= EXT2_N_BLOCKS && bad > 4) || bad > 8)
+ return PR_1_INDIRECT_BAD;
+
+#if DEBUG_E2FSCK
+ /* For debugging, clobber buffer to ensure it doesn't appear sane */
+ memset(indir, 0xca, buflen);
+#endif
+ return 0;
+}
+
+static int e2fsck_ext_block_verify(struct process_block_struct *p,
+ void *block_buf, int buflen)
+{
+ struct ext3_extent_header *eh = block_buf, *eh_sav;
+ e2fsck_t ctx = p->ctx;
+ struct problem_context *pctx = p->pctx;
+ int i, problem = 0, high_bits_ok = 0;
+
+ if (ext2fs_extent_header_verify(eh, buflen))
+ return PR_1_EXTENT_IDX_BAD;
+
+ if (p->eh_prev && p->eh_prev->eh_depth != eh->eh_depth + 1)
+ return PR_1_EXTENT_IDX_BAD;
+
+ if (ctx->fs->super->s_blocks_count_hi) /* FIXME: 48-bit support ??? */
+ high_bits_ok = 1;
+
+ eh_sav = p->eh_prev;
+ p->eh_prev = eh;
+
+ if (eh->eh_depth == 0) {
+ struct ext3_extent *ex = EXT_FIRST_EXTENT(eh), *ex_prev = NULL;
+
+ for (i = 0; i < eh->eh_entries; i++, ex++) {
+ if (ex->ee_start_hi && !high_bits_ok &&
+ fix_problem(ctx, PR_1_EXTENT_HI, pctx)) {
+ ex->ee_start_hi = 0;
+ problem = PR_1_EXTENT_CHANGED;
+ }
+
+ if (ext2fs_extent_verify(ctx->fs, ex, ex_prev, NULL,0)){
+ p->num_illegal_blocks++;
+ pctx->blkcount = ex->ee_start;
+ pctx->num = ex->ee_len;
+ pctx->blk = ex->ee_block;
+ if (fix_problem(ctx, PR_1_EXTENT_BAD, pctx)) {
+ ext2fs_extent_remove(eh, ex);
+ i--; ex--; /* check next (moved) item */
+ problem = PR_1_EXTENT_CHANGED;
+ continue;
+ }
+ }
+
+ ex_prev = ex;
+ }
+ } else {
+ struct ext3_extent_idx *ix =EXT_FIRST_INDEX(eh), *ix_prev =NULL;
+
+ for (i = 0; i < eh->eh_entries; i++, ix++) {
+ if (ix->ei_leaf_hi && !high_bits_ok &&
+ fix_problem(ctx, PR_1_EXTENT_HI, pctx)) {
+ ix->ei_leaf_hi = ix->ei_unused = 0;
+ problem = PR_1_EXTENT_CHANGED;
+ }
+
+ if (ext2fs_extent_index_verify(ctx->fs, ix, ix_prev)) {
+ p->num_illegal_blocks++;
+ pctx->blkcount = ix->ei_leaf;;
+ pctx->num = i;
+ pctx->blk = ix->ei_block;
+ if (fix_problem(ctx, PR_1_EXTENT_IDX_BAD,pctx)){
+ ext2fs_extent_index_remove(eh, ix);
+ i--; ix--; /* check next (moved) item */
+ problem = PR_1_EXTENT_CHANGED;
+ continue;
+ }
+ }
+
+ ix_prev = ix;
+ }
+ }
+
+ p->eh_prev = eh_sav;
+
+ return problem;
+}
+
/*
* This subroutine is called on each inode to account for all of the
* blocks used by that inode.
@@ -1499,9 +1701,11 @@ static void check_blocks(e2fsck_t ctx, s
pb.num_blocks = 0;
pb.last_block = -1;
pb.num_illegal_blocks = 0;
+ pb.last_illegal_blocks = 0;
pb.suppress = 0; pb.clear = 0;
pb.fragmented = 0;
pb.compressed = 0;
+ pb.extent = !!(inode->i_flags & EXT4_EXTENTS_FL);
pb.previous_block = 0;
pb.is_dir = LINUX_S_ISDIR(inode->i_mode);
pb.is_reg = LINUX_S_ISREG(inode->i_mode);
@@ -1509,6 +1713,8 @@ static void check_blocks(e2fsck_t ctx, s
pb.inode = inode;
pb.pctx = pctx;
pb.ctx = ctx;
+ pb.eh_prev = NULL;
+ pb.block_buf = block_buf;
pctx->ino = ino;
pctx->errcode = 0;

@@ -1530,10 +1736,27 @@ static void check_blocks(e2fsck_t ctx, s
pb.num_blocks++;
}

- if (ext2fs_inode_has_valid_blocks(inode))
- pctx->errcode = ext2fs_block_iterate2(fs, ino,
- pb.is_dir ? BLOCK_FLAG_HOLE : 0,
- block_buf, process_block, &pb);
+ if (ext2fs_inode_has_valid_blocks(inode)) {
+ int problem = 0;
+
+ if (pb.extent)
+ problem = e2fsck_ext_block_verify(&pb, inode->i_block,
+ sizeof(inode->i_block));
+ else
+ problem = e2fsck_ind_block_verify(&pb, inode->i_block,
+ sizeof(inode->i_block));
+ if (problem == PR_1_EXTENT_CHANGED) {
+ dirty_inode++;
+ problem = 0;
+ }
+
+ if (problem && fix_problem(ctx, problem, pctx))
+ pb.clear = 1;
+ else
+ pctx->errcode = ext2fs_block_iterate2(fs, ino,
+ pb.is_dir ? BLOCK_FLAG_HOLE : 0,
+ block_buf, process_block, &pb);
+ }
end_problem_latch(ctx, PR_LATCH_BLOCK);
end_problem_latch(ctx, PR_LATCH_TOOBIG);
if (ctx->flags & E2F_FLAG_SIGNAL_MASK)
@@ -1697,6 +1920,9 @@ static char *describe_illegal_block(ext2
}
#endif

+#define IND_BLKCNT(_b) ((_b) == BLOCK_COUNT_IND || (_b) == BLOCK_COUNT_DIND ||\
+ (_b) == BLOCK_COUNT_TIND)
+
/*
* This is a helper function for check_blocks().
*/
@@ -1775,7 +2001,8 @@ static int process_block(ext2_filsys fs,
* file be contiguous. (Which can never be true for really
* big files that are greater than a block group.)
*/
- if (!HOLE_BLKADDR(p->previous_block)) {
+ if (!HOLE_BLKADDR(p->previous_block) &&
+ !(p->extent && IND_BLKCNT(blockcnt))) {
if (p->previous_block+1 != blk)
p->fragmented = 1;
}
@@ -1792,9 +2019,34 @@ static int process_block(ext2_filsys fs,
blk >= fs->super->s_blocks_count)
problem = PR_1_ILLEGAL_BLOCK_NUM;

+ if (!problem && IND_BLKCNT(blockcnt) && p->ino != EXT2_RESIZE_INO) {
+ if (p->extent) {
+ if (ext2fs_read_ext_block(ctx->fs, blk, p->block_buf))
+ problem = PR_1_BLOCK_ITERATE;
+ else
+ problem = e2fsck_ext_block_verify(p,
+ p->block_buf,
+ fs->blocksize);
+ if (problem == PR_1_EXTENT_CHANGED) {
+ if (ext2fs_write_ext_block(ctx->fs, blk,
+ p->block_buf))
+ problem = PR_1_BLOCK_ITERATE;
+ }
+
+ } else {
+ if (ext2fs_read_ind_block(ctx->fs, blk, p->block_buf))
+ problem = PR_1_BLOCK_ITERATE;
+ else
+ problem = e2fsck_ind_block_verify(p,
+ p->block_buf,
+ fs->blocksize);
+ }
+ }
+
if (problem) {
p->num_illegal_blocks++;
- if (!p->suppress && (p->num_illegal_blocks % 12) == 0) {
+ if (!p->suppress &&
+ p->num_illegal_blocks - p->last_illegal_blocks > 12) {
if (fix_problem(ctx, PR_1_TOO_MANY_BAD_BLOCKS, pctx)) {
p->clear = 1;
return BLOCK_ABORT;
@@ -1804,9 +2056,12 @@ static int process_block(ext2_filsys fs,
set_latch_flags(PR_LATCH_BLOCK,
PRL_SUPPRESS, 0);
}
+ p->last_illegal_blocks = p->num_illegal_blocks;
}
pctx->blk = blk;
pctx->blkcount = blockcnt;
+ if (problem == PR_1_EXTENT_CHANGED)
+ goto mark_used;
if (fix_problem(ctx, problem, pctx)) {
blk = *block_nr = 0;
ret_code = BLOCK_CHANGED;
@@ -1815,6 +2070,7 @@ static int process_block(ext2_filsys fs,
return 0;
}

+mark_used:
if (p->ino == EXT2_RESIZE_INO) {
/*
* The resize inode has already be sanity checked
Index: e2fsprogs-1.40.5/e2fsck/pass2.c
===================================================================
--- e2fsprogs-1.40.5.orig/e2fsck/pass2.c
+++ e2fsprogs-1.40.5/e2fsck/pass2.c
@@ -285,7 +285,16 @@ void e2fsck_pass2(e2fsck_t ctx)
ext2fs_mark_super_dirty(fs);
}
}
-
+
+ if (!ctx->extent_files &&
+ (sb->s_feature_incompat & EXT3_FEATURE_INCOMPAT_EXTENTS)) {
+ if (fs->flags & EXT2_FLAG_RW) {
+ sb->s_feature_incompat &=
+ ~EXT3_FEATURE_INCOMPAT_EXTENTS;
+ ext2fs_mark_super_dirty(fs);
+ }
+ }
+
#ifdef RESOURCE_TRACK
if (ctx->options & E2F_OPT_TIME2) {
e2fsck_clear_progbar(ctx);
Index: e2fsprogs-1.40.5/e2fsck/problem.c
===================================================================
--- e2fsprogs-1.40.5.orig/e2fsck/problem.c
+++ e2fsprogs-1.40.5/e2fsck/problem.c
@@ -784,6 +784,46 @@ static struct e2fsck_problem problem_tab
N_("@i %i is a %It but it looks like it is really a directory.\n"),
PROMPT_FIX, 0 },

+ /* indirect block corrupt */
+ { PR_1_INDIRECT_BAD,
+ N_("@i %i has corrupt indirect block\n"),
+ PROMPT_CLEAR, PR_PREEN_OK },
+
+ /* inode has extents, superblock missing INCOMPAT_EXTENTS feature */
+ { PR_1_EXTENT_FEATURE,
+ N_("@i %i is in extent format, but @S is missing EXTENTS feature\n"),
+ PROMPT_FIX, PR_PREEN_OK },
+
+ /* inode has EXTENTS_FL set, but is not an extent inode */
+ { PR_1_SET_EXTENT_FL,
+ N_("@i %i has EXTENT_FL set, but is not in extents format\n"),
+ PROMPT_FIX, PR_PREEN_OK },
+
+ /* inode missing EXTENTS_FL, but is an extent inode */
+ { PR_1_UNSET_EXTENT_FL,
+ N_("@i %i missing EXTENT_FL, but is in extents format\n"),
+ PROMPT_FIX, PR_PREEN_OK },
+
+ /* extent index corrupt */
+ { PR_1_EXTENT_BAD,
+ N_("@i %i has corrupt extent at @b %b (logical %B) length %N\n"),
+ PROMPT_CLEAR, PR_PREEN_OK },
+
+ /* extent index corrupt */
+ { PR_1_EXTENT_IDX_BAD,
+ N_("@i %i has corrupt extent index at @b %b (logical %B) entry %N\n"),
+ PROMPT_CLEAR, PR_PREEN_OK },
+
+ /* extent has high 16 bits set */
+ { PR_1_EXTENT_HI,
+ N_("High 16 bits of extent/index @b set\n"),
+ PROMPT_CLEAR, PR_LATCH_EXTENT_HI|PR_PREEN_OK|PR_NO_OK|PR_PREEN_NOMSG},
+
+ /* extent has high 16 bits set header */
+ { PR_1_EXTENT_HI_LATCH,
+ N_("@i %i has high 16 bits of extent/index @b set\n"),
+ PROMPT_CLEAR, PR_PREEN_OK | PR_NO_OK | PR_PREEN_NOMSG },
+
/* Pass 1b errors */

/* Pass 1B: Rescan for duplicate/bad blocks */
@@ -1518,6 +1558,7 @@ static struct latch_descr pr_latch_info[
{ PR_LATCH_LOW_DTIME, PR_1_ORPHAN_LIST_REFUGEES, 0 },
{ PR_LATCH_TOOBIG, PR_1_INODE_TOOBIG, 0 },
{ PR_LATCH_OPTIMIZE_DIR, PR_3A_OPTIMIZE_DIR_HEADER, PR_3A_OPTIMIZE_DIR_END },
+ { PR_LATCH_EXTENT_HI, PR_1_EXTENT_HI_LATCH, 0 },
{ -1, 0, 0 },
};

Index: e2fsprogs-1.40.5/e2fsck/problem.h
===================================================================
--- e2fsprogs-1.40.5.orig/e2fsck/problem.h
+++ e2fsprogs-1.40.5/e2fsck/problem.h
@@ -38,6 +38,7 @@ struct problem_context {
#define PR_LATCH_LOW_DTIME 0x0070 /* Latch for pass1 orphaned list refugees */
#define PR_LATCH_TOOBIG 0x0080 /* Latch for file to big errors */
#define PR_LATCH_OPTIMIZE_DIR 0x0090 /* Latch for optimize directories */
+#define PR_LATCH_EXTENT_HI 0x00A0 /* Latch for extent high bits set */

#define PR_LATCH(x) ((((x) & PR_LATCH_MASK) >> 4) - 1)

@@ -455,6 +456,33 @@ struct problem_context {
/* inode appears to be a directory */
#define PR_1_TREAT_AS_DIRECTORY 0x010055

+/* indirect block corrupt */
+#define PR_1_INDIRECT_BAD 0x010059
+
+/* wrong EXT3_FEATURE_INCOMPAT_EXTENTS flag */
+#define PR_1_EXTENT_FEATURE 0x010060
+
+/* EXT4_EXTENT_FL flag set on non-extent file */
+#define PR_1_SET_EXTENT_FL 0x010061
+
+/* EXT4_EXTENT_FL flag not set extent file */
+#define PR_1_UNSET_EXTENT_FL 0x010062
+
+/* extent index corrupt */
+#define PR_1_EXTENT_BAD 0x010063
+
+/* extent index corrupt */
+#define PR_1_EXTENT_IDX_BAD 0x010064
+
+/* extent/index has high 16 bits set - header */
+#define PR_1_EXTENT_HI 0x010065
+
+/* extent/index has high 16 bits set */
+#define PR_1_EXTENT_HI_LATCH 0x010066
+
+/* extent/index was modified & repaired - not really a problem */
+#define PR_1_EXTENT_CHANGED 0x010067
+
/*
* Pass 1b errors
*/
Index: e2fsprogs-1.40.5/lib/ext2fs/Makefile.in
===================================================================
--- e2fsprogs-1.40.5.orig/lib/ext2fs/Makefile.in
+++ e2fsprogs-1.40.5/lib/ext2fs/Makefile.in
@@ -35,6 +35,7 @@ OBJS= $(DEBUGFS_LIB_OBJS) $(RESIZE_LIB_O
dir_iterate.o \
expanddir.o \
ext_attr.o \
+ extents.o \
finddev.o \
flushb.o \
freefs.o \
@@ -90,6 +91,7 @@ SRCS= ext2_err.c \
$(srcdir)/dupfs.c \
$(srcdir)/expanddir.c \
$(srcdir)/ext_attr.c \
+ $(srcdir)/extents.c \
$(srcdir)/fileio.c \
$(srcdir)/finddev.c \
$(srcdir)/flushb.c \
@@ -127,6 +129,7 @@ SRCS= ext2_err.c \
$(srcdir)/tst_bitops.c \
$(srcdir)/tst_byteswap.c \
$(srcdir)/tst_getsize.c \
+ $(srcdir)/tst_types.c \
$(srcdir)/tst_iscan.c \
$(srcdir)/unix_io.c \
$(srcdir)/unlink.c \
@@ -394,6 +397,10 @@ ext_attr.o: $(srcdir)/ext_attr.c $(srcdi
$(srcdir)/ext2fs.h $(srcdir)/ext2_fs.h $(srcdir)/ext3_extents.h \
$(top_srcdir)/lib/et/com_err.h $(srcdir)/ext2_io.h \
$(top_builddir)/lib/ext2fs/ext2_err.h $(srcdir)/bitops.h
+extents.o: $(srcdir)/extents.c $(srcdir)/ext2_fs.h \
+ $(top_builddir)/lib/ext2fs/ext2_types.h $(srcdir)/ext3_extents.h \
+ $(srcdir)/ext2fs.h $(srcdir)/ext2_fs.h $(top_srcdir)/lib/et/com_err.h \
+ $(srcdir)/ext2_io.h $(top_builddir)/lib/ext2fs/ext2_err.h $(srcdir)/bitops.h
fileio.o: $(srcdir)/fileio.c $(srcdir)/ext2_fs.h \
$(top_builddir)/lib/ext2fs/ext2_types.h $(srcdir)/ext2fs.h \
$(srcdir)/ext2_fs.h $(srcdir)/ext3_extents.h $(top_srcdir)/lib/et/com_err.h \
Index: e2fsprogs-1.40.5/lib/ext2fs/block.c
===================================================================
--- e2fsprogs-1.40.5.orig/lib/ext2fs/block.c
+++ e2fsprogs-1.40.5/lib/ext2fs/block.c
@@ -17,24 +17,17 @@

#include "ext2_fs.h"
#include "ext2fs.h"
+#include "block.h"

-struct block_context {
- ext2_filsys fs;
- int (*func)(ext2_filsys fs,
- blk_t *blocknr,
- e2_blkcnt_t bcount,
- blk_t ref_blk,
- int ref_offset,
- void *priv_data);
- e2_blkcnt_t bcount;
- int bsize;
- int flags;
- errcode_t errcode;
- char *ind_buf;
- char *dind_buf;
- char *tind_buf;
- void *priv_data;
-};
+#ifdef EXT_DEBUG
+void ext_show_inode(struct ext2_inode *inode, ext2_ino_t ino)
+{
+ printf("inode: %u blocks: %u\n",
+ ino, inode->i_blocks);
+}
+#else
+#define ext_show_inode(inode, ino) do { } while (0)
+#endif

static int block_iterate_ind(blk_t *ind_block, blk_t ref_block,
int ref_offset, struct block_context *ctx)
@@ -276,7 +269,6 @@ errcode_t ext2fs_block_iterate2(ext2_fil
void *priv_data)
{
int i;
- int got_inode = 0;
int ret = 0;
blk_t blocks[EXT2_N_BLOCKS]; /* directory data blocks */
struct ext2_inode inode;
@@ -286,19 +278,20 @@ errcode_t ext2fs_block_iterate2(ext2_fil

EXT2_CHECK_MAGIC(fs, EXT2_ET_MAGIC_EXT2FS_FILSYS);

+ ctx.errcode = ext2fs_read_inode(fs, ino, &inode);
+ if (ctx.errcode)
+ return ctx.errcode;
+
/*
* Check to see if we need to limit large files
*/
if (flags & BLOCK_FLAG_NO_LARGE) {
- ctx.errcode = ext2fs_read_inode(fs, ino, &inode);
- if (ctx.errcode)
- return ctx.errcode;
- got_inode = 1;
if (!LINUX_S_ISDIR(inode.i_mode) &&
(inode.i_size_high != 0))
return EXT2_ET_FILE_TOO_BIG;
}

+ /* The in-memory inode may have been changed by e2fsck */
retval = ext2fs_get_blocks(fs, ino, blocks);
if (retval)
return retval;
@@ -325,10 +318,6 @@ errcode_t ext2fs_block_iterate2(ext2_fil
*/
if ((fs->super->s_creator_os == EXT2_OS_HURD) &&
!(flags & BLOCK_FLAG_DATA_ONLY)) {
- ctx.errcode = ext2fs_read_inode(fs, ino, &inode);
- if (ctx.errcode)
- goto abort_exit;
- got_inode = 1;
if (inode.osd1.hurd1.h_i_translator) {
ret |= (*ctx.func)(fs,
&inode.osd1.hurd1.h_i_translator,
@@ -338,7 +327,16 @@ errcode_t ext2fs_block_iterate2(ext2_fil
goto abort_exit;
}
}
-
+
+ /* Iterate over normal data blocks with extents.
+ * We can't do any fixing here because this gets called by other
+ * callers than e2fsck_pass1->check_blocks(). */
+ if (inode.i_flags & EXT4_EXTENTS_FL) {
+ ext_show_inode(&inode, ino);
+ ret |= block_iterate_extents(blocks, sizeof(blocks), 0, 0,&ctx);
+ goto abort_exit;
+ }
+
/*
* Iterate over normal data blocks
*/
@@ -373,11 +371,6 @@ errcode_t ext2fs_block_iterate2(ext2_fil

abort_exit:
if (ret & BLOCK_CHANGED) {
- if (!got_inode) {
- retval = ext2fs_read_inode(fs, ino, &inode);
- if (retval)
- return retval;
- }
for (i=0; i < EXT2_N_BLOCKS; i++)
inode.i_block[i] = blocks[i];
retval = ext2fs_write_inode(fs, ino, &inode);
Index: e2fsprogs-1.40.5/lib/ext2fs/block.h
===================================================================
--- /dev/null
+++ e2fsprogs-1.40.5/lib/ext2fs/block.h
@@ -0,0 +1,33 @@
+/*
+ * block.h --- header for block iteration in block.c, extent.c
+ *
+ * Copyright (C) 1993, 1994, 1995, 1996 Theodore Ts'o.
+ *
+ * %Begin-Header%
+ * This file may be redistributed under the terms of the GNU Public
+ * License.
+ * %End-Header%
+ */
+
+struct block_context {
+ ext2_filsys fs;
+ int (*func)(ext2_filsys fs,
+ blk_t *blocknr,
+ e2_blkcnt_t bcount,
+ blk_t ref_blk,
+ int ref_offset,
+ void *priv_data);
+ e2_blkcnt_t bcount;
+ int bsize;
+ int flags;
+ errcode_t errcode;
+ char *ind_buf;
+ char *dind_buf;
+ char *tind_buf;
+ void *priv_data;
+};
+
+/* libext2fs nternal function, in extent.c */
+extern int block_iterate_extents(void *eh_buf, unsigned bufsize,blk_t ref_block,
+ int ref_offset EXT2FS_ATTR((unused)),
+ struct block_context *ctx);
Index: e2fsprogs-1.40.5/lib/ext2fs/bmap.c
===================================================================
--- e2fsprogs-1.40.5.orig/lib/ext2fs/bmap.c
+++ e2fsprogs-1.40.5/lib/ext2fs/bmap.c
@@ -17,6 +17,7 @@

#include "ext2_fs.h"
#include "ext2fs.h"
+#include "ext3_extents.h"

#if defined(__GNUC__) && !defined(NO_INLINE_FUNCS)
#define _BMAP_INLINE_ __inline__
@@ -31,6 +32,65 @@ extern errcode_t ext2fs_bmap(ext2_filsys

#define inode_bmap(inode, nr) ((inode)->i_block[(nr)])

+/* see also block_iterate_extents() */
+static errcode_t block_bmap_extents(void *eh_buf, unsigned bufsize,
+ ext2_filsys fs, blk_t block,blk_t *phys_blk)
+{
+ struct ext3_extent_header *eh = eh_buf;
+ struct ext3_extent *ex;
+ errcode_t ret = 0;
+ int i;
+
+ ret = ext2fs_extent_header_verify(eh, bufsize);
+ if (ret)
+ return ret;
+
+ if (eh->eh_depth == 0) {
+ ex = EXT_FIRST_EXTENT(eh);
+ for (i = 0; i < eh->eh_entries; i++, ex++) {
+ if (block < ex->ee_block)
+ continue;
+
+ if (block < ex->ee_block + ex->ee_len)
+ /* FIXME: 48-bit support */
+ *phys_blk = ex->ee_start + block - ex->ee_block;
+
+ /* only the first extent > block could hold the block
+ * otherwise the extents would overlap */
+ break;
+ }
+ } else {
+ struct ext3_extent_idx *ix;
+ char *block_buf;
+
+ ret = ext2fs_get_mem(fs->blocksize, &block_buf);
+ if (ret)
+ return ret;
+
+ ix = EXT_FIRST_INDEX(eh);
+ for (i = 0; i < eh->eh_entries; i++, ix++) {
+ if (block < ix->ei_block)
+ continue;
+
+ ret = io_channel_read_blk(fs->io, ix->ei_leaf, 1,
+ block_buf);
+ if (ret)
+ goto free_buf;
+
+ ret = block_bmap_extents(block_buf, fs->blocksize,
+ fs, block, phys_blk);
+
+ /* only the first extent > block could hold the block
+ * otherwise the extents would overlap */
+ break;
+ }
+
+ free_buf:
+ ext2fs_free_mem(&block_buf);
+ }
+ return ret;
+}
+
static _BMAP_INLINE_ errcode_t block_ind_bmap(ext2_filsys fs, int flags,
blk_t ind, char *block_buf,
int *blocks_alloc,
@@ -149,6 +209,16 @@ errcode_t ext2fs_bmap(ext2_filsys fs, ex
return retval;
inode = &inode_buf;
}
+
+ if (inode->i_flags & EXT4_EXTENTS_FL) {
+ if (bmap_flags) /* unsupported as yet */
+ return EXT2_ET_BLOCK_ALLOC_FAIL;
+ retval = block_bmap_extents(inode->i_block,
+ sizeof(inode->i_block),
+ fs, block, phys_blk);
+ goto done;
+ }
+
addr_per_block = (blk_t) fs->blocksize >> 2;

if (!block_buf) {
Index: e2fsprogs-1.40.5/lib/ext2fs/ext2_err.et.in
===================================================================
--- e2fsprogs-1.40.5.orig/lib/ext2fs/ext2_err.et.in
+++ e2fsprogs-1.40.5/lib/ext2fs/ext2_err.et.in
@@ -326,5 +326,17 @@ ec EXT2_ET_TDB_ERR_NOEXIST,
ec EXT2_ET_TDB_ERR_RDONLY,
"TDB: Write not permitted"

+ec EXT2_ET_EXTENT_HEADER_BAD,
+ "Corrupt extent header"
+
+ec EXT2_ET_EXTENT_INDEX_BAD,
+ "Corrupt extent index"
+
+ec EXT2_ET_EXTENT_LEAF_BAD,
+ "Corrupt extent"
+
+ec EXT2_ET_EXTENT_NO_SPACE,
+ "No free space in extent map"
+
end

Index: e2fsprogs-1.40.5/lib/ext2fs/ext2fs.h
===================================================================
--- e2fsprogs-1.40.5.orig/lib/ext2fs/ext2fs.h
+++ e2fsprogs-1.40.5/lib/ext2fs/ext2fs.h
@@ -436,12 +436,14 @@ typedef struct ext2_icount *ext2_icount_
EXT3_FEATURE_INCOMPAT_JOURNAL_DEV|\
EXT2_FEATURE_INCOMPAT_META_BG|\
EXT3_FEATURE_INCOMPAT_RECOVER|\
+ EXT3_FEATURE_INCOMPAT_EXTENTS|\
EXT4_FEATURE_INCOMPAT_FLEX_BG)
#else
#define EXT2_LIB_FEATURE_INCOMPAT_SUPP (EXT2_FEATURE_INCOMPAT_FILETYPE|\
EXT3_FEATURE_INCOMPAT_JOURNAL_DEV|\
EXT2_FEATURE_INCOMPAT_META_BG|\
EXT3_FEATURE_INCOMPAT_RECOVER|\
+ EXT3_FEATURE_INCOMPAT_EXTENTS|\
EXT4_FEATURE_INCOMPAT_FLEX_BG)
#endif
#define EXT2_LIB_FEATURE_RO_COMPAT_SUPP (EXT2_FEATURE_RO_COMPAT_SPARSE_SUPER|\
@@ -722,6 +724,21 @@ extern errcode_t ext2fs_adjust_ea_refcou
char *block_buf,
int adjust, __u32 *newcount);

+/* extent.c */
+errcode_t ext2fs_extent_header_verify(struct ext3_extent_header *eh, int size);
+errcode_t ext2fs_extent_verify(ext2_filsys fs, struct ext3_extent *ex,
+ struct ext3_extent *ex_prev,
+ struct ext3_extent_idx *ix, int ix_len);
+errcode_t ext2fs_extent_index_verify(ext2_filsys fs,
+ struct ext3_extent_idx *ix,
+ struct ext3_extent_idx *ix_prev);
+errcode_t ext2fs_extent_remove(struct ext3_extent_header *eh,
+ struct ext3_extent *ex);
+errcode_t ext2fs_extent_split(ext2_filsys fs, struct ext3_extent_header **eh,
+ struct ext3_extent **ex, int count, int *flag);
+errcode_t ext2fs_extent_index_remove(struct ext3_extent_header *eh,
+ struct ext3_extent_idx *ix);
+
/* fileio.c */
extern errcode_t ext2fs_file_open2(ext2_filsys fs, ext2_ino_t ino,
struct ext2_inode *inode,
@@ -810,6 +827,8 @@ extern errcode_t ext2fs_image_bitmap_rea
/* ind_block.c */
errcode_t ext2fs_read_ind_block(ext2_filsys fs, blk_t blk, void *buf);
errcode_t ext2fs_write_ind_block(ext2_filsys fs, blk_t blk, void *buf);
+errcode_t ext2fs_read_ext_block(ext2_filsys fs, blk_t blk, void *buf);
+errcode_t ext2fs_write_ext_block(ext2_filsys fs, blk_t blk, void *buf);

/* initialize.c */
extern errcode_t ext2fs_initialize(const char *name, int flags,
@@ -984,6 +1003,9 @@ extern void ext2fs_swap_inode_full(ext2_
int bufsize);
extern void ext2fs_swap_inode(ext2_filsys fs,struct ext2_inode *t,
struct ext2_inode *f, int hostorder);
+extern void ext2fs_swap_extent_header(struct ext3_extent_header *eh);
+extern void ext2fs_swap_extent_index(struct ext3_extent_idx *ix);
+extern void ext2fs_swap_extent(struct ext3_extent *ex);

/* valid_blk.c */
extern int ext2fs_inode_has_valid_blocks(struct ext2_inode *inode);
Index: e2fsprogs-1.40.5/lib/ext2fs/extents.c
===================================================================
--- /dev/null
+++ e2fsprogs-1.40.5/lib/ext2fs/extents.c
@@ -0,0 +1,475 @@
+/*
+ * extent.c --- iterate over all blocks in an extent-mapped inode
+ *
+ * Copyright (C) 2005 Alex Tomas <[email protected]>
+ * Copyright (C) 2006 Andreas Dilger <[email protected]>
+ *
+ * %Begin-Header%
+ * This file may be redistributed under the terms of the GNU Public
+ * License.
+ * %End-Header%
+ */
+
+#include <stdio.h>
+#include <string.h>
+#if HAVE_UNISTD_H
+#include <unistd.h>
+#endif
+
+#include "ext2_fs.h"
+#include "ext2fs.h"
+#include "block.h"
+
+#ifdef EXT_DEBUG
+void ext_show_header(struct ext3_extent_header *eh)
+{
+ printf("header: magic=%x entries=%u max=%u depth=%u generation=%u\n",
+ eh->eh_magic, eh->eh_entries, eh->eh_max, eh->eh_depth,
+ eh->eh_generation);
+}
+
+void ext_show_index(struct ext3_extent_idx *ix)
+{
+ printf("index: block=%u leaf=%u leaf_hi=%u unused=%u\n",
+ ix->ei_block, ix->ei_leaf, ix->ei_leaf_hi, ix->ei_unused);
+}
+
+void ext_show_extent(struct ext3_extent *ex)
+{
+ printf("extent: block=%u-%u len=%u start=%u start_hi=%u\n",
+ ex->ee_block, ex->ee_block + ex->ee_len - 1,
+ ex->ee_len, ex->ee_start, ex->ee_start_hi);
+}
+
+#define ext_printf(fmt, args...) printf(fmt, ## args)
+#else
+#define ext_show_header(eh) do { } while (0)
+#define ext_show_index(ix) do { } while (0)
+#define ext_show_extent(ex) do { } while (0)
+#define ext_printf(fmt, args...) do { } while (0)
+#endif
+
+errcode_t ext2fs_extent_header_verify(struct ext3_extent_header *eh, int size)
+{
+ int eh_max, entry_size;
+
+ ext_show_header(eh);
+ if (eh->eh_magic != EXT3_EXT_MAGIC)
+ return EXT2_ET_EXTENT_HEADER_BAD;
+ if (eh->eh_entries > eh->eh_max)
+ return EXT2_ET_EXTENT_HEADER_BAD;
+ if (eh->eh_depth == 0)
+ entry_size = sizeof(struct ext3_extent);
+ else
+ entry_size = sizeof(struct ext3_extent_idx);
+
+ eh_max = (size - sizeof(*eh)) / entry_size;
+ /* Allow two extent-sized items at the end of the block, for
+ * ext4_extent_tail with checksum in the future. */
+ if (eh->eh_max > eh_max || eh->eh_max < eh_max - 2)
+ return EXT2_ET_EXTENT_HEADER_BAD;
+
+ return 0;
+}
+
+/* Verify that a single extent @ex is valid. If @ex_prev is passed in,
+ * then this was the previous logical extent in this block and we can
+ * do additional sanity checking (though in case of error we don't know
+ * which of the two extents is bad). Similarly, if @ix is passed in
+ * we can check that this extent is logically part of the index that
+ * refers to it (though again we can't know which of the two is bad). */
+errcode_t ext2fs_extent_verify(ext2_filsys fs, struct ext3_extent *ex,
+ struct ext3_extent *ex_prev,
+ struct ext3_extent_idx *ix, int ix_len)
+{
+ ext_show_extent(ex);
+ /* FIXME: 48-bit support */
+ if (ex->ee_start > fs->super->s_blocks_count)
+ return EXT2_ET_EXTENT_LEAF_BAD;
+
+ if (ex->ee_len == 0)
+ return EXT2_ET_EXTENT_LEAF_BAD;
+
+ if (ex->ee_len >= fs->super->s_blocks_per_group)
+ return EXT2_ET_EXTENT_LEAF_BAD;
+
+ if (ex_prev) {
+ /* We can't have a zero logical block except for first index */
+ if (ex->ee_block == 0)
+ return EXT2_ET_EXTENT_LEAF_BAD;
+
+ /* FIXME: 48-bit support */
+ /* extents must be in logical offset order */
+ if (ex->ee_block < ex_prev->ee_block + ex_prev->ee_len)
+ return EXT2_ET_EXTENT_LEAF_BAD;
+
+ /* extents must not overlap physical blocks */
+ if ((ex->ee_start < ex_prev->ee_start + ex_prev->ee_len) &&
+ (ex->ee_start + ex->ee_len > ex_prev->ee_start))
+ return EXT2_ET_EXTENT_LEAF_BAD;
+ }
+
+ if (ix) {
+ /* FIXME: 48-bit support */
+ if (ex->ee_block < ix->ei_block)
+ return EXT2_ET_EXTENT_LEAF_BAD;
+
+ if (ix_len && ex->ee_block + ex->ee_len > ix->ei_block + ix_len)
+ return EXT2_ET_EXTENT_LEAF_BAD;
+ }
+
+ return 0;
+}
+
+errcode_t ext2fs_extent_index_verify(ext2_filsys fs, struct ext3_extent_idx *ix,
+ struct ext3_extent_idx *ix_prev)
+{
+ ext_show_index(ix);
+ /* FIXME: 48-bit support */
+ if (ix->ei_leaf > fs->super->s_blocks_count)
+ return EXT2_ET_EXTENT_INDEX_BAD;
+
+ if (ix_prev == NULL)
+ return 0;
+
+ /* We can't have a zero logical block except for first index */
+ if (ix->ei_block == 0)
+ return EXT2_ET_EXTENT_INDEX_BAD;
+
+ if (ix->ei_block <= ix_prev->ei_block)
+ return EXT2_ET_EXTENT_INDEX_BAD;
+
+ return 0;
+}
+
+errcode_t ext2fs_extent_remove(struct ext3_extent_header *eh,
+ struct ext3_extent *ex)
+{
+ int offs = ex - EXT_FIRST_EXTENT(eh);
+
+ if (offs < 0 || offs > eh->eh_entries)
+ return EXT2_ET_EXTENT_LEAF_BAD;
+
+ ext_printf("remove extent: offset %u\n", offs);
+
+ memmove(ex, ex + 1, (eh->eh_entries - offs - 1) * sizeof(*ex));
+ --eh->eh_entries;
+
+ return 0;
+}
+
+static errcode_t ext2fs_extent_split_internal(struct ext3_extent_header *eh,
+ struct ext3_extent *ex, int offs)
+{
+ int entry = ex - EXT_FIRST_EXTENT(eh);
+ struct ext3_extent *ex_new = ex + 1;
+
+ ext_printf("split: ee_len: %u ee_block: %u ee_start: %u offset: %u\n",
+ ex->ee_len, ex->ee_block, ex->ee_start, offs);
+ memmove(ex_new, ex, (eh->eh_entries - entry) * sizeof(*ex));
+ ++eh->eh_entries;
+
+ ex->ee_len = offs;
+ /* FIXME: 48-bit support */
+ ex_new->ee_len -= offs;
+ ex_new->ee_block += offs;
+ ex_new->ee_start += offs;
+
+ return 0;
+}
+
+errcode_t ext2fs_extent_split(ext2_filsys fs,
+ struct ext3_extent_header **eh_orig,
+ struct ext3_extent **ex_orig, int offs, int *flag)
+{
+ struct ext3_extent_header *eh_parent = *eh_orig;
+ int retval, entry = *ex_orig - EXT_FIRST_EXTENT(eh_parent);
+ blk_t new_block;
+ char *buf;
+ struct ext3_extent_idx *ei = EXT_FIRST_INDEX(eh_parent);
+
+ if (entry < 0 || entry > (*eh_orig)->eh_entries)
+ return EXT2_ET_EXTENT_LEAF_BAD;
+
+ if (offs > (*ex_orig)->ee_len)
+ return EXT2_ET_EXTENT_LEAF_BAD;
+
+ if (eh_parent->eh_entries >= eh_parent->eh_max) {
+ ext_printf("split: eh_entries: %u eh_max: %u\n",
+ eh_parent->eh_entries, eh_parent->eh_max);
+ if (eh_parent->eh_max == 4) {
+ struct ext3_extent_header *eh_child;
+ struct ext3_extent *ex_child;
+
+ retval = ext2fs_get_mem(fs->blocksize, &buf);
+
+ if (retval)
+ return EXT2_ET_EXTENT_NO_SPACE;
+
+ memset(buf, 0, fs->blocksize);
+ memcpy(buf, eh_parent, sizeof(*eh_parent) +
+ eh_parent->eh_entries * sizeof(*ex_child));
+ eh_child = (struct ext3_extent_header *)buf;
+
+ eh_child->eh_max = (fs->blocksize -
+ sizeof(struct ext3_extent_header)) /
+ sizeof(struct ext3_extent);
+ retval = ext2fs_new_block(fs, (*ex_orig)->ee_block, 0,
+ &new_block);
+ if (retval)
+ return EXT2_ET_EXTENT_NO_SPACE;
+
+ retval = io_channel_write_blk(fs->io, new_block, 1,buf);
+ if (retval)
+ return EXT2_ET_EXTENT_NO_SPACE;
+
+ eh_parent->eh_entries = 1;
+ eh_parent->eh_depth = 1;
+
+ ex_child = EXT_FIRST_EXTENT(eh_child);
+ ei->ei_block = ex_child->ee_block;
+ /* FIXME: 48-bit support*/
+ ei->ei_leaf = new_block;
+
+ *eh_orig = eh_child;
+ *ex_orig = EXT_FIRST_EXTENT(eh_child) + entry;
+
+ *flag = BLOCK_CHANGED;
+ } else {
+ return EXT2_ET_EXTENT_NO_SPACE;
+ }
+ }
+
+ return ext2fs_extent_split_internal(*eh_orig, *ex_orig, offs);
+}
+
+errcode_t ext2fs_extent_index_remove(struct ext3_extent_header *eh,
+ struct ext3_extent_idx *ix)
+{
+ struct ext3_extent_idx *first = EXT_FIRST_INDEX(eh);
+ int offs = ix - first;
+
+ ext_printf("remove index: offset %u\n", offs);
+
+ memmove(ix, ix + 1, (eh->eh_entries - offs - 1) * sizeof(*ix));
+ --eh->eh_entries;
+
+ return 0;
+}
+
+/* Internal function for ext2fs_block_iterate2() to recursively walk the
+ * extent tree, with a callback function for each block. We also call the
+ * callback function on index blocks unless BLOCK_FLAG_DATA_ONLY is given.
+ * We traverse the tree in-order (internal nodes before their children)
+ * unless BLOCK_FLAG_DEPTH_FIRST is given.
+ *
+ * See also block_bmap_extents(). */
+int block_iterate_extents(void *eh_buf, unsigned bufsize, blk_t ref_block,
+ int ref_offset EXT2FS_ATTR((unused)),
+ struct block_context *ctx)
+{
+ struct ext3_extent_header *orig_eh, *eh;
+ struct ext3_extent *ex, *ex_prev = NULL;
+ int ret = 0;
+ int item, offs, flags, split_flag = 0;
+ blk_t block_address;
+
+ orig_eh = eh = eh_buf;
+
+ if (ext2fs_extent_header_verify(eh, bufsize))
+ return BLOCK_ERROR;
+
+ if (eh->eh_depth == 0) {
+ ex = EXT_FIRST_EXTENT(eh);
+ for (item = 0; item < eh->eh_entries; item++, ex++) {
+ ext_show_extent(ex);
+ for (offs = 0; offs < ex->ee_len; offs++) {
+ block_address = ex->ee_start + offs;
+ flags = (*ctx->func)(ctx->fs, &block_address,
+ (ex->ee_block + offs),
+ ref_block, item,
+ ctx->priv_data);
+ if (flags & (BLOCK_ABORT | BLOCK_ERROR)) {
+ ret |= flags &(BLOCK_ABORT|BLOCK_ERROR);
+ return ret;
+ }
+ if (!(flags & BLOCK_CHANGED))
+ continue;
+
+ ext_printf("extent leaf changed: "
+ "block was %u+%u = %u, now %u\n",
+ ex->ee_start, offs,
+ ex->ee_start + offs, block_address);
+
+ /* FIXME: 48-bit support */
+ if (ex_prev &&
+ block_address ==
+ ex_prev->ee_start + ex_prev->ee_len &&
+ ex->ee_block + offs ==
+ ex_prev->ee_block + ex_prev->ee_len) {
+ /* can merge block with prev extent */
+ ex_prev->ee_len++;
+ ex->ee_len--;
+ ret |= BLOCK_CHANGED;
+
+ if (ex->ee_len == 0) {
+ /* no blocks left in this one */
+ ext2fs_extent_remove(eh, ex);
+ item--; ex--;
+ break;
+ } else {
+ /* FIXME: 48-bit support */
+ ex->ee_start++;
+ ex->ee_block++;
+ offs--;
+ }
+
+ } else if (offs > 0 && /* implies ee_len > 1 */
+ (ctx->errcode =
+ ext2fs_extent_split(ctx->fs, &eh,
+ &ex, offs,
+ &split_flag)
+ /* advance ex past newly split item,
+ * comparison is bogus to make sure
+ * increment doesn't change logic */
+ || (offs > 0 && ex++ == NULL))) {
+ /* split before new block failed */
+ ret |= BLOCK_ABORT | BLOCK_ERROR;
+ return ret;
+
+ } else if (ex->ee_len > 1 &&
+ (ctx->errcode =
+ ext2fs_extent_split(ctx->fs, &eh,
+ &ex, 1,
+ &split_flag))) {
+ /* split after new block failed */
+ ret |= BLOCK_ABORT | BLOCK_ERROR;
+ return ret;
+
+ } else {
+ if (ex->ee_len != 1) {
+ /* this is an internal error */
+ ctx->errcode =
+ EXT2_ET_EXTENT_INDEX_BAD;
+ ret |= BLOCK_ABORT |BLOCK_ERROR;
+ return ret;
+ }
+ /* FIXME: 48-bit support */
+ ex->ee_start = block_address;
+ ret |= BLOCK_CHANGED;
+ }
+ }
+ ex_prev = ex;
+ }
+ /* Multi level split at depth == 0.
+ * ex has been changed to point to newly allocated block
+ * buffer. And after returning in this scenario, only inode is
+ * updated with changed i_block. Hence explicitly write to the
+ * block is required. */
+ if (split_flag == BLOCK_CHANGED) {
+ struct ext3_extent_idx *ix = EXT_FIRST_INDEX(orig_eh);
+ ctx->errcode = ext2fs_write_ext_block(ctx->fs,
+ ix->ei_leaf, eh);
+ }
+ } else {
+ char *block_buf;
+ struct ext3_extent_idx *ix;
+
+ ret = ext2fs_get_mem(ctx->fs->blocksize, &block_buf);
+ if (ret)
+ return ret;
+
+ ext_show_header(eh);
+ ix = EXT_FIRST_INDEX(eh);
+ for (item = 0; item < eh->eh_entries; item++, ix++) {
+ ext_show_index(ix);
+ /* index is processed first in e2fsck case */
+ if (!(ctx->flags & BLOCK_FLAG_DEPTH_TRAVERSE) &&
+ !(ctx->flags & BLOCK_FLAG_DATA_ONLY)) {
+ block_address = ix->ei_leaf;
+ flags = (*ctx->func)(ctx->fs, &block_address,
+ BLOCK_COUNT_IND, ref_block,
+ item, ctx->priv_data);
+ if (flags & (BLOCK_ABORT | BLOCK_ERROR)) {
+ ret |= flags &(BLOCK_ABORT|BLOCK_ERROR);
+ goto free_buf;
+ }
+ if (flags & BLOCK_CHANGED) {
+ ret |= BLOCK_CHANGED;
+ /* index has no more block, remove it */
+ /* FIXME: 48-bit support */
+ ix->ei_leaf = block_address;
+ if (ix->ei_leaf == 0 &&
+ ix->ei_leaf_hi == 0) {
+ if(ext2fs_extent_index_remove(eh, ix)) {
+ ret |= BLOCK_ABORT |BLOCK_ERROR;
+ goto free_buf;
+ } else {
+ --item; --ix;
+ continue;
+ }
+ }
+ /* remapped? */
+ }
+ }
+ ctx->errcode = ext2fs_read_ext_block(ctx->fs,
+ ix->ei_leaf,
+ block_buf);
+ if (ctx->errcode) {
+ ret |= BLOCK_ERROR;
+ goto free_buf;
+ }
+ flags = block_iterate_extents(block_buf,
+ ctx->fs->blocksize,
+ ix->ei_leaf, item, ctx);
+ if (flags & BLOCK_CHANGED) {
+ struct ext3_extent_header *nh;
+ ctx->errcode =
+ ext2fs_write_ext_block(ctx->fs,
+ ix->ei_leaf,
+ block_buf);
+
+ nh = (struct ext3_extent_header *)block_buf;
+ if (nh->eh_entries == 0)
+ ix->ei_leaf = ix->ei_leaf_hi = 0;
+ }
+ if (flags & (BLOCK_ABORT | BLOCK_ERROR)) {
+ ret |= flags & (BLOCK_ABORT | BLOCK_ERROR);
+ goto free_buf;
+ }
+ if ((ctx->flags & BLOCK_FLAG_DEPTH_TRAVERSE) &&
+ !(ctx->flags & BLOCK_FLAG_DATA_ONLY)) {
+ flags = (*ctx->func)(ctx->fs, &block_address,
+ BLOCK_COUNT_IND, ref_block,
+ item, ctx->priv_data);
+ if (flags & (BLOCK_ABORT | BLOCK_ERROR)) {
+ ret |= flags &(BLOCK_ABORT|BLOCK_ERROR);
+ goto free_buf;
+ }
+ if (flags & BLOCK_CHANGED)
+ /* FIXME: 48-bit support */
+ ix->ei_leaf = block_address;
+ }
+
+ if (flags & BLOCK_CHANGED) {
+ /* index has no more block, remove it */
+ if (ix->ei_leaf == 0 && ix->ei_leaf_hi == 0 &&
+ ext2fs_extent_index_remove(eh, ix)) {
+ ret |= BLOCK_ABORT |BLOCK_ERROR;
+ goto free_buf;
+ }
+
+ ret |= BLOCK_CHANGED;
+ if (ref_block == 0) {
+ --item; --ix;
+ continue;
+ }
+ /* remapped? */
+ }
+ }
+
+ free_buf:
+ ext2fs_free_mem(&block_buf);
+ }
+ return ret;
+}
Index: e2fsprogs-1.40.5/lib/ext2fs/ind_block.c
===================================================================
--- e2fsprogs-1.40.5.orig/lib/ext2fs/ind_block.c
+++ e2fsprogs-1.40.5/lib/ext2fs/ind_block.c
@@ -22,9 +22,9 @@
errcode_t ext2fs_read_ind_block(ext2_filsys fs, blk_t blk, void *buf)
{
errcode_t retval;
- blk_t *block_nr;
- int i;
- int limit = fs->blocksize >> 2;
+ int limit = fs->blocksize >> 2;
+ blk_t *block_nr = (blk_t *)buf;
+ int i;

if ((fs->flags & EXT2_FLAG_IMAGE_FILE) &&
(fs->io != fs->image_io))
@@ -35,7 +35,6 @@ errcode_t ext2fs_read_ind_block(ext2_fil
return retval;
}
#ifdef WORDS_BIGENDIAN
- block_nr = (blk_t *) buf;
for (i = 0; i < limit; i++, block_nr++)
*block_nr = ext2fs_swab32(*block_nr);
#endif
@@ -60,3 +59,82 @@ errcode_t ext2fs_write_ind_block(ext2_fi
}


+errcode_t ext2fs_read_ext_block(ext2_filsys fs, blk_t blk, void *buf)
+{
+ errcode_t retval;
+
+ if ((fs->flags & EXT2_FLAG_IMAGE_FILE) &&
+ (fs->io != fs->image_io))
+ memset(buf, 0, fs->blocksize);
+ else {
+ retval = io_channel_read_blk(fs->io, blk, 1, buf);
+ if (retval)
+ return retval;
+ }
+#ifdef EXT2FS_ENABLE_SWAPFS
+ if (fs->flags & (EXT2_FLAG_SWAP_BYTES | EXT2_FLAG_SWAP_BYTES_READ)) {
+ struct ext3_extent_header *eh = buf;
+ int i, limit;
+
+ ext2fs_swap_extent_header(eh);
+
+ if (eh->eh_depth == 0) {
+ struct ext3_extent *ex = EXT_FIRST_EXTENT(eh);
+
+ limit = (fs->blocksize - sizeof(*eh)) / sizeof(*ex);
+ if (eh->eh_entries < limit)
+ limit = eh->eh_entries;
+
+ for (i = 0; i < limit; i++, ex++)
+ ext2fs_swap_extent(ex);
+ } else {
+ struct ext3_extent_idx *ix = EXT_FIRST_INDEX(eh);
+
+ limit = (fs->blocksize - sizeof(*eh)) / sizeof(*ix);
+ if (eh->eh_entries < limit)
+ limit = eh->eh_entries;
+
+ for (i = 0; i < limit; i++, ix++)
+ ext2fs_swap_extent_index(ix);
+ }
+ }
+#endif
+ return 0;
+}
+
+errcode_t ext2fs_write_ext_block(ext2_filsys fs, blk_t blk, void *buf)
+{
+ if (fs->flags & EXT2_FLAG_IMAGE_FILE)
+ return 0;
+
+#ifdef EXT2FS_ENABLE_SWAPFS
+ if (fs->flags & (EXT2_FLAG_SWAP_BYTES | EXT2_FLAG_SWAP_BYTES_WRITE)) {
+ struct ext3_extent_header *eh = buf;
+ int i, limit;
+
+ if (eh->eh_depth == 0) {
+ struct ext3_extent *ex = EXT_FIRST_EXTENT(eh);
+
+ limit = (fs->blocksize - sizeof(*eh)) / sizeof(*ex);
+ if (eh->eh_entries < limit)
+ limit = eh->eh_entries;
+
+ for (i = 0; i < limit; i++, ex++)
+ ext2fs_swap_extent(ex);
+ } else {
+ struct ext3_extent_idx *ix = EXT_FIRST_INDEX(eh);
+
+ limit = (fs->blocksize - sizeof(*eh)) / sizeof(*ix);
+ if (eh->eh_entries < limit)
+ limit = eh->eh_entries;
+
+ for (i = 0; i < limit; i++, ix++)
+ ext2fs_swap_extent_index(ix);
+ }
+
+ ext2fs_swap_extent_header(eh);
+ }
+#endif
+ return io_channel_write_blk(fs->io, blk, 1, buf);
+}
+
Index: e2fsprogs-1.40.5/lib/ext2fs/swapfs.c
===================================================================
--- e2fsprogs-1.40.5.orig/lib/ext2fs/swapfs.c
+++ e2fsprogs-1.40.5/lib/ext2fs/swapfs.c
@@ -142,11 +142,33 @@ void ext2fs_swap_ext_attr(char *to, char
}
}

+void ext2fs_swap_extent_header(struct ext3_extent_header *eh) {
+ eh->eh_magic = ext2fs_swab16(eh->eh_magic);
+ eh->eh_entries = ext2fs_swab16(eh->eh_entries);
+ eh->eh_max = ext2fs_swab16(eh->eh_max);
+ eh->eh_depth = ext2fs_swab16(eh->eh_depth);
+ eh->eh_generation = ext2fs_swab32(eh->eh_generation);
+}
+
+void ext2fs_swap_extent_index(struct ext3_extent_idx *ix) {
+ ix->ei_block = ext2fs_swab32(ix->ei_block);
+ ix->ei_leaf = ext2fs_swab32(ix->ei_leaf);
+ ix->ei_leaf_hi = ext2fs_swab16(ix->ei_leaf_hi);
+ ix->ei_unused = ext2fs_swab16(ix->ei_unused);
+}
+
+void ext2fs_swap_extent(struct ext3_extent *ex) {
+ ex->ee_block = ext2fs_swab32(ex->ee_block);
+ ex->ee_len = ext2fs_swab16(ex->ee_len);
+ ex->ee_start_hi =ext2fs_swab16(ex->ee_start_hi);
+ ex->ee_start = ext2fs_swab32(ex->ee_start);
+}
+
void ext2fs_swap_inode_full(ext2_filsys fs, struct ext2_inode_large *t,
struct ext2_inode_large *f, int hostorder,
int bufsize)
{
- unsigned i, has_data_blocks, extra_isize;
+ unsigned i, has_data_blocks, extra_isize, has_extents;
int islnk = 0;
__u32 *eaf, *eat;

@@ -164,18 +186,46 @@ void ext2fs_swap_inode_full(ext2_filsys
t->i_gid = ext2fs_swab16(f->i_gid);
t->i_links_count = ext2fs_swab16(f->i_links_count);
t->i_file_acl = ext2fs_swab32(f->i_file_acl);
- if (hostorder)
- has_data_blocks = ext2fs_inode_data_blocks(fs,
+ if (hostorder) {
+ has_data_blocks = ext2fs_inode_data_blocks(fs,
(struct ext2_inode *) f);
- t->i_blocks = ext2fs_swab32(f->i_blocks);
- if (!hostorder)
- has_data_blocks = ext2fs_inode_data_blocks(fs,
+ t->i_blocks = ext2fs_swab32(f->i_blocks);
+ has_extents = (f->i_flags & EXT4_EXTENTS_FL);
+ t->i_flags = ext2fs_swab32(f->i_flags);
+ } else {
+ t->i_blocks = ext2fs_swab32(f->i_blocks);
+ has_data_blocks = ext2fs_inode_data_blocks(fs,
(struct ext2_inode *) t);
+ t->i_flags = ext2fs_swab32(f->i_flags);
+ has_extents = (t->i_flags & EXT4_EXTENTS_FL);
+ }
t->i_flags = ext2fs_swab32(f->i_flags);
t->i_dir_acl = ext2fs_swab32(f->i_dir_acl);
- if (!islnk || has_data_blocks ) {
- for (i = 0; i < EXT2_N_BLOCKS; i++)
- t->i_block[i] = ext2fs_swab32(f->i_block[i]);
+ if (!islnk || has_data_blocks) {
+ if (has_extents) {
+ struct ext3_extent_header *eh;
+ int max = EXT2_N_BLOCKS * sizeof(__u32) - sizeof(*eh);
+
+ memcpy(t->i_block, f->i_block, sizeof(f->i_block));
+ eh = (struct ext3_extent_header *)t->i_block;
+ ext2fs_swap_extent_header(eh);
+
+ if (!eh->eh_depth) {
+ struct ext3_extent *ex = EXT_FIRST_EXTENT(eh);
+ max = max / sizeof(struct ext3_extent);
+ for (i = 0; i < max; i++, ex++)
+ ext2fs_swap_extent(ex);
+ } else {
+ struct ext3_extent_idx *ix =
+ EXT_FIRST_INDEX(eh);
+ max = max / sizeof(struct ext3_extent_idx);
+ for (i = 0; i < max; i++, ix++)
+ ext2fs_swap_extent_index(ix);
+ }
+ } else {
+ for (i = 0; i < EXT2_N_BLOCKS; i++)
+ t->i_block[i] = ext2fs_swab32(f->i_block[i]);
+ }
} else if (t != f) {
for (i = 0; i < EXT2_N_BLOCKS; i++)
t->i_block[i] = f->i_block[i];
@@ -218,11 +268,13 @@ void ext2fs_swap_inode_full(ext2_filsys
if (bufsize < (int) (sizeof(struct ext2_inode) + sizeof(__u16)))
return; /* no i_extra_isize field */

- if (hostorder)
+ if (hostorder) {
extra_isize = f->i_extra_isize;
- t->i_extra_isize = ext2fs_swab16(f->i_extra_isize);
- if (!hostorder)
+ t->i_extra_isize = ext2fs_swab16(f->i_extra_isize);
+ } else {
+ t->i_extra_isize = ext2fs_swab16(f->i_extra_isize);
extra_isize = t->i_extra_isize;
+ }
if (extra_isize > EXT2_INODE_SIZE(fs->super) -
sizeof(struct ext2_inode)) {
/* this is error case: i_extra_size is too large */
Index: e2fsprogs-1.40.5/lib/ext2fs/valid_blk.c
===================================================================
--- e2fsprogs-1.40.5.orig/lib/ext2fs/valid_blk.c
+++ e2fsprogs-1.40.5/lib/ext2fs/valid_blk.c
@@ -19,6 +19,7 @@

#include "ext2_fs.h"
#include "ext2fs.h"
+#include "ext3_extents.h"

/*
* This function returns 1 if the inode's block entries actually
@@ -41,12 +42,23 @@ int ext2fs_inode_has_valid_blocks(struct
if (LINUX_S_ISLNK (inode->i_mode)) {
if (inode->i_file_acl == 0) {
/* With no EA block, we can rely on i_blocks */
- if (inode->i_blocks == 0)
- return 0;
+ if (inode->i_flags & EXT4_EXTENTS_FL) {
+ struct ext3_extent_header *eh;
+ eh = (struct ext3_extent_header *)inode->i_block;
+ if (eh->eh_entries == 0)
+ return 0;
+ } else {
+ if (inode->i_blocks == 0)
+ return 0;
+ }
} else {
/* With an EA block, life gets more tricky */
if (inode->i_size >= EXT2_N_BLOCKS*4)
return 1; /* definitely using i_block[] */
+ /*
+ * we cant have EA + extents, so assume we aren't
+ * using extents
+ */
if (inode->i_size > 4 && inode->i_block[1] == 0)
return 1; /* definitely using i_block[] */
return 0; /* Probably a fast symlink */
Index: e2fsprogs-1.40.5/tests/f_bad_disconnected_inode/expect.1
===================================================================
--- e2fsprogs-1.40.5.orig/tests/f_bad_disconnected_inode/expect.1
+++ e2fsprogs-1.40.5/tests/f_bad_disconnected_inode/expect.1
@@ -1,4 +1,10 @@
Pass 1: Checking inodes, blocks, and sizes
+Inode 15 has EXTENT_FL set, but is not in extents format
+Fix? yes
+
+Inode 16 has EXTENT_FL set, but is not in extents format
+Fix? yes
+
Pass 2: Checking directory structure
Pass 3: Checking directory connectivity
/lost+found not found. Create? yes
Index: e2fsprogs-1.40.5/tests/f_bbfile/expect.1
===================================================================
--- e2fsprogs-1.40.5.orig/tests/f_bbfile/expect.1
+++ e2fsprogs-1.40.5/tests/f_bbfile/expect.1
@@ -3,46 +3,60 @@ Filesystem did not have a UUID; generati
Pass 1: Checking inodes, blocks, and sizes
Group 0's inode bitmap (4) is bad. Relocate? yes

+Inode 11 has corrupt indirect block
+Clear? yes
+
Relocating group 0's inode bitmap from 4 to 43...
+Restarting e2fsck from the beginning...
+Pass 1: Checking inodes, blocks, and sizes

Running additional passes to resolve blocks claimed by more than one inode...
Pass 1B: Rescanning for multiply-claimed blocks
Multiply-claimed block(s) in inode 2: 21
-Multiply-claimed block(s) in inode 11: 9 10 11 12 13 14 15 16 17 18 19 20
Multiply-claimed block(s) in inode 12: 25 26
Pass 1C: Scanning directories for inodes with multiply-claimed blocks
Pass 1D: Reconciling multiply-claimed blocks
-(There are 3 inodes containing multiply-claimed blocks.)
+(There are 2 inodes containing multiply-claimed blocks.)

File / (inode #2, mod time Sun Jan 2 08:29:13 1994)
has 1 multiply-claimed block(s), shared with 1 file(s):
<The bad blocks inode> (inode #1, mod time Sun Jul 17 00:47:58 1994)
Clone multiply-claimed blocks? yes

-File /lost+found (inode #11, mod time Sun Jan 2 08:28:40 1994)
- has 12 multiply-claimed block(s), shared with 1 file(s):
- <The bad blocks inode> (inode #1, mod time Sun Jul 17 00:47:58 1994)
-Clone multiply-claimed blocks? yes
-
File /termcap (inode #12, mod time Sun Jan 2 08:29:13 1994)
has 2 multiply-claimed block(s), shared with 1 file(s):
<The bad blocks inode> (inode #1, mod time Sun Jul 17 00:47:58 1994)
Clone multiply-claimed blocks? yes

Pass 2: Checking directory structure
+Entry 'lost+found' in / (2) has deleted/unused inode 11. Clear? yes
+
Pass 3: Checking directory connectivity
+/lost+found not found. Create? yes
+
Pass 4: Checking reference counts
+Inode 2 ref count is 4, should be 3. Fix? yes
+
Pass 5: Checking group summary information
Block bitmap differences: +43
Fix? yes

-Free blocks count wrong for group #0 (57, counted=41).
+Free blocks count wrong for group #0 (56, counted=52).
+Fix? yes
+
+Free blocks count wrong (56, counted=52).
+Fix? yes
+
+Free inodes count wrong for group #0 (19, counted=20).
+Fix? yes
+
+Directories count wrong for group #0 (3, counted=2).
Fix? yes

-Free blocks count wrong (57, counted=41).
+Free inodes count wrong (19, counted=20).
Fix? yes


test_filesys: ***** FILE SYSTEM WAS MODIFIED *****
-test_filesys: 12/32 files (0.0% non-contiguous), 59/100 blocks
+test_filesys: 12/32 files (0.0% non-contiguous), 48/100 blocks
Exit status is 1
Index: e2fsprogs-1.40.5/tests/f_bbfile/expect.2
===================================================================
--- e2fsprogs-1.40.5.orig/tests/f_bbfile/expect.2
+++ e2fsprogs-1.40.5/tests/f_bbfile/expect.2
@@ -3,5 +3,5 @@ Pass 2: Checking directory structure
Pass 3: Checking directory connectivity
Pass 4: Checking reference counts
Pass 5: Checking group summary information
-test_filesys: 12/32 files (8.3% non-contiguous), 59/100 blocks
+test_filesys: 12/32 files (8.3% non-contiguous), 48/100 blocks
Exit status is 0
Index: e2fsprogs-1.40.5/tests/f_lotsbad/expect.1
===================================================================
--- e2fsprogs-1.40.5.orig/tests/f_lotsbad/expect.1
+++ e2fsprogs-1.40.5/tests/f_lotsbad/expect.1
@@ -8,54 +8,41 @@ Inode 13, i_size is 15360, should be 122

Inode 13, i_blocks is 32, should be 30. Fix? yes

-Inode 12 has illegal block(s). Clear? yes
+Inode 12 has corrupt indirect block
+Clear? yes

-Illegal block #12 (778398818) in inode 12. CLEARED.
-Illegal block #13 (1768444960) in inode 12. CLEARED.
-Illegal block #14 (1752375411) in inode 12. CLEARED.
-Illegal block #15 (1684829551) in inode 12. CLEARED.
-Illegal block #16 (1886349344) in inode 12. CLEARED.
-Illegal block #17 (1819633253) in inode 12. CLEARED.
-Illegal block #18 (1663072620) in inode 12. CLEARED.
-Illegal block #19 (1735287144) in inode 12. CLEARED.
-Illegal block #20 (1310731877) in inode 12. CLEARED.
-Illegal block #21 (560297071) in inode 12. CLEARED.
-Illegal block #22 (543512352) in inode 12. CLEARED.
-Too many illegal blocks in inode 12.
-Clear inode? yes
+Inode 12, i_blocks is 34, should be 24. Fix? yes

-Restarting e2fsck from the beginning...
-Pass 1: Checking inodes, blocks, and sizes
Pass 2: Checking directory structure
-Entry 'termcap' in / (2) has deleted/unused inode 12. Clear? yes
+Directory inode 13 has an unallocated block #16580876. Allocate? yes

Pass 3: Checking directory connectivity
Pass 4: Checking reference counts
Inode 2 ref count is 5, should be 4. Fix? yes

Pass 5: Checking group summary information
-Block bitmap differences: -(27--41) -(44--45) -(74--90)
+Block bitmap differences: -(38--41) -(74--90)
Fix? yes

-Free blocks count wrong for group #0 (9, counted=43).
+Free blocks count wrong for group #0 (9, counted=30).
Fix? yes

-Free blocks count wrong (9, counted=43).
+Free blocks count wrong (9, counted=30).
Fix? yes

-Inode bitmap differences: -12 -14
+Inode bitmap differences: -14
Fix? yes

-Free inodes count wrong for group #0 (18, counted=20).
+Free inodes count wrong for group #0 (18, counted=19).
Fix? yes

Directories count wrong for group #0 (4, counted=3).
Fix? yes

-Free inodes count wrong (18, counted=20).
+Free inodes count wrong (18, counted=19).
Fix? yes


test_filesys: ***** FILE SYSTEM WAS MODIFIED *****
-test_filesys: 12/32 files (0.0% non-contiguous), 57/100 blocks
+test_filesys: 13/32 files (7.7% non-contiguous), 70/100 blocks
Exit status is 1
Index: e2fsprogs-1.40.5/tests/f_lotsbad/expect.2
===================================================================
--- e2fsprogs-1.40.5.orig/tests/f_lotsbad/expect.2
+++ e2fsprogs-1.40.5/tests/f_lotsbad/expect.2
@@ -1,7 +1,18 @@
Pass 1: Checking inodes, blocks, and sizes
+Inode 13 is too big. Truncate? yes
+
+Block #16580876 (37) causes directory to be too big. CLEARED.
+Inode 13, i_size is 4093916160, should be 12288. Fix? yes
+
+Inode 13, i_blocks is 32, should be 30. Fix? yes
+
Pass 2: Checking directory structure
+Directory inode 13 has an unallocated block #16580876. Allocate? yes
+
Pass 3: Checking directory connectivity
Pass 4: Checking reference counts
Pass 5: Checking group summary information
-test_filesys: 12/32 files (0.0% non-contiguous), 57/100 blocks
-Exit status is 0
+
+test_filesys: ***** FILE SYSTEM WAS MODIFIED *****
+test_filesys: 13/32 files (15.4% non-contiguous), 70/100 blocks
+Exit status is 1
Index: e2fsprogs-1.40.5/tests/f_messy_inode/expect.1
===================================================================
--- e2fsprogs-1.40.5.orig/tests/f_messy_inode/expect.1
+++ e2fsprogs-1.40.5/tests/f_messy_inode/expect.1
@@ -1,38 +1,36 @@
Filesystem did not have a UUID; generating one.

Pass 1: Checking inodes, blocks, and sizes
-Inode 14 has illegal block(s). Clear? yes
-
-Illegal block #2 (4294901760) in inode 14. CLEARED.
-Illegal block #3 (4294901760) in inode 14. CLEARED.
-Illegal block #4 (4294901760) in inode 14. CLEARED.
-Illegal block #5 (4294901760) in inode 14. CLEARED.
-Illegal block #6 (4294901760) in inode 14. CLEARED.
-Illegal block #7 (4294901760) in inode 14. CLEARED.
-Illegal block #8 (4294901760) in inode 14. CLEARED.
-Illegal block #9 (4294901760) in inode 14. CLEARED.
-Illegal block #10 (4294901760) in inode 14. CLEARED.
-Inode 14, i_size is 18446462598732849291, should be 2048. Fix? yes
-
-Inode 14, i_blocks is 18, should be 4. Fix? yes
+Inode 14 has corrupt indirect block
+Clear? yes

+Restarting e2fsck from the beginning...
+Pass 1: Checking inodes, blocks, and sizes
Pass 2: Checking directory structure
-i_file_acl for inode 14 (/MAKEDEV) is 4294901760, should be zero.
-Clear? yes
+Entry 'MAKEDEV' in / (2) has deleted/unused inode 14. Clear? yes

Pass 3: Checking directory connectivity
Pass 4: Checking reference counts
Pass 5: Checking group summary information
-Block bitmap differences: -(43--49)
+Block bitmap differences: -(41--49)
+Fix? yes
+
+Free blocks count wrong for group #0 (68, counted=77).
+Fix? yes
+
+Free blocks count wrong (68, counted=77).
+Fix? yes
+
+Inode bitmap differences: -14
Fix? yes

-Free blocks count wrong for group #0 (68, counted=75).
+Free inodes count wrong for group #0 (3, counted=4).
Fix? yes

-Free blocks count wrong (68, counted=75).
+Free inodes count wrong (3, counted=4).
Fix? yes


test_filesys: ***** FILE SYSTEM WAS MODIFIED *****
-test_filesys: 29/32 files (3.4% non-contiguous), 25/100 blocks
+test_filesys: 28/32 files (0.0% non-contiguous), 23/100 blocks
Exit status is 1
Index: e2fsprogs-1.40.5/tests/f_messy_inode/expect.2
===================================================================
--- e2fsprogs-1.40.5.orig/tests/f_messy_inode/expect.2
+++ e2fsprogs-1.40.5/tests/f_messy_inode/expect.2
@@ -3,5 +3,5 @@ Pass 2: Checking directory structure
Pass 3: Checking directory connectivity
Pass 4: Checking reference counts
Pass 5: Checking group summary information
-test_filesys: 29/32 files (0.0% non-contiguous), 25/100 blocks
+test_filesys: 28/32 files (0.0% non-contiguous), 23/100 blocks
Exit status is 0

Cheers, Andreas
--
Andreas Dilger
Sr. Staff Engineer, Lustre Group
Sun Microsystems of Canada, Inc.

2008-02-02 08:29:28

by Andreas Dilger

[permalink] [raw]
Subject: Re: [PATCH][8/28] e2fsprogs-config-before-cmdline.patch


The patch changes the order that the config file and command line are
parsed so that command line has precedence. It also allows multiple
-E options to be specified on the command line.

Signed-off-by: Jim Garlick <[email protected]>
Signed-off-by: Andreas Dilger <[email protected]>

Index: e2fsprogs-1.40.4/e2fsck/unix.c
===================================================================
--- e2fsprogs-1.40.4.orig/e2fsck/unix.c
+++ e2fsprogs-1.40.4/e2fsck/unix.c
@@ -588,7 +588,6 @@ static errcode_t PRS(int argc, char *arg
#ifdef HAVE_SIGNAL_H
struct sigaction sa;
#endif
- char *extended_opts = 0;
char *cp;
int res; /* result of sscanf */
#ifdef CONFIG_JBD_DEBUG
@@ -619,6 +618,12 @@ static errcode_t PRS(int argc, char *arg
ctx->program_name = *argv;
else
ctx->program_name = "e2fsck";
+
+ if ((cp = getenv("E2FSCK_CONFIG")) != NULL)
+ config_fn[0] = cp;
+ profile_set_syntax_err_cb(syntax_err_report);
+ profile_init(config_fn, &ctx->profile);
+
while ((c = getopt (argc, argv, "panyrcC:B:dE:fvtFVM:b:I:j:P:l:L:N:SsDk")) != EOF)
switch (c) {
case 'C':
@@ -645,7 +650,7 @@ static errcode_t PRS(int argc, char *arg
ctx->options |= E2F_OPT_COMPRESS_DIRS;
break;
case 'E':
- extended_opts = optarg;
+ parse_extended_opts(ctx, optarg);
break;
case 'p':
case 'a':
@@ -771,13 +776,6 @@ static errcode_t PRS(int argc, char *arg
argv[optind]);
fatal_error(ctx, 0);
}
- if (extended_opts)
- parse_extended_opts(ctx, extended_opts);
-
- if ((cp = getenv("E2FSCK_CONFIG")) != NULL)
- config_fn[0] = cp;
- profile_set_syntax_err_cb(syntax_err_report);
- profile_init(config_fn, &ctx->profile);

if (flush) {
fd = open(ctx->filesystem_name, O_RDONLY, 0);

Cheers, Andreas
--
Andreas Dilger
Sr. Staff Engineer, Lustre Group
Sun Microsystems of Canada, Inc.

2008-02-02 08:30:20

by Andreas Dilger

[permalink] [raw]
Subject: Re: [PATCH][9/28] e2fsprogs-SLES10--m-support.patch


SLES9 patch to add "fsck -m" option to skip checking mounted filesystems.
This isn't in their upstream e2fsprogs, since SLES uses the fsck in
util-linux, but is needed for compatibility.

Index: e2fsprogs-1.40.1/misc/fsck.8.in
===================================================================
--- e2fsprogs-1.40.1.orig/misc/fsck.8.in
+++ e2fsprogs-1.40.1/misc/fsck.8.in
@@ -180,6 +180,10 @@ option,
will use the specified filesystem type. If this type is not
available, then the default file system type (currently ext2) is used.
.TP
+.B \-m
+Do not check mounted filesystems and return an exit code of 0
+for mounted filesystems.
+.TP
.B \-A
Walk through the
.I /etc/fstab
Index: e2fsprogs-1.40.1/misc/fsck.c
===================================================================
--- e2fsprogs-1.40.1.orig/misc/fsck.c
+++ e2fsprogs-1.40.1/misc/fsck.c
@@ -103,6 +103,7 @@ int noexecute = 0;
int serialize = 0;
int skip_root = 0;
int like_mount = 0;
+int ignore_mounted = 0;
int notitle = 0;
int parallel_root = 0;
int progress = 0;
@@ -793,7 +794,7 @@ static void compile_fs_type(char *fs_typ
#if 0
printf("Adding %s to list (type %d).\n", s, cmp->type[num]);
#endif
- cmp->list[num++] = string_copy(s);
+ cmp->list[num++] = string_copy(s);
s = strtok(NULL, ",");
}
free(list);
@@ -819,7 +820,7 @@ static int opt_in_list(char *opt, char *
}
s = strtok(NULL, ",");
}
- free(list);
+ free(list);
return 0;
}

@@ -855,6 +856,56 @@ static int fs_match(struct fs_info *fs,
return (cmp->negate ? !ret : ret);
}

+/* Check to see whether a filesystem is already mounted */
+static int is_mounted(struct fs_info *fs)
+{
+ struct stat st_buf;
+ dev_t fs_rdev;
+ char *testdir;
+ int retval = 0;
+
+ if (!fs->mountpt) {
+ /*
+ * We have already read /proc/mounts
+ * so any device without a mountpoint
+ * is indeed not mounted.
+ */
+ return 0;
+ }
+
+ if (!strcmp(fs->mountpt,"/")) {
+ /* Root should be always mounted */
+ return 1;
+ }
+
+ if (stat(fs->mountpt, &st_buf) < 0)
+ return 0;
+
+ if (!S_ISDIR(st_buf.st_mode)) {
+ /* This is not a directory, cannot be mounted */
+ return 0;
+ }
+
+ fs_rdev = st_buf.st_dev;
+
+ /* Compare with the upper directory */
+ testdir = malloc(strlen(fs->mountpt) + 4);
+ strcpy(testdir,fs->mountpt);
+ if (fs->mountpt[strlen(fs->mountpt) - 1] == '/')
+ strcat(testdir,"..");
+ else
+ strcat(testdir,"/..");
+
+ if (stat(testdir, &st_buf) == 0) {
+ if (st_buf.st_dev != fs_rdev) {
+ retval = 1;
+ }
+ }
+ free(testdir);
+
+ return retval;
+}
+
/* Check if we should ignore this filesystem. */
static int ignore(struct fs_info *fs)
{
@@ -1002,6 +1053,15 @@ static int check_all(NOARGS)
not_done_yet++;
continue;
}
+ if (ignore_mounted) {
+ /*
+ * Ignore mounted devices.
+ */
+ if (is_mounted(fs)) {
+ fs->flags |= FLAG_DONE;
+ continue;
+ }
+ }
/*
* If a filesystem on a particular device has
* already been spawned, then we need to defer
@@ -1179,6 +1239,9 @@ static void PRS(int argc, char *argv[])
case 'P':
parallel_root++;
break;
+ case 'm':
+ ignore_mounted++;
+ break;
case 's':
serialize++;
break;
@@ -1254,6 +1317,10 @@ int main(int argc, char *argv[])
fstab = _PATH_MNTTAB;
load_fs_info(fstab);

+ /* Load info from /proc/mounts, too */
+ if (ignore_mounted)
+ load_fs_info("/proc/mounts");
+
/* Update our search path to include uncommon directories. */
if (oldpath) {
fsck_path = malloc (strlen (fsck_prefix_path) + 1 +
@@ -1296,6 +1363,14 @@ int main(int argc, char *argv[])
if (!fs)
continue;
}
+ if (ignore_mounted) {
+ /*
+ * Ignore mounted devices.
+ */
+ if (is_mounted(fs)) {
+ continue;
+ }
+ }
fsck_device(fs, interactive);
if (serialize ||
(max_running && (num_running >= max_running))) {

Cheers, Andreas
--
Andreas Dilger
Sr. Staff Engineer, Lustre Group
Sun Microsystems of Canada, Inc.

2008-02-02 08:34:52

by Andreas Dilger

[permalink] [raw]
Subject: Re: [PATCH][10/28] e2fsprogs-uninit.patch


Support for the COMPAT_GDT_CSUM (uninit_groups) feature.

Allows skipping uninitialized inode and block bitmap checks, and
skipping unused parts of the inode table. Can dramatically speed
up e2fsck on large filesystems where the inode table is mostly
unused.

Signed-off-by: Andreas Dilger <[email protected]>
Signed-off-by: Girish Shilamkar <[email protected]>
Signed-off-by: Kalpak Shah <[email protected]>

Index: e2fsprogs-1.40.5/debugfs/debugfs.c
===================================================================
--- e2fsprogs-1.40.5.orig/debugfs/debugfs.c
+++ e2fsprogs-1.40.5/debugfs/debugfs.c
@@ -286,7 +286,10 @@ void do_show_super_stats(int argc, char
FILE *out;
struct ext2_group_desc *gdp;
int c, header_only = 0;
- int numdirs = 0, first;
+ int numdirs = 0, first, gdt_csum;
+
+ gdt_csum = EXT2_HAS_RO_COMPAT_FEATURE(current_fs->super,
+ EXT4_FEATURE_RO_COMPAT_GDT_CSUM);

reset_getopt();
while ((c = getopt (argc, argv, "h")) != EOF) {
@@ -322,7 +325,7 @@ void do_show_super_stats(int argc, char
"inode table at %u\n"
" %d free %s, "
"%d free %s, "
- "%d used %s\n",
+ "%d used %s%s",
i, gdp->bg_block_bitmap,
gdp->bg_inode_bitmap, gdp->bg_inode_table,
gdp->bg_free_blocks_count,
@@ -331,12 +334,21 @@ void do_show_super_stats(int argc, char
gdp->bg_free_inodes_count != 1 ? "inodes" : "inode",
gdp->bg_used_dirs_count,
gdp->bg_used_dirs_count != 1 ? "directories"
- : "directory");
+ : "directory", gdt_csum ? ", " : "\n");
+ if (gdt_csum)
+ fprintf(out, "%d unused %s\n",
+ gdp->bg_itable_unused,
+ gdp->bg_itable_unused != 1 ? "inodes":"inode");
first = 1;
print_bg_opts(gdp, EXT2_BG_INODE_UNINIT, "Inode not init",
&first, out);
print_bg_opts(gdp, EXT2_BG_BLOCK_UNINIT, "Block not init",
&first, out);
+ if (gdt_csum) {
+ fprintf(out, "%sChecksum 0x%04x",
+ first ? " [":", ", gdp->bg_checksum);
+ first = 0;
+ }
if (!first)
fputs("]\n", out);
}
Index: e2fsprogs-1.40.5/debugfs/set_fields.c
===================================================================
--- e2fsprogs-1.40.5.orig/debugfs/set_fields.c
+++ e2fsprogs-1.40.5/debugfs/set_fields.c
@@ -36,6 +36,7 @@
static struct ext2_super_block set_sb;
static struct ext2_inode set_inode;
static struct ext2_group_desc set_gd;
+static dgrp_t set_bg;
static ext2_ino_t set_ino;
static int array_idx;

@@ -57,6 +58,7 @@ static errcode_t parse_uuid(struct field
static errcode_t parse_hashalg(struct field_set_info *info, char *arg);
static errcode_t parse_time(struct field_set_info *info, char *arg);
static errcode_t parse_bmap(struct field_set_info *info, char *arg);
+static errcode_t parse_gd_csum(struct field_set_info *info, char *arg);

static struct field_set_info super_fields[] = {
{ "inodes_count", &set_sb.s_inodes_count, 4, parse_uint },
@@ -163,7 +165,7 @@ static struct field_set_info ext2_bg_fie
{ "flags", &set_gd.bg_flags, 2, parse_uint },
{ "reserved", &set_gd.bg_reserved, 2, parse_uint, FLAG_ARRAY, 2 },
{ "itable_unused", &set_gd.bg_itable_unused, 2, parse_uint },
- { "checksum", &set_gd.bg_checksum, 2, parse_uint },
+ { "checksum", &set_gd.bg_checksum, 2, parse_gd_csum },
{ 0, 0, 0, 0 }
};

@@ -376,6 +378,17 @@ static errcode_t parse_bmap(struct field
return retval;
}

+static errcode_t parse_gd_csum(struct field_set_info *info, char *arg)
+{
+ __u16 *val = info->ptr;
+
+ if (strcmp(arg, "calc") == 0) {
+ *val = ext2fs_group_desc_csum(&set_sb, set_bg, &set_gd);
+ return 0;
+ }
+
+ return parse_uint(info, arg);
+}

static void print_possible_fields(struct field_set_info *fields)
{
@@ -495,7 +508,6 @@ void do_set_block_group_descriptor(int a
"\t\"set_block_group_descriptor -l\" will list the names of "
"the fields in a block group descriptor\n\twhich can be set.";
struct field_set_info *ss;
- dgrp_t set_bg;
char *end;

if ((argc == 2) && !strcmp(argv[1], "-l")) {
@@ -525,6 +537,7 @@ void do_set_block_group_descriptor(int a
}

set_gd = current_fs->group_desc[set_bg];
+ set_sb = *current_fs->super;

if (ss->func(ss, argv[3]) == 0) {
current_fs->group_desc[set_bg] = set_gd;
Index: e2fsprogs-1.40.5/e2fsck/journal.c
===================================================================
--- e2fsprogs-1.40.5.orig/e2fsck/journal.c
+++ e2fsprogs-1.40.5/e2fsck/journal.c
@@ -988,6 +988,8 @@ void e2fsck_move_ext3_journal(e2fsck_t c
ext2fs_unmark_inode_bitmap(fs->inode_map, ino);
ext2fs_mark_ib_dirty(fs);
fs->group_desc[group].bg_free_inodes_count++;
+ fs->group_desc[group].bg_checksum =
+ ext2fs_group_desc_csum(fs->super, group,&fs->group_desc[group]);
fs->super->s_free_inodes_count++;
return;

Index: e2fsprogs-1.40.5/e2fsck/pass5.c
===================================================================
--- e2fsprogs-1.40.5.orig/e2fsck/pass5.c
+++ e2fsprogs-1.40.5/e2fsck/pass5.c
@@ -121,7 +121,7 @@ static void check_block_bitmaps(e2fsck_t
struct problem_context pctx;
int problem, save_problem, fixit, had_problem;
errcode_t retval;
- int lazy_bg = 0;
+ int lazy_flag, csum_flag;
int skip_group = 0;

clear_problem_context(&pctx);
@@ -158,15 +158,16 @@ static void check_block_bitmaps(e2fsck_t
goto errout;
}

- if (EXT2_HAS_COMPAT_FEATURE(fs->super, EXT2_FEATURE_COMPAT_LAZY_BG))
- lazy_bg++;
-
+ lazy_flag = EXT2_HAS_COMPAT_FEATURE(fs->super,
+ EXT2_FEATURE_COMPAT_LAZY_BG);
+ csum_flag = EXT2_HAS_RO_COMPAT_FEATURE(fs->super,
+ EXT4_FEATURE_RO_COMPAT_GDT_CSUM);
redo_counts:
had_problem = 0;
save_problem = 0;
pctx.blk = pctx.blk2 = NO_BLK;
- if (lazy_bg && (fs->group_desc[group].bg_flags &
- EXT2_BG_BLOCK_UNINIT))
+ if ((lazy_flag || csum_flag) &&
+ (fs->group_desc[group].bg_flags & EXT2_BG_BLOCK_UNINIT))
skip_group++;
super = fs->super->s_first_data_block;
for (i = fs->super->s_first_data_block;
@@ -206,6 +207,17 @@ redo_counts:
* Block used, but not marked in use in the bitmap.
*/
problem = PR_5_BLOCK_USED;
+
+ if (skip_group) {
+ struct problem_context pctx2;
+ pctx2.blk = i;
+ pctx2.group = group;
+ if (fix_problem(ctx, PR_5_BLOCK_UNINIT,&pctx2)){
+ fs->group_desc[group].bg_flags &=
+ ~EXT2_BG_BLOCK_UNINIT;
+ skip_group = 0;
+ }
+ }
}
if (pctx.blk == NO_BLK) {
pctx.blk = pctx.blk2 = i;
@@ -224,7 +236,7 @@ redo_counts:
had_problem++;

do_counts:
- if (!bitmap && !skip_group) {
+ if (!bitmap && (!skip_group || csum_flag)) {
group_free++;
free_blocks++;
}
@@ -241,7 +253,7 @@ redo_counts:
if ((ctx->progress)(ctx, 5, group,
fs->group_desc_count*2))
goto errout;
- if (lazy_bg &&
+ if ((lazy_flag || csum_flag) &&
(i != fs->super->s_blocks_count-1) &&
(fs->group_desc[group].bg_flags &
EXT2_BG_BLOCK_UNINIT))
@@ -321,7 +333,7 @@ static void check_inode_bitmaps(e2fsck_t
errcode_t retval;
struct problem_context pctx;
int problem, save_problem, fixit, had_problem;
- int lazy_bg = 0;
+ int lazy_flag, csum_flag;
int skip_group = 0;

clear_problem_context(&pctx);
@@ -358,16 +370,16 @@ static void check_inode_bitmaps(e2fsck_t
goto errout;
}

- if (EXT2_HAS_COMPAT_FEATURE(fs->super,
- EXT2_FEATURE_COMPAT_LAZY_BG))
- lazy_bg++;
-
+ lazy_flag = EXT2_HAS_COMPAT_FEATURE(fs->super,
+ EXT2_FEATURE_COMPAT_LAZY_BG);
+ csum_flag = EXT2_HAS_RO_COMPAT_FEATURE(fs->super,
+ EXT4_FEATURE_RO_COMPAT_GDT_CSUM);
redo_counts:
had_problem = 0;
save_problem = 0;
pctx.ino = pctx.ino2 = 0;
- if (lazy_bg && (fs->group_desc[group].bg_flags &
- EXT2_BG_INODE_UNINIT))
+ if ((lazy_flag || csum_flag) &&
+ (fs->group_desc[group].bg_flags & EXT2_BG_INODE_UNINIT))
skip_group++;

/* Protect loop from wrap-around if inodes_count is maxed */
@@ -390,6 +402,21 @@ redo_counts:
* Inode used, but not in bitmap
*/
problem = PR_5_INODE_USED;
+
+ /* We should never hit this, because it means that
+ * inodes were marked in use that weren't noticed
+ * in pass1 or pass 2. It is easier to fix the problem
+ * than to kill e2fsck and leave the user stuck. */
+ if (skip_group) {
+ struct problem_context pctx2;
+ pctx2.blk = i;
+ pctx2.group = group;
+ if (fix_problem(ctx, PR_5_INODE_UNINIT,&pctx2)){
+ fs->group_desc[group].bg_flags &=
+ ~EXT2_BG_INODE_UNINIT;
+ skip_group = 0;
+ }
+ }
}
if (pctx.ino == 0) {
pctx.ino = pctx.ino2 = i;
@@ -411,7 +438,7 @@ do_counts:
if (bitmap) {
if (ext2fs_test_inode_bitmap(ctx->inode_dir_map, i))
dirs_count++;
- } else if (!skip_group) {
+ } else if (!skip_group || csum_flag) {
group_free++;
free_inodes++;
}
@@ -430,7 +457,7 @@ do_counts:
group + fs->group_desc_count,
fs->group_desc_count*2))
goto errout;
- if (lazy_bg &&
+ if ((lazy_flag || csum_flag) &&
(i != fs->super->s_inodes_count) &&
(fs->group_desc[group].bg_flags &
EXT2_BG_INODE_UNINIT))
Index: e2fsprogs-1.40.5/e2fsck/problem.c
===================================================================
--- e2fsprogs-1.40.5.orig/e2fsck/problem.c
+++ e2fsprogs-1.40.5/e2fsck/problem.c
@@ -351,8 +351,33 @@ static struct e2fsck_problem problem_tab
N_("Adding dirhash hint to @f.\n\n"),
PROMPT_NONE, 0 },

+ /* Group descriptor N checksum is invalid. */
+ { PR_0_GDT_CSUM,
+ N_("@g descriptor %g checksum is invalid. "),
+ PROMPT_FIX, PR_PREEN_OK },
+
+ /* Group descriptor N marked uninitialized without feature set. */
+ { PR_0_GDT_UNINIT,
+ N_("@g descriptor %g marked uninitialized without feature set.\n"),
+ PROMPT_FIX, PR_PREEN_OK },
+
+ /* Group N block bitmap uninitialized but inode bitmap in use. */
+ { PR_0_BB_UNINIT_IB_INIT,
+ N_("@g %g @b @B uninitialized but @i @B in use.\n"),
+ PROMPT_FIX, PR_PREEN_OK },
+
+ /* Group descriptor N has invalid unused inodes count. */
+ { PR_0_GDT_ITABLE_UNUSED,
+ N_("@g descriptor %g has invalid unused inodes count %b. "),
+ PROMPT_FIX, PR_PREEN_OK },
+
+ /* Last group block bitmap uninitialized. */
+ { PR_0_BB_UNINIT_LAST,
+ N_("last @g @b @B uninitialized. "),
+ PROMPT_FIX, PR_PREEN_OK },
+
/* Pass 1 errors */
-
+
/* Pass 1: Checking inodes, blocks, and sizes */
{ PR_1_PASS_HEADER,
N_("Pass 1: Checking @is, @bs, and sizes\n"),
@@ -1236,6 +1261,16 @@ static struct e2fsck_problem problem_tab
N_("i_blocks_hi @F %N, @s zero.\n"),
PROMPT_CLEAR, 0 },

+ /* Inode found in group where _INODE_UNINIT is set */
+ { PR_2_INOREF_BG_INO_UNINIT,
+ N_("@i %i found in @g %g where _INODE_UNINIT is set. "),
+ PROMPT_FIX, PR_PREEN_OK },
+
+ /* Inode found in group unused inodes area */
+ { PR_2_INOREF_IN_UNUSED,
+ N_("@i %i found in @g %g unused inodes area. "),
+ PROMPT_FIX, PR_PREEN_OK },
+
/* Pass 3 errors */

/* Pass 3: Checking directory connectivity */
@@ -1542,6 +1577,16 @@ static struct e2fsck_problem problem_tab
" +(%i--%j)",
PROMPT_NONE, PR_LATCH_IBITMAP | PR_PREEN_OK | PR_PREEN_NOMSG },

+ /* Group N block(s) in use but group is marked BLOCK_UNINIT */
+ { PR_5_BLOCK_UNINIT,
+ N_("@g %g @b(s) in use but @g is marked BLOCK_UNINIT\n"),
+ PROMPT_FIX, PR_PREEN_OK },
+
+ /* Group N inode(s) in use but group is marked INODE_UNINIT */
+ { PR_5_INODE_UNINIT,
+ N_("@g %g @i(s) in use but @g is marked INODE_UNINIT\n"),
+ PROMPT_FIX, PR_PREEN_OK },
+
/* Recreate journal if E2F_FLAG_JOURNAL_INODE flag is set */
{ PR_6_RECREATE_JOURNAL,
N_("Recreate journal to make the filesystem ext3 again?\n"),
Index: e2fsprogs-1.40.5/e2fsck/problem.h
===================================================================
--- e2fsprogs-1.40.5.orig/e2fsck/problem.h
+++ e2fsprogs-1.40.5/e2fsck/problem.h
@@ -197,6 +197,21 @@ struct problem_context {
/* Superblock hint for external journal incorrect */
#define PR_0_DIRHASH_HINT 0x000034

+/* Group descriptor N checksum is invalid */
+#define PR_0_GDT_CSUM 0x000035
+
+/* Group descriptor N marked uninitialized without feature set. */
+#define PR_0_GDT_UNINIT 0x000036
+
+/* Block bitmap is uninitialised but Inode bitmap in use. */
+#define PR_0_BB_UNINIT_IB_INIT 0x000037
+
+/* Group descriptor N has invalid unused inodes count. */
+#define PR_0_GDT_ITABLE_UNUSED 0x000038
+
+/* Last group block bitmap is uninitialized. */
+#define PR_0_BB_UNINIT_LAST 0x000039
+
/*
* Pass 1 errors
*/
@@ -742,6 +757,12 @@ struct problem_context {
/* i_blocks_hi should be zero */
#define PR_2_BLOCKS_HI_ZERO 0x020044

+/* Inode found in group where _INODE_UNINIT is set */
+#define PR_2_INOREF_BG_INO_UNINIT 0x020045
+
+/* Inode found in group unused inodes area */
+#define PR_2_INOREF_IN_UNUSED 0x020046
+
/*
* Pass 3 errors
*/
@@ -930,10 +951,16 @@ struct problem_context {

/* Inode range not used, but marked in bitmap */
#define PR_5_INODE_RANGE_UNUSED 0x050016
-
+
/* Inode rangeused, but not marked used in bitmap */
#define PR_5_INODE_RANGE_USED 0x050017

+/* Block in use but group is marked BLOCK_UNINIT */
+#define PR_5_BLOCK_UNINIT 0x050018
+
+/* Inode in use but group is marked INODE_UNINIT */
+#define PR_5_INODE_UNINIT 0x050019
+
/*
* Post-Pass 5 errors
*/
Index: e2fsprogs-1.40.5/e2fsck/super.c
===================================================================
--- e2fsprogs-1.40.5.orig/e2fsck/super.c
+++ e2fsprogs-1.40.5/e2fsck/super.c
@@ -469,6 +469,7 @@ void check_super_block(e2fsck_t ctx)
struct problem_context pctx;
blk_t free_blocks = 0;
ino_t free_inodes = 0;
+ int lazy_flag, csum_flag;

inodes_per_block = EXT2_INODES_PER_BLOCK(fs->super);
ipg_max = inodes_per_block * (blocks_per_group - 4);
@@ -578,6 +579,10 @@ void check_super_block(e2fsck_t ctx)
first_block = sb->s_first_data_block;
last_block = sb->s_blocks_count-1;

+ lazy_flag = EXT2_HAS_COMPAT_FEATURE(fs->super,
+ EXT2_FEATURE_COMPAT_LAZY_BG);
+ csum_flag = EXT2_HAS_RO_COMPAT_FEATURE(fs->super,
+ EXT4_FEATURE_RO_COMPAT_GDT_CSUM);
for (i = 0, gd=fs->group_desc; i < fs->group_desc_count; i++, gd++) {
pctx.group = i;

@@ -626,6 +631,50 @@ void check_super_block(e2fsck_t ctx)
(gd->bg_used_dirs_count > sb->s_inodes_per_group))
ext2fs_unmark_valid(fs);

+ if (!ext2fs_group_desc_csum_verify(sb, i, gd)) {
+ if (fix_problem(ctx, PR_0_GDT_CSUM, &pctx)) {
+ gd->bg_flags &= ~(EXT2_BG_BLOCK_UNINIT |
+ EXT2_BG_INODE_UNINIT);
+ gd->bg_itable_unused = 0;
+ }
+ ext2fs_unmark_valid(fs);
+ }
+
+ if (!lazy_flag && !csum_flag &&
+ (gd->bg_flags &(EXT2_BG_BLOCK_UNINIT|EXT2_BG_INODE_UNINIT)||
+ gd->bg_itable_unused != 0)){
+ if (fix_problem(ctx, PR_0_GDT_UNINIT, &pctx)) {
+ gd->bg_flags &= ~(EXT2_BG_BLOCK_UNINIT |
+ EXT2_BG_INODE_UNINIT);
+ gd->bg_itable_unused = 0;
+ }
+ ext2fs_unmark_valid(fs);
+ }
+
+ if (i == fs->group_desc_count - 1 &&
+ gd->bg_flags & EXT2_BG_BLOCK_UNINIT) {
+ if (fix_problem(ctx, PR_0_BB_UNINIT_LAST, &pctx))
+ gd->bg_flags &= ~EXT2_BG_BLOCK_UNINIT;
+ ext2fs_unmark_valid(fs);
+ }
+
+ if (gd->bg_flags & EXT2_BG_BLOCK_UNINIT &&
+ !(gd->bg_flags & EXT2_BG_INODE_UNINIT)) {
+ if (fix_problem(ctx, PR_0_BB_UNINIT_IB_INIT, &pctx))
+ gd->bg_flags &= ~EXT2_BG_BLOCK_UNINIT;
+ ext2fs_unmark_valid(fs);
+ }
+
+ if (csum_flag &&
+ (gd->bg_itable_unused > gd->bg_free_inodes_count ||
+ gd->bg_itable_unused > sb->s_inodes_per_group)) {
+ pctx.blk = gd->bg_itable_unused;
+ if (fix_problem(ctx, PR_0_GDT_ITABLE_UNUSED, &pctx))
+ gd->bg_itable_unused = 0;
+ ext2fs_unmark_valid(fs);
+ }
+
+ gd->bg_checksum = ext2fs_group_desc_csum(fs->super, i, gd);
}

/*
Index: e2fsprogs-1.40.5/lib/ext2fs/Makefile.in
===================================================================
--- e2fsprogs-1.40.5.orig/lib/ext2fs/Makefile.in
+++ e2fsprogs-1.40.5/lib/ext2fs/Makefile.in
@@ -67,7 +67,9 @@ OBJS= $(DEBUGFS_LIB_OBJS) $(RESIZE_LIB_O
unix_io.o \
unlink.o \
valid_blk.o \
- version.o
+ version.o \
+ crc16.o \
+ csum.o

SRCS= ext2_err.c \
$(srcdir)/alloc.c \
@@ -131,6 +133,9 @@ SRCS= ext2_err.c \
$(srcdir)/tst_getsize.c \
$(srcdir)/tst_types.c \
$(srcdir)/tst_iscan.c \
+ $(srcdir)/tst_csum.c \
+ $(srcdir)/crc16.c \
+ $(srcdir)/csum.c \
$(srcdir)/unix_io.c \
$(srcdir)/unlink.c \
$(srcdir)/valid_blk.c \
@@ -189,11 +194,13 @@ ext2fs.pc: $(srcdir)/ext2fs.pc.in $(top_
@cd $(top_builddir); CONFIG_FILES=lib/ext2fs/ext2fs.pc ./config.status

tst_badblocks: tst_badblocks.o freefs.o bitmaps.o rw_bitmaps.o \
- read_bb_file.o write_bb_file.o badblocks.o
+ read_bb_file.o write_bb_file.o badblocks.o csum.o \
+ crc16.o $(STATIC_LIBEXT2FS)
@echo " LD [email protected]"
@$(CC) -o tst_badblocks tst_badblocks.o freefs.o \
read_bb_file.o write_bb_file.o badblocks.o rw_bitmaps.o \
- inline.o bitops.o gen_bitmap.o bitmaps.o $(LIBCOM_ERR)
+ inline.o bitops.o gen_bitmap.o bitmaps.o csum.o crc16.o \
+ $(STATIC_LIBEXT2FS) $(LIBCOM_ERR)

tst_icount: icount.c initialize.o $(STATIC_LIBEXT2FS)
@echo " LD [email protected]"
@@ -225,6 +232,11 @@ tst_bitops: tst_bitops.o inline.o $(STAT
@$(CC) -o tst_bitops tst_bitops.o inline.o $(ALL_CFLAGS) \
$(STATIC_LIBEXT2FS) $(LIBCOM_ERR)

+tst_csum: tst_csum.o csum.o crc16.o $(STATIC_LIBEXT2FS) $(STATIC_LIBUUID)
+ @echo " LD [email protected]"
+ @$(CC) -o tst_csum csum.o tst_csum.o crc16.o $(STATIC_LIBEXT2FS) \
+ $(STATIC_LIBUUID) $(LIBCOM_ERR)
+
tst_getsectsize: tst_getsectsize.o getsectsize.o $(STATIC_LIBEXT2FS)
@echo " LD [email protected]"
@$(CC) -o tst_sectgetsize tst_getsectsize.o getsectsize.o \
@@ -246,13 +258,15 @@ mkjournal: mkjournal.c $(STATIC_LIBEXT2F
@echo " LD [email protected]"
@$(CC) -o mkjournal $(srcdir)/mkjournal.c -DDEBUG $(STATIC_LIBEXT2FS) $(LIBCOM_ERR) $(ALL_CFLAGS)

-check:: tst_bitops tst_badblocks tst_iscan tst_types tst_icount tst_super_size
+check:: tst_bitops tst_badblocks tst_iscan tst_types tst_icount tst_super_size \
+ tst_csum
LD_LIBRARY_PATH=$(LIB) DYLD_LIBRARY_PATH=$(LIB) ./tst_bitops
LD_LIBRARY_PATH=$(LIB) DYLD_LIBRARY_PATH=$(LIB) ./tst_badblocks
LD_LIBRARY_PATH=$(LIB) DYLD_LIBRARY_PATH=$(LIB) ./tst_iscan
LD_LIBRARY_PATH=$(LIB) DYLD_LIBRARY_PATH=$(LIB) ./tst_types
LD_LIBRARY_PATH=$(LIB) DYLD_LIBRARY_PATH=$(LIB) ./tst_icount
LD_LIBRARY_PATH=$(LIB) DYLD_LIBRARY_PATH=$(LIB) ./tst_super_size
+ LD_LIBRARY_PATH=$(LIB) DYLD_LIBRARY_PATH=$(LIB) ./tst_csum

installdirs::
@echo " MKINSTALLDIRS $(libdir) $(includedir)/ext2fs"
@@ -360,6 +374,10 @@ closefs.o: $(srcdir)/closefs.c $(srcdir)
$(srcdir)/ext2fs.h $(srcdir)/ext2_fs.h $(srcdir)/ext3_extents.h \
$(top_srcdir)/lib/et/com_err.h $(srcdir)/ext2_io.h \
$(top_builddir)/lib/ext2fs/ext2_err.h $(srcdir)/bitops.h
+crc16.o: $(srcdir)/crc16.c $(srcdir)/ext2_fs.h $(srcdir)/crc16.h \
+ $(top_builddir)/lib/ext2fs/ext2_types.h $(srcdir)/ext2fs.h $(srcdir)/ext2_fs.h
+csum.o: $(srcdir)/csum.c $(srcdir)/ext2_fs.h \
+ $(top_builddir)/lib/ext2fs/ext2_types.h $(srcdir)/ext2fs.h $(srcdir)/ext2_fs.h
dblist.o: $(srcdir)/dblist.c $(srcdir)/ext2_fs.h \
$(top_builddir)/lib/ext2fs/ext2_types.h $(srcdir)/ext2fsP.h \
$(srcdir)/ext2fs.h $(srcdir)/ext2_fs.h $(srcdir)/ext3_extents.h \
@@ -570,6 +588,8 @@ tst_byteswap.o: $(srcdir)/tst_byteswap.c
$(top_builddir)/lib/ext2fs/ext2_types.h $(srcdir)/ext2fs.h \
$(srcdir)/ext2_fs.h $(srcdir)/ext3_extents.h $(top_srcdir)/lib/et/com_err.h \
$(srcdir)/ext2_io.h $(top_builddir)/lib/ext2fs/ext2_err.h $(srcdir)/bitops.h
+tst_csum.o: $(srcdir)/tst_csum.c $(srcdir)/ext2_fs.h \
+ $(top_builddir)/lib/ext2fs/ext2_types.h $(srcdir)/ext2fs.h
tst_getsize.o: $(srcdir)/tst_getsize.c $(srcdir)/ext2_fs.h \
$(top_builddir)/lib/ext2fs/ext2_types.h $(srcdir)/ext2fs.h \
$(srcdir)/ext2_fs.h $(srcdir)/ext3_extents.h $(top_srcdir)/lib/et/com_err.h \
Index: e2fsprogs-1.40.5/lib/ext2fs/alloc_stats.c
===================================================================
--- e2fsprogs-1.40.5.orig/lib/ext2fs/alloc_stats.c
+++ e2fsprogs-1.40.5/lib/ext2fs/alloc_stats.c
@@ -27,6 +27,27 @@ void ext2fs_inode_alloc_stats2(ext2_fils
fs->group_desc[group].bg_free_inodes_count -= inuse;
if (isdir)
fs->group_desc[group].bg_used_dirs_count += inuse;
+
+ /* We don't strictly need to be clearing these if inuse < 0
+ * (i.e. freeing inodes) but it also means something is bad. */
+ fs->group_desc[group].bg_flags &= ~(EXT2_BG_INODE_UNINIT |
+ EXT2_BG_BLOCK_UNINIT);
+ if (EXT2_HAS_RO_COMPAT_FEATURE(fs->super,
+ EXT4_FEATURE_RO_COMPAT_GDT_CSUM)) {
+ ext2_ino_t first_unused_inode = fs->super->s_inodes_per_group -
+ fs->group_desc[group].bg_itable_unused +
+ group * fs->super->s_inodes_per_group + 1;
+
+ if (ino >= first_unused_inode)
+ fs->group_desc[group].bg_itable_unused =
+ group * fs->super->s_inodes_per_group +
+ fs->super->s_inodes_per_group - ino;
+
+ fs->group_desc[group].bg_checksum =
+ ext2fs_group_desc_csum(fs->super, group,
+ &fs->group_desc[group]);
+ }
+
fs->super->s_free_inodes_count -= inuse;
ext2fs_mark_super_dirty(fs);
ext2fs_mark_ib_dirty(fs);
@@ -46,6 +67,10 @@ void ext2fs_block_alloc_stats(ext2_filsy
else
ext2fs_unmark_block_bitmap(fs->block_map, blk);
fs->group_desc[group].bg_free_blocks_count -= inuse;
+ fs->group_desc[group].bg_flags &= ~EXT2_BG_BLOCK_UNINIT;
+ fs->group_desc[group].bg_checksum =
+ ext2fs_group_desc_csum(fs->super, group,&fs->group_desc[group]);
+
fs->super->s_free_blocks_count -= inuse;
ext2fs_mark_super_dirty(fs);
ext2fs_mark_bb_dirty(fs);
Index: e2fsprogs-1.40.5/lib/ext2fs/alloc_tables.c
===================================================================
--- e2fsprogs-1.40.5.orig/lib/ext2fs/alloc_tables.c
+++ e2fsprogs-1.40.5/lib/ext2fs/alloc_tables.c
@@ -95,13 +95,12 @@ errcode_t ext2fs_allocate_group_table(ex
ext2fs_mark_block_bitmap(bmap, blk);
fs->group_desc[group].bg_inode_table = new_blk;
}
+ fs->group_desc[group].bg_checksum =
+ ext2fs_group_desc_csum(fs->super, group,&fs->group_desc[group]);

-
return 0;
}

-
-
errcode_t ext2fs_allocate_tables(ext2_filsys fs)
{
errcode_t retval;
Index: e2fsprogs-1.40.5/lib/ext2fs/ext2_fs.h
===================================================================
--- e2fsprogs-1.40.5.orig/lib/ext2fs/ext2_fs.h
+++ e2fsprogs-1.40.5/lib/ext2fs/ext2_fs.h
@@ -173,6 +173,7 @@ struct ext4_group_desc

#define EXT2_BG_INODE_UNINIT 0x0001 /* Inode table/bitmap not initialized */
#define EXT2_BG_BLOCK_UNINIT 0x0002 /* Block bitmap not initialized */
+#define EXT2_BG_INODE_ZEROED 0x0004 /* On-disk itable initialized to zero */

/*
* Data structures used by the directory indexing feature
Index: e2fsprogs-1.40.5/lib/ext2fs/ext2fs.h
===================================================================
--- e2fsprogs-1.40.5.orig/lib/ext2fs/ext2fs.h
+++ e2fsprogs-1.40.5/lib/ext2fs/ext2fs.h
@@ -324,6 +324,7 @@ typedef struct ext2_struct_inode_scan *e
#define EXT2_SF_BAD_EXTRA_BYTES 0x0004
#define EXT2_SF_SKIP_MISSING_ITABLE 0x0008
#define EXT2_SF_DO_LAZY 0x0010
+#define EXT2_SF_DO_CSUM 0x0020

/*
* ext2fs_check_if_mounted flags
@@ -448,7 +449,8 @@ typedef struct ext2_icount *ext2_icount_
#endif
#define EXT2_LIB_FEATURE_RO_COMPAT_SUPP (EXT2_FEATURE_RO_COMPAT_SPARSE_SUPER|\
EXT2_FEATURE_RO_COMPAT_LARGE_FILE|\
- EXT4_FEATURE_RO_COMPAT_DIR_NLINK)
+ EXT4_FEATURE_RO_COMPAT_DIR_NLINK|\
+ EXT4_FEATURE_RO_COMPAT_GDT_CSUM)

/*
* These features are only allowed if EXT2_FLAG_SOFTSUPP_FEATURES is passed
@@ -456,7 +458,6 @@ typedef struct ext2_icount *ext2_icount_
*/
#define EXT2_LIB_SOFTSUPP_INCOMPAT (EXT3_FEATURE_INCOMPAT_EXTENTS)
#define EXT2_LIB_SOFTSUPP_RO_COMPAT (EXT4_FEATURE_RO_COMPAT_HUGE_FILE|\
- EXT4_FEATURE_RO_COMPAT_GDT_CSUM|\
EXT4_FEATURE_RO_COMPAT_EXTRA_ISIZE)

/*
@@ -634,6 +635,13 @@ extern int ext2fs_super_and_bgd_loc(ext2
int *ret_meta_bg);
extern void ext2fs_update_dynamic_rev(ext2_filsys fs);

+/* csum.c */
+extern __u16 ext2fs_group_desc_csum(struct ext2_super_block *sb, __u32 group,
+ struct ext2_group_desc *desc);
+extern int ext2fs_group_desc_csum_verify(struct ext2_super_block *sb,
+ __u32 group, struct ext2_group_desc *desc);
+extern void ext2fs_set_gdt_csum(ext2_filsys fs);
+
/* dblist.c */

extern errcode_t ext2fs_get_num_dirs(ext2_filsys fs, ext2_ino_t *ret_num_dirs);
Index: e2fsprogs-1.40.5/lib/ext2fs/inode.c
===================================================================
--- e2fsprogs-1.40.5.orig/lib/ext2fs/inode.c
+++ e2fsprogs-1.40.5/lib/ext2fs/inode.c
@@ -167,6 +167,9 @@ errcode_t ext2fs_open_inode_scan(ext2_fi
if (EXT2_HAS_COMPAT_FEATURE(fs->super,
EXT2_FEATURE_COMPAT_LAZY_BG))
scan->scan_flags |= EXT2_SF_DO_LAZY;
+ if (EXT2_HAS_RO_COMPAT_FEATURE(fs->super,
+ EXT4_FEATURE_RO_COMPAT_GDT_CSUM))
+ scan->scan_flags |= EXT2_SF_DO_LAZY | EXT2_SF_DO_CSUM;
*ret_scan = scan;
return 0;
}
@@ -218,18 +221,29 @@ int ext2fs_inode_scan_flags(ext2_inode_s
*/
static errcode_t get_next_blockgroup(ext2_inode_scan scan)
{
+ ext2_filsys fs = scan->fs;
+
scan->current_group++;
scan->groups_left--;
-
- scan->current_block = scan->fs->
- group_desc[scan->current_group].bg_inode_table;
+
+ scan->current_block =fs->group_desc[scan->current_group].bg_inode_table;

scan->current_inode = scan->current_group *
- EXT2_INODES_PER_GROUP(scan->fs->super);
+ EXT2_INODES_PER_GROUP(fs->super);

scan->bytes_left = 0;
- scan->inodes_left = EXT2_INODES_PER_GROUP(scan->fs->super);
- scan->blocks_left = scan->fs->inode_blocks_per_group;
+ scan->inodes_left = EXT2_INODES_PER_GROUP(fs->super);
+ scan->blocks_left = fs->inode_blocks_per_group;
+ if (EXT2_HAS_RO_COMPAT_FEATURE(fs->super,
+ EXT4_FEATURE_RO_COMPAT_GDT_CSUM)) {
+ scan->inodes_left -=
+ fs->group_desc[scan->current_group].bg_itable_unused;
+ scan->blocks_left =
+ (scan->inodes_left +
+ (fs->blocksize / scan->inode_size - 1)) *
+ scan->inode_size / fs->blocksize;
+ }
+
return 0;
}

@@ -417,6 +431,8 @@ errcode_t ext2fs_get_next_inode_full(ext
(scan->fs->group_desc[scan->current_group].bg_flags &
EXT2_BG_INODE_UNINIT))
goto force_new_group;
+ if (scan->inodes_left == 0)
+ goto force_new_group;
if (scan->current_block == 0) {
if (scan->scan_flags & EXT2_SF_SKIP_MISSING_ITABLE) {
goto force_new_group;
Index: e2fsprogs-1.40.5/lib/ext2fs/rw_bitmaps.c
===================================================================
--- e2fsprogs-1.40.5.orig/lib/ext2fs/rw_bitmaps.c
+++ e2fsprogs-1.40.5/lib/ext2fs/rw_bitmaps.c
@@ -152,8 +152,10 @@ static errcode_t read_bitmaps(ext2_filsy

fs->write_bitmaps = ext2fs_write_bitmaps;

- if (EXT2_HAS_COMPAT_FEATURE(fs->super,
- EXT2_FEATURE_COMPAT_LAZY_BG))
+ if (EXT2_HAS_COMPAT_FEATURE(fs->super,
+ EXT2_FEATURE_COMPAT_LAZY_BG) ||
+ EXT2_HAS_RO_COMPAT_FEATURE(fs->super,
+ EXT4_FEATURE_RO_COMPAT_GDT_CSUM))
lazy_flag = 1;

retval = ext2fs_get_mem(strlen(fs->device_name) + 80, &buf);
@@ -233,7 +235,9 @@ static errcode_t read_bitmaps(ext2_filsy
if (block_bitmap) {
blk = fs->group_desc[i].bg_block_bitmap;
if (lazy_flag && fs->group_desc[i].bg_flags &
- EXT2_BG_BLOCK_UNINIT)
+ EXT2_BG_BLOCK_UNINIT &&
+ ext2fs_group_desc_csum_verify(fs->super, i,
+ &fs->group_desc[i]))
blk = 0;
if (blk) {
retval = io_channel_read_blk(fs->io, blk,
@@ -254,7 +258,9 @@ static errcode_t read_bitmaps(ext2_filsy
if (inode_bitmap) {
blk = fs->group_desc[i].bg_inode_bitmap;
if (lazy_flag && fs->group_desc[i].bg_flags &
- EXT2_BG_INODE_UNINIT)
+ EXT2_BG_INODE_UNINIT &&
+ ext2fs_group_desc_csum_verify(fs->super, i,
+ &fs->group_desc[i]))
blk = 0;
if (blk) {
retval = io_channel_read_blk(fs->io, blk,
Index: e2fsprogs-1.40.5/misc/dumpe2fs.c
===================================================================
--- e2fsprogs-1.40.5.orig/misc/dumpe2fs.c
+++ e2fsprogs-1.40.5/misc/dumpe2fs.c
@@ -112,7 +112,8 @@ static void print_bg_opts(ext2_filsys fs
{
int first = 1, bg_flags;

- if (fs->super->s_feature_compat & EXT2_FEATURE_COMPAT_LAZY_BG)
+ if (fs->super->s_feature_compat & EXT2_FEATURE_COMPAT_LAZY_BG ||
+ fs->super->s_feature_ro_compat & EXT4_FEATURE_RO_COMPAT_GDT_CSUM)
bg_flags = fs->group_desc[i].bg_flags;
else
bg_flags = 0;
@@ -210,11 +211,15 @@ static void list_desc (ext2_filsys fs)
diff = fs->group_desc[i].bg_inode_table - first_block;
if (diff > 0)
printf(" (+%ld)", diff);
- printf (_("\n %d free blocks, %d free inodes, "
- "%d directories\n"),
+ printf (_("\n %u free blocks, %u free inodes, "
+ "%u directories%s"),
fs->group_desc[i].bg_free_blocks_count,
fs->group_desc[i].bg_free_inodes_count,
- fs->group_desc[i].bg_used_dirs_count);
+ fs->group_desc[i].bg_used_dirs_count,
+ fs->group_desc[i].bg_itable_unused ? "" : "\n");
+ if (fs->group_desc[i].bg_itable_unused)
+ printf (_(", %u unused inodes\n"),
+ fs->group_desc[i].bg_itable_unused);
if (block_bitmap) {
fputs(_(" Free blocks: "), stdout);
ext2fs_get_block_bitmap_range(fs->block_map,
Index: e2fsprogs-1.40.5/misc/mke2fs.c
===================================================================
--- e2fsprogs-1.40.5.orig/misc/mke2fs.c
+++ e2fsprogs-1.40.5/misc/mke2fs.c
@@ -432,6 +432,8 @@ static void write_inode_tables(ext2_fils
num, blk, error_message(retval));
exit(1);
}
+ /* The kernel doesn't need to zero the itable blocks */
+ fs->group_desc[i].bg_flags |= EXT2_BG_INODE_ZEROED;
}
if (sync_kludge) {
if (sync_kludge == 1)
@@ -447,34 +449,49 @@ static void write_inode_tables(ext2_fils
static void setup_lazy_bg(ext2_filsys fs)
{
dgrp_t i;
- int blks;
+ int blks, csum_flag;
struct ext2_super_block *sb = fs->super;
struct ext2_group_desc *bg = fs->group_desc;

- if (EXT2_HAS_COMPAT_FEATURE(fs->super,
- EXT2_FEATURE_COMPAT_LAZY_BG)) {
+ csum_flag = EXT2_HAS_RO_COMPAT_FEATURE(fs->super,
+ EXT4_FEATURE_RO_COMPAT_GDT_CSUM);
+ if (EXT2_HAS_COMPAT_FEATURE(fs->super, EXT2_FEATURE_COMPAT_LAZY_BG) ||
+ csum_flag) {
for (i = 0; i < fs->group_desc_count; i++, bg++) {
if ((i == 0) ||
- (i == fs->group_desc_count-1))
+ (i == fs->group_desc_count - 1 && !csum_flag))
continue;
if (bg->bg_free_inodes_count ==
sb->s_inodes_per_group) {
- bg->bg_free_inodes_count = 0;
bg->bg_flags |= EXT2_BG_INODE_UNINIT;
- sb->s_free_inodes_count -=
- sb->s_inodes_per_group;
+ if (!csum_flag) {
+ bg->bg_free_inodes_count = 0;
+ sb->s_free_inodes_count -=
+ sb->s_inodes_per_group;
+ }
}
+
+ /* Skip groups with GDT backups because the resize
+ * inode has blocks allocated in them, and the last
+ * group because it may need block bitmap padding. */
+ if ((ext2fs_bg_has_super(fs, i) &&
+ sb->s_reserved_gdt_blocks) ||
+ i == fs->group_desc_count - 1)
+ continue;
+
blks = ext2fs_super_and_bgd_loc(fs, i, 0, 0, 0, 0);
- if (bg->bg_free_blocks_count == blks) {
- bg->bg_free_blocks_count = 0;
+ if (bg->bg_free_blocks_count == blks &&
+ bg->bg_flags & EXT2_BG_INODE_UNINIT) {
bg->bg_flags |= EXT2_BG_BLOCK_UNINIT;
- sb->s_free_blocks_count -= blks;
+ if (!csum_flag) {
+ bg->bg_free_blocks_count = 0;
+ sb->s_free_blocks_count -= blks;
+ }
}
}
}
}

-
static void create_root_dir(ext2_filsys fs)
{
errcode_t retval;
@@ -879,7 +896,8 @@ static __u32 ok_features[3] = {
EXT3_FEATURE_INCOMPAT_JOURNAL_DEV|
EXT2_FEATURE_INCOMPAT_META_BG|
EXT4_FEATURE_INCOMPAT_FLEX_BG,
- EXT2_FEATURE_RO_COMPAT_SPARSE_SUPER /* R/O compat */
+ EXT2_FEATURE_RO_COMPAT_SPARSE_SUPER| /* R/O compat */
+ EXT4_FEATURE_RO_COMPAT_GDT_CSUM
};


@@ -1758,6 +1776,8 @@ int main (int argc, char *argv[])
}
no_journal:

+ if (!super_only)
+ ext2fs_set_gdt_csum(fs);
if (!quiet)
printf(_("Writing superblocks and "
"filesystem accounting information: "));
Index: e2fsprogs-1.40.5/misc/tune2fs.c
===================================================================
--- e2fsprogs-1.40.5.orig/misc/tune2fs.c
+++ e2fsprogs-1.40.5/misc/tune2fs.c
@@ -100,7 +100,8 @@ static __u32 ok_features[3] = {
EXT2_FEATURE_COMPAT_DIR_INDEX, /* Compat */
EXT2_FEATURE_INCOMPAT_FILETYPE| /* Incompat */
EXT4_FEATURE_INCOMPAT_FLEX_BG,
- EXT2_FEATURE_RO_COMPAT_SPARSE_SUPER /* R/O compat */
+ EXT2_FEATURE_RO_COMPAT_SPARSE_SUPER | /* R/O compat */
+ EXT4_FEATURE_RO_COMPAT_GDT_CSUM
};

/*
@@ -215,6 +216,8 @@ static int release_blocks_proc(ext2_fils
ext2fs_unmark_block_bitmap(fs->block_map,block);
group = ext2fs_group_of_blk(fs, block);
fs->group_desc[group].bg_free_blocks_count++;
+ fs->group_desc[group].bg_checksum =
+ ext2fs_group_desc_csum(fs->super, group,&fs->group_desc[group]);
fs->super->s_free_blocks_count++;
return 0;
}
@@ -284,7 +287,7 @@ static void update_mntopts(ext2_filsys f
static void update_feature_set(ext2_filsys fs, char *features)
{
int sparse, old_sparse, filetype, old_filetype;
- int journal, old_journal, dxdir, old_dxdir;
+ int journal, old_journal, dxdir, old_dxdir, uninit, old_uninit;
int flex_bg, old_flex_bg;
struct ext2_super_block *sb= fs->super;
__u32 old_compat, old_incompat, old_ro_compat;
@@ -303,6 +306,8 @@ static void update_feature_set(ext2_fils
EXT3_FEATURE_COMPAT_HAS_JOURNAL;
old_dxdir = sb->s_feature_compat &
EXT2_FEATURE_COMPAT_DIR_INDEX;
+ old_uninit = sb->s_feature_ro_compat &
+ EXT4_FEATURE_RO_COMPAT_GDT_CSUM;
if (e2p_edit_feature(features, &sb->s_feature_compat,
ok_features)) {
fprintf(stderr, _("Invalid filesystem option set: %s\n"),
@@ -319,6 +324,8 @@ static void update_feature_set(ext2_fils
EXT3_FEATURE_COMPAT_HAS_JOURNAL;
dxdir = sb->s_feature_compat &
EXT2_FEATURE_COMPAT_DIR_INDEX;
+ uninit = sb->s_feature_ro_compat &
+ EXT4_FEATURE_RO_COMPAT_GDT_CSUM;
if (old_journal && !journal) {
if ((mount_flags & EXT2_MF_MOUNTED) &&
!(mount_flags & EXT2_MF_READONLY)) {
@@ -373,6 +380,7 @@ static void update_feature_set(ext2_fils
sb->s_feature_incompat))
ext2fs_update_dynamic_rev(fs);
if ((sparse != old_sparse) ||
+ (uninit != old_uninit) ||
(filetype != old_filetype)) {
sb->s_state &= ~EXT2_VALID_FS;
printf("\n%s\n", _(please_fsck));
Index: e2fsprogs-1.40.5/misc/tune2fs.8.in
===================================================================
--- e2fsprogs-1.40.5.orig/misc/tune2fs.8.in
+++ e2fsprogs-1.40.5/misc/tune2fs.8.in
@@ -411,10 +411,18 @@ option.
.TP
.B sparse_super
Limit the number of backup superblocks to save space on large filesystems.
+.TP
+.B uninit_groups
+Allow the kernel to initialize bitmaps and inode tables and keep a high
+watermark for the unused inodes in a filesystem, to reduce
+.BR e2fsck (8)
+time. This first e2fsck run after enabling this feature will take the
+full time, but subsequent e2fsck runs will take only a fraction of the
+original time, depending on how full the file system is.
.RE
.IP
After setting or clearing
-.B sparse_super
+.BR sparse_super , " uninit_groups" ,
and
.B filetype
filesystem features,
@@ -433,7 +441,9 @@ can be run to convert existing directori
Linux kernels before 2.0.39 and many 2.1 series kernels do not support
the filesystems that use any of these features.
Enabling certain filesystem features may prevent the filesystem from
-being mounted by kernels which do not support those features.
+being mounted by kernels which do not support those features. The
+.B uninit_groups
+feature is not yet supported by any officially released kernel.
.TP
.BI \-r " reserved-blocks-count"
Set the number of reserved filesystem blocks.
Index: e2fsprogs-1.40.5/resize/resize2fs.c
===================================================================
--- e2fsprogs-1.40.5.orig/resize/resize2fs.c
+++ e2fsprogs-1.40.5/resize/resize2fs.c
@@ -339,7 +339,9 @@ retry:
numblocks = fs->super->s_blocks_per_group;
i = old_fs->group_desc_count - 1;
fs->group_desc[i].bg_free_blocks_count += (numblocks-old_numblocks);
-
+ fs->group_desc[i].bg_checksum =
+ ext2fs_group_desc_csum(fs->super, i, &fs->group_desc[i]);
+
/*
* If the number of block groups is staying the same, we're
* done and can exit now. (If the number block groups is
@@ -414,6 +416,8 @@ retry:
fs->group_desc[i].bg_free_inodes_count =
fs->super->s_inodes_per_group;
fs->group_desc[i].bg_used_dirs_count = 0;
+ fs->group_desc[i].bg_checksum =
+ ext2fs_group_desc_csum(fs->super, i,&fs->group_desc[i]);

retval = ext2fs_allocate_group_table(fs, i, 0);
if (retval) goto errout;
@@ -1222,9 +1226,13 @@ static errcode_t inode_scan_and_fix(ext2
if (retval) goto errout;

group = (new_inode-1) / EXT2_INODES_PER_GROUP(rfs->new_fs->super);
- if (LINUX_S_ISDIR(inode.i_mode))
+ if (LINUX_S_ISDIR(inode.i_mode)) {
rfs->new_fs->group_desc[group].bg_used_dirs_count++;
-
+ rfs->new_fs->group_desc[group].bg_checksum =
+ ext2fs_group_desc_csum(rfs->new_fs->super,group,
+ &rfs->new_fs->group_desc[group]);
+ }
+
#ifdef RESIZE2FS_DEBUG
if (rfs->flags & RESIZE_DEBUG_INODEMAP)
printf("Inode moved %u->%u\n", ino, new_inode);
@@ -1477,6 +1485,9 @@ static errcode_t move_itables(ext2_resiz
ext2fs_unmark_block_bitmap(fs->block_map, blk);

rfs->old_fs->group_desc[i].bg_inode_table = new_blk;
+ rfs->old_fs->group_desc[i].bg_checksum =
+ ext2fs_group_desc_csum(rfs->old_fs->super, i,
+ &rfs->old_fs->group_desc[i]);
ext2fs_mark_super_dirty(rfs->old_fs);
ext2fs_flush(rfs->old_fs);

@@ -1574,8 +1585,12 @@ static errcode_t ext2fs_calculate_summar
count++;
if ((count == fs->super->s_blocks_per_group) ||
(blk == fs->super->s_blocks_count-1)) {
- fs->group_desc[group++].bg_free_blocks_count =
+ fs->group_desc[group].bg_free_blocks_count =
group_free;
+ fs->group_desc[group].bg_checksum =
+ ext2fs_group_desc_csum(fs->super, group,
+ &fs->group_desc[group]);
+ group++;
count = 0;
group_free = 0;
}
@@ -1599,8 +1614,12 @@ static errcode_t ext2fs_calculate_summar
count++;
if ((count == fs->super->s_inodes_per_group) ||
(ino == fs->super->s_inodes_count)) {
- fs->group_desc[group++].bg_free_inodes_count =
+ fs->group_desc[group].bg_free_inodes_count =
group_free;
+ fs->group_desc[group].bg_checksum =
+ ext2fs_group_desc_csum(fs->super, group,
+ &fs->group_desc[group]);
+ group++;
count = 0;
group_free = 0;
}
Index: e2fsprogs-1.40.5/lib/ext2fs/initialize.c
===================================================================
--- e2fsprogs-1.40.5.orig/lib/ext2fs/initialize.c
+++ e2fsprogs-1.40.5/lib/ext2fs/initialize.c
@@ -375,6 +375,8 @@ ipg_retry:
fs->group_desc[i].bg_free_inodes_count =
fs->super->s_inodes_per_group;
fs->group_desc[i].bg_used_dirs_count = 0;
+ fs->group_desc[i].bg_checksum =
+ ext2fs_group_desc_csum(fs->super, i,&fs->group_desc[i]);
}

c = (char) 255;
Index: e2fsprogs-1.40.5/e2fsck/pass2.c
===================================================================
--- e2fsprogs-1.40.5.orig/e2fsck/pass2.c
+++ e2fsprogs-1.40.5/e2fsck/pass2.c
@@ -151,7 +151,7 @@ void e2fsck_pass2(e2fsck_t ctx)

cd.pctx.errcode = ext2fs_dblist_iterate(fs->dblist, check_dir_block,
&cd);
- if (ctx->flags & E2F_FLAG_SIGNAL_MASK)
+ if (ctx->flags & E2F_FLAG_SIGNAL_MASK || ctx->flags & E2F_FLAG_RESTART)
return;
if (cd.pctx.errcode) {
fix_problem(ctx, PR_2_DBLIST_ITERATE, &cd.pctx);
@@ -745,7 +745,7 @@ static int check_dir_block(ext2_filsys f
buf = cd->buf;
ctx = cd->ctx;

- if (ctx->flags & E2F_FLAG_SIGNAL_MASK)
+ if (ctx->flags & E2F_FLAG_SIGNAL_MASK || ctx->flags & E2F_FLAG_RESTART)
return DIRENT_ABORT;

if (ctx->progress && (ctx->progress)(ctx, 2, cd->count++, cd->max))
@@ -842,6 +842,9 @@ static int check_dir_block(ext2_filsys f
dict_init(&de_dict, DICTCOUNT_T_MAX, dict_de_cmp);
prev = 0;
do {
+ int group;
+ ext2_ino_t first_unused_inode;
+
problem = 0;
dirent = (struct ext2_dir_entry *) (buf + offset);
cd->pctx.dirent = dirent;
@@ -891,12 +894,6 @@ static int check_dir_block(ext2_filsys f
(dirent->inode < EXT2_FIRST_INODE(fs->super))) ||
(dirent->inode > fs->super->s_inodes_count)) {
problem = PR_2_BAD_INO;
- } else if (!(ext2fs_test_inode_bitmap(ctx->inode_used_map,
- dirent->inode))) {
- /*
- * If the inode is unused, offer to clear it.
- */
- problem = PR_2_UNUSED_INODE;
} else if (ctx->inode_bb_map &&
(ext2fs_test_inode_bitmap(ctx->inode_bb_map,
dirent->inode))) {
@@ -973,6 +970,67 @@ static int check_dir_block(ext2_filsys f
return DIRENT_ABORT;
}

+ group = ext2fs_group_of_ino(fs, dirent->inode);
+ first_unused_inode = group * fs->super->s_inodes_per_group +
+ 1 + fs->super->s_inodes_per_group -
+ fs->group_desc[group].bg_itable_unused;
+ cd->pctx.group = group;
+
+ /*
+ * Check if the inode was missed out because _INODE_UNINIT
+ * flag was set or bg_itable_unused was incorrect.
+ * If that is the case restart e2fsck.
+ * XXX Optimisations TODO:
+ * 1. only restart e2fsck once
+ * 2. only exposed inodes are checked again.
+ */
+ if (fs->group_desc[group].bg_flags & EXT2_BG_INODE_UNINIT) {
+ if (fix_problem(ctx, PR_2_INOREF_BG_INO_UNINIT,
+ &cd->pctx)){
+ fs->group_desc[group].bg_flags &=
+ ~EXT2_BG_INODE_UNINIT;
+ ctx->flags |= E2F_FLAG_RESTART |
+ E2F_FLAG_SIGNAL_MASK;
+ } else {
+ ext2fs_unmark_valid(fs);
+ if (problem == PR_2_BAD_INO)
+ goto next;
+ }
+ } else if (dirent->inode >= first_unused_inode) {
+ if (fix_problem(ctx, PR_2_INOREF_IN_UNUSED, &cd->pctx)){
+ fs->group_desc[group].bg_itable_unused = 0;
+ fs->group_desc[group].bg_flags &=
+ ~EXT2_BG_INODE_UNINIT;
+ ext2fs_mark_super_dirty(fs);
+ ctx->flags |= E2F_FLAG_RESTART;
+ goto restart_fsck;
+ } else {
+ ext2fs_unmark_valid(fs);
+ if (problem == PR_2_BAD_INO)
+ goto next;
+ }
+ }
+
+ if (!(ext2fs_test_inode_bitmap(ctx->inode_used_map,
+ dirent->inode))) {
+ /*
+ * If the inode is unused, offer to clear it.
+ */
+ problem = PR_2_UNUSED_INODE;
+ }
+
+ if (problem) {
+ if (fix_problem(ctx, problem, &cd->pctx)) {
+ dirent->inode = 0;
+ dir_modified++;
+ goto next;
+ } else {
+ ext2fs_unmark_valid(fs);
+ if (problem == PR_2_BAD_INO)
+ goto next;
+ }
+ }
+
if (check_name(ctx, dirent, ino, &cd->pctx))
dir_modified++;

@@ -1084,8 +1142,9 @@ static int check_dir_block(ext2_filsys f
dict_free_nodes(&de_dict);
return 0;
abort_free_dict:
- dict_free_nodes(&de_dict);
ctx->flags |= E2F_FLAG_ABORT;
+restart_fsck:
+ dict_free_nodes(&de_dict);
return DIRENT_ABORT;
}

Index: e2fsprogs-1.40.5/lib/ext2fs/openfs.c
===================================================================
--- e2fsprogs-1.40.5.orig/lib/ext2fs/openfs.c
+++ e2fsprogs-1.40.5/lib/ext2fs/openfs.c
@@ -303,6 +303,22 @@ errcode_t ext2fs_open2(const char *name,

fs->stride = fs->super->s_raid_stride;

+ /*
+ * If recovery is from backup superblock, Clear _UNININT flags &
+ * reset bg_itable_unused to zero
+ */
+ if (superblock > 1 && EXT2_HAS_RO_COMPAT_FEATURE(fs->super,
+ EXT4_FEATURE_RO_COMPAT_GDT_CSUM)) {
+ struct ext2_group_desc *gd;
+ for (i = 0, gd = fs->group_desc; i < fs->group_desc_count;
+ i++, gd++) {
+ gd->bg_flags &= ~EXT2_BG_BLOCK_UNINIT;
+ gd->bg_flags &= ~EXT2_BG_INODE_UNINIT;
+ gd->bg_itable_unused = 0;
+ }
+ ext2fs_mark_super_dirty(fs);
+ }
+
*ret_fs = fs;
return 0;
cleanup:
Index: e2fsprogs-1.40.5/e2fsck/unix.c
===================================================================
--- e2fsprogs-1.40.5.orig/e2fsck/unix.c
+++ e2fsprogs-1.40.5/e2fsck/unix.c
@@ -1380,6 +1380,10 @@ no_journal:
}
}

+ if (sb->s_feature_ro_compat & EXT4_FEATURE_RO_COMPAT_GDT_CSUM &&
+ !(ctx->options & E2F_OPT_READONLY))
+ ext2fs_set_gdt_csum(ctx->fs);
+
e2fsck_write_bitmaps(ctx);
#ifdef RESOURCE_TRACK
io_channel_flush(ctx->fs->io);
Index: e2fsprogs-1.40.5/e2fsck/e2fsck.h
===================================================================
--- e2fsprogs-1.40.5.orig/e2fsck/e2fsck.h
+++ e2fsprogs-1.40.5/e2fsck/e2fsck.h
@@ -482,6 +482,8 @@ extern void e2fsck_read_bitmaps(e2fsck_t
extern void e2fsck_write_bitmaps(e2fsck_t ctx);
extern void preenhalt(e2fsck_t ctx);
extern char *string_copy(e2fsck_t ctx, const char *str, int len);
+extern errcode_t e2fsck_zero_blocks(ext2_filsys fs, blk_t blk, int num,
+ blk_t *ret_blk, int *ret_count);
#ifdef RESOURCE_TRACK
extern void print_resource_track(const char *desc,
struct resource_track *track,
Index: e2fsprogs-1.40.5/lib/ext2fs/csum.c
===================================================================
--- /dev/null
+++ e2fsprogs-1.40.5/lib/ext2fs/csum.c
@@ -0,0 +1,150 @@
+/*
+ * csum.c --- checksumming of ext3 structures
+ *
+ * Copyright (C) 2006 Cluster File Systems, Inc.
+ *
+ * %Begin-Header%
+ * This file may be redistributed under the terms of the GNU Public
+ * License.
+ * %End-Header%
+ */
+
+#include "ext2_fs.h"
+#include "ext2fs.h"
+#include "crc16.h"
+#include <assert.h>
+
+#ifndef offsetof
+#define offsetof(TYPE, MEMBER) ((size_t) &((TYPE *)0)->MEMBER)
+#endif
+
+__u16 ext2fs_group_desc_csum(struct ext2_super_block *sb, __u32 group,
+ struct ext2_group_desc *desc)
+{
+ crc16_t crc = 0;
+
+ if (sb->s_feature_ro_compat & EXT4_FEATURE_RO_COMPAT_GDT_CSUM) {
+ int offset = offsetof(struct ext2_group_desc, bg_checksum);
+
+#ifdef WORDS_BIGENDIAN
+ struct ext2_group_desc swabdesc = *desc;
+
+ /* Have to swab back to little-endian to do the checksum */
+ ext2fs_swap_group_desc(&swabdesc);
+ desc = &swabdesc;
+
+ group = ext2fs_swab32(group);
+#endif
+ crc = crc16(0xffff, sb->s_uuid, sizeof(sb->s_uuid));
+ crc = crc16(crc, &group, sizeof(group));
+ crc = crc16(crc, desc, offset);
+ offset += sizeof(desc->bg_checksum); /* skip checksum */
+ if (sb->s_feature_incompat & EXT4_FEATURE_INCOMPAT_64BIT &&
+ sb->s_desc_size != 0 && offset < sb->s_desc_size)
+ crc = crc16(crc, (char *)desc + offset,
+ sb->s_desc_size - offset);
+ }
+
+ return crc;
+}
+
+int ext2fs_group_desc_csum_verify(struct ext2_super_block *sb, __u32 group,
+ struct ext2_group_desc *desc)
+{
+ if (desc->bg_checksum != ext2fs_group_desc_csum(sb, group, desc))
+ return 0;
+
+ return 1;
+}
+
+static __u32 find_last_inode_ingrp(ext2fs_inode_bitmap bitmap,
+ __u32 inodes_per_grp, dgrp_t grp_no)
+{
+ ext2_ino_t i, start_ino, end_ino;
+
+ start_ino = grp_no * inodes_per_grp + 1;
+ end_ino = start_ino + inodes_per_grp - 1;
+
+ for (i = end_ino; i >= start_ino; i--) {
+ if (ext2fs_fast_test_inode_bitmap(bitmap, i))
+ return i - start_ino + 1;
+ }
+ return inodes_per_grp;
+}
+
+/* update the bitmap flags, set the itable high watermark, and calculate
+ * checksums for the group descriptors */
+void ext2fs_set_gdt_csum(ext2_filsys fs)
+{
+ struct ext2_super_block *sb = fs->super;
+ struct ext2_group_desc *bg = fs->group_desc;
+ int blks, csum_flag, dirty = 0;
+ int inodes_per_group = sb->s_inodes_per_group;
+ dgrp_t i;
+
+ csum_flag = EXT2_HAS_RO_COMPAT_FEATURE(fs->super,
+ EXT4_FEATURE_RO_COMPAT_GDT_CSUM);
+ if (!EXT2_HAS_COMPAT_FEATURE(fs->super,
+ EXT2_FEATURE_COMPAT_LAZY_BG) && !csum_flag)
+ return;
+
+ for (i = 0; i < fs->group_desc_count; i++, bg++) {
+ int old_csum = bg->bg_checksum;
+ int old_unused = bg->bg_itable_unused;
+ int old_flags = bg->bg_flags;
+
+ /* Even if it wasn't zeroed, by the time this function is
+ * called by e2fsck we have already scanned and corrected
+ * the whole inode table so we may as well not overwrite it.
+ * This is just a hint to the kernel that it could do lazy
+ * zeroing of the inode table if mke2fs didn't do it, to help
+ * out if we need to do a full itable scan sometime later. */
+ if (!(bg->bg_flags & (EXT2_BG_INODE_UNINIT |
+ EXT2_BG_INODE_ZEROED))) {
+ bg->bg_flags |= EXT2_BG_INODE_ZEROED;
+ dirty = 1;
+ }
+
+ if (bg->bg_free_inodes_count == inodes_per_group &&
+ i > 0 && (i < fs->group_desc_count - 1 || csum_flag)) {
+ if (!(bg->bg_flags & EXT2_BG_INODE_UNINIT))
+ bg->bg_flags |= EXT2_BG_INODE_UNINIT;
+
+ if (csum_flag)
+ bg->bg_itable_unused = inodes_per_group;
+
+ } else if (csum_flag) {
+ if (fs->inode_map)
+ bg->bg_itable_unused = inodes_per_group -
+ find_last_inode_ingrp(fs->inode_map,
+ inodes_per_group,
+ i);
+ else if (bg->bg_flags & EXT2_BG_INODE_UNINIT)
+ bg->bg_itable_unused = 0;
+
+ bg->bg_flags &= ~EXT2_BG_INODE_UNINIT;
+ }
+
+ /* skip first and last groups, or groups with GDT backups
+ * because the resize inode has blocks allocated in them. */
+ if (i == 0 || i == fs->group_desc_count - 1 ||
+ (ext2fs_bg_has_super(fs, i) && sb->s_reserved_gdt_blocks))
+ goto checksum;
+
+ blks = ext2fs_super_and_bgd_loc(fs, i, 0, 0, 0, 0);
+ if (bg->bg_free_blocks_count == blks &&
+ bg->bg_flags & EXT2_BG_INODE_UNINIT &&
+ !(bg->bg_flags & EXT2_BG_BLOCK_UNINIT))
+ bg->bg_flags |= EXT2_BG_BLOCK_UNINIT;
+checksum:
+ bg->bg_checksum = ext2fs_group_desc_csum(fs->super, i, bg);
+ if (old_flags != bg->bg_flags)
+ dirty = 1;
+ if (old_unused != bg->bg_itable_unused)
+ dirty = 1;
+ if (old_csum != bg->bg_checksum)
+ dirty = 1;
+ }
+ if (dirty)
+ ext2fs_mark_super_dirty(fs);
+}
Index: e2fsprogs-1.40.5/lib/ext2fs/crc16.c
===================================================================
--- /dev/null
+++ e2fsprogs-1.40.5/lib/ext2fs/crc16.c
@@ -0,0 +1,61 @@
+/*
+ * crc16.c
+ *
+ * This source code is licensed under the GNU General Public License,
+ * Version 2. See the file COPYING for more details.
+ */
+
+#include <linux/types.h>
+#include "crc16.h"
+
+/** CRC table for the CRC-16. The poly is 0x8005 (x^16 + x^15 + x^2 + 1) */
+__u16 const crc16_table[256] = {
+ 0x0000, 0xC0C1, 0xC181, 0x0140, 0xC301, 0x03C0, 0x0280, 0xC241,
+ 0xC601, 0x06C0, 0x0780, 0xC741, 0x0500, 0xC5C1, 0xC481, 0x0440,
+ 0xCC01, 0x0CC0, 0x0D80, 0xCD41, 0x0F00, 0xCFC1, 0xCE81, 0x0E40,
+ 0x0A00, 0xCAC1, 0xCB81, 0x0B40, 0xC901, 0x09C0, 0x0880, 0xC841,
+ 0xD801, 0x18C0, 0x1980, 0xD941, 0x1B00, 0xDBC1, 0xDA81, 0x1A40,
+ 0x1E00, 0xDEC1, 0xDF81, 0x1F40, 0xDD01, 0x1DC0, 0x1C80, 0xDC41,
+ 0x1400, 0xD4C1, 0xD581, 0x1540, 0xD701, 0x17C0, 0x1680, 0xD641,
+ 0xD201, 0x12C0, 0x1380, 0xD341, 0x1100, 0xD1C1, 0xD081, 0x1040,
+ 0xF001, 0x30C0, 0x3180, 0xF141, 0x3300, 0xF3C1, 0xF281, 0x3240,
+ 0x3600, 0xF6C1, 0xF781, 0x3740, 0xF501, 0x35C0, 0x3480, 0xF441,
+ 0x3C00, 0xFCC1, 0xFD81, 0x3D40, 0xFF01, 0x3FC0, 0x3E80, 0xFE41,
+ 0xFA01, 0x3AC0, 0x3B80, 0xFB41, 0x3900, 0xF9C1, 0xF881, 0x3840,
+ 0x2800, 0xE8C1, 0xE981, 0x2940, 0xEB01, 0x2BC0, 0x2A80, 0xEA41,
+ 0xEE01, 0x2EC0, 0x2F80, 0xEF41, 0x2D00, 0xEDC1, 0xEC81, 0x2C40,
+ 0xE401, 0x24C0, 0x2580, 0xE541, 0x2700, 0xE7C1, 0xE681, 0x2640,
+ 0x2200, 0xE2C1, 0xE381, 0x2340, 0xE101, 0x21C0, 0x2080, 0xE041,
+ 0xA001, 0x60C0, 0x6180, 0xA141, 0x6300, 0xA3C1, 0xA281, 0x6240,
+ 0x6600, 0xA6C1, 0xA781, 0x6740, 0xA501, 0x65C0, 0x6480, 0xA441,
+ 0x6C00, 0xACC1, 0xAD81, 0x6D40, 0xAF01, 0x6FC0, 0x6E80, 0xAE41,
+ 0xAA01, 0x6AC0, 0x6B80, 0xAB41, 0x6900, 0xA9C1, 0xA881, 0x6840,
+ 0x7800, 0xB8C1, 0xB981, 0x7940, 0xBB01, 0x7BC0, 0x7A80, 0xBA41,
+ 0xBE01, 0x7EC0, 0x7F80, 0xBF41, 0x7D00, 0xBDC1, 0xBC81, 0x7C40,
+ 0xB401, 0x74C0, 0x7580, 0xB541, 0x7700, 0xB7C1, 0xB681, 0x7640,
+ 0x7200, 0xB2C1, 0xB381, 0x7340, 0xB101, 0x71C0, 0x7080, 0xB041,
+ 0x5000, 0x90C1, 0x9181, 0x5140, 0x9301, 0x53C0, 0x5280, 0x9241,
+ 0x9601, 0x56C0, 0x5780, 0x9741, 0x5500, 0x95C1, 0x9481, 0x5440,
+ 0x9C01, 0x5CC0, 0x5D80, 0x9D41, 0x5F00, 0x9FC1, 0x9E81, 0x5E40,
+ 0x5A00, 0x9AC1, 0x9B81, 0x5B40, 0x9901, 0x59C0, 0x5880, 0x9841,
+ 0x8801, 0x48C0, 0x4980, 0x8941, 0x4B00, 0x8BC1, 0x8A81, 0x4A40,
+ 0x4E00, 0x8EC1, 0x8F81, 0x4F40, 0x8D01, 0x4DC0, 0x4C80, 0x8C41,
+ 0x4400, 0x84C1, 0x8581, 0x4540, 0x8701, 0x47C0, 0x4680, 0x8641,
+ 0x8201, 0x42C0, 0x4380, 0x8341, 0x4100, 0x81C1, 0x8081, 0x4040
+};
+
+/**
+ * Compute the CRC-16 for the data buffer
+ *
+ * @param crc previous CRC value
+ * @param buffer data pointer
+ * @param len number of bytes in the buffer
+ * @return the updated CRC value
+ */
+crc16_t crc16(crc16_t crc, void const *buffer, size_t len)
+{
+ const unsigned char *ptr = buffer;
+ while (len--)
+ crc = crc16_byte(crc, *ptr++);
+ return crc;
+}
Index: e2fsprogs-1.40.5/lib/ext2fs/crc16.h
===================================================================
--- /dev/null
+++ e2fsprogs-1.40.5/lib/ext2fs/crc16.h
@@ -0,0 +1,47 @@
+/*
+ * crc16.h - CRC-16 routine
+ *
+ * Implements the standard CRC-16:
+ * Width 16
+ * Poly 0x8005 (x^16 + x^15 + x^2 + 1)
+ * Init 0
+ *
+ * Copyright (c) 2005 Ben Gardner <[email protected]>
+ *
+ * This source code is licensed under the GNU General Public License,
+ * Version 2. See the file COPYING for more details.
+ */
+
+#ifndef __CRC16_H
+#define __CRC16_H
+
+#include <linux/types.h>
+
+extern __u16 const crc16_table[256];
+
+
+#ifdef WORDS_BIGENDIAN
+/* for an unknown reason, PPC treats __u16 as signed and keeps doing sign
+ * extension on the value. Instead, use only the low 16 bits of an
+ * unsigned int for holding the CRC value to avoid this.
+ */
+typedef unsigned crc16_t;
+
+static inline crc16_t crc16_byte(crc16_t crc, const unsigned char data)
+{
+ return (((crc >> 8) & 0xffU) ^ crc16_table[(crc ^ data) & 0xffU]) &
+ 0x0000ffffU;
+}
+#else
+typedef __u16 crc16_t;
+
+static inline crc16_t crc16_byte(crc16_t crc, const unsigned char data)
+{
+ return (crc >> 8) ^ crc16_table[(crc ^ data) & 0xff];
+}
+#endif
+
+extern crc16_t crc16(crc16_t crc, void const *buffer, size_t len);
+
+
+#endif /* __CRC16_H */
Index: e2fsprogs-1.40.5/lib/ext2fs/tst_csum.c
===================================================================
--- /dev/null
+++ e2fsprogs-1.40.5/lib/ext2fs/tst_csum.c
@@ -0,0 +1,140 @@
+/*
+ * This testing program verifies checksumming operations
+ *
+ * Copyright (C) 2006, 2007 by Andreas Dilger <[email protected]>
+ *
+ * %Begin-Header%
+ * This file may be redistributed under the terms of the GNU Public
+ * License.
+ * %End-Header%
+ */
+
+#include "ext2fs/ext2_fs.h"
+#include "ext2fs/ext2fs.h"
+#include "ext2fs/crc16.h"
+#include "uuid/uuid.h"
+
+#ifndef offsetof
+#define offsetof(TYPE, MEMBER) ((size_t) &((TYPE *)0)->MEMBER)
+#endif
+
+void print_csum(const char *msg, struct ext2_super_block *sb,
+ __u32 group, struct ext2_group_desc *desc)
+{
+ __u16 crc1, crc2, crc3;
+ __u32 swabgroup;
+ char uuid[40];
+
+#ifdef WORDS_BIGENDIAN
+ struct ext2_group_desc swabdesc = *desc;
+
+ /* Have to swab back to little-endian to do the checksum */
+ ext2fs_swap_group_desc(&swabdesc);
+ desc = &swabdesc;
+
+ swabgroup = ext2fs_swab32(group);
+#else
+ swabgroup = group;
+#endif
+
+ crc1 = crc16(0xffff, sb->s_uuid, sizeof(sb->s_uuid));
+ crc2 = crc16(crc1, &swabgroup, sizeof(swabgroup));
+ crc3 = crc16(crc2, desc, offsetof(struct ext2_group_desc, bg_checksum));
+ uuid_unparse(sb->s_uuid, uuid);
+ printf("%s: UUID %s=%04x, grp %u=%04x: %04x=%04x\n",
+ msg, uuid, crc1, group, crc2, crc3,
+ ext2fs_group_desc_csum(sb, group,desc));
+}
+
+int main(int argc, char **argv)
+{
+ struct ext2_group_desc desc = { .bg_block_bitmap = 124,
+ .bg_inode_bitmap = 125,
+ .bg_inode_table = 126,
+ .bg_free_blocks_count = 31119,
+ .bg_free_inodes_count = 15701,
+ .bg_used_dirs_count = 2,
+ .bg_flags = 0,
+ };
+ struct ext2_super_block sb = { .s_feature_ro_compat =
+ EXT4_FEATURE_RO_COMPAT_GDT_CSUM,
+ .s_uuid = { 0x4f, 0x25, 0xe8, 0xcf,
+ 0xe7, 0x97, 0x48, 0x23,
+ 0xbe, 0xfa, 0xa7, 0x88,
+ 0x4b, 0xae, 0xec, 0xdb } };
+ __u16 csum1, csum2, csum_known = 0xd3a4;
+ char data[8] = { 0x10, 0x20, 0x30, 0x40, 0xf1, 0xb2, 0xc3, 0xd4 };
+ __u16 data_crc[8] = { 0xcc01, 0x180c, 0x1118, 0xfa10,
+ 0x483a, 0x6648, 0x6726, 0x85e6 };
+ __u16 data_crc0[8] = { 0x8cbe, 0xa80d, 0xd169, 0xde10,
+ 0x481e, 0x7d48, 0x673d, 0x8ea6 };
+ int i;
+
+ for (i = 0; i < sizeof(data); i++) {
+ csum1 = crc16(0, data, i + 1);
+ printf("crc16(0): data[%d]: %04x=%04x\n", i, csum1,data_crc[i]);
+ if (csum1 != data_crc[i]) {
+ printf("error: crc16(0) for data[%d] should be %04x\n",
+ i, data_crc[i]);
+ exit(1);
+ }
+ }
+
+ for (i = 0; i < sizeof(data); i++) {
+ csum1 = crc16(~0, data, i + 1);
+ printf("crc16(~0): data[%d]: %04x=%04x\n",i,csum1,data_crc0[i]);
+ if (csum1 != data_crc0[i]) {
+ printf("error: crc16(~0) for data[%d] should be %04x\n",
+ i, data_crc0[i]);
+ exit(1);
+ }
+ }
+
+ csum1 = ext2fs_group_desc_csum(&sb, 0, &desc);
+ print_csum("csum0000", &sb, 0, &desc);
+
+ if (csum1 != csum_known) {
+ printf("checksum for group 0 should be %04x\n", csum_known);
+ exit(1);
+ }
+ csum2 = ext2fs_group_desc_csum(&sb, 1, &desc);
+ print_csum("csum0001", &sb, 1, &desc);
+ if (csum1 == csum2) {
+ printf("checksums for different groups shouldn't match\n");
+ exit(1);
+ }
+ csum2 = ext2fs_group_desc_csum(&sb, 0xffff, &desc);
+ print_csum("csumffff", &sb, 0xffff, &desc);
+ if (csum1 == csum2) {
+ printf("checksums for different groups shouldn't match\n");
+ exit(1);
+ }
+ desc.bg_checksum = csum1;
+ csum2 = ext2fs_group_desc_csum(&sb, 0, &desc);
+ print_csum("csum_set", &sb, 0, &desc);
+ if (csum1 != csum2) {
+ printf("checksums should not depend on checksum field\n");
+ exit(1);
+ }
+ if (!ext2fs_group_desc_csum_verify(&sb, 0, &desc)) {
+ printf("checksums should verify against gd_checksum\n");
+ exit(1);
+ }
+ memset(sb.s_uuid, 0x30, sizeof(sb.s_uuid));
+ print_csum("new_uuid", &sb, 0, &desc);
+ if (ext2fs_group_desc_csum_verify(&sb, 0, &desc) != 0) {
+ printf("checksums for different filesystems shouldn't match\n");
+ exit(1);
+ }
+ csum1 = desc.bg_checksum = ext2fs_group_desc_csum(&sb, 0, &desc);
+ print_csum("csum_new", &sb, 0, &desc);
+ desc.bg_free_blocks_count = 1;
+ csum2 = ext2fs_group_desc_csum(&sb, 0, &desc);
+ print_csum("csum_blk", &sb, 0, &desc);
+ if (csum1 == csum2) {
+ printf("checksums for different data shouldn't match\n");
+ exit(1);
+ }
+
+ return 0;
+}
Index: e2fsprogs-1.40.5/lib/e2p/feature.c
===================================================================
--- e2fsprogs-1.40.5.orig/lib/e2p/feature.c
+++ e2fsprogs-1.40.5/lib/e2p/feature.c
@@ -45,7 +45,7 @@ static struct feature feature_list[] = {
{ E2P_FEATURE_RO_INCOMPAT, EXT4_FEATURE_RO_COMPAT_HUGE_FILE,
"huge_file" },
{ E2P_FEATURE_RO_INCOMPAT, EXT4_FEATURE_RO_COMPAT_GDT_CSUM,
- "gdt_checksum" },
+ "uninit_groups" },
{ E2P_FEATURE_RO_INCOMPAT, EXT4_FEATURE_RO_COMPAT_DIR_NLINK,
"dir_nlink" },
{ E2P_FEATURE_RO_INCOMPAT, EXT4_FEATURE_RO_COMPAT_EXTRA_ISIZE,
Index: e2fsprogs-1.40.5/misc/mke2fs.8.in
===================================================================
--- e2fsprogs-1.40.5.orig/misc/mke2fs.8.in
+++ e2fsprogs-1.40.5/misc/mke2fs.8.in
@@ -214,7 +214,7 @@ for the filesystem. (For administrators
filesystems on RAID arrays, it is preferable to use the
.I stride
RAID parameter as part of the
-.B \-R
+.B \-E
option rather than manipulating the number of blocks per group.)
This option is generally used by developers who
are developing test cases.
@@ -412,6 +412,12 @@ Store file type information in directory
Create an ext3 journal (as if using the
.B \-j
option).
+.TP
+.B uninit_groups
+Create a filesystem without initializing all of the groups. This can reduce
+.BR e2fsck (8)
+time dramatically. This feature causes the filesystem to be read-only in
+older kernels is not supported in most Linux kernels, use with caution.
@[email protected]
@[email protected] journal_dev
@[email protected] an external ext3 journal on the given device
Index: e2fsprogs-1.40.5/e2fsck/util.c
===================================================================
--- e2fsprogs-1.40.5.orig/e2fsck/util.c
+++ e2fsprogs-1.40.5/e2fsck/util.c
@@ -29,6 +29,10 @@
#include <malloc.h>
#endif

+#ifdef HAVE_ERRNO_H
+#include <errno.h>
+#endif
+
#include "e2fsck.h"

extern e2fsck_t e2fsck_global_ctx; /* Try your very best not to use this! */
@@ -546,3 +550,60 @@ int ext2_file_type(unsigned int mode)

return 0;
}
+
+#define STRIDE_LENGTH 8
+/*
+ * Helper function which zeros out _num_ blocks starting at _blk_. In
+ * case of an error, the details of the error is returned via _ret_blk_
+ * and _ret_count_ if they are non-NULL pointers. Returns 0 on
+ * success, and an error code on an error.
+ *
+ * As a special case, if the first argument is NULL, then it will
+ * attempt to free the static zeroizing buffer. (This is to keep
+ * programs that check for memory leaks happy.)
+ */
+errcode_t e2fsck_zero_blocks(ext2_filsys fs, blk_t blk, int num,
+ blk_t *ret_blk, int *ret_count)
+{
+ int j, count, next_update, next_update_incr;
+ static char *buf;
+ errcode_t retval;
+
+ /* If fs is null, clean up the static buffer and return */
+ if (!fs) {
+ if (buf) {
+ free(buf);
+ buf = 0;
+ }
+ return 0;
+ }
+ /* Allocate the zeroizing buffer if necessary */
+ if (!buf) {
+ buf = malloc(fs->blocksize * STRIDE_LENGTH);
+ if (!buf) {
+ com_err("malloc", ENOMEM,
+ _("while allocating zeroizing buffer"));
+ exit(1);
+ }
+ memset(buf, 0, fs->blocksize * STRIDE_LENGTH);
+ }
+ /* OK, do the write loop */
+ next_update = 0;
+ next_update_incr = num / 100;
+ if (next_update_incr < 1)
+ next_update_incr = 1;
+ for (j = 0; j < num; j += STRIDE_LENGTH, blk += STRIDE_LENGTH) {
+ count = num - j;
+ if (count > STRIDE_LENGTH)
+ count = STRIDE_LENGTH;
+ retval = io_channel_write_blk(fs->io, blk, count, buf);
+ if (retval) {
+ if (ret_count)
+ *ret_count = count;
+ if (ret_blk)
+ *ret_blk = blk;
+ return retval;
+ }
+ }
+ return 0;
+}
Index: e2fsprogs-1.40.5/resize/main.c
===================================================================
--- e2fsprogs-1.40.5.orig/resize/main.c
+++ e2fsprogs-1.40.5/resize/main.c
@@ -298,6 +298,13 @@ int main (int argc, char ** argv)
printf (_("Couldn't find valid filesystem superblock.\n"));
exit (1);
}
+
+ if (fs->super->s_feature_ro_compat & EXT4_FEATURE_RO_COMPAT_GDT_CSUM) {
+ com_err(program_name, EXT2_ET_RO_UNSUPP_FEATURE,
+ ":- uninit_groups");
+ exit(1);
+ }
+
/*
* Check for compatibility with the feature sets. We need to
* be more stringent than ext2fs_open().
Index: e2fsprogs-1.40.5/tests/f_dupfsblks/expect.1
===================================================================
--- e2fsprogs-1.40.5.orig/tests/f_dupfsblks/expect.1
+++ e2fsprogs-1.40.5/tests/f_dupfsblks/expect.1
@@ -44,7 +44,8 @@ Salvage? yes
Directory inode 12, block 3, offset 0: directory corrupted
Salvage? yes

-Entry '' in ??? (12) has deleted/unused inode 32. Clear? yes
+Entry '' in ??? (12) has a zero-length name.
+Clear? yes

Directory inode 12, block 4, offset 100: directory corrupted
Salvage? yes
Index: e2fsprogs-1.40.5/tests/f_uninit_last_uninit/expect.1
===================================================================
--- /dev/null
+++ e2fsprogs-1.40.5/tests/f_uninit_last_uninit/expect.1
@@ -0,0 +1,9 @@
+last group block bitmap uninitialized. Fix? yes
+
+Pass 1: Checking inodes, blocks, and sizes
+Pass 2: Checking directory structure
+Pass 3: Checking directory connectivity
+Pass 4: Checking reference counts
+Pass 5: Checking group summary information
+test_filesys: 11/32 files (9.1% non-contiguous), 105/10000 blocks
+Exit status is 0
Index: e2fsprogs-1.40.5/tests/f_uninit_last_uninit/expect.2
===================================================================
--- /dev/null
+++ e2fsprogs-1.40.5/tests/f_uninit_last_uninit/expect.2
@@ -0,0 +1,7 @@
+Pass 1: Checking inodes, blocks, and sizes
+Pass 2: Checking directory structure
+Pass 3: Checking directory connectivity
+Pass 4: Checking reference counts
+Pass 5: Checking group summary information
+test_filesys: 11/32 files (9.1% non-contiguous), 105/10000 blocks
+Exit status is 0
Index: e2fsprogs-1.40.5/tests/f_uninit_last_uninit/name
===================================================================
--- /dev/null
+++ e2fsprogs-1.40.5/tests/f_uninit_last_uninit/name
@@ -0,0 +1,2 @@
+last group has BLOCK_UNINIT set
+
Index: e2fsprogs-1.40.5/tests/f_uninit_last_uninit/script
===================================================================
--- /dev/null
+++ e2fsprogs-1.40.5/tests/f_uninit_last_uninit/script
@@ -0,0 +1,20 @@
+SKIP_GUNZIP="true"
+
+touch $TMPFILE
+$MKE2FS -N 32 -F -o Linux -O uninit_groups -b 1024 $TMPFILE 10000 > /dev/null 2>&1
+$DEBUGFS -w $TMPFILE << EOF > /dev/null 2>&1
+set_current_time 200704102100
+set_super_value lastcheck 0
+set_super_value hash_seed null
+set_super_value mkfs_time 0
+set_bg 1 flags 0x7
+set_bg 1 checksum calc
+q
+EOF
+
+E2FSCK_TIME=200704102100
+export E2FSCK_TIME
+
+. $cmd_dir/run_e2fsck
+
+unset E2FSCK_TIME
Index: e2fsprogs-1.40.5/tests/m_lazy/expect.1
===================================================================
--- /dev/null
+++ e2fsprogs-1.40.5/tests/m_lazy/expect.1
@@ -0,0 +1,158 @@
+Filesystem label=
+OS type: Linux
+Block size=1024 (log=0)
+Fragment size=1024 (log=0)
+32768 inodes, 131072 blocks
+6553 blocks (5.00%) reserved for the super user
+First data block=1
+16 block groups
+8192 blocks per group, 8192 fragments per group
+2048 inodes per group
+Superblock backups stored on blocks:
+ 8193, 24577, 40961, 57345, 73729
+
+Writing inode tables: done
+Writing superblocks and filesystem accounting information: done
+
+Filesystem features: ext_attr dir_index lazy_bg filetype sparse_super
+
+Pass 1: Checking inodes, blocks, and sizes
+Pass 2: Checking directory structure
+Pass 3: Checking directory connectivity
+Pass 4: Checking reference counts
+Pass 5: Checking group summary information
+test_filesys: 28683/32768 files (0.0% non-contiguous), 115220/131072 blocks
+Exit status is 0
+
+Filesystem volume name: <none>
+Last mounted on: <not available>
+Filesystem magic number: 0xEF53
+Filesystem revision #: 1 (dynamic)
+Filesystem features: ext_attr dir_index lazy_bg filetype sparse_super
+Default mount options: (none)
+Filesystem state: clean
+Errors behavior: Continue
+Filesystem OS type: Linux
+Inode count: 32768
+Block count: 131072
+Reserved block count: 6553
+Free blocks: 15852
+Free inodes: 4085
+First block: 1
+Block size: 1024
+Fragment size: 1024
+Blocks per group: 8192
+Fragments per group: 8192
+Inodes per group: 2048
+Inode blocks per group: 256
+Mount count: 0
+Check interval: 15552000 (6 months)
+Reserved blocks uid: 0
+Reserved blocks gid: 0
+First inode: 11
+Inode size: 128
+Default directory hash: tea
+
+
+Group 0: (Blocks 1-8192)
+ Primary superblock at 1, Group descriptors at 2-2
+ Block bitmap at 3 (+2), Inode bitmap at 4 (+3)
+ Inode table at 5-260 (+4)
+ 7919 free blocks, 2037 free inodes, 2 directories
+ Free blocks: 274-8192
+ Free inodes: 12-2048
+Group 1: (Blocks 8193-16384) [Inode not init, Block not init]
+ Backup superblock at 8193, Group descriptors at 8194-8194
+ Block bitmap at 8195 (+2), Inode bitmap at 8196 (+3)
+ Inode table at 8197-8452 (+4)
+ 0 free blocks, 0 free inodes, 0 directories
+ Free blocks:
+ Free inodes:
+Group 2: (Blocks 16385-24576) [Inode not init, Block not init]
+ Block bitmap at 16385 (+0), Inode bitmap at 16386 (+1)
+ Inode table at 16387-16642 (+2)
+ 0 free blocks, 0 free inodes, 0 directories
+ Free blocks:
+ Free inodes:
+Group 3: (Blocks 24577-32768) [Inode not init, Block not init]
+ Backup superblock at 24577, Group descriptors at 24578-24578
+ Block bitmap at 24579 (+2), Inode bitmap at 24580 (+3)
+ Inode table at 24581-24836 (+4)
+ 0 free blocks, 0 free inodes, 0 directories
+ Free blocks:
+ Free inodes:
+Group 4: (Blocks 32769-40960) [Inode not init, Block not init]
+ Block bitmap at 32769 (+0), Inode bitmap at 32770 (+1)
+ Inode table at 32771-33026 (+2)
+ 0 free blocks, 0 free inodes, 0 directories
+ Free blocks:
+ Free inodes:
+Group 5: (Blocks 40961-49152) [Inode not init, Block not init]
+ Backup superblock at 40961, Group descriptors at 40962-40962
+ Block bitmap at 40963 (+2), Inode bitmap at 40964 (+3)
+ Inode table at 40965-41220 (+4)
+ 0 free blocks, 0 free inodes, 0 directories
+ Free blocks:
+ Free inodes:
+Group 6: (Blocks 49153-57344) [Inode not init, Block not init]
+ Block bitmap at 49153 (+0), Inode bitmap at 49154 (+1)
+ Inode table at 49155-49410 (+2)
+ 0 free blocks, 0 free inodes, 0 directories
+ Free blocks:
+ Free inodes:
+Group 7: (Blocks 57345-65536) [Inode not init, Block not init]
+ Backup superblock at 57345, Group descriptors at 57346-57346
+ Block bitmap at 57347 (+2), Inode bitmap at 57348 (+3)
+ Inode table at 57349-57604 (+4)
+ 0 free blocks, 0 free inodes, 0 directories
+ Free blocks:
+ Free inodes:
+Group 8: (Blocks 65537-73728) [Inode not init, Block not init]
+ Block bitmap at 65537 (+0), Inode bitmap at 65538 (+1)
+ Inode table at 65539-65794 (+2)
+ 0 free blocks, 0 free inodes, 0 directories
+ Free blocks:
+ Free inodes:
+Group 9: (Blocks 73729-81920) [Inode not init, Block not init]
+ Backup superblock at 73729, Group descriptors at 73730-73730
+ Block bitmap at 73731 (+2), Inode bitmap at 73732 (+3)
+ Inode table at 73733-73988 (+4)
+ 0 free blocks, 0 free inodes, 0 directories
+ Free blocks:
+ Free inodes:
+Group 10: (Blocks 81921-90112) [Inode not init, Block not init]
+ Block bitmap at 81921 (+0), Inode bitmap at 81922 (+1)
+ Inode table at 81923-82178 (+2)
+ 0 free blocks, 0 free inodes, 0 directories
+ Free blocks:
+ Free inodes:
+Group 11: (Blocks 90113-98304) [Inode not init, Block not init]
+ Block bitmap at 90113 (+0), Inode bitmap at 90114 (+1)
+ Inode table at 90115-90370 (+2)
+ 0 free blocks, 0 free inodes, 0 directories
+ Free blocks:
+ Free inodes:
+Group 12: (Blocks 98305-106496) [Inode not init, Block not init]
+ Block bitmap at 98305 (+0), Inode bitmap at 98306 (+1)
+ Inode table at 98307-98562 (+2)
+ 0 free blocks, 0 free inodes, 0 directories
+ Free blocks:
+ Free inodes:
+Group 13: (Blocks 106497-114688) [Inode not init, Block not init]
+ Block bitmap at 106497 (+0), Inode bitmap at 106498 (+1)
+ Inode table at 106499-106754 (+2)
+ 0 free blocks, 0 free inodes, 0 directories
+ Free blocks:
+ Free inodes:
+Group 14: (Blocks 114689-122880) [Inode not init, Block not init]
+ Block bitmap at 114689 (+0), Inode bitmap at 114690 (+1)
+ Inode table at 114691-114946 (+2)
+ 0 free blocks, 0 free inodes, 0 directories
+ Free blocks:
+ Free inodes:
+Group 15: (Blocks 122881-131071)
+ Block bitmap at 122881 (+0), Inode bitmap at 122882 (+1)
+ Inode table at 122883-123138 (+2)
+ 7933 free blocks, 2048 free inodes, 0 directories
+ Free blocks: 123139-131071
+ Free inodes: 30721-32768
Index: e2fsprogs-1.40.5/tests/m_lazy/script
===================================================================
--- /dev/null
+++ e2fsprogs-1.40.5/tests/m_lazy/script
@@ -0,0 +1,4 @@
+DESCRIPTION="lazy group feature"
+FS_SIZE=131072
+MKE2FS_OPTS="-O ^resize_inode,lazy_bg"
+. $cmd_dir/run_mke2fs
Index: e2fsprogs-1.40.5/tests/m_lazy_resize/expect.1
===================================================================
--- /dev/null
+++ e2fsprogs-1.40.5/tests/m_lazy_resize/expect.1
@@ -0,0 +1,166 @@
+Filesystem label=
+OS type: Linux
+Block size=1024 (log=0)
+Fragment size=1024 (log=0)
+32768 inodes, 131072 blocks
+6553 blocks (5.00%) reserved for the super user
+First data block=1
+Maximum filesystem blocks=67371008
+16 block groups
+8192 blocks per group, 8192 fragments per group
+2048 inodes per group
+Superblock backups stored on blocks:
+ 8193, 24577, 40961, 57345, 73729
+
+Writing inode tables: done
+Writing superblocks and filesystem accounting information: done
+
+Filesystem features: ext_attr resize_inode dir_index lazy_bg filetype sparse_super
+
+Pass 1: Checking inodes, blocks, and sizes
+Pass 2: Checking directory structure
+Pass 3: Checking directory connectivity
+Pass 4: Checking reference counts
+Pass 5: Checking group summary information
+test_filesys: 28683/32768 files (0.0% non-contiguous), 77097/131072 blocks
+Exit status is 0
+
+Filesystem volume name: <none>
+Last mounted on: <not available>
+Filesystem magic number: 0xEF53
+Filesystem revision #: 1 (dynamic)
+Filesystem features: ext_attr resize_inode dir_index lazy_bg filetype sparse_super
+Default mount options: (none)
+Filesystem state: clean
+Errors behavior: Continue
+Filesystem OS type: Linux
+Inode count: 32768
+Block count: 131072
+Reserved block count: 6553
+Free blocks: 53975
+Free inodes: 4085
+First block: 1
+Block size: 1024
+Fragment size: 1024
+Reserved GDT blocks: 256
+Blocks per group: 8192
+Fragments per group: 8192
+Inodes per group: 2048
+Inode blocks per group: 256
+Mount count: 0
+Check interval: 15552000 (6 months)
+Reserved blocks uid: 0
+Reserved blocks gid: 0
+First inode: 11
+Inode size: 128
+Default directory hash: tea
+
+
+Group 0: (Blocks 1-8192)
+ Primary superblock at 1, Group descriptors at 2-2
+ Reserved GDT blocks at 3-258
+ Block bitmap at 259 (+258), Inode bitmap at 260 (+259)
+ Inode table at 261-516 (+260)
+ 7662 free blocks, 2037 free inodes, 2 directories
+ Free blocks: 531-8192
+ Free inodes: 12-2048
+Group 1: (Blocks 8193-16384) [Inode not init]
+ Backup superblock at 8193, Group descriptors at 8194-8194
+ Reserved GDT blocks at 8195-8450
+ Block bitmap at 8451 (+258), Inode bitmap at 8452 (+259)
+ Inode table at 8453-8708 (+260)
+ 7676 free blocks, 0 free inodes, 0 directories
+ Free blocks: 8709-16384
+ Free inodes:
+Group 2: (Blocks 16385-24576) [Inode not init, Block not init]
+ Block bitmap at 16385 (+0), Inode bitmap at 16386 (+1)
+ Inode table at 16387-16642 (+2)
+ 0 free blocks, 0 free inodes, 0 directories
+ Free blocks:
+ Free inodes:
+Group 3: (Blocks 24577-32768) [Inode not init]
+ Backup superblock at 24577, Group descriptors at 24578-24578
+ Reserved GDT blocks at 24579-24834
+ Block bitmap at 24835 (+258), Inode bitmap at 24836 (+259)
+ Inode table at 24837-25092 (+260)
+ 7676 free blocks, 0 free inodes, 0 directories
+ Free blocks: 25093-32768
+ Free inodes:
+Group 4: (Blocks 32769-40960) [Inode not init, Block not init]
+ Block bitmap at 32769 (+0), Inode bitmap at 32770 (+1)
+ Inode table at 32771-33026 (+2)
+ 0 free blocks, 0 free inodes, 0 directories
+ Free blocks:
+ Free inodes:
+Group 5: (Blocks 40961-49152) [Inode not init]
+ Backup superblock at 40961, Group descriptors at 40962-40962
+ Reserved GDT blocks at 40963-41218
+ Block bitmap at 41219 (+258), Inode bitmap at 41220 (+259)
+ Inode table at 41221-41476 (+260)
+ 7676 free blocks, 0 free inodes, 0 directories
+ Free blocks: 41477-49152
+ Free inodes:
+Group 6: (Blocks 49153-57344) [Inode not init, Block not init]
+ Block bitmap at 49153 (+0), Inode bitmap at 49154 (+1)
+ Inode table at 49155-49410 (+2)
+ 0 free blocks, 0 free inodes, 0 directories
+ Free blocks:
+ Free inodes:
+Group 7: (Blocks 57345-65536) [Inode not init]
+ Backup superblock at 57345, Group descriptors at 57346-57346
+ Reserved GDT blocks at 57347-57602
+ Block bitmap at 57603 (+258), Inode bitmap at 57604 (+259)
+ Inode table at 57605-57860 (+260)
+ 7676 free blocks, 0 free inodes, 0 directories
+ Free blocks: 57861-65536
+ Free inodes:
+Group 8: (Blocks 65537-73728) [Inode not init, Block not init]
+ Block bitmap at 65537 (+0), Inode bitmap at 65538 (+1)
+ Inode table at 65539-65794 (+2)
+ 0 free blocks, 0 free inodes, 0 directories
+ Free blocks:
+ Free inodes:
+Group 9: (Blocks 73729-81920) [Inode not init]
+ Backup superblock at 73729, Group descriptors at 73730-73730
+ Reserved GDT blocks at 73731-73986
+ Block bitmap at 73987 (+258), Inode bitmap at 73988 (+259)
+ Inode table at 73989-74244 (+260)
+ 7676 free blocks, 0 free inodes, 0 directories
+ Free blocks: 74245-81920
+ Free inodes:
+Group 10: (Blocks 81921-90112) [Inode not init, Block not init]
+ Block bitmap at 81921 (+0), Inode bitmap at 81922 (+1)
+ Inode table at 81923-82178 (+2)
+ 0 free blocks, 0 free inodes, 0 directories
+ Free blocks:
+ Free inodes:
+Group 11: (Blocks 90113-98304) [Inode not init, Block not init]
+ Block bitmap at 90113 (+0), Inode bitmap at 90114 (+1)
+ Inode table at 90115-90370 (+2)
+ 0 free blocks, 0 free inodes, 0 directories
+ Free blocks:
+ Free inodes:
+Group 12: (Blocks 98305-106496) [Inode not init, Block not init]
+ Block bitmap at 98305 (+0), Inode bitmap at 98306 (+1)
+ Inode table at 98307-98562 (+2)
+ 0 free blocks, 0 free inodes, 0 directories
+ Free blocks:
+ Free inodes:
+Group 13: (Blocks 106497-114688) [Inode not init, Block not init]
+ Block bitmap at 106497 (+0), Inode bitmap at 106498 (+1)
+ Inode table at 106499-106754 (+2)
+ 0 free blocks, 0 free inodes, 0 directories
+ Free blocks:
+ Free inodes:
+Group 14: (Blocks 114689-122880) [Inode not init, Block not init]
+ Block bitmap at 114689 (+0), Inode bitmap at 114690 (+1)
+ Inode table at 114691-114946 (+2)
+ 0 free blocks, 0 free inodes, 0 directories
+ Free blocks:
+ Free inodes:
+Group 15: (Blocks 122881-131071)
+ Block bitmap at 122881 (+0), Inode bitmap at 122882 (+1)
+ Inode table at 122883-123138 (+2)
+ 7933 free blocks, 2048 free inodes, 0 directories
+ Free blocks: 123139-131071
+ Free inodes: 30721-32768
Index: e2fsprogs-1.40.5/tests/m_lazy_resize/script
===================================================================
--- /dev/null
+++ e2fsprogs-1.40.5/tests/m_lazy_resize/script
@@ -0,0 +1,4 @@
+DESCRIPTION="lazy group feature with resize_inode"
+FS_SIZE=131072
+MKE2FS_OPTS="-O resize_inode,lazy_bg"
+. $cmd_dir/run_mke2fs
Index: e2fsprogs-1.40.5/tests/m_raid_opt/expect.1
===================================================================
--- e2fsprogs-1.40.5.orig/tests/m_raid_opt/expect.1
+++ e2fsprogs-1.40.5/tests/m_raid_opt/expect.1
@@ -46,57 +46,68 @@ Setting filetype for entry '..' in ??? (
Directory inode 11, block 1, offset 0: directory corrupted
Salvage? yes

-Entry '' in ??? (11) has deleted/unused inode 1063. Clear? yes
+Entry '' in ??? (11) has a zero-length name.
+Clear? yes

Directory inode 11, block 2, offset 0: directory corrupted
Salvage? yes

-Entry '' in ??? (11) has deleted/unused inode 1064. Clear? yes
+Entry '' in ??? (11) has a zero-length name.
+Clear? yes

Directory inode 11, block 3, offset 0: directory corrupted
Salvage? yes

-Entry '' in ??? (11) has deleted/unused inode 1065. Clear? yes
+Entry '' in ??? (11) has a zero-length name.
+Clear? yes

Directory inode 11, block 4, offset 0: directory corrupted
Salvage? yes

-Entry '' in ??? (11) has deleted/unused inode 1066. Clear? yes
+Entry '' in ??? (11) has a zero-length name.
+Clear? yes

Directory inode 11, block 5, offset 0: directory corrupted
Salvage? yes

-Entry '' in ??? (11) has deleted/unused inode 1067. Clear? yes
+Entry '' in ??? (11) has a zero-length name.
+Clear? yes

Directory inode 11, block 6, offset 0: directory corrupted
Salvage? yes

-Entry '' in ??? (11) has deleted/unused inode 1068. Clear? yes
+Entry '' in ??? (11) has a zero-length name.
+Clear? yes

Directory inode 11, block 7, offset 0: directory corrupted
Salvage? yes

-Entry '' in ??? (11) has deleted/unused inode 1069. Clear? yes
+Entry '' in ??? (11) has a zero-length name.
+Clear? yes

Directory inode 11, block 8, offset 0: directory corrupted
Salvage? yes

-Entry '' in ??? (11) has deleted/unused inode 1070. Clear? yes
+Entry '' in ??? (11) has a zero-length name.
+Clear? yes

Directory inode 11, block 9, offset 0: directory corrupted
Salvage? yes

-Entry '' in ??? (11) has deleted/unused inode 1071. Clear? yes
+Entry '' in ??? (11) has a zero-length name.
+Clear? yes

Directory inode 11, block 10, offset 0: directory corrupted
Salvage? yes

-Entry '' in ??? (11) has deleted/unused inode 1072. Clear? yes
+Entry '' in ??? (11) has a zero-length name.
+Clear? yes

Directory inode 11, block 11, offset 0: directory corrupted
Salvage? yes

-Entry '' in ??? (11) has deleted/unused inode 1073. Clear? yes
+Entry '' in ??? (11) has a zero-length name.
+Clear? yes

Pass 3: Checking directory connectivity
'..' in / (2) is <The NULL inode> (0), should be / (2).
Index: e2fsprogs-1.40.5/tests/m_uninit/expect.1
===================================================================
--- /dev/null
+++ e2fsprogs-1.40.5/tests/m_uninit/expect.1
@@ -0,0 +1,166 @@
+Filesystem label=
+OS type: Linux
+Block size=1024 (log=0)
+Fragment size=1024 (log=0)
+32768 inodes, 131072 blocks
+6553 blocks (5.00%) reserved for the super user
+First data block=1
+Maximum filesystem blocks=67371008
+16 block groups
+8192 blocks per group, 8192 fragments per group
+2048 inodes per group
+Superblock backups stored on blocks:
+ 8193, 24577, 40961, 57345, 73729
+
+Writing inode tables: done
+Writing superblocks and filesystem accounting information: done
+
+Filesystem features: ext_attr resize_inode dir_index filetype sparse_super uninit_groups
+
+Pass 1: Checking inodes, blocks, and sizes
+Pass 2: Checking directory structure
+Pass 3: Checking directory connectivity
+Pass 4: Checking reference counts
+Pass 5: Checking group summary information
+test_filesys: 11/32768 files (9.1% non-contiguous), 5691/131072 blocks
+Exit status is 0
+
+Filesystem volume name: <none>
+Last mounted on: <not available>
+Filesystem magic number: 0xEF53
+Filesystem revision #: 1 (dynamic)
+Filesystem features: ext_attr resize_inode dir_index filetype sparse_super uninit_groups
+Default mount options: (none)
+Filesystem state: clean
+Errors behavior: Continue
+Filesystem OS type: Linux
+Inode count: 32768
+Block count: 131072
+Reserved block count: 6553
+Free blocks: 125381
+Free inodes: 32757
+First block: 1
+Block size: 1024
+Fragment size: 1024
+Reserved GDT blocks: 256
+Blocks per group: 8192
+Fragments per group: 8192
+Inodes per group: 2048
+Inode blocks per group: 256
+Mount count: 0
+Check interval: 15552000 (6 months)
+Reserved blocks uid: 0
+Reserved blocks gid: 0
+First inode: 11
+Inode size: 128
+Default directory hash: tea
+
+
+Group 0: (Blocks 1-8192)
+ Primary superblock at 1, Group descriptors at 2-2
+ Reserved GDT blocks at 3-258
+ Block bitmap at 259 (+258), Inode bitmap at 260 (+259)
+ Inode table at 261-516 (+260)
+ 7662 free blocks, 2037 free inodes, 2 directories, 2037 unused inodes
+ Free blocks: 531-8192
+ Free inodes: 12-2048
+Group 1: (Blocks 8193-16384) [Inode not init]
+ Backup superblock at 8193, Group descriptors at 8194-8194
+ Reserved GDT blocks at 8195-8450
+ Block bitmap at 8451 (+258), Inode bitmap at 8452 (+259)
+ Inode table at 8453-8708 (+260)
+ 7676 free blocks, 2048 free inodes, 0 directories, 2048 unused inodes
+ Free blocks: 8709-16384
+ Free inodes:
+Group 2: (Blocks 16385-24576) [Inode not init, Block not init]
+ Block bitmap at 16385 (+0), Inode bitmap at 16386 (+1)
+ Inode table at 16387-16642 (+2)
+ 7934 free blocks, 2048 free inodes, 0 directories, 2048 unused inodes
+ Free blocks:
+ Free inodes:
+Group 3: (Blocks 24577-32768) [Inode not init]
+ Backup superblock at 24577, Group descriptors at 24578-24578
+ Reserved GDT blocks at 24579-24834
+ Block bitmap at 24835 (+258), Inode bitmap at 24836 (+259)
+ Inode table at 24837-25092 (+260)
+ 7676 free blocks, 2048 free inodes, 0 directories, 2048 unused inodes
+ Free blocks: 25093-32768
+ Free inodes:
+Group 4: (Blocks 32769-40960) [Inode not init, Block not init]
+ Block bitmap at 32769 (+0), Inode bitmap at 32770 (+1)
+ Inode table at 32771-33026 (+2)
+ 7934 free blocks, 2048 free inodes, 0 directories, 2048 unused inodes
+ Free blocks:
+ Free inodes:
+Group 5: (Blocks 40961-49152) [Inode not init]
+ Backup superblock at 40961, Group descriptors at 40962-40962
+ Reserved GDT blocks at 40963-41218
+ Block bitmap at 41219 (+258), Inode bitmap at 41220 (+259)
+ Inode table at 41221-41476 (+260)
+ 7676 free blocks, 2048 free inodes, 0 directories, 2048 unused inodes
+ Free blocks: 41477-49152
+ Free inodes:
+Group 6: (Blocks 49153-57344) [Inode not init, Block not init]
+ Block bitmap at 49153 (+0), Inode bitmap at 49154 (+1)
+ Inode table at 49155-49410 (+2)
+ 7934 free blocks, 2048 free inodes, 0 directories, 2048 unused inodes
+ Free blocks:
+ Free inodes:
+Group 7: (Blocks 57345-65536) [Inode not init]
+ Backup superblock at 57345, Group descriptors at 57346-57346
+ Reserved GDT blocks at 57347-57602
+ Block bitmap at 57603 (+258), Inode bitmap at 57604 (+259)
+ Inode table at 57605-57860 (+260)
+ 7676 free blocks, 2048 free inodes, 0 directories, 2048 unused inodes
+ Free blocks: 57861-65536
+ Free inodes:
+Group 8: (Blocks 65537-73728) [Inode not init, Block not init]
+ Block bitmap at 65537 (+0), Inode bitmap at 65538 (+1)
+ Inode table at 65539-65794 (+2)
+ 7934 free blocks, 2048 free inodes, 0 directories, 2048 unused inodes
+ Free blocks:
+ Free inodes:
+Group 9: (Blocks 73729-81920) [Inode not init]
+ Backup superblock at 73729, Group descriptors at 73730-73730
+ Reserved GDT blocks at 73731-73986
+ Block bitmap at 73987 (+258), Inode bitmap at 73988 (+259)
+ Inode table at 73989-74244 (+260)
+ 7676 free blocks, 2048 free inodes, 0 directories, 2048 unused inodes
+ Free blocks: 74245-81920
+ Free inodes:
+Group 10: (Blocks 81921-90112) [Inode not init, Block not init]
+ Block bitmap at 81921 (+0), Inode bitmap at 81922 (+1)
+ Inode table at 81923-82178 (+2)
+ 7934 free blocks, 2048 free inodes, 0 directories, 2048 unused inodes
+ Free blocks:
+ Free inodes:
+Group 11: (Blocks 90113-98304) [Inode not init, Block not init]
+ Block bitmap at 90113 (+0), Inode bitmap at 90114 (+1)
+ Inode table at 90115-90370 (+2)
+ 7934 free blocks, 2048 free inodes, 0 directories, 2048 unused inodes
+ Free blocks:
+ Free inodes:
+Group 12: (Blocks 98305-106496) [Inode not init, Block not init]
+ Block bitmap at 98305 (+0), Inode bitmap at 98306 (+1)
+ Inode table at 98307-98562 (+2)
+ 7934 free blocks, 2048 free inodes, 0 directories, 2048 unused inodes
+ Free blocks:
+ Free inodes:
+Group 13: (Blocks 106497-114688) [Inode not init, Block not init]
+ Block bitmap at 106497 (+0), Inode bitmap at 106498 (+1)
+ Inode table at 106499-106754 (+2)
+ 7934 free blocks, 2048 free inodes, 0 directories, 2048 unused inodes
+ Free blocks:
+ Free inodes:
+Group 14: (Blocks 114689-122880) [Inode not init, Block not init]
+ Block bitmap at 114689 (+0), Inode bitmap at 114690 (+1)
+ Inode table at 114691-114946 (+2)
+ 7934 free blocks, 2048 free inodes, 0 directories, 2048 unused inodes
+ Free blocks:
+ Free inodes:
+Group 15: (Blocks 122881-131071) [Inode not init]
+ Block bitmap at 122881 (+0), Inode bitmap at 122882 (+1)
+ Inode table at 122883-123138 (+2)
+ 7933 free blocks, 2048 free inodes, 0 directories, 2048 unused inodes
+ Free blocks: 123139-131071
+ Free inodes:
Index: e2fsprogs-1.40.5/tests/m_uninit/script
===================================================================
--- /dev/null
+++ e2fsprogs-1.40.5/tests/m_uninit/script
@@ -0,0 +1,4 @@
+DESCRIPTION="uninitialized group feature"
+FS_SIZE=131072
+MKE2FS_OPTS="-O uninit_groups"
+. $cmd_dir/run_mke2fs

Cheers, Andreas
--
Andreas Dilger
Sr. Staff Engineer, Lustre Group
Sun Microsystems of Canada, Inc.

2008-02-02 08:36:14

by Andreas Dilger

[permalink] [raw]
Subject: Re: [PATCH][11/28] e2fsprogs-nlinks-flag.patch


If there are any directories with > 65000 subdirectories, enable the
DIR_NLINK feature in the superblock. If there are any directories
that formerly had > 65000 subdirs (i_links_count == 1) but no longer
do, don't consider this an error to alert the user about, but silently
fix the link count to the currently counted link count.

The DIR_NLINK feature is not disabled if set but no many-subdir directories
are found, so that the kernel is not required to enable it on-the-fly. The
admin should set it with tune2fs instead.

Index: e2fsprogs-1.40.5/e2fsck/pass4.c
===================================================================
--- e2fsprogs-1.40.5.orig/e2fsck/pass4.c
+++ e2fsprogs-1.40.5/e2fsck/pass4.c
@@ -101,6 +101,7 @@ void e2fsck_pass4(e2fsck_t ctx)
struct problem_context pctx;
__u16 link_count;
__u32 link_counted;
+ __u32 many_subdirs = 0;
char *buf = 0;
int group, maxgroup;

@@ -182,7 +183,20 @@ void e2fsck_pass4(e2fsck_t ctx)
e2fsck_write_inode(ctx, i, inode, "pass4");
}
}
+ if (link_count == 1 && link_counted > EXT2_LINK_MAX)
+ many_subdirs++;
}
+
+ if (many_subdirs) {
+ if (!(fs->super->s_feature_ro_compat &
+ EXT4_FEATURE_RO_COMPAT_DIR_NLINK) &&
+ fix_problem(ctx, PR_4_FEATURE_DIR_NLINK, &pctx)) {
+ fs->super->s_feature_ro_compat |=
+ EXT4_FEATURE_RO_COMPAT_DIR_NLINK;
+ ext2fs_mark_super_dirty(fs);
+ }
+ }
+
ext2fs_free_icount(ctx->inode_link_info); ctx->inode_link_info = 0;
ext2fs_free_icount(ctx->inode_count); ctx->inode_count = 0;
ext2fs_free_inode_bitmap(ctx->inode_bb_map);
Index: e2fsprogs-1.40.5/e2fsck/problem.c
===================================================================
--- e2fsprogs-1.40.5.orig/e2fsck/problem.c
+++ e2fsprogs-1.40.5/e2fsck/problem.c
@@ -1371,6 +1371,11 @@ static struct e2fsck_problem problem_tab
"They @s the same!\n"),
PROMPT_NONE, 0 },

+ /* DIR_NLINK flag not set but dirs with > 65000 subdirs found */
+ { PR_4_FEATURE_DIR_NLINK,
+ N_("@f has @d with > 65000 subdirs, but no DIR_NLINK flag in @S.\n"),
+ PROMPT_FIX, 0 },
+
/* Pass 5 errors */

/* Pass 5: Checking group summary information */
Index: e2fsprogs-1.40.5/e2fsck/problem.h
===================================================================
--- e2fsprogs-1.40.5.orig/e2fsck/problem.h
+++ e2fsprogs-1.40.5/e2fsck/problem.h
@@ -824,6 +824,10 @@ struct problem_context {
/* Inconsistent inode count information cached */
#define PR_4_INCONSISTENT_COUNT 0x040004

+/* Directory with > EXT2_LINK_MAX subdirs found but
+ * EXT4_FEATURE_RO_COMPAT_DIR_NLINK flag is not set */
+#define PR_4_FEATURE_DIR_NLINK 0x040005
+
/*
* Pass 5 errors
*/
Index: e2fsprogs-1.40.5/misc/tune2fs.8.in
===================================================================
--- e2fsprogs-1.40.5.orig/misc/tune2fs.8.in
+++ e2fsprogs-1.40.5/misc/tune2fs.8.in
@@ -400,6 +400,10 @@ The following filesystem features can be
.B dir_index
Use hashed b-trees to speed up lookups in large directories.
.TP
+.B dir_nlink
+Allow directories to have more than 65000 subdirectories (read-only
+compatible).
+.TP
.B filetype
Store file type information in directory entries.
.TP
Index: e2fsprogs-1.40.5/misc/tune2fs.c
===================================================================
--- e2fsprogs-1.40.5.orig/misc/tune2fs.c
+++ e2fsprogs-1.40.5/misc/tune2fs.c
@@ -100,7 +100,8 @@ static __u32 ok_features[3] = {
EXT2_FEATURE_INCOMPAT_FILETYPE| /* Incompat */
EXT4_FEATURE_INCOMPAT_FLEX_BG,
EXT2_FEATURE_RO_COMPAT_SPARSE_SUPER | /* R/O compat */
- EXT4_FEATURE_RO_COMPAT_GDT_CSUM
+ EXT4_FEATURE_RO_COMPAT_GDT_CSUM |
+ EXT4_FEATURE_RO_COMPAT_DIR_NLINK,
};

/*
@@ -286,6 +287,7 @@ static void update_feature_set(ext2_fils
int sparse, old_sparse, filetype, old_filetype;
int journal, old_journal, dxdir, old_dxdir;
int flex_bg, old_flex_bg;
+ int dir_nlink, old_dir_nlink;
struct ext2_super_block *sb= fs->super;
__u32 old_compat, old_incompat, old_ro_compat;

@@ -295,6 +297,8 @@ static void update_feature_set(ext2_fils

old_sparse = sb->s_feature_ro_compat &
EXT2_FEATURE_RO_COMPAT_SPARSE_SUPER;
+ old_dir_nlink = sb->s_feature_ro_compat &
+ EXT4_FEATURE_RO_COMPAT_DIR_NLINK;
old_filetype = sb->s_feature_incompat &
EXT2_FEATURE_INCOMPAT_FILETYPE;
old_flex_bg = sb->s_feature_incompat &
@@ -311,6 +315,8 @@ static void update_feature_set(ext2_fils
}
sparse = sb->s_feature_ro_compat &
EXT2_FEATURE_RO_COMPAT_SPARSE_SUPER;
+ dir_nlink = sb->s_feature_ro_compat &
+ EXT4_FEATURE_RO_COMPAT_DIR_NLINK;
filetype = sb->s_feature_incompat &
EXT2_FEATURE_INCOMPAT_FILETYPE;
flex_bg = sb->s_feature_incompat &
@@ -359,6 +365,14 @@ static void update_feature_set(ext2_fils
if (uuid_is_null((unsigned char *) sb->s_hash_seed))
uuid_generate((unsigned char *) sb->s_hash_seed);
}
+
+ if (old_dir_nlink && !dir_nlink) {
+ fputs(_("The dir_nlink flag was cleared. "
+ "Please run e2fsck before using the filesystem\n"
+ "to verify no many-linked directories exist or "
+ "data loss may result.\n"), stderr);
+ }
+
if (!flex_bg && old_flex_bg) {
if (ext2fs_check_desc(fs)) {
fputs(_("Clearing the flex_bg flag would "

Cheers, Andreas
--
Andreas Dilger
Sr. Staff Engineer, Lustre Group
Sun Microsystems of Canada, Inc.

2008-02-02 08:36:59

by Andreas Dilger

[permalink] [raw]
Subject: Re: [PATCH][12/28] e2fsprogs-expand-extra-isize.patch


This patch adds a "-E expand_extra_isize" feature which makes sure that
_every_ used inode has i_extra_isize >= s_min_extra_isize if
s_min_extra_isize is set. Else it makes sure that i_extra_isize of every
inode is equal to sizeof(ext2_inode_large) - 128.

This is useful for the case where nanosecond timestamps or 64-bit inode
version fields are required for all inodes in the filesystem.

Signed-off-by: Andreas Dilger <[email protected]>
Signed-off-by: Kalpak Shah <[email protected]>

Index: e2fsprogs-1.40.5/lib/ext2fs/ext_attr.c
===================================================================
--- e2fsprogs-1.40.5.orig/lib/ext2fs/ext_attr.c
+++ e2fsprogs-1.40.5/lib/ext2fs/ext_attr.c
@@ -17,6 +17,7 @@
#endif
#include <string.h>
#include <time.h>
+#include <errno.h>

#include "ext2_fs.h"
#include "ext2_ext_attr.h"
@@ -60,11 +61,39 @@ __u32 ext2fs_ext_attr_hash_entry(struct
#undef NAME_HASH_SHIFT
#undef VALUE_HASH_SHIFT

+#define BLOCK_HASH_SHIFT 16
+/*
+ * Re-compute the extended attribute hash value after an entry has changed.
+ */
+static void ext2fs_attr_rehash(struct ext2_ext_attr_header *header,
+ struct ext2_ext_attr_entry *entry)
+{
+ struct ext2_ext_attr_entry *here;
+ __u32 hash = 0;
+
+ entry->e_hash = ext2fs_ext_attr_hash_entry(entry, (char *) header +
+ entry->e_value_offs);
+
+ here = ENTRY(header+1);
+ while (!EXT2_EXT_IS_LAST_ENTRY(here)) {
+ if (!here->e_hash) {
+ /* Block is not shared if an entry's hash value == 0 */
+ hash = 0;
+ break;
+ }
+ hash = (hash << BLOCK_HASH_SHIFT) ^
+ (hash >> (8*sizeof(hash) - BLOCK_HASH_SHIFT)) ^
+ here->e_hash;
+ here = EXT2_EXT_ATTR_NEXT(here);
+ }
+ header->h_hash = hash;
+}
+
errcode_t ext2fs_read_ext_attr(ext2_filsys fs, blk_t block, void *buf)
{
errcode_t retval;

- retval = io_channel_read_blk(fs->io, block, 1, buf);
+ retval = io_channel_read_blk(fs->io, block, 1, buf);
if (retval)
return retval;
#ifdef WORDS_BIGENDIAN
@@ -88,7 +117,7 @@ errcode_t ext2fs_write_ext_attr(ext2_fil
#else
write_buf = (char *) inbuf;
#endif
- retval = io_channel_write_blk(fs->io, block, 1, write_buf);
+ retval = io_channel_write_blk(fs->io, block, 1, write_buf);
if (buf)
ext2fs_free_mem(&buf);
if (!retval)
@@ -122,7 +151,10 @@ errcode_t ext2fs_adjust_ea_refcount(ext2
if (retval)
goto errout;

- header = (struct ext2_ext_attr_header *) block_buf;
+ header = BHDR(block_buf);
+ if (header->h_magic != EXT2_EXT_ATTR_MAGIC)
+ return EXT2_ET_EA_BAD_MAGIC;
+
header->h_refcount += adjust;
if (newcount)
*newcount = header->h_refcount;
@@ -136,3 +168,881 @@ errout:
ext2fs_free_mem(&buf);
return retval;
}
+
+struct ext2_attr_info {
+ int name_index;
+ const char *name;
+ const char *value;
+ int value_len;
+};
+
+struct ext2_attr_search {
+ struct ext2_ext_attr_entry *first;
+ char *base;
+ char *end;
+ struct ext2_ext_attr_entry *here;
+ int not_found;
+};
+
+struct ext2_attr_ibody_find {
+ ext2_ino_t ino;
+ struct ext2_attr_search s;
+};
+
+struct ext2_attr_block_find {
+ struct ext2_attr_search s;
+ char *block;
+};
+
+void ext2fs_attr_shift_entries(struct ext2_ext_attr_entry *entry,
+ int value_offs_shift, char *to,
+ char *from, int n)
+{
+ struct ext2_ext_attr_entry *last = entry;
+
+ /* Adjust the value offsets of the entries */
+ for (; !EXT2_EXT_IS_LAST_ENTRY(last); last = EXT2_EXT_ATTR_NEXT(last)) {
+ if (!last->e_value_block && last->e_value_size) {
+ last->e_value_offs = last->e_value_offs +
+ value_offs_shift;
+ }
+ }
+ /* Shift the entries by n bytes */
+ memmove(to, from, n);
+}
+
+/*
+ * This function returns the free space present in the inode or the EA block.
+ * total is number of bytes taken up by the EA entries and is used to shift the
+ * EAs in ext2fs_expand_extra_isize().
+ */
+int ext2fs_attr_free_space(struct ext2_ext_attr_entry *last,
+ int *min_offs, char *base, int *total)
+{
+ for (; !EXT2_EXT_IS_LAST_ENTRY(last); last = EXT2_EXT_ATTR_NEXT(last)) {
+ *total += EXT2_EXT_ATTR_LEN(last->e_name_len);
+ if (!last->e_value_block && last->e_value_size) {
+ int offs = last->e_value_offs;
+ if (offs < *min_offs)
+ *min_offs = offs;
+ }
+ }
+
+ return (*min_offs - ((char *)last - base) - sizeof(__u32));
+}
+
+static errcode_t ext2fs_attr_check_names(struct ext2_ext_attr_entry *entry,
+ char *end)
+{
+ while (!EXT2_EXT_IS_LAST_ENTRY(entry)) {
+ struct ext2_ext_attr_entry *next = EXT2_EXT_ATTR_NEXT(entry);
+ if ((char *)next >= end)
+ return EXT2_ET_EA_BAD_ENTRIES;
+ entry = next;
+ }
+ return 0;
+}
+
+static errcode_t ext2fs_attr_find_entry(struct ext2_ext_attr_entry **pentry,
+ int name_index, const char *name,
+ int size, int sorted)
+{
+ struct ext2_ext_attr_entry *entry;
+ int name_len;
+ int cmp = 1;
+
+ if (name == NULL)
+ return EXT2_ET_EA_BAD_NAME;
+
+ name_len = strlen(name);
+ entry = *pentry;
+ for (; !EXT2_EXT_IS_LAST_ENTRY(entry);
+ entry = EXT2_EXT_ATTR_NEXT(entry)) {
+ cmp = name_index - entry->e_name_index;
+ if (!cmp)
+ cmp = name_len - entry->e_name_len;
+ if (!cmp)
+ cmp = memcmp(name, entry->e_name, name_len);
+ if (cmp <= 0 && (sorted || cmp == 0))
+ break;
+ }
+ *pentry = entry;
+
+ return cmp ? EXT2_ET_EA_NAME_NOT_FOUND : 0;
+}
+
+static errcode_t ext2fs_attr_block_find(ext2_filsys fs,struct ext2_inode *inode,
+ struct ext2_attr_info *i,
+ struct ext2_attr_block_find *bs)
+{
+ struct ext2_ext_attr_header *header;
+ errcode_t error;
+
+ if (inode->i_file_acl) {
+ /* The inode already has an extended attribute block. */
+ error = ext2fs_get_mem(fs->blocksize, &bs->block);
+ if (error)
+ return error;
+ error = ext2fs_read_ext_attr(fs, inode->i_file_acl, bs->block);
+ if (error)
+ goto cleanup;
+
+ header = BHDR(bs->block);
+ if (header->h_magic != EXT2_EXT_ATTR_MAGIC) {
+ error = EXT2_ET_EA_BAD_MAGIC;
+ goto cleanup;
+ }
+
+ /* Find the named attribute. */
+ bs->s.base = bs->block;
+ bs->s.first = (struct ext2_ext_attr_entry *)(header + 1);
+ bs->s.end = bs->block + fs->blocksize;
+ bs->s.here = bs->s.first;
+ error = ext2fs_attr_find_entry(&bs->s.here, i->name_index,
+ i->name, fs->blocksize, 1);
+ if (error && error != EXT2_ET_EA_NAME_NOT_FOUND)
+ goto cleanup;
+ bs->s.not_found = error;
+ }
+ error = 0;
+
+cleanup:
+ if (error && bs->block)
+ ext2fs_free_mem(&bs->block);
+ return error;
+}
+
+static errcode_t ext2fs_attr_ibody_find(ext2_filsys fs,
+ struct ext2_inode_large *inode,
+ struct ext2_attr_info *i,
+ struct ext2_attr_ibody_find *is)
+{
+ __u32 *eamagic;
+ char *start;
+ errcode_t error;
+
+ if (EXT2_INODE_SIZE(fs->super) == EXT2_GOOD_OLD_INODE_SIZE)
+ return 0;
+
+ if (inode->i_extra_isize == 0)
+ return 0;
+ eamagic = IHDR(inode);
+
+ start = (char *) inode + EXT2_GOOD_OLD_INODE_SIZE +
+ inode->i_extra_isize + sizeof(__u32);
+ is->s.first = (struct ext2_ext_attr_entry *) start;
+ is->s.base = start;
+ is->s.here = is->s.first;
+ is->s.end = (char *) inode + EXT2_INODE_SIZE(fs->super);
+ if (*eamagic == EXT2_EXT_ATTR_MAGIC) {
+ error = ext2fs_attr_check_names((struct ext2_ext_attr_entry *)
+ start, is->s.end);
+ if (error)
+ return error;
+ /* Find the named attribute. */
+ error = ext2fs_attr_find_entry(&is->s.here, i->name_index,
+ i->name, is->s.end -
+ (char *)is->s.base, 0);
+ if (error && error != EXT2_ET_EA_NAME_NOT_FOUND)
+ return error;
+ is->s.not_found = error;
+ }
+
+ return 0;
+}
+
+static errcode_t ext2fs_attr_set_entry(ext2_filsys fs, struct ext2_attr_info *i,
+ struct ext2_attr_search *s)
+{
+ struct ext2_ext_attr_entry *last;
+ int free, min_offs = s->end - s->base, name_len = strlen(i->name);
+
+ /* Compute min_offs and last. */
+ for (last = s->first; !EXT2_EXT_IS_LAST_ENTRY(last);
+ last = EXT2_EXT_ATTR_NEXT(last)) {
+ if (!last->e_value_block && last->e_value_size) {
+ int offs = last->e_value_offs;
+
+ if (offs < min_offs)
+ min_offs = offs;
+ }
+ }
+ free = min_offs - ((char *)last - s->base) - sizeof(__u32);
+
+ if (!s->not_found) {
+ if (!s->here->e_value_block && s->here->e_value_size) {
+ int size = s->here->e_value_size;
+ free += EXT2_EXT_ATTR_SIZE(size);
+ }
+ free += EXT2_EXT_ATTR_LEN(name_len);
+ }
+ if (i->value) {
+ if (free < EXT2_EXT_ATTR_LEN(name_len) +
+ EXT2_EXT_ATTR_SIZE(i->value_len))
+ return EXT2_ET_EA_NO_SPACE;
+ }
+
+ if (i->value && s->not_found) {
+ /* Insert the new name. */
+ int size = EXT2_EXT_ATTR_LEN(name_len);
+ int rest = (char *)last - (char *)s->here + sizeof(__u32);
+
+ memmove((char *)s->here + size, s->here, rest);
+ memset(s->here, 0, size);
+ s->here->e_name_index = i->name_index;
+ s->here->e_name_len = name_len;
+ memcpy(s->here->e_name, i->name, name_len);
+ } else {
+ if (!s->here->e_value_block && s->here->e_value_size) {
+ char *first_val = s->base + min_offs;
+ int offs = s->here->e_value_offs;
+ char *val = s->base + offs;
+ int size = EXT2_EXT_ATTR_SIZE(s->here->e_value_size);
+
+ if (i->value &&
+ size == EXT2_EXT_ATTR_SIZE(i->value_len)) {
+ /* The old and the new value have the same
+ size. Just replace. */
+ s->here->e_value_size = i->value_len;
+ memset(val + size - EXT2_EXT_ATTR_PAD, 0,
+ EXT2_EXT_ATTR_PAD); /* Clear pad bytes */
+ memcpy(val, i->value, i->value_len);
+ return 0;
+ }
+
+ /* Remove the old value. */
+ memmove(first_val + size, first_val, val - first_val);
+ memset(first_val, 0, size);
+ s->here->e_value_size = 0;
+ s->here->e_value_offs = 0;
+ min_offs += size;
+
+ /* Adjust all value offsets. */
+ last = s->first;
+ while (!EXT2_EXT_IS_LAST_ENTRY(last)) {
+ int o = last->e_value_offs;
+
+ if (!last->e_value_block &&
+ last->e_value_size && o < offs)
+ last->e_value_offs = o + size;
+ last = EXT2_EXT_ATTR_NEXT(last);
+ }
+ }
+ if (!i->value) {
+ /* Remove the old name. */
+ int size = EXT2_EXT_ATTR_LEN(name_len);
+
+ last = ENTRY((char *)last - size);
+ memmove((char *)s->here, (char *)s->here + size,
+ (char *)last - (char *)s->here + sizeof(__u32));
+ memset(last, 0, size);
+ }
+ }
+
+ if (i->value) {
+ /* Insert the new value. */
+ s->here->e_value_size = i->value_len;
+ if (i->value_len) {
+ int size = EXT2_EXT_ATTR_SIZE(i->value_len);
+ char *val = s->base + min_offs - size;
+
+ s->here->e_value_offs = min_offs - size;
+ memset(val + size - EXT2_EXT_ATTR_PAD, 0,
+ EXT2_EXT_ATTR_PAD); /* Clear the pad bytes. */
+ memcpy(val, i->value, i->value_len);
+ }
+ }
+
+ return 0;
+}
+
+static errcode_t ext2fs_attr_block_set(ext2_filsys fs, struct ext2_inode *inode,
+ struct ext2_attr_info *i,
+ struct ext2_attr_block_find *bs)
+{
+ struct ext2_attr_search *s = &bs->s;
+ char *new_buf = NULL, *old_block = NULL;
+ blk_t blk;
+ int clear_flag = 0;
+ errcode_t error;
+
+ if (i->value && i->value_len > fs->blocksize)
+ return EXT2_ET_EA_NO_SPACE;
+
+ if (s->base) {
+ if (BHDR(s->base)->h_refcount != 1) {
+ int offset = (char *)s->here - bs->block;
+
+ /* Decrement the refcount of the shared block */
+ old_block = s->base;
+ BHDR(s->base)->h_refcount -= 1;
+
+ error = ext2fs_get_mem(fs->blocksize, &s->base);
+ if (error)
+ goto cleanup;
+ clear_flag = 1;
+ memcpy(s->base, bs->block, fs->blocksize);
+ s->first = ENTRY(BHDR(s->base)+1);
+ BHDR(s->base)->h_refcount = 1;
+ s->here = ENTRY(s->base + offset);
+ s->end = s->base + fs->blocksize;
+ }
+ } else {
+ error = ext2fs_get_mem(fs->blocksize, &s->base);
+ if (error)
+ goto cleanup;
+ clear_flag = 1;
+ memset(s->base, 0, fs->blocksize);
+ BHDR(s->base)->h_magic = EXT2_EXT_ATTR_MAGIC;
+ BHDR(s->base)->h_blocks = 1;
+ BHDR(s->base)->h_refcount = 1;
+ s->first = ENTRY(BHDR(s->base)+1);
+ s->here = ENTRY(BHDR(s->base)+1);
+ s->end = s->base + fs->blocksize;
+ }
+
+ error = ext2fs_attr_set_entry(fs, i, s);
+ if (error)
+ goto cleanup;
+
+ if (!EXT2_EXT_IS_LAST_ENTRY(s->first))
+ ext2fs_attr_rehash(BHDR(s->base), s->here);
+
+ if (!EXT2_EXT_IS_LAST_ENTRY(s->first)) {
+ if (bs->block && bs->block == s->base) {
+ /* We are modifying this block in-place */
+ new_buf = bs->block;
+ blk = inode->i_file_acl;
+ error = ext2fs_write_ext_attr(fs, blk, s->base);
+ if (error)
+ goto cleanup;
+ } else {
+ /* We need to allocate a new block */
+ error = ext2fs_new_block(fs, 0, 0, &blk);
+ if (error)
+ goto cleanup;
+ ext2fs_block_alloc_stats(fs, blk, +1);
+ error = ext2fs_write_ext_attr(fs, blk, s->base);
+ if (error)
+ goto cleanup;
+ new_buf = s->base;
+ if (old_block) {
+ BHDR(s->base)->h_refcount -= 1;
+ error = ext2fs_write_ext_attr(fs,
+ inode->i_file_acl,
+ s->base);
+ if (error)
+ goto cleanup;
+ }
+ }
+ }
+
+ /* Update the i_blocks if we added a new EA block */
+ if (!inode->i_file_acl && new_buf)
+ inode->i_blocks += fs->blocksize / 512;
+ /* Update the inode. */
+ inode->i_file_acl = new_buf ? blk : 0;
+
+cleanup:
+ if (clear_flag)
+ ext2fs_free_mem(&s->base);
+ return 0;
+}
+
+static errcode_t ext2fs_attr_ibody_set(ext2_filsys fs,
+ struct ext2_inode_large *inode,
+ struct ext2_attr_info *i,
+ struct ext2_attr_ibody_find *is)
+{
+ __u32 *eamagic;
+ struct ext2_attr_search *s = &is->s;
+ errcode_t error;
+
+ if (EXT2_INODE_SIZE(fs->super) == EXT2_GOOD_OLD_INODE_SIZE)
+ return EXT2_ET_EA_NO_SPACE;
+
+ error = ext2fs_attr_set_entry(fs, i, s);
+ if (error)
+ return error;
+
+ eamagic = IHDR(inode);
+ if (!EXT2_EXT_IS_LAST_ENTRY(s->first))
+ *eamagic = EXT2_EXT_ATTR_MAGIC;
+ else
+ *eamagic = 0;
+
+ return ext2fs_write_inode_full(fs, is->ino, (struct ext2_inode *)inode,
+ EXT2_INODE_SIZE(fs->super));
+}
+
+
+errcode_t ext2fs_attr_set(ext2_filsys fs, ext2_ino_t ino,
+ struct ext2_inode *inode,
+ int name_index, const char *name, const char *value,
+ int value_len, int flags)
+{
+ struct ext2_inode_large *inode_large = NULL;
+ struct ext2_attr_info i = {
+ .name_index = name_index,
+ .name = name,
+ .value = value,
+ .value_len = value_len,
+ };
+ struct ext2_attr_ibody_find is = {
+ .ino = ino,
+ .s = { .not_found = -ENODATA, },
+ };
+ struct ext2_attr_block_find bs = {
+ .s = { .not_found = -ENODATA, },
+ };
+ errcode_t error;
+
+ if (!name)
+ return EXT2_ET_EA_BAD_NAME;
+ if (strlen(name) > 255)
+ return EXT2_ET_EA_NAME_TOO_BIG;
+
+ if (EXT2_INODE_SIZE(fs->super) > EXT2_GOOD_OLD_INODE_SIZE) {
+ inode_large = (struct ext2_inode_large *)inode;
+
+ error = ext2fs_attr_ibody_find(fs, inode_large, &i, &is);
+ if (error)
+ goto cleanup;
+ }
+ if (is.s.not_found) {
+ error = ext2fs_attr_block_find(fs, inode, &i, &bs);
+ if (error)
+ goto cleanup;
+ }
+
+ if (is.s.not_found && bs.s.not_found) {
+ error = EXT2_ET_EA_NAME_NOT_FOUND;
+ if (flags & XATTR_REPLACE)
+ goto cleanup;
+ error = 0;
+ if (!value)
+ goto cleanup;
+ } else {
+ error = EXT2_ET_EA_NAME_EXISTS;
+ if (flags & XATTR_CREATE)
+ goto cleanup;
+ }
+
+ if (!value) {
+ if (!is.s.not_found &&
+ (EXT2_INODE_SIZE(fs->super) > EXT2_GOOD_OLD_INODE_SIZE))
+ error = ext2fs_attr_ibody_set(fs, inode_large, &i, &is);
+ else if (!bs.s.not_found)
+ error = ext2fs_attr_block_set(fs, inode, &i, &bs);
+ } else {
+ if (EXT2_INODE_SIZE(fs->super) > EXT2_GOOD_OLD_INODE_SIZE)
+ error = ext2fs_attr_ibody_set(fs, inode_large, &i, &is);
+ if (!error && !bs.s.not_found) {
+ i.value = NULL;
+ error = ext2fs_attr_block_set(fs, inode, &i, &bs);
+ } else if (error == EXT2_ET_EA_NO_SPACE) {
+ error = ext2fs_attr_block_set(fs, inode, &i, &bs);
+ if (error)
+ goto cleanup;
+ if (!is.s.not_found) {
+ i.value = NULL;
+ if (EXT2_INODE_SIZE(fs->super) >
+ EXT2_GOOD_OLD_INODE_SIZE)
+ error = ext2fs_attr_ibody_set(fs,
+ inode_large, &i, &is);
+ }
+ }
+ }
+
+cleanup:
+ return error;
+}
+
+static errcode_t ext2fs_attr_check_block(ext2_filsys fs, char *buffer)
+{
+ if (BHDR(buffer)->h_magic != (EXT2_EXT_ATTR_MAGIC) ||
+ BHDR(buffer)->h_blocks != 1)
+ return EXT2_ET_EA_BAD_MAGIC;
+
+ return ext2fs_attr_check_names((struct ext2_ext_attr_entry *)
+ (BHDR(buffer) + 1),
+ buffer + fs->blocksize);
+}
+
+static errcode_t ext2fs_attr_block_get(ext2_filsys fs, struct ext2_inode *inode,
+ int name_index, const char *name,
+ void *buffer, size_t buffer_size,
+ int *easize)
+{
+ struct ext2_ext_attr_header *header = NULL;
+ struct ext2_ext_attr_entry *entry;
+ char *block_buf = NULL;
+ errcode_t error;
+
+ error = EXT2_ET_EA_NAME_NOT_FOUND;
+ if (!inode->i_file_acl)
+ goto cleanup;
+
+ error = ext2fs_get_mem(fs->blocksize, &block_buf);
+ if (error)
+ return error;
+ error = ext2fs_read_ext_attr(fs, inode->i_file_acl, block_buf);
+ if (error)
+ goto cleanup;
+
+ error = ext2fs_attr_check_block(fs, block_buf);
+ if (error)
+ goto cleanup;
+
+ header = BHDR(block_buf);
+ entry = (struct ext2_ext_attr_entry *)(header+1);
+ error = ext2fs_attr_find_entry(&entry, name_index, name,
+ fs->blocksize, 1);
+ if (error)
+ goto cleanup;
+ if (easize)
+ *easize = entry->e_value_size;
+ if (buffer) {
+ error = EXT2_ET_EA_TOO_BIG;
+ if (entry->e_value_size > buffer_size)
+ goto cleanup;
+ memcpy(buffer, block_buf + entry->e_value_offs,
+ entry->e_value_size);
+ }
+
+cleanup:
+ if (block_buf)
+ ext2fs_free_mem (&block_buf);
+ return error;
+}
+
+static errcode_t ext2fs_attr_ibody_get(ext2_filsys fs,
+ struct ext2_inode_large *inode,
+ int name_index, const char *name,
+ void *buffer, size_t buffer_size,
+ int *easize)
+{
+ struct ext2_ext_attr_entry *entry;
+ int error;
+ char *end, *start;
+ __u32 *eamagic;
+
+ if (EXT2_INODE_SIZE(fs->super) == EXT2_GOOD_OLD_INODE_SIZE)
+ return EXT2_ET_EA_NAME_NOT_FOUND;
+
+ eamagic = IHDR(inode);
+ error = ext2fs_attr_check_block(fs, buffer);
+ if (error)
+ return error;
+
+ start = (char *)inode + EXT2_GOOD_OLD_INODE_SIZE +
+ inode->i_extra_isize + sizeof(__u32);
+ entry = (struct ext2_ext_attr_entry *)start;
+ end = (char *)inode + EXT2_INODE_SIZE(fs->super);
+ error = ext2fs_attr_check_names(entry, end);
+ if (error)
+ goto cleanup;
+ error = ext2fs_attr_find_entry(&entry, name_index, name,
+ end - (char *)entry, 0);
+ if (error)
+ goto cleanup;
+ if (easize)
+ *easize = entry->e_value_size;
+ if (buffer) {
+ error = EXT2_ET_EA_TOO_BIG;
+ if (entry->e_value_size > buffer_size)
+ goto cleanup;
+ memcpy(buffer, start + entry->e_value_offs,entry->e_value_size);
+ }
+
+cleanup:
+ return error;
+}
+
+
+errcode_t ext2fs_attr_get(ext2_filsys fs, struct ext2_inode *inode,
+ int name_index, const char *name, char *buffer,
+ size_t buffer_size, int *easize)
+{
+ errcode_t error;
+
+ error = ext2fs_attr_ibody_get(fs, (struct ext2_inode_large *)inode,
+ name_index, name, buffer, buffer_size,
+ easize);
+ if (error == EXT2_ET_EA_NAME_NOT_FOUND)
+ error = ext2fs_attr_block_get(fs, inode, name_index, name,
+ buffer, buffer_size, easize);
+
+ return error;
+}
+
+char *ext2_attr_index_prefix[] = {
+ [EXT2_ATTR_INDEX_USER] = EXT2_ATTR_INDEX_USER_PREFIX,
+ [EXT2_ATTR_INDEX_POSIX_ACL_ACCESS] = EXT2_ATTR_INDEX_POSIX_ACL_ACCESS_PREFIX,
+ [EXT2_ATTR_INDEX_POSIX_ACL_DEFAULT] = EXT2_ATTR_INDEX_POSIX_ACL_DEFAULT_PREFIX,
+ [EXT2_ATTR_INDEX_TRUSTED] = EXT2_ATTR_INDEX_TRUSTED_PREFIX,
+ [EXT2_ATTR_INDEX_LUSTRE] = EXT2_ATTR_INDEX_LUSTRE_PREFIX,
+ [EXT2_ATTR_INDEX_SECURITY] = EXT2_ATTR_INDEX_SECURITY_PREFIX,
+ NULL
+};
+
+int ext2fs_attr_get_next_attr(struct ext2_ext_attr_entry *entry, int name_index,
+ char *buffer, int buffer_size, int start)
+{
+ const int prefix_len = strlen(ext2_attr_index_prefix[name_index]);
+ int total_len;
+
+ if (!start && !EXT2_EXT_IS_LAST_ENTRY(entry))
+ entry = EXT2_EXT_ATTR_NEXT(entry);
+
+ for (; !EXT2_EXT_IS_LAST_ENTRY(entry);
+ entry = EXT2_EXT_ATTR_NEXT(entry)) {
+ if (!name_index)
+ break;
+ if (name_index == entry->e_name_index)
+ break;
+ }
+ if (EXT2_EXT_IS_LAST_ENTRY(entry))
+ return 0;
+
+ total_len = prefix_len + entry->e_name_len + 1;
+ if (buffer && total_len <= buffer_size) {
+ memcpy(buffer, ext2_attr_index_prefix[name_index], prefix_len);
+ memcpy(buffer + prefix_len, entry->e_name, entry->e_name_len);
+ buffer[prefix_len + entry->e_name_len] = '\0';
+ }
+
+ return total_len;
+}
+
+errcode_t ext2fs_expand_extra_isize(ext2_filsys fs, ext2_ino_t ino,
+ struct ext2_inode_large *inode,
+ int new_extra_isize, int *ret,
+ int *needed_size)
+{
+ struct ext2_inode *inode_buf = NULL;
+ __u32 *eamagic = NULL;
+ struct ext2_ext_attr_header *header = NULL;
+ struct ext2_ext_attr_entry *entry = NULL, *last = NULL;
+ struct ext2_attr_ibody_find is = {
+ .ino = ino,
+ .s = { .not_found = EXT2_ET_EA_NO_SPACE, },
+ };
+ struct ext2_attr_block_find bs = {
+ .s = { .not_found = EXT2_ET_EA_NO_SPACE, },
+ };
+ char *start, *end, *block_buf = NULL, *buffer =NULL, *b_entry_name=NULL;
+ int total_ino = 0, total_blk, free, offs, tried_min_extra_isize = 0;
+ int s_min_extra_isize = fs->super->s_min_extra_isize;
+ errcode_t error = 0;
+
+ if (needed_size)
+ *needed_size = new_extra_isize;
+ error = ext2fs_get_mem(fs->blocksize, &block_buf);
+ if (error)
+ return error;
+
+ if (inode == NULL) {
+ error = ext2fs_get_mem(EXT2_INODE_SIZE(fs->super), &inode_buf);
+ if (error)
+ goto cleanup;
+
+ error = ext2fs_read_inode_full(fs, ino, inode_buf,
+ EXT2_INODE_SIZE(fs->super));
+ if (error)
+ goto cleanup;
+
+ inode = (struct ext2_inode_large *)inode_buf;
+ }
+
+retry:
+ if (inode->i_extra_isize >= new_extra_isize)
+ goto cleanup;
+
+ eamagic = IHDR(inode);
+ /* No extended attributes present */
+ if (*eamagic != EXT2_EXT_ATTR_MAGIC) {
+ memset((char *)inode + EXT2_GOOD_OLD_INODE_SIZE +
+ inode->i_extra_isize, 0,
+ EXT2_INODE_SIZE(fs->super) - EXT2_GOOD_OLD_INODE_SIZE -
+ inode->i_extra_isize);
+ inode->i_extra_isize = new_extra_isize;
+ if (needed_size)
+ *needed_size = 0;
+ goto write_inode;
+ }
+
+ start = (char *) inode + EXT2_GOOD_OLD_INODE_SIZE +
+ inode->i_extra_isize + sizeof(__u32);
+ end = (char *) inode + EXT2_INODE_SIZE(fs->super);
+ last = entry = (struct ext2_ext_attr_entry *) start;
+ offs = end - start;
+ /* Consider space takenup by magic number */
+ total_ino = sizeof(__u32);
+ free = ext2fs_attr_free_space(last, &offs, start, &total_ino);
+
+ /* Enough free space available in the inode for expansion */
+ if (free >= new_extra_isize) {
+ ext2fs_attr_shift_entries(entry, inode->i_extra_isize -
+ new_extra_isize, (char *)inode +
+ EXT2_GOOD_OLD_INODE_SIZE + new_extra_isize,
+ (char *)start - sizeof(__u32), total_ino);
+ inode->i_extra_isize = new_extra_isize;
+ if (needed_size)
+ *needed_size = 0;
+ goto write_inode;
+ }
+
+ if (inode->i_file_acl) {
+ error = ext2fs_read_ext_attr(fs, inode->i_file_acl, block_buf);
+ if (error)
+ goto cleanup;
+
+ header = BHDR(block_buf);
+ if (header->h_magic != EXT2_EXT_ATTR_MAGIC) {
+ error = EXT2_ET_EA_BAD_MAGIC;
+ goto cleanup;
+ }
+ end = block_buf + fs->blocksize;
+ last = entry = (struct ext2_ext_attr_entry *)(header+1);
+ start = (char *) entry;
+ offs = end - start;
+ free = ext2fs_attr_free_space(last, &offs, start, &total_blk);
+ if (free < new_extra_isize) {
+ if (!tried_min_extra_isize && s_min_extra_isize) {
+ tried_min_extra_isize++;
+ new_extra_isize = s_min_extra_isize;
+ goto retry;
+ }
+ if (ret)
+ *ret = EXT2_EXPAND_EISIZE_NOSPC;
+ error = EXT2_ET_EA_NO_SPACE;
+ goto cleanup;
+ }
+ } else {
+ if (ret && *ret == EXT2_EXPAND_EISIZE_UNSAFE) {
+ *ret = EXT2_EXPAND_EISIZE_NEW_BLOCK;
+ error = 0;
+ goto cleanup;
+ }
+ free = fs->blocksize;
+ }
+
+ while (new_extra_isize > 0) {
+ int offs, size, entry_size;
+ struct ext2_ext_attr_entry *small_entry = NULL;
+ struct ext2_attr_info i = {
+ .value = NULL,
+ .value_len = 0,
+ };
+ unsigned int total_size, shift_bytes, temp = ~0U, extra_isize=0;
+
+ start = (char *) inode + EXT2_GOOD_OLD_INODE_SIZE +
+ inode->i_extra_isize + sizeof(__u32);
+ end = (char *) inode + EXT2_INODE_SIZE(fs->super);
+ last = (struct ext2_ext_attr_entry *) start;
+
+ /* Find the entry best suited to be pushed into EA block */
+ entry = NULL;
+ for (; !EXT2_EXT_IS_LAST_ENTRY(last);
+ last = EXT2_EXT_ATTR_NEXT(last)) {
+ total_size = EXT2_EXT_ATTR_SIZE(last->e_value_size) +
+ EXT2_EXT_ATTR_LEN(last->e_name_len);
+ if (total_size <= free && total_size < temp) {
+ if (total_size < new_extra_isize) {
+ small_entry = last;
+ } else {
+ entry = last;
+ temp = total_size;
+ }
+ }
+ }
+
+ if (entry == NULL) {
+ if (small_entry) {
+ entry = small_entry;
+ } else {
+ if (!tried_min_extra_isize &&
+ s_min_extra_isize) {
+ tried_min_extra_isize++;
+ new_extra_isize = s_min_extra_isize;
+ goto retry;
+ }
+ if (ret)
+ *ret = EXT2_EXPAND_EISIZE_NOSPC;
+ error = EXT2_ET_EA_NO_SPACE;
+ goto cleanup;
+ }
+ }
+ offs = entry->e_value_offs;
+ size = entry->e_value_size;
+ entry_size = EXT2_EXT_ATTR_LEN(entry->e_name_len);
+ i.name_index = entry->e_name_index;
+ error = ext2fs_get_mem(size, &buffer);
+ if (error)
+ goto cleanup;
+ error = ext2fs_get_mem(entry->e_name_len + 1, &b_entry_name);
+ if (error)
+ goto cleanup;
+ /* Save the entry name and the entry value */
+ memcpy((char *)buffer, (char *) start + offs,
+ EXT2_EXT_ATTR_SIZE(size));
+ memcpy((char *)b_entry_name, (char *)entry->e_name,
+ entry->e_name_len);
+ b_entry_name[entry->e_name_len] = '\0';
+ i.name = b_entry_name;
+
+ error = ext2fs_attr_ibody_find(fs, inode, &i, &is);
+ if (error)
+ goto cleanup;
+
+ error = ext2fs_attr_set_entry(fs, &i, &is.s);
+ if (error)
+ goto cleanup;
+
+ entry = (struct ext2_ext_attr_entry *) start;
+ if (entry_size + EXT2_EXT_ATTR_SIZE(size) >= new_extra_isize)
+ shift_bytes = new_extra_isize;
+ else
+ shift_bytes = entry_size + EXT2_EXT_ATTR_SIZE(size);
+ ext2fs_attr_shift_entries(entry, inode->i_extra_isize -
+ shift_bytes, (char *)inode +
+ EXT2_GOOD_OLD_INODE_SIZE + extra_isize + shift_bytes,
+ (char *)start - sizeof(__u32), total_ino - entry_size);
+
+ extra_isize += shift_bytes;
+ new_extra_isize -= shift_bytes;
+ if (needed_size)
+ *needed_size = new_extra_isize;
+ inode->i_extra_isize = extra_isize;
+
+ i.name = b_entry_name;
+ i.value = buffer;
+ i.value_len = size;
+ error = ext2fs_attr_block_find(fs, (struct ext2_inode *) inode,
+ &i, &bs);
+ if (error)
+ goto cleanup;
+
+ /* Add entry which was removed from the inode into the block */
+ error = ext2fs_attr_block_set(fs, (struct ext2_inode *) inode,
+ &i, &bs);
+ if (error)
+ goto cleanup;
+ }
+
+write_inode:
+ error = ext2fs_write_inode_full(fs, ino, (struct ext2_inode *) inode,
+ EXT2_INODE_SIZE(fs->super));
+cleanup:
+ if (inode_buf)
+ ext2fs_free_mem(&inode_buf);
+ if (block_buf)
+ ext2fs_free_mem(&block_buf);
+ if (buffer)
+ ext2fs_free_mem(&buffer);
+ if (b_entry_name)
+ ext2fs_free_mem(&b_entry_name);
+
+ return error;
+}
Index: e2fsprogs-1.40.5/e2fsck/unix.c
===================================================================
--- e2fsprogs-1.40.5.orig/e2fsck/unix.c
+++ e2fsprogs-1.40.5/e2fsck/unix.c
@@ -618,6 +618,12 @@ static void parse_extended_opts(e2fsck_t
extended_usage++;
continue;
}
+ } else if (strcmp(token, "expand_extra_isize") == 0) {
+ ctx->flags |= E2F_FLAG_EXPAND_EISIZE;
+ if (arg) {
+ extended_usage++;
+ continue;
+ }
} else {
fprintf(stderr, _("Unknown extended option: %s\n"),
token);
@@ -639,6 +646,7 @@ static void parse_extended_opts(e2fsck_t
"\tshared=<preserve|lost+found|delete>\n"
"\tclone=<dup|zero>\n"
"\tea_ver=<ea_version (1 or 2)>\n"
+ "\texpand_extra_isize\n"
"\n"), stderr);
exit(1);
}
@@ -1246,6 +1249,54 @@ restart:
if (ctx->flags & E2F_FLAG_SIGNAL_MASK)
fatal_error(ctx, 0);
check_if_skip(ctx);
+
+ if (EXT2_GOOD_OLD_INODE_SIZE + sb->s_want_extra_isize >
+ EXT2_INODE_SIZE(sb)) {
+ if (fix_problem(ctx, PR_0_WANT_EXTRA_ISIZE_INVALID, &pctx))
+ sb->s_want_extra_isize = sizeof(struct ext2_inode_large) -
+ EXT2_GOOD_OLD_INODE_SIZE;
+ }
+ if (EXT2_GOOD_OLD_INODE_SIZE + sb->s_min_extra_isize >
+ EXT2_INODE_SIZE(sb)) {
+ if (fix_problem(ctx, PR_0_MIN_EXTRA_ISIZE_INVALID, &pctx))
+ sb->s_min_extra_isize = 0;
+ }
+ if (EXT2_INODE_SIZE(sb) > EXT2_GOOD_OLD_INODE_SIZE) {
+ ctx->want_extra_isize = sizeof(struct ext2_inode_large) -
+ EXT2_GOOD_OLD_INODE_SIZE;
+ ctx->min_extra_isize = ~0L;
+ if (EXT2_HAS_RO_COMPAT_FEATURE(sb,
+ EXT4_FEATURE_RO_COMPAT_EXTRA_ISIZE)) {
+ if (ctx->want_extra_isize < sb->s_want_extra_isize)
+ ctx->want_extra_isize = sb->s_want_extra_isize;
+ if (ctx->want_extra_isize < sb->s_min_extra_isize)
+ ctx->want_extra_isize = sb->s_min_extra_isize;
+ }
+ }
+ else {
+ if (sb->s_feature_ro_compat &
+ EXT4_FEATURE_RO_COMPAT_EXTRA_ISIZE) {
+ fix_problem(ctx, PR_0_CLEAR_EXTRA_ISIZE, &pctx);
+ sb->s_feature_ro_compat &=
+ ~EXT4_FEATURE_RO_COMPAT_EXTRA_ISIZE;
+ }
+ sb->s_want_extra_isize = 0;
+ sb->s_min_extra_isize = 0;
+ ctx->flags &= ~E2F_FLAG_EXPAND_EISIZE;
+ }
+
+ if (ctx->options & E2F_OPT_READONLY) {
+ if (ctx->flags & (E2F_FLAG_EXPAND_EISIZE)) {
+ fprintf(stderr, _("Cannot enable EXTRA_ISIZE feature "
+ "on read-only filesystem\n"));
+ exit(1);
+ }
+ } else {
+ if (sb->s_want_extra_isize > sb->s_min_extra_isize &&
+ (sb->s_feature_ro_compat & EXT4_FEATURE_RO_COMPAT_EXTRA_ISIZE))
+ ctx->flags |= E2F_FLAG_EXPAND_EISIZE;
+ }
+
if (bad_blocks_file)
read_bad_blocks_file(ctx, bad_blocks_file, replace_bad_blocks);
else if (cflag)
Index: e2fsprogs-1.40.5/lib/ext2fs/ext2_ext_attr.h
===================================================================
--- e2fsprogs-1.40.5.orig/lib/ext2fs/ext2_ext_attr.h
+++ e2fsprogs-1.40.5/lib/ext2fs/ext2_ext_attr.h
@@ -15,6 +15,9 @@
/* Maximum number of references to one attribute block */
#define EXT2_EXT_ATTR_REFCOUNT_MAX 1024

+#define XATTR_CREATE 0x1 /* set value, fail if attr already exists */
+#define XATTR_REPLACE 0x2 /* set value, fail if attr does not exist */
+
struct ext2_ext_attr_header {
__u32 h_magic; /* magic number for identification */
__u32 h_refcount; /* reference count */
@@ -35,6 +38,32 @@ struct ext2_ext_attr_entry {
#endif
};

+#define BHDR(block) ((struct ext2_ext_attr_header *) block)
+#define IHDR(inode) \
+ ((__u32 *) ((char *)inode + \
+ EXT2_GOOD_OLD_INODE_SIZE + \
+ (inode)->i_extra_isize))
+#define ENTRY(ptr) ((struct ext2_ext_attr_entry *)(ptr))
+
+/* Name indexes */
+#define EXT2_ATTR_INDEX_USER 1
+#define EXT2_ATTR_INDEX_POSIX_ACL_ACCESS 2
+#define EXT2_ATTR_INDEX_POSIX_ACL_DEFAULT 3
+#define EXT2_ATTR_INDEX_TRUSTED 4
+#define EXT2_ATTR_INDEX_LUSTRE 5
+#define EXT2_ATTR_INDEX_SECURITY 6
+#define EXT2_ATTR_INDEX_MAX 7
+
+#define EXT2_ATTR_INDEX_USER_PREFIX "user."
+#define EXT2_ATTR_INDEX_POSIX_ACL_ACCESS_PREFIX "system.posix_acl_access"
+#define EXT2_ATTR_INDEX_POSIX_ACL_DEFAULT_PREFIX "system.posix_acl_default"
+#define EXT2_ATTR_INDEX_TRUSTED_PREFIX "trusted."
+#define EXT2_ATTR_INDEX_LUSTRE_PREFIX "lustre."
+#define EXT2_ATTR_INDEX_SECURITY_PREFIX "security."
+
+#define EXT2_ATTR_PREFIX(index) (index ## _PREFIX)
+#define EXT2_ATTR_PREFIX_LEN(index) (index ## _PRE_LEN)
+
#define EXT2_EXT_ATTR_PAD_BITS 2
#define EXT2_EXT_ATTR_PAD ((unsigned) 1<<EXT2_EXT_ATTR_PAD_BITS)
#define EXT2_EXT_ATTR_ROUND (EXT2_EXT_ATTR_PAD-1)
Index: e2fsprogs-1.40.5/e2fsck/e2fsck.h
===================================================================
--- e2fsprogs-1.40.5.orig/e2fsck/e2fsck.h
+++ e2fsprogs-1.40.5/e2fsck/e2fsck.h
@@ -173,6 +173,7 @@ struct resource_track {
#define E2F_FLAG_RESTARTED 0x0200 /* E2fsck has been restarted */
#define E2F_FLAG_RESIZE_INODE 0x0400 /* Request to recreate resize inode */
#define E2F_FLAG_GOT_DEVSIZE 0x0800 /* Device size has been fetched */
+#define E2F_FLAG_EXPAND_EISIZE 0x2000 /* Expand the inodes (i_extra_isize) */

/*
* Defines for indicating the e2fsck pass number
@@ -350,6 +351,15 @@ struct e2fsck_struct {
profile_t profile;
int blocks_per_page;

+ /* Expand large inodes to atleast these many bytes */
+ int want_extra_isize;
+ /* minimum i_extra_isize found in used inodes. Should not be lesser
+ * than s_min_extra_isize.
+ */
+ __u32 min_extra_isize;
+ int fs_unexpanded_inodes;
+ ext2fs_inode_bitmap expand_eisize_map;
+
/*
* For the use of callers of the e2fsck functions; not used by
* e2fsck functions themselves.
Index: e2fsprogs-1.40.5/lib/ext2fs/ext2_fs.h
===================================================================
--- e2fsprogs-1.40.5.orig/lib/ext2fs/ext2_fs.h
+++ e2fsprogs-1.40.5/lib/ext2fs/ext2_fs.h
@@ -404,10 +404,17 @@ struct ext2_inode_large {
__u32 i_atime_extra; /* extra Access time (nsec << 2 | epoch) */
__u32 i_crtime; /* File creation time */
__u32 i_crtime_extra; /* extra File creation time (nsec << 2 | epoch)*/
+ __u32 i_version_hi; /* high 32 bits for 64-bit version */
};

#define i_size_high i_dir_acl

+#define EXT2_FITS_IN_INODE(inode, field) \
+ ((offsetof(struct ext2_inode_large, field) + \
+ sizeof((inode)->field)) <= \
+ (EXT2_GOOD_OLD_INODE_SIZE + \
+ (inode)->i_extra_isize)) \
+
#if defined(__KERNEL__) || defined(__linux__)
#define i_reserved1 osd1.linux1.l_i_reserved1
#define i_frag osd2.linux2.l_i_frag
@@ -634,6 +641,7 @@ struct ext2_super_block {
#define EXT2_FEATURE_RO_COMPAT_SUPP (EXT2_FEATURE_RO_COMPAT_SPARSE_SUPER| \
EXT2_FEATURE_RO_COMPAT_LARGE_FILE| \
EXT4_FEATURE_RO_COMPAT_DIR_NLINK| \
+ EXT4_FEATURE_RO_COMPAT_EXTRA_ISIZE| \
EXT2_FEATURE_RO_COMPAT_BTREE_DIR)

/*
Index: e2fsprogs-1.40.5/e2fsck/pass1.c
===================================================================
--- e2fsprogs-1.40.5.orig/e2fsck/pass1.c
+++ e2fsprogs-1.40.5/e2fsck/pass1.c
@@ -23,6 +23,7 @@
* - A bitmap of which inodes have bad fields. (inode_bad_map)
* - A bitmap of which inodes are in bad blocks. (inode_bb_map)
* - A bitmap of which inodes are imagic inodes. (inode_imagic_map)
+ * - A bitmap of which inodes need to be expanded (expand_eisize_map)
* - A bitmap of which blocks are in use. (block_found_map)
* - A bitmap of which blocks are in use by two inodes (block_dup_map)
* - The data blocks of the directory inodes. (dir_map)
@@ -353,16 +354,27 @@ static void check_inode_extra_space(e2fs
(inode->i_extra_isize < min || inode->i_extra_isize > max)) {
if (!fix_problem(ctx, PR_1_EXTRA_ISIZE, pctx))
return;
- inode->i_extra_isize = min;
+ inode->i_extra_isize = ctx->want_extra_isize;
e2fsck_write_inode_full(ctx, pctx->ino, pctx->inode,
EXT2_INODE_SIZE(sb), "pass1");
return;
}

- eamagic = (__u32 *) (((char *) inode) + EXT2_GOOD_OLD_INODE_SIZE +
- inode->i_extra_isize);
- if (*eamagic == EXT2_EXT_ATTR_MAGIC) {
- /* it seems inode has an extended attribute(s) in body */
+ eamagic = IHDR(inode);
+ if (*eamagic != EXT2_EXT_ATTR_MAGIC &&
+ (ctx->flags & E2F_FLAG_EXPAND_EISIZE) &&
+ (inode->i_extra_isize < ctx->want_extra_isize)) {
+ fix_problem(ctx, PR_1_EXPAND_EISIZE, pctx);
+ memset((char *)inode + EXT2_GOOD_OLD_INODE_SIZE, 0,
+ EXT2_INODE_SIZE(sb) - EXT2_GOOD_OLD_INODE_SIZE);
+ inode->i_extra_isize = ctx->want_extra_isize;
+ e2fsck_write_inode_full(ctx, pctx->ino,
+ (struct ext2_inode *) inode,
+ EXT2_INODE_SIZE(sb),
+ "check_inode_extra_space");
+ if (inode->i_extra_isize < ctx->min_extra_isize)
+ ctx->min_extra_isize = inode->i_extra_isize;
+ } else {
check_ea_in_inode(ctx, pctx);
}
}
@@ -468,6 +480,156 @@ extern void e2fsck_setup_tdb_icount(e2fs
*ret = 0;
}

+extern char *ext2_attr_index_prefix[];
+
+int e2fsck_pass1_delete_attr(e2fsck_t ctx, struct ext2_inode_large *inode,
+ struct problem_context *pctx, int needed_size)
+{
+ struct ext2_ext_attr_header *header;
+ struct ext2_ext_attr_entry *entry_ino, *entry_blk = NULL, *entry;
+ char *start, name[4096], block_buf[4096];
+ int len, index = EXT2_ATTR_INDEX_USER, entry_size, ea_size;
+ int in_inode = 1, error;
+ unsigned int freed_bytes = inode->i_extra_isize;
+
+ start = (char *) inode + EXT2_GOOD_OLD_INODE_SIZE +
+ inode->i_extra_isize + sizeof(__u32);
+ entry_ino = (struct ext2_ext_attr_entry *) start;
+
+ if (inode->i_file_acl) {
+ error = ext2fs_read_ext_attr(ctx->fs, inode->i_file_acl,
+ block_buf);
+ /* We have already checked this block, shouldn't happen */
+ if (error) {
+ fix_problem(ctx, PR_1_EXTATTR_READ_ABORT, pctx);
+ return 0;
+ }
+ header = BHDR(block_buf);
+ if (header->h_magic != EXT2_EXT_ATTR_MAGIC) {
+ fix_problem(ctx, PR_1_EXTATTR_READ_ABORT, pctx);
+ return 0;
+ }
+
+ entry_blk = (struct ext2_ext_attr_entry *)(header+1);
+ }
+ entry = entry_ino;
+ len = sizeof(entry->e_name);
+ entry_size = ext2fs_attr_get_next_attr(entry, index, name, len, 1);
+
+ while (freed_bytes < needed_size) {
+ if (entry_size && name[0] != '\0') {
+ pctx->str = name;
+ if (fix_problem(ctx, PR_1_EISIZE_DELETE_EA, pctx)) {
+ int i;
+
+ ea_size = EXT2_EXT_ATTR_LEN(entry->e_name_len) +
+ EXT2_EXT_ATTR_SIZE(entry->e_value_size);
+ i = strlen(ext2_attr_index_prefix[entry->e_name_index]);
+ error = ext2fs_attr_set(ctx->fs, pctx->ino,
+ (struct ext2_inode *)inode,
+ index, &name[i], 0,0,0);
+ if (!error)
+ freed_bytes += ea_size;
+ }
+ }
+ len = sizeof(entry->e_name);
+ entry_size = ext2fs_attr_get_next_attr(entry, index,name,len,0);
+ entry = EXT2_EXT_ATTR_NEXT(entry);
+ if (EXT2_EXT_IS_LAST_ENTRY(entry)) {
+ if (in_inode) {
+ entry = entry_blk;
+ len = sizeof(entry->e_name);
+ entry_size = ext2fs_attr_get_next_attr(entry,
+ index, name, len, 1);
+ in_inode = 0;
+ } else {
+ index += 1;
+ in_inode = 1;
+ if (!entry && index < EXT2_ATTR_INDEX_MAX)
+ entry = (struct ext2_ext_attr_entry *)start;
+ else
+ return freed_bytes;
+ }
+ }
+ }
+
+ return freed_bytes;
+}
+
+int e2fsck_pass1_expand_eisize(e2fsck_t ctx, struct ext2_inode_large *inode,
+ struct problem_context *pctx)
+{
+ int needed_size = 0, retval, ret = EXT2_EXPAND_EISIZE_UNSAFE;
+ static int message;
+
+retry:
+ retval = ext2fs_expand_extra_isize(ctx->fs, pctx->ino, inode,
+ ctx->want_extra_isize, &ret,
+ &needed_size);
+ if (ret & EXT2_EXPAND_EISIZE_NEW_BLOCK)
+ goto mark_expand_eisize_map;
+ if (!retval) {
+ e2fsck_write_inode_full(ctx, pctx->ino,
+ (struct ext2_inode *)inode,
+ EXT2_INODE_SIZE(ctx->fs->super),
+ "pass1");
+ return 0;
+ }
+
+ if (ret & EXT2_EXPAND_EISIZE_NOSPC) {
+ if (ctx->options & (E2F_OPT_PREEN | E2F_OPT_YES)) {
+ fix_problem(ctx, PR_1_EA_BLK_NOSPC, pctx);
+ ctx->flags |= E2F_FLAG_ABORT;
+ return -1;
+ }
+
+ if (!message) {
+ pctx->num = ctx->fs->super->s_min_extra_isize;
+ fix_problem(ctx, PR_1_EXPAND_EISIZE_WARNING, pctx);
+ message = 1;
+ }
+delete_EA:
+ retval = e2fsck_pass1_delete_attr(ctx, inode, pctx,
+ needed_size);
+ if (retval >= ctx->want_extra_isize)
+ goto retry;
+
+ needed_size -= retval;
+
+ /*
+ * We loop here until either the user deletes EA(s) or
+ * EXTRA_ISIZE feature is disabled.
+ */
+ if (fix_problem(ctx, PR_1_CLEAR_EXTRA_ISIZE, pctx)) {
+ ctx->fs->super->s_feature_ro_compat &=
+ ~EXT4_FEATURE_RO_COMPAT_EXTRA_ISIZE;
+ ext2fs_mark_super_dirty(ctx->fs);
+ } else {
+ goto delete_EA;
+ }
+ ctx->fs_unexpanded_inodes++;
+
+ /* No EA was deleted, inode cannot be expanded */
+ return -1;
+ }
+
+mark_expand_eisize_map:
+ if (!ctx->expand_eisize_map) {
+ pctx->errcode = ext2fs_allocate_inode_bitmap(ctx->fs,
+ _("expand extrz isize map"),
+ &ctx->expand_eisize_map);
+ if (pctx->errcode) {
+ fix_problem(ctx, PR_1_ALLOCATE_IBITMAP_ERROR,
+ pctx);
+ exit(1);
+ }
+ }
+
+ /* Add this inode to the expand_eisize_map */
+ ext2fs_mark_inode_bitmap(ctx->expand_eisize_map, pctx->ino);
+ return 0;
+}
+
void e2fsck_pass1(e2fsck_t ctx)
{
int i;
@@ -490,6 +652,7 @@ void e2fsck_pass1(e2fsck_t ctx)
int inode_size;
struct ext3_extent_header *eh;
int extent_fs;
+ int inode_exp = 0;

#ifdef RESOURCE_TRACK
init_resource_track(&rtrack, ctx->fs->io);
@@ -980,6 +1143,22 @@ void e2fsck_pass1(e2fsck_t ctx)
}
}

+ if (ctx->flags & E2F_FLAG_EXPAND_EISIZE) {
+ struct ext2_inode_large *inode_l;
+
+ inode_l = (struct ext2_inode_large *) inode;
+
+ if (inode_l->i_extra_isize < ctx->want_extra_isize) {
+ fix_problem(ctx, PR_1_EXPAND_EISIZE, &pctx);
+ inode_exp = e2fsck_pass1_expand_eisize(ctx,
+ inode_l,
+ &pctx);
+ }
+ if ((inode_l->i_extra_isize < ctx->min_extra_isize) &&
+ inode_exp == 0)
+ ctx->min_extra_isize = inode_l->i_extra_isize;
+ }
+
if (ctx->flags & E2F_FLAG_SIGNAL_MASK)
return;

@@ -1282,11 +1461,17 @@ static void adjust_extattr_refcount(e2fs
break;
pctx.blk = blk;
pctx.errcode = ext2fs_read_ext_attr(fs, blk, block_buf);
+ /* We already checked this block, shouldn't happen */
if (pctx.errcode) {
fix_problem(ctx, PR_1_EXTATTR_READ_ABORT, &pctx);
return;
}
- header = (struct ext2_ext_attr_header *) block_buf;
+ header = BHDR(block_buf);
+ if (header->h_magic != EXT2_EXT_ATTR_MAGIC) {
+ fix_problem(ctx, PR_1_EXTATTR_READ_ABORT, &pctx);
+ return;
+ }
+
pctx.blkcount = header->h_refcount;
should_be = header->h_refcount + adjust_sign * count;
pctx.num = should_be;
@@ -1392,7 +1577,7 @@ static int check_ext_attr(e2fsck_t ctx,
pctx->errcode = ext2fs_read_ext_attr(fs, blk, block_buf);
if (pctx->errcode && fix_problem(ctx, PR_1_READ_EA_BLOCK, pctx))
goto clear_extattr;
- header = (struct ext2_ext_attr_header *) block_buf;
+ header = BHDR(block_buf);
pctx->blk = inode->i_file_acl;
if (((ctx->ext_attr_ver == 1) &&
(header->h_magic != EXT2_EXT_ATTR_MAGIC_v1)) ||
Index: e2fsprogs-1.40.5/lib/ext2fs/ext2fs.h
===================================================================
--- e2fsprogs-1.40.5.orig/lib/ext2fs/ext2fs.h
+++ e2fsprogs-1.40.5/lib/ext2fs/ext2fs.h
@@ -406,6 +406,12 @@ typedef struct ext2_icount *ext2_icount_
#define EXT2_CHECK_MAGIC(struct, code) \
if ((struct)->magic != (code)) return (code)

+/*
+ * Flags for returning status of ext2fs_expand_extra_isize()
+ */
+#define EXT2_EXPAND_EISIZE_UNSAFE 0x0001
+#define EXT2_EXPAND_EISIZE_NEW_BLOCK 0x0002
+#define EXT2_EXPAND_EISIZE_NOSPC 0x0004

/*
* For ext2 compression support
@@ -450,6 +456,7 @@ typedef struct ext2_icount *ext2_icount_
#define EXT2_LIB_FEATURE_RO_COMPAT_SUPP (EXT2_FEATURE_RO_COMPAT_SPARSE_SUPER|\
EXT2_FEATURE_RO_COMPAT_LARGE_FILE|\
EXT4_FEATURE_RO_COMPAT_DIR_NLINK|\
+ EXT4_FEATURE_RO_COMPAT_EXTRA_ISIZE| \
EXT4_FEATURE_RO_COMPAT_GDT_CSUM)

/*
@@ -725,6 +732,16 @@ extern errcode_t ext2fs_expand_dir(ext2_
/* ext_attr.c */
extern __u32 ext2fs_ext_attr_hash_entry(struct ext2_ext_attr_entry *entry,
void *data);
+int ext2fs_attr_get_next_attr(struct ext2_ext_attr_entry *entry, int name_index,
+ char *buffer, int buffer_size, int start);
+errcode_t ext2fs_attr_set(ext2_filsys fs, ext2_ino_t ino,
+ struct ext2_inode *inode,
+ int name_index, const char *name, const char *value,
+ int value_len, int flags);
+extern errcode_t ext2fs_expand_extra_isize(ext2_filsys fs, ext2_ino_t ino,
+ struct ext2_inode_large *inode,
+ int new_extra_isize, int *ret,
+ int *needed_size);
extern errcode_t ext2fs_read_ext_attr(ext2_filsys fs, blk_t block, void *buf);
extern errcode_t ext2fs_write_ext_attr(ext2_filsys fs, blk_t block,
void *buf);
Index: e2fsprogs-1.40.5/e2fsck/problem.h
===================================================================
--- e2fsprogs-1.40.5.orig/e2fsck/problem.h
+++ e2fsprogs-1.40.5/e2fsck/problem.h
@@ -212,6 +212,16 @@ struct problem_context {
/* Last group block bitmap is uninitialized. */
#define PR_0_BB_UNINIT_LAST 0x000039

+/* Invalid s_min_extra_isize */
+#define PR_0_MIN_EXTRA_ISIZE_INVALID 0x00003A
+
+/* Invalid s_want_extra_isize */
+#define PR_0_WANT_EXTRA_ISIZE_INVALID 0x00003B
+
+/* Clear EXT4_FEATURE_RO_COMPAT_EXTRA_ISIZE flag */
+#define PR_0_CLEAR_EXTRA_ISIZE 0x00003C
+
+
/*
* Pass 1 errors
*/
@@ -498,6 +508,25 @@ struct problem_context {
/* extent/index was modified & repaired - not really a problem */
#define PR_1_EXTENT_CHANGED 0x010067

+/* Warning for user that all inodes need to be expanded atleast by
+ * s_min_extra_isize
+ */
+#define PR_1_EXPAND_EISIZE_WARNING 0x010068
+
+/* Expand the inode */
+#define PR_1_EXPAND_EISIZE 0x010069
+
+/* Delete an EA so that EXTRA_ISIZE may be enabled */
+#define PR_1_EISIZE_DELETE_EA 0x01006A
+
+/* An EA needs to be deleted by e2fsck is being run with -p or -y */
+#define PR_1_EA_BLK_NOSPC 0x01006B
+
+/* Disable EXTRA_ISIZE feature as inode cannot be expanded
+ * without deletion of an EA
+ */
+#define PR_1_CLEAR_EXTRA_ISIZE 0x01006C
+
/*
* Pass 1b errors
*/
@@ -961,6 +990,9 @@ struct problem_context {
/* Inode in use but group is marked INODE_UNINIT */
#define PR_5_INODE_UNINIT 0x050019

+/* Expand the inodes which need a new EA block */
+#define PR_5_EXPAND_EISIZE 0x05001a
+
/*
* Post-Pass 5 errors
*/
Index: e2fsprogs-1.40.5/e2fsck/problem.c
===================================================================
--- e2fsprogs-1.40.5.orig/e2fsck/problem.c
+++ e2fsprogs-1.40.5/e2fsck/problem.c
@@ -376,6 +376,19 @@ static struct e2fsck_problem problem_tab
N_("last @g @b @B uninitialized. "),
PROMPT_FIX, PR_PREEN_OK },

+ { PR_0_MIN_EXTRA_ISIZE_INVALID,
+ N_("@S has invalid s_min_extra_isize. "),
+ PROMPT_FIX, PR_PREEN_OK },
+
+ { PR_0_WANT_EXTRA_ISIZE_INVALID,
+ N_("@S has invalid s_want_extra_isize. "),
+ PROMPT_FIX, PR_PREEN_OK },
+
+ { PR_0_CLEAR_EXTRA_ISIZE,
+ N_("Disable extra_isize feature since @f has 128 byte inodes. "),
+ PROMPT_NONE, 0 },
+
+
/* Pass 1 errors */

/* Pass 1: Checking inodes, blocks, and sizes */
@@ -849,6 +862,38 @@ static struct e2fsck_problem problem_tab
N_("@i %i has high 16 bits of extent/index @b set\n"),
PROMPT_CLEAR, PR_PREEN_OK | PR_NO_OK | PR_PREEN_NOMSG },

+ /* expand inode */
+ { PR_1_EXPAND_EISIZE_WARNING,
+ N_("\ne2fsck is being run with \"expand_extra_isize\" option or\n"
+ "s_min_extra_isize of %d bytes has been set in the superblock.\n"
+ "Inode %i does not have enough free space. Either some EAs\n"
+ "need to be deleted from this inode or the RO_COMPAT_EXTRA_ISIZE\n"
+ "flag must be cleared.\n\n"), PROMPT_NONE, PR_PREEN_OK | PR_NO_OK |
+ PR_PREEN_NOMSG },
+
+ /* expand inode */
+ { PR_1_EXPAND_EISIZE,
+ N_("Expanding @i %i.\n"),
+ PROMPT_NONE, PR_PREEN_OK | PR_NO_OK | PR_PREEN_NOMSG },
+
+ /* delete an EA so that EXTRA_ISIZE feature may be enabled */
+ { PR_1_EISIZE_DELETE_EA,
+ N_("Delete EA %s of @i %i so that EXTRA_ISIZE feature may be "
+ "enabled?\n"), PROMPT_FIX, PR_NO_OK | PR_PREEN_NO },
+
+ /* an EA needs to be deleted by e2fsck is being run with -p or -y */
+ { PR_1_EA_BLK_NOSPC,
+ N_("An EA needs to be deleted for @i %i but e2fsck is being run\n"
+ "with -p or -y mode.\n"),
+ PROMPT_ABORT, 0 },
+
+ /* disable EXTRA_ISIZE feature since inode cannot be expanded */
+ { PR_1_CLEAR_EXTRA_ISIZE,
+ N_("Disable EXTRA_ISIZE feature since @i %i cannot be expanded\n"
+ "without deletion of an EA.\n"),
+ PROMPT_FIX, 0 },
+
+
/* Pass 1b errors */

/* Pass 1B: Rescan for duplicate/bad blocks */
@@ -1587,6 +1632,11 @@ static struct e2fsck_problem problem_tab
N_("@g %g @i(s) in use but @g is marked INODE_UNINIT\n"),
PROMPT_FIX, PR_PREEN_OK },

+ /* Expand inode */
+ { PR_5_EXPAND_EISIZE,
+ N_("Expanding @i %i.\n"),
+ PROMPT_NONE, PR_PREEN_OK | PR_NO_OK | PR_PREEN_NOMSG },
+
/* Recreate journal if E2F_FLAG_JOURNAL_INODE flag is set */
{ PR_6_RECREATE_JOURNAL,
N_("Recreate journal to make the filesystem ext3 again?\n"),
Index: e2fsprogs-1.40.5/e2fsck/e2fsck.c
===================================================================
--- e2fsprogs-1.40.5.orig/e2fsck/e2fsck.c
+++ e2fsprogs-1.40.5/e2fsck/e2fsck.c
@@ -151,6 +151,7 @@ errcode_t e2fsck_reset_context(e2fsck_t
ctx->fs_tind_count = 0;
ctx->fs_fragmented = 0;
ctx->large_files = 0;
+ ctx->fs_unexpanded_inodes = 0;

/* Reset the superblock to the user's requested value */
ctx->superblock = ctx->use_superblock;
Index: e2fsprogs-1.40.5/e2fsck/pass5.c
===================================================================
--- e2fsprogs-1.40.5.orig/e2fsck/pass5.c
+++ e2fsprogs-1.40.5/e2fsck/pass5.c
@@ -64,6 +64,42 @@ void e2fsck_pass5(e2fsck_t ctx)
ext2fs_free_block_bitmap(ctx->block_found_map);
ctx->block_found_map = 0;

+ if (ctx->flags & E2F_FLAG_EXPAND_EISIZE) {
+ int min_extra_isize;
+
+ if (!ctx->expand_eisize_map)
+ goto set_min_extra_isize;
+
+ for (pctx.ino = 1; pctx.ino < ctx->fs->super->s_inodes_count;
+ pctx.ino++) {
+ if (ext2fs_test_inode_bitmap(ctx->expand_eisize_map,
+ pctx.ino)) {
+ fix_problem(ctx, PR_5_EXPAND_EISIZE, &pctx);
+ ext2fs_expand_extra_isize(ctx->fs, pctx.ino, 0,
+ ctx->want_extra_isize,
+ NULL, NULL);
+ }
+ }
+ ext2fs_free_inode_bitmap(ctx->expand_eisize_map);
+
+set_min_extra_isize:
+ if (ctx->fs->super->s_min_extra_isize)
+ min_extra_isize = ctx->fs->super->s_min_extra_isize;
+ else
+ min_extra_isize = ctx->want_extra_isize;
+ if (ctx->min_extra_isize >= min_extra_isize &&
+ !ctx->fs_unexpanded_inodes) {
+ ctx->fs->super->s_min_extra_isize =ctx->min_extra_isize;
+ ctx->fs->super->s_feature_ro_compat |=
+ EXT4_FEATURE_RO_COMPAT_EXTRA_ISIZE;
+ } else {
+ ctx->fs->super->s_min_extra_isize = 0;
+ ctx->fs->super->s_feature_ro_compat &=
+ ~EXT4_FEATURE_RO_COMPAT_EXTRA_ISIZE;
+ }
+ ext2fs_mark_super_dirty(ctx->fs);
+ }
+
#ifdef RESOURCE_TRACK
if (ctx->options & E2F_OPT_TIME2) {
e2fsck_clear_progbar(ctx);
Index: e2fsprogs-1.40.5/lib/ext2fs/ext2_err.et.in
===================================================================
--- e2fsprogs-1.40.5.orig/lib/ext2fs/ext2_err.et.in
+++ e2fsprogs-1.40.5/lib/ext2fs/ext2_err.et.in
@@ -338,5 +338,29 @@ ec EXT2_ET_EXTENT_LEAF_BAD,
ec EXT2_ET_EXTENT_NO_SPACE,
"No free space in extent map"

+ec EXT2_ET_EA_BAD_MAGIC,
+ "Extended attribute block has bad magic value"
+
+ec EXT2_ET_EA_BAD_ENTRIES,
+ "Extended attribute block has bad entries"
+
+ec EXT2_ET_EA_NO_SPACE,
+ "No free space for extended attribute"
+
+ec EXT2_ET_EA_TOO_BIG,
+ "Extended attribute too big for buffer"
+
+ec EXT2_ET_EA_NAME_TOO_BIG,
+ "Extended attribute name too big for header"
+
+ec EXT2_ET_EA_BAD_NAME,
+ "Extended attribute name is bad"
+
+ec EXT2_ET_EA_NAME_NOT_FOUND,
+ "Extended attribute name not found"
+
+ec EXT2_ET_EA_NAME_EXISTS,
+ "Extended attribute name already exists"
+
end


Cheers, Andreas
--
Andreas Dilger
Sr. Staff Engineer, Lustre Group
Sun Microsystems of Canada, Inc.

2008-02-02 08:40:07

by Andreas Dilger

[permalink] [raw]
Subject: Re: [PATCH][14/28] e2fsprogs-tests-f_expisize_ea_del.patch

Test case for expanding inode size where there is not enough room
for the requested new inode size. Prompt user to delete one or more
EAs (default is to abort).

Cheers, Andreas
--
Andreas Dilger
Sr. Staff Engineer, Lustre Group
Sun Microsystems of Canada, Inc.


Attachments:
(No filename) (264.00 B)
e2fsprogs-tests-f_expisize_ea_del.patch (13.92 kB)
Download all attachments

2008-02-02 08:41:04

by Andreas Dilger

[permalink] [raw]
Subject: Re: [PATCH][15/28] e2fsprogs-ibadness-counter.patch


The present e2fsck code checks the inode, per field basis. It doesn't
take into consideration to total sanity of the inode. This may cause
e2fsck turning a garbage inode into an apparently sane inode ("It is a
vessel of fertilizer, and none may abide its strength.").

The following patch adds a heuristics to detect the degree of badness of
an inode. icount mechanism is used to keep track of the badness of every
inode. The badness is increased as various fields in inode are found to
be corrupt. Badness above a certain threshold value results in deletion
of the inode. The default threshold value is 7, it can be specified to
e2fsck using "-E inode_badness_threshold=<value>"

This can avoid lengthy pass1b shared block processing, where a corrupt
chunk of the inode table has resulted in a bunch of garbage inodes
suddenly having shared blocks with a lot of good inodes (or each other).

Signed-off-by: Andreas Dilger <[email protected]>
Signed-off-by: Girish Shilamkar <[email protected]>

Index: e2fsprogs-1.40.4/e2fsck/e2fsck.h
===================================================================
--- e2fsprogs-1.40.4.orig/e2fsck/e2fsck.h
+++ e2fsprogs-1.40.4/e2fsck/e2fsck.h
@@ -11,6 +11,7 @@

#include <stdio.h>
#include <string.h>
+#include <stddef.h>
#ifdef HAVE_UNISTD_H
#include <unistd.h>
#endif
@@ -195,6 +196,18 @@ typedef enum {
E2F_CLONE_ZERO
} clone_opt_t;

+#define EXT4_FITS_IN_INODE(ext4_inode, einode, field) \
+ ((offsetof(typeof(*ext4_inode), field) + \
+ sizeof(ext4_inode->field)) \
+ <= (EXT2_GOOD_OLD_INODE_SIZE + \
+ (einode)->i_extra_isize)) \
+
+#define BADNESS_NORMAL 1
+#define BADNESS_HIGH 2
+#define BADNESS_THRESHOLD 8
+#define BADNESS_BAD_MODE 100
+#define BADNESS_LARGE_FILE 2199023255552ULL
+
/*
* Define the extended attribute refcount structure
*/
@@ -229,7 +242,6 @@ struct e2fsck_struct {
unsigned long max);

ext2fs_inode_bitmap inode_used_map; /* Inodes which are in use */
- ext2fs_inode_bitmap inode_bad_map; /* Inodes which are bad somehow */
ext2fs_inode_bitmap inode_dir_map; /* Inodes which are directories */
ext2fs_inode_bitmap inode_bb_map; /* Inodes which are in bad blocks */
ext2fs_inode_bitmap inode_imagic_map; /* AFS inodes */
@@ -244,6 +256,8 @@ struct e2fsck_struct {
*/
ext2_icount_t inode_count;
ext2_icount_t inode_link_info;
+ ext2_icount_t inode_badness;
+ int inode_badness_threshold;

ext2_refcount_t refcount;
ext2_refcount_t refcount_extra;
@@ -344,6 +358,7 @@ struct e2fsck_struct {
/* misc fields */
time_t now;
time_t time_fudge; /* For working around buggy init scripts */
+ time_t now_tolerance_val;
int ext_attr_ver;
shared_opt_t shared;
clone_opt_t clone;
@@ -454,6 +469,8 @@ extern int e2fsck_pass1_check_device_ino
struct ext2_inode *inode);
extern int e2fsck_pass1_check_symlink(ext2_filsys fs,
struct ext2_inode *inode, char *buf);
+extern void e2fsck_mark_inode_bad(e2fsck_t ctx, ino_t ino, int count);
+extern int is_inode_bad(e2fsck_t ctx, ino_t ino);

/* pass2.c */
extern int e2fsck_process_bad_inode(e2fsck_t ctx, ext2_ino_t dir,
Index: e2fsprogs-1.40.4/e2fsck/pass1.c
===================================================================
--- e2fsprogs-1.40.4.orig/e2fsck/pass1.c
+++ e2fsprogs-1.40.4/e2fsck/pass1.c
@@ -20,7 +20,8 @@
* - A bitmap of which inodes are in use. (inode_used_map)
* - A bitmap of which inodes are directories. (inode_dir_map)
* - A bitmap of which inodes are regular files. (inode_reg_map)
- * - A bitmap of which inodes have bad fields. (inode_bad_map)
+ * - An icount mechanism is used to keep track of
+ * inodes with bad fields and its badness (ctx->inode_badness)
* - A bitmap of which inodes are in bad blocks. (inode_bb_map)
* - A bitmap of which inodes are imagic inodes. (inode_imagic_map)
* - A bitmap of which inodes need to be expanded (expand_eisize_map)
@@ -68,7 +69,6 @@ static void check_blocks(e2fsck_t ctx, s
static void mark_table_blocks(e2fsck_t ctx);
static void alloc_bb_map(e2fsck_t ctx);
static void alloc_imagic_map(e2fsck_t ctx);
-static void mark_inode_bad(e2fsck_t ctx, ino_t ino);
static void handle_fs_bad_blocks(e2fsck_t ctx);
static void process_inodes(e2fsck_t ctx, char *block_buf);
static EXT2_QSORT_TYPE process_inode_cmp(const void *a, const void *b);
@@ -220,6 +220,7 @@ static void check_immutable(e2fsck_t ctx
if (!(pctx->inode->i_flags & BAD_SPECIAL_FLAGS))
return;

+ e2fsck_mark_inode_bad(ctx, pctx->ino, BADNESS_NORMAL);
if (!fix_problem(ctx, PR_1_SET_IMMUTABLE, pctx))
return;

@@ -238,6 +239,7 @@ static void check_size(e2fsck_t ctx, str
if ((inode->i_size == 0) && (inode->i_size_high == 0))
return;

+ e2fsck_mark_inode_bad(ctx, pctx->ino, BADNESS_NORMAL);
if (!fix_problem(ctx, PR_1_SET_NONZSIZE, pctx))
return;

@@ -352,6 +354,7 @@ static void check_inode_extra_space(e2fs
*/
if (inode->i_extra_isize &&
(inode->i_extra_isize < min || inode->i_extra_isize > max)) {
+ e2fsck_mark_inode_bad(ctx, pctx->ino, BADNESS_NORMAL);
if (!fix_problem(ctx, PR_1_EXTRA_ISIZE, pctx))
return;
inode->i_extra_isize = ctx->want_extra_isize;
@@ -441,6 +444,7 @@ static void check_is_really_dir(e2fsck_t
(dirent->rec_len % 4))
return;

+ e2fsck_mark_inode_bad(ctx, pctx->ino, BADNESS_NORMAL);
if (fix_problem(ctx, PR_1_TREAT_AS_DIRECTORY, pctx)) {
inode->i_mode = (inode->i_mode & 07777) | LINUX_S_IFDIR;
e2fsck_write_inode_full(ctx, pctx->ino, inode,
@@ -637,6 +641,7 @@ void e2fsck_pass1(e2fsck_t ctx)
ext2_filsys fs = ctx->fs;
ext2_ino_t ino;
struct ext2_inode *inode;
+ struct ext2_inode_large *inode_large;
ext2_inode_scan scan;
char *block_buf;
#ifdef RESOURCE_TRACK
@@ -873,8 +878,10 @@ void e2fsck_pass1(e2fsck_t ctx)
ino, 0);
e2fsck_write_inode(ctx, ino, inode,
"pass1");
+ } else {
+ e2fsck_mark_inode_bad(ctx, ino,
+ BADNESS_NORMAL);
}
-
}
/*
* If dtime is set, offer to clear it. mke2fs
@@ -891,6 +898,7 @@ void e2fsck_pass1(e2fsck_t ctx)
e2fsck_write_inode(ctx, ino, inode,
"pass1");
}
+ e2fsck_mark_inode_bad(ctx, ino, BADNESS_NORMAL);
}
} else if (ino == EXT2_JOURNAL_INO) {
ext2fs_mark_inode_bitmap(ctx->inode_used_map, ino);
@@ -997,6 +1005,7 @@ void e2fsck_pass1(e2fsck_t ctx)
inode->i_dtime = 0;
e2fsck_write_inode(ctx, ino, inode, "pass1");
}
+ e2fsck_mark_inode_bad(ctx, ino, BADNESS_NORMAL);
}

ext2fs_mark_inode_bitmap(ctx->inode_used_map, ino);
@@ -1013,14 +1022,16 @@ void e2fsck_pass1(e2fsck_t ctx)
frag = fsize = 0;
}

+ /* Fixed in pass2, e2fsck_process_bad_inode(). */
if (inode->i_faddr || frag || fsize ||
(LINUX_S_ISDIR(inode->i_mode) && inode->i_dir_acl))
- mark_inode_bad(ctx, ino);
+ e2fsck_mark_inode_bad(ctx, ino, BADNESS_NORMAL);
+ /* Fixed in pass2, e2fsck_process_bad_inode(). */
if ((fs->super->s_creator_os == EXT2_OS_LINUX) &&
!(fs->super->s_feature_ro_compat &
EXT4_FEATURE_RO_COMPAT_HUGE_FILE) &&
(inode->osd2.linux2.l_i_blocks_hi != 0))
- mark_inode_bad(ctx, ino);
+ e2fsck_mark_inode_bad(ctx, ino, BADNESS_NORMAL);
if (inode->i_flags & EXT2_IMAGIC_FL) {
if (imagic_fs) {
if (!ctx->inode_imagic_map)
@@ -1033,6 +1044,7 @@ void e2fsck_pass1(e2fsck_t ctx)
e2fsck_write_inode(ctx, ino,
inode, "pass1");
}
+ e2fsck_mark_inode_bad(ctx, ino, BADNESS_NORMAL);
}
}

@@ -1075,8 +1087,41 @@ void e2fsck_pass1(e2fsck_t ctx)
check_immutable(ctx, &pctx);
check_size(ctx, &pctx);
ctx->fs_sockets_count++;
- } else
- mark_inode_bad(ctx, ino);
+ } else {
+ e2fsck_mark_inode_bad(ctx, ino, BADNESS_NORMAL);
+ }
+
+ if (inode->i_atime > ctx->now + ctx->now_tolerance_val ||
+ inode->i_mtime > ctx->now + ctx->now_tolerance_val)
+ e2fsck_mark_inode_bad(ctx, ino, BADNESS_NORMAL);
+
+ if (inode->i_ctime < sb->s_mkfs_time ||
+ inode->i_ctime > ctx->now + ctx->now_tolerance_val)
+ e2fsck_mark_inode_bad(ctx, ino, BADNESS_HIGH);
+
+ if (EXT4_FITS_IN_INODE(inode_large,
+ (struct ext2_inode_large *)inode, i_crtime)) {
+ if (((struct ext2_inode_large *)inode)->i_crtime <
+ sb->s_mkfs_time ||
+ ((struct ext2_inode_large *)inode)->i_crtime >
+ ctx->now + ctx->now_tolerance_val) {
+ e2fsck_mark_inode_bad(ctx, ino, BADNESS_HIGH);
+ }
+ }
+
+ /* Is it a regular file */
+ if ((LINUX_S_ISREG(inode->i_mode)) &&
+ /* File size > 2TB */
+ ((((long long)inode->i_size_high << 32) +
+ inode->i_size) > BADNESS_LARGE_FILE) &&
+ /* fs does not have huge file feature */
+ ((fs->super->s_creator_os == EXT2_OS_LINUX) &&
+ !(fs->super->s_feature_ro_compat &
+ EXT4_FEATURE_RO_COMPAT_HUGE_FILE) &&
+ /* inode does not have enough blocks for size */
+ (inode->osd2.linux2.l_i_blocks_hi != 0))) {
+ e2fsck_mark_inode_bad(ctx, ino, BADNESS_NORMAL);
+ }

eh = (struct ext3_extent_header *)inode->i_block;
if ((inode->i_flags & EXT4_EXTENTS_FL)) {
@@ -1091,19 +1136,28 @@ void e2fsck_pass1(e2fsck_t ctx)
ext2fs_mark_super_dirty(fs);
extent_fs = 1;
}
- } else if (fix_problem(ctx, PR_1_SET_EXTENT_FL, &pctx)){
- inode->i_flags &= ~EXT4_EXTENTS_FL;
- e2fsck_write_inode(ctx, ino, inode, "pass1");
- goto check_ind_inode;
+ } else {
+ e2fsck_mark_inode_bad(ctx, ino, BADNESS_NORMAL);
+ if (fix_problem(ctx, PR_1_SET_EXTENT_FL,
+ &pctx)) {
+ inode->i_flags &= ~EXT4_EXTENTS_FL;
+ e2fsck_write_inode(ctx, ino,
+ inode,"pass1");
+ goto check_ind_inode;
+ }
}
} else if (extent_fs &&
(LINUX_S_ISREG(inode->i_mode) ||
LINUX_S_ISDIR(inode->i_mode)) &&
ext2fs_extent_header_verify(eh, EXT2_N_BLOCKS *
- sizeof(__u32)) == 0 &&
- fix_problem(ctx, PR_1_UNSET_EXTENT_FL, &pctx)) {
- inode->i_flags |= EXT4_EXTENTS_FL;
- e2fsck_write_inode(ctx, ino, inode, "pass1");
+ sizeof(__u32)) == 0) {
+ if (fix_problem(ctx, PR_1_UNSET_EXTENT_FL,
+ &pctx)) {
+ inode->i_flags |= EXT4_EXTENTS_FL;
+ e2fsck_write_inode(ctx, ino, inode,
+ "pass1");
+ }
+ e2fsck_mark_inode_bad(ctx, ino, BADNESS_NORMAL);
}
if (extent_fs && inode->i_flags & EXT4_EXTENTS_FL) {
ctx->extent_files++;
@@ -1344,29 +1398,27 @@ static EXT2_QSORT_TYPE process_inode_cmp
}

/*
- * Mark an inode as being bad in some what
+ * Mark an inode as being bad and increment its badness counter.
*/
-static void mark_inode_bad(e2fsck_t ctx, ino_t ino)
+void e2fsck_mark_inode_bad(e2fsck_t ctx, ino_t ino, int count)
{
- struct problem_context pctx;
-
- if (!ctx->inode_bad_map) {
- clear_problem_context(&pctx);
+ struct problem_context pctx;
+ __u16 result;

- pctx.errcode = ext2fs_allocate_inode_bitmap(ctx->fs,
- _("bad inode map"), &ctx->inode_bad_map);
+ if (!ctx->inode_badness) {
+ clear_problem_context(&pctx);
+ pctx.errcode = ext2fs_create_icount2(ctx->fs, 0, 0, NULL,
+ &ctx->inode_badness);
if (pctx.errcode) {
- pctx.num = 3;
- fix_problem(ctx, PR_1_ALLOCATE_IBITMAP_ERROR, &pctx);
- /* Should never get here */
+ fix_problem(ctx, PR_1_ALLOCATE_ICOUNT, &pctx);
ctx->flags |= E2F_FLAG_ABORT;
return;
}
}
- ext2fs_mark_inode_bitmap(ctx->inode_bad_map, ino);
+ ext2fs_icount_fetch(ctx->inode_badness, ino, &result);
+ ext2fs_icount_store(ctx->inode_badness, ino, count + result);
}

-
/*
* This procedure will allocate the inode "bb" (badblock) map table
*/
@@ -1521,7 +1573,8 @@ static int check_ext_attr(e2fsck_t ctx,
if (!(fs->super->s_feature_compat & EXT2_FEATURE_COMPAT_EXT_ATTR) ||
(blk < fs->super->s_first_data_block) ||
(blk >= fs->super->s_blocks_count)) {
- mark_inode_bad(ctx, ino);
+ /* Fixed in pass2, e2fsck_process_bad_inode(). */
+ e2fsck_mark_inode_bad(ctx, ino, BADNESS_NORMAL);
return 0;
}

@@ -1701,21 +1754,28 @@ static int handle_htree(e2fsck_t ctx, st

if ((!LINUX_S_ISDIR(inode->i_mode) &&
fix_problem(ctx, PR_1_HTREE_NODIR, pctx)) ||
- (!(fs->super->s_feature_compat & EXT2_FEATURE_COMPAT_DIR_INDEX) &&
- fix_problem(ctx, PR_1_HTREE_SET, pctx)))
- return 1;
+ (!(fs->super->s_feature_compat & EXT2_FEATURE_COMPAT_DIR_INDEX))) {
+ e2fsck_mark_inode_bad(ctx, ino, BADNESS_NORMAL);
+ if (fix_problem(ctx, PR_1_HTREE_SET, pctx))
+ return 1;
+ }

ext2fs_block_iterate2(fs, ino, BLOCK_FLAG_DATA_ONLY | BLOCK_FLAG_HOLE,
block_buf, htree_blk_iter_cb, &blk);
if (((blk == 0) ||
(blk < fs->super->s_first_data_block) ||
- (blk >= fs->super->s_blocks_count)) &&
- fix_problem(ctx, PR_1_HTREE_BADROOT, pctx))
- return 1;
+ (blk >= fs->super->s_blocks_count))) {
+ e2fsck_mark_inode_bad(ctx, ino, BADNESS_NORMAL);
+ if (fix_problem(ctx, PR_1_HTREE_BADROOT, pctx))
+ return 1;
+ }

retval = io_channel_read_blk(fs->io, blk, 1, block_buf);
- if (retval && fix_problem(ctx, PR_1_HTREE_BADROOT, pctx))
- return 1;
+ if (retval) {
+ e2fsck_mark_inode_bad(ctx, ino, BADNESS_NORMAL);
+ if (fix_problem(ctx, PR_1_HTREE_BADROOT, pctx))
+ return 1;
+ }

/* XXX should check that beginning matches a directory */
root = (struct ext2_dx_root_info *) (block_buf + 24);
@@ -1785,6 +1845,9 @@ static int e2fsck_ind_block_verify(struc
bad++;
}

+ if (num_indir <= EXT2_N_BLOCKS)
+ e2fsck_mark_inode_bad(p->ctx, p->ino, bad);
+
if ((num_indir <= EXT2_N_BLOCKS && bad > 4) || bad > 8)
return PR_1_INDIRECT_BAD;

@@ -1828,6 +1891,10 @@ static int e2fsck_ext_block_verify(struc
pctx->blkcount = ex->ee_start;
pctx->num = ex->ee_len;
pctx->blk = ex->ee_block;
+ /* To ensure that extent is in inode */
+ if (eh->eh_max == 4)
+ e2fsck_mark_inode_bad(p->ctx, p->ino,
+ BADNESS_HIGH);
if (fix_problem(ctx, PR_1_EXTENT_BAD, pctx)) {
ext2fs_extent_remove(eh, ex);
i--; ex--; /* check next (moved) item */
@@ -1854,6 +1921,10 @@ static int e2fsck_ext_block_verify(struc
pctx->blkcount = ix->ei_leaf;;
pctx->num = i;
pctx->blk = ix->ei_block;
+ /* To ensure that extent_idx is in inode */
+ if (eh->eh_max == 4)
+ e2fsck_mark_inode_bad(p->ctx, p->ino,
+ BADNESS_HIGH);
if (fix_problem(ctx, PR_1_EXTENT_IDX_BAD,pctx)){
ext2fs_extent_index_remove(eh, ix);
i--; ix--; /* check next (moved) item */
@@ -1861,7 +1932,6 @@ static int e2fsck_ext_block_verify(struc
continue;
}
}
-
ix_prev = ix;
}
}
@@ -1916,6 +1986,7 @@ static void check_blocks(e2fsck_t ctx, s
inode->i_flags &= ~EXT2_COMPRBLK_FL;
dirty_inode++;
}
+ e2fsck_mark_inode_bad(ctx, ino, BADNESS_NORMAL);
}
}

@@ -1961,6 +2032,7 @@ static void check_blocks(e2fsck_t ctx, s
ext2fs_icount_store(ctx->inode_link_info, ino, 0);
inode->i_dtime = ctx->now;
dirty_inode++;
+ ext2fs_icount_store(ctx->inode_badness, ino, 0);
ext2fs_unmark_inode_bitmap(ctx->inode_dir_map, ino);
ext2fs_unmark_inode_bitmap(ctx->inode_reg_map, ino);
ext2fs_unmark_inode_bitmap(ctx->inode_used_map, ino);
@@ -2000,6 +2072,11 @@ static void check_blocks(e2fsck_t ctx, s
ctx->fs_directory_count--;
goto out;
}
+ /*
+ * The mode might be in-correct. Increasing the badness by
+ * small amount won't hurt much.
+ */
+ e2fsck_mark_inode_bad(ctx, ino, BADNESS_NORMAL);
}

pb.num_blocks *= (fs->blocksize / 512);
@@ -2039,6 +2116,7 @@ static void check_blocks(e2fsck_t ctx, s
inode->i_size_high = pctx->num >> 32;
dirty_inode++;
}
+ e2fsck_mark_inode_bad(ctx, ino, BADNESS_NORMAL);
pctx->num = 0;
}
if (LINUX_S_ISREG(inode->i_mode) &&
@@ -2050,6 +2128,7 @@ static void check_blocks(e2fsck_t ctx, s
inode->i_blocks = pb.num_blocks;
dirty_inode++;
}
+ e2fsck_mark_inode_bad(ctx, ino, BADNESS_NORMAL);
pctx->num = 0;
}
out:
Index: e2fsprogs-1.40.4/e2fsck/pass4.c
===================================================================
--- e2fsprogs-1.40.4.orig/e2fsck/pass4.c
+++ e2fsprogs-1.40.4/e2fsck/pass4.c
@@ -185,6 +185,7 @@ void e2fsck_pass4(e2fsck_t ctx)
}
ext2fs_free_icount(ctx->inode_link_info); ctx->inode_link_info = 0;
ext2fs_free_icount(ctx->inode_count); ctx->inode_count = 0;
+ ext2fs_free_icount(ctx->inode_badness); ctx->inode_badness = 0;
ext2fs_free_inode_bitmap(ctx->inode_bb_map);
ctx->inode_bb_map = 0;
ext2fs_free_inode_bitmap(ctx->inode_imagic_map);
Index: e2fsprogs-1.40.4/e2fsck/pass2.c
===================================================================
--- e2fsprogs-1.40.4.orig/e2fsck/pass2.c
+++ e2fsprogs-1.40.4/e2fsck/pass2.c
@@ -251,10 +251,6 @@ void e2fsck_pass2(e2fsck_t ctx)
ext2fs_free_mem(&buf);
ext2fs_free_dblist(fs->dblist);

- if (ctx->inode_bad_map) {
- ext2fs_free_inode_bitmap(ctx->inode_bad_map);
- ctx->inode_bad_map = 0;
- }
if (ctx->inode_reg_map) {
ext2fs_free_inode_bitmap(ctx->inode_reg_map);
ctx->inode_reg_map = 0;
@@ -501,6 +497,7 @@ static _INLINE_ int check_filetype(e2fsc
{
int filetype = dirent->name_len >> 8;
int should_be = EXT2_FT_UNKNOWN;
+ __u32 result;
struct ext2_inode inode;

if (!(ctx->fs->super->s_feature_incompat &
@@ -512,16 +509,18 @@ static _INLINE_ int check_filetype(e2fsc
return 1;
}

+ if (ctx->inode_badness)
+ ext2fs_icount_fetch32(ctx->inode_badness, dirent->inode,
+ &result);
+
if (ext2fs_test_inode_bitmap(ctx->inode_dir_map, dirent->inode)) {
should_be = EXT2_FT_DIR;
} else if (ext2fs_test_inode_bitmap(ctx->inode_reg_map,
dirent->inode)) {
should_be = EXT2_FT_REG_FILE;
- } else if (ctx->inode_bad_map &&
- ext2fs_test_inode_bitmap(ctx->inode_bad_map,
- dirent->inode))
+ } else if (ctx->inode_badness && result >= BADNESS_BAD_MODE) {
should_be = 0;
- else {
+ } else {
e2fsck_read_inode(ctx, dirent->inode, &inode,
"check_filetype");
should_be = ext2_file_type(inode.i_mode);
@@ -956,12 +955,10 @@ static int check_dir_block(ext2_filsys f
* (We wait until now so that we can display the
* pathname to the user.)
*/
- if (ctx->inode_bad_map &&
- ext2fs_test_inode_bitmap(ctx->inode_bad_map,
- dirent->inode)) {
- if (e2fsck_process_bad_inode(ctx, ino,
- dirent->inode,
- buf + fs->blocksize)) {
+ if ((ctx->inode_badness) &&
+ ext2fs_icount_is_set(ctx->inode_badness, dirent->inode)) {
+ if (e2fsck_process_bad_inode(ctx, ino, dirent->inode,
+ buf + fs->blocksize)) {
dirent->inode = 0;
dir_modified++;
goto next;
@@ -1195,8 +1192,8 @@ static void deallocate_inode(e2fsck_t ct
e2fsck_read_bitmaps(ctx);
ext2fs_unmark_inode_bitmap(ctx->inode_used_map, ino);
ext2fs_unmark_inode_bitmap(ctx->inode_dir_map, ino);
- if (ctx->inode_bad_map)
- ext2fs_unmark_inode_bitmap(ctx->inode_bad_map, ino);
+ if (ctx->inode_badness)
+ ext2fs_icount_store(ctx->inode_badness, ino, 0);
ext2fs_inode_alloc_stats2(fs, ino, -1, LINUX_S_ISDIR(inode.i_mode));

if (inode.i_file_acl &&
@@ -1261,8 +1258,10 @@ extern int e2fsck_process_bad_inode(e2fs
int not_fixed = 0;
unsigned char *frag, *fsize;
struct problem_context pctx;
- int problem = 0;
+ int problem = 0;
+ __u16 badness;

+ ext2fs_icount_fetch(ctx->inode_badness, ino, &badness);
e2fsck_read_inode(ctx, ino, &inode, "process_bad_inode");

clear_problem_context(&pctx);
@@ -1277,6 +1276,7 @@ extern int e2fsck_process_bad_inode(e2fs
inode_modified++;
} else
not_fixed++;
+ badness += BADNESS_NORMAL;
}

if (!LINUX_S_ISDIR(inode.i_mode) && !LINUX_S_ISREG(inode.i_mode) &&
@@ -1310,6 +1310,11 @@ extern int e2fsck_process_bad_inode(e2fs
} else
not_fixed++;
problem = 0;
+ /*
+ * A high value is associated with bad mode in order to detect
+ * that mode was corrupt in check_filetype()
+ */
+ badness += BADNESS_BAD_MODE;
}

if (inode.i_faddr) {
@@ -1318,6 +1323,7 @@ extern int e2fsck_process_bad_inode(e2fs
inode_modified++;
} else
not_fixed++;
+ badness += BADNESS_NORMAL;
}

switch (fs->super->s_creator_os) {
@@ -1339,6 +1345,7 @@ extern int e2fsck_process_bad_inode(e2fs
inode_modified++;
} else
not_fixed++;
+ badness += BADNESS_NORMAL;
pctx.num = 0;
}
if (fsize && *fsize) {
@@ -1348,11 +1355,28 @@ extern int e2fsck_process_bad_inode(e2fs
inode_modified++;
} else
not_fixed++;
+ badness += BADNESS_NORMAL;
pctx.num = 0;
}

+ /* In pass1 these conditions were used to mark inode bad so that
+ * it calls e2fsck_process_bad_inode and make an extensive check
+ * plus prompt for action to be taken. To compensate for badness
+ * incremented in pass1 by this condition, decrease it.
+ */
+ if ((inode.i_faddr || frag || fsize ||
+ (LINUX_S_ISDIR(inode.i_mode) && inode.i_dir_acl)) ||
+ (inode.i_file_acl &&
+ (!(fs->super->s_feature_compat & EXT2_FEATURE_COMPAT_EXT_ATTR) ||
+ (inode.i_file_acl < fs->super->s_first_data_block) ||
+ (inode.i_file_acl >= fs->super->s_blocks_count)))) {
+ /* badness can be 0 if called from pass4. */
+ if (badness)
+ badness -= BADNESS_NORMAL;
+ }
+
if ((fs->super->s_creator_os == EXT2_OS_LINUX) &&
- !(fs->super->s_feature_ro_compat &
+ !(fs->super->s_feature_ro_compat &
EXT4_FEATURE_RO_COMPAT_HUGE_FILE) &&
(inode.osd2.linux2.l_i_blocks_hi != 0)) {
pctx.num = inode.osd2.linux2.l_i_blocks_hi;
@@ -1360,6 +1384,8 @@ extern int e2fsck_process_bad_inode(e2fs
inode.osd2.linux2.l_i_blocks_hi = 0;
inode_modified++;
}
+ /* Badness was increased in pass1 for this condition */
+ /* badness += BADNESS_NORMAL; */
}

if (inode.i_file_acl &&
@@ -1370,6 +1396,7 @@ extern int e2fsck_process_bad_inode(e2fs
inode_modified++;
} else
not_fixed++;
+ badness += BADNESS_NORMAL;
}
if (inode.i_dir_acl &&
LINUX_S_ISDIR(inode.i_mode)) {
@@ -1378,12 +1405,28 @@ extern int e2fsck_process_bad_inode(e2fs
inode_modified++;
} else
not_fixed++;
+ badness += BADNESS_NORMAL;
+ }
+
+ /*
+ * The high value due to BADNESS_BAD_MODE should not delete the inode.
+ */
+ if ((badness - ((badness >= BADNESS_BAD_MODE) ? BADNESS_BAD_MODE : 0))>=
+ ctx->inode_badness_threshold) {
+ pctx.num = badness;
+ if (fix_problem(ctx, PR_2_INODE_TOOBAD, &pctx)) {
+ deallocate_inode(ctx, ino, 0);
+ if (ctx->flags & E2F_FLAG_SIGNAL_MASK)
+ return 0;
+ return 1;
+ }
+ not_fixed++;
}

if (inode_modified)
e2fsck_write_inode(ctx, ino, &inode, "process_bad_inode");
- if (!not_fixed && ctx->inode_bad_map)
- ext2fs_unmark_inode_bitmap(ctx->inode_bad_map, ino);
+ if (ctx->inode_badness)
+ ext2fs_icount_store(ctx->inode_badness, ino, 0);
return 0;
}

Index: e2fsprogs-1.40.4/e2fsck/problem.c
===================================================================
--- e2fsprogs-1.40.4.orig/e2fsck/problem.c
+++ e2fsprogs-1.40.4/e2fsck/problem.c
@@ -1316,6 +1316,11 @@ static struct e2fsck_problem problem_tab
N_("@i %i found in @g %g unused inodes area. "),
PROMPT_FIX, PR_PREEN_OK },

+ /* Inode too bad */
+ { PR_2_INODE_TOOBAD,
+ N_("@i %i is badly corrupt (badness value = %N). "),
+ PROMPT_CLEAR, PR_PREEN_OK },
+
/* Pass 3 errors */

/* Pass 3: Checking directory connectivity */
Index: e2fsprogs-1.40.4/e2fsck/problem.h
===================================================================
--- e2fsprogs-1.40.4.orig/e2fsck/problem.h
+++ e2fsprogs-1.40.4/e2fsck/problem.h
@@ -792,6 +792,9 @@ struct problem_context {
/* Inode found in group unused inodes area */
#define PR_2_INOREF_IN_UNUSED 0x020046

+/* Inode completely corrupt */
+#define PR_2_INODE_TOOBAD 0x020047
+
/*
* Pass 3 errors
*/
Index: e2fsprogs-1.40.4/lib/ext2fs/icount.c
===================================================================
--- e2fsprogs-1.40.4.orig/lib/ext2fs/icount.c
+++ e2fsprogs-1.40.4/lib/ext2fs/icount.c
@@ -462,6 +462,23 @@ static errcode_t get_inode_count(ext2_ic
return 0;
}

+int ext2fs_icount_is_set(ext2_icount_t icount, ext2_ino_t ino)
+{
+ __u16 result;
+
+ if (ext2fs_test_inode_bitmap(icount->single, ino))
+ return 1;
+ else if (icount->multiple) {
+ if (ext2fs_test_inode_bitmap(icount->multiple, ino))
+ return 1;
+ return 0;
+ }
+ ext2fs_icount_fetch(icount, ino, &result);
+ if (result)
+ return 1;
+ return 0;
+}
+
errcode_t ext2fs_icount_validate(ext2_icount_t icount, FILE *out)
{
errcode_t ret = 0;
@@ -501,6 +518,7 @@ errcode_t ext2fs_icount_fetch32(ext2_ico
*ret = 0;
return 0;
}
+
get_inode_count(icount, ino, ret);
return 0;
}
Index: e2fsprogs-1.40.4/e2fsck/pass1b.c
===================================================================
--- e2fsprogs-1.40.4.orig/e2fsck/pass1b.c
+++ e2fsprogs-1.40.4/e2fsck/pass1b.c
@@ -613,8 +613,8 @@ static void delete_file(e2fsck_t ctx, ex
fix_problem(ctx, PR_1B_BLOCK_ITERATE, &pctx);
ext2fs_unmark_inode_bitmap(ctx->inode_used_map, ino);
ext2fs_unmark_inode_bitmap(ctx->inode_dir_map, ino);
- if (ctx->inode_bad_map)
- ext2fs_unmark_inode_bitmap(ctx->inode_bad_map, ino);
+ if (ctx->inode_badness)
+ e2fsck_mark_inode_bad(ctx, ino, 0);
ext2fs_inode_alloc_stats2(fs, ino, -1, LINUX_S_ISDIR(inode.i_mode));

/* Inode may have changed by block_iterate, so reread it */
Index: e2fsprogs-1.40.4/e2fsck/unix.c
===================================================================
--- e2fsprogs-1.40.4.orig/e2fsck/unix.c
+++ e2fsprogs-1.40.4/e2fsck/unix.c
@@ -624,6 +624,18 @@ static void parse_extended_opts(e2fsck_t
extended_usage++;
continue;
}
+ /* -E inode_badness_threshold=<value> */
+ } else if (strcmp(token, "inode_badness_threshold") == 0) {
+ if (!arg) {
+ extended_usage++;
+ continue;
+ }
+ ctx->inode_badness_threshold = strtoul(arg, &p, 0);
+ if (*p != '\0' || (ctx->inode_badness_threshold > 200)){
+ fprintf(stderr, _("Invalid badness value.\n"));
+ extended_usage++;
+ continue;
+ }
} else {
fprintf(stderr, _("Unknown extended option: %s\n"),
token);
@@ -639,6 +646,7 @@ static void parse_extended_opts(e2fsck_t
"\tshared=<preserve|lost+found|delete>\n"
"\tclone=<dup|zero>\n"
"\tea_ver=<ea_version (1 or 2)>\n"
+ "\tinode_badness_threhold=(value)\n"
"\texpand_extra_isize\n"
"\n"), stderr);
exit(1);
@@ -703,6 +711,9 @@ static errcode_t PRS(int argc, char *arg
profile_init(config_fn, &ctx->profile);
initialize_profile_options(ctx);

+ ctx->inode_badness_threshold = BADNESS_THRESHOLD;
+ ctx->now_tolerance_val = 172800; /* Two days */
+
while ((c = getopt (argc, argv, "panyrcC:B:dE:fvtFVM:b:I:j:P:l:L:N:SsDk")) != EOF)
switch (c) {
case 'C':
Index: e2fsprogs-1.40.4/e2fsck/e2fsck.c
===================================================================
--- e2fsprogs-1.40.4.orig/e2fsck/e2fsck.c
+++ e2fsprogs-1.40.4/e2fsck/e2fsck.c
@@ -105,10 +105,6 @@ errcode_t e2fsck_reset_context(e2fsck_t
ext2fs_free_inode_bitmap(ctx->inode_bb_map);
ctx->inode_bb_map = 0;
}
- if (ctx->inode_bad_map) {
- ext2fs_free_inode_bitmap(ctx->inode_bad_map);
- ctx->inode_bad_map = 0;
- }
if (ctx->inode_imagic_map) {
ext2fs_free_inode_bitmap(ctx->inode_imagic_map);
ctx->inode_imagic_map = 0;
Index: e2fsprogs-1.40.4/e2fsck/e2fsck.8.in
===================================================================
--- e2fsprogs-1.40.4.orig/e2fsck/e2fsck.8.in
+++ e2fsprogs-1.40.4/e2fsck/e2fsck.8.in
@@ -191,6 +191,13 @@ in place (preserve);
Assume the format of the extended attribute blocks in the filesystem is
the specified version number. The version number may be 1 or 2. The
default extended attribute version format is 2.
+.TP
+.BI inode_badness_threshold= threshold_value
+A badness counter is associated with every inode, which determines the degree
+of inode corruption. Each error found in the inode will increase the badness by
+1 or 2, and inodes with a badness at or above
+.I threshold_value will be prompted for deletion. The default
+.I threshold_value is 7.
.RE
.TP
.B \-f
Index: e2fsprogs-1.40.4/tests/f_bad_disconnected_inode/expect.1
===================================================================
--- e2fsprogs-1.40.4.orig/tests/f_bad_disconnected_inode/expect.1
+++ e2fsprogs-1.40.4/tests/f_bad_disconnected_inode/expect.1
@@ -39,10 +39,7 @@ Clear? yes
i_blocks_hi for inode 16 (...) is 62762, should be zero.
Clear? yes

-Unattached inode 16
-Connect to /lost+found? yes
-
-Inode 16 ref count is 5925, should be 1. Fix? yes
+Inode 16 is badly corrupt (badness value = 10). Clear? yes

Pass 5: Checking group summary information
Block bitmap differences: -(9--19)
@@ -54,19 +51,16 @@ Fix? yes
Free blocks count wrong (79, counted=91).
Fix? yes

-Inode bitmap differences: +16
-Fix? yes
-
-Free inodes count wrong for group #0 (7, counted=4).
+Free inodes count wrong for group #0 (8, counted=5).
Fix? yes

Directories count wrong for group #0 (3, counted=2).
Fix? yes

-Free inodes count wrong (7, counted=4).
+Free inodes count wrong (8, counted=5).
Fix? yes


test_filesys: ***** FILE SYSTEM WAS MODIFIED *****
-test_filesys: 12/16 files (0.0% non-contiguous), 9/100 blocks
+test_filesys: 11/16 files (0.0% non-contiguous), 9/100 blocks
Exit status is 1
Index: e2fsprogs-1.40.4/tests/f_bad_disconnected_inode/expect.2
===================================================================
--- e2fsprogs-1.40.4.orig/tests/f_bad_disconnected_inode/expect.2
+++ e2fsprogs-1.40.4/tests/f_bad_disconnected_inode/expect.2
@@ -3,5 +3,5 @@ Pass 2: Checking directory structure
Pass 3: Checking directory connectivity
Pass 4: Checking reference counts
Pass 5: Checking group summary information
-test_filesys: 12/16 files (0.0% non-contiguous), 9/100 blocks
+test_filesys: 11/16 files (0.0% non-contiguous), 9/100 blocks
Exit status is 0
Index: e2fsprogs-1.40.4/lib/ext2fs/ext2fs.h
===================================================================
--- e2fsprogs-1.40.4.orig/lib/ext2fs/ext2fs.h
+++ e2fsprogs-1.40.4/lib/ext2fs/ext2fs.h
@@ -831,6 +831,7 @@ extern errcode_t ext2fs_initialize(const

/* icount.c */
extern void ext2fs_free_icount(ext2_icount_t icount);
+extern int ext2fs_icount_is_set(ext2_icount_t icount, ext2_ino_t ino);
extern errcode_t ext2fs_create_icount_tdb(ext2_filsys fs, char *tdb_dir,
int flags, ext2_icount_t *ret);
extern errcode_t ext2fs_create_icount2(ext2_filsys fs, int flags,

Cheers, Andreas
--
Andreas Dilger
Sr. Staff Engineer, Lustre Group
Sun Microsystems of Canada, Inc.

2008-02-02 08:43:46

by Andreas Dilger

[permalink] [raw]
Subject: Re: [PATCH][16/28] e2fsprogs-tests-f_ibadness.patch

Test case for bad inode detection, and fixes for old test cases.

Cheers, Andreas
--
Andreas Dilger
Sr. Staff Engineer, Lustre Group
Sun Microsystems of Canada, Inc.


Attachments:
(No filename) (167.00 B)
e2fsprogs-tests-f_ibadness.patch (42.33 kB)
Download all attachments

2008-02-02 08:46:31

by Andreas Dilger

[permalink] [raw]
Subject: Re: [PATCH][18/28] e2fsprogs-tests-f_random_corruption.patch

The f_random_corruption test enables a random subset of filesystem features,
picks one of the valid filesystem block and inode sizes, and a random device
size and creates a new filesystem with those parameters.

It is possible to disable the running of the test by setting the environment
variable F_RANDOM_CORRUPTION=skip. By default the test script is run only
one time, but setting the LOOP_COUNT variable allows the test to run multiple
times.

If the script is running as root the filesystem is mounted and populated with
file data to allow a more useful test filesystem to be generated. In some
cases the kernel may not support the requested filesystem features and the
filesystem cannot be mounted. This is not considered a test failure.

The resulting filesystem is corrupted with both random data and by shifting
data from one part of the device to another and then repaired by e2fsck.
In some rare cases the random corruption is severe enough that the filesystem
is not recoverable (e.g. small filesystem with no backup superblock has bad
superblock corruption) but in most cases "e2fsck -fy" should be able to fix
all errors in some way.

After e2fsck has repaired the filesystem, it is optionally mounted (if the
environment variable MOUNT_AFTER_CORRUPTION=yes is set) and the test files
created in the filesystem are deleted. This verifies that the fixes in the
filesystem are sufficient for the kernel to use the filesystem without error.
Since there is some possibility of the kernel oopsing if there is a filesystem
bug, this part of the test is not enabled by default.

Signed-off-by: Andreas Dilger <[email protected]>
Signed-off-by: Kalpak Shah <[email protected]>

Index: e2fsprogs-1.40.4/tests/f_random_corruption/script
===================================================================
--- /dev/null
+++ e2fsprogs-1.40.4/tests/f_random_corruption/script
@@ -0,0 +1,277 @@
+# This is to make sure that if this test fails other tests can still be run
+# instead of doing an exit. We break before the end of the loop.
+export LOOP_COUNT=${LOOP_COUNT:-1}
+export COUNT=0
+
+while [ $COUNT -lt $LOOP_COUNT ]; do
+[ "$F_RANDOM_CORRUPTION" = "skip" ] && echo "skipped" && break
+
+# choose block and inode sizes randomly
+BLK_SIZES=(1024 2048 4096)
+INODE_SIZES=(128 256 512 1024)
+
+SEED=$(head -1 /dev/urandom | od -N 1 | awk '{ print $2 }')
+RANDOM=$SEED
+
+IMAGE=${IMAGE:-$TMPFILE}
+DATE=`date '+%Y%m%d%H%M%S'`
+ARCHIVE=$IMAGE.$DATE
+SIZE=${SIZE:-$(( 192000 + RANDOM + RANDOM )) }
+FS_TYPE=${FS_TYPE:-ext3}
+BLK_SIZE=${BLK_SIZES[(( $RANDOM % ${#BLK_SIZES[*]} ))]}
+INODE_SIZE=${INODE_SIZES[(( $RANDOM % ${#INODE_SIZES[*]} ))]}
+DEF_FEATURES="sparse_super,filetype,dir_index"
+FEATURES=${FEATURES:-$DEF_FEATURES}
+MOUNT_OPTS="-o loop"
+MNTPT=$test_dir/temp
+OUT=$test_name.log
+FAILED=$test_name.failed
+OKFILE=$test_name.ok
+
+# Do you want to try and mount the filesystem?
+MOUNT_AFTER_CORRUPTION=${MOUNT_AFTER_CORRUPTION:-"no"}
+# Do you want to remove the files from the mounted filesystem?
+# Ideally use it only in test environment.
+REMOVE_FILES=${REMOVE_FILES:-"no"}
+
+# In KB
+CORRUPTION_SIZE=${CORRUPTION_SIZE:-64}
+CORRUPTION_ITERATIONS=${CORRUPTION_ITERATIONS:-5}
+
+MKFS=../misc/mke2fs
+E2FSCK=../e2fsck/e2fsck
+FIRST_FSCK_OPTS="-fyv"
+SECOND_FSCK_OPTS="-fyv"
+
+# Lets check if the image can fit in the current filesystem.
+BASE_DIR=`dirname $IMAGE`
+BASE_AVAIL_BLOCKS=`df -P -k $BASE_DIR | awk '/%/ { print $4 }'`
+
+if (( BASE_AVAIL_BLOCKS < NUM_BLKS * (BLK_SIZE / 1024) )); then
+ echo "$BASE_DIR does not have enough space to accomodate test image."
+ echo "Skipping test...."
+ break;
+fi
+
+# Lets have a journal more times than not.
+HAVE_JOURNAL=$((RANDOM % 12 ))
+if (( HAVE_JOURNAL == 0 )); then
+ FS_TYPE="ext2"
+ HAVE_JOURNAL=""
+else
+ HAVE_JOURNAL="-j"
+fi
+
+# Experimental features should not be used too often.
+LAZY_BG=$(( $RANDOM % 12 ))
+if (( LAZY_BG == 0 )); then
+ FEATURES=$FEATURES,lazy_bg
+fi
+
+# meta_bg and resize_inode features should not be enabled simultaneously
+META_BG=$(( $RANDOM % 12 ))
+if (( META_BG == 0 )); then
+ FEATURES=$FEATURES,meta_bg
+else
+ FEATURES=$FEATURES,resize_inode
+fi
+
+modprobe ext4 2> /dev/null
+modprobe ext4dev 2> /dev/null
+
+# If ext4 is present in the kernel then we can play with ext4 options
+EXT4=`grep ext4 /proc/filesystems`
+if [ -n "$EXT4" ]; then
+ USE_EXT4=$((RANDOM % 2 ))
+ if (( USE_EXT4 == 1 )); then
+ FS_TYPE="ext4dev"
+ fi
+fi
+
+if [ "$FS_TYPE" = "ext4dev" ]; then
+ UNINIT_GROUPS=$((RANDOM % 12 ))
+ if (( UNINIT_GROUPS == 0 )); then
+ FEATURES=$FEATURES,uninit_groups
+ fi
+ EXPAND_ESIZE=$((RANDOM % 12 ))
+ if (( EXPAND_EISIZE == 0 )); then
+ FIRST_FSCK_OPTS=$FIRST_FSCK_OPTS," -E expand_extra_isize"
+ fi
+fi
+
+MKFS_OPTS="$HAVE_JOURNAL -b $BLK_SIZE -I $INODE_SIZE -O $FEATURES"
+
+NUM_BLKS=$(((SIZE * 1024) / BLK_SIZE))
+
+log()
+{
+ [ "$VERBOSE" ] && echo "$*"
+ echo "$*" >> $OUT
+}
+
+error()
+{
+ log "$*"
+ echo "$*" >> $FAILED
+}
+
+unset_vars()
+{
+ unset IMAGE DATE ARCHIVE FS_TYPE SIZE BLK_SIZE MKFS_OPTS MOUNT_OPTS
+ unset E2FSCK FIRST_FSCK_OPTS SECOND_FSCK_OPTS OUT FAILED OKFILE
+}
+
+cleanup()
+{
+ [ "$1" ] && error "$*" || error "Error occured..."
+ umount -f $MNTPT > /dev/null 2>&1 | tee -a $OUT
+ cp $OUT $OUT.$DATE
+ echo " failed"
+ echo "*** This appears to be a bug in e2fsprogs ***"
+ echo "Please contact [email protected] for further assistance."
+ echo "Include $OUT.$DATE, and save $ARCHIVE locally for reference."
+ unset_vars
+ break;
+}
+
+echo -n "Random corruption test for e2fsck:"
+# Truncate the output log file
+rm -f $FAILED $OKFILE
+> $OUT
+
+get_random_location()
+{
+ total=$1
+
+ tmp=$(((RANDOM * 32768) % total))
+
+ # Try and have more corruption in metadata at the start of the
+ # filesystem.
+ if ((tmp % 3 == 0 || tmp % 5 == 0 || tmp % 7 == 0)); then
+ tmp=$((tmp % 32768))
+ fi
+
+ echo $tmp
+}
+
+make_fs_dirty()
+{
+ from=`get_random_location $NUM_BLKS`
+
+ # Number of blocks to write garbage into should be within fs and should
+ # not be too many.
+ num_blks_to_dirty=$((RANDOM % $1))
+
+ # write garbage into the selected blocks
+ [ ! -c /dev/urandom ] && return
+ log "writing ${num_blks_to_dirty}kB random garbage at offset ${from}kB"
+ dd if=/dev/urandom of=$IMAGE bs=1kB seek=$from conv=notrunc \
+ count=$num_blks_to_dirty >> $OUT 2>&1
+}
+
+
+touch $IMAGE
+log "Format the filesystem image..."
+log
+# Write some garbage blocks into the filesystem to make sure e2fsck has to do
+# a more difficult job than checking blocks of zeroes.
+log "Copy some random data into filesystem image...."
+make_fs_dirty 32768
+log "$MKFS $MKFS_OPTS -F $IMAGE $NUM_BLKS >> $OUT"
+$MKFS $MKFS_OPTS -F $IMAGE $NUM_BLKS >> $OUT 2>&1
+if [ $? -ne 0 ]; then
+ zero_size=`grep "Device size reported to be zero" $OUT`
+ short_write=`grep "Attempt to write block from filesystem resulted in short write" $OUT`
+
+ if (( zero_size != 0 || short_write != 0 )); then
+ echo "mkfs failed due to device size of 0 or a short write. This is harmless and need not be reported."
+ else
+ cleanup "mkfs failed - internal error during operation. Aborting random regression test..."
+ fi
+fi
+
+if [ `id -u` = 0 ]; then
+ mkdir -p $MNTPT
+ if [ $? -ne 0 ]; then
+ log "Failed to create or find mountpoint...."
+ else
+ mount -t $FS_TYPE $MOUNT_OPTS $IMAGE $MNTPT 2>&1 | tee -a $OUT
+ if [ $? -ne 0 ]; then
+ log "Unable to mount file system - skipped"
+ else
+ df -h $MNTPT >> $OUT
+ df -i $MNTPT >> $OUT
+ log "Copying data into the test filesystem..."
+
+ cp -r ../ $MNTPT >> $OUT 2>&1
+ sync
+ umount -f $MNTPT > /dev/null 2>&1 | tee -a $OUT
+ fi
+ fi
+else
+ log "skipping mount test for non-root user"
+fi
+
+log "Corrupt the image by moving around blocks of data..."
+log
+for (( i = 0; i < $CORRUPTION_ITERATIONS; i++ )); do
+ from=`get_random_location $NUM_BLKS`
+ to=`get_random_location $NUM_BLKS`
+
+ log "Moving ${CORRUPTION_SIZE}kB from block ${from}kB to ${to}kB"
+ dd if=$IMAGE of=$IMAGE bs=1k count=$CORRUPTION_SIZE conv=notrunc skip=$from seek=$to >> $OUT 2>&1
+
+ # more corruption by overwriting blocks from within the filesystem.
+ make_fs_dirty $CORRUPTION_SIZE
+done
+
+# Copy the image for reproducing the bug.
+cp --sparse=always $IMAGE $ARCHIVE >> $OUT 2>&1
+
+log "First pass of fsck..."
+$E2FSCK $FIRST_FSCK_OPTS $IMAGE >> $OUT 2>&1
+RET=$?
+
+# Run e2fsck for the second time and check if the problem gets solved.
+# After we can report error with pass1.
+[ $((RET & 1)) == 0 ] || log "The first fsck corrected errors"
+[ $((RET & 2)) == 0 ] || error "The first fsck wants a reboot"
+[ $((RET & 4)) == 0 ] || error "The first fsck left uncorrected errors"
+[ $((RET & 8)) == 0 ] || error "The first fsck reports an operational error"
+[ $((RET & 16)) == 0 ] || error "The first fsck reports there was a usage error"
+[ $((RET & 32)) == 0 ] || error "The first fsck reports it was cancelled"
+[ $((RET & 128)) == 0 ] || error "The first fsck reports a library error"
+
+log "---------------------------------------------------------"
+
+log "Second pass of fsck..."
+$E2FSCK $SECOND_FSCK_OPTS $IMAGE >> $OUT 2>&1
+RET=$?
+[ $((RET & 1)) == 0 ] || cleanup "The second fsck corrected errors!"
+[ $((RET & 2)) == 0 ] || cleanup "The second fsck wants a reboot"
+[ $((RET & 4)) == 0 ] || cleanup "The second fsck left uncorrected errors"
+[ $((RET & 8)) == 0 ] || cleanup "The second fsck reports an operational error"
+[ $((RET & 16)) == 0 ] || cleanup "The second fsck reports a usage error"
+[ $((RET & 32)) == 0 ] || cleanup "The second fsck reports it was cancelled"
+[ $((RET & 128)) == 0 ] || cleanup "The second fsck reports a library error"
+
+[ -f $FAILED ] && cleanup
+
+if [ "$MOUNT_AFTER_CORRUPTION" = "yes" ]; then
+ mount -t $FS_TYPE $MOUNT_OPTS $IMAGE $MNTPT 2>&1 | tee -a $OUT
+ [ $? -ne 0 ] && log "Unable to mount file system - skipped"
+
+ [ "$REMOVE_FILES" = "yes" ] && rm -rf $MNTPT/* >> $OUT
+ umount -f $MNTPT > /dev/null 2>&1 | tee -a $OUT
+fi
+
+rm -f $ARCHIVE
+
+# Report success
+echo " ok"
+echo "Succeeded..." > $OKFILE
+
+unset_vars
+
+COUNT=$((COUNT + 1))
+done
Index: e2fsprogs-1.40.4/tests/Makefile.in
===================================================================
--- e2fsprogs-1.40.4.orig/tests/Makefile.in
+++ e2fsprogs-1.40.4/tests/Makefile.in
@@ -24,6 +24,8 @@ test_script: test_script.in Makefile
@chmod +x test_script

check:: test_script
+ @echo "Removing remnants of earlier tests..."
+ $(RM) -f *~ *.log *.new *.failed *.ok test.img2*
@echo "Running e2fsprogs test suite..."
@echo " "
@./test_script
@@ -63,7 +65,7 @@ testend: test_script ${TDIR}/image
@echo "If all is well, edit ${TDIR}/name and rename ${TDIR}."

clean::
- $(RM) -f *~ *.log *.new *.failed *.ok test.img test_script
+ $(RM) -f *~ *.log *.new *.failed *.ok test.img* test_script

distclean:: clean
$(RM) -f Makefile

Cheers, Andreas
--
Andreas Dilger
Sr. Staff Engineer, Lustre Group
Sun Microsystems of Canada, Inc.

2008-02-02 08:47:09

by Andreas Dilger

[permalink] [raw]
Subject: Re: [PATCH][19/28] e2fsprogs-stride_option.patch


Add support for setting the s_raid_stride and s_raid_stripe_width
fields in the superblock via mke2fs and tune2fs.c. This is useful
for mballoc to align block allocation on the RAID stripe boundaries.

Fix up the debugfs "ssv" command to set a number of new superblock fields.

Signed-off-by: Rupesh Thakare <[email protected]>
Signed-off-by: Andreas Dilger <[email protected]>

Index: e2fsprogs-1.40.5/lib/ext2fs/initialize.c
===================================================================
--- e2fsprogs-1.40.5.orig/lib/ext2fs/initialize.c
+++ e2fsprogs-1.40.5/lib/ext2fs/initialize.c
@@ -156,6 +156,8 @@ errcode_t ext2fs_initialize(const char *
set_field(s_feature_incompat, 0);
set_field(s_feature_ro_compat, 0);
set_field(s_first_meta_bg, 0);
+ set_field(s_raid_stride, 0); /* default stride size: 0 */
+ set_field(s_raid_stripe_width, 0); /* default stripe width: 0 */
set_field(s_flags, 0);
if (super->s_feature_incompat & ~EXT2_LIB_FEATURE_INCOMPAT_SUPP) {
retval = EXT2_ET_UNSUPP_FEATURE;
Index: e2fsprogs-1.40.5/misc/mke2fs.c
===================================================================
--- e2fsprogs-1.40.5.orig/misc/mke2fs.c
+++ e2fsprogs-1.40.5/misc/mke2fs.c
@@ -773,7 +773,7 @@ static int set_os(struct ext2_super_bloc
static void parse_extended_opts(struct ext2_super_block *param,
const char *opts)
{
- char *buf, *token, *next, *p, *arg;
+ char *buf, *token, *next, *p, *arg, *badopt = "";
int len;
int r_usage = 0;

@@ -800,16 +800,32 @@ static void parse_extended_opts(struct e
if (strcmp(token, "stride") == 0) {
if (!arg) {
r_usage++;
+ badopt = token;
continue;
}
- fs_stride = strtoul(arg, &p, 0);
- if (*p || (fs_stride == 0)) {
+ param->s_raid_stride = strtoul(arg, &p, 0);
+ if (*p || (param->s_raid_stride == 0)) {
fprintf(stderr,
_("Invalid stride parameter: %s\n"),
arg);
r_usage++;
continue;
}
+ } else if (strcmp(token, "stripe-width") == 0 ||
+ strcmp(token, "stripe_width") == 0) {
+ if (!arg) {
+ r_usage++;
+ badopt = token;
+ continue;
+ }
+ param->s_raid_stripe_width = strtoul(arg, &p, 0);
+ if (*p || (param->s_raid_stripe_width == 0)) {
+ fprintf(stderr,
+ _("Invalid stripe-width parameter: %s\n"),
+ arg);
+ r_usage++;
+ continue;
+ }
} else if (!strcmp(token, "resize")) {
unsigned long resize, bpg, rsv_groups;
unsigned long group_desc_count, desc_blocks;
@@ -818,6 +834,7 @@ static void parse_extended_opts(struct e

if (!arg) {
r_usage++;
+ badopt = token;
continue;
}

@@ -868,21 +885,31 @@ static void parse_extended_opts(struct e
}
} else if (!strcmp(token, "test_fs")) {
param->s_flags |= EXT2_FLAGS_TEST_FILESYS;
- } else
+ } else {
r_usage++;
+ badopt = token;
+ }
}
if (r_usage) {
- fprintf(stderr, _("\nBad options specified.\n\n"
+ fprintf(stderr, _("\nBad option(s) specified: %s\n\n"
"Extended options are separated by commas, "
"and may take an argument which\n"
"\tis set off by an equals ('=') sign.\n\n"
"Valid extended options are:\n"
- "\tstride=<stride length in blocks>\n"
- "\tresize=<resize maximum size in blocks>\n"
- "\ttest_fs\n"));
+ "\tstride=<RAID per-disk data chunk in blocks>\n"
+ "\tstripe-width=<RAID stride * data disks in blocks>\n"
+ "\tresize=<resize maximum size in blocks>\n\n"
+ "\ttest_fs\n"),
+ badopt);
free(buf);
exit(1);
}
+ if (param->s_raid_stride &&
+ (param->s_raid_stripe_width % param->s_raid_stride) != 0)
+ fprintf(stderr, _("\nWarning: RAID stripe-width %u not an even "
+ "multiple of stride %u.\n\n"),
+ param->s_raid_stripe_width, param->s_raid_stride);
+
free(buf);
}

@@ -1662,7 +1689,7 @@ int main (int argc, char *argv[])
test_disk(fs, &bb_list);

handle_bad_blocks(fs, bb_list);
- fs->stride = fs->super->s_raid_stride = fs_stride;
+ fs->stride = fs_stride = fs->super->s_raid_stride;
retval = ext2fs_allocate_tables(fs);
if (retval) {
com_err(program_name, retval,
Index: e2fsprogs-1.40.5/misc/tune2fs.c
===================================================================
--- e2fsprogs-1.40.5.orig/misc/tune2fs.c
+++ e2fsprogs-1.40.5/misc/tune2fs.c
@@ -71,6 +71,8 @@ static unsigned short errors;
static int open_flag;
static char *features_cmd;
static char *mntopts_cmd;
+static int stride, stripe_width;
+static int stride_set, stripe_width_set;
static char *extended_cmd;

int journal_size, journal_flags;
@@ -800,7 +802,36 @@ static void parse_extended_opts(ext2_fil
fs->super->s_flags &= ~EXT2_FLAGS_TEST_FILESYS;
printf("Clearing test filesystem flag\n");
ext2fs_mark_super_dirty(fs);
- } else
+ } else if (strcmp(token, "stride") == 0) {
+ if (!arg) {
+ r_usage++;
+ continue;
+ }
+ stride = strtoul(arg, &p, 0);
+ if (*p || (stride == 0)) {
+ fprintf(stderr,
+ _("Invalid RAID stride: %s\n"),
+ arg);
+ r_usage++;
+ continue;
+ }
+ stride_set = 1;
+ } else if (strcmp(token, "stripe-width") == 0 ||
+ strcmp(token, "stripe_width") == 0) {
+ if (!arg) {
+ r_usage++;
+ continue;
+ }
+ stripe_width = strtoul(arg, &p, 0);
+ if (*p || (stripe_width == 0)) {
+ fprintf(stderr,
+ _("Invalid RAID stripe-width: %s\n"),
+ arg);
+ r_usage++;
+ continue;
+ }
+ stripe_width_set = 1;
+ } else
r_usage++;
}
if (r_usage) {
@@ -809,6 +840,8 @@ static void parse_extended_opts(ext2_fil
"and may take an argument which\n"
"\tis set off by an equals ('=') sign.\n\n"
"Valid extended options are:\n"
+ "\tstride=<RAID per-disk chunk size in blocks>\n"
+ "\tstripe-width=<RAID stride*data disks in blocks>\n"
"\ttest_fs\n"
"\t^test_fs\n"));
free(buf);
@@ -1002,6 +1035,16 @@ int main (int argc, char ** argv)

if (l_flag)
list_super (sb);
+ if (stride_set) {
+ sb->s_raid_stride = stride;
+ ext2fs_mark_super_dirty(fs);
+ printf(_("Setting stride size to %d\n"), stride);
+ }
+ if (stripe_width_set) {
+ sb->s_raid_stripe_width = stripe_width;
+ ext2fs_mark_super_dirty(fs);
+ printf(_("Setting stripe width to %d"), stripe_width);
+ }
remove_error_table(&et_ext2_error_table);
return (ext2fs_close (fs) ? 1 : 0);
}
Index: e2fsprogs-1.40.5/misc/mke2fs.8.in
===================================================================
--- e2fsprogs-1.40.5.orig/misc/mke2fs.8.in
+++ e2fsprogs-1.40.5/misc/mke2fs.8.in
@@ -179,10 +179,23 @@ option is still accepted for backwards c
following extended options are supported:
.RS 1.2i
.TP
-.BI stride= stripe-size
+.BI stride= stride-size
Configure the filesystem for a RAID array with
-.I stripe-size
-filesystem blocks per stripe.
+.I stride-size
+filesystem blocks. This is the number of blocks read or written to disk
+before moving to next disk. This mostly affects placement of filesystem
+metadata like bitmaps at
+.BR mke2fs (2)
+time to avoid placing them on a single disk, which can hurt the performanace.
+It may also be used by block allocator.
+.TP
+.BI stripe-width= stripe-width
+Configure the filesystem for a RAID array with
+.I stripe-width
+filesystem blocks per stripe. This is typically be stride-size * N, where
+N is the number of data disks in the RAID (e.g. RAID 5 N+1, RAID 6 N+2).
+This allows the block allocator to prevent read-modify-write of the
+parity in a RAID stripe if possible when the data is written.
.TP
.BI resize= max-online-resize
Reserve enough space so that the block group descriptor table can grow
Index: e2fsprogs-1.40.5/misc/tune2fs.8.in
===================================================================
--- e2fsprogs-1.40.5.orig/misc/tune2fs.8.in
+++ e2fsprogs-1.40.5/misc/tune2fs.8.in
@@ -65,6 +65,10 @@ tune2fs \- adjust tunable filesystem par
.I extended-options
]
[
+.B \-E
+.I extended-options
+]
+[
.B \-L
.I volume-name
]
@@ -163,6 +167,31 @@ Clear the test_fs flag, indicating the f
using production-level filesystem code.
.RE
.TP
+.BI \-E " extended-options"
+Set extended options for the filesystem. Extended options are comma
+separated, and may take an argument using the equals ('=') sign.
+The following extended options are supported:
+.RS 1.2i
+.TP
+.BI stride= stride-size
+Configure the filesystem for a RAID array with
+.I stride-size
+filesystem blocks. This is the number of blocks read or written to disk
+before moving to next disk. This mostly affects placement of filesystem
+metadata like bitmaps at
+.BR mke2fs (2)
+time to avoid placing them on a single disk, which can hurt the performanace.
+It may also be used by block allocator.
+.TP
+.BI stripe-width= stripe-width
+Configure the filesystem for a RAID array with
+.I stripe-width
+filesystem blocks per stripe. This is typically be stride-size * N, where
+N is the number of data disks in the RAID (e.g. RAID 5 N+1, RAID 6 N+2).
+This allows the block allocator to prevent read-modify-write of the
+parity in a RAID stripe if possible when the data is written.
+.RE
+.TP
.B \-f
Force the tune2fs operation to complete even in the face of errors. This
option is useful when removing the
Index: e2fsprogs-1.40.5/debugfs/set_fields.c
===================================================================
--- e2fsprogs-1.40.5.orig/debugfs/set_fields.c
+++ e2fsprogs-1.40.5/debugfs/set_fields.c
@@ -9,12 +9,18 @@
* %End-Header%
*/

-#define _XOPEN_SOURCE 500 /* for inclusion of strptime() */
+#define _XOPEN_SOURCE 600 /* for inclusion of strptime() and strtoull */
+
+#ifdef HAVE_STRTOULL
+#define STRTOULL strtoull
+#else
+#define STRTOULL strtoul
+#endif

#include <stdio.h>
#include <unistd.h>
-#include <stdlib.h>
#include <ctype.h>
+#include <stdlib.h>
#include <string.h>
#include <strings.h>
#include <time.h>
@@ -103,7 +109,6 @@ static struct field_set_info super_field
parse_uint },
{ "reserved_gdt_blocks", &set_sb.s_reserved_gdt_blocks, 2,
parse_uint },
- /* s_padding1 */
{ "journal_uuid", &set_sb.s_journal_uuid, 16, parse_uuid },
{ "journal_inum", &set_sb.s_journal_inum, 4, parse_uint },
{ "journal_dev", &set_sb.s_journal_dev, 4, parse_uint },
@@ -111,13 +116,22 @@ static struct field_set_info super_field
{ "hash_seed", &set_sb.s_hash_seed, 16, parse_uuid },
{ "def_hash_version", &set_sb.s_def_hash_version, 1, parse_hashalg },
{ "jnl_backup_type", &set_sb.s_jnl_backup_type, 1, parse_uint },
- /* s_reserved_word_pad */
+ { "desc_size", &set_sb.s_desc_size, 2, parse_uint },
{ "default_mount_opts", &set_sb.s_default_mount_opts, 4, parse_uint },
{ "first_meta_bg", &set_sb.s_first_meta_bg, 4, parse_uint },
{ "mkfs_time", &set_sb.s_mkfs_time, 4, parse_time },
{ "jnl_blocks", &set_sb.s_jnl_blocks[0], 4, parse_uint, FLAG_ARRAY,
17 },
+ { "blocks_count_hi", &set_sb.s_blocks_count_hi, 4, parse_uint },
+ { "r_blocks_count_hi", &set_sb.s_r_blocks_count_hi, 4, parse_uint },
+ { "min_extra_isize", &set_sb.s_min_extra_isize, 2, parse_uint },
+ { "want_extra_isize", &set_sb.s_want_extra_isize, 2, parse_uint },
{ "flags", &set_sb.s_flags, 4, parse_uint },
+ { "raid_stride", &set_sb.s_raid_stride, 2, parse_uint },
+ { "min_extra_isize", &set_sb.s_min_extra_isize, 4, parse_uint },
+ { "mmp_interval", &set_sb.s_mmp_interval, 2, parse_uint },
+ { "mmp_block", &set_sb.s_mmp_block, 8, parse_uint },
+ { "raid_stripe_width", &set_sb.s_raid_stripe_width, 4, parse_uint },
{ 0, 0, 0, 0 }
};

@@ -144,6 +158,7 @@ static struct field_set_info inode_field
{ "generation", &set_inode.i_generation, 4, parse_uint },
{ "file_acl", &set_inode.i_file_acl, 4, parse_uint },
{ "dir_acl", &set_inode.i_dir_acl, 4, parse_uint },
+ { "size_high", &set_inode.i_size_high, 4, parse_uint },
{ "faddr", &set_inode.i_faddr, 4, parse_uint },
{ "blocks_hi", &set_inode.osd2.linux2.l_i_blocks_hi, 2, parse_uint },
{ "frag", &set_inode.osd2.hurd2.h_i_frag, 1, parse_uint },
@@ -229,9 +244,10 @@ static struct field_set_info *find_field

static errcode_t parse_uint(struct field_set_info *info, char *arg)
{
- unsigned long num;
+ unsigned long long num, limit;
char *tmp;
union {
+ __u64 *ptr64;
__u32 *ptr32;
__u16 *ptr16;
__u8 *ptr8;
@@ -241,13 +257,23 @@ static errcode_t parse_uint(struct field
if (info->flags & FLAG_ARRAY)
u.ptr8 += array_idx * info->size;

- num = strtoul(arg, &tmp, 0);
- if (*tmp) {
+ errno = 0;
+ num = STRTOULL(arg, &tmp, 0);
+ if (*tmp || errno) {
fprintf(stderr, "Couldn't parse '%s' for field %s.\n",
arg, info->name);
return EINVAL;
}
+ limit = ~0ULL >> ((8 - info->size) * 8);
+ if (num > limit) {
+ fprintf(stderr, "Value '%s' exceeds field %s maximum %llu.\n",
+ arg, info->name, limit);
+ return EINVAL;
+ }
switch (info->size) {
+ case 8:
+ *u.ptr64 = num;
+ break;
case 4:
*u.ptr32 = num;
break;
Index: e2fsprogs-1.40.5/lib/blkid/read.c
===================================================================
--- e2fsprogs-1.40.5.orig/lib/blkid/read.c
+++ e2fsprogs-1.40.5/lib/blkid/read.c
@@ -10,6 +10,8 @@
* %End-Header%
*/

+#define _XOPEN_SOURCE 600 /* for inclusion of strtoull */
+
#include <stdio.h>
#include <ctype.h>
#include <string.h>
@@ -26,7 +28,6 @@
#include "uuid/uuid.h"

#ifdef HAVE_STRTOULL
-#define __USE_ISOC9X
#define STRTOULL strtoull /* defined in stdlib.h if you try hard enough */
#else
/* FIXME: need to support real strtoull here */
@@ -319,8 +320,7 @@ static int parse_tag(blkid_cache cache,
else if (!strcmp(name, "PRI"))
dev->bid_pri = strtol(value, 0, 0);
else if (!strcmp(name, "TIME"))
- /* FIXME: need to parse a long long eventually */
- dev->bid_time = strtol(value, 0, 0);
+ dev->bid_time = STRTOULL(value, 0, 0);
else
ret = blkid_set_tag(dev, name, value, strlen(value));


Cheers, Andreas
--
Andreas Dilger
Sr. Staff Engineer, Lustre Group
Sun Microsystems of Canada, Inc.

2008-02-02 08:48:37

by Andreas Dilger

[permalink] [raw]
Subject: Re: [PATCH][20/28] e2fsprogs-mmp.patch


Add multi-mount protection support to libext2fs (INCOMPAT_MMP feature).

This allows mke2fs, e2fsck, and others to detect if the filesystem is
mounted on a remote node (on SAN disks) and avoid corrupting the
filesystem. For e2fsprogs this only means that it check the MMP block
to see if the filesystem is in use, and mark the filesystem busy while
e2fsck is running on the system.

There is no requirement that e2fsck updates the MMP block in any regular
interval, but e2fsck does this occasionally to provide additional
information to the sysadmin in case of conflict.

Signed-off-by: Kalpak Shah <[email protected]>
Signed-off-by: Andreas Dilger <[email protected]>

Index: e2fsprogs-1.40.5/lib/e2p/feature.c
===================================================================
--- e2fsprogs-1.40.5.orig/lib/e2p/feature.c
+++ e2fsprogs-1.40.5/lib/e2p/feature.c
@@ -67,6 +67,8 @@ static struct feature feature_list[] = {
"extent" },
{ E2P_FEATURE_INCOMPAT, EXT4_FEATURE_INCOMPAT_64BIT,
"64bit" },
+ { E2P_FEATURE_INCOMPAT, EXT4_FEATURE_INCOMPAT_MMP,
+ "mmp" },
{ E2P_FEATURE_INCOMPAT, EXT4_FEATURE_INCOMPAT_FLEX_BG,
"flex_bg"},
{ 0, 0, 0 },
Index: e2fsprogs-1.40.5/lib/ext2fs/ext2_fs.h
===================================================================
--- e2fsprogs-1.40.5.orig/lib/ext2fs/ext2_fs.h
+++ e2fsprogs-1.40.5/lib/ext2fs/ext2_fs.h
@@ -566,11 +566,11 @@ struct ext2_super_block {
__u16 s_min_extra_isize; /* All inodes have at least # bytes */
__u16 s_want_extra_isize; /* New inodes should reserve # bytes */
__u32 s_flags; /* Miscellaneous flags */
- __u16 s_raid_stride; /* RAID stride */
- __u16 s_mmp_interval; /* # seconds to wait in MMP checking */
- __u64 s_mmp_block; /* Block for multi-mount protection */
- __u32 s_raid_stripe_width; /* blocks on all data disks (N*stride)*/
- __u32 s_reserved[163]; /* Padding to the end of the block */
+ __u16 s_raid_stride; /* RAID stride */
+ __u16 s_mmp_update_interval; /* # seconds to wait in MMP checking */
+ __u64 s_mmp_block; /* Block for multi-mount protection */
+ __u32 s_raid_stripe_width; /* blocks on all data disks (N*stride)*/
+ __u32 s_reserved[163]; /* Padding to the end of the block */
};

/*
@@ -637,7 +637,8 @@ struct ext2_super_block {


#define EXT2_FEATURE_COMPAT_SUPP 0
-#define EXT2_FEATURE_INCOMPAT_SUPP (EXT2_FEATURE_INCOMPAT_FILETYPE)
+#define EXT2_FEATURE_INCOMPAT_SUPP (EXT2_FEATURE_INCOMPAT_FILETYPE| \
+ EXT4_FEATURE_INCOMPAT_MMP)
#define EXT2_FEATURE_RO_COMPAT_SUPP (EXT2_FEATURE_RO_COMPAT_SPARSE_SUPER| \
EXT2_FEATURE_RO_COMPAT_LARGE_FILE| \
EXT4_FEATURE_RO_COMPAT_DIR_NLINK| \
@@ -717,26 +718,34 @@ struct ext2_dir_entry_2 {
/*
* This structure will be used for multiple mount protection. It will be
* written into the block number saved in the s_mmp_block field in the
- * superblock.
- */
-#define EXT2_MMP_MAGIC 0x004D4D50 /* ASCII for MMP */
-#define EXT2_MMP_CLEAN 0xFF4D4D50 /* Value of mmp_seq for clean unmount */
-#define EXT2_MMP_FSCK_ON 0xE24D4D50 /* Value of mmp_seq when being fscked */
+ * superblock. Programs that check MMP should assume that if
+ * SEQ_FSCK (or any unknown code above SEQ_MAX) is present then it is NOT safe
+ * to use the filesystem, regardless of how old the timestamp is.
+ */
+#define EXT2_MMP_MAGIC 0x004D4D50U /* ASCII for MMP */
+#define EXT2_MMP_SEQ_CLEAN 0xFF4D4D50U /* mmp_seq value for clean unmount */
+#define EXT2_MMP_SEQ_FSCK 0xE24D4D50U /* mmp_seq value when being fscked */
+#define EXT2_MMP_SEQ_MAX 0xE24D4D4FU /* maximum valid mmp_seq value */

struct mmp_struct {
- __u32 mmp_magic;
- __u32 mmp_seq;
- __u64 mmp_time;
- char mmp_nodename[64];
- char mmp_bdevname[32];
- __u16 mmp_interval;
+ __u32 mmp_magic; /* Magic number for MMP */
+ __u32 mmp_seq; /* Sequence no. updated periodically */
+ __u64 mmp_time; /* Time last updated */
+ char mmp_nodename[64]; /* Node which last updated MMP block */
+ char mmp_bdevname[32]; /* Bdev which last updated MMP block */
+ __u16 mmp_check_interval; /* Changed mmp_check_interval */
__u16 mmp_pad1;
- __u32 mmp_pad2;
+ __u32 mmp_pad2[227];
};

/*
- * Interval in number of seconds to update the MMP sequence number.
+ * Default interval in seconds to update the MMP sequence number.
+ */
+#define EXT2_MMP_UPDATE_INTERVAL 1
+
+/*
+ * Minimum interval for MMP checking in seconds.
*/
-#define EXT2_MMP_DEF_INTERVAL 5
+#define EXT2_MMP_MIN_CHECK_INTERVAL 5

#endif /* _LINUX_EXT2_FS_H */
Index: e2fsprogs-1.40.5/lib/ext2fs/ext2fs.h
===================================================================
--- e2fsprogs-1.40.5.orig/lib/ext2fs/ext2fs.h
+++ e2fsprogs-1.40.5/lib/ext2fs/ext2fs.h
@@ -171,6 +171,7 @@ typedef struct ext2_file *ext2_file_t;
#define EXT2_FLAG_IMAGE_FILE 0x2000
#define EXT2_FLAG_EXCLUSIVE 0x4000
#define EXT2_FLAG_SOFTSUPP_FEATURES 0x8000
+#define EXT2_FLAG_SKIP_MMP 0x10000

/*
* Special flag in the ext2 inode i_flag field that means that this is
@@ -185,6 +186,15 @@ typedef struct ext2_file *ext2_file_t;
*/
#define EXT2_MKJOURNAL_V1_SUPER 0x0000001

+/*
+ * The timestamp in the MMP structure will be updated by e2fsck at some
+ * arbitary intervals (start of passes, after every EXT2_MMP_INODE_INTERVAL
+ * inodes in pass1 and pass1b). There is no guarantee that e2fsck is updating
+ * the MMP block in a timely manner, and the updates it does are purely for
+ * the convenience of the sysadmin and not for automatic validation.
+ */
+#define EXT2_MMP_INODE_INTERVAL 20000
+
struct struct_ext2_filsys {
errcode_t magic;
io_channel io;
@@ -228,6 +238,15 @@ struct struct_ext2_filsys {
*/
struct ext2_inode_cache *icache;
io_channel image_io;
+
+ /*
+ * Buffer for Multiple mount protection(MMP) block.
+ */
+ char *mmp_buf;
+ /*
+ * Time at which e2fsck last updated the MMP block.
+ */
+ long mmp_last_written;
};

#if EXT2_FLAT_INCLUDES
@@ -444,14 +463,16 @@ typedef struct ext2_icount *ext2_icount_
EXT2_FEATURE_INCOMPAT_META_BG|\
EXT3_FEATURE_INCOMPAT_RECOVER|\
EXT3_FEATURE_INCOMPAT_EXTENTS|\
- EXT4_FEATURE_INCOMPAT_FLEX_BG)
+ EXT4_FEATURE_INCOMPAT_FLEX_BG|\
+ EXT4_FEATURE_INCOMPAT_MMP)
#else
#define EXT2_LIB_FEATURE_INCOMPAT_SUPP (EXT2_FEATURE_INCOMPAT_FILETYPE|\
EXT3_FEATURE_INCOMPAT_JOURNAL_DEV|\
EXT2_FEATURE_INCOMPAT_META_BG|\
EXT3_FEATURE_INCOMPAT_RECOVER|\
EXT3_FEATURE_INCOMPAT_EXTENTS|\
- EXT4_FEATURE_INCOMPAT_FLEX_BG)
+ EXT4_FEATURE_INCOMPAT_FLEX_BG|\
+ EXT4_FEATURE_INCOMPAT_MMP)
#endif
#define EXT2_LIB_FEATURE_RO_COMPAT_SUPP (EXT2_FEATURE_RO_COMPAT_SPARSE_SUPER|\
EXT2_FEATURE_RO_COMPAT_LARGE_FILE|\
@@ -995,6 +1016,12 @@ errcode_t ext2fs_link(ext2_filsys fs, ex
errcode_t ext2fs_unlink(ext2_filsys fs, ext2_ino_t dir, const char *name,
ext2_ino_t ino, int flags);

+/* mmp.c */
+errcode_t ext2fs_read_mmp(ext2_filsys fs, blk_t mmp_blk, char *buf);
+errcode_t ext2fs_write_mmp(ext2_filsys fs, blk_t mmp_blk, char *buf);
+errcode_t ext2fs_enable_mmp(ext2_filsys fs);
+long int ext2fs_mmp_new_seq();
+
/* read_bb.c */
extern errcode_t ext2fs_read_bb_inode(ext2_filsys fs,
ext2_badblocks_list *bb_list);
@@ -1032,6 +1059,7 @@ extern void ext2fs_swap_inode(ext2_filsy
extern void ext2fs_swap_extent_header(struct ext3_extent_header *eh);
extern void ext2fs_swap_extent_index(struct ext3_extent_idx *ix);
extern void ext2fs_swap_extent(struct ext3_extent *ex);
+extern void ext2fs_swap_mmp(struct mmp_struct *mmp);

/* valid_blk.c */
extern int ext2fs_inode_has_valid_blocks(struct ext2_inode *inode);
Index: e2fsprogs-1.40.5/misc/tune2fs.c
===================================================================
--- e2fsprogs-1.40.5.orig/misc/tune2fs.c
+++ e2fsprogs-1.40.5/misc/tune2fs.c
@@ -60,7 +60,7 @@ char * device_name;
char * new_label, *new_last_mounted, *new_UUID;
char * io_options;
static int c_flag, C_flag, e_flag, f_flag, g_flag, i_flag, l_flag, L_flag;
-static int m_flag, M_flag, r_flag, s_flag = -1, u_flag, U_flag, T_flag;
+static int m_flag, M_flag, r_flag, s_flag = -1, u_flag, U_flag, T_flag, p_flag;
static time_t last_check_time;
static int print_label;
static int max_mount_count, mount_count, mount_flags;
@@ -69,6 +69,7 @@ static double reserved_ratio;
static unsigned long resgid, resuid;
static unsigned short errors;
static int open_flag;
+static unsigned int mmp_update_interval;
static char *features_cmd;
static char *mntopts_cmd;
static int stride, stripe_width;
@@ -89,7 +90,8 @@ static void usage(void)
"[-g group]\n"
"\t[-i interval[d|m|w]] [-j] [-J journal_options]\n"
"\t[-l] [-s sparse_flag] [-m reserved_blocks_percent]\n"
- "\t[-o [^]mount_options[,...]] [-r reserved_blocks_count]\n"
+ "\t[-o [^]mount_options[,...]] [-p mmp_update_interval]"
+ "[-r reserved_blocks_count]\n"
"\t[-u user] [-C mount_count] [-L volume_label]\n"
"\t[-M last_mounted_dir] [-O [^]feature[,...]]\n"
"\t[-E extended-option[,...]] [-T last_check_time] "
@@ -101,7 +103,8 @@ static __u32 ok_features[3] = {
EXT3_FEATURE_COMPAT_HAS_JOURNAL |
EXT2_FEATURE_COMPAT_DIR_INDEX, /* Compat */
EXT2_FEATURE_INCOMPAT_FILETYPE| /* Incompat */
- EXT4_FEATURE_INCOMPAT_FLEX_BG,
+ EXT4_FEATURE_INCOMPAT_FLEX_BG |
+ EXT4_FEATURE_INCOMPAT_MMP,
EXT2_FEATURE_RO_COMPAT_SPARSE_SUPER | /* R/O compat */
EXT4_FEATURE_RO_COMPAT_GDT_CSUM
};
@@ -290,6 +293,7 @@ static void update_feature_set(ext2_fils
{
int sparse, old_sparse, filetype, old_filetype;
int journal, old_journal, dxdir, old_dxdir, uninit, old_uninit;
+ int mmp, old_mmp;
int flex_bg, old_flex_bg;
struct ext2_super_block *sb= fs->super;
__u32 old_compat, old_incompat, old_ro_compat;
@@ -293,6 +296,7 @@ static void update_feature_set(ext2_fils
int flex_bg, old_flex_bg;
struct ext2_super_block *sb= fs->super;
__u32 old_compat, old_incompat, old_ro_compat;
+ int error;

old_compat = sb->s_feature_compat;
old_ro_compat = sb->s_feature_ro_compat;
@@ -310,6 +315,8 @@ static void update_feature_set(ext2_fils
EXT2_FEATURE_COMPAT_DIR_INDEX;
old_uninit = sb->s_feature_ro_compat &
EXT4_FEATURE_RO_COMPAT_GDT_CSUM;
+ old_mmp = sb->s_feature_incompat &
+ EXT4_FEATURE_INCOMPAT_MMP;
if (e2p_edit_feature(features, &sb->s_feature_compat,
ok_features)) {
fprintf(stderr, _("Invalid filesystem option set: %s\n"),
@@ -328,6 +335,8 @@ static void update_feature_set(ext2_fils
EXT2_FEATURE_COMPAT_DIR_INDEX;
uninit = sb->s_feature_ro_compat &
EXT4_FEATURE_RO_COMPAT_GDT_CSUM;
+ mmp = sb->s_feature_incompat &
+ EXT4_FEATURE_INCOMPAT_MMP;
if (old_journal && !journal) {
if ((mount_flags & EXT2_MF_MOUNTED) &&
!(mount_flags & EXT2_MF_READONLY)) {
@@ -376,6 +385,75 @@ static void update_feature_set(ext2_fils
exit(1);
}
}
+ if (!old_mmp && mmp) {
+ if ((mount_flags & EXT2_MF_MOUNTED) ||
+ (mount_flags & EXT2_MF_READONLY)) {
+ fputs(_("The multiple mount protection feature can't \n"
+ "be set if the filesystem is mounted or \n"
+ "read-only.\n"), stderr);
+ exit(1);
+ }
+
+ error = ext2fs_enable_mmp(fs);
+ if (error) {
+ fputs(_("\nError while enabling multiple mount "
+ "protection feature."), stderr);
+ exit(1);
+ }
+
+ printf(_("Multiple mount protection has been enabled. The MMP "
+ "update interval has been set to %d seconds.\n"),
+ sb->s_mmp_update_interval);
+ }
+
+ if (old_mmp && !mmp) {
+ blk_t mmp_block;
+ struct mmp_struct *mmp_s;
+ char *buf;
+
+ if (mount_flags & EXT2_MF_READONLY) {
+ fputs(_("The multiple mount protection feature cannot\n"
+ "be disabled if the filesystem is readonly.\n"),
+ stderr);
+ exit(1);
+ }
+
+ error = ext2fs_read_bitmaps(fs);
+ if (error) {
+ fputs(_("Error while reading bitmaps\n"), stderr);
+ exit(1);
+ }
+
+ mmp_block = sb->s_mmp_block;
+
+ error = ext2fs_get_mem(fs->blocksize, &buf);
+ if (error) {
+ fputs(_("Error allocating memory.\n"), stderr);
+ exit(1);
+ }
+
+ mmp_s = (struct mmp_struct *) buf;
+ error = ext2fs_read_mmp(fs, mmp_block, buf);
+ if (error) {
+ if (error == EXT2_ET_MMP_MAGIC_INVALID)
+ printf(_("Magic number in MMP block does not "
+ "match. expected: %x, actual: %x\n"),
+ EXT2_MMP_MAGIC, mmp_s->mmp_magic);
+ else
+ com_err (program_name, error,
+ _("while reading MMP block."));
+ goto mmp_error;
+ }
+
+ ext2fs_unmark_block_bitmap(fs->block_map, mmp_block);
+ ext2fs_mark_bb_dirty(fs);
+
+mmp_error:
+ sb->s_mmp_block = 0;
+ sb->s_mmp_update_interval = 0;
+ if (buf)
+ ext2fs_free_mem(&buf);
+ }

if (sb->s_rev_level == EXT2_GOOD_OLD_REV &&
(sb->s_feature_compat || sb->s_feature_ro_compat ||
@@ -530,7 +608,7 @@ static void parse_tune2fs_options(int ar
struct passwd * pw;

printf("tune2fs %s (%s)\n", E2FSPROGS_VERSION, E2FSPROGS_DATE);
- while ((c = getopt(argc, argv, "c:e:fg:i:jlm:o:r:s:u:C:E:J:L:M:O:T:U:")) != EOF)
+ while ((c = getopt(argc, argv, "c:e:fg:i:jlm:o:p:r:s:u:C:E:J:L:M:O:T:U:")) != EOF)
switch (c)
{
case 'c':
@@ -685,6 +763,26 @@ static void parse_tune2fs_options(int ar
features_cmd = optarg;
open_flag = EXT2_FLAG_RW;
break;
+ case 'p':
+ mmp_update_interval = strtol (optarg, &tmp, 0);
+ if (*tmp && mmp_update_interval < 0) {
+ com_err (program_name, 0, _("invalid "
+ "mmp update interval"));
+ usage();
+ }
+ if (mmp_update_interval == 0)
+ mmp_update_interval = EXT2_MMP_UPDATE_INTERVAL;
+ if (mmp_update_interval > EXT2_MMP_UPDATE_INTERVAL) {
+ com_err (program_name, 0,
+ _("MMP update interval of %s "
+ "seconds may be dangerous "
+ "under high load. Consider "
+ "decreasing it."),
+ optarg);
+ }
+ p_flag = 1;
+ open_flag = EXT2_FLAG_RW;
+ break;
case 'r':
reserved_blocks = strtoul (optarg, &tmp, 0);
if (*tmp) {
@@ -883,6 +981,9 @@ int main (int argc, char ** argv)
#else
io_ptr = unix_io_manager;
#endif
+ if (open_flag == EXT2_FLAG_RW && f_flag)
+ open_flag |= EXT2_FLAG_SKIP_MMP;
+
retval = ext2fs_open2(device_name, io_options, open_flag,
0, 0, io_ptr, &fs);
if (retval) {
@@ -944,6 +1045,12 @@ int main (int argc, char ** argv)
printf (_("Setting reserved blocks percentage to %g%% (%u blocks)\n"),
reserved_ratio, sb->s_r_blocks_count);
}
+ if (p_flag) {
+ sb->s_mmp_update_interval = mmp_update_interval;
+ ext2fs_mark_super_dirty(fs);
+ printf (_("Setting multiple mount protection update interval to "
+ "%lu seconds\n"), mmp_update_interval);
+ }
if (r_flag) {
if (reserved_blocks >= sb->s_blocks_count/2) {
com_err (program_name, 0,
Index: e2fsprogs-1.40.5/e2fsck/pass1.c
===================================================================
--- e2fsprogs-1.40.5.orig/e2fsck/pass1.c
+++ e2fsprogs-1.40.5/e2fsck/pass1.c
@@ -795,7 +795,20 @@ void e2fsck_pass1(e2fsck_t ctx)
(fs->super->s_mtime < fs->super->s_inodes_count))
busted_fs_time = 1;

+ if ((fs->super->s_feature_incompat & EXT4_FEATURE_INCOMPAT_MMP) &&
+ !(fs->super->s_mmp_block < fs->super->s_first_data_block ||
+ fs->super->s_mmp_block >= fs->super->s_blocks_count))
+ ext2fs_mark_block_bitmap(ctx->block_found_map,
+ fs->super->s_mmp_block);
+
while (1) {
+ if (ino % EXT2_MMP_INODE_INTERVAL == 0) {
+ errcode_t error;
+
+ error = e2fsck_mmp_update(fs);
+ if (error)
+ fatal_error(ctx, 0);
+ }
old_op = ehandler_operation(_("getting next inode from scan"));
pctx.errcode = ext2fs_get_next_inode_full(scan, &ino,
inode, inode_size);
Index: e2fsprogs-1.40.5/e2fsck/unix.c
===================================================================
--- e2fsprogs-1.40.5.orig/e2fsck/unix.c
+++ e2fsprogs-1.40.5/e2fsck/unix.c
@@ -1076,6 +1076,23 @@ restart:
"to do a read-only\n"
"check of the device.\n"));
#endif
+ else if (retval == EXT2_ET_MMP_BAD_BLOCK) {
+ if (fix_problem(ctx, PR_0_MMP_INVALID_BLK, &pctx)) {
+ fs->super->s_mmp_block = 0;
+ ext2fs_mark_super_dirty(fs);
+ }
+ }
+ else if (retval == EXT2_ET_MMP_FAILED) {
+ dump_mmp_msg((struct mmp_struct *)fs->mmp_buf,
+ _("Device is already active on another node."));
+ }
+ else if (retval == EXT2_ET_MMP_FSCK_ON) {
+ dump_mmp_msg((struct mmp_struct *)fs->mmp_buf,
+ _("It seems as if e2fsck is running on the "
+ "filesystem.\nIf you are sure that e2fsck is "
+ "not running then use \"tune2fs -O ^mmp {device}\" "
+ "followed by \"tune2fs -O mmp {device}\""));
+ }
else
fix_problem(ctx, PR_0_SB_CORRUPT, &pctx);
fatal_error(ctx, 0);
Index: e2fsprogs-1.40.5/e2fsck/problem.c
===================================================================
--- e2fsprogs-1.40.5.orig/e2fsck/problem.c
+++ e2fsprogs-1.40.5/e2fsck/problem.c
@@ -389,6 +389,11 @@ static struct e2fsck_problem problem_tab
PROMPT_NONE, 0 },


+ /* Superblock has invalid MMP block. */
+ { PR_0_MMP_INVALID_BLK,
+ N_("@S has invalid MMP block. "),
+ PROMPT_CLEAR, PR_PREEN_OK },
+
/* Pass 1 errors */

/* Pass 1: Checking inodes, blocks, and sizes */
Index: e2fsprogs-1.40.5/e2fsck/problem.h
===================================================================
--- e2fsprogs-1.40.5.orig/e2fsck/problem.h
+++ e2fsprogs-1.40.5/e2fsck/problem.h
@@ -222,6 +222,9 @@ struct problem_context {
#define PR_0_CLEAR_EXTRA_ISIZE 0x00003C


+/* Superblock has invalid MMP block. */
+#define PR_0_MMP_INVALID_BLK 0x00003A
+
/*
* Pass 1 errors
*/
Index: e2fsprogs-1.40.5/lib/ext2fs/swapfs.c
===================================================================
--- e2fsprogs-1.40.5.orig/lib/ext2fs/swapfs.c
+++ e2fsprogs-1.40.5/lib/ext2fs/swapfs.c
@@ -70,6 +70,8 @@ void ext2fs_swap_super(struct ext2_super
sb->s_min_extra_isize = ext2fs_swab16(sb->s_min_extra_isize);
sb->s_want_extra_isize = ext2fs_swab16(sb->s_want_extra_isize);
sb->s_flags = ext2fs_swab32(sb->s_flags);
+ sb->s_mmp_update_interval = ext2fs_swab16(sb->s_mmp_update_interval);
+ sb->s_mmp_block = ext2fs_swab64(sb->s_mmp_block);
for (i=0; i < 4; i++)
sb->s_hash_seed[i] = ext2fs_swab32(sb->s_hash_seed[i]);
for (i=0; i < 17; i++)
@@ -310,4 +312,12 @@ void ext2fs_swap_inode(ext2_filsys fs, s
sizeof(struct ext2_inode));
}

+void ext2fs_swap_mmp(struct mmp_struct *mmp)
+{
+ mmp->mmp_magic = ext2fs_swab32(mmp->mmp_magic);
+ mmp->mmp_seq = ext2fs_swab32(mmp->mmp_seq);
+ mmp->mmp_time = ext2fs_swab64(mmp->mmp_time);
+ mmp->mmp_check_interval = ext2fs_swab16(mmp->mmp_check_interval);
+}
+
#endif
Index: e2fsprogs-1.40.5/lib/ext2fs/openfs.c
===================================================================
--- e2fsprogs-1.40.5.orig/lib/ext2fs/openfs.c
+++ e2fsprogs-1.40.5/lib/ext2fs/openfs.c
@@ -22,6 +22,9 @@
#if HAVE_SYS_TYPES_H
#include <sys/types.h>
#endif
+#ifdef HAVE_ERRNO_H
+#include <errno.h>
+#endif

#include "ext2_fs.h"

@@ -67,6 +70,97 @@ errcode_t ext2fs_open(const char *name,
}

/*
+ * Make sure that the fs is not mounted or being fsck'ed while opening the fs.
+ */
+int ext2fs_multiple_mount_protect(ext2_filsys fs)
+{
+ blk_t mmp_blk = fs->super->s_mmp_block;
+ char *buf;
+ struct mmp_struct *mmp_s;
+ unsigned seq;
+ unsigned int mmp_check_interval;
+ errcode_t retval = 0;
+
+ retval = ext2fs_get_mem(fs->blocksize, &fs->mmp_buf);
+ if (retval)
+ goto mmp_error;
+ buf = fs->mmp_buf;
+
+ retval = ext2fs_read_mmp(fs, mmp_blk, buf);
+ if (retval)
+ goto mmp_error;
+
+ mmp_s = (struct mmp_struct *) buf;
+
+ mmp_check_interval = fs->super->s_mmp_update_interval;
+ if (mmp_check_interval < EXT2_MMP_MIN_CHECK_INTERVAL)
+ mmp_check_interval = EXT2_MMP_MIN_CHECK_INTERVAL;
+
+ /*
+ * If check_interval in MMP block is larger, use that instead of
+ * check_interval from the superblock.
+ */
+ if (mmp_s->mmp_check_interval > mmp_check_interval)
+ mmp_check_interval = mmp_s->mmp_check_interval;
+
+
+ seq = mmp_s->mmp_seq;
+ if (seq == EXT2_MMP_SEQ_CLEAN)
+ goto clean_seq;
+ if (seq == EXT2_MMP_SEQ_FSCK) {
+ retval = EXT2_ET_MMP_FSCK_ON;
+ goto mmp_error;
+ }
+
+ if (seq > EXT2_MMP_SEQ_FSCK) {
+ retval = EXT2_ET_MMP_UNKNOWN_SEQ;
+ goto mmp_error;
+ }
+
+ sleep(2 * mmp_check_interval + 1);
+
+ retval = ext2fs_read_mmp(fs, mmp_blk, buf);
+ if (retval)
+ goto mmp_error;
+
+ if (seq != mmp_s->mmp_seq) {
+ retval = EXT2_ET_MMP_FAILED;
+ goto mmp_error;
+ }
+
+clean_seq:
+ mmp_s->mmp_seq = seq = ext2fs_mmp_new_seq();
+
+ retval = ext2fs_write_mmp(fs, mmp_blk, buf);
+ if (retval)
+ goto mmp_error;
+
+ sleep(2 * mmp_check_interval + 1);
+
+ retval = ext2fs_read_mmp(fs, mmp_blk, buf);
+ if (retval)
+ goto mmp_error;
+
+ if (seq != mmp_s->mmp_seq) {
+ retval = EXT2_ET_MMP_FAILED;
+ goto mmp_error;
+ }
+
+ mmp_s->mmp_seq = EXT2_MMP_SEQ_FSCK;
+ retval = ext2fs_write_mmp(fs, mmp_blk, buf);
+ if (retval)
+ goto mmp_error;
+
+ return 0;
+
+mmp_error:
+ if (buf)
+ ext2fs_free_mem(&buf);
+
+ return retval;
+}
+
+/*
* Note: if superblock is non-zero, block-size must also be non-zero.
* Superblock and block_size can be zero to use the default size.
*
@@ -76,6 +170,7 @@ errcode_t ext2fs_open(const char *name,
* EXT2_FLAG_FORCE - Open the filesystem even if some of the
* features aren't supported.
* EXT2_FLAG_JOURNAL_DEV_OK - Open an ext3 journal device
+ * EXT2_FLAG_SKIP_MMP - Open without multi-mount protection check.
*/
errcode_t ext2fs_open2(const char *name, const char *io_options,
int flags, int superblock,
@@ -320,6 +415,15 @@ errcode_t ext2fs_open2(const char *name,
}

*ret_fs = fs;
+
+ fs->mmp_buf = NULL;
+ if ((fs->super->s_feature_incompat & EXT4_FEATURE_INCOMPAT_MMP) &&
+ (flags & EXT2_FLAG_RW) && !(flags & EXT2_FLAG_SKIP_MMP)) {
+ retval = ext2fs_multiple_mount_protect(fs);
+ if (retval)
+ goto cleanup;
+ }
+
return 0;
cleanup:
ext2fs_free(fs);
Index: e2fsprogs-1.40.5/lib/ext2fs/ext2_err.et.in
===================================================================
--- e2fsprogs-1.40.5.orig/lib/ext2fs/ext2_err.et.in
+++ e2fsprogs-1.40.5/lib/ext2fs/ext2_err.et.in
@@ -362,5 +362,25 @@ ec EXT2_ET_EA_NAME_NOT_FOUND,
ec EXT2_ET_EA_NAME_EXISTS,
"Extended attribute name already exists"

+ec EXT2_ET_MMP_MAGIC_INVALID,
+ "MMP: Invalid magic number in MMP block"
+
+ec EXT2_ET_MMP_FAILED,
+ "MMP: Device already active on another node"
+
+ec EXT2_ET_MMP_FSCK_ON,
+ "MMP: Seems as if fsck is already being run on the filesystem."
+
+ec EXT2_ET_MMP_BAD_BLOCK,
+ "MMP: MMP block number beyond filesystem range."
+
+ec EXT2_ET_MMP_UNKNOWN_SEQ,
+ "MMP: MMP sequence is beyond EXT2_MMP_SEQ_MAX. This filesystem "
+ "seems to be undergoing an unknown operation."
+
+ec EXT2_ET_MMP_FSCK_ABORT,
+ "MMP: Expected sequence not found. Filesystem may be mounted while "
+ "fsck was running"
+
end

Index: e2fsprogs-1.40.5/lib/ext2fs/closefs.c
===================================================================
--- e2fsprogs-1.40.5.orig/lib/ext2fs/closefs.c
+++ e2fsprogs-1.40.5/lib/ext2fs/closefs.c
@@ -351,12 +351,63 @@ errout:
return retval;
}

+errcode_t write_mmp_clean(ext2_filsys fs)
+{
+ blk_t mmp_blk = fs->super->s_mmp_block;
+ char *buf = fs->mmp_buf, *buf_cmp;
+ struct mmp_struct *mmp, *mmp_cmp;
+ errcode_t retval;
+
+ retval = ext2fs_get_mem(fs->blocksize, &buf_cmp);
+ if (retval)
+ goto mmp_error;
+
+ retval = ext2fs_read_mmp(fs, mmp_blk, buf_cmp);
+ if (retval)
+ goto mmp_error;
+ mmp_cmp = (struct mmp_struct *) buf_cmp;
+
+ /*
+ * This is important since we may come here just after when MMP feature
+ * is set and fs->mmp_buf is NULL
+ */
+ if (!buf)
+ goto check_skipped;
+
+ /*
+ * Make sure that the MMP block is not changed.
+ */
+ mmp = (struct mmp_struct *) buf;
+ if (memcmp(mmp, mmp_cmp, sizeof(struct mmp_struct)))
+ return EXT2_ET_MMP_FSCK_ABORT;
+
+check_skipped:
+ mmp_cmp->mmp_seq = EXT2_MMP_SEQ_CLEAN;
+ retval = ext2fs_write_mmp(fs, mmp_blk, buf_cmp);
+
+mmp_error:
+ if (buf)
+ ext2fs_free_mem(&buf);
+ if (buf_cmp)
+ ext2fs_free_mem(&buf_cmp);
+
+ return retval;
+}
+
+
errcode_t ext2fs_close(ext2_filsys fs)
{
errcode_t retval;

EXT2_CHECK_MAGIC(fs, EXT2_ET_MAGIC_EXT2FS_FILSYS);

+ if ((fs->super->s_feature_incompat & EXT4_FEATURE_INCOMPAT_MMP) &&
+ (fs->flags & EXT2_FLAG_RW) && !(fs->flags & EXT2_FLAG_SKIP_MMP)) {
+ retval = write_mmp_clean(fs);
+ if (retval)
+ return retval;
+ }
+
if (fs->flags & EXT2_FLAG_DIRTY) {
retval = ext2fs_flush(fs);
if (retval)
Index: e2fsprogs-1.40.5/e2fsck/e2fsck.c
===================================================================
--- e2fsprogs-1.40.5.orig/e2fsck/e2fsck.c
+++ e2fsprogs-1.40.5/e2fsck/e2fsck.c
@@ -186,6 +186,7 @@ int e2fsck_run(e2fsck_t ctx)
{
int i;
pass_t e2fsck_pass;
+ int error;

#ifdef HAVE_SETJMP_H
if (setjmp(ctx->abort_loc)) {
@@ -198,6 +199,9 @@ int e2fsck_run(e2fsck_t ctx)
for (i=0; (e2fsck_pass = e2fsck_passes[i]); i++) {
if (ctx->flags & E2F_FLAG_RUN_RETURN)
break;
+ error = e2fsck_mmp_update(ctx->fs);
+ if (error)
+ fatal_error(ctx, 0);
e2fsck_pass(ctx);
if (ctx->progress)
(void) (ctx->progress)(ctx, 0, 0, 0);
Index: e2fsprogs-1.40.5/e2fsck/e2fsck.h
===================================================================
--- e2fsprogs-1.40.5.orig/e2fsck/e2fsck.h
+++ e2fsprogs-1.40.5/e2fsck/e2fsck.h
@@ -535,6 +535,8 @@ extern void mtrace_print(char *mesg);
extern blk_t get_backup_sb(e2fsck_t ctx, ext2_filsys fs,
const char *name, io_manager manager);
extern int ext2_file_type(unsigned int mode);
+errcode_t e2fsck_mmp_update(ext2_filsys fs);
+void dump_mmp_msg(struct mmp_struct *mmp, const char *msg);

/* unix.c */
extern void e2fsck_clear_progbar(e2fsck_t ctx);
Index: e2fsprogs-1.40.5/lib/ext2fs/Makefile.in
===================================================================
--- e2fsprogs-1.40.5.orig/lib/ext2fs/Makefile.in
+++ e2fsprogs-1.40.5/lib/ext2fs/Makefile.in
@@ -55,6 +55,7 @@ OBJS= $(DEBUGFS_LIB_OBJS) $(RESIZE_LIB_O
lookup.o \
mkdir.o \
mkjournal.o \
+ mmp.o \
native.o \
newdir.o \
openfs.o \
@@ -116,6 +117,7 @@ SRCS= ext2_err.c \
$(srcdir)/lookup.c \
$(srcdir)/mkdir.c \
$(srcdir)/mkjournal.c \
+ $(srcdir)/mmp.c \
$(srcdir)/namei.c \
$(srcdir)/native.c \
$(srcdir)/newdir.c \
@@ -502,6 +504,7 @@ mkdir.o: $(srcdir)/mkdir.c $(srcdir)/ext
$(top_builddir)/lib/ext2fs/ext2_types.h $(srcdir)/ext2fs.h \
$(srcdir)/ext2_fs.h $(srcdir)/ext3_extents.h $(top_srcdir)/lib/et/com_err.h \
$(srcdir)/ext2_io.h $(top_builddir)/lib/ext2fs/ext2_err.h $(srcdir)/bitops.h
+mmp.o: $(srcdir)/ext2_fs.h $(srcdir)/ext2fs.h
mkjournal.o: $(srcdir)/mkjournal.c $(srcdir)/ext2_fs.h \
$(top_builddir)/lib/ext2fs/ext2_types.h $(top_srcdir)/lib/e2p/e2p.h \
$(srcdir)/ext2_fs.h $(srcdir)/ext2fs.h $(srcdir)/ext3_extents.h \
Index: e2fsprogs-1.40.5/lib/ext2fs/mmp.c
===================================================================
--- /dev/null
+++ e2fsprogs-1.40.5/lib/ext2fs/mmp.c
@@ -0,0 +1,139 @@
+/*
+ * Helper functions for multiple mount protection(MMP).
+ *
+ * Copyright (C) 2006, 2007 by Kalpak Shah <[email protected]>
+ *
+ * %Begin-Header%
+ * This file may be redistributed under the terms of the GNU Public
+ * License.
+ * %End-Header%
+ */
+
+#if HAVE_UNISTD_H
+#include <unistd.h>
+#endif
+#include <sys/time.h>
+
+#include "ext2fs/ext2_fs.h"
+#include "ext2fs/ext2fs.h"
+
+errcode_t ext2fs_read_mmp(ext2_filsys fs, blk_t mmp_blk, char *buf)
+{
+ struct mmp_struct *mmp_s;
+ errcode_t retval;
+
+ if ((mmp_blk < fs->super->s_first_data_block) ||
+ (mmp_blk >= fs->super->s_blocks_count))
+ return EXT2_ET_MMP_BAD_BLOCK;
+
+ /*
+ * Make sure that we read direct from disk by reading only
+ * sizeof(stuct mmp_struct) bytes.
+ */
+ retval = io_channel_read_blk(fs->io, mmp_blk,
+ -(int)sizeof(struct mmp_struct), buf);
+ if (retval)
+ return retval;
+
+ mmp_s = (struct mmp_struct *) buf;
+
+#ifdef EXT2FS_ENABLE_SWAPFS
+ if (fs->flags & EXT2_FLAG_SWAP_BYTES)
+ ext2fs_swap_mmp(mmp_s);
+#endif
+
+ if (mmp_s->mmp_magic != EXT2_MMP_MAGIC)
+ return EXT2_ET_MMP_MAGIC_INVALID;
+
+ return 0;
+}
+
+errcode_t ext2fs_write_mmp(ext2_filsys fs, blk_t mmp_blk, char *buf)
+{
+ struct mmp_struct *mmp_s = (struct mmp_struct *) buf;
+ struct timeval tv;
+ int retval;
+
+ gethostname(mmp_s->mmp_nodename, sizeof(mmp_s->mmp_nodename));
+ gettimeofday(&tv, 0);
+ mmp_s->mmp_time = tv.tv_sec;
+
+#ifdef EXT2FS_ENABLE_SWAPFS
+ if (fs->super->s_magic == ext2fs_swab16(EXT2_SUPER_MAGIC))
+ ext2fs_swap_mmp(mmp_s);
+#endif
+
+ retval = io_channel_write_blk(fs->io, mmp_blk,
+ -(int)sizeof(struct mmp_struct), buf);
+
+#ifdef EXT2FS_ENABLE_SWAPFS
+ if (fs->super->s_magic == ext2fs_swab16(EXT2_SUPER_MAGIC))
+ ext2fs_swap_mmp(mmp_s);
+#endif
+
+ /*
+ * Make sure the block gets to disk quickly.
+ */
+ io_channel_flush(fs->io);
+ return retval;
+}
+
+long int ext2fs_mmp_new_seq()
+{
+ long int new_seq;
+
+ do {
+ new_seq = random();
+ } while (new_seq > EXT2_MMP_SEQ_MAX);
+
+ return new_seq;
+}
+
+errcode_t ext2fs_enable_mmp(ext2_filsys fs)
+{
+ struct ext2_super_block *sb = fs->super;
+ struct mmp_struct *mmp_s = NULL;
+ blk_t mmp_block;
+ char *buf;
+ int error;
+
+ error = ext2fs_read_bitmaps(fs);
+ if (error)
+ goto out;
+
+ error = ext2fs_new_block(fs, 0, 0, &mmp_block);
+ if (error)
+ goto out;
+
+ ext2fs_block_alloc_stats(fs, mmp_block, +1);
+ sb->s_mmp_block = mmp_block;
+
+ error = ext2fs_get_mem(fs->blocksize, &buf);
+ if (error)
+ goto out;
+
+ mmp_s = (struct mmp_struct *) buf;
+ memset(mmp_s, 0, sizeof(struct mmp_struct));
+
+ mmp_s->mmp_magic = EXT2_MMP_MAGIC;
+ mmp_s->mmp_seq = EXT2_MMP_SEQ_CLEAN;
+ mmp_s->mmp_time = 0;
+ mmp_s->mmp_nodename[0] = '\0';
+ mmp_s->mmp_bdevname[0] = '\0';
+ mmp_s->mmp_check_interval = EXT2_MMP_MIN_CHECK_INTERVAL;
+
+ error = ext2fs_write_mmp(fs, mmp_block, buf);
+ if (error) {
+ if (buf)
+ ext2fs_free_mem(&buf);
+ goto out;
+ }
+
+ if (buf)
+ ext2fs_free_mem(&buf);
+
+ sb->s_mmp_update_interval = EXT2_MMP_UPDATE_INTERVAL;
+
+out:
+ return error;
+}
Index: e2fsprogs-1.40.5/e2fsck/pass1b.c
===================================================================
--- e2fsprogs-1.40.5.orig/e2fsck/pass1b.c
+++ e2fsprogs-1.40.5/e2fsck/pass1b.c
@@ -272,6 +272,13 @@ static void pass1b(e2fsck_t ctx, char *b
pb.pctx = &pctx;
pctx.str = "pass1b";
while (1) {
+ if (ino % EXT2_MMP_INODE_INTERVAL == 0) {
+ errcode_t error;
+
+ error = e2fsck_mmp_update(fs);
+ if (error)
+ fatal_error(ctx, 0);
+ }
pctx.errcode = ext2fs_get_next_inode(scan, &ino, &inode);
if (pctx.errcode == EXT2_ET_BAD_BLOCK_IN_INODE_TABLE)
continue;
Index: e2fsprogs-1.40.5/misc/tune2fs.8.in
===================================================================
--- e2fsprogs-1.40.5.orig/misc/tune2fs.8.in
+++ e2fsprogs-1.40.5/misc/tune2fs.8.in
@@ -438,6 +438,11 @@ Setting the filesystem feature is equiva
.B \-j
option.
.TP
+.B mmp
+Enable or disable multiple mount protection(MMP) feature. MMP helps to protect
+the filesystem from being multiply mounted and is useful in shared storage
+environment.
+.TP
.B sparse_super
Limit the number of backup superblocks to save space on large filesystems.
.TP
@@ -474,6 +479,9 @@ being mounted by kernels which do not su
.B uninit_groups
feature is not yet supported by any officially released kernel.
.TP
+.BI \-p " mmp_check_interval"
+Set the desired MMP check interval in seconds. It is 5 seconds by default.
+.TP
.BI \-r " reserved-blocks-count"
Set the number of reserved filesystem blocks.
.TP
Index: e2fsprogs-1.40.5/misc/mke2fs.c
===================================================================
--- e2fsprogs-1.40.5.orig/misc/mke2fs.c
+++ e2fsprogs-1.40.5/misc/mke2fs.c
@@ -922,7 +922,8 @@ static __u32 ok_features[3] = {
EXT2_FEATURE_INCOMPAT_FILETYPE| /* Incompat */
EXT3_FEATURE_INCOMPAT_JOURNAL_DEV|
EXT2_FEATURE_INCOMPAT_META_BG|
- EXT4_FEATURE_INCOMPAT_FLEX_BG,
+ EXT4_FEATURE_INCOMPAT_FLEX_BG|
+ EXT4_FEATURE_INCOMPAT_MMP,
EXT2_FEATURE_RO_COMPAT_SPARSE_SUPER| /* R/O compat */
EXT4_FEATURE_RO_COMPAT_GDT_CSUM
};
@@ -1803,8 +1804,21 @@ int main (int argc, char *argv[])
}
no_journal:

- if (!super_only)
+ if (!super_only) {
ext2fs_set_gdt_csum(fs);
+ if (fs->super->s_feature_incompat & EXT4_FEATURE_INCOMPAT_MMP) {
+ retval = ext2fs_enable_mmp(fs);
+ if (retval) {
+ fprintf(stderr, _("\nError while enabling "
+ "multiple mount protection feature."));
+ exit(1);
+ }
+ printf(_("Multiple mount protection has been enabled. "
+ "The MMP update interval has been set to "
+ "%d seconds.\n"),
+ fs->super->s_mmp_update_interval);
+ }
+ }
if (!quiet)
printf(_("Writing superblocks and "
"filesystem accounting information: "));
Index: e2fsprogs-1.40.5/e2fsck/util.c
===================================================================
--- e2fsprogs-1.40.5.orig/e2fsck/util.c
+++ e2fsprogs-1.40.5/e2fsck/util.c
@@ -607,3 +607,61 @@ errcode_t e2fsck_zero_blocks(ext2_filsys
}
return 0;
}
+
+void dump_mmp_msg(struct mmp_struct *mmp, const char *msg)
+{
+ printf("MMP check failed: %s\n", msg);
+ printf("MMP failure info: last update time: %llu, "
+ "last update node: %s, last update device: %s\n",
+ (long long)mmp->mmp_time, mmp->mmp_nodename, mmp->mmp_bdevname);
+}
+
+#define EXT2_MIN_MMP_UPDATE_INTERVAL 60
+
+errcode_t e2fsck_mmp_update(ext2_filsys fs)
+{
+ blk_t mmp_blk = fs->super->s_mmp_block;
+ char *buf = fs->mmp_buf, *buf_cmp;
+ struct mmp_struct *mmp, *mmp_cmp;
+ struct timeval tv;
+ errcode_t retval;
+
+ if (!(fs->super->s_feature_incompat & EXT4_FEATURE_INCOMPAT_MMP) ||
+ !(fs->flags & EXT2_FLAG_RW) || (fs->flags & EXT2_FLAG_SKIP_MMP))
+ return 0;
+
+ gettimeofday(&tv, 0);
+ if (tv.tv_sec - fs->mmp_last_written < EXT2_MIN_MMP_UPDATE_INTERVAL)
+ return 0;
+
+ retval = ext2fs_get_mem(fs->blocksize, &buf_cmp);
+ if (retval)
+ goto mmp_error;
+
+ retval = ext2fs_read_mmp(fs, mmp_blk, buf_cmp);
+ if (retval)
+ goto mmp_error;
+
+ mmp = (struct mmp_struct *) buf;
+ mmp_cmp = (struct mmp_struct *) buf_cmp;
+
+ if (memcmp(mmp, mmp_cmp, sizeof(struct mmp_struct))) {
+ dump_mmp_msg(mmp_cmp, _("\n UNEXPECTED INCONSISTENCY: "
+ "Unexpected MMP structure read from disk.\n"
+ "It seems the filesystem is being modified while "
+ "fsck is running.\n"));
+ retval = EXT2_ET_MMP_FSCK_ABORT;
+ goto mmp_error;
+ }
+
+ mmp->mmp_time = tv.tv_sec;
+ fs->mmp_last_written = tv.tv_sec;
+ mmp->mmp_seq = EXT2_MMP_SEQ_FSCK;
+ retval = ext2fs_write_mmp(fs, mmp_blk, buf);
+
+mmp_error:
+ if (buf_cmp)
+ ext2fs_free_mem(&buf_cmp);
+
+ return retval;
+}
Index: e2fsprogs-1.40.5/lib/ext2fs/bitops.h
===================================================================
--- e2fsprogs-1.40.5.orig/lib/ext2fs/bitops.h
+++ e2fsprogs-1.40.5/lib/ext2fs/bitops.h
@@ -329,6 +329,12 @@ _INLINE_ __u32 ext2fs_swab32(__u32 val)
((val<<8)&0xFF0000) | (val<<24));
}

+_INLINE_ __u64 ext2fs_swab64(__u64 val)
+{
+ return (ext2fs_swab32(val >> 32) |
+ (((__u64)ext2fs_swab32(val & 0xFFFFFFFFUL)) << 32));
+}
+
#endif /* !_EXT2_HAVE_ASM_SWAB */

#if !defined(_EXT2_HAVE_ASM_FINDBIT_)
Index: e2fsprogs-1.40.5/debugfs/set_fields.c
===================================================================
--- e2fsprogs-1.40.5.orig/debugfs/set_fields.c
+++ e2fsprogs-1.40.5/debugfs/set_fields.c
@@ -129,7 +129,7 @@ static struct field_set_info super_field
{ "flags", &set_sb.s_flags, 4, parse_uint },
{ "raid_stride", &set_sb.s_raid_stride, 2, parse_uint },
{ "min_extra_isize", &set_sb.s_min_extra_isize, 4, parse_uint },
- { "mmp_interval", &set_sb.s_mmp_interval, 2, parse_uint },
+ { "mmp_update_interval", &set_sb.s_mmp_update_interval, 2, parse_uint },
{ "mmp_block", &set_sb.s_mmp_block, 8, parse_uint },
{ "raid_stripe_width", &set_sb.s_raid_stripe_width, 4, parse_uint },
{ 0, 0, 0, 0 }

Cheers, Andreas
--
Andreas Dilger
Sr. Staff Engineer, Lustre Group
Sun Microsystems of Canada, Inc.

2008-02-02 08:49:20

by Andreas Dilger

[permalink] [raw]
Subject: Re: [PATCH][21/28] e2fsprogs-journal_chksum.patch


E2fsprogs part of Journal Checksum feature.

This adds support for journals with the INCOMPAT_ASYNC_COMMIT and
COMPAT_CHECKSUM features.

If CHECKSUM is set, each transaction has a checksum of the full
transaction and it is verified before the transaction is replayed.
If any interior block is missing or corrupted, or if the transaction is
incomplete it will not be replayed.

The ASYNC_COMMIT feature allows the kernel to avoid waiting for the
transaction (meta)data to commit before writing the journal commit block.

Signed-off-by: Andreas Dilger <[email protected]>
Signed-off-by: Girish Shilamkar <[email protected]>

Index: e2fsprogs-1.40.2/lib/ext2fs/kernel-jbd.h
===================================================================
--- e2fsprogs-1.40.2.orig/lib/ext2fs/kernel-jbd.h
+++ e2fsprogs-1.40.2/lib/ext2fs/kernel-jbd.h
@@ -108,7 +108,29 @@ typedef struct journal_header_s
__u32 h_sequence;
} journal_header_t;

+/*
+ * Checksum types.
+ */
+#define JFS_CRC32_CHKSUM 1
+#define JFS_MD5_CHKSUM 2
+#define JFS_SHA1_CHKSUM 3
+
+#define JFS_CRC32_CHKSUM_SIZE 4

+#define JFS_CHECKSUM_BYTES (32 / sizeof(__u32))
+/*
+ * Commit block header for storing transactional checksums:
+ */
+struct commit_header
+{
+ __u32 h_magic;
+ __u32 h_blocktype;
+ __u32 h_sequence;
+ unsigned char h_chksum_type;
+ unsigned char h_chksum_size;
+ unsigned char h_padding[2];
+ __u32 h_chksum[JFS_CHECKSUM_BYTES];
+};
/*
* The block tag: used to describe a single buffer in the journal
*/
@@ -194,12 +216,17 @@ typedef struct journal_superblock_s
((j)->j_format_version >= 2 && \
((j)->j_superblock->s_feature_incompat & cpu_to_be32((mask))))

-#define JFS_FEATURE_INCOMPAT_REVOKE 0x00000001
+#define JFS_FEATURE_COMPAT_CHECKSUM 0x00000001
+
+#define JFS_FEATURE_INCOMPAT_REVOKE 0x00000001
+/*#define JFS_FEATURE_INCOMPAT_64BIT 0x00000002*/
+#define JFS_FEATURE_INCOMPAT_ASYNC_COMMIT 0x00000004

/* Features known to this kernel version: */
-#define JFS_KNOWN_COMPAT_FEATURES 0
+#define JFS_KNOWN_COMPAT_FEATURES JFS_FEATURE_COMPAT_CHECKSUM
#define JFS_KNOWN_ROCOMPAT_FEATURES 0
-#define JFS_KNOWN_INCOMPAT_FEATURES JFS_FEATURE_INCOMPAT_REVOKE
+#define JFS_KNOWN_INCOMPAT_FEATURES (JFS_FEATURE_INCOMPAT_REVOKE| \
+ JFS_FEATURE_INCOMPAT_ASYNC_COMMIT)

#ifdef __KERNEL__

Index: e2fsprogs-1.40.2/lib/ext2fs/crc32.h
===================================================================
--- /dev/null
+++ e2fsprogs-1.40.2/lib/ext2fs/crc32.h
@@ -0,0 +1,28 @@
+/*
+ * crc32.h
+ * See crc32.c for license and changes
+ */
+#ifndef _LINUX_CRC32_H
+#define _LINUX_CRC32_H
+
+typedef unsigned int u32;
+typedef unsigned char u8;
+
+extern u32 crc32_le(u32 crc, unsigned char const *p, size_t len);
+extern u32 crc32_be(u32 crc, unsigned char const *p, size_t len);
+extern u32 bitreverse(u32 in);
+
+#define crc32(seed, data, length) crc32_le(seed, (unsigned char const *)data, length)
+
+/*
+ * Helpers for hash table generation of ethernet nics:
+ *
+ * Ethernet sends the least significant bit of a byte first, thus crc32_le
+ * is used. The output of crc32_le is bit reversed [most significant bit
+ * is in bit nr 0], thus it must be reversed before use. Except for
+ * nics that bit swap the result internally...
+ */
+#define ether_crc(length, data) bitreverse(crc32_le(~0, data, length))
+#define ether_crc_le(length, data) crc32_le(~0, data, length)
+
+#endif /* _LINUX_CRC32_H */
Index: e2fsprogs-1.40.2/lib/ext2fs/crc32.c
===================================================================
--- /dev/null
+++ e2fsprogs-1.40.2/lib/ext2fs/crc32.c
@@ -0,0 +1,519 @@
+/*
+ * Oct 15, 2000 Matt Domsch <[email protected]>
+ * Nicer crc32 functions/docs submitted by [email protected]. Thanks!
+ * Code was from the public domain, copyright abandoned. Code was
+ * subsequently included in the kernel, thus was re-licensed under the
+ * GNU GPL v2.
+ *
+ * Oct 12, 2000 Matt Domsch <[email protected]>
+ * Same crc32 function was used in 5 other places in the kernel.
+ * I made one version, and deleted the others.
+ * There are various incantations of crc32(). Some use a seed of 0 or ~0.
+ * Some xor at the end with ~0. The generic crc32() function takes
+ * seed as an argument, and doesn't xor at the end. Then individual
+ * users can do whatever they need.
+ * drivers/net/smc9194.c uses seed ~0, doesn't xor with ~0.
+ * fs/jffs2 uses seed 0, doesn't xor with ~0.
+ * fs/partitions/efi.c uses seed ~0, xor's with ~0.
+ *
+ * This source code is licensed under the GNU General Public License,
+ * Version 2. See the file COPYING for more details.
+ */
+
+#include <stdlib.h>
+#include "crc32_user.h"
+#include "crc32.h"
+#include "crc32defs.h"
+#if CRC_LE_BITS == 8
+#define tole(x) __constant_cpu_to_le32(x)
+#define tobe(x) __constant_cpu_to_be32(x)
+#else
+#define tole(x) (x)
+#define tobe(x) (x)
+#endif
+#include "crc32table.h"
+
+
+#if CRC_LE_BITS == 1
+/*
+ * In fact, the table-based code will work in this case, but it can be
+ * simplified by inlining the table in ?: form.
+ */
+
+/**
+ * crc32_le() - Calculate bitwise little-endian Ethernet AUTODIN II CRC32
+ * @crc - seed value for computation. ~0 for Ethernet, sometimes 0 for
+ * other uses, or the previous crc32 value if computing incrementally.
+ * @p - pointer to buffer over which CRC is run
+ * @len - length of buffer @p
+ *
+ */
+u32 crc32_le(u32 crc, unsigned char const *p, size_t len)
+{
+ int i;
+ while (len--) {
+ crc ^= *p++;
+ for (i = 0; i < 8; i++)
+ crc = (crc >> 1) ^ ((crc & 1) ? CRCPOLY_LE : 0);
+ }
+ return crc;
+}
+#else /* Table-based approach */
+
+/**
+ * crc32_le() - Calculate bitwise little-endian Ethernet AUTODIN II CRC32
+ * @crc - seed value for computation. ~0 for Ethernet, sometimes 0 for
+ * other uses, or the previous crc32 value if computing incrementally.
+ * @p - pointer to buffer over which CRC is run
+ * @len - length of buffer @p
+ *
+ */
+u32 crc32_le(u32 crc, unsigned char const *p, size_t len)
+{
+# if CRC_LE_BITS == 8
+ const u32 *b =(u32 *)p;
+ const u32 *tab = crc32table_le;
+
+# if __BYTE_ORDER == __LITTLE_ENDIAN
+# define DO_CRC(x) crc = tab[ (crc ^ (x)) & 255 ] ^ (crc>>8)
+# else
+# define DO_CRC(x) crc = tab[ ((crc >> 24) ^ (x)) & 255] ^ (crc<<8)
+# endif
+
+ crc = __cpu_to_le32(crc);
+ /* Align it */
+ if(unlikely(((long)b)&3 && len)){
+ do {
+ u8 *p = (u8 *)b;
+ DO_CRC(*p++);
+ b = (void *)p;
+ } while ((--len) && ((long)b)&3 );
+ }
+ if(likely(len >= 4)){
+ /* load data 32 bits wide, xor data 32 bits wide. */
+ size_t save_len = len & 3;
+ len = len >> 2;
+ --b; /* use pre increment below(*++b) for speed */
+ do {
+ crc ^= *++b;
+ DO_CRC(0);
+ DO_CRC(0);
+ DO_CRC(0);
+ DO_CRC(0);
+ } while (--len);
+ b++; /* point to next byte(s) */
+ len = save_len;
+ }
+ /* And the last few bytes */
+ if(len){
+ do {
+ u8 *p = (u8 *)b;
+ DO_CRC(*p++);
+ b = (void *)p;
+ } while (--len);
+ }
+
+ return __le32_to_cpu(crc);
+#undef ENDIAN_SHIFT
+#undef DO_CRC
+
+# elif CRC_LE_BITS == 4
+ while (len--) {
+ crc ^= *p++;
+ crc = (crc >> 4) ^ crc32table_le[crc & 15];
+ crc = (crc >> 4) ^ crc32table_le[crc & 15];
+ }
+ return crc;
+# elif CRC_LE_BITS == 2
+ while (len--) {
+ crc ^= *p++;
+ crc = (crc >> 2) ^ crc32table_le[crc & 3];
+ crc = (crc >> 2) ^ crc32table_le[crc & 3];
+ crc = (crc >> 2) ^ crc32table_le[crc & 3];
+ crc = (crc >> 2) ^ crc32table_le[crc & 3];
+ }
+ return crc;
+# endif
+}
+#endif
+
+#if CRC_BE_BITS == 1
+/*
+ * In fact, the table-based code will work in this case, but it can be
+ * simplified by inlining the table in ?: form.
+ */
+
+/**
+ * crc32_be() - Calculate bitwise big-endian Ethernet AUTODIN II CRC32
+ * @crc - seed value for computation. ~0 for Ethernet, sometimes 0 for
+ * other uses, or the previous crc32 value if computing incrementally.
+ * @p - pointer to buffer over which CRC is run
+ * @len - length of buffer @p
+ *
+ */
+u32 crc32_be(u32 crc, unsigned char const *p, size_t len)
+{
+ int i;
+ printf("CRC %u\n",crc);
+ while (len--) {
+ crc ^= *p++ << 24;
+ for (i = 0; i < 8; i++)
+ crc =
+ (crc << 1) ^ ((crc & 0x80000000) ? CRCPOLY_BE :
+ 0);
+ }
+ return crc;
+}
+
+#else /* Table-based approach */
+/**
+ * crc32_be() - Calculate bitwise big-endian Ethernet AUTODIN II CRC32
+ * @crc - seed value for computation. ~0 for Ethernet, sometimes 0 for
+ * other uses, or the previous crc32 value if computing incrementally.
+ * @p - pointer to buffer over which CRC is run
+ * @len - length of buffer @p
+ *
+ */
+u32 crc32_be(u32 crc, unsigned char const *p, size_t len)
+{
+# if CRC_BE_BITS == 8
+ const u32 *b =(u32 *)p;
+ const u32 *tab = crc32table_be;
+
+# if __BYTE_ORDER == __LITTLE_ENDIAN
+# define DO_CRC(x) crc = tab[ (crc ^ (x)) & 255 ] ^ (crc>>8)
+# else
+# define DO_CRC(x) crc = tab[ ((crc >> 24) ^ (x)) & 255] ^ (crc<<8)
+# endif
+ crc = __cpu_to_be32(crc);
+ /* Align it */
+ if(unlikely(((long)b)&3 && len)){
+ do {
+ u8 *p = (u8 *)b;
+ DO_CRC(*p++);
+ b = (u32 *)p;
+ } while ((--len) && ((long)b)&3 );
+ }
+
+ if(likely(len >= 4)){
+ /* load data 32 bits wide, xor data 32 bits wide. */
+ size_t save_len = len & 3;
+ len = len >> 2;
+ --b; /* use pre increment below(*++b) for speed */
+ do {
+ crc ^= *++b;
+ DO_CRC(0);
+ DO_CRC(0);
+ DO_CRC(0);
+ DO_CRC(0);
+ } while (--len);
+ b++; /* point to next byte(s) */
+ len = save_len;
+ }
+ /* And the last few bytes */
+ if(len){
+ do {
+ u8 *p = (u8 *)b;
+ DO_CRC(*p++);
+ b = (void *)p;
+ } while (--len);
+ }
+ return __be32_to_cpu(crc);
+#undef ENDIAN_SHIFT
+#undef DO_CRC
+
+# elif CRC_BE_BITS == 4
+ while (len--) {
+ crc ^= *p++ << 24;
+ crc = (crc << 4) ^ crc32table_be[crc >> 28];
+ crc = (crc << 4) ^ crc32table_be[crc >> 28];
+ }
+ return crc;
+# elif CRC_BE_BITS == 2
+ while (len--) {
+ crc ^= *p++ << 24;
+ crc = (crc << 2) ^ crc32table_be[crc >> 30];
+ crc = (crc << 2) ^ crc32table_be[crc >> 30];
+ crc = (crc << 2) ^ crc32table_be[crc >> 30];
+ crc = (crc << 2) ^ crc32table_be[crc >> 30];
+ }
+ return crc;
+# endif
+}
+#endif
+
+u32 bitreverse(u32 x)
+{
+ x = (x >> 16) | (x << 16);
+ x = (x >> 8 & 0x00ff00ff) | (x << 8 & 0xff00ff00);
+ x = (x >> 4 & 0x0f0f0f0f) | (x << 4 & 0xf0f0f0f0);
+ x = (x >> 2 & 0x33333333) | (x << 2 & 0xcccccccc);
+ x = (x >> 1 & 0x55555555) | (x << 1 & 0xaaaaaaaa);
+ return x;
+}
+
+
+/*
+ * A brief CRC tutorial.
+ *
+ * A CRC is a long-division remainder. You add the CRC to the message,
+ * and the whole thing (message+CRC) is a multiple of the given
+ * CRC polynomial. To check the CRC, you can either check that the
+ * CRC matches the recomputed value, *or* you can check that the
+ * remainder computed on the message+CRC is 0. This latter approach
+ * is used by a lot of hardware implementations, and is why so many
+ * protocols put the end-of-frame flag after the CRC.
+ *
+ * It's actually the same long division you learned in school, except that
+ * - We're working in binary, so the digits are only 0 and 1, and
+ * - When dividing polynomials, there are no carries. Rather than add and
+ * subtract, we just xor. Thus, we tend to get a bit sloppy about
+ * the difference between adding and subtracting.
+ *
+ * A 32-bit CRC polynomial is actually 33 bits long. But since it's
+ * 33 bits long, bit 32 is always going to be set, so usually the CRC
+ * is written in hex with the most significant bit omitted. (If you're
+ * familiar with the IEEE 754 floating-point format, it's the same idea.)
+ *
+ * Note that a CRC is computed over a string of *bits*, so you have
+ * to decide on the endianness of the bits within each byte. To get
+ * the best error-detecting properties, this should correspond to the
+ * order they're actually sent. For example, standard RS-232 serial is
+ * little-endian; the most significant bit (sometimes used for parity)
+ * is sent last. And when appending a CRC word to a message, you should
+ * do it in the right order, matching the endianness.
+ *
+ * Just like with ordinary division, the remainder is always smaller than
+ * the divisor (the CRC polynomial) you're dividing by. Each step of the
+ * division, you take one more digit (bit) of the dividend and append it
+ * to the current remainder. Then you figure out the appropriate multiple
+ * of the divisor to subtract to being the remainder back into range.
+ * In binary, it's easy - it has to be either 0 or 1, and to make the
+ * XOR cancel, it's just a copy of bit 32 of the remainder.
+ *
+ * When computing a CRC, we don't care about the quotient, so we can
+ * throw the quotient bit away, but subtract the appropriate multiple of
+ * the polynomial from the remainder and we're back to where we started,
+ * ready to process the next bit.
+ *
+ * A big-endian CRC written this way would be coded like:
+ * for (i = 0; i < input_bits; i++) {
+ * multiple = remainder & 0x80000000 ? CRCPOLY : 0;
+ * remainder = (remainder << 1 | next_input_bit()) ^ multiple;
+ * }
+ * Notice how, to get at bit 32 of the shifted remainder, we look
+ * at bit 31 of the remainder *before* shifting it.
+ *
+ * But also notice how the next_input_bit() bits we're shifting into
+ * the remainder don't actually affect any decision-making until
+ * 32 bits later. Thus, the first 32 cycles of this are pretty boring.
+ * Also, to add the CRC to a message, we need a 32-bit-long hole for it at
+ * the end, so we have to add 32 extra cycles shifting in zeros at the
+ * end of every message,
+ *
+ * So the standard trick is to rearrage merging in the next_input_bit()
+ * until the moment it's needed. Then the first 32 cycles can be precomputed,
+ * and merging in the final 32 zero bits to make room for the CRC can be
+ * skipped entirely.
+ * This changes the code to:
+ * for (i = 0; i < input_bits; i++) {
+ * remainder ^= next_input_bit() << 31;
+ * multiple = (remainder & 0x80000000) ? CRCPOLY : 0;
+ * remainder = (remainder << 1) ^ multiple;
+ * }
+ * With this optimization, the little-endian code is simpler:
+ * for (i = 0; i < input_bits; i++) {
+ * remainder ^= next_input_bit();
+ * multiple = (remainder & 1) ? CRCPOLY : 0;
+ * remainder = (remainder >> 1) ^ multiple;
+ * }
+ *
+ * Note that the other details of endianness have been hidden in CRCPOLY
+ * (which must be bit-reversed) and next_input_bit().
+ *
+ * However, as long as next_input_bit is returning the bits in a sensible
+ * order, we can actually do the merging 8 or more bits at a time rather
+ * than one bit at a time:
+ * for (i = 0; i < input_bytes; i++) {
+ * remainder ^= next_input_byte() << 24;
+ * for (j = 0; j < 8; j++) {
+ * multiple = (remainder & 0x80000000) ? CRCPOLY : 0;
+ * remainder = (remainder << 1) ^ multiple;
+ * }
+ * }
+ * Or in little-endian:
+ * for (i = 0; i < input_bytes; i++) {
+ * remainder ^= next_input_byte();
+ * for (j = 0; j < 8; j++) {
+ * multiple = (remainder & 1) ? CRCPOLY : 0;
+ * remainder = (remainder << 1) ^ multiple;
+ * }
+ * }
+ * If the input is a multiple of 32 bits, you can even XOR in a 32-bit
+ * word at a time and increase the inner loop count to 32.
+ *
+ * You can also mix and match the two loop styles, for example doing the
+ * bulk of a message byte-at-a-time and adding bit-at-a-time processing
+ * for any fractional bytes at the end.
+ *
+ * The only remaining optimization is to the byte-at-a-time table method.
+ * Here, rather than just shifting one bit of the remainder to decide
+ * in the correct multiple to subtract, we can shift a byte at a time.
+ * This produces a 40-bit (rather than a 33-bit) intermediate remainder,
+ * but again the multiple of the polynomial to subtract depends only on
+ * the high bits, the high 8 bits in this case.
+ *
+ * The multile we need in that case is the low 32 bits of a 40-bit
+ * value whose high 8 bits are given, and which is a multiple of the
+ * generator polynomial. This is simply the CRC-32 of the given
+ * one-byte message.
+ *
+ * Two more details: normally, appending zero bits to a message which
+ * is already a multiple of a polynomial produces a larger multiple of that
+ * polynomial. To enable a CRC to detect this condition, it's common to
+ * invert the CRC before appending it. This makes the remainder of the
+ * message+crc come out not as zero, but some fixed non-zero value.
+ *
+ * The same problem applies to zero bits prepended to the message, and
+ * a similar solution is used. Instead of starting with a remainder of
+ * 0, an initial remainder of all ones is used. As long as you start
+ * the same way on decoding, it doesn't make a difference.
+ */
+
+#ifdef UNITTEST
+
+#include <stdlib.h>
+#include <stdio.h>
+
+#if 0 /*Not used at present */
+static void
+buf_dump(char const *prefix, unsigned char const *buf, size_t len)
+{
+ fputs(prefix, stdout);
+ while (len--)
+ printf(" %02x", *buf++);
+ putchar('\n');
+
+}
+#endif
+
+static void bytereverse(unsigned char *buf, size_t len)
+{
+ while (len--) {
+ unsigned char x = *buf;
+ x = (x >> 4) | (x << 4);
+ x = (x >> 2 & 0x33) | (x << 2 & 0xcc);
+ x = (x >> 1 & 0x55) | (x << 1 & 0xaa);
+ *buf++ = x;
+ }
+}
+
+static void random_garbage(unsigned char *buf, size_t len)
+{
+ while (len--)
+ *buf++ = (unsigned char) random();
+}
+
+#if 0 /* Not used at present */
+static void store_le(u32 x, unsigned char *buf)
+{
+ buf[0] = (unsigned char) x;
+ buf[1] = (unsigned char) (x >> 8);
+ buf[2] = (unsigned char) (x >> 16);
+ buf[3] = (unsigned char) (x >> 24);
+}
+#endif
+
+static void store_be(u32 x, unsigned char *buf)
+{
+ buf[0] = (unsigned char) (x >> 24);
+ buf[1] = (unsigned char) (x >> 16);
+ buf[2] = (unsigned char) (x >> 8);
+ buf[3] = (unsigned char) x;
+}
+
+/*
+ * This checks that CRC(buf + CRC(buf)) = 0, and that
+ * CRC commutes with bit-reversal. This has the side effect
+ * of bytewise bit-reversing the input buffer, and returns
+ * the CRC of the reversed buffer.
+ */
+static u32 test_step(u32 init, unsigned char *buf, size_t len)
+{
+ u32 crc1, crc2;
+ size_t i;
+
+ crc1 = crc32_be(init, buf, len);
+ store_be(crc1, buf + len);
+ crc2 = crc32_be(init, buf, len + 4);
+ if (crc2)
+ printf("\nCRC cancellation fail: 0x%08x should be 0\n",
+ crc2);
+
+ for (i = 0; i <= len + 4; i++) {
+ crc2 = crc32_be(init, buf, i);
+ crc2 = crc32_be(crc2, buf + i, len + 4 - i);
+ if (crc2)
+ printf("\nCRC split fail: 0x%08x\n", crc2);
+ }
+
+ /* Now swap it around for the other test */
+
+ bytereverse(buf, len + 4);
+ init = bitreverse(init);
+ crc2 = bitreverse(crc1);
+ if (crc1 != bitreverse(crc2))
+ printf("\nBit reversal fail: 0x%08x -> %0x08x -> 0x%08x\n",
+ crc1, crc2, bitreverse(crc2));
+ crc1 = crc32_le(init, buf, len);
+ if (crc1 != crc2)
+ printf("\nCRC endianness fail: 0x%08x != 0x%08x\n", crc1,
+ crc2);
+ crc2 = crc32_le(init, buf, len + 4);
+ if (crc2)
+ printf("\nCRC cancellation fail: 0x%08x should be 0\n",
+ crc2);
+
+ for (i = 0; i <= len + 4; i++) {
+ crc2 = crc32_le(init, buf, i);
+ crc2 = crc32_le(crc2, buf + i, len + 4 - i);
+ if (crc2)
+ printf("\nCRC split fail: 0x%08x\n", crc2);
+ }
+
+ return crc1;
+}
+
+#define SIZE 64
+#define INIT1 0
+#define INIT2 0
+
+int main(void)
+{
+ unsigned char buf1[SIZE + 4];
+ unsigned char buf2[SIZE + 4];
+ unsigned char buf3[SIZE + 4];
+ int i, j;
+ u32 crc1, crc2, crc3;
+
+ for (i = 0; i <= SIZE; i++) {
+ printf("\rTesting length %d...", i);
+ fflush(stdout);
+ random_garbage(buf1, i);
+ random_garbage(buf2, i);
+ for (j = 0; j < i; j++)
+ buf3[j] = buf1[j] ^ buf2[j];
+
+ crc1 = test_step(INIT1, buf1, i);
+ crc2 = test_step(INIT2, buf2, i);
+ /* Now check that CRC(buf1 ^ buf2) = CRC(buf1) ^ CRC(buf2) */
+ crc3 = test_step(INIT1 ^ INIT2, buf3, i);
+ if (crc3 != (crc1 ^ crc2))
+ printf("CRC XOR fail: 0x%08x != 0x%08x ^ 0x%08x\n",
+ crc3, crc1, crc2);
+ }
+ printf("\nAll test complete. No failures expected.\n");
+ return 0;
+}
+
+#endif /* UNITTEST */
Index: e2fsprogs-1.40.2/lib/ext2fs/crc32defs.h
===================================================================
--- /dev/null
+++ e2fsprogs-1.40.2/lib/ext2fs/crc32defs.h
@@ -0,0 +1,32 @@
+/*
+ * There are multiple 16-bit CRC polynomials in common use, but this is
+ * *the* standard CRC-32 polynomial, first popularized by Ethernet.
+ * x^32+x^26+x^23+x^22+x^16+x^12+x^11+x^10+x^8+x^7+x^5+x^4+x^2+x^1+x^0
+ */
+#define CRCPOLY_LE 0xedb88320
+#define CRCPOLY_BE 0x04c11db7
+
+/* How many bits at a time to use. Requires a table of 4<<CRC_xx_BITS bytes. */
+/* For less performance-sensitive, use 4 */
+#ifndef CRC_LE_BITS
+# define CRC_LE_BITS 8
+#endif
+#ifndef CRC_BE_BITS
+# define CRC_BE_BITS 8
+#endif
+
+/*
+ * Little-endian CRC computation. Used with serial bit streams sent
+ * lsbit-first. Be sure to use cpu_to_le32() to append the computed CRC.
+ */
+#if CRC_LE_BITS > 8 || CRC_LE_BITS < 1 || CRC_LE_BITS & CRC_LE_BITS-1
+# error CRC_LE_BITS must be a power of 2 between 1 and 8
+#endif
+
+/*
+ * Big-endian CRC computation. Used with serial bit streams sent
+ * msbit-first. Be sure to use cpu_to_be32() to append the computed CRC.
+ */
+#if CRC_BE_BITS > 8 || CRC_BE_BITS < 1 || CRC_BE_BITS & CRC_BE_BITS-1
+# error CRC_BE_BITS must be a power of 2 between 1 and 8
+#endif
Index: e2fsprogs-1.40.2/lib/ext2fs/crc32table.h
===================================================================
--- /dev/null
+++ e2fsprogs-1.40.2/lib/ext2fs/crc32table.h
@@ -0,0 +1,135 @@
+/* this file is generated - do not edit */
+
+static const u32 crc32table_le[] = {
+tole(0x00000000L), tole(0x77073096L), tole(0xee0e612cL), tole(0x990951baL),
+tole(0x076dc419L), tole(0x706af48fL), tole(0xe963a535L), tole(0x9e6495a3L),
+tole(0x0edb8832L), tole(0x79dcb8a4L), tole(0xe0d5e91eL), tole(0x97d2d988L),
+tole(0x09b64c2bL), tole(0x7eb17cbdL), tole(0xe7b82d07L), tole(0x90bf1d91L),
+tole(0x1db71064L), tole(0x6ab020f2L), tole(0xf3b97148L), tole(0x84be41deL),
+tole(0x1adad47dL), tole(0x6ddde4ebL), tole(0xf4d4b551L), tole(0x83d385c7L),
+tole(0x136c9856L), tole(0x646ba8c0L), tole(0xfd62f97aL), tole(0x8a65c9ecL),
+tole(0x14015c4fL), tole(0x63066cd9L), tole(0xfa0f3d63L), tole(0x8d080df5L),
+tole(0x3b6e20c8L), tole(0x4c69105eL), tole(0xd56041e4L), tole(0xa2677172L),
+tole(0x3c03e4d1L), tole(0x4b04d447L), tole(0xd20d85fdL), tole(0xa50ab56bL),
+tole(0x35b5a8faL), tole(0x42b2986cL), tole(0xdbbbc9d6L), tole(0xacbcf940L),
+tole(0x32d86ce3L), tole(0x45df5c75L), tole(0xdcd60dcfL), tole(0xabd13d59L),
+tole(0x26d930acL), tole(0x51de003aL), tole(0xc8d75180L), tole(0xbfd06116L),
+tole(0x21b4f4b5L), tole(0x56b3c423L), tole(0xcfba9599L), tole(0xb8bda50fL),
+tole(0x2802b89eL), tole(0x5f058808L), tole(0xc60cd9b2L), tole(0xb10be924L),
+tole(0x2f6f7c87L), tole(0x58684c11L), tole(0xc1611dabL), tole(0xb6662d3dL),
+tole(0x76dc4190L), tole(0x01db7106L), tole(0x98d220bcL), tole(0xefd5102aL),
+tole(0x71b18589L), tole(0x06b6b51fL), tole(0x9fbfe4a5L), tole(0xe8b8d433L),
+tole(0x7807c9a2L), tole(0x0f00f934L), tole(0x9609a88eL), tole(0xe10e9818L),
+tole(0x7f6a0dbbL), tole(0x086d3d2dL), tole(0x91646c97L), tole(0xe6635c01L),
+tole(0x6b6b51f4L), tole(0x1c6c6162L), tole(0x856530d8L), tole(0xf262004eL),
+tole(0x6c0695edL), tole(0x1b01a57bL), tole(0x8208f4c1L), tole(0xf50fc457L),
+tole(0x65b0d9c6L), tole(0x12b7e950L), tole(0x8bbeb8eaL), tole(0xfcb9887cL),
+tole(0x62dd1ddfL), tole(0x15da2d49L), tole(0x8cd37cf3L), tole(0xfbd44c65L),
+tole(0x4db26158L), tole(0x3ab551ceL), tole(0xa3bc0074L), tole(0xd4bb30e2L),
+tole(0x4adfa541L), tole(0x3dd895d7L), tole(0xa4d1c46dL), tole(0xd3d6f4fbL),
+tole(0x4369e96aL), tole(0x346ed9fcL), tole(0xad678846L), tole(0xda60b8d0L),
+tole(0x44042d73L), tole(0x33031de5L), tole(0xaa0a4c5fL), tole(0xdd0d7cc9L),
+tole(0x5005713cL), tole(0x270241aaL), tole(0xbe0b1010L), tole(0xc90c2086L),
+tole(0x5768b525L), tole(0x206f85b3L), tole(0xb966d409L), tole(0xce61e49fL),
+tole(0x5edef90eL), tole(0x29d9c998L), tole(0xb0d09822L), tole(0xc7d7a8b4L),
+tole(0x59b33d17L), tole(0x2eb40d81L), tole(0xb7bd5c3bL), tole(0xc0ba6cadL),
+tole(0xedb88320L), tole(0x9abfb3b6L), tole(0x03b6e20cL), tole(0x74b1d29aL),
+tole(0xead54739L), tole(0x9dd277afL), tole(0x04db2615L), tole(0x73dc1683L),
+tole(0xe3630b12L), tole(0x94643b84L), tole(0x0d6d6a3eL), tole(0x7a6a5aa8L),
+tole(0xe40ecf0bL), tole(0x9309ff9dL), tole(0x0a00ae27L), tole(0x7d079eb1L),
+tole(0xf00f9344L), tole(0x8708a3d2L), tole(0x1e01f268L), tole(0x6906c2feL),
+tole(0xf762575dL), tole(0x806567cbL), tole(0x196c3671L), tole(0x6e6b06e7L),
+tole(0xfed41b76L), tole(0x89d32be0L), tole(0x10da7a5aL), tole(0x67dd4accL),
+tole(0xf9b9df6fL), tole(0x8ebeeff9L), tole(0x17b7be43L), tole(0x60b08ed5L),
+tole(0xd6d6a3e8L), tole(0xa1d1937eL), tole(0x38d8c2c4L), tole(0x4fdff252L),
+tole(0xd1bb67f1L), tole(0xa6bc5767L), tole(0x3fb506ddL), tole(0x48b2364bL),
+tole(0xd80d2bdaL), tole(0xaf0a1b4cL), tole(0x36034af6L), tole(0x41047a60L),
+tole(0xdf60efc3L), tole(0xa867df55L), tole(0x316e8eefL), tole(0x4669be79L),
+tole(0xcb61b38cL), tole(0xbc66831aL), tole(0x256fd2a0L), tole(0x5268e236L),
+tole(0xcc0c7795L), tole(0xbb0b4703L), tole(0x220216b9L), tole(0x5505262fL),
+tole(0xc5ba3bbeL), tole(0xb2bd0b28L), tole(0x2bb45a92L), tole(0x5cb36a04L),
+tole(0xc2d7ffa7L), tole(0xb5d0cf31L), tole(0x2cd99e8bL), tole(0x5bdeae1dL),
+tole(0x9b64c2b0L), tole(0xec63f226L), tole(0x756aa39cL), tole(0x026d930aL),
+tole(0x9c0906a9L), tole(0xeb0e363fL), tole(0x72076785L), tole(0x05005713L),
+tole(0x95bf4a82L), tole(0xe2b87a14L), tole(0x7bb12baeL), tole(0x0cb61b38L),
+tole(0x92d28e9bL), tole(0xe5d5be0dL), tole(0x7cdcefb7L), tole(0x0bdbdf21L),
+tole(0x86d3d2d4L), tole(0xf1d4e242L), tole(0x68ddb3f8L), tole(0x1fda836eL),
+tole(0x81be16cdL), tole(0xf6b9265bL), tole(0x6fb077e1L), tole(0x18b74777L),
+tole(0x88085ae6L), tole(0xff0f6a70L), tole(0x66063bcaL), tole(0x11010b5cL),
+tole(0x8f659effL), tole(0xf862ae69L), tole(0x616bffd3L), tole(0x166ccf45L),
+tole(0xa00ae278L), tole(0xd70dd2eeL), tole(0x4e048354L), tole(0x3903b3c2L),
+tole(0xa7672661L), tole(0xd06016f7L), tole(0x4969474dL), tole(0x3e6e77dbL),
+tole(0xaed16a4aL), tole(0xd9d65adcL), tole(0x40df0b66L), tole(0x37d83bf0L),
+tole(0xa9bcae53L), tole(0xdebb9ec5L), tole(0x47b2cf7fL), tole(0x30b5ffe9L),
+tole(0xbdbdf21cL), tole(0xcabac28aL), tole(0x53b39330L), tole(0x24b4a3a6L),
+tole(0xbad03605L), tole(0xcdd70693L), tole(0x54de5729L), tole(0x23d967bfL),
+tole(0xb3667a2eL), tole(0xc4614ab8L), tole(0x5d681b02L), tole(0x2a6f2b94L),
+tole(0xb40bbe37L), tole(0xc30c8ea1L), tole(0x5a05df1bL), tole(0x2d02ef8dL)
+};
+
+static const u32 crc32table_be[] = {
+tobe(0x00000000L), tobe(0x04c11db7L), tobe(0x09823b6eL), tobe(0x0d4326d9L),
+tobe(0x130476dcL), tobe(0x17c56b6bL), tobe(0x1a864db2L), tobe(0x1e475005L),
+tobe(0x2608edb8L), tobe(0x22c9f00fL), tobe(0x2f8ad6d6L), tobe(0x2b4bcb61L),
+tobe(0x350c9b64L), tobe(0x31cd86d3L), tobe(0x3c8ea00aL), tobe(0x384fbdbdL),
+tobe(0x4c11db70L), tobe(0x48d0c6c7L), tobe(0x4593e01eL), tobe(0x4152fda9L),
+tobe(0x5f15adacL), tobe(0x5bd4b01bL), tobe(0x569796c2L), tobe(0x52568b75L),
+tobe(0x6a1936c8L), tobe(0x6ed82b7fL), tobe(0x639b0da6L), tobe(0x675a1011L),
+tobe(0x791d4014L), tobe(0x7ddc5da3L), tobe(0x709f7b7aL), tobe(0x745e66cdL),
+tobe(0x9823b6e0L), tobe(0x9ce2ab57L), tobe(0x91a18d8eL), tobe(0x95609039L),
+tobe(0x8b27c03cL), tobe(0x8fe6dd8bL), tobe(0x82a5fb52L), tobe(0x8664e6e5L),
+tobe(0xbe2b5b58L), tobe(0xbaea46efL), tobe(0xb7a96036L), tobe(0xb3687d81L),
+tobe(0xad2f2d84L), tobe(0xa9ee3033L), tobe(0xa4ad16eaL), tobe(0xa06c0b5dL),
+tobe(0xd4326d90L), tobe(0xd0f37027L), tobe(0xddb056feL), tobe(0xd9714b49L),
+tobe(0xc7361b4cL), tobe(0xc3f706fbL), tobe(0xceb42022L), tobe(0xca753d95L),
+tobe(0xf23a8028L), tobe(0xf6fb9d9fL), tobe(0xfbb8bb46L), tobe(0xff79a6f1L),
+tobe(0xe13ef6f4L), tobe(0xe5ffeb43L), tobe(0xe8bccd9aL), tobe(0xec7dd02dL),
+tobe(0x34867077L), tobe(0x30476dc0L), tobe(0x3d044b19L), tobe(0x39c556aeL),
+tobe(0x278206abL), tobe(0x23431b1cL), tobe(0x2e003dc5L), tobe(0x2ac12072L),
+tobe(0x128e9dcfL), tobe(0x164f8078L), tobe(0x1b0ca6a1L), tobe(0x1fcdbb16L),
+tobe(0x018aeb13L), tobe(0x054bf6a4L), tobe(0x0808d07dL), tobe(0x0cc9cdcaL),
+tobe(0x7897ab07L), tobe(0x7c56b6b0L), tobe(0x71159069L), tobe(0x75d48ddeL),
+tobe(0x6b93dddbL), tobe(0x6f52c06cL), tobe(0x6211e6b5L), tobe(0x66d0fb02L),
+tobe(0x5e9f46bfL), tobe(0x5a5e5b08L), tobe(0x571d7dd1L), tobe(0x53dc6066L),
+tobe(0x4d9b3063L), tobe(0x495a2dd4L), tobe(0x44190b0dL), tobe(0x40d816baL),
+tobe(0xaca5c697L), tobe(0xa864db20L), tobe(0xa527fdf9L), tobe(0xa1e6e04eL),
+tobe(0xbfa1b04bL), tobe(0xbb60adfcL), tobe(0xb6238b25L), tobe(0xb2e29692L),
+tobe(0x8aad2b2fL), tobe(0x8e6c3698L), tobe(0x832f1041L), tobe(0x87ee0df6L),
+tobe(0x99a95df3L), tobe(0x9d684044L), tobe(0x902b669dL), tobe(0x94ea7b2aL),
+tobe(0xe0b41de7L), tobe(0xe4750050L), tobe(0xe9362689L), tobe(0xedf73b3eL),
+tobe(0xf3b06b3bL), tobe(0xf771768cL), tobe(0xfa325055L), tobe(0xfef34de2L),
+tobe(0xc6bcf05fL), tobe(0xc27dede8L), tobe(0xcf3ecb31L), tobe(0xcbffd686L),
+tobe(0xd5b88683L), tobe(0xd1799b34L), tobe(0xdc3abdedL), tobe(0xd8fba05aL),
+tobe(0x690ce0eeL), tobe(0x6dcdfd59L), tobe(0x608edb80L), tobe(0x644fc637L),
+tobe(0x7a089632L), tobe(0x7ec98b85L), tobe(0x738aad5cL), tobe(0x774bb0ebL),
+tobe(0x4f040d56L), tobe(0x4bc510e1L), tobe(0x46863638L), tobe(0x42472b8fL),
+tobe(0x5c007b8aL), tobe(0x58c1663dL), tobe(0x558240e4L), tobe(0x51435d53L),
+tobe(0x251d3b9eL), tobe(0x21dc2629L), tobe(0x2c9f00f0L), tobe(0x285e1d47L),
+tobe(0x36194d42L), tobe(0x32d850f5L), tobe(0x3f9b762cL), tobe(0x3b5a6b9bL),
+tobe(0x0315d626L), tobe(0x07d4cb91L), tobe(0x0a97ed48L), tobe(0x0e56f0ffL),
+tobe(0x1011a0faL), tobe(0x14d0bd4dL), tobe(0x19939b94L), tobe(0x1d528623L),
+tobe(0xf12f560eL), tobe(0xf5ee4bb9L), tobe(0xf8ad6d60L), tobe(0xfc6c70d7L),
+tobe(0xe22b20d2L), tobe(0xe6ea3d65L), tobe(0xeba91bbcL), tobe(0xef68060bL),
+tobe(0xd727bbb6L), tobe(0xd3e6a601L), tobe(0xdea580d8L), tobe(0xda649d6fL),
+tobe(0xc423cd6aL), tobe(0xc0e2d0ddL), tobe(0xcda1f604L), tobe(0xc960ebb3L),
+tobe(0xbd3e8d7eL), tobe(0xb9ff90c9L), tobe(0xb4bcb610L), tobe(0xb07daba7L),
+tobe(0xae3afba2L), tobe(0xaafbe615L), tobe(0xa7b8c0ccL), tobe(0xa379dd7bL),
+tobe(0x9b3660c6L), tobe(0x9ff77d71L), tobe(0x92b45ba8L), tobe(0x9675461fL),
+tobe(0x8832161aL), tobe(0x8cf30badL), tobe(0x81b02d74L), tobe(0x857130c3L),
+tobe(0x5d8a9099L), tobe(0x594b8d2eL), tobe(0x5408abf7L), tobe(0x50c9b640L),
+tobe(0x4e8ee645L), tobe(0x4a4ffbf2L), tobe(0x470cdd2bL), tobe(0x43cdc09cL),
+tobe(0x7b827d21L), tobe(0x7f436096L), tobe(0x7200464fL), tobe(0x76c15bf8L),
+tobe(0x68860bfdL), tobe(0x6c47164aL), tobe(0x61043093L), tobe(0x65c52d24L),
+tobe(0x119b4be9L), tobe(0x155a565eL), tobe(0x18197087L), tobe(0x1cd86d30L),
+tobe(0x029f3d35L), tobe(0x065e2082L), tobe(0x0b1d065bL), tobe(0x0fdc1becL),
+tobe(0x3793a651L), tobe(0x3352bbe6L), tobe(0x3e119d3fL), tobe(0x3ad08088L),
+tobe(0x2497d08dL), tobe(0x2056cd3aL), tobe(0x2d15ebe3L), tobe(0x29d4f654L),
+tobe(0xc5a92679L), tobe(0xc1683bceL), tobe(0xcc2b1d17L), tobe(0xc8ea00a0L),
+tobe(0xd6ad50a5L), tobe(0xd26c4d12L), tobe(0xdf2f6bcbL), tobe(0xdbee767cL),
+tobe(0xe3a1cbc1L), tobe(0xe760d676L), tobe(0xea23f0afL), tobe(0xeee2ed18L),
+tobe(0xf0a5bd1dL), tobe(0xf464a0aaL), tobe(0xf9278673L), tobe(0xfde69bc4L),
+tobe(0x89b8fd09L), tobe(0x8d79e0beL), tobe(0x803ac667L), tobe(0x84fbdbd0L),
+tobe(0x9abc8bd5L), tobe(0x9e7d9662L), tobe(0x933eb0bbL), tobe(0x97ffad0cL),
+tobe(0xafb010b1L), tobe(0xab710d06L), tobe(0xa6322bdfL), tobe(0xa2f33668L),
+tobe(0xbcb4666dL), tobe(0xb8757bdaL), tobe(0xb5365d03L), tobe(0xb1f740b4L)
+};
Index: e2fsprogs-1.40.2/lib/ext2fs/crc32_user.h
===================================================================
--- /dev/null
+++ e2fsprogs-1.40.2/lib/ext2fs/crc32_user.h
@@ -0,0 +1,45 @@
+/*
+ * Defines macros and types required by crc32 code undefined in user space.
+ */
+#ifndef _LINUX_CRC32_USER_H
+#define _LINUX_CRC32_USER_H
+#include <linux/types.h>
+
+#define likely(x) __builtin_expect(!!(x), 1)
+#define unlikely(x) __builtin_expect(!!(x), 0)
+
+#define __swab32(x) \
+({ \
+ __u32 __x = (x); \
+ ((__u32)( \
+ (((__u32)(__x) & (__u32)0x000000ffUL) << 24) | \
+ (((__u32)(__x) & (__u32)0x0000ff00UL) << 8) | \
+ (((__u32)(__x) & (__u32)0x00ff0000UL) >> 8) | \
+ (((__u32)(__x) & (__u32)0xff000000UL) >> 24) )); \
+})
+
+#define ___constant_swab32(x) \
+ ((__u32)( \
+ (((__u32)(x) & (__u32)0x000000ffUL) << 24) | \
+ (((__u32)(x) & (__u32)0x0000ff00UL) << 8) | \
+ (((__u32)(x) & (__u32)0x0000ff00UL) << 8) | \
+ (((__u32)(x) & (__u32)0x00ff0000UL) >> 8) | \
+ (((__u32)(x) & (__u32)0xff000000UL) >> 24) ))
+
+#if __BYTE_ORDER == __LITTLE_ENDIAN
+#define __le32_to_cpu(x) ((__u32)(x))
+#define __cpu_to_le32(x) ((__u32)(x))
+#define __be32_to_cpu(x) __swab32((x))
+#define __cpu_to_be32(x) __swab32((x))
+#define __constant_cpu_to_le32(x) ((__u32)(x))
+#define __constant_cpu_to_be32(x) (( __u32)___constant_swab32((x)))
+#else
+#define __le32_to_cpu(x) __swab32((x))
+#define __cpu_to_le32(x) __swab32((x))
+#define __be32_to_cpu(x) ((__u32)(x))
+#define __cpu_to_be32(x) ((__u32)(x))
+#define __constant_cpu_to_le32(x) ___constant_swab32((x))
+#define __constant_cpu_to_be32(x) ((__u32)(x))
+#endif
+
+#endif /* _LINUX_CRC32_USER_H */
Index: e2fsprogs-1.40.2/lib/ext2fs/Makefile.in
===================================================================
--- e2fsprogs-1.40.2.orig/lib/ext2fs/Makefile.in
+++ e2fsprogs-1.40.2/lib/ext2fs/Makefile.in
@@ -69,6 +69,7 @@ OBJS= $(DEBUGFS_LIB_OBJS) $(RESIZE_LIB_O
unlink.o \
valid_blk.o \
version.o \
+ crc32.o \
crc16.o \
csum.o

@@ -143,6 +144,7 @@ SRCS= ext2_err.c \
$(srcdir)/tst_types.c \
$(srcdir)/tst_iscan.c \
$(srcdir)/tst_csum.c \
+ $(srcdir)/crc32.c \
$(srcdir)/crc16.c \
$(srcdir)/csum.c

@@ -234,10 +236,11 @@ tst_bitops: tst_bitops.o inline.o $(STAT
@$(CC) -o tst_bitops tst_bitops.o inline.o $(ALL_CFLAGS) \
$(STATIC_LIBEXT2FS) $(LIBCOM_ERR)

-tst_csum: tst_csum.o csum.o crc16.o $(STATIC_LIBEXT2FS) $(STATIC_LIBUUID)
+tst_csum: tst_csum.o csum.o crc16.o crc32.o $(STATIC_LIBEXT2FS) \
+ $(STATIC_LIBUUID)
@echo " LD [email protected]"
- @$(CC) -o tst_csum csum.o tst_csum.o crc16.o $(STATIC_LIBEXT2FS) \
- $(STATIC_LIBUUID) $(LIBCOM_ERR)
+ @$(CC) -o tst_csum csum.o tst_csum.o crc16.o crc32.o \
+ $(STATIC_LIBEXT2FS) $(STATIC_LIBUUID) $(LIBCOM_ERR)

tst_getsectsize: tst_getsectsize.o getsectsize.o $(STATIC_LIBEXT2FS)
@echo " LD [email protected]"
@@ -384,6 +387,8 @@ cmp_bitmaps.o: $(srcdir)/cmp_bitmaps.c $
$(srcdir)/ext2_io.h $(top_builddir)/lib/ext2fs/ext2_err.h $(srcdir)/bitops.h
crc16.o: $(srcdir)/crc16.c $(srcdir)/ext2_fs.h $(srcdir)/crc16.h \
$(top_builddir)/lib/ext2fs/ext2_types.h $(srcdir)/ext2fs.h $(srcdir)/ext2_fs.h
+crc32.o: $(srcdir)/crc32.c $(srcdir)/ext2_fs.h $(srcdir)/crc16.h \
+ $(top_builddir)/lib/ext2fs/ext2_types.h $(srcdir)/ext2fs.h $(srcdir)/ext2_fs.h
csum.o: $(srcdir)/csum.c $(srcdir)/ext2_fs.h \
$(top_builddir)/lib/ext2fs/ext2_types.h $(srcdir)/ext2fs.h $(srcdir)/ext2_fs.h
dblist.o: $(srcdir)/dblist.c $(srcdir)/ext2_fs.h \
Index: e2fsprogs-1.40.2/e2fsck/e2fsck.h
===================================================================
--- e2fsprogs-1.40.2.orig/e2fsck/e2fsck.h
+++ e2fsprogs-1.40.2/e2fsck/e2fsck.h
@@ -30,10 +30,12 @@
#if EXT2_FLAT_INCLUDES
#include "ext2_fs.h"
#include "ext2fs.h"
+#include "crc32.h"
#include "blkid.h"
#else
#include "ext2fs/ext2_fs.h"
#include "ext2fs/ext2fs.h"
+#include "ext2fs/crc32.h"
#include "blkid/blkid.h"
#endif

Index: e2fsprogs-1.40.2/e2fsck/recovery.c
===================================================================
--- e2fsprogs-1.40.2.orig/e2fsck/recovery.c
+++ e2fsprogs-1.40.2/e2fsck/recovery.c
@@ -21,6 +21,7 @@
#include <linux/jbd.h>
#include <linux/errno.h>
#include <linux/slab.h>
+#include <linux/crc32.h>
#endif

/*
@@ -304,6 +305,36 @@ int journal_skip_recovery(journal_t *jou
return err;
}

+/* calc_chksums calculates the checksums for the blocks described in the
+ * descriptor block.
+ */
+static int calc_chksums(journal_t *journal, struct buffer_head *bh,
+ unsigned long *next_log_block, __u32 *crc32_sum)
+{
+ int i, num_blks, err;
+ unsigned io_block;
+ struct buffer_head *obh;
+
+ num_blks = count_tags(bh, journal->j_blocksize);
+ /* Calculate checksum of the descriptor block. */
+ *crc32_sum = crc32_be(*crc32_sum, (void *)bh->b_data, bh->b_size);
+ for (i = 0; i < num_blks; i++) {
+ io_block = (*next_log_block)++;
+ wrap(journal, *next_log_block);
+
+ err = jread(&obh, journal, io_block);
+ if (err) {
+ printk (KERN_ERR "JBD: IO error %d recovering block "
+ "%u in log\n", err, io_block);
+ return 1;
+ } else {
+ *crc32_sum = crc32_be(*crc32_sum, (void *)obh->b_data,
+ obh->b_size);
+ }
+ }
+ return 0;
+}
+
static int do_one_pass(journal_t *journal,
struct recovery_info *info, enum passtype pass)
{
@@ -315,6 +346,7 @@ static int do_one_pass(journal_t *journa
struct buffer_head * bh;
unsigned int sequence;
int blocktype;
+ __u32 crc32_sum = ~0; /* Transactional Checksums */

/* Precompute the maximum metadata descriptors in a descriptor block */
int MAX_BLOCKS_PER_DESC;
@@ -404,9 +436,24 @@ static int do_one_pass(journal_t *journa
switch(blocktype) {
case JFS_DESCRIPTOR_BLOCK:
/* If it is a valid descriptor block, replay it
- * in pass REPLAY; otherwise, just skip over the
- * blocks it describes. */
+ * in pass REPLAY; if journal_checksums enabled, then
+ * calculate checksums in PASS_SCAN, otherwise,
+ * just skip over the blocks it describes. */
if (pass != PASS_REPLAY) {
+ if (pass == PASS_SCAN &&
+ JFS_HAS_COMPAT_FEATURE(journal,
+ JFS_FEATURE_COMPAT_CHECKSUM) &&
+ !info->end_transaction) {
+ if (calc_chksums(journal, bh,
+ &next_log_block,
+ &crc32_sum)) {
+ brelse(bh);
+ break;
+ }
+ brelse(bh);
+ continue;
+ }
+
next_log_block +=
count_tags(bh, journal->j_blocksize);
wrap(journal, next_log_block);
@@ -501,9 +548,96 @@ static int do_one_pass(journal_t *journa
continue;

case JFS_COMMIT_BLOCK:
- /* Found an expected commit block: not much to
- * do other than move on to the next sequence
+ /* How to differentiate between interrupted commit
+ * and journal corruption ?
+ *
+ * {nth transaction}
+ * Checksum Verification Failed
+ * |
+ * ____________________
+ * | |
+ * async_commit sync_commit
+ * | |
+ * | GO TO NEXT "Journal Corruption"
+ * | TRANSACTION
+ * |
+ * {(n+1)th transanction}
+ * |
+ * _______|______________
+ * | |
+ * Commit block found Commit block not found
+ * | |
+ * "Journal Corruption" |
+ * _____________|__________
+ * | |
+ * nth trans corrupt OR nth trans
+ * and (n+1)th interrupted interrupted
+ * before commit block
+ * could reach the disk.
+ * (Cannot find the difference in above
+ * mentioned conditions. Hence assume
+ * "Interrupted Commit".)
+ */
+
+ /* Found an expected commit block: if checksums
+ * are present verify them in PASS_SCAN; else not
+ * much to do other than move on to the next sequence
* number. */
+ if (pass == PASS_SCAN &&
+ JFS_HAS_COMPAT_FEATURE(journal,
+ JFS_FEATURE_COMPAT_CHECKSUM)) {
+ int chksum_err, chksum_seen;
+ struct commit_header *cbh =
+ (struct commit_header *)bh->b_data;
+ __u32 found_chksum = ntohl(cbh->h_chksum[0]);
+
+ chksum_err = chksum_seen = 0;
+
+ if (info->end_transaction) {
+ printk(KERN_ERR "JBD: Transaction %u "
+ "found to be corrupt.\n",
+ next_commit_ID - 1);
+ brelse(bh);
+ break;
+ }
+
+ if (crc32_sum == found_chksum &&
+ cbh->h_chksum_type == JFS_CRC32_CHKSUM &&
+ cbh->h_chksum_size ==
+ JFS_CRC32_CHKSUM_SIZE) {
+ chksum_seen = 1;
+ } else if (!(cbh->h_chksum_type == 0 &&
+ cbh->h_chksum_size == 0 &&
+ found_chksum == 0 &&
+ !chksum_seen)) {
+ /*
+ * If fs is mounted using an old kernel and then
+ * kernel with journal_chksum is used then we
+ * get a situation where the journal flag has
+ * checksum flag set but checksums are not
+ * present i.e chksum = 0, in the individual
+ * commit blocks.
+ * Hence to avoid checksum failures, in this
+ * situation, this extra check is added.
+ */
+ chksum_err = 1;
+ }
+
+ if (chksum_err) {
+ info->end_transaction = next_commit_ID;
+
+ if (!JFS_HAS_COMPAT_FEATURE(journal,
+ JFS_FEATURE_INCOMPAT_ASYNC_COMMIT)){
+ printk(KERN_ERR
+ "JBD: Transaction %u "
+ "found to be corrupt.\n",
+ next_commit_ID);
+ brelse(bh);
+ break;
+ }
+ }
+ crc32_sum = ~0;
+ }
brelse(bh);
next_commit_ID++;
continue;
@@ -539,9 +673,10 @@ static int do_one_pass(journal_t *journa
* transaction marks the end of the valid log.
*/

- if (pass == PASS_SCAN)
- info->end_transaction = next_commit_ID;
- else {
+ if (pass == PASS_SCAN) {
+ if (!info->end_transaction)
+ info->end_transaction = next_commit_ID;
+ } else {
/* It's really bad news if different passes end up at
* different places (but possible due to IO errors). */
if (info->end_transaction != next_commit_ID) {
Index: e2fsprogs-1.40.2/lib/ext2fs/tst_csum.c
===================================================================
--- e2fsprogs-1.40.2.orig/lib/ext2fs/tst_csum.c
+++ e2fsprogs-1.40.2/lib/ext2fs/tst_csum.c
@@ -1,16 +1,17 @@
/*
- * This testing program verifies checksumming operations
- *
- * Copyright (C) 2006, 2007 by Andreas Dilger <[email protected]>
- *
- * %Begin-Header%
- * This file may be redistributed under the terms of the GNU Public
- * License.
- * %End-Header%
- */
+* This testing program verifies checksumming operations
+*
+* Copyright (C) 2006, 2007 by Andreas Dilger <[email protected]>
+*
+* %Begin-Header%
+* This file may be redistributed under the terms of the GNU Public
+* License.
+* %End-Header%
+*/

#include "ext2fs/ext2_fs.h"
#include "ext2fs/ext2fs.h"
+#include "ext2fs/crc32.h"
#include "ext2fs/crc16.h"
#include "uuid/uuid.h"

@@ -64,28 +65,61 @@ int main(int argc, char **argv)
0x4b, 0xae, 0xec, 0xdb } };
__u16 csum1, csum2, csum_known = 0xd3a4;
char data[8] = { 0x10, 0x20, 0x30, 0x40, 0xf1, 0xb2, 0xc3, 0xd4 };
- __u16 data_crc[8] = { 0xcc01, 0x180c, 0x1118, 0xfa10,
- 0x483a, 0x6648, 0x6726, 0x85e6 };
- __u16 data_crc0[8] = { 0x8cbe, 0xa80d, 0xd169, 0xde10,
- 0x481e, 0x7d48, 0x673d, 0x8ea6 };
+ __u16 data_crc16[8] = { 0xcc01, 0x180c, 0x1118, 0xfa10,
+ 0x483a, 0x6648, 0x6726, 0x85e6 };
+ __u16 data_crc16_0[8] = { 0x8cbe, 0xa80d, 0xd169, 0xde10,
+ 0x481e, 0x7d48, 0x673d, 0x8ea6 };
+ __u32 data_crc32[8] = {
+ 0x4c11db70, 0x88722df3, 0xe91b93c6, 0xc8756001,
+ 0x839b9c9f, 0x4b6fef27, 0x20eb2a56, 0x7196ddd5
+ };
+ __u32 data_crc32_0[8] = {
+ 0x21964c4, 0x88c5498e, 0x5e7feec6, 0xf71bd7a,
+ 0xc48b2703, 0x71155355, 0xa1efe310, 0x1892668c
+ };
int i;

for (i = 0; i < sizeof(data); i++) {
csum1 = crc16(0, data, i + 1);
- printf("crc16(0): data[%d]: %04x=%04x\n", i, csum1,data_crc[i]);
- if (csum1 != data_crc[i]) {
+ printf("crc16(0): data[%d]: %04x=%04x\n", i, csum1,
+ data_crc16[i]);
+ if (csum1 != data_crc16[i]) {
printf("error: crc16(0) for data[%d] should be %04x\n",
- i, data_crc[i]);
+ i, data_crc16[i]);
exit(1);
}
}

for (i = 0; i < sizeof(data); i++) {
csum1 = crc16(~0, data, i + 1);
- printf("crc16(~0): data[%d]: %04x=%04x\n",i,csum1,data_crc0[i]);
- if (csum1 != data_crc0[i]) {
+ printf("crc16(~0): data[%d]: %04x=%04x\n", i, csum1,
+ data_crc16_0[i]);
+ if (csum1 != data_crc16_0[i]) {
printf("error: crc16(~0) for data[%d] should be %04x\n",
- i, data_crc0[i]);
+ i, data_crc16_0[i]);
+ exit(1);
+ }
+ }
+ for (i = 0; i < sizeof(data); i++) {
+ __u32 csum32;
+ csum32 = crc32_be(0, data, i + 1);
+ printf("crc32(0): data[%d]: %04x=%04x\n", i, csum32,
+ data_crc32[i]);
+ if (csum32 != data_crc32[i]) {
+ printf("error: crc32(0) for data[%d] should be %04x\n",
+ i, data_crc32[i]);
+ exit(1);
+ }
+ }
+
+ for (i = 0; i < sizeof(data); i++) {
+ __u32 csum32;
+ csum32 = crc32_be(~0U, data, i + 1);
+ printf("crc32(~0): data[%d]: %04x=%04x\n", i, csum32,
+ data_crc32_0[i]);
+ if (csum32 != data_crc32_0[i]) {
+ printf("error: crc32(~0) for data[%d] should be %04x\n",
+ i, data_crc32_0[i]);
exit(1);
}
}

Cheers, Andreas
--
Andreas Dilger
Sr. Staff Engineer, Lustre Group
Sun Microsystems of Canada, Inc.

2008-02-02 08:54:13

by Andreas Dilger

[permalink] [raw]
Subject: Re: [PATCH][25/28] e2fsprogs-i_size-corruption.patch


Fix handling of block preallocation support in cases where the kernel
PAGE_SIZE is larger than the filesystem blocksize.

Signed-off-by: Kalpak Shah <[email protected]>
Signed-off-by: Andreas Dilger <[email protected]>

Index: e2fsprogs-1.40.2/e2fsck/pass1.c
===================================================================
--- e2fsprogs-1.40.2.orig/e2fsck/pass1.c
+++ e2fsprogs-1.40.2/e2fsck/pass1.c
@@ -2103,7 +2103,7 @@ static void check_blocks(e2fsck_t ctx, s
if ((pb.last_block >= 0) &&
/* allow allocated blocks to end of PAGE_SIZE */
(size < (__u64)pb.last_block * fs->blocksize) &&
- (pb.last_block / blkpg * blkpg != pb.last_block ||
+ ((pb.last_block+1) / blkpg * blkpg != (pb.last_block+1) ||
size < (__u64)(pb.last_block & ~(blkpg-1)) *fs->blocksize))
bad_size = 3;
else if (size > ext2_max_sizes[fs->super->s_log_block_size])


Cheers, Andreas
--
Andreas Dilger
Sr. Staff Engineer, Lustre Group
Sun Microsystems of Canada, Inc.

2008-02-02 08:54:44

by Andreas Dilger

[permalink] [raw]
Subject: Re: [PATCH][26/28] e2fsprogs-fiemap.patch


Add support for ioctl(FIEMAP) to filefrag. If the kernel supports FIEMAP
the filefrag program prefers this more efficient mechanism to get extent
information instead of repeated FIBMAP calls.

Signed-off-by: Kalpak Shah <[email protected]>
Signed-off-by: Andreas Dilger <[email protected]>

Index: e2fsprogs-1.40.2/misc/filefrag.c
===================================================================
--- e2fsprogs-1.40.2.orig/misc/filefrag.c
+++ e2fsprogs-1.40.2/misc/filefrag.c
@@ -38,11 +38,47 @@ extern int optind;
#include <sys/vfs.h>
#include <sys/ioctl.h>
#include <linux/fd.h>
+#include <ext2fs/ext2_types.h>

int verbose = 0;

-#define FIBMAP _IO(0x00,1) /* bmap access */
-#define FIGETBSZ _IO(0x00,2) /* get the block size used for bmap */
+struct fiemap_extent {
+ __u64 fe_offset; /* offset in bytes for the start of the extent */
+ __u64 fe_length; /* length in bytes for the extent */
+ __u32 fe_flags; /* returned FIEMAP_EXTENT_* flags for the extent */
+ __u32 fe_lun; /* logical device number for extent (starting at 0) */
+};
+
+struct fiemap {
+ __u64 fm_start; /* logical starting byte offset (in/out) */
+ __u64 fm_length; /* logical length of map (in/out) */
+ __u32 fm_flags; /* FIEMAP_FLAG_* flags for request (in/out) */
+ __u32 fm_extent_count; /* number of extents in fm_extents (in/out) */
+ __u64 fm_unused;
+ struct fiemap_extent fm_extents[0];
+};
+
+#define FIEMAP_FLAG_SYNC 0x00000001 /* sync file data before map */
+#define FIEMAP_FLAG_HSM_READ 0x00000002 /* get data from HSM before map */
+#define FIEMAP_FLAG_NUM_EXTENTS 0x00000004 /* return only number of extents */
+#define FIEMAP_FLAG_INCOMPAT 0xff000000 /* error for unknown flags in here */
+
+#define FIEMAP_EXTENT_HOLE 0x00000001 /* has no data or space allocation */
+#define FIEMAP_EXTENT_UNWRITTEN 0x00000002 /* space allocated, but no data */
+#define FIEMAP_EXTENT_UNMAPPED 0x00000004 /* has data but no space allocation*/
+#define FIEMAP_EXTENT_ERROR 0x00000008 /* mapping error, errno in fe_start*/
+#define FIEMAP_EXTENT_NO_DIRECT 0x00000010 /* cannot access data directly */
+#define FIEMAP_EXTENT_LAST 0x00000020 /* last extent in the file */
+#define FIEMAP_EXTENT_DELALLOC 0x00000040 /* has data but not yet written,
+ must have EXTENT_UNKNOWN set */
+#define FIEMAP_EXTENT_SECONDARY 0x00000080 /* data (also) in secondary storage,
+ not in primary if EXTENT_UNKNOWN*/
+#define FIEMAP_EXTENT_EOF 0x00000100 /* if fm_start+fm_len is beyond EOF*/
+
+
+#define FIBMAP _IO(0x00, 1) /* bmap access */
+#define FIGETBSZ _IO(0x00, 2) /* get the block size used for bmap */
+#define EXT4_IOC_FIEMAP _IOWR('f', 10, struct fiemap) /* get file extent info*/

#define EXT4_EXTENTS_FL 0x00080000 /* Inode uses extents */
#define EXT3_IOC_GETFLAGS _IOR('f', 1, long)
@@ -71,6 +107,62 @@ static unsigned long get_bmap(int fd, un
return b;
}

+int filefrag_fiemap(int fd, int bs, int *num_extents)
+{
+ char buf[4096] = "";
+ struct fiemap *fiemap = (struct fiemap *)buf;
+ int count = (sizeof(buf) - sizeof(*fiemap)) /
+ sizeof(struct fiemap_extent);
+ __u64 logical_blk = 0, last_blk = 0;
+ unsigned long flags;
+ int tot_extents = 0;
+ int eof = 0;
+ int i;
+ int rc;
+
+ memset(fiemap, 0, sizeof(struct fiemap));
+ fiemap->fm_extent_count = count;
+ fiemap->fm_length = ~0ULL;
+ if (!verbose)
+ flags |= FIEMAP_FLAG_NUM_EXTENTS;
+
+ do {
+ fiemap->fm_length = ~0ULL;
+ fiemap->fm_flags = flags;
+ fiemap->fm_extent_count = count;
+ rc = ioctl (fd, EXT4_IOC_FIEMAP, (unsigned long) fiemap);
+ if (rc)
+ return rc;
+
+ if (!verbose) {
+ *num_extents = fiemap->fm_extent_count;
+ goto out;
+ }
+
+ for (i = 0; i < fiemap->fm_extent_count; i++) {
+ __u64 phy_blk;
+ unsigned long ext_len;
+
+ phy_blk = fiemap->fm_extents[i].fe_offset / bs;
+ ext_len = fiemap->fm_extents[i].fe_length / bs;
+ if (logical_blk && (phy_blk != last_blk+1))
+ printf("Discontinuity: Block %llu is at %llu "
+ "(was %llu)\n", logical_blk, phy_blk,
+ last_blk);
+ logical_blk += ext_len;
+ last_blk = phy_blk + ext_len - 1;
+ if (fiemap->fm_extents[i].fe_flags & FIEMAP_EXTENT_EOF)
+ eof = 1;
+ }
+ fiemap->fm_start += fiemap->fm_length;
+ tot_extents += fiemap->fm_extent_count;
+ } while (0);
+
+ *num_extents = tot_extents;
+out:
+ return 0;
+}
+
#define EXT2_DIRECT 12

static void frag_report(const char *filename)
@@ -86,7 +178,7 @@ static void frag_report(const char *file
unsigned long block, last_block = 0, numblocks, i;
long bpib; /* Blocks per indirect block */
long cylgroups;
- int discont = 0, expected;
+ int num_extents = 0, expected;
int is_ext2 = 0;
unsigned int flags;

@@ -135,7 +227,8 @@ static void frag_report(const char *file
if (ioctl(fd, EXT3_IOC_GETFLAGS, &flags) < 0)
flags = 0;
if (flags & EXT4_EXTENTS_FL) {
- printf("File is stored in extents format\n");
+ if (verbose)
+ printf("File is stored in extents format\n");
is_ext2 = 0;
}
if (verbose)
@@ -148,32 +241,36 @@ static void frag_report(const char *file
printf("First block: %lu\nLast block: %lu\n",
get_bmap(fd, 0), get_bmap(fd, numblocks - 1));
}
- for (i=0; i < numblocks; i++) {
- if (is_ext2 && last_block) {
- if (((i-EXT2_DIRECT) % bpib) == 0)
- last_block++;
- if (((i-EXT2_DIRECT-bpib) % (bpib*bpib)) == 0)
- last_block++;
- if (((i-EXT2_DIRECT-bpib-bpib*bpib) % (bpib*bpib*bpib)) == 0)
- last_block++;
- }
- block = get_bmap(fd, i);
- if (block == 0)
- continue;
- if (last_block && (block != last_block +1) ) {
- if (verbose)
- printf("Discontinuity: Block %ld is at %lu (was %lu)\n",
- i, block, last_block);
- discont++;
+ if (is_ext2 || (filefrag_fiemap(fd, bs, &num_extents) != 0)) {
+ for (i = 0; i < numblocks; i++) {
+ if (is_ext2 && last_block) {
+ if (((i-EXT2_DIRECT) % bpib) == 0)
+ last_block++;
+ if (((i-EXT2_DIRECT-bpib) % (bpib*bpib)) == 0)
+ last_block++;
+ if (((i-EXT2_DIRECT-bpib-bpib*bpib) %
+ (bpib*bpib*bpib)) == 0)
+ last_block++;
+ }
+ block = get_bmap(fd, i);
+ if (block == 0)
+ continue;
+ if (last_block && (block != last_block+1) ) {
+ if (verbose)
+ printf("Discontinuity: Block %ld is at "
+ "%lu (was %lu)\n",
+ i, block, last_block+1);
+ num_extents++;
+ }
+ last_block = block;
}
- last_block = block;
}
- if (discont==0)
+ if (num_extents == 1)
printf("%s: 1 extent found", filename);
else
- printf("%s: %d extents found", filename, discont+1);
+ printf("%s: %d extents found", filename, num_extents);
expected = (numblocks/((bs*8)-(fsinfo.f_files/8/cylgroups)-3))+1;
- if (is_ext2 && expected != discont+1)
+ if (is_ext2 && expected != num_extents)
printf(", perfection would be %d extent%s\n", expected,
(expected>1) ? "s" : "");
else

Cheers, Andreas
--
Andreas Dilger
Sr. Staff Engineer, Lustre Group
Sun Microsystems of Canada, Inc.

2008-02-02 08:56:10

by Andreas Dilger

[permalink] [raw]
Subject: Re: [PATCH][27/28] e2fsprogs-debugfs-supported_features.patch


Print out the currently supported features of e2fsprogs/libext2fs
via a new "debugfs supported_features" command.

Signed-off-by: Kalpak Shah <[email protected]>
Signed-off-by: Andreas Dilger <[email protected]>

Index: e2fsprogs-1.40.2/debugfs/debug_cmds.ct
===================================================================
--- e2fsprogs-1.40.2.orig/debugfs/debug_cmds.ct
+++ e2fsprogs-1.40.2/debugfs/debug_cmds.ct
@@ -154,5 +154,8 @@ request do_dump_unused, "Dump unused blo
request do_set_current_time, "Set current time to use when setting filesystme fields",
set_current_time;

+request do_supported_features, "Print features supported by this version of e2fsprogs",
+ supported_features;
+
end;

Index: e2fsprogs-1.40.2/debugfs/debugfs.c
===================================================================
--- e2fsprogs-1.40.2.orig/debugfs/debugfs.c
+++ e2fsprogs-1.40.2/debugfs/debugfs.c
@@ -1772,6 +1772,44 @@ void do_set_current_time(int argc, char
}
}

+void do_supported_features(int argc, char *argv[])
+{
+ FILE *out = stdout;
+ int i, j, ret;
+ __u32 supp[3] = { EXT2_LIB_FEATURE_COMPAT_SUPP,
+ EXT2_LIB_FEATURE_INCOMPAT_SUPP,
+ EXT2_LIB_FEATURE_RO_COMPAT_SUPP };
+ __u32 m;
+ int compat;
+ unsigned int feature_flag;
+
+ if (argc >= 1) {
+ ret = e2p_string2feature(argv[1], &compat, &feature_flag);
+ if (ret)
+ goto err;
+
+ if (!(supp[compat] & feature_flag))
+ goto err;
+
+ fprintf(out, "Supported feature: %s\n", argv[1]);
+ } else {
+ fprintf(out, "Supported features:");
+ for (i = 0; i < 3; i++) {
+ for (j = 0, m = 1; j < 32; j++, m <<= 1) {
+ if (supp[i] & m)
+ fprintf(out, " %s",
+ e2p_feature2string(i, m));
+ }
+ }
+ fprintf(out, "\n");
+ }
+
+ return;
+
+err:
+ com_err(argv[0], 0, "Unknown feature: %s\n", argv[1]);
+}
+
static int source_file(const char *cmd_file, int sci_idx)
{
FILE *f;
Index: e2fsprogs-1.40.2/lib/ext2fs/ext2_fs.h
===================================================================
--- e2fsprogs-1.40.2.orig/lib/ext2fs/ext2_fs.h
+++ e2fsprogs-1.40.2/lib/ext2fs/ext2_fs.h
@@ -656,8 +656,7 @@ struct ext2_super_block {
#define EXT2_FEATURE_RO_COMPAT_SUPP (EXT2_FEATURE_RO_COMPAT_SPARSE_SUPER| \
EXT2_FEATURE_RO_COMPAT_LARGE_FILE| \
EXT4_FEATURE_RO_COMPAT_DIR_NLINK| \
- EXT4_FEATURE_RO_COMPAT_EXTRA_ISIZE| \
- EXT2_FEATURE_RO_COMPAT_BTREE_DIR)
+ EXT4_FEATURE_RO_COMPAT_EXTRA_ISIZE)

/*
* Default values for user and/or group using reserved blocks

Cheers, Andreas
--
Andreas Dilger
Sr. Staff Engineer, Lustre Group
Sun Microsystems of Canada, Inc.

2008-02-02 08:57:37

by Andreas Dilger

[permalink] [raw]
Subject: Re: [PATCH][28/28] e2fsprogs-lts-make_rpms.patch


Allow "make rpm" to take some extra configure options from the build
environment without having to patch the code.

Build the tarball in a temporary directory instead of the e2fsprogs
source directory.

Signed-off-by: Michael MacDonald <[email protected]>
Signed-off-by: Andreas Dilger <[email protected]>

diff -Naur e2fsprogs-1.40.2/contrib/build-rpm e2fsprogs-1.40.2.new/contrib/build-rpm
--- e2fsprogs-1.40.2/contrib/build-rpm 2007-06-30 08:58:34.000000000 -0400
+++ e2fsprogs-1.40.2.new/contrib/build-rpm 2007-12-21 12:49:43.000000000 -0500
@@ -1,5 +1,10 @@
#!/bin/sh

+# enable xtrace output if requested
+if [ -n ${ENABLE_XTRACE:-''} ]; then
+ set -x
+fi
+
# Build an e2fsprogs RPM from cvs

pwd=`pwd`
@@ -8,8 +13,11 @@
pkgvers=`grep Version: e2fsprogs.spec | awk '{print $2;}'`
builddir=${pkgname}-${pkgvers}

+# ensure that $TMP is set to something
+TMP=${TMP:-'/tmp'}
+
cd ..
-tmpdir=`mktemp -d rpmtmp.XXXXXX`
+tmpdir=`mktemp -d ${RPM_TMPDIR:-$TMP}/rpmtmp.XXXXXX`

# We need to build a tarball for the SRPM using $builddir as the
# directory name (since that's what RPM will expect it to unpack
@@ -25,10 +33,13 @@
(cd $tmpdir && tar czfh ${builddir}.tar.gz $EXCLUDE $builddir)

[ "`rpmbuild --version 2> /dev/null`" ] && RPM=rpmbuild || RPM=rpm
-$RPM --define "_sourcedir `pwd`/$tmpdir" -ba $currdir/e2fsprogs.spec
-
-ret=$?
-rm -rf $tmpdir
-exit $?

+$RPM --define "_sourcedir $tmpdir" \
+ --define "_topdir ${RPM_TOPDIR:-$(rpm -E %_topdir)}" \
+ --define "_tmpdir ${RPM_TMPDIR:-$TMP}" \
+ --define "extra_config_flags ${EXTRA_CONFIG_FLAGS:-''}" \
+ -ba $currdir/e2fsprogs.spec

+rpm_exit=$?
+rm -rf $tmpdir
+exit $rpm_exit
diff -Naur e2fsprogs-1.40.2/e2fsprogs.spec.in e2fsprogs-1.40.2.new/e2fsprogs.spec.in
--- e2fsprogs-1.40.2/e2fsprogs.spec.in 2007-12-12 22:48:29.000000000 -0500
+++ e2fsprogs-1.40.2.new/e2fsprogs.spec.in 2007-12-21 12:39:11.000000000 -0500
@@ -50,7 +50,8 @@
%setup

%build
-%configure --enable-elf-shlibs --enable-nls
+%configure --enable-elf-shlibs --enable-nls \
+ %{?extra_config_flags:%extra_config_flags}
make

%install

Cheers, Andreas
--
Andreas Dilger
Sr. Staff Engineer, Lustre Group
Sun Microsystems of Canada, Inc.

2008-02-11 04:19:15

by Theodore Ts'o

[permalink] [raw]
Subject: Re: [PATCH][0/28] Lustre e2fsprogs patch series

On Sat, Feb 02, 2008 at 12:59:43AM -0700, Andreas Dilger wrote:
> The following series of emails will contain the large part of the
> e2fsprogs patch series that is used for Lustre. It will not contain
> the regression tests for EXTENTS nor the DIR_NLINK features, as those
> are very large and were previously submitted.
>
> A full tarball that includes the patches, series, and regression tests
> will be uploaded to ftp://ftp.lustre.org/pub/lustre/other/e2fsprogs/

Hey Andreas,

I've applied these patches to the tip of "maint", and exported it as
"e2fsprogs-interim" on the e2fsprogs git repository. There quite a
few patch conflicts, mostly due to some changes that had happened on
the tip of maint, but also apparently because your patchset was
missing the flex bg changes. I haven't applied them yet, but I'll
probably tack them at the end.

If you could sanity check to make sure they are sane, I would
appreciate it.

- Ted

2008-02-11 10:22:44

by Aneesh Kumar K.V

[permalink] [raw]
Subject: Re: [PATCH][0/28] Lustre e2fsprogs patch series

On Sun, Feb 10, 2008 at 11:19:12PM -0500, Theodore Tso wrote:
> On Sat, Feb 02, 2008 at 12:59:43AM -0700, Andreas Dilger wrote:
> > The following series of emails will contain the large part of the
> > e2fsprogs patch series that is used for Lustre. It will not contain
> > the regression tests for EXTENTS nor the DIR_NLINK features, as those
> > are very large and were previously submitted.
> >
> > A full tarball that includes the patches, series, and regression tests
> > will be uploaded to ftp://ftp.lustre.org/pub/lustre/other/e2fsprogs/
>
> Hey Andreas,
>
> I've applied these patches to the tip of "maint", and exported it as
> "e2fsprogs-interim" on the e2fsprogs git repository. There quite a
> few patch conflicts, mostly due to some changes that had happened on
> the tip of maint, but also apparently because your patchset was
> missing the flex bg changes. I haven't applied them yet, but I'll
> probably tack them at the end.
>
> If you could sanity check to make sure they are sane, I would
> appreciate it.
>

Needed this diff to get it build

from ../../../lib/ext2fs/crc16.h:18,
from ../../../lib/ext2fs/crc16.c:9:
/usr/include/sys/types.h:46: error: conflicting types for 'loff_t'
/usr/include/linux/types.h:30: error: previous declaration of 'loff_t' was here
/usr/include/sys/types.h:62: error: conflicting types for 'dev_t'
/usr/include/linux/types.h:13: error: previous declaration of 'dev_t' was here
In file included from /usr/include/sys/types.h:133,
from /usr/include/stdlib.h:438,
from ../../../lib/ext2fs/crc16.h:18,
from ../../../lib/ext2fs/crc16.c:9


diff --git a/lib/ext2fs/crc16.c b/lib/ext2fs/crc16.c
index 5d87e10..246813f 100644
--- a/lib/ext2fs/crc16.c
+++ b/lib/ext2fs/crc16.c
@@ -5,7 +5,6 @@
* Version 2. See the file COPYING for more details.
*/

-#include <linux/types.h>
#include "crc16.h"

/** CRC table for the CRC-16. The poly is 0x8005 (x^16 + x^15 + x^2 + 1) */

2008-02-11 20:09:31

by Andreas Dilger

[permalink] [raw]
Subject: Re: [PATCH][0/28] Lustre e2fsprogs patch series

On Feb 10, 2008 23:19 -0500, Theodore Ts'o wrote:
> On Sat, Feb 02, 2008 at 12:59:43AM -0700, Andreas Dilger wrote:
> > The following series of emails will contain the large part of the
> > e2fsprogs patch series that is used for Lustre. It will not contain
> > the regression tests for EXTENTS nor the DIR_NLINK features, as those
> > are very large and were previously submitted.
>
> I've applied these patches to the tip of "maint", and exported it as
> "e2fsprogs-interim" on the e2fsprogs git repository. There quite a
> few patch conflicts, mostly due to some changes that had happened on
> the tip of maint, but also apparently because your patchset was
> missing the flex bg changes. I haven't applied them yet, but I'll
> probably tack them at the end.

The patch was based on e2fsprogs-1.40.5. Also note that the majority
of patches are intended for upstream inclusion, with the exception of
the extents patches, which you are reworking.

> > A full tarball that includes the patches, series, and regression tests
> > will be uploaded to ftp://ftp.lustre.org/pub/lustre/other/e2fsprogs/

This didn't work out, as our FTP site was subsumed into a Sun-hosted
download site that week and I'm no longer able to make public-access
uploads. I have to find some other location to host that tarball.

> If you could sanity check to make sure they are sane, I would
> appreciate it.

I need to catch up from travelling last week, hopefully I'll get to
it by the end of this week.

Cheers, Andreas
--
Andreas Dilger
Sr. Staff Engineer, Lustre Group
Sun Microsystems of Canada, Inc.

2008-02-18 17:57:16

by Eric Sandeen

[permalink] [raw]
Subject: Re: [PATCH][7/28] e2fsprogs-extents.patch

Andreas Dilger wrote:
> Support for checking 32-bit extents format inodes and the INCOMPAT_EXTENTS
> feature.
>
> Clear the high 16 bits of extents and index entries, since the
> extents patches did not do this explicitly. Some parts of this
> code need fixing for checking > 32-bit block filesystems (when
> INCOMPAT_64BIT support is added), marked "FIXME: 48-bit support".
>
> Verify extent headers in blocks, logical ordering of extents,
> logical ordering of indexes.
>
> Add explicit checking of {d,t,}indirect and index blocks to detect
> corruption instead of implicitly doing this by checking the referred
> blocks and only block-at-a-time correctness. This avoids incorrectly
> invoking the very lengthy duplicate blocks pass for bad indirect/index
> blocks. We may want to tune the "threshold" for how many errors make
> a "bad" indirect/index block.
>
> Add ability to split or remove extents in order to allow extent
> reallocation during the duplicate blocks pass.
>
...


> @@ -904,21 +910,75 @@ void e2fsck_pass1(e2fsck_t ctx)
> ctx->fs_sockets_count++;
> } else
> mark_inode_bad(ctx, ino);
> - if (inode->i_block[EXT2_IND_BLOCK])
> - ctx->fs_ind_count++;
> - if (inode->i_block[EXT2_DIND_BLOCK])
> - ctx->fs_dind_count++;
> - if (inode->i_block[EXT2_TIND_BLOCK])
> - ctx->fs_tind_count++;
> - if (inode->i_block[EXT2_IND_BLOCK] ||
> - inode->i_block[EXT2_DIND_BLOCK] ||
> - inode->i_block[EXT2_TIND_BLOCK] ||
> - inode->i_file_acl) {
> - inodes_to_process[process_inode_count].ino = ino;
> - inodes_to_process[process_inode_count].inode = *inode;
> - process_inode_count++;
> - } else
> - check_blocks(ctx, &pctx, block_buf);
> +
> + eh = (struct ext3_extent_header *)inode->i_block;
> + if ((inode->i_flags & EXT4_EXTENTS_FL)) {
> + if ((LINUX_S_ISREG(inode->i_mode) ||
> + LINUX_S_ISDIR(inode->i_mode)) &&

So this trips up on things like sockets, fifos, and block & char nodes.

Also this is unhappy:

> @@ -137,7 +141,7 @@ int e2fsck_pass1_check_device_inode(ext2
> * If the index flag is set, then this is a bogus
> * device/fifo/socket
> */
> - if (inode->i_flags & EXT2_INDEX_FL)
> + if (inode->i_flags & (EXT2_INDEX_FL | EXT4_EXTENTS_FL))
> return 0;

Do we really care if these have the extents flag set? IOW should we
make sure the kernel doesn't set the flag, or should we make e2fsck not
care...

There are enough checks in e2fsck to show the intent was that these
files should not have the extents flag set, but I'm not sure why it
matters enough that the kernel needs to run around being sure to clear
it....

Or... (rambling on now) it seems odd to me that zero-length files have
the extents flag set at all; should we only set extents when we actually
get a block allocated to the file? That would also take care of this
from the kernel side I think.

-Eric

2008-02-18 18:13:09

by Eric Sandeen

[permalink] [raw]
Subject: Re: [PATCH][7/28] e2fsprogs-extents.patch

Eric Sandeen wrote:
> So this trips up on things like sockets, fifos, and block & char nodes.

Just to demonstrate; doing this on ext4:

mknod mnt/block b 1 1
mknod mnt/char c 1 1
mknod mnt/fifo p
mksock mnt/sock

mkdir -p
mnt/verylongdir12345678901234567890/verylongdir12345678901234567890/verylongdir12345678901234567890
ln -s
mnt/verylongdir12345678901234567890/verylongdir12345678901234567890/verylongdir12345678901234567890
mnt/longlink

yields an unhappy fsck w/ e2fsprogs-interim:

e2fsck 1.40.6 (09-Feb-2008)
Pass 1: Checking inodes, blocks, and sizes
Inode 12 has EXTENT_FL set, but is not in extents format
Fix? no

Inode 13 has EXTENT_FL set, but is not in extents format
Fix? no

Inode 14 has EXTENT_FL set, but is not in extents format
Fix? no

Inode 15 has EXTENT_FL set, but is not in extents format
Fix? no

Inode 17 has EXTENT_FL set, but is not in extents format
Fix? no

Pass 2: Checking directory structure
Inode 12 (/block) is an illegal block device.
Clear? no

Inode 13 (/char) is an illegal character device.
Clear? no

Inode 14 (/fifo) is an illegal FIFO.
Clear? no

Inode 15 (/sock) is an illegal socket.
Clear? no

Symlink /longlink (inode #17) is invalid.
Clear? no

Pass 3: Checking directory connectivity
Pass 4: Checking reference counts
Pass 5: Checking group summary information

-Eric

2008-02-18 19:53:44

by Theodore Ts'o

[permalink] [raw]
Subject: Re: [PATCH][7/28] e2fsprogs-extents.patch

On Mon, Feb 18, 2008 at 11:56:53AM -0600, Eric Sandeen wrote:
> So this trips up on things like sockets, fifos, and block & char nodes.
>
> Also this is unhappy:
>
> > @@ -137,7 +141,7 @@ int e2fsck_pass1_check_device_inode(ext2
> > * If the index flag is set, then this is a bogus
> > * device/fifo/socket
> > */
> > - if (inode->i_flags & EXT2_INDEX_FL)
> > + if (inode->i_flags & (EXT2_INDEX_FL | EXT4_EXTENTS_FL))
> > return 0;
>
> Do we really care if these have the extents flag set? IOW should we
> make sure the kernel doesn't set the flag, or should we make e2fsck not
> care...

<Sigh>

I think we need to get kernel patches into mainline ASAP not to set
the EXTENTS_FL --- be conservative in what you send --- and at least
for now, e2fsck needs to accept (and not complain or core dump) if
EXTENTS_FL is set for files where ext2fs_inode_has_valid_blocks()
returns false --- be liberal in what you accept.

Eventually, after the kernel patches hit mainline, we could change
e2fsck to automatically fix all of these in preen mode, just for
cleanliness sake.

- Ted

2008-02-18 20:48:41

by Eric Sandeen

[permalink] [raw]
Subject: Re: [PATCH][7/28] e2fsprogs-extents.patch

Theodore Tso wrote:
> On Mon, Feb 18, 2008 at 11:56:53AM -0600, Eric Sandeen wrote:
>> So this trips up on things like sockets, fifos, and block & char nodes.
>>
>> Also this is unhappy:
>>
>>> @@ -137,7 +141,7 @@ int e2fsck_pass1_check_device_inode(ext2
>>> * If the index flag is set, then this is a bogus
>>> * device/fifo/socket
>>> */
>>> - if (inode->i_flags & EXT2_INDEX_FL)
>>> + if (inode->i_flags & (EXT2_INDEX_FL | EXT4_EXTENTS_FL))
>>> return 0;
>> Do we really care if these have the extents flag set? IOW should we
>> make sure the kernel doesn't set the flag, or should we make e2fsck not
>> care...
>
> <Sigh>
>
> I think we need to get kernel patches into mainline ASAP not to set
> the EXTENTS_FL

You mean on devices/fifos/sockets ? Ok.

But today, with 2.6.25-rc1 and e2fsprogs-interim, long (non-fast)
symlinks get clobbered by e2fsck, because:

Pass 1: Checking inodes, blocks, and sizes
Inode 12 has EXTENT_FL set, but is not in extents format
Fix? yes

Inode 12 has illegal block(s). Clear? yes

Illegal block #0 (127754) in inode 12. CLEARED.
Inode 12 is too big. Truncate? yes

Block #1 (4) causes symlink to be too big. CLEARED.
Block #4 (1) causes symlink to be too big. CLEARED.
Block #5 (4772) causes symlink to be too big. CLEARED.
Inode 12, i_blocks is 2, should be 0. Fix? yes

Pass 2: Checking directory structure
Symlink /longlink (inode #12) is invalid.
Clear? yes

Pass 3: Checking directory connectivity
Pass 4: Checking reference counts
Pass 5: Checking group summary information
Block bitmap differences: -4772
Fix? yes

Free blocks count wrong for group #0 (3420, counted=3421).
Fix? yes

Free blocks count wrong (26192, counted=26193).
Fix? yes

and *poof* it's gone. That one concerns me more... This *should* be in
extents format, right, even though it's limited to one block...

> and at least
> for now, e2fsck needs to accept (and not complain or core dump) if
> EXTENTS_FL is set for files where ext2fs_inode_has_valid_blocks()
> returns false

well, if any filetypes are not supposed to have the extents flag set,
and they're zero-length, I'd say go ahead & clear it, and even complain
if you like - it's the design intent after all - I wouldn't worry about
the noise at this stage. FWIW, I haven't seen a core dump. :)

-Eric

2008-02-18 22:09:44

by Theodore Ts'o

[permalink] [raw]
Subject: Re: [PATCH][7/28] e2fsprogs-extents.patch

On Mon, Feb 18, 2008 at 02:48:18PM -0600, Eric Sandeen wrote:
> > I think we need to get kernel patches into mainline ASAP not to set
> > the EXTENTS_FL
>
> You mean on devices/fifos/sockets ? Ok.

Yes, sorry for not being explicit.

>
> But today, with 2.6.25-rc1 and e2fsprogs-interim, long (non-fast)
> symlinks get clobbered by e2fsck, because:
>
> Pass 1: Checking inodes, blocks, and sizes
> Inode 12 has EXTENT_FL set, but is not in extents format
> Fix? yes

Yeah, my current development branch of e2fsprogs does the right thing,
but e2fsprogs-interim doesn't. We need to add a test to make sure
ext2fs_inode_has_valid_blocks(inode) before marking the inode bad and
asking the user if the inode should be cleared.

> and *poof* it's gone. That one concerns me more... This *should* be in
> extents format, right, even though it's limited to one block...

Well, for symlinks, they are only one block, so there is no reason for
it to be using the extent format. So storing it as a single block
number makes a lot more sense. It should just not be setting the
EXTENTS_FL flag.

> well, if any filetypes are not supposed to have the extents flag set,
> and they're zero-length, I'd say go ahead & clear it, and even complain
> if you like - it's the design intent after all - I wouldn't worry about
> the noise at this stage. FWIW, I haven't seen a core dump. :)

The current pu branch core dumps. My development branch has at least
that problem fixed. :-)

- Ted

2008-02-19 04:35:49

by Andreas Dilger

[permalink] [raw]
Subject: Re: [PATCH][7/28] e2fsprogs-extents.patch

On Feb 18, 2008 11:56 -0600, Eric Sandeen wrote:
> > @@ -904,21 +910,75 @@ void e2fsck_pass1(e2fsck_t ctx)
> > + eh = (struct ext3_extent_header *)inode->i_block;
> > + if ((inode->i_flags & EXT4_EXTENTS_FL)) {
> > + if ((LINUX_S_ISREG(inode->i_mode) ||
> > + LINUX_S_ISDIR(inode->i_mode)) &&
>
> So this trips up on things like sockets, fifos, and block & char nodes.

Hrm, not impossible, since Lustre only uses extent-based filesystems for
regular file storage.

> Also this is unhappy:
>
> > @@ -137,7 +141,7 @@ int e2fsck_pass1_check_device_inode(ext2
> > * If the index flag is set, then this is a bogus
> > * device/fifo/socket
> > */
> > - if (inode->i_flags & EXT2_INDEX_FL)
> > + if (inode->i_flags & (EXT2_INDEX_FL | EXT4_EXTENTS_FL))
> > return 0;
>
> Do we really care if these have the extents flag set? IOW should we
> make sure the kernel doesn't set the flag, or should we make e2fsck not
> care...

The Lustre extents patches clear the EXT4_EXTENTS_FL always (i.e. they
are never set on directories) so we've never seen these problems.

> There are enough checks in e2fsck to show the intent was that these
> files should not have the extents flag set, but I'm not sure why it
> matters enough that the kernel needs to run around being sure to clear
> it....
>
> Or... (rambling on now) it seems odd to me that zero-length files have
> the extents flag set at all; should we only set extents when we actually
> get a block allocated to the file? That would also take care of this
> from the kernel side I think.

Yes, I'd be for e2fsck clearing this flag, but as I mentioned in the
concall, I think it is better to have the kernel just stop inheriting
all flags from the parent directory, or possibly just have a fixed
range of flags that are being propogated to child inodes.

Cheers, Andreas
--
Andreas Dilger
Sr. Staff Engineer, Lustre Group
Sun Microsystems of Canada, Inc.

2008-03-15 19:41:37

by Theodore Ts'o

[permalink] [raw]
Subject: Re: [PATCH][10/28] e2fsprogs-uninit.patch

What is the intended use of the SF_DO_CSUM flag? I see where it is
defined, and where it gets sets, but as far as I can tell nothing
actually tests for it or uses it.

- Ted


On Sat, Feb 02, 2008 at 01:34:44AM -0700, Andreas Dilger wrote:
> Index: e2fsprogs-1.40.5/lib/ext2fs/ext2fs.h
>
> +#define EXT2_SF_DO_CSUM 0x0020
>

> Index: e2fsprogs-1.40.5/lib/ext2fs/inode.c
> ===================================================================
> --- e2fsprogs-1.40.5.orig/lib/ext2fs/inode.c
> +++ e2fsprogs-1.40.5/lib/ext2fs/inode.c
> @@ -167,6 +167,9 @@ errcode_t ext2fs_open_inode_scan(ext2_fi
> if (EXT2_HAS_COMPAT_FEATURE(fs->super,
> EXT2_FEATURE_COMPAT_LAZY_BG))
> scan->scan_flags |= EXT2_SF_DO_LAZY;
> + if (EXT2_HAS_RO_COMPAT_FEATURE(fs->super,
> + EXT4_FEATURE_RO_COMPAT_GDT_CSUM))
> + scan->scan_flags |= EXT2_SF_DO_LAZY | EXT2_SF_DO_CSUM;


2008-03-16 00:35:25

by Andreas Dilger

[permalink] [raw]
Subject: Re: [PATCH][10/28] e2fsprogs-uninit.patch

On Mar 15, 2008 15:41 -0400, Theodore Ts'o wrote:
> What is the intended use of the SF_DO_CSUM flag? I see where it is
> defined, and where it gets sets, but as far as I can tell nothing
> actually tests for it or uses it.

Probably a hold-over from a previous version of the code. It sets the
SF_DO_LAZY flag if GDT_CSUM is set, causing ext2fs_get_next_inode_full()
to skip the BG_INODE_UNINT groups. It's up to you if it would be better
to keep SF_DO_CSUM and check for it explicitly (possibly using it for
something else later), or to overload SF_DO_LAZY as we do currently.

> On Sat, Feb 02, 2008 at 01:34:44AM -0700, Andreas Dilger wrote:
> > Index: e2fsprogs-1.40.5/lib/ext2fs/ext2fs.h
> >
> > +#define EXT2_SF_DO_CSUM 0x0020
> >
>
> > Index: e2fsprogs-1.40.5/lib/ext2fs/inode.c
> > ===================================================================
> > --- e2fsprogs-1.40.5.orig/lib/ext2fs/inode.c
> > +++ e2fsprogs-1.40.5/lib/ext2fs/inode.c
> > @@ -167,6 +167,9 @@ errcode_t ext2fs_open_inode_scan(ext2_fi
> > if (EXT2_HAS_COMPAT_FEATURE(fs->super,
> > EXT2_FEATURE_COMPAT_LAZY_BG))
> > scan->scan_flags |= EXT2_SF_DO_LAZY;
> > + if (EXT2_HAS_RO_COMPAT_FEATURE(fs->super,
> > + EXT4_FEATURE_RO_COMPAT_GDT_CSUM))
> > + scan->scan_flags |= EXT2_SF_DO_LAZY | EXT2_SF_DO_CSUM;

Cheers, Andreas
--
Andreas Dilger
Sr. Staff Engineer, Lustre Group
Sun Microsystems of Canada, Inc.


2008-03-17 12:33:51

by Theodore Ts'o

[permalink] [raw]
Subject: Re: [PATCH][10/28] e2fsprogs-uninit.patch

On Sat, Feb 02, 2008 at 01:34:44AM -0700, Andreas Dilger wrote:
> Index: e2fsprogs-1.40.5/e2fsck/super.c
> ===================================================================
> @@ -626,6 +631,50 @@ void check_super_block(e2fsck_t ctx)
...
> + if (!ext2fs_group_desc_csum_verify(sb, i, gd)) {
> + if (fix_problem(ctx, PR_0_GDT_CSUM, &pctx)) {
> + gd->bg_flags &= ~(EXT2_BG_BLOCK_UNINIT |
> + EXT2_BG_INODE_UNINIT);
> + gd->bg_itable_unused = 0;
> + }
> + ext2fs_unmark_valid(fs);
> + }
> +
...
> +
> + gd->bg_checksum = ext2fs_group_desc_csum(fs->super, i, gd);

This last looks horribly wrong. check_super_block() is merely
supposed to check to see if the superblock and block gorup descriptos
looks OK, and to mark the filesystem as invalid if anything looks
insane. It should *not* modifying the block group descriptor, and it
certainly should not be doing so without first checking to see if the
filesystem has been opened read/only or calling
ext2fs_mark_super_dirty(fs) after making a change (that it shouldn't
do).

In fact there already is a check to see if the checksum has verified
correctly (see the first part of the patch which I quoted), so I think
the best thing to do is to remove that last bit. I'm not sure why
it's there at all, in fact....

- Ted