Yikes. This patch series has sprawled quite a lot since October.
The first 15 patches are miscellaneous fixes. Half of these are the
same patches from October, but there are a few new ones to tweak
wording, cap the number of logical blocks in a file to limits that the
library can handle, and a patch to disable changing the UUID on a
mounted checksumming filesystem because tune2fs and the kernel can
race to write out new (non-identical) GDT checksums, leading to fs
corruption. There's also a patch to prevent mapping blocks into files
at too high addresses, a fix to make big symlinks safe for extents and
64bit block numbers, fixes for resize2fs moving inodes on checksumming
filesystems, and a bunch of other 64-bit safety fixes.
Patches 16-30 fix some problems that the Coverity scanner found. Most
of these are resource handling errors -- file descriptors that don't
get freed, memory that doesn't get freed in error paths (or is
incorrectly freed), etc. It's true that in many cases the resource
leaks happen on the way towards exit()/abort(), but this isn't always
true, and a little defensive programming never hurt anyone.
Patches 31-35 fix some library problems that running xfstests atop
fuse2fs exposed. There are still problems with the fileio calls
returning stale data, and an extent tree corruption bug hiding
somewhere in libext2fs, but I decided to work on that after pushing
out this patch set.
Patches 36-37 fix some problems I encountered when creating 1k-block
bigalloc filesystems. The kernel still complains of block group
corruption, but I will address those separately.
Patches 38-41 address the BLOCK_UNINIT discussion that I've been
having with Akira Fujita and Ted. It shifts responsibility for
calculating the block bitmap for a BLOCK_UNINIT group out of the block
allocation routines into the bitmap loading routines. With this in
place, the testb command in debugfs (and therefore the
ext2fs_test_block_bitmap() function) return correct results.
Patches 42-48 fix some more bigalloc bugs. These are the same patches
from October, plus an additional fix to e2fsck and some resize2fs
cluster handling fixes.
Patch 49 enables block_validity for new filesystems. See patch 77 for
a performance microbenchmark.
Patches 50-51 enhance ext2fs_bmap2() to allow the creation of
uninitialized extents. The functionality is already there; really it
just adds a flag to indicate uninitialized. There's also a patch to
the fileio routines to handle uninitialized extents. fuse2fs will use
this for fallocate.
Patches 52-54 add to resize2fs the ability to convert a filesystem to
and from 64bit mode. These patches are mostly unchanged from October.
Patches 55-60 implement a new API to iterate and edit extended
attributes. The inline_data and fuse2fs patchsets both use this
feature. For those who have been watching the inline_data patchset, I
rolled the minor bugfix patches into the first patch.
Patch 61 extends ext2fs_open() to support 64bit superblock numbers.
Ted wondered (back in October) if anyone else wanted to add anything
to that call. I can't think of anything, and nobody else has spoken
up.
Patches 62-65 implement fuse2fs, a FUSE server based on libext2fs.
Primarily I've been using it to shake out bugs in the library via
xfstests and the metadata checksumming test program. It can also be
used to mount ext4 on any OS supporting FUSE, and it can also mount
64k-block filesystems on x86, though I'd be wary of using rw mode.
fuse2fs depends on these new APIs: xattr editing, uninit extent
handling, and the 64-bit openfs call.
Patches 66-74 provide the metadata checksumming test script. Its
primary advantage over 'make check' is that it allows one to specify a
variety of different mkfs and mount options. It's also growing more
tests as a result of fuse2fs exercise.
I've tested these e2fsprogs changes against the -next branch as of
12/9. These days, I use an 8GB ramdisk and a 20T "disk" I constructed
out of dm-snapshot to test in an x64 VM. The make check tests should
pass.
Comments and questions are, as always, welcome.
--D
On a FS with a rather large blockize (> 4K), the old block map
structure can construct a fat enough "tree" (or whatever we call that
lopsided thing) that (at least in theory) one could create mappings
for logical blocks higher than 32 bits. In practice this doesn't
happen, but the 'max' and 'iter' variables that the punch helpers use
will overflow because the BLOCK_SIZE_BITS shifts are too large to fit
a 32-bit variable. The current variable declarations also cause punch
to fail on TIND-mapped blocks even if the file is < 16T. So enlarge
the fields to fit.
Yes this is an obscure corner case, but it seems a little silly if we
can't punch a file's block 300,000,000 on a 64k-block filesystem.
Signed-off-by: Darrick J. Wong <[email protected]>
---
lib/ext2fs/punch.c | 15 ++++++++-------
1 file changed, 8 insertions(+), 7 deletions(-)
diff --git a/lib/ext2fs/punch.c b/lib/ext2fs/punch.c
index 4471f46..790a0ad8 100644
--- a/lib/ext2fs/punch.c
+++ b/lib/ext2fs/punch.c
@@ -50,15 +50,16 @@ static errcode_t ind_punch(ext2_filsys fs, struct ext2_inode *inode,
blk_t start, blk_t count, int max)
{
errcode_t retval;
- blk_t b, offset;
- int i, incr;
+ blk_t b;
+ int i;
+ blk64_t offset, incr;
int freed = 0;
#ifdef PUNCH_DEBUG
printf("Entering ind_punch, level %d, start %u, count %u, "
"max %d\n", level, start, count, max);
#endif
- incr = 1 << ((EXT2_BLOCK_SIZE_BITS(fs->super)-2)*level);
+ incr = 1ULL << ((EXT2_BLOCK_SIZE_BITS(fs->super)-2)*level);
for (i=0, offset=0; i < max; i++, p++, offset += incr) {
if (offset >= start + count)
break;
@@ -87,7 +88,7 @@ static errcode_t ind_punch(ext2_filsys fs, struct ext2_inode *inode,
continue;
}
#ifdef PUNCH_DEBUG
- printf("Freeing block %u (offset %d)\n", b, offset);
+ printf("Freeing block %u (offset %llu)\n", b, offset);
#endif
ext2fs_block_alloc_stats(fs, b, -1);
*p = 0;
@@ -108,7 +109,7 @@ static errcode_t ext2fs_punch_ind(ext2_filsys fs, struct ext2_inode *inode,
int num = EXT2_NDIR_BLOCKS;
blk_t *bp = inode->i_block;
blk_t addr_per_block;
- blk_t max = EXT2_NDIR_BLOCKS;
+ blk64_t max = EXT2_NDIR_BLOCKS;
if (!block_buf) {
retval = ext2fs_get_array(3, fs->blocksize, &buf);
@@ -119,10 +120,10 @@ static errcode_t ext2fs_punch_ind(ext2_filsys fs, struct ext2_inode *inode,
addr_per_block = (blk_t) fs->blocksize >> 2;
- for (level=0; level < 4; level++, max *= addr_per_block) {
+ for (level = 0; level < 4; level++, max *= (blk64_t)addr_per_block) {
#ifdef PUNCH_DEBUG
printf("Main loop level %d, start %u count %u "
- "max %d num %d\n", level, start, count, max, num);
+ "max %llu num %d\n", level, start, count, max, num);
#endif
if (start < max) {
retval = ind_punch(fs, inode, block_buf, bp, level,
For each site where we test for a large file (> 2GB) and set the
LARGE_FILE feature, use a helper function to make the size test
consistent with the test that's in e2fsck. This fixes the fsck
complaints when we try to create a 2GB journal (not so hard with 64k
block size) and fixes the incorrect test in fileio.c.
v2: Fix another site in e2fsck/pass2.c that Zheng Liu pointed out.
Reviewed-by: Zheng Liu <[email protected]>
Signed-off-by: Darrick J. Wong <[email protected]>
---
e2fsck/pass1.c | 3 ++-
e2fsck/pass2.c | 3 ++-
lib/ext2fs/ext2fs.h | 6 ++++++
lib/ext2fs/fileio.c | 2 +-
lib/ext2fs/mkjournal.c | 2 +-
5 files changed, 12 insertions(+), 4 deletions(-)
diff --git a/e2fsck/pass1.c b/e2fsck/pass1.c
index 9a5dac7..821239e 100644
--- a/e2fsck/pass1.c
+++ b/e2fsck/pass1.c
@@ -2281,7 +2281,8 @@ static void check_blocks(e2fsck_t ctx, struct problem_context *pctx,
}
pctx->num = 0;
}
- if (LINUX_S_ISREG(inode->i_mode) && EXT2_I_SIZE(inode) >= 0x80000000UL)
+ if (LINUX_S_ISREG(inode->i_mode) &&
+ ext2fs_needs_large_file_feature(EXT2_I_SIZE(inode)))
ctx->large_files++;
if ((pb.num_blocks != ext2fs_inode_i_blocks(fs, inode)) ||
((fs->super->s_feature_ro_compat &
diff --git a/e2fsck/pass2.c b/e2fsck/pass2.c
index 81a0f4b..804770b 100644
--- a/e2fsck/pass2.c
+++ b/e2fsck/pass2.c
@@ -1318,7 +1318,8 @@ static void deallocate_inode(e2fsck_t ctx, ext2_ino_t ino, char* block_buf)
if (!ext2fs_inode_has_valid_blocks2(fs, &inode))
goto clear_inode;
- if (LINUX_S_ISREG(inode.i_mode) && EXT2_I_SIZE(&inode) >= 0x80000000UL)
+ if (LINUX_S_ISREG(inode.i_mode) &&
+ ext2fs_needs_large_file_feature(EXT2_I_SIZE(&inode)))
ctx->large_files--;
del_block.ctx = ctx;
diff --git a/lib/ext2fs/ext2fs.h b/lib/ext2fs/ext2fs.h
index 654247a..64e498f 100644
--- a/lib/ext2fs/ext2fs.h
+++ b/lib/ext2fs/ext2fs.h
@@ -646,6 +646,12 @@ static inline int ext2fs_has_group_desc_csum(ext2_filsys fs)
EXT4_FEATURE_RO_COMPAT_METADATA_CSUM);
}
+/* The LARGE_FILE feature should be set if we have stored files 2GB+ in size */
+static inline int ext2fs_needs_large_file_feature(unsigned long long file_size)
+{
+ return file_size >= 0x80000000ULL;
+}
+
/* alloc.c */
extern errcode_t ext2fs_new_inode(ext2_filsys fs, ext2_ino_t dir, int mode,
ext2fs_inode_bitmap map, ext2_ino_t *ret);
diff --git a/lib/ext2fs/fileio.c b/lib/ext2fs/fileio.c
index 02e6263..6b213b5 100644
--- a/lib/ext2fs/fileio.c
+++ b/lib/ext2fs/fileio.c
@@ -400,7 +400,7 @@ errcode_t ext2fs_file_set_size2(ext2_file_t file, ext2_off64_t size)
/* If we're writing a large file, set the large_file flag */
if (LINUX_S_ISREG(file->inode.i_mode) &&
- EXT2_I_SIZE(&file->inode) > 0x7FFFFFFULL &&
+ ext2fs_needs_large_file_feature(EXT2_I_SIZE(&file->inode)) &&
(!EXT2_HAS_RO_COMPAT_FEATURE(file->fs->super,
EXT2_FEATURE_RO_COMPAT_LARGE_FILE) ||
file->fs->super->s_rev_level == EXT2_GOOD_OLD_REV)) {
diff --git a/lib/ext2fs/mkjournal.c b/lib/ext2fs/mkjournal.c
index c636a97..2afd3b7 100644
--- a/lib/ext2fs/mkjournal.c
+++ b/lib/ext2fs/mkjournal.c
@@ -378,7 +378,7 @@ static errcode_t write_journal_inode(ext2_filsys fs, ext2_ino_t journal_ino,
inode_size = (unsigned long long)fs->blocksize * num_blocks;
inode.i_size = inode_size & 0xFFFFFFFF;
inode.i_size_high = (inode_size >> 32) & 0xFFFFFFFF;
- if (inode.i_size_high)
+ if (ext2fs_needs_large_file_feature(inode_size))
fs->super->s_feature_ro_compat |=
EXT2_FEATURE_RO_COMPAT_LARGE_FILE;
ext2fs_iblk_add_blocks(fs, &inode, es.newblocks);
mke2fs has a series of checks to ensure that we don't create a
filesystem too big for its blocksize -- if auto-64bit is on, then it
turns on 64bit; otherwise it complains. Unfortunately, it performs
these checks before looking in mke2fs.conf for a blocksize, which
means that the checks are incorrect if the user specifies a non-4096
blocksize in the config file and says nothing on the command line.
The bug also has the effect of mandating a 4k block size on any block
device larger than 4T in that situation. Therefore, read the block
size from the config file before performing the 64bit checks.
Reviewed-by: Zheng Liu <[email protected]>
Signed-off-by: Darrick J. Wong <[email protected]>
---
misc/mke2fs.c | 134 +++++++++++++++++++++++++++++++--------------------------
1 file changed, 72 insertions(+), 62 deletions(-)
diff --git a/misc/mke2fs.c b/misc/mke2fs.c
index 67c9225..19b6e85 100644
--- a/misc/mke2fs.c
+++ b/misc/mke2fs.c
@@ -1298,6 +1298,21 @@ static void PRS(int argc, char *argv[])
char * fs_type = 0;
char * usage_types = 0;
blk64_t dev_size;
+ /*
+ * NOTE: A few words about fs_blocks_count and blocksize:
+ *
+ * Initially, blocksize is set to zero, which implies 1024.
+ * If -b is specified, blocksize is updated to the user's value.
+ *
+ * Next, the device size or the user's "blocks" command line argument
+ * is used to set fs_blocks_count; the units are blocksize.
+ *
+ * Later, if blocksize hasn't been set and the profile specifies a
+ * blocksize, then blocksize is updated and fs_blocks_count is scaled
+ * appropriately. Note the change in units!
+ *
+ * Finally, we complain about fs_blocks_count > 2^32 on a non-64bit fs.
+ */
blk64_t fs_blocks_count = 0;
#ifdef __linux__
struct utsname ut;
@@ -1780,15 +1795,67 @@ profile_error:
}
}
+ /* Get the hardware sector sizes, if available */
+ retval = ext2fs_get_device_sectsize(device_name, &lsector_size);
+ if (retval) {
+ com_err(program_name, retval,
+ _("while trying to determine hardware sector size"));
+ exit(1);
+ }
+ retval = ext2fs_get_device_phys_sectsize(device_name, &psector_size);
+ if (retval) {
+ com_err(program_name, retval,
+ _("while trying to determine physical sector size"));
+ exit(1);
+ }
+
+ tmp = getenv("MKE2FS_DEVICE_SECTSIZE");
+ if (tmp != NULL)
+ lsector_size = atoi(tmp);
+ tmp = getenv("MKE2FS_DEVICE_PHYS_SECTSIZE");
+ if (tmp != NULL)
+ psector_size = atoi(tmp);
+
+ /* Older kernels may not have physical/logical distinction */
+ if (!psector_size)
+ psector_size = lsector_size;
+
+ if (blocksize <= 0) {
+ use_bsize = get_int_from_profile(fs_types, "blocksize", 4096);
+
+ if (use_bsize == -1) {
+ use_bsize = sys_page_size;
+ if ((linux_version_code < (2*65536 + 6*256)) &&
+ (use_bsize > 4096))
+ use_bsize = 4096;
+ }
+ if (lsector_size && use_bsize < lsector_size)
+ use_bsize = lsector_size;
+ if ((blocksize < 0) && (use_bsize < (-blocksize)))
+ use_bsize = -blocksize;
+ blocksize = use_bsize;
+ fs_blocks_count /= (blocksize / 1024);
+ } else {
+ if (blocksize < lsector_size) { /* Impossible */
+ com_err(program_name, EINVAL,
+ _("while setting blocksize; too small "
+ "for device\n"));
+ exit(1);
+ } else if ((blocksize < psector_size) &&
+ (psector_size <= sys_page_size)) { /* Suboptimal */
+ fprintf(stderr, _("Warning: specified blocksize %d is "
+ "less than device physical sectorsize %d\n"),
+ blocksize, psector_size);
+ }
+ }
+
+ fs_param.s_log_block_size =
+ int_log2(blocksize >> EXT2_MIN_BLOCK_LOG_SIZE);
+
/*
* We now need to do a sanity check of fs_blocks_count for
* 32-bit vs 64-bit block number support.
*/
- if ((fs_blocks_count > MAX_32_NUM) && (blocksize == 0)) {
- fs_blocks_count /= 4; /* Try using a 4k blocksize */
- blocksize = 4096;
- fs_param.s_log_block_size = 2;
- }
if ((fs_blocks_count > MAX_32_NUM) &&
!(fs_param.s_feature_incompat & EXT4_FEATURE_INCOMPAT_64BIT) &&
get_bool_from_profile(fs_types, "auto_64-bit_support", 0)) {
@@ -1880,63 +1947,6 @@ profile_error:
if ((fs_param.s_feature_incompat & EXT2_FEATURE_INCOMPAT_META_BG) &&
((tmp = getenv("MKE2FS_FIRST_META_BG"))))
fs_param.s_first_meta_bg = atoi(tmp);
-
- /* Get the hardware sector sizes, if available */
- retval = ext2fs_get_device_sectsize(device_name, &lsector_size);
- if (retval) {
- com_err(program_name, retval,
- _("while trying to determine hardware sector size"));
- exit(1);
- }
- retval = ext2fs_get_device_phys_sectsize(device_name, &psector_size);
- if (retval) {
- com_err(program_name, retval,
- _("while trying to determine physical sector size"));
- exit(1);
- }
-
- if ((tmp = getenv("MKE2FS_DEVICE_SECTSIZE")) != NULL)
- lsector_size = atoi(tmp);
- if ((tmp = getenv("MKE2FS_DEVICE_PHYS_SECTSIZE")) != NULL)
- psector_size = atoi(tmp);
-
- /* Older kernels may not have physical/logical distinction */
- if (!psector_size)
- psector_size = lsector_size;
-
- if (blocksize <= 0) {
- use_bsize = get_int_from_profile(fs_types, "blocksize", 4096);
-
- if (use_bsize == -1) {
- use_bsize = sys_page_size;
- if ((linux_version_code < (2*65536 + 6*256)) &&
- (use_bsize > 4096))
- use_bsize = 4096;
- }
- if (lsector_size && use_bsize < lsector_size)
- use_bsize = lsector_size;
- if ((blocksize < 0) && (use_bsize < (-blocksize)))
- use_bsize = -blocksize;
- blocksize = use_bsize;
- ext2fs_blocks_count_set(&fs_param,
- ext2fs_blocks_count(&fs_param) /
- (blocksize / 1024));
- } else {
- if (blocksize < lsector_size) { /* Impossible */
- com_err(program_name, EINVAL,
- _("while setting blocksize; too small "
- "for device\n"));
- exit(1);
- } else if ((blocksize < psector_size) &&
- (psector_size <= sys_page_size)) { /* Suboptimal */
- fprintf(stderr, _("Warning: specified blocksize %d is "
- "less than device physical sectorsize %d\n"),
- blocksize, psector_size);
- }
- }
Use the new ext2fs_punch() call to truncate the quota file. This also
eliminates the need to fix it to work with bigalloc.
Reviewed-by: Zheng Liu <[email protected]>
Signed-off-by: Darrick J. Wong <[email protected]>
---
lib/quota/quotaio.c | 19 +++----------------
1 file changed, 3 insertions(+), 16 deletions(-)
diff --git a/lib/quota/quotaio.c b/lib/quota/quotaio.c
index 8ddb92a..1bdcba6 100644
--- a/lib/quota/quotaio.c
+++ b/lib/quota/quotaio.c
@@ -98,19 +98,6 @@ void update_grace_times(struct dquot *q)
}
}
-static int release_blocks_proc(ext2_filsys fs, blk64_t *blocknr,
- e2_blkcnt_t blockcnt EXT2FS_ATTR((unused)),
- blk64_t ref_block EXT2FS_ATTR((unused)),
- int ref_offset EXT2FS_ATTR((unused)),
- void *private EXT2FS_ATTR((unused)))
-{
- blk64_t block;
-
- block = *blocknr;
- ext2fs_block_alloc_stats2(fs, block, -1);
- return 0;
-}
-
static int compute_num_blocks_proc(ext2_filsys fs, blk64_t *blocknr,
e2_blkcnt_t blockcnt EXT2FS_ATTR((unused)),
blk64_t ref_block EXT2FS_ATTR((unused)),
@@ -135,9 +122,9 @@ errcode_t quota_inode_truncate(ext2_filsys fs, ext2_ino_t ino)
inode.i_dtime = fs->now ? fs->now : time(0);
if (!ext2fs_inode_has_valid_blocks2(fs, &inode))
return 0;
The help text for debugfs' init_filesys command is incorrect; the
second parameter is the size of the filesystem in blocks, not the size
of an individual filesystem block. There is in fact no way to set
that parameter.
Reported-by: Zheng Liu <[email protected]>
Signed-off-by: Darrick J. Wong <[email protected]>
---
debugfs/debugfs.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/debugfs/debugfs.c b/debugfs/debugfs.c
index 4ecf474..ea0f2c4 100644
--- a/debugfs/debugfs.c
+++ b/debugfs/debugfs.c
@@ -280,7 +280,7 @@ void do_init_filesys(int argc, char **argv)
int err;
if (common_args_process(argc, argv, 3, 3, "initialize",
- "<device> <blocksize>", CHECK_FS_NOTOPEN))
+ "<device> <blocks>", CHECK_FS_NOTOPEN))
return;
memset(¶m, 0, sizeof(struct ext2_super_block));
The old uninit_bg checksums depend on the UUID, so prohibit changes to
the UUID if a checksumming filesystem is mounted, because this
introduces a nasty race where the kernel and tune2fs are both trying
to rewrite group descriptors at the same time, with different ideas
about what the UUID is.
Signed-off-by: Darrick J. Wong <[email protected]>
---
misc/tune2fs.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/misc/tune2fs.c b/misc/tune2fs.c
index 1ae0ee6..111a43d 100644
--- a/misc/tune2fs.c
+++ b/misc/tune2fs.c
@@ -2653,8 +2653,7 @@ retry_open:
int set_csum = 0;
dgrp_t i;
- if (EXT2_HAS_RO_COMPAT_FEATURE(fs->super,
- EXT4_FEATURE_RO_COMPAT_METADATA_CSUM)) {
+ if (ext2fs_has_group_desc_csum(fs)) {
/*
* Changing the UUID requires rewriting all metadata,
* which can race with a mounted fs. Don't allow that.
Tweak the wording to be a little less ambiguous, since 'block' can be
a noun or a verb.
Signed-off-by: Darrick J. Wong <[email protected]>
---
lib/ext2fs/ext2_err.et.in | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/lib/ext2fs/ext2_err.et.in b/lib/ext2fs/ext2_err.et.in
index 9cc1bd1..2486321 100644
--- a/lib/ext2fs/ext2_err.et.in
+++ b/lib/ext2fs/ext2_err.et.in
@@ -480,6 +480,6 @@ ec EXT2_ET_BLOCK_BITMAP_CSUM_INVALID,
"Block bitmap checksum does not match bitmap"
ec EXT2_ET_INLINE_DATA_CANT_ITERATE,
- "Cannot block iterate on an inode containing inline data"
+ "Cannot iterate data blocks of an inode containing inline data"
end
Forbid clients from trying to map logical block numbers that are
larger than the lblk->pblk data structures are capable of handling.
While we're at it, don't let clients set the file size to a number
that's beyond what can be mapped.
Signed-off-by: Darrick J. Wong <[email protected]>
---
lib/ext2fs/bmap.c | 24 ++++++++++++++++++++++++
lib/ext2fs/ext2fsP.h | 4 ++++
lib/ext2fs/fileio.c | 3 +++
3 files changed, 31 insertions(+)
diff --git a/lib/ext2fs/bmap.c b/lib/ext2fs/bmap.c
index 5074587..32788f6 100644
--- a/lib/ext2fs/bmap.c
+++ b/lib/ext2fs/bmap.c
@@ -238,6 +238,27 @@ got_block:
return 0;
}
+int ext2fs_file_block_offset_too_big(ext2_filsys fs,
+ struct ext2_inode *inode,
+ blk64_t offset)
+{
+ blk64_t addr_per_block, max_map_block;
+
+ /* Kernel seems to cut us off at 4294967294 blocks */
+ if (offset >= (1ULL << 32) - 1)
+ return 1;
+
+ if (inode->i_flags & EXT4_EXTENTS_FL)
+ return 0;
+
+ addr_per_block = fs->blocksize >> 2;
+ max_map_block = addr_per_block;
+ max_map_block += addr_per_block * addr_per_block;
+ max_map_block += addr_per_block * addr_per_block * addr_per_block;
+ max_map_block += 12;
+
+ return offset >= max_map_block;
+}
errcode_t ext2fs_bmap2(ext2_filsys fs, ext2_ino_t ino, struct ext2_inode *inode,
char *block_buf, int bmap_flags, blk64_t block,
@@ -266,6 +287,9 @@ errcode_t ext2fs_bmap2(ext2_filsys fs, ext2_ino_t ino, struct ext2_inode *inode,
}
addr_per_block = (blk_t) fs->blocksize >> 2;
+ if (ext2fs_file_block_offset_too_big(fs, inode, block))
+ return EXT2_ET_FILE_TOO_BIG;
+
if (!block_buf) {
retval = ext2fs_get_array(2, fs->blocksize, &buf);
if (retval)
diff --git a/lib/ext2fs/ext2fsP.h b/lib/ext2fs/ext2fsP.h
index 80d2d0a..8c7983b 100644
--- a/lib/ext2fs/ext2fsP.h
+++ b/lib/ext2fs/ext2fsP.h
@@ -158,3 +158,7 @@ extern errcode_t ext2fs_get_generic_bmap_range(ext2fs_generic_bitmap bitmap,
extern void ext2fs_warn_bitmap32(ext2fs_generic_bitmap bitmap,const char *func);
extern int ext2fs_mem_is_zero(const char *mem, size_t len);
+
+int ext2fs_file_block_offset_too_big(ext2_filsys fs,
+ struct ext2_inode *inode,
+ blk64_t offset);
diff --git a/lib/ext2fs/fileio.c b/lib/ext2fs/fileio.c
index 6b213b5..a6bcbe7 100644
--- a/lib/ext2fs/fileio.c
+++ b/lib/ext2fs/fileio.c
@@ -392,6 +392,9 @@ errcode_t ext2fs_file_set_size2(ext2_file_t file, ext2_off64_t size)
EXT2_CHECK_MAGIC(file, EXT2_ET_MAGIC_EXT2_FILE);
+ if (size && ext2fs_file_block_offset_too_big(file->fs, &file->inode,
+ (size - 1) / file->fs->blocksize))
+ return EXT2_ET_FILE_TOO_BIG;
truncate_block = ((size + file->fs->blocksize - 1) >>
EXT2_BLOCK_SIZE_BITS(file->fs->super));
old_size = EXT2_I_SIZE(&file->inode);
'an block' should be 'a block'. Missed the read case in the first patch.
Signed-off-by: Darrick J. Wong <[email protected]>
---
lib/ext2fs/ext2_err.et.in | 2 +-
po/ca.po | 2 +-
po/cs.po | 2 +-
po/de.po | 2 +-
po/e2fsprogs.pot | 2 +-
po/es.po | 2 +-
po/fr.po | 2 +-
po/id.po | 2 +-
po/it.po | 2 +-
po/nl.po | 2 +-
po/pl.po | 2 +-
po/sv.po | 2 +-
po/tr.po | 2 +-
po/vi.po | 2 +-
po/zh_CN.po | 2 +-
15 files changed, 15 insertions(+), 15 deletions(-)
diff --git a/lib/ext2fs/ext2_err.et.in b/lib/ext2fs/ext2_err.et.in
index 2486321..93a1106 100644
--- a/lib/ext2fs/ext2_err.et.in
+++ b/lib/ext2fs/ext2_err.et.in
@@ -99,7 +99,7 @@ ec EXT2_ET_BLOCK_BITMAP_WRITE,
"Can't write a block bitmap"
ec EXT2_ET_BLOCK_BITMAP_READ,
- "Can't read an block bitmap"
+ "Can't read a block bitmap"
ec EXT2_ET_INODE_TABLE_WRITE,
"Can't write an inode table"
diff --git a/po/ca.po b/po/ca.po
index 0d8f36a..10e87ea 100644
--- a/po/ca.po
+++ b/po/ca.po
@@ -6015,7 +6015,7 @@ msgstr "en escriure el mapa de bits dels blocs"
#: lib/ext2fs/ext2_err.c:41
#, fuzzy
-msgid "Can't read an block bitmap"
+msgid "Can't read a block bitmap"
msgstr "en escriure el mapa de bits dels blocs"
#: lib/ext2fs/ext2_err.c:42
diff --git a/po/cs.po b/po/cs.po
index fe34399..2ea495e 100644
--- a/po/cs.po
+++ b/po/cs.po
@@ -5976,7 +5976,7 @@ msgid "Can't write a block bitmap"
msgstr "Bitmapu bloků nelze zapsat"
#: lib/ext2fs/ext2_err.c:41
-msgid "Can't read an block bitmap"
+msgid "Can't read a block bitmap"
msgstr "Bitmapu bloků nelze přečíst"
#: lib/ext2fs/ext2_err.c:42
diff --git a/po/de.po b/po/de.po
index b7f71c2..d1ed005 100644
--- a/po/de.po
+++ b/po/de.po
@@ -5980,7 +5980,7 @@ msgid "Can't write a block bitmap"
msgstr "Die Block-Bitmap kann nicht geschrieben werden"
#: lib/ext2fs/ext2_err.c:41
-msgid "Can't read an block bitmap"
+msgid "Can't read a block bitmap"
msgstr "Die Block-Bitmap kann nicht gelesen werden"
#: lib/ext2fs/ext2_err.c:42
diff --git a/po/e2fsprogs.pot b/po/e2fsprogs.pot
index 0e63d47..3ce9e14 100644
--- a/po/e2fsprogs.pot
+++ b/po/e2fsprogs.pot
@@ -5644,7 +5644,7 @@ msgid "Can't write a block bitmap"
msgstr ""
#: lib/ext2fs/ext2_err.c:41
-msgid "Can't read an block bitmap"
+msgid "Can't read a block bitmap"
msgstr ""
#: lib/ext2fs/ext2_err.c:42
diff --git a/po/es.po b/po/es.po
index f47480b..b9be136 100644
--- a/po/es.po
+++ b/po/es.po
@@ -6186,7 +6186,7 @@ msgstr "leyendo los mapas de bits del nodo-i y del bloque"
#: lib/ext2fs/ext2_err.c:41
#, fuzzy
-msgid "Can't read an block bitmap"
+msgid "Can't read a block bitmap"
msgstr "leyendo los mapas de bits del nodo-i y del bloque"
#: lib/ext2fs/ext2_err.c:42
diff --git a/po/fr.po b/po/fr.po
index 602441e..69c13d4 100644
--- a/po/fr.po
+++ b/po/fr.po
@@ -5991,7 +5991,7 @@ msgid "Can't write a block bitmap"
msgstr "Ne peut ?crire un bitmap de blocs"
#: lib/ext2fs/ext2_err.c:41
-msgid "Can't read an block bitmap"
+msgid "Can't read a block bitmap"
msgstr "Ne peut lire un bitmap de blocs"
#: lib/ext2fs/ext2_err.c:42
diff --git a/po/id.po b/po/id.po
index fb4e8fb..61ccea4 100644
--- a/po/id.po
+++ b/po/id.po
@@ -6032,7 +6032,7 @@ msgstr "membaca inode dan blok bitmap"
#: lib/ext2fs/ext2_err.c:41
#, fuzzy
-msgid "Can't read an block bitmap"
+msgid "Can't read a block bitmap"
msgstr "membaca inode dan blok bitmap"
#: lib/ext2fs/ext2_err.c:42
diff --git a/po/it.po b/po/it.po
index ba8ba61..7ba7d71 100644
--- a/po/it.po
+++ b/po/it.po
@@ -6093,7 +6093,7 @@ msgstr "lettura delle mappe di bit inode e blocco"
#: lib/ext2fs/ext2_err.c:41
#, fuzzy
-msgid "Can't read an block bitmap"
+msgid "Can't read a block bitmap"
msgstr "lettura delle mappe di bit inode e blocco"
#: lib/ext2fs/ext2_err.c:42
diff --git a/po/nl.po b/po/nl.po
index b98cb32..74e337f 100644
--- a/po/nl.po
+++ b/po/nl.po
@@ -5986,7 +5986,7 @@ msgid "Can't write a block bitmap"
msgstr "Kan een blok-bitkaart niet schrijven"
#: lib/ext2fs/ext2_err.c:41
-msgid "Can't read an block bitmap"
+msgid "Can't read a block bitmap"
msgstr "Kan een blok-bitkaart niet lezen"
#: lib/ext2fs/ext2_err.c:42
diff --git a/po/pl.po b/po/pl.po
index 62fac2e..d7d2498 100644
--- a/po/pl.po
+++ b/po/pl.po
@@ -5950,7 +5950,7 @@ msgid "Can't write a block bitmap"
msgstr "Nie mo?na zapisa? bitmapy blok?w"
#: lib/ext2fs/ext2_err.c:41
-msgid "Can't read an block bitmap"
+msgid "Can't read a block bitmap"
msgstr "Nie mo?na odczyta? bitmapy blok?w"
#: lib/ext2fs/ext2_err.c:42
diff --git a/po/sv.po b/po/sv.po
index d89370e..53e9da6 100644
--- a/po/sv.po
+++ b/po/sv.po
@@ -5956,7 +5956,7 @@ msgid "Can't write a block bitmap"
msgstr "Kan inte skriva en blockbitkarta"
#: lib/ext2fs/ext2_err.c:41
-msgid "Can't read an block bitmap"
+msgid "Can't read a block bitmap"
msgstr "Kan inte läsa en blockbitkarta"
#: lib/ext2fs/ext2_err.c:42
diff --git a/po/tr.po b/po/tr.po
index c677675..69d6787 100644
--- a/po/tr.po
+++ b/po/tr.po
@@ -6357,7 +6357,7 @@ msgstr "düğüm ve blok biteşlemleri okunuyor"
#: lib/ext2fs/ext2_err.c:41
#, fuzzy
-msgid "Can't read an block bitmap"
+msgid "Can't read a block bitmap"
msgstr "düğüm ve blok biteşlemleri okunuyor"
#: lib/ext2fs/ext2_err.c:42
diff --git a/po/vi.po b/po/vi.po
index 736542e..a56afa5 100644
--- a/po/vi.po
+++ b/po/vi.po
@@ -5933,7 +5933,7 @@ msgid "Can't write a block bitmap"
msgstr "Không thể ghi mảng ảnh khối"
#: lib/ext2fs/ext2_err.c:41
-msgid "Can't read an block bitmap"
+msgid "Can't read a block bitmap"
msgstr "Không thể đọc mảng ảnh khối"
#: lib/ext2fs/ext2_err.c:42
diff --git a/po/zh_CN.po b/po/zh_CN.po
index 77f4a22..f7ddd31 100644
--- a/po/zh_CN.po
+++ b/po/zh_CN.po
@@ -5649,7 +5649,7 @@ msgstr "块位图"
#: lib/ext2fs/ext2_err.c:41
#, fuzzy
-msgid "Can't read an block bitmap"
+msgid "Can't read a block bitmap"
msgstr "块位图"
#: lib/ext2fs/ext2_err.c:42
We should really use the ext2fs memory allocator functions in
copy_file(), and we really should return a value if there's allocation
problems.
Also fix up a minor bogosity in an error message.
v2: Fix the buffer free paths too.
Reviewed-by: Zheng Liu <[email protected]>
Cc: Robert Yang <[email protected]>
Cc: Darren Hart <[email protected]>
Signed-off-by: Darrick J. Wong <[email protected]>
---
debugfs/debugfs.c | 15 ++++++++-------
1 file changed, 8 insertions(+), 7 deletions(-)
diff --git a/debugfs/debugfs.c b/debugfs/debugfs.c
index ea0f2c4..c5f8a1f 100644
--- a/debugfs/debugfs.c
+++ b/debugfs/debugfs.c
@@ -1602,16 +1602,17 @@ static errcode_t copy_file(int fd, ext2_ino_t newfile, int bufsize, int make_hol
if (retval)
return retval;
- if (!(buf = (char *) malloc(bufsize))){
- com_err("copy_file", errno, "can't allocate buffer\n");
- return;
+ retval = ext2fs_get_mem(bufsize, &buf);
+ if (retval) {
+ com_err("copy_file", retval, "can't allocate buffer\n");
+ return retval;
}
/* This is used for checking whether the whole block is zero */
retval = ext2fs_get_memzero(bufsize, &zero_buf);
if (retval) {
com_err("copy_file", retval, "can't allocate buffer\n");
- free(buf);
+ ext2fs_free_mem(&buf);
return retval;
}
@@ -1649,13 +1650,13 @@ static errcode_t copy_file(int fd, ext2_ino_t newfile, int bufsize, int make_hol
ptr += written;
}
}
- free(buf);
+ ext2fs_free_mem(&buf);
ext2fs_free_mem(&zero_buf);
retval = ext2fs_file_close(e2_file);
return retval;
fail:
- free(buf);
+ ext2fs_free_mem(&buf);
ext2fs_free_mem(&zero_buf);
(void) ext2fs_file_close(e2_file);
return retval;
@@ -2113,7 +2114,7 @@ void do_bmap(int argc, char *argv[])
errcode = ext2fs_bmap2(current_fs, ino, 0, 0, 0, blk, 0, &pblk);
if (errcode) {
- com_err("argv[0]", errcode,
+ com_err(argv[0], errcode,
"while mapping logical block %llu\n", blk);
return;
}
metadata_csum implies uninit_bg, and in fact forces the bit off for
rocompat with older implementations. Therefore, to detect the
presence of checksums, we should use the predicate function to decide
if group descriptor checksums are turned on, not open-coded flag
tests.
Signed-off-by: Darrick J. Wong <[email protected]>
---
misc/e2image.c | 4 +---
resize/resize2fs.c | 4 +---
2 files changed, 2 insertions(+), 6 deletions(-)
diff --git a/misc/e2image.c b/misc/e2image.c
index aa363fb..25d8d4e 100644
--- a/misc/e2image.c
+++ b/misc/e2image.c
@@ -349,9 +349,7 @@ static void mark_table_blocks(ext2_filsys fs)
ext2fs_inode_table_loc(fs, i)) {
unsigned int end = (unsigned) fs->inode_blocks_per_group;
/* skip unused blocks */
- if (!output_is_blk &&
- EXT2_HAS_RO_COMPAT_FEATURE(fs->super,
- EXT4_FEATURE_RO_COMPAT_GDT_CSUM))
+ if (!output_is_blk && ext2fs_has_group_desc_csum(fs))
end -= (ext2fs_bg_itable_unused(fs, i) /
EXT2_INODES_PER_BLOCK(fs->super));
for (j = 0, b = ext2fs_inode_table_loc(fs, i);
diff --git a/resize/resize2fs.c b/resize/resize2fs.c
index 0feff0f..fa4fe46 100644
--- a/resize/resize2fs.c
+++ b/resize/resize2fs.c
@@ -668,9 +668,7 @@ static errcode_t adjust_superblock(ext2_resize_t rfs, blk64_t new_size)
* supports lazy inode initialization, we can skip
* initializing the inode table.
*/
- if (lazy_itable_init &&
- EXT2_HAS_RO_COMPAT_FEATURE(fs->super,
- EXT4_FEATURE_RO_COMPAT_GDT_CSUM)) {
+ if (lazy_itable_init && ext2fs_has_group_desc_csum(fs)) {
retval = 0;
goto errout;
}
If we have to create a big symlink (i.e. one that doesn't fit into
i_block[]), we are not 64bit block safe and the namei code does not
handle extents at all. Fix both.
Signed-off-by: Darrick J. Wong <[email protected]>
---
lib/ext2fs/namei.c | 9 +++++++--
lib/ext2fs/symlink.c | 20 ++++++--------------
2 files changed, 13 insertions(+), 16 deletions(-)
diff --git a/lib/ext2fs/namei.c b/lib/ext2fs/namei.c
index efcc02b..307aecc 100644
--- a/lib/ext2fs/namei.c
+++ b/lib/ext2fs/namei.c
@@ -34,6 +34,7 @@ static errcode_t follow_link(ext2_filsys fs, ext2_ino_t root, ext2_ino_t dir,
char *buffer = 0;
errcode_t retval;
struct ext2_inode ei;
+ blk64_t blk;
#ifdef NAMEI_DEBUG
printf("follow_link: root=%lu, dir=%lu, inode=%lu, lc=%d\n",
@@ -49,12 +50,16 @@ static errcode_t follow_link(ext2_filsys fs, ext2_ino_t root, ext2_ino_t dir,
if (link_count++ >= EXT2FS_MAX_NESTED_LINKS)
return EXT2_ET_SYMLINK_LOOP;
- /* FIXME-64: Actually, this is FIXME EXTENTS */
if (ext2fs_inode_data_blocks(fs,&ei)) {
+ retval = ext2fs_bmap2(fs, inode, &ei, NULL, 0, 0, NULL, &blk);
+ if (retval)
+ return retval;
+
retval = ext2fs_get_mem(fs->blocksize, &buffer);
if (retval)
return retval;
- retval = io_channel_read_blk(fs->io, ei.i_block[0], 1, buffer);
+
+ retval = io_channel_read_blk64(fs->io, blk, 1, buffer);
if (retval) {
ext2fs_free_mem(&buffer);
return retval;
diff --git a/lib/ext2fs/symlink.c b/lib/ext2fs/symlink.c
index e943412..ad80444 100644
--- a/lib/ext2fs/symlink.c
+++ b/lib/ext2fs/symlink.c
@@ -91,14 +91,12 @@ errcode_t ext2fs_symlink(ext2_filsys fs, ext2_ino_t parent, ext2_ino_t ino,
memset(block_buf, 0, fs->blocksize);
strcpy(block_buf, target);
if (fs->super->s_feature_incompat &
- EXT3_FEATURE_INCOMPAT_EXTENTS) {
+ EXT3_FEATURE_INCOMPAT_EXTENTS) {
/*
* The extent bmap is setup after the inode and block
* have been written out below.
*/
inode.i_flags |= EXT4_EXTENTS_FL;
- } else {
- inode.i_block[0] = blk;
}
}
@@ -112,20 +110,14 @@ errcode_t ext2fs_symlink(ext2_filsys fs, ext2_ino_t parent, ext2_ino_t ino,
goto cleanup;
if (!fastlink) {
- retval = io_channel_write_blk(fs->io, blk, 1, block_buf);
+ retval = ext2fs_bmap2(fs, ino, &inode, NULL, BMAP_SET, 0, NULL,
+ &blk);
if (retval)
goto cleanup;
- if (fs->super->s_feature_incompat &
- EXT3_FEATURE_INCOMPAT_EXTENTS) {
- retval = ext2fs_extent_open2(fs, ino, &inode, &handle);
- if (retval)
- goto cleanup;
- retval = ext2fs_extent_set_bmap(handle, 0, blk, 0);
- ext2fs_extent_free(handle);
- if (retval)
- goto cleanup;
- }
+ retval = io_channel_write_blk64(fs->io, blk, 1, block_buf);
+ if (retval)
+ goto cleanup;
}
/*
debugfs should use strtoull wrappers for reading block numbers from
the command line. "unsigned long" isn't wide enough to handle block
numbers on 32bit platforms.
Signed-off-by: Darrick J. Wong <[email protected]>
---
debugfs/debugfs.c | 33 ++++++++++++++++++++++-----------
debugfs/extent_inode.c | 22 +++++++++-------------
debugfs/util.c | 2 +-
3 files changed, 32 insertions(+), 25 deletions(-)
diff --git a/debugfs/debugfs.c b/debugfs/debugfs.c
index c5f8a1f..578d577 100644
--- a/debugfs/debugfs.c
+++ b/debugfs/debugfs.c
@@ -181,8 +181,7 @@ void do_open_filesys(int argc, char **argv)
return;
break;
case 's':
- superblock = parse_ulong(optarg, argv[0],
- "superblock number", &err);
+ err = strtoblk(argv[0], optarg, &superblock);
if (err)
return;
break;
@@ -278,14 +277,17 @@ void do_init_filesys(int argc, char **argv)
struct ext2_super_block param;
errcode_t retval;
int err;
+ blk64_t blocks;
if (common_args_process(argc, argv, 3, 3, "initialize",
"<device> <blocks>", CHECK_FS_NOTOPEN))
return;
memset(¶m, 0, sizeof(struct ext2_super_block));
- ext2fs_blocks_count_set(¶m, parse_ulong(argv[2], argv[0],
- "blocks count", &err));
+ err = strtoblk(argv[0], argv[2], &blocks);
+ if (err)
+ return;
+ ext2fs_blocks_count_set(¶m, blocks);
if (err)
return;
retval = ext2fs_initialize(argv[1], 0, ¶m,
@@ -2110,7 +2112,9 @@ void do_bmap(int argc, char *argv[])
ino = string_to_inode(argv[1]);
if (!ino)
return;
- blk = parse_ulong(argv[2], argv[0], "logical_block", &err);
+ err = strtoblk(argv[0], argv[2], &blk);
+ if (err)
+ return;
errcode = ext2fs_bmap2(current_fs, ino, 0, 0, 0, blk, 0, &pblk);
if (errcode) {
@@ -2255,10 +2259,14 @@ void do_punch(int argc, char *argv[])
ino = string_to_inode(argv[1]);
if (!ino)
return;
- start = parse_ulong(argv[2], argv[0], "logical_block", &err);
- if (argc == 4)
- end = parse_ulong(argv[3], argv[0], "logical_block", &err);
- else
+ err = strtoblk(argv[0], argv[2], &start);
+ if (err)
+ return;
+ if (argc == 4) {
+ err = strtoblk(argv[0], argv[3], &end);
+ if (err)
+ return;
+ } else
end = ~0;
errcode = ext2fs_punch(current_fs, ino, 0, 0, start, end);
@@ -2470,8 +2478,11 @@ int main(int argc, char **argv)
"block size", 0);
break;
case 's':
- superblock = parse_ulong(optarg, argv[0],
- "superblock number", 0);
+ retval = strtoblk(argv[0], optarg, &superblock);
+ if (retval) {
+ com_err(argv[0], retval, 0, debug_prog_name);
+ return 1;
+ }
break;
case 'c':
catastrophic = 1;
diff --git a/debugfs/extent_inode.c b/debugfs/extent_inode.c
index 0bbc4c5..75e328c 100644
--- a/debugfs/extent_inode.c
+++ b/debugfs/extent_inode.c
@@ -264,7 +264,7 @@ void do_replace_node(int argc, char *argv[])
return;
}
- extent.e_lblk = parse_ulong(argv[1], argv[0], "logical block", &err);
+ err = strtoblk(argv[0], argv[1], &extent.e_lblk);
if (err)
return;
@@ -272,7 +272,7 @@ void do_replace_node(int argc, char *argv[])
if (err)
return;
- extent.e_pblk = parse_ulong(argv[3], argv[0], "logical block", &err);
+ err = strtoblk(argv[0], argv[3], &extent.e_pblk);
if (err)
return;
@@ -338,8 +338,7 @@ void do_insert_node(int argc, char *argv[])
return;
}
- extent.e_lblk = parse_ulong(argv[1], cmd,
- "logical block", &err);
+ err = strtoblk(cmd, argv[1], &extent.e_lblk);
if (err)
return;
@@ -348,8 +347,7 @@ void do_insert_node(int argc, char *argv[])
if (err)
return;
- extent.e_pblk = parse_ulong(argv[3], cmd,
- "pysical block", &err);
+ err = strtoblk(cmd, argv[3], &extent.e_pblk);
if (err)
return;
@@ -366,8 +364,8 @@ void do_set_bmap(int argc, char **argv)
const char *usage = "[--uninit] <lblk> <pblk>";
struct ext2fs_extent extent;
errcode_t retval;
- blk_t logical;
- blk_t physical;
+ blk64_t logical;
+ blk64_t physical;
char *cmd = argv[0];
int flags = 0;
int err;
@@ -387,18 +385,16 @@ void do_set_bmap(int argc, char **argv)
return;
}
- logical = parse_ulong(argv[1], cmd,
- "logical block", &err);
+ err = strtoblk(cmd, argv[1], &logical);
if (err)
return;
- physical = parse_ulong(argv[2], cmd,
- "physical block", &err);
+ err = strtoblk(cmd, argv[2], &physical);
if (err)
return;
retval = ext2fs_extent_set_bmap(current_handle, logical,
- (blk64_t) physical, flags);
+ physical, flags);
if (retval) {
com_err(cmd, retval, 0);
return;
diff --git a/debugfs/util.c b/debugfs/util.c
index cf3a6c6..09088e0 100644
--- a/debugfs/util.c
+++ b/debugfs/util.c
@@ -377,7 +377,7 @@ int common_block_args_process(int argc, char *argv[],
}
if (argc > 2) {
- *count = parse_ulong(argv[2], argv[0], "count", &err);
+ err = strtoblk(argv[0], argv[2], count);
if (err)
return 1;
}
When reading or writing file blocks, use the IO manager routines that
can handle 64bit block numbers.
Signed-off-by: Darrick J. Wong <[email protected]>
---
lib/ext2fs/fileio.c | 9 ++++-----
1 file changed, 4 insertions(+), 5 deletions(-)
diff --git a/lib/ext2fs/fileio.c b/lib/ext2fs/fileio.c
index a6bcbe7..d092e65 100644
--- a/lib/ext2fs/fileio.c
+++ b/lib/ext2fs/fileio.c
@@ -142,8 +142,7 @@ errcode_t ext2fs_file_flush(ext2_file_t file)
return retval;
}
- retval = io_channel_write_blk(fs->io, file->physblock,
- 1, file->buf);
+ retval = io_channel_write_blk64(fs->io, file->physblock, 1, file->buf);
if (retval)
return retval;
@@ -194,9 +193,9 @@ static errcode_t load_buffer(ext2_file_t file, int dontfill)
return retval;
if (!dontfill) {
if (file->physblock) {
- retval = io_channel_read_blk(fs->io,
- file->physblock,
- 1, file->buf);
+ retval = io_channel_read_blk64(fs->io,
+ file->physblock,
+ 1, file->buf);
if (retval)
return retval;
} else
With the advent of metadata_csum, we now tie extent and directory
blocks to the associated inode number (and generation). Therefore, we
must be careful when remapping inodes. At that point in the resize
process, all the blocks that are going away have been duplicated
elsewhere in the FS (albeit with checksums based on the old inode
numbers). If we're moving the inode, then do that and remember that
new inode number. Now we can update the block mappings for each inode
with the final inode number, and schedule directory blocks for mass
inode relocation. We also have to recalculate the EA block checksum.
Signed-off-by: Darrick J. Wong <[email protected]>
---
resize/resize2fs.c | 154 ++++++++++++++++++++++++++++++++++++++--------------
1 file changed, 114 insertions(+), 40 deletions(-)
diff --git a/resize/resize2fs.c b/resize/resize2fs.c
index fa4fe46..ff5e6a2 100644
--- a/resize/resize2fs.c
+++ b/resize/resize2fs.c
@@ -1360,10 +1360,12 @@ __u64 extent_translate(ext2_filsys fs, ext2_extent extent, __u64 old_loc)
struct process_block_struct {
ext2_resize_t rfs;
ext2_ino_t ino;
+ ext2_ino_t old_ino;
struct ext2_inode * inode;
errcode_t error;
int is_dir;
int changed;
+ int has_extents;
};
static int process_block(ext2_filsys fs, blk64_t *block_nr,
@@ -1387,11 +1389,23 @@ static int process_block(ext2_filsys fs, blk64_t *block_nr,
#ifdef RESIZE2FS_DEBUG
if (pb->rfs->flags & RESIZE_DEBUG_BMOVE)
printf("ino=%u, blockcnt=%lld, %llu->%llu\n",
- pb->ino, blockcnt, block, new_block);
+ pb->old_ino, blockcnt, block,
+ new_block);
#endif
block = new_block;
}
}
+
+ /*
+ * If we moved inodes and metadata_csum is enabled, we must force the
+ * extent block to be rewritten with new checksum.
+ */
+ if (EXT2_HAS_RO_COMPAT_FEATURE(fs->super,
+ EXT4_FEATURE_RO_COMPAT_METADATA_CSUM) &&
+ pb->has_extents &&
+ pb->old_ino != pb->ino)
+ ret |= BLOCK_CHANGED;
+
if (pb->is_dir) {
retval = ext2fs_add_dir_block2(fs->dblist, pb->ino,
block, (int) blockcnt);
@@ -1431,6 +1445,46 @@ static errcode_t progress_callback(ext2_filsys fs,
return 0;
}
+static errcode_t migrate_ea_block(ext2_resize_t rfs, ext2_ino_t ino,
+ struct ext2_inode *inode, int *changed)
+{
+ char *buf;
+ blk64_t new_block;
+ errcode_t err = 0;
+
+ /* No EA block or no remapping? Quit early. */
+ if (ext2fs_file_acl_block(rfs->old_fs, inode) == 0 && !rfs->bmap)
+ return 0;
+ new_block = extent_translate(rfs->old_fs, rfs->bmap,
+ ext2fs_file_acl_block(rfs->old_fs, inode));
+ if (new_block == 0)
+ return 0;
+
+ /* Set the new ACL block */
+ ext2fs_file_acl_block_set(rfs->old_fs, inode, new_block);
+
+ /* Update checksum */
+ if (EXT2_HAS_RO_COMPAT_FEATURE(rfs->new_fs->super,
+ EXT4_FEATURE_RO_COMPAT_METADATA_CSUM)) {
+ err = ext2fs_get_mem(rfs->old_fs->blocksize, &buf);
+ if (err)
+ return err;
+ rfs->old_fs->flags |= EXT2_FLAG_IGNORE_CSUM_ERRORS;
+ err = ext2fs_read_ext_attr3(rfs->old_fs, new_block, buf, ino);
+ rfs->old_fs->flags &= ~EXT2_FLAG_IGNORE_CSUM_ERRORS;
+ if (err)
+ goto out;
+ err = ext2fs_write_ext_attr3(rfs->old_fs, new_block, buf, ino);
+ if (err)
+ goto out;
+ }
+ *changed = 1;
+
+out:
+ ext2fs_free_mem(&buf);
+ return err;
+}
+
static errcode_t inode_scan_and_fix(ext2_resize_t rfs)
{
struct process_block_struct pb;
@@ -1441,7 +1495,6 @@ static errcode_t inode_scan_and_fix(ext2_resize_t rfs)
char *block_buf = 0;
ext2_ino_t start_to_move;
blk64_t orig_size;
- blk64_t new_block;
int inode_size;
if ((rfs->old_fs->group_desc_count <=
@@ -1504,37 +1557,19 @@ static errcode_t inode_scan_and_fix(ext2_resize_t rfs)
pb.is_dir = LINUX_S_ISDIR(inode->i_mode);
pb.changed = 0;
- if (ext2fs_file_acl_block(rfs->old_fs, inode) && rfs->bmap) {
- new_block = extent_translate(rfs->old_fs, rfs->bmap,
- ext2fs_file_acl_block(rfs->old_fs, inode));
- if (new_block) {
- ext2fs_file_acl_block_set(rfs->old_fs, inode,
- new_block);
- retval = ext2fs_write_inode_full(rfs->old_fs,
- ino, inode, inode_size);
- if (retval) goto errout;
- }
- }
-
- if (ext2fs_inode_has_valid_blocks2(rfs->old_fs, inode) &&
- (rfs->bmap || pb.is_dir)) {
- pb.ino = ino;
- retval = ext2fs_block_iterate3(rfs->old_fs,
- ino, 0, block_buf,
- process_block, &pb);
- if (retval)
- goto errout;
- if (pb.error) {
- retval = pb.error;
- goto errout;
- }
- }
+ /* Remap EA block */
+ retval = migrate_ea_block(rfs, ino, inode, &pb.changed);
+ if (retval)
+ goto errout;
+ new_inode = ino;
if (ino <= start_to_move)
- continue; /* Don't need to move it. */
+ goto remap_blocks; /* Don't need to move inode. */
/*
- * Find a new inode
+ * Find a new inode. Now that extents and directory blocks
+ * are tied to the inode number through the checksum, we must
+ * set up the new inode before we start rewriting blocks.
*/
retval = ext2fs_new_inode(rfs->new_fs, 0, 0, 0, &new_inode);
if (retval)
@@ -1542,16 +1577,12 @@ static errcode_t inode_scan_and_fix(ext2_resize_t rfs)
ext2fs_inode_alloc_stats2(rfs->new_fs, new_inode, +1,
pb.is_dir);
- if (pb.changed) {
- /* Get the new version of the inode */
- retval = ext2fs_read_inode_full(rfs->old_fs, ino,
- inode, inode_size);
- if (retval) goto errout;
- }
inode->i_ctime = time(0);
retval = ext2fs_write_inode_full(rfs->old_fs, new_inode,
inode, inode_size);
- if (retval) goto errout;
+ if (retval)
+ goto errout;
+ pb.changed = 0;
#ifdef RESIZE2FS_DEBUG
if (rfs->flags & RESIZE_DEBUG_INODEMAP)
@@ -1563,6 +1594,37 @@ static errcode_t inode_scan_and_fix(ext2_resize_t rfs)
goto errout;
}
ext2fs_add_extent_entry(rfs->imap, ino, new_inode);
+
+remap_blocks:
+ if (pb.changed)
+ retval = ext2fs_write_inode_full(rfs->old_fs,
+ new_inode,
+ inode, inode_size);
+ if (retval)
+ goto errout;
+
+ /*
+ * Update inodes to point to new blocks; schedule directory
+ * blocks for inode remapping. Need to write out dir blocks
+ * with new inode numbers if we have metadata_csum enabled.
+ */
+ if (ext2fs_inode_has_valid_blocks2(rfs->old_fs, inode) &&
+ (rfs->bmap || pb.is_dir)) {
+ pb.ino = new_inode;
+ pb.old_ino = ino;
+ pb.has_extents = inode->i_flags & EXT4_EXTENTS_FL;
+ rfs->old_fs->flags |= EXT2_FLAG_IGNORE_CSUM_ERRORS;
+ retval = ext2fs_block_iterate3(rfs->old_fs,
+ new_inode, 0, block_buf,
+ process_block, &pb);
+ rfs->old_fs->flags &= ~EXT2_FLAG_IGNORE_CSUM_ERRORS;
+ if (retval)
+ goto errout;
+ if (pb.error) {
+ retval = pb.error;
+ goto errout;
+ }
+ }
}
io_channel_flush(rfs->old_fs->io);
@@ -1605,6 +1667,7 @@ static int check_and_change_inodes(ext2_ino_t dir,
struct ext2_inode inode;
ext2_ino_t new_inode;
errcode_t retval;
+ int ret = 0;
if (is->rfs->progress && offset == 0) {
io_channel_flush(is->rfs->old_fs->io);
@@ -1615,13 +1678,22 @@ static int check_and_change_inodes(ext2_ino_t dir,
return DIRENT_ABORT;
}
+ /*
+ * If we have checksums enabled and the inode wasn't present in the
+ * old fs, then we must rewrite all dir blocks with new checksums.
+ */
+ if (EXT2_HAS_RO_COMPAT_FEATURE(is->rfs->old_fs->super,
+ EXT4_FEATURE_RO_COMPAT_METADATA_CSUM) &&
+ !ext2fs_test_inode_bitmap2(is->rfs->old_fs->inode_map, dir))
+ ret |= DIRENT_CHANGED;
+
if (!dirent->inode)
- return 0;
+ return ret;
new_inode = ext2fs_extent_translate(is->rfs->imap, dirent->inode);
if (!new_inode)
- return 0;
+ return ret;
#ifdef RESIZE2FS_DEBUG
if (is->rfs->flags & RESIZE_DEBUG_INODEMAP)
printf("Inode translate (dir=%u, name=%.*s, %u->%u)\n",
@@ -1637,10 +1709,10 @@ static int check_and_change_inodes(ext2_ino_t dir,
inode.i_mtime = inode.i_ctime = time(0);
is->err = ext2fs_write_inode(is->rfs->old_fs, dir, &inode);
if (is->err)
- return DIRENT_ABORT;
+ return ret | DIRENT_ABORT;
}
- return DIRENT_CHANGED;
+ return ret | DIRENT_CHANGED;
}
static errcode_t inode_ref_fix(ext2_resize_t rfs)
@@ -1667,9 +1739,11 @@ static errcode_t inode_ref_fix(ext2_resize_t rfs)
goto errout;
}
+ rfs->old_fs->flags |= EXT2_FLAG_IGNORE_CSUM_ERRORS;
retval = ext2fs_dblist_dir_iterate(rfs->old_fs->dblist,
DIRENT_FLAG_INCLUDE_EMPTY, 0,
check_and_change_inodes, &is);
+ rfs->old_fs->flags &= ~EXT2_FLAG_IGNORE_CSUM_ERRORS;
if (retval)
goto errout;
if (is.err) {
The caller of dump_file provides a fd to write to, so the caller
should also dispose of the fd. Also, the fd never gets closed if
preserve=1.
Signed-off-by: Darrick J. Wong <[email protected]>
---
debugfs/dump.c | 11 +++++++++--
1 file changed, 9 insertions(+), 2 deletions(-)
diff --git a/debugfs/dump.c b/debugfs/dump.c
index c75b9f1..952a752 100644
--- a/debugfs/dump.c
+++ b/debugfs/dump.c
@@ -143,8 +143,6 @@ static void dump_file(const char *cmdname, ext2_ino_t ino, int fd,
if (preserve)
fix_perms("dump_file", &inode, fd, outname);
- else if (fd != 1)
- close(fd);
return;
}
@@ -191,6 +189,11 @@ void do_dump(int argc, char **argv)
}
dump_file(argv[0], inode, fd, preserve, out_fn);
+ if (close(fd) != 0) {
+ com_err(argv[0], errno, "while closing %s for dump_inode",
+ out_fn);
+ return;
+ }
return;
}
@@ -273,6 +276,10 @@ static void rdump_inode(ext2_ino_t ino, struct ext2_inode *inode,
goto errout;
}
dump_file("rdump", ino, fd, 1, fullname);
+ if (close(fd) != 0) {
+ com_err("rdump", errno, "while dumping %s", fullname);
+ goto errout;
+ }
}
else if (LINUX_S_ISDIR(inode->i_mode) && strcmp(name, ".") && strcmp(name, "..")) {
errcode_t retval;
ext2fs_free_mem() takes a pointer to a pointer, similar to
ext2fs_get_mem(). Improve the documentation, and fix debugfs.
Signed-off-by: Darrick J. Wong <[email protected]>
---
debugfs/set_fields.c | 2 +-
lib/ext2fs/ext2fs.h | 6 +++---
2 files changed, 4 insertions(+), 4 deletions(-)
diff --git a/debugfs/set_fields.c b/debugfs/set_fields.c
index aad1cd8..1e57277 100644
--- a/debugfs/set_fields.c
+++ b/debugfs/set_fields.c
@@ -805,7 +805,7 @@ void do_set_mmp_value(int argc, char *argv[])
if (retval) {
com_err(argv[0], retval, "reading MMP block %llu.\n",
(long long)current_fs->super->s_mmp_block);
- ext2fs_free_mem(mmp_s);
+ ext2fs_free_mem(&mmp_s);
return;
}
current_fs->mmp_buf = mmp_s;
diff --git a/lib/ext2fs/ext2fs.h b/lib/ext2fs/ext2fs.h
index 64e498f..0624350 100644
--- a/lib/ext2fs/ext2fs.h
+++ b/lib/ext2fs/ext2fs.h
@@ -1608,7 +1608,7 @@ _INLINE_ void ext2fs_init_csum_seed(ext2_filsys fs)
#ifndef EXT2_CUSTOM_MEMORY_ROUTINES
#include <string.h>
/*
- * Allocate memory
+ * Allocate memory. The 'ptr' arg must point to a pointer.
*/
_INLINE_ errcode_t ext2fs_get_mem(unsigned long size, void *ptr)
{
@@ -1655,7 +1655,7 @@ _INLINE_ errcode_t ext2fs_get_arrayzero(unsigned long count,
}
/*
- * Free memory
+ * Free memory. The 'ptr' arg must point to a pointer.
*/
_INLINE_ errcode_t ext2fs_free_mem(void *ptr)
{
@@ -1669,7 +1669,7 @@ _INLINE_ errcode_t ext2fs_free_mem(void *ptr)
}
/*
- * Resize memory
+ * Resize memory. The 'ptr' arg must point to a pointer.
*/
_INLINE_ errcode_t ext2fs_resize_mem(unsigned long EXT2FS_ATTR((unused)) old_size,
unsigned long size, void *ptr)
Signed-off-by: Darrick J. Wong <[email protected]>
---
e2fsck/journal.c | 4 +++-
e2fsck/pass3.c | 5 +++--
e2fsck/profile.c | 2 ++
e2fsck/unix.c | 2 ++
4 files changed, 10 insertions(+), 3 deletions(-)
diff --git a/e2fsck/journal.c b/e2fsck/journal.c
index e3f80bc..22f06e7 100644
--- a/e2fsck/journal.c
+++ b/e2fsck/journal.c
@@ -1139,8 +1139,10 @@ int e2fsck_fix_ext3_journal_hint(e2fsck_t ctx)
if (!journal_name)
return 0;
- if (stat(journal_name, &st) < 0)
+ if (stat(journal_name, &st) < 0) {
+ free(journal_name);
return 0;
+ }
if (st.st_rdev != sb->s_journal_dev) {
clear_problem_context(&pctx);
diff --git a/e2fsck/pass3.c b/e2fsck/pass3.c
index fbaadcf..6989f17 100644
--- a/e2fsck/pass3.c
+++ b/e2fsck/pass3.c
@@ -53,7 +53,7 @@ static ext2fs_inode_bitmap inode_done_map = 0;
void e2fsck_pass3(e2fsck_t ctx)
{
ext2_filsys fs = ctx->fs;
- struct dir_info_iter *iter;
+ struct dir_info_iter *iter = NULL;
#ifdef RESOURCE_TRACK
struct resource_track rtrack;
#endif
@@ -108,7 +108,6 @@ void e2fsck_pass3(e2fsck_t ctx)
if (check_directory(ctx, dir->ino, &pctx))
goto abort_exit;
}
- e2fsck_dir_info_iter_end(ctx, iter);
/*
* Force the creation of /lost+found if not present
@@ -123,6 +122,8 @@ void e2fsck_pass3(e2fsck_t ctx)
e2fsck_rehash_directories(ctx);
abort_exit:
+ if (iter)
+ e2fsck_dir_info_iter_end(ctx, iter);
e2fsck_free_dir_info(ctx);
if (inode_loop_detect) {
ext2fs_free_inode_bitmap(inode_loop_detect);
diff --git a/e2fsck/profile.c b/e2fsck/profile.c
index 019c6f5..92aa893 100644
--- a/e2fsck/profile.c
+++ b/e2fsck/profile.c
@@ -318,6 +318,8 @@ profile_init(const char **files, profile_t *ret_profile)
/* if the filenames list is not specified return an empty profile */
if ( files ) {
for (fs = files; !PROFILE_LAST_FILESPEC(*fs); fs++) {
+ if (array)
+ free_list(array);
retval = get_dirlist(*fs, &array);
if (retval == 0) {
if (!array)
diff --git a/e2fsck/unix.c b/e2fsck/unix.c
index a6c8d25..7a8fce2 100644
--- a/e2fsck/unix.c
+++ b/e2fsck/unix.c
@@ -869,6 +869,8 @@ static errcode_t PRS(int argc, char *argv[], e2fsck_t *ret_ctx)
case 'L':
replace_bad_blocks++;
case 'l':
+ if (bad_blocks_file)
+ free(bad_blocks_file);
bad_blocks_file = string_copy(ctx, optarg, 0);
break;
case 'd':
Signed-off-by: Darrick J. Wong <[email protected]>
---
misc/e2image.c | 1 +
misc/filefrag.c | 2 ++
resize/online.c | 12 +++++++++---
3 files changed, 12 insertions(+), 3 deletions(-)
diff --git a/misc/e2image.c b/misc/e2image.c
index 25d8d4e..624525b 100644
--- a/misc/e2image.c
+++ b/misc/e2image.c
@@ -1227,6 +1227,7 @@ static void install_image(char *device, char *image_fn, int type)
exit(1);
}
+ close(fd);
ext2fs_close (fs);
}
diff --git a/misc/filefrag.c b/misc/filefrag.c
index 35b3544..a050a22 100644
--- a/misc/filefrag.c
+++ b/misc/filefrag.c
@@ -360,12 +360,14 @@ static void frag_report(const char *filename)
#else
if (fstat(fd, &st) < 0) {
#endif
+ close(fd);
perror("stat");
return;
}
if (last_device != st.st_dev) {
if (fstatfs(fd, &fsinfo) < 0) {
+ close(fd);
perror("fstatfs");
return;
}
diff --git a/resize/online.c b/resize/online.c
index 2d34640..defcac1 100644
--- a/resize/online.c
+++ b/resize/online.c
@@ -184,12 +184,16 @@ errcode_t online_resize_fs(ext2_filsys fs, const char *mtpt,
ext2fs_blocks_count(sb);
retval = ext2fs_read_bitmaps(fs);
- if (retval)
+ if (retval) {
+ close(fd);
return retval;
+ }
retval = ext2fs_dup_handle(fs, &new_fs);
- if (retval)
+ if (retval) {
+ close(fd);
return retval;
+ }
/* The current method of adding one block group at a time to a
* mounted filesystem means it is impossible to accomodate the
@@ -203,8 +207,10 @@ errcode_t online_resize_fs(ext2_filsys fs, const char *mtpt,
*/
new_fs->super->s_feature_incompat &= ~EXT4_FEATURE_INCOMPAT_FLEX_BG;
retval = adjust_fs_info(new_fs, fs, 0, *new_size);
- if (retval)
+ if (retval) {
+ close(fd);
return retval;
+ }
printf(_("Performing an on-line resize of %s to %llu (%dk) blocks.\n"),
fs->device_name, *new_size, fs->blocksize / 1024);
Signed-off-by: Darrick J. Wong <[email protected]>
---
misc/mke2fs.c | 11 ++++++++---
1 file changed, 8 insertions(+), 3 deletions(-)
diff --git a/misc/mke2fs.c b/misc/mke2fs.c
index 19b6e85..c1cbcaa 100644
--- a/misc/mke2fs.c
+++ b/misc/mke2fs.c
@@ -93,7 +93,7 @@ gid_t root_gid;
int journal_size;
int journal_flags;
int lazy_itable_init;
-char *bad_blocks_filename;
+char *bad_blocks_filename = NULL;
__u32 fs_stride;
int quotatype = -1; /* Initialize both user and group quotas by default */
@@ -1139,6 +1139,7 @@ static char **parse_fs_type(const char *fs_type,
parse_str = malloc(strlen(usage_types)+1);
if (!parse_str) {
+ free(profile_type);
free(list.list);
return 0;
}
@@ -1509,7 +1510,8 @@ profile_error:
discard = 0;
break;
case 'l':
- bad_blocks_filename = malloc(strlen(optarg)+1);
+ bad_blocks_filename = realloc(bad_blocks_filename,
+ strlen(optarg) + 1);
if (!bad_blocks_filename) {
com_err(program_name, ENOMEM,
_("in malloc for bad_blocks_filename"));
@@ -2262,8 +2264,11 @@ static int mke2fs_setup_tdb(const char *name, io_manager *io_ptr)
}
if (!strcmp(tdb_dir, "none") || (tdb_dir[0] == 0) ||
- access(tdb_dir, W_OK))
+ access(tdb_dir, W_OK)) {
+ if (free_tdb_dir)
+ free(tdb_dir);
return 0;
+ }
tmp_name = strdup(name);
if (!tmp_name)
If someone umounts the filesystem between statfs64 and the getmntent()
iteration, we can exit the loop having never set mnt_type, and strcmp
can crash. Fix the potential NULL deref.
Signed-off-by: Darrick J. Wong <[email protected]>
---
misc/e4defrag.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/misc/e4defrag.c b/misc/e4defrag.c
index 4b31d03..b6e2e31 100644
--- a/misc/e4defrag.c
+++ b/misc/e4defrag.c
@@ -374,7 +374,7 @@ static int is_ext4(const char *file, char *devname)
}
endmntent(fp);
- if (strcmp(mnt_type, FS_EXT4) == 0) {
+ if (mnt_type && strcmp(mnt_type, FS_EXT4) == 0) {
FREE(mnt_type);
return 0;
} else {
sysconf(_SC_PAGESIZE) will probably never return an error, but just in
case it does, we shouldn't pass what looks like a huge number to
sync_file_range() and posix_fadvise().
Signed-off-by: Darrick J. Wong <[email protected]>
---
misc/e4defrag.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/misc/e4defrag.c b/misc/e4defrag.c
index b6e2e31..07d56d9 100644
--- a/misc/e4defrag.c
+++ b/misc/e4defrag.c
@@ -473,6 +473,9 @@ static int defrag_fadvise(int fd, struct move_extent defrag_data,
unsigned int i;
loff_t offset;
+ if (pagesize < 1)
+ return -1;
+
offset = (loff_t)defrag_data.orig_start * block_size;
offset = (offset / pagesize) * pagesize;
Signed-off-by: Darrick J. Wong <[email protected]>
---
misc/e2image.c | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/misc/e2image.c b/misc/e2image.c
index 624525b..878149e 100644
--- a/misc/e2image.c
+++ b/misc/e2image.c
@@ -1309,7 +1309,11 @@ int main (int argc, char ** argv)
device_name = argv[optind];
image_fn = argv[optind+1];
- ext2fs_check_if_mounted(device_name, &mount_flags);
+ retval = ext2fs_check_if_mounted(device_name, &mount_flags);
+ if (retval) {
+ com_err(program_name, retval, "checking if mounted");
+ exit(1);
+ }
if (img_type && !ignore_rw_mount &&
(mount_flags & EXT2_MF_MOUNTED) &&
Check the return values from ext2fs_get_block_bitmap_range2(); if an
error happened, print that and don't print garbage bitmap.
Signed-off-by: Darrick J. Wong <[email protected]>
---
misc/dumpe2fs.c | 26 ++++++++++++++++++--------
1 file changed, 18 insertions(+), 8 deletions(-)
diff --git a/misc/dumpe2fs.c b/misc/dumpe2fs.c
index 8be7ce2..3dbfcb9 100644
--- a/misc/dumpe2fs.c
+++ b/misc/dumpe2fs.c
@@ -162,6 +162,7 @@ static void list_desc (ext2_filsys fs)
int has_super;
blk64_t blk_itr = EXT2FS_B2C(fs, fs->super->s_first_data_block);
ext2_ino_t ino_itr = 1;
+ errcode_t retval;
if (EXT2_HAS_RO_COMPAT_FEATURE(fs->super,
EXT4_FEATURE_RO_COMPAT_BIGALLOC))
@@ -264,21 +265,30 @@ static void list_desc (ext2_filsys fs)
ext2fs_bg_itable_unused(fs, i));
if (block_bitmap) {
fputs(_(" Free blocks: "), stdout);
- ext2fs_get_block_bitmap_range2(fs->block_map,
+ retval = ext2fs_get_block_bitmap_range2(fs->block_map,
blk_itr, block_nbytes << 3, block_bitmap);
- print_free(i, block_bitmap,
- fs->super->s_clusters_per_group,
- fs->super->s_first_data_block,
- EXT2FS_CLUSTER_RATIO(fs));
+ if (retval)
+ com_err("list_desc", retval,
+ "while reading block bitmap");
+ else
+ print_free(i, block_bitmap,
+ fs->super->s_clusters_per_group,
+ fs->super->s_first_data_block,
+ EXT2FS_CLUSTER_RATIO(fs));
fputc('\n', stdout);
blk_itr += fs->super->s_clusters_per_group;
}
if (inode_bitmap) {
fputs(_(" Free inodes: "), stdout);
- ext2fs_get_inode_bitmap_range2(fs->inode_map,
+ retval = ext2fs_get_inode_bitmap_range2(fs->inode_map,
ino_itr, inode_nbytes << 3, inode_bitmap);
- print_free(i, inode_bitmap,
- fs->super->s_inodes_per_group, 1, 1);
+ if (retval)
+ com_err("list_desc", retval,
+ "while reading inode bitmap");
+ else
+ print_free(i, inode_bitmap,
+ fs->super->s_inodes_per_group,
+ 1, 1);
fputc('\n', stdout);
ino_itr += fs->super->s_inodes_per_group;
}
Signed-off-by: Darrick J. Wong <[email protected]>
---
lib/ss/help.c | 1 +
lib/ss/list_rqs.c | 5 +++++
2 files changed, 6 insertions(+)
diff --git a/lib/ss/help.c b/lib/ss/help.c
index 6a61e70..5278c95 100644
--- a/lib/ss/help.c
+++ b/lib/ss/help.c
@@ -110,6 +110,7 @@ void ss_help (argc, argv, sci_idx, info_ptr)
switch (child = fork()) {
case -1:
ss_perror(sci_idx, errno, "Can't fork for pager");
+ (void) close(fd);
return;
case 0:
(void) dup2(fd, 0); /* put file on stdin */
diff --git a/lib/ss/list_rqs.c b/lib/ss/list_rqs.c
index 38e6aef..6baed41 100644
--- a/lib/ss/list_rqs.c
+++ b/lib/ss/list_rqs.c
@@ -45,6 +45,11 @@ void ss_list_requests(int argc __SS_ATTR((unused)),
sigprocmask(SIG_BLOCK, &igmask, &omask);
func = signal(SIGINT, SIG_IGN);
fd = ss_pager_create();
+ if (fd < 0) {
+ perror("ss_pager_create");
+ (void) signal(SIGINT, func);
+ return;
+ }
output = fdopen(fd, "w");
sigprocmask(SIG_SETMASK, &omask, (sigset_t *) 0);
Fix memory allocation calculations and check for NULL pointer returns.
Signed-off-by: Darrick J. Wong <[email protected]>
---
lib/ss/invocation.c | 5 +++++
lib/ss/parse.c | 4 ++++
lib/ss/request_tbl.c | 2 +-
3 files changed, 10 insertions(+), 1 deletion(-)
diff --git a/lib/ss/invocation.c b/lib/ss/invocation.c
index a711050..08b66f2 100644
--- a/lib/ss/invocation.c
+++ b/lib/ss/invocation.c
@@ -20,6 +20,7 @@
#ifdef HAVE_DLOPEN
#include <dlfcn.h>
#endif
+#include <errno.h>
int ss_create_invocation(subsystem_name, version_string, info_ptr,
request_table_ptr, code_ptr)
@@ -46,6 +47,10 @@ int ss_create_invocation(subsystem_name, version_string, info_ptr,
;
table = (ss_data **) realloc((char *)table,
((unsigned)sci_idx+2)*size);
+ if (table == NULL) {
+ *code_ptr = errno;
+ return 0;
+ }
table[sci_idx+1] = (ss_data *) NULL;
table[sci_idx] = new_table;
diff --git a/lib/ss/parse.c b/lib/ss/parse.c
index b70ad16..baded66 100644
--- a/lib/ss/parse.c
+++ b/lib/ss/parse.c
@@ -90,6 +90,10 @@ char **ss_parse (sci_idx, line_ptr, argc_ptr)
parse_mode = TOKEN;
cp = line_ptr;
argv = NEW_ARGV (argv, argc);
+ if (argv == NULL) {
+ *argc_ptr = errno;
+ return argv;
+ }
argv[argc++] = line_ptr;
argv[argc] = NULL;
}
diff --git a/lib/ss/request_tbl.c b/lib/ss/request_tbl.c
index b0b6f95..efdabfa 100644
--- a/lib/ss/request_tbl.c
+++ b/lib/ss/request_tbl.c
@@ -35,7 +35,7 @@ void ss_add_request_table(sci_idx, rqtbl_ptr, position, code_ptr)
;
/* size == C subscript of NULL == #elements */
size += 2; /* new element, and NULL */
- t = (ssrt **)realloc(info->rqt_tables, (unsigned)size*sizeof(ssrt));
+ t = (ssrt **)realloc(info->rqt_tables, (unsigned)size*sizeof(ssrt *));
if (t == (ssrt **)NULL) {
*code_ptr = errno;
return;
Signed-off-by: Darrick J. Wong <[email protected]>
---
lib/quota/mkquota.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/lib/quota/mkquota.c b/lib/quota/mkquota.c
index a0d3a2a..3aa8100 100644
--- a/lib/quota/mkquota.c
+++ b/lib/quota/mkquota.c
@@ -230,6 +230,7 @@ errcode_t quota_init_context(quota_ctx_t *qctx, ext2_filsys fs, int qtype)
err = ext2fs_get_mem(sizeof(dict_t), &dict);
if (err) {
log_err("Failed to allocate dictionary");
+ quota_release_context(&ctx);
return err;
}
ctx->quota_dict[i] = dict;
Fix up a few places where we ignore return values.
Signed-off-by: Darrick J. Wong <[email protected]>
---
lib/ext2fs/flushb.c | 2 +-
lib/ext2fs/icount.c | 2 ++
lib/ext2fs/imager.c | 7 ++++++-
lib/ext2fs/mkjournal.c | 4 +++-
lib/ext2fs/punch.c | 7 +++++++
5 files changed, 19 insertions(+), 3 deletions(-)
diff --git a/lib/ext2fs/flushb.c b/lib/ext2fs/flushb.c
index ac8923c..98821fc 100644
--- a/lib/ext2fs/flushb.c
+++ b/lib/ext2fs/flushb.c
@@ -70,7 +70,7 @@ errcode_t ext2fs_sync_device(int fd, int flushb)
#warning BLKFLSBUF not defined
#endif
#ifdef FDFLUSH
- ioctl (fd, FDFLUSH, 0); /* In case this is a floppy */
+ return ioctl(fd, FDFLUSH, 0); /* In case this is a floppy */
#elif defined(__linux__)
#warning FDFLUSH not defined
#endif
diff --git a/lib/ext2fs/icount.c b/lib/ext2fs/icount.c
index 84b74a9..c5ebf74 100644
--- a/lib/ext2fs/icount.c
+++ b/lib/ext2fs/icount.c
@@ -193,6 +193,8 @@ errcode_t ext2fs_create_icount_tdb(ext2_filsys fs, char *tdb_dir,
uuid_unparse(fs->super->s_uuid, uuid);
sprintf(fn, "%s/%s-icount-XXXXXX", tdb_dir, uuid);
fd = mkstemp(fn);
+ if (fd < 0)
+ return fd;
/*
* This is an overestimate of the size that we will need; the
diff --git a/lib/ext2fs/imager.c b/lib/ext2fs/imager.c
index 7f3b25b..378a3c8 100644
--- a/lib/ext2fs/imager.c
+++ b/lib/ext2fs/imager.c
@@ -66,6 +66,7 @@ errcode_t ext2fs_image_inode_write(ext2_filsys fs, int fd, int flags)
blk64_t blk;
ssize_t actual;
errcode_t retval;
+ off_t r;
buf = malloc(fs->blocksize * BUF_BLOCKS);
if (!buf)
@@ -97,7 +98,11 @@ errcode_t ext2fs_image_inode_write(ext2_filsys fs, int fd, int flags)
blk++;
left--;
cp += fs->blocksize;
- lseek(fd, fs->blocksize, SEEK_CUR);
+ r = lseek(fd, fs->blocksize, SEEK_CUR);
+ if (r < 0) {
+ retval = errno;
+ goto errout;
+ }
continue;
}
/* Find non-zero blocks */
diff --git a/lib/ext2fs/mkjournal.c b/lib/ext2fs/mkjournal.c
index 2afd3b7..1d5b1a7 100644
--- a/lib/ext2fs/mkjournal.c
+++ b/lib/ext2fs/mkjournal.c
@@ -520,8 +520,10 @@ errcode_t ext2fs_add_journal_inode(ext2_filsys fs, blk_t num_blocks, int flags)
#if HAVE_EXT2_IOCTLS
fd = open(jfile, O_RDONLY);
if (fd >= 0) {
- ioctl(fd, EXT2_IOC_SETFLAGS, &f);
+ retval = ioctl(fd, EXT2_IOC_SETFLAGS, &f);
close(fd);
+ if (retval)
+ return retval;
}
#endif
#endif
diff --git a/lib/ext2fs/punch.c b/lib/ext2fs/punch.c
index 790a0ad8..ceec336 100644
--- a/lib/ext2fs/punch.c
+++ b/lib/ext2fs/punch.c
@@ -192,6 +192,13 @@ static errcode_t ext2fs_punch_extent(ext2_filsys fs, ext2_ino_t ino,
retval = ext2fs_extent_open2(fs, ino, inode, &handle);
if (retval)
return retval;
+ /*
+ * Find the extent closest to the start of the punch range. We don't
+ * check the return value because _goto() sets the current node to the
+ * next-lowest extent if 'start' is in a hole, and doesn't set a
+ * current node if there was a real error reading the extent tree.
+ * In that case, _get() will error out.
+ */
ext2fs_extent_goto(handle, start);
retval = ext2fs_extent_get(handle, EXT2_EXTENT_CURRENT, &extent);
if (retval)
Signed-off-by: Darrick J. Wong <[email protected]>
---
lib/ext2fs/gen_bitmap64.c | 2 ++
lib/ext2fs/mkjournal.c | 13 ++++++++-----
lib/ext2fs/newdir.c | 8 ++++++--
lib/ext2fs/qcow2.c | 14 ++++++++++----
4 files changed, 26 insertions(+), 11 deletions(-)
diff --git a/lib/ext2fs/gen_bitmap64.c b/lib/ext2fs/gen_bitmap64.c
index 2880afa..36e0240 100644
--- a/lib/ext2fs/gen_bitmap64.c
+++ b/lib/ext2fs/gen_bitmap64.c
@@ -128,6 +128,7 @@ errcode_t ext2fs_alloc_generic_bmap(ext2_filsys fs, errcode_t magic,
if (gettimeofday(&bitmap->stats.created,
(struct timezone *) NULL) == -1) {
perror("gettimeofday");
+ ext2fs_free_mem(&bitmap);
return 1;
}
bitmap->stats.type = type;
@@ -300,6 +301,7 @@ errcode_t ext2fs_copy_generic_bmap(ext2fs_generic_bitmap src,
if (gettimeofday(&new_bmap->stats.created,
(struct timezone *) NULL) == -1) {
perror("gettimeofday");
+ ext2fs_free_mem(&new_bmap);
return 1;
}
new_bmap->stats.type = src->stats.type;
diff --git a/lib/ext2fs/mkjournal.c b/lib/ext2fs/mkjournal.c
index 1d5b1a7..09ca412 100644
--- a/lib/ext2fs/mkjournal.c
+++ b/lib/ext2fs/mkjournal.c
@@ -312,13 +312,15 @@ static errcode_t write_journal_inode(ext2_filsys fs, ext2_ino_t journal_ino,
return retval;
if ((retval = ext2fs_read_bitmaps(fs)))
- return retval;
+ goto out2;
if ((retval = ext2fs_read_inode(fs, journal_ino, &inode)))
- return retval;
+ goto out2;
- if (inode.i_blocks > 0)
- return EEXIST;
+ if (inode.i_blocks > 0) {
+ retval = EEXIST;
+ goto out2;
+ }
es.num_blocks = num_blocks;
es.newblocks = 0;
@@ -330,7 +332,7 @@ static errcode_t write_journal_inode(ext2_filsys fs, ext2_ino_t journal_ino,
if (fs->super->s_feature_incompat & EXT3_FEATURE_INCOMPAT_EXTENTS) {
inode.i_flags |= EXT4_EXTENTS_FL;
if ((retval = ext2fs_write_inode(fs, journal_ino, &inode)))
- return retval;
+ goto out2;
}
/*
@@ -398,6 +400,7 @@ static errcode_t write_journal_inode(ext2_filsys fs, ext2_ino_t journal_ino,
errout:
ext2fs_zero_blocks2(0, 0, 0, 0, 0);
+out2:
ext2fs_free_mem(&buf);
return retval;
}
diff --git a/lib/ext2fs/newdir.c b/lib/ext2fs/newdir.c
index d134bdf..44e4ca9 100644
--- a/lib/ext2fs/newdir.c
+++ b/lib/ext2fs/newdir.c
@@ -50,8 +50,10 @@ errcode_t ext2fs_new_dir_block(ext2_filsys fs, ext2_ino_t dir_ino,
csum_size = sizeof(struct ext2_dir_entry_tail);
retval = ext2fs_set_rec_len(fs, fs->blocksize - csum_size, dir);
- if (retval)
+ if (retval) {
+ ext2fs_free_mem(&buf);
return retval;
+ }
if (dir_ino) {
if (fs->super->s_feature_incompat &
@@ -72,8 +74,10 @@ errcode_t ext2fs_new_dir_block(ext2_filsys fs, ext2_ino_t dir_ino,
*/
dir = (struct ext2_dir_entry *) (buf + dir->rec_len);
retval = ext2fs_set_rec_len(fs, rec_len, dir);
- if (retval)
+ if (retval) {
+ ext2fs_free_mem(&buf);
return retval;
+ }
dir->inode = parent_ino;
ext2fs_dirent_set_name_len(dir, 2);
ext2fs_dirent_set_file_type(dir, filetype);
diff --git a/lib/ext2fs/qcow2.c b/lib/ext2fs/qcow2.c
index 8394270..547edc0 100644
--- a/lib/ext2fs/qcow2.c
+++ b/lib/ext2fs/qcow2.c
@@ -60,8 +60,10 @@ struct ext2_qcow2_hdr *qcow2_read_header(int fd)
return NULL;
memset(buffer, 0, sizeof(struct ext2_qcow2_hdr));
- if (ext2fs_llseek(fd, 0, SEEK_SET < 0))
+ if (ext2fs_llseek(fd, 0, SEEK_SET < 0)) {
+ ext2fs_free_mem(&buffer);
return NULL;
+ }
size = read(fd, buffer, sizeof(struct ext2_qcow2_hdr));
if (size != sizeof(struct ext2_qcow2_hdr)) {
@@ -91,8 +93,10 @@ static int qcow2_read_l1_table(struct ext2_qcow2_image *img)
if (ret)
return ret;
- if (ext2fs_llseek(fd, img->l1_offset, SEEK_SET) < 0)
+ if (ext2fs_llseek(fd, img->l1_offset, SEEK_SET) < 0) {
+ ext2fs_free_mem(&table);
return errno;
+ }
size = read(fd, table, l1_size);
if (size != l1_size) {
@@ -236,8 +240,10 @@ int qcow2_write_raw_image(int qcow2_fd, int raw_fd,
((char *)copy_buf)[0] = 0;
size = write(raw_fd, copy_buf, 1);
- if (size != 1)
- return errno;
+ if (size != 1) {
+ ret = errno;
+ goto out;
+ }
out:
if (copy_buf)
Zero is a valid file descriptor, so close it.
Signed-off-by: Darrick J. Wong <[email protected]>
---
lib/ext2fs/mkjournal.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/lib/ext2fs/mkjournal.c b/lib/ext2fs/mkjournal.c
index 09ca412..69ac135 100644
--- a/lib/ext2fs/mkjournal.c
+++ b/lib/ext2fs/mkjournal.c
@@ -595,7 +595,7 @@ errcode_t ext2fs_add_journal_inode(ext2_filsys fs, blk_t num_blocks, int flags)
ext2fs_mark_super_dirty(fs);
return 0;
errout:
- if (fd > 0)
+ if (fd >= 0)
close(fd);
return retval;
}
If we're using ext2fs_file_write() to write to a hole in a file,
ensure that we can actually allocate the block before updating i_size.
In other words, don't update i_size and don't return success if we hit
an error while allocating space.
Signed-off-by: Darrick J. Wong <[email protected]>
---
lib/ext2fs/fileio.c | 14 ++++++++++++++
1 file changed, 14 insertions(+)
diff --git a/lib/ext2fs/fileio.c b/lib/ext2fs/fileio.c
index d092e65..03bdf86 100644
--- a/lib/ext2fs/fileio.c
+++ b/lib/ext2fs/fileio.c
@@ -297,6 +297,20 @@ errcode_t ext2fs_file_write(ext2_file_t file, const void *buf,
if (retval)
goto fail;
+ /*
+ * OK, the physical block hasn't been allocated yet.
+ * Allocate it.
+ */
+ if (!file->physblock) {
+ retval = ext2fs_bmap2(fs, file->ino, &file->inode,
+ BMAP_BUFFER,
+ file->ino ? BMAP_ALLOC : 0,
+ file->blockno, 0,
+ &file->physblock);
+ if (retval)
+ goto fail;
+ }
+
file->flags |= EXT2_FILE_BUF_DIRTY;
memcpy(file->buf+start, ptr, c);
file->pos += c;
When deleting an entire extent, we cannot always slip to the previous
leaf extent because there might not /be/ a previous extent.
Attempting to correct for that error by asking for the 'current' leaf
extent also doesn't work, because the failed attempt to change to the
previous extent leaves us with no current extent.
Fix this problem by recording the lblk of the next extent before
deleting the current extent and _goto()ing to the next extent after
the deletion.
Signed-off-by: Darrick J. Wong <[email protected]>
---
lib/ext2fs/punch.c | 41 ++++++++++++++++++++++++++++++-----------
1 file changed, 30 insertions(+), 11 deletions(-)
diff --git a/lib/ext2fs/punch.c b/lib/ext2fs/punch.c
index ceec336..9dfba2e 100644
--- a/lib/ext2fs/punch.c
+++ b/lib/ext2fs/punch.c
@@ -267,22 +267,41 @@ static errcode_t ext2fs_punch_extent(ext2_filsys fs, ext2_ino_t ino,
retval = ext2fs_extent_replace(handle, 0, &extent);
} else {
struct ext2fs_extent newex;
+ blk64_t old_lblk, next_lblk;
dbg_printf("deleting current extent%s\n", "");
- retval = ext2fs_extent_delete(handle, 0);
- if (retval)
- goto errout;
+
/*
- * We just moved the next extent into the current
- * extent's position, so re-read the extent next time.
+ * Save the location of the next leaf, then slip
+ * back to the current extent.
*/
+ retval = ext2fs_extent_get(handle, EXT2_EXTENT_CURRENT,
+ &newex);
+ if (retval)
+ goto errout;
+ old_lblk = newex.e_lblk;
+
retval = ext2fs_extent_get(handle,
- EXT2_EXTENT_PREV_LEAF,
+ EXT2_EXTENT_NEXT_LEAF,
&newex);
- /* Can't go back? Just reread current. */
- if (retval == EXT2_ET_EXTENT_NO_PREV) {
- retval = 0;
- op = EXT2_EXTENT_CURRENT;
- }
+ if (retval == EXT2_ET_EXTENT_NO_NEXT)
+ next_lblk = old_lblk;
+ else if (retval)
+ goto errout;
+ else
+ next_lblk = newex.e_lblk;
+
+ retval = ext2fs_extent_goto(handle, old_lblk);
+ if (retval)
+ goto errout;
+
+ /* Now delete the extent. */
+ retval = ext2fs_extent_delete(handle, 0);
+ if (retval)
+ goto errout;
+
+ /* Jump forward to the next extent. */
+ ext2fs_extent_goto(handle, next_lblk);
+ op = EXT2_EXTENT_CURRENT;
}
if (retval)
goto errout;
If we're asked to punch a file with no data blocks mapped to it and a
non-zero length, we don't need to do any work in ext2fs_punch_extent()
and can return success. Unfortunately, the extent_get() function
returns "no current node" because it (correctly) failed to find any
extents, which is bubbled up to callers. Since no extents being found
is not an error in this corner case, fix up ext2fs_punch_extent() to
return 0 to callers.
Signed-off-by: Darrick J. Wong <[email protected]>
---
lib/ext2fs/punch.c | 9 ++++++++-
1 file changed, 8 insertions(+), 1 deletion(-)
diff --git a/lib/ext2fs/punch.c b/lib/ext2fs/punch.c
index 9dfba2e..ff051f7 100644
--- a/lib/ext2fs/punch.c
+++ b/lib/ext2fs/punch.c
@@ -198,10 +198,17 @@ static errcode_t ext2fs_punch_extent(ext2_filsys fs, ext2_ino_t ino,
* next-lowest extent if 'start' is in a hole, and doesn't set a
* current node if there was a real error reading the extent tree.
* In that case, _get() will error out.
+ *
+ * Note: If _get() returns 'no current node', that simply means that
+ * there aren't any blocks mapped past this point in the file, so we're
+ * done.
*/
ext2fs_extent_goto(handle, start);
retval = ext2fs_extent_get(handle, EXT2_EXTENT_CURRENT, &extent);
- if (retval)
+ if (retval == EXT2_ET_NO_CURRENT_NODE) {
+ retval = 0;
+ goto errout;
+ } else if (retval)
goto errout;
while (1) {
op = EXT2_EXTENT_NEXT_LEAF;
When we're rehashing directories, it's possible that an extent block
(or a map block) could be (silently) allocated by the underlying
libext2fs when expanding the directory. This silent allocation is not
captured in block_found_map, which is disastrous if later the rehash
process expands another directory and uses that same block from
before without realizing that it's now in use.
Therefore, if we notice that the free block count has dropped by more
than what e2fsck allocated itself during the expansion, we iterate the
directory's blocks a second time to ensure that these silent
allocations are marked in the found blocks bitmap.
Signed-off-by: Darrick J. Wong <[email protected]>
---
e2fsck/pass3.c | 39 ++++++++++++++++++++++++++++++++++++++-
1 file changed, 38 insertions(+), 1 deletion(-)
diff --git a/e2fsck/pass3.c b/e2fsck/pass3.c
index 6989f17..dc9d7c1 100644
--- a/e2fsck/pass3.c
+++ b/e2fsck/pass3.c
@@ -758,6 +758,27 @@ static int expand_dir_proc(ext2_filsys fs,
return BLOCK_CHANGED;
}
+/*
+ * Ensure that all blocks are marked in the block_found_map, since it's
+ * possible that the library allocated an extent node block or a block map
+ * block during the directory rebuilding; these new allocations are not
+ * captured in block_found_map. This is bad since we could later use
+ * block_found_map to allocate more blocks.
+ */
+static int find_new_blocks_proc(ext2_filsys fs,
+ blk64_t *blocknr,
+ e2_blkcnt_t blockcnt,
+ blk64_t ref_block EXT2FS_ATTR((unused)),
+ int ref_offset EXT2FS_ATTR((unused)),
+ void *priv_data)
+{
+ struct expand_dir_struct *es = (struct expand_dir_struct *) priv_data;
+ e2fsck_t ctx = es->ctx;
+
+ ext2fs_mark_block_bitmap2(ctx->block_found_map, *blocknr);
+ return 0;
+}
+
errcode_t e2fsck_expand_directory(e2fsck_t ctx, ext2_ino_t dir,
int num, int guaranteed_size)
{
@@ -765,7 +786,7 @@ errcode_t e2fsck_expand_directory(e2fsck_t ctx, ext2_ino_t dir,
errcode_t retval;
struct expand_dir_struct es;
struct ext2_inode inode;
- blk64_t sz;
+ blk64_t sz, before, after;
if (!(fs->flags & EXT2_FLAG_RW))
return EXT2_ET_RO_FILSYS;
@@ -788,11 +809,27 @@ errcode_t e2fsck_expand_directory(e2fsck_t ctx, ext2_ino_t dir,
es.ctx = ctx;
es.dir = dir;
+ before = ext2fs_free_blocks_count(fs->super);
retval = ext2fs_block_iterate3(fs, dir, BLOCK_FLAG_APPEND,
0, expand_dir_proc, &es);
if (es.err)
return es.err;
+ after = ext2fs_free_blocks_count(fs->super);
+
+ /*
+ * If the free block count has dropped by more than the blocks we
+ * allocated ourselves, then we must've allocated some extent/map
+ * blocks. Therefore, we must iterate this dir's blocks again to
+ * ensure that all newly allocated blocks are captured in
+ * block_found_map.
+ */
+ if ((before - after) > es.newblocks) {
+ retval = ext2fs_block_iterate3(fs, dir, BLOCK_FLAG_READ_ONLY,
+ 0, find_new_blocks_proc, &es);
+ if (es.err)
+ return es.err;
+ }
/*
* Update the size and block count fields in the inode.
When we set the file size, find the block containing EOF, and zero
everything in that block past EOF so that we can't return stale data
if we ever use fallocate or truncate to lengthen the file.
Signed-off-by: Darrick J. Wong <[email protected]>
---
lib/ext2fs/fileio.c | 50 ++++++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 50 insertions(+)
diff --git a/lib/ext2fs/fileio.c b/lib/ext2fs/fileio.c
index 03bdf86..582b306 100644
--- a/lib/ext2fs/fileio.c
+++ b/lib/ext2fs/fileio.c
@@ -393,6 +393,52 @@ ext2_off_t ext2fs_file_get_size(ext2_file_t file)
return size;
}
+/* Zero the parts of the last block that are past EOF. */
+errcode_t ext2fs_file_zero_past_offset(ext2_file_t file, ext2_off64_t offset)
+{
+ ext2_filsys fs = file->fs;
+ char *b = NULL;
+ ext2_off64_t off = offset % fs->blocksize;
+ blk64_t blk;
+ int ret_flags;
+ errcode_t retval;
+
+ if (off == 0)
+ return 0;
+
+ retval = sync_buffer_position(file);
+ if (retval)
+ return retval;
+
+ /* Is there an initialized block at the end? */
+ retval = ext2fs_bmap2(fs, file->ino, NULL, NULL, 0,
+ offset / fs->blocksize, &ret_flags, &blk);
+ if (retval)
+ return retval;
+ if ((blk == 0) || (ret_flags & BMAP_RET_UNINIT))
+ return 0;
+
+ /* Zero to the end of the block */
+ retval = ext2fs_get_mem(fs->blocksize, &b);
+ if (retval)
+ return retval;
+
+ /* Read/zero/write block */
+ retval = io_channel_read_blk64(fs->io, blk, 1, b);
+ if (retval)
+ goto out;
+
+ memset(b + off, 0, fs->blocksize - off);
+
+ retval = io_channel_write_blk64(fs->io, blk, 1, b);
+ if (retval)
+ goto out;
+
+out:
+ ext2fs_free_mem(&b);
+ return retval;
+}
+
/*
* This function sets the size of the file, truncating it if necessary
*
@@ -434,6 +480,10 @@ errcode_t ext2fs_file_set_size2(ext2_file_t file, ext2_off64_t size)
return retval;
}
+ retval = ext2fs_file_zero_past_offset(file, size);
+ if (retval)
+ return retval;
+
if (truncate_block >= old_truncate)
return 0;
If ext2fs_descriptor_block_loc2() is called with a meta_bg filesystem
and group_block is not the normal value, the function will return the
location of the backup group descriptor block in the next block group.
Unfortunately, it fails to account for the possibility that the backup
group contains a backup superblock but the regular superblock does
not. This is the case with block groups 48-49 on a meta_bg fs with 1k
blocks; in this case, libext2fs will fail to open the filesystem.
Therefore, teach the function to adjust for superblocks in the backup
group, if necessary.
Signed-off-by: Darrick J. Wong <[email protected]>
---
lib/ext2fs/openfs.c | 19 +++++++++++++++----
1 file changed, 15 insertions(+), 4 deletions(-)
diff --git a/lib/ext2fs/openfs.c b/lib/ext2fs/openfs.c
index b2a8abb..92d9e40 100644
--- a/lib/ext2fs/openfs.c
+++ b/lib/ext2fs/openfs.c
@@ -47,7 +47,7 @@ blk64_t ext2fs_descriptor_block_loc2(ext2_filsys fs, blk64_t group_block,
bg = EXT2_DESC_PER_BLOCK(fs->super) * i;
if (ext2fs_bg_has_super(fs, bg))
has_super = 1;
- ret_blk = ext2fs_group_first_block2(fs, bg) + has_super;
+ ret_blk = ext2fs_group_first_block2(fs, bg);
/*
* If group_block is not the normal value, we're trying to use
* the backup group descriptors and superblock --- so use the
@@ -57,10 +57,21 @@ blk64_t ext2fs_descriptor_block_loc2(ext2_filsys fs, blk64_t group_block,
* have the infrastructure in place to do that.
*/
if (group_block != fs->super->s_first_data_block &&
- ((ret_blk + fs->super->s_blocks_per_group) <
- ext2fs_blocks_count(fs->super)))
+ ((ret_blk + has_super + fs->super->s_blocks_per_group) <
+ ext2fs_blocks_count(fs->super))) {
ret_blk += fs->super->s_blocks_per_group;
- return ret_blk;
+
+ /*
+ * If we're going to jump forward a block group, make sure
+ * that we adjust has_super to account for the next group's
+ * backup superblock (or lack thereof).
+ */
+ if (ext2fs_bg_has_super(fs, bg + 1))
+ has_super = 1;
+ else
+ has_super = 0;
+ }
+ return ret_blk + has_super;
}
blk_t ext2fs_descriptor_block_loc(ext2_filsys fs, blk_t group_block, dgrp_t i)
On a filesystem with 1K blocks and meta_bg enabled, opening a
filesystem with automatic superblock detection tries to compensate for
the fact that the superblock lives in block 1. However, the method by
which this is done is later misinterpreted to mean "read the backup
group descriptors", which is not what we want in this case.
Therefore, in ext2fs_open3() separate the 'group zero' adjustment into
its own variable so that we don't get fed backup group descriptors
when we try to load meta_bg group descriptors.
Furthermore, enhance ext2fs_descriptor_block_loc2() to perform its own
group zero correction. The other caller of this function neglects to
do any group-zero correction of their own, so this fixes them too.
Signed-off-by: Darrick J. Wong <[email protected]>
---
lib/ext2fs/ext2fs.h | 5 +++++
lib/ext2fs/openfs.c | 30 +++++++++++++++++++++++++-----
2 files changed, 30 insertions(+), 5 deletions(-)
diff --git a/lib/ext2fs/ext2fs.h b/lib/ext2fs/ext2fs.h
index 0624350..edd5ee9 100644
--- a/lib/ext2fs/ext2fs.h
+++ b/lib/ext2fs/ext2fs.h
@@ -1449,6 +1449,11 @@ extern errcode_t ext2fs_open2(const char *name, const char *io_options,
int flags, int superblock,
unsigned int block_size, io_manager manager,
ext2_filsys *ret_fs);
+/*
+ * The dgrp_t argument to these two functions is not actually a group number
+ * but a block number offset within a group table! Convert with the formula
+ * (group_number / groups_per_block).
+ */
extern blk64_t ext2fs_descriptor_block_loc2(ext2_filsys fs,
blk64_t group_block, dgrp_t i);
extern blk_t ext2fs_descriptor_block_loc(ext2_filsys fs, blk_t group_block,
diff --git a/lib/ext2fs/openfs.c b/lib/ext2fs/openfs.c
index 92d9e40..5cf6ae4 100644
--- a/lib/ext2fs/openfs.c
+++ b/lib/ext2fs/openfs.c
@@ -37,12 +37,19 @@ blk64_t ext2fs_descriptor_block_loc2(ext2_filsys fs, blk64_t group_block,
dgrp_t i)
{
int bg;
- int has_super = 0;
+ int has_super = 0, group_zero_adjust = 0;
blk64_t ret_blk;
+ /*
+ * On a bigalloc FS with 1K blocks, block 0 is reserved for non-ext4
+ * stuff, so adjust for that if we're being asked for group 0.
+ */
+ if (i == 0 && fs->blocksize == 1024 && EXT2FS_CLUSTER_RATIO(fs) > 1)
+ group_zero_adjust = 1;
+
if (!(fs->super->s_feature_incompat & EXT2_FEATURE_INCOMPAT_META_BG) ||
(i < fs->super->s_first_meta_bg))
- return (group_block + i + 1);
+ return group_block + i + 1 + group_zero_adjust;
bg = EXT2_DESC_PER_BLOCK(fs->super) * i;
if (ext2fs_bg_has_super(fs, bg))
@@ -71,7 +78,7 @@ blk64_t ext2fs_descriptor_block_loc2(ext2_filsys fs, blk64_t group_block,
else
has_super = 0;
}
- return ret_blk + has_super;
+ return ret_blk + has_super + group_zero_adjust;
}
blk_t ext2fs_descriptor_block_loc(ext2_filsys fs, blk_t group_block, dgrp_t i)
@@ -113,6 +120,7 @@ errcode_t ext2fs_open2(const char *name, const char *io_options,
unsigned int blocks_per_group, io_flags;
blk64_t group_block, blk;
char *dest, *cp;
+ int group_zero_adjust = 0;
#ifdef WORDS_BIGENDIAN
unsigned int groups_per_block;
struct ext2_group_desc *gdp;
@@ -380,8 +388,19 @@ errcode_t ext2fs_open2(const char *name, const char *io_options,
goto cleanup;
if (!group_block)
group_block = fs->super->s_first_data_block;
+ /*
+ * On a FS with a 1K blocksize, block 0 is reserved for bootloaders
+ * so we must increment block numbers to any group 0 items.
+ *
+ * However, we cannot touch group_block directly because in the meta_bg
+ * case, the ext2fs_descriptor_block_loc2() function will interpret
+ * group_block != s_first_data_block to mean that we want to access the
+ * backup group descriptors. This is not what we want if the caller
+ * set superblock == 0 (i.e. auto-detect the superblock), which is
+ * what's going on here.
+ */
if (group_block == 0 && fs->blocksize == 1024)
- group_block = 1; /* Deal with 1024 blocksize && bigalloc */
+ group_zero_adjust = 1;
dest = (char *) fs->group_desc;
#ifdef WORDS_BIGENDIAN
groups_per_block = EXT2_DESC_PER_BLOCK(fs->super);
@@ -391,7 +410,8 @@ errcode_t ext2fs_open2(const char *name, const char *io_options,
else
first_meta_bg = fs->desc_blocks;
if (first_meta_bg) {
- retval = io_channel_read_blk(fs->io, group_block+1,
+ retval = io_channel_read_blk(fs->io, group_block +
+ group_zero_adjust + 1,
first_meta_bg, dest);
if (retval)
goto cleanup;
The kernel[1] and e2fsck[2] both react to a BLOCK_UNINIT group by
calculating the block bitmap that's needed to show all the group
blocks for that group (if any) and using that. However, when reading
bitmaps from disk, libext2fs simply imports a block of zeroes into the
bitmap, without bothering to check for group blocks. This erroneous
behavior results in the filesystem having a block bitmap that does not
accurately reflect disk contents, and worse yet makes it seem as
though superblocks, group descriptors, bitmaps, and inode tables are
"free" space on disk.
So, fix the block bitmap loading routines to calculate the correct
block bitmap for all groups and load it into the main fs block bitmap.
This also fixes bogus debugfs output such as:
Group 1: (Blocks 8193-16384) [INODE_UNINIT, BLOCK_UNINIT]
Checksum 0x1310, unused inodes 512
Backup superblock at 8193, Group descriptors at 8194-8217
Reserved GDT blocks at 8218-8473
Block bitmap at 283 (bg #0 + 282), Inode bitmap at 299 (bg #0 + 298)
Inode table at 442-569 (bg #0 + 441)
7911 free blocks, 512 free inodes, 0 directories, 512 unused inodes
Free blocks: 8193-16384
Free inodes: 513-1024
Notice how the "free blocks" range includes the backup sb & GDT area
and doesn't match the free block count.
Worse yet, debugfs' testb command will report those group descriptor
blocks as not being in use unless the user also instructs debugfs to
find a free block first. That is a rather surprising result:
debugfs: testb 8194
Block 8194 not in use
debugfs: ffb 1 16380
Free blocks found: 16380
debugfs: testb 8194
Block 8194 marked in use
Also, remove the part of check_block_uninit() that "fixes" the bitmap
since we're doing that at bitmap load time now.
[1] kernel git 717d50e4971b81b96c0199c91cdf0039a8cb181a
"Ext4: Uninitialized Block Groups"
[2] e2fsprogs git f5fa20078bfc05b554294fe9c5505375d7913e8c
"Add support for EXT2_FEATURE_COMPAT_LAZY_BG"
Signed-off-by: Darrick J. Wong <[email protected]>
---
lib/ext2fs/alloc.c | 17 +----------------
lib/ext2fs/rw_bitmaps.c | 45 +++++++++++++++++++++++++++++++++++++++++++++
2 files changed, 46 insertions(+), 16 deletions(-)
diff --git a/lib/ext2fs/alloc.c b/lib/ext2fs/alloc.c
index ce72ffe..39ef24f 100644
--- a/lib/ext2fs/alloc.c
+++ b/lib/ext2fs/alloc.c
@@ -51,23 +51,8 @@ static void check_block_uninit(ext2_filsys fs, ext2fs_block_bitmap map,
else
old_desc_blocks = fs->desc_blocks + fs->super->s_reserved_gdt_blocks;
- for (i=0; i < fs->super->s_blocks_per_group; i++, blk++)
- ext2fs_fast_unmark_block_bitmap2(map, blk);
+ /* uninit block bitmaps are now initialized in read_bitmaps() */
- blk = ext2fs_group_first_block2(fs, group);
- for (i=0; i < fs->super->s_blocks_per_group; i++, blk++) {
- if ((blk == super_blk) ||
- (old_desc_blk && old_desc_blocks &&
- (blk >= old_desc_blk) &&
- (blk < old_desc_blk + old_desc_blocks)) ||
- (new_desc_blk && (blk == new_desc_blk)) ||
- (blk == ext2fs_block_bitmap_loc(fs, group)) ||
- (blk == ext2fs_inode_bitmap_loc(fs, group)) ||
- (blk >= ext2fs_inode_table_loc(fs, group) &&
- (blk < ext2fs_inode_table_loc(fs, group)
- + fs->inode_blocks_per_group)))
- ext2fs_fast_mark_block_bitmap2(map, blk);
- }
ext2fs_bg_flags_clear(fs, group, EXT2_BG_BLOCK_UNINIT);
ext2fs_group_desc_csum_set(fs, group);
ext2fs_mark_super_dirty(fs);
diff --git a/lib/ext2fs/rw_bitmaps.c b/lib/ext2fs/rw_bitmaps.c
index 386cbeb..ad6cfc1 100644
--- a/lib/ext2fs/rw_bitmaps.c
+++ b/lib/ext2fs/rw_bitmaps.c
@@ -156,6 +156,43 @@ errout:
return retval;
}
+static errcode_t mark_uninit_bg_group_blocks(ext2_filsys fs)
+{
+ dgrp_t i;
+ blk64_t blk;
+ ext2fs_block_bitmap bmap = fs->block_map;
+
+ for (i = 0; i < fs->group_desc_count; i++) {
+ if (!ext2fs_bg_flags_test(fs, i, EXT2_BG_BLOCK_UNINIT))
+ continue;
+
+ ext2fs_reserve_super_and_bgd(fs, i, bmap);
+
+ /*
+ * Mark the blocks used for the inode table
+ */
+ blk = ext2fs_inode_table_loc(fs, i);
+ if (blk)
+ ext2fs_mark_block_bitmap_range2(bmap, blk,
+ fs->inode_blocks_per_group);
+
+ /*
+ * Mark block used for the block bitmap
+ */
+ blk = ext2fs_block_bitmap_loc(fs, i);
+ if (blk)
+ ext2fs_mark_block_bitmap2(bmap, blk);
+
+ /*
+ * Mark block used for the inode bitmap
+ */
+ blk = ext2fs_inode_bitmap_loc(fs, i);
+ if (blk)
+ ext2fs_mark_block_bitmap2(bmap, blk);
+ }
+ return 0;
+}
+
static errcode_t read_bitmaps(ext2_filsys fs, int do_inode, int do_block)
{
dgrp_t i;
@@ -320,6 +357,14 @@ static errcode_t read_bitmaps(ext2_filsys fs, int do_inode, int do_block)
ino_itr += inode_nbytes << 3;
}
}
+
+ /* Mark group blocks for any BLOCK_UNINIT groups */
+ if (do_block) {
+ retval = mark_uninit_bg_group_blocks(fs);
+ if (retval)
+ goto cleanup;
+ }
+
success_cleanup:
if (inode_bitmap)
ext2fs_free_mem(&inode_bitmap);
Since libext2fs now detects a BLOCK_UNINIT group and calculates the
group's block bitmap, we no longer need to emulate this behavior in
e2fsck. We can simply compare the found block map against the
filesystem's, and proceed from there.
Signed-off-by: Darrick J. Wong <[email protected]>
---
e2fsck/pass5.c | 103 +++++++-------------------------------------------------
1 file changed, 12 insertions(+), 91 deletions(-)
diff --git a/e2fsck/pass5.c b/e2fsck/pass5.c
index 498c041..6d7b968 100644
--- a/e2fsck/pass5.c
+++ b/e2fsck/pass5.c
@@ -326,7 +326,6 @@ static void check_block_bitmaps(e2fsck_t ctx)
int fixit, had_problem;
errcode_t retval;
int csum_flag;
- int skip_group = 0;
int old_desc_blocks = 0;
int count = 0;
int cmp_block = 0;
@@ -378,9 +377,6 @@ redo_counts:
had_problem = 0;
save_problem = 0;
pctx.blk = pctx.blk2 = NO_BLK;
- if (csum_flag &&
- (ext2fs_bg_flags_test(fs, group, EXT2_BG_BLOCK_UNINIT)))
- skip_group++;
for (i = B2C(fs->super->s_first_data_block);
i < ext2fs_blocks_count(fs->super);
i += EXT2FS_CLUSTER_RATIO(fs)) {
@@ -411,15 +407,11 @@ redo_counts:
actual_buf);
if (retval)
goto no_optimize;
- if (ext2fs_bg_flags_test(fs, group, EXT2_BG_BLOCK_UNINIT))
- memset(bitmap_buf, 0, nbytes);
- else {
- retval = ext2fs_get_block_bitmap_range2(fs->block_map,
- B2C(i), fs->super->s_clusters_per_group,
- bitmap_buf);
- if (retval)
- goto no_optimize;
- }
+ retval = ext2fs_get_block_bitmap_range2(fs->block_map,
+ B2C(i), fs->super->s_clusters_per_group,
+ bitmap_buf);
+ if (retval)
+ goto no_optimize;
if (memcmp(actual_buf, bitmap_buf, nbytes) != 0)
goto no_optimize;
n = ext2fs_bitcount(actual_buf, nbytes);
@@ -429,73 +421,7 @@ redo_counts:
goto next_group;
no_optimize:
- if (skip_group) {
- if (first_block_in_bg) {
- super_blk = 0;
- old_desc_blk = 0;
- new_desc_blk = 0;
- ext2fs_super_and_bgd_loc2(fs, group, &super_blk,
- &old_desc_blk, &new_desc_blk, 0);
-
- if (fs->super->s_feature_incompat &
- EXT2_FEATURE_INCOMPAT_META_BG)
- old_desc_blocks =
- fs->super->s_first_meta_bg;
- else
- old_desc_blocks = fs->desc_blocks +
- fs->super->s_reserved_gdt_blocks;
-
- count = 0;
- cmp_block = fs->super->s_clusters_per_group;
- if (group == (int)fs->group_desc_count - 1)
- cmp_block = EXT2FS_NUM_B2C(fs,
- ext2fs_group_blocks_count(fs, group));
- }
-
- bitmap = 0;
- if (EQ_CLSTR(i, super_blk) ||
- (old_desc_blk && old_desc_blocks &&
- GE_CLSTR(i, old_desc_blk) &&
- LE_CLSTR(i, old_desc_blk + old_desc_blocks-1)) ||
- (new_desc_blk && EQ_CLSTR(i, new_desc_blk)) ||
- EQ_CLSTR(i, ext2fs_block_bitmap_loc(fs, group)) ||
- EQ_CLSTR(i, ext2fs_inode_bitmap_loc(fs, group)) ||
- (GE_CLSTR(i, ext2fs_inode_table_loc(fs, group)) &&
- LE_CLSTR(i, (ext2fs_inode_table_loc(fs, group) +
- fs->inode_blocks_per_group - 1)))) {
- bitmap = 1;
- actual = (actual != 0);
- count++;
- cmp_block--;
- } else if ((EXT2FS_B2C(fs, i) - count -
- EXT2FS_B2C(fs, fs->super->s_first_data_block)) %
- fs->super->s_clusters_per_group == 0) {
- /*
- * When the compare data blocks in block bitmap
- * are 0, count the free block,
- * skip the current block group.
- */
- if (ext2fs_test_block_bitmap_range2(
- ctx->block_found_map,
- EXT2FS_B2C(fs, i),
- cmp_block)) {
- /*
- * -1 means to skip the current block
- * group.
- */
- blocks = fs->super->s_clusters_per_group - 1;
- group_free = cmp_block;
- free_blocks += cmp_block;
- /*
- * The current block group's last block
- * is set to i.
- */
- i += EXT2FS_C2B(fs, cmp_block - 1);
- bitmap = 1;
- goto do_counts;
- }
- }
- } else if (redo_flag)
+ if (redo_flag)
bitmap = actual;
else
bitmap = ext2fs_fast_test_block_bitmap2(fs->block_map, i);
@@ -514,14 +440,15 @@ redo_counts:
*/
problem = PR_5_BLOCK_USED;
- if (skip_group) {
+ if (ext2fs_bg_flags_test(fs, group,
+ EXT2_BG_BLOCK_UNINIT)) {
struct problem_context pctx2;
pctx2.blk = i;
pctx2.group = group;
- if (fix_problem(ctx, PR_5_BLOCK_UNINIT,&pctx2)){
- ext2fs_bg_flags_clear(fs, group, EXT2_BG_BLOCK_UNINIT);
- skip_group = 0;
- }
+ if (fix_problem(ctx, PR_5_BLOCK_UNINIT,
+ &pctx2))
+ ext2fs_bg_flags_clear(fs, group,
+ EXT2_BG_BLOCK_UNINIT);
}
}
if (pctx.blk == NO_BLK) {
@@ -575,16 +502,10 @@ redo_counts:
group ++;
blocks = 0;
group_free = 0;
- skip_group = 0;
if (ctx->progress)
if ((ctx->progress)(ctx, 5, group,
fs->group_desc_count*2))
goto errout;
- if (csum_flag &&
- (i != ext2fs_blocks_count(fs->super)-1) &&
- ext2fs_bg_flags_test(fs, group,
- EXT2_BG_BLOCK_UNINIT))
- skip_group++;
}
}
if (pctx.blk != NO_BLK)
Since the beginning of the uninit_bg feature, the kernel[1] and
e2fsck[2] have always been careful to detect the presence of the
BLOCK_UNINIT flag, and compute a block bitmap with any group metadata
blocks marked in that bitmap. With that in mind, I think it's safe to
say that this is a design feature of uninit_bg.
Now that we've trained libext2fs to have this same behavior whenever
it's loading a block bitmap, we no longer need to unset BLOCK_UNINIT
for a group that contains only its own group metadata -- kernel,
e2fsck, and e2fsprogs will handle this correctly.
[1] kernel git 717d50e4971b81b96c0199c91cdf0039a8cb181a
"Ext4: Uninitialized Block Groups"
[2] e2fsprogs git f5fa20078bfc05b554294fe9c5505375d7913e8c
"Add support for EXT2_FEATURE_COMPAT_LAZY_BG"
Reported-by: Akira Fujita <[email protected]>
Signed-off-by: Darrick J. Wong <[email protected]>
---
lib/ext2fs/alloc_sb.c | 2 --
1 file changed, 2 deletions(-)
diff --git a/lib/ext2fs/alloc_sb.c b/lib/ext2fs/alloc_sb.c
index 223ec51..8788c00 100644
--- a/lib/ext2fs/alloc_sb.c
+++ b/lib/ext2fs/alloc_sb.c
@@ -65,8 +65,6 @@ int ext2fs_reserve_super_and_bgd(ext2_filsys fs,
ext2fs_mark_block_bitmap2(bmap, 0);
if (old_desc_blk) {
- if (fs->super->s_reserved_gdt_blocks && fs->block_map == bmap)
- ext2fs_bg_flags_clear(fs, group, EXT2_BG_BLOCK_UNINIT);
num_blocks = old_desc_blocks;
if (old_desc_blk + num_blocks >= ext2fs_blocks_count(fs->super))
num_blocks = ext2fs_blocks_count(fs->super) -
Now that libext2fs marks group metadata in the fs block bitmap, adjust
the expected test output to reflect expanded use of block_uninit and
the fact debugfs no longer prints block bitmap data that fails to
account for group data blocks.
Signed-off-by: Darrick J. Wong <[email protected]>
---
tests/m_bigjournal/expect.1 | 16 ++++++++--------
tests/m_uninit/expect.1 | 28 ++++++++++++++--------------
2 files changed, 22 insertions(+), 22 deletions(-)
diff --git a/tests/m_bigjournal/expect.1 b/tests/m_bigjournal/expect.1
index 312c276..b45d02c 100644
--- a/tests/m_bigjournal/expect.1
+++ b/tests/m_bigjournal/expect.1
@@ -55,7 +55,7 @@ Group 0: (Blocks 0-32767)
31836 free blocks, 5 free inodes, 2 directories, 5 unused inodes
Free blocks: 764-1184, 1269-1696, 1781-32767
Free inodes: 12-16
-Group 1: (Blocks 32768-65535) [INODE_UNINIT]
+Group 1: (Blocks 32768-65535) [INODE_UNINIT, BLOCK_UNINIT]
Backup superblock at 32768, Group descriptors at 32769-32769
Reserved GDT blocks at 32770-33440
Block bitmap at 674 (bg #0 + 674), Inode bitmap at 1186 (bg #0 + 1186)
@@ -69,7 +69,7 @@ Group 2: (Blocks 65536-98303) [INODE_UNINIT, BLOCK_UNINIT]
32768 free blocks, 16 free inodes, 0 directories, 16 unused inodes
Free blocks: 65536-98303
Free inodes: 33-48
-Group 3: (Blocks 98304-131071) [INODE_UNINIT]
+Group 3: (Blocks 98304-131071) [INODE_UNINIT, BLOCK_UNINIT]
Backup superblock at 98304, Group descriptors at 98305-98305
Reserved GDT blocks at 98306-98976
Block bitmap at 676 (bg #0 + 676), Inode bitmap at 1188 (bg #0 + 1188)
@@ -83,7 +83,7 @@ Group 4: (Blocks 131072-163839) [INODE_UNINIT, BLOCK_UNINIT]
32768 free blocks, 16 free inodes, 0 directories, 16 unused inodes
Free blocks: 131072-163839
Free inodes: 65-80
-Group 5: (Blocks 163840-196607) [INODE_UNINIT]
+Group 5: (Blocks 163840-196607) [INODE_UNINIT, BLOCK_UNINIT]
Backup superblock at 163840, Group descriptors at 163841-163841
Reserved GDT blocks at 163842-164512
Block bitmap at 678 (bg #0 + 678), Inode bitmap at 1190 (bg #0 + 1190)
@@ -97,7 +97,7 @@ Group 6: (Blocks 196608-229375) [INODE_UNINIT, BLOCK_UNINIT]
32768 free blocks, 16 free inodes, 0 directories, 16 unused inodes
Free blocks: 196608-229375
Free inodes: 97-112
-Group 7: (Blocks 229376-262143) [INODE_UNINIT]
+Group 7: (Blocks 229376-262143) [INODE_UNINIT, BLOCK_UNINIT]
Backup superblock at 229376, Group descriptors at 229377-229377
Reserved GDT blocks at 229378-230048
Block bitmap at 680 (bg #0 + 680), Inode bitmap at 1192 (bg #0 + 1192)
@@ -111,7 +111,7 @@ Group 8: (Blocks 262144-294911) [INODE_UNINIT, BLOCK_UNINIT]
32768 free blocks, 16 free inodes, 0 directories, 16 unused inodes
Free blocks: 262144-294911
Free inodes: 129-144
-Group 9: (Blocks 294912-327679) [INODE_UNINIT]
+Group 9: (Blocks 294912-327679) [INODE_UNINIT, BLOCK_UNINIT]
Backup superblock at 294912, Group descriptors at 294913-294913
Reserved GDT blocks at 294914-295584
Block bitmap at 682 (bg #0 + 682), Inode bitmap at 1194 (bg #0 + 1194)
@@ -209,7 +209,7 @@ Group 24: (Blocks 786432-819199) [INODE_UNINIT, BLOCK_UNINIT]
32768 free blocks, 16 free inodes, 0 directories, 16 unused inodes
Free blocks: 786432-819199
Free inodes: 385-400
-Group 25: (Blocks 819200-851967) [INODE_UNINIT]
+Group 25: (Blocks 819200-851967) [INODE_UNINIT, BLOCK_UNINIT]
Backup superblock at 819200, Group descriptors at 819201-819201
Reserved GDT blocks at 819202-819872
Block bitmap at 698 (bg #0 + 698), Inode bitmap at 1210 (bg #0 + 1210)
@@ -223,7 +223,7 @@ Group 26: (Blocks 851968-884735) [INODE_UNINIT, BLOCK_UNINIT]
32768 free blocks, 16 free inodes, 0 directories, 16 unused inodes
Free blocks: 851968-884735
Free inodes: 417-432
-Group 27: (Blocks 884736-917503) [INODE_UNINIT]
+Group 27: (Blocks 884736-917503) [INODE_UNINIT, BLOCK_UNINIT]
Backup superblock at 884736, Group descriptors at 884737-884737
Reserved GDT blocks at 884738-885408
Block bitmap at 700 (bg #0 + 700), Inode bitmap at 1212 (bg #0 + 1212)
@@ -551,7 +551,7 @@ Group 80: (Blocks 2621440-2654207) [INODE_UNINIT, BLOCK_UNINIT]
32768 free blocks, 16 free inodes, 0 directories, 16 unused inodes
Free blocks: 2621440-2654207
Free inodes: 1281-1296
-Group 81: (Blocks 2654208-2686975) [INODE_UNINIT]
+Group 81: (Blocks 2654208-2686975) [INODE_UNINIT, BLOCK_UNINIT]
Backup superblock at 2654208, Group descriptors at 2654209-2654209
Reserved GDT blocks at 2654210-2654880
Block bitmap at 754 (bg #0 + 754), Inode bitmap at 1266 (bg #0 + 1266)
diff --git a/tests/m_uninit/expect.1 b/tests/m_uninit/expect.1
index 3212e10..4af5955 100644
--- a/tests/m_uninit/expect.1
+++ b/tests/m_uninit/expect.1
@@ -64,7 +64,7 @@ Group 0: (Blocks 1-8192) [ITABLE_ZEROED]
7662 free blocks, 2037 free inodes, 2 directories, 2037 unused inodes
Free blocks: 531-8192
Free inodes: 12-2048
-Group 1: (Blocks 8193-16384) [INODE_UNINIT, ITABLE_ZEROED]
+Group 1: (Blocks 8193-16384) [INODE_UNINIT, BLOCK_UNINIT, ITABLE_ZEROED]
Backup superblock at 8193, Group descriptors at 8194-8194
Reserved GDT blocks at 8195-8450
Block bitmap at 8451 (+258), Inode bitmap at 8452 (+259)
@@ -76,9 +76,9 @@ Group 2: (Blocks 16385-24576) [INODE_UNINIT, BLOCK_UNINIT, ITABLE_ZEROED]
Block bitmap at 16385 (+0), Inode bitmap at 16386 (+1)
Inode table at 16387-16642 (+2)
7934 free blocks, 2048 free inodes, 0 directories, 2048 unused inodes
- Free blocks: 16385-24576
+ Free blocks: 16643-24576
Free inodes: 4097-6144
-Group 3: (Blocks 24577-32768) [INODE_UNINIT, ITABLE_ZEROED]
+Group 3: (Blocks 24577-32768) [INODE_UNINIT, BLOCK_UNINIT, ITABLE_ZEROED]
Backup superblock at 24577, Group descriptors at 24578-24578
Reserved GDT blocks at 24579-24834
Block bitmap at 24835 (+258), Inode bitmap at 24836 (+259)
@@ -90,9 +90,9 @@ Group 4: (Blocks 32769-40960) [INODE_UNINIT, BLOCK_UNINIT, ITABLE_ZEROED]
Block bitmap at 32769 (+0), Inode bitmap at 32770 (+1)
Inode table at 32771-33026 (+2)
7934 free blocks, 2048 free inodes, 0 directories, 2048 unused inodes
- Free blocks: 32769-40960
+ Free blocks: 33027-40960
Free inodes: 8193-10240
-Group 5: (Blocks 40961-49152) [INODE_UNINIT, ITABLE_ZEROED]
+Group 5: (Blocks 40961-49152) [INODE_UNINIT, BLOCK_UNINIT, ITABLE_ZEROED]
Backup superblock at 40961, Group descriptors at 40962-40962
Reserved GDT blocks at 40963-41218
Block bitmap at 41219 (+258), Inode bitmap at 41220 (+259)
@@ -104,9 +104,9 @@ Group 6: (Blocks 49153-57344) [INODE_UNINIT, BLOCK_UNINIT, ITABLE_ZEROED]
Block bitmap at 49153 (+0), Inode bitmap at 49154 (+1)
Inode table at 49155-49410 (+2)
7934 free blocks, 2048 free inodes, 0 directories, 2048 unused inodes
- Free blocks: 49153-57344
+ Free blocks: 49411-57344
Free inodes: 12289-14336
-Group 7: (Blocks 57345-65536) [INODE_UNINIT, ITABLE_ZEROED]
+Group 7: (Blocks 57345-65536) [INODE_UNINIT, BLOCK_UNINIT, ITABLE_ZEROED]
Backup superblock at 57345, Group descriptors at 57346-57346
Reserved GDT blocks at 57347-57602
Block bitmap at 57603 (+258), Inode bitmap at 57604 (+259)
@@ -118,9 +118,9 @@ Group 8: (Blocks 65537-73728) [INODE_UNINIT, BLOCK_UNINIT, ITABLE_ZEROED]
Block bitmap at 65537 (+0), Inode bitmap at 65538 (+1)
Inode table at 65539-65794 (+2)
7934 free blocks, 2048 free inodes, 0 directories, 2048 unused inodes
- Free blocks: 65537-73728
+ Free blocks: 65795-73728
Free inodes: 16385-18432
-Group 9: (Blocks 73729-81920) [INODE_UNINIT, ITABLE_ZEROED]
+Group 9: (Blocks 73729-81920) [INODE_UNINIT, BLOCK_UNINIT, ITABLE_ZEROED]
Backup superblock at 73729, Group descriptors at 73730-73730
Reserved GDT blocks at 73731-73986
Block bitmap at 73987 (+258), Inode bitmap at 73988 (+259)
@@ -132,31 +132,31 @@ Group 10: (Blocks 81921-90112) [INODE_UNINIT, BLOCK_UNINIT, ITABLE_ZEROED]
Block bitmap at 81921 (+0), Inode bitmap at 81922 (+1)
Inode table at 81923-82178 (+2)
7934 free blocks, 2048 free inodes, 0 directories, 2048 unused inodes
- Free blocks: 81921-90112
+ Free blocks: 82179-90112
Free inodes: 20481-22528
Group 11: (Blocks 90113-98304) [INODE_UNINIT, BLOCK_UNINIT, ITABLE_ZEROED]
Block bitmap at 90113 (+0), Inode bitmap at 90114 (+1)
Inode table at 90115-90370 (+2)
7934 free blocks, 2048 free inodes, 0 directories, 2048 unused inodes
- Free blocks: 90113-98304
+ Free blocks: 90371-98304
Free inodes: 22529-24576
Group 12: (Blocks 98305-106496) [INODE_UNINIT, BLOCK_UNINIT, ITABLE_ZEROED]
Block bitmap at 98305 (+0), Inode bitmap at 98306 (+1)
Inode table at 98307-98562 (+2)
7934 free blocks, 2048 free inodes, 0 directories, 2048 unused inodes
- Free blocks: 98305-106496
+ Free blocks: 98563-106496
Free inodes: 24577-26624
Group 13: (Blocks 106497-114688) [INODE_UNINIT, BLOCK_UNINIT, ITABLE_ZEROED]
Block bitmap at 106497 (+0), Inode bitmap at 106498 (+1)
Inode table at 106499-106754 (+2)
7934 free blocks, 2048 free inodes, 0 directories, 2048 unused inodes
- Free blocks: 106497-114688
+ Free blocks: 106755-114688
Free inodes: 26625-28672
Group 14: (Blocks 114689-122880) [INODE_UNINIT, BLOCK_UNINIT, ITABLE_ZEROED]
Block bitmap at 114689 (+0), Inode bitmap at 114690 (+1)
Inode table at 114691-114946 (+2)
7934 free blocks, 2048 free inodes, 0 directories, 2048 unused inodes
- Free blocks: 114689-122880
+ Free blocks: 114947-122880
Free inodes: 28673-30720
Group 15: (Blocks 122881-131071) [INODE_UNINIT, ITABLE_ZEROED]
Block bitmap at 122881 (+0), Inode bitmap at 122882 (+1)
When bigalloc is enabled, using ext2fs_block_alloc_stats2() to free
any block in a cluster has the effect of freeing the entire cluster.
This is problematic if a caller instructs us to punch, say, blocks
12-15 of a 16-block cluster, because blocks 0-11 now point to a "free"
cluster.
The naive way to solve this problem is to see if any of the other
blocks in this logical cluster map to a physical cluster. If so, then
we know that the cluster is still in use and it mustn't be freed.
Otherwise, we are punching the last mapped block in this cluster, so
we can free the cluster.
The implementation given only does the rigorous checks for the partial
clusters at the beginning and end of the punching range.
v2: Refactor the block free code into a separate helper function that
should be more efficient.
Reviewed-by: Zheng Liu <[email protected]>
Signed-off-by: Darrick J. Wong <[email protected]>
---
lib/ext2fs/bmap.c | 29 ++++++++++++++++++
lib/ext2fs/ext2fs.h | 3 ++
lib/ext2fs/punch.c | 82 ++++++++++++++++++++++++++++++++++++++++++++++++---
3 files changed, 109 insertions(+), 5 deletions(-)
diff --git a/lib/ext2fs/bmap.c b/lib/ext2fs/bmap.c
index 32788f6..3a18d76 100644
--- a/lib/ext2fs/bmap.c
+++ b/lib/ext2fs/bmap.c
@@ -173,6 +173,35 @@ static errcode_t implied_cluster_alloc(ext2_filsys fs, ext2_ino_t ino,
return 0;
}
+/* Try to map a logical block to an already-allocated physical cluster. */
+errcode_t ext2fs_map_cluster_block(ext2_filsys fs, ext2_ino_t ino,
+ struct ext2_inode *inode, blk64_t lblk,
+ blk64_t *pblk)
+{
+ ext2_extent_handle_t handle;
+ errcode_t retval;
+
+ /* Need bigalloc and extents to be enabled */
+ *pblk = 0;
+ if (!EXT2_HAS_RO_COMPAT_FEATURE(fs->super,
+ EXT4_FEATURE_RO_COMPAT_BIGALLOC) ||
+ !(inode->i_flags & EXT4_EXTENTS_FL))
+ return 0;
+
+ retval = ext2fs_extent_open2(fs, ino, inode, &handle);
+ if (retval)
+ goto out;
+
+ retval = implied_cluster_alloc(fs, ino, inode, handle, lblk, pblk);
+ if (retval)
+ goto out2;
+
+out2:
+ ext2fs_extent_free(handle);
+out:
+ return retval;
+}
+
static errcode_t extent_bmap(ext2_filsys fs, ext2_ino_t ino,
struct ext2_inode *inode,
ext2_extent_handle_t handle,
diff --git a/lib/ext2fs/ext2fs.h b/lib/ext2fs/ext2fs.h
index edd5ee9..da518df 100644
--- a/lib/ext2fs/ext2fs.h
+++ b/lib/ext2fs/ext2fs.h
@@ -924,6 +924,9 @@ extern errcode_t ext2fs_bmap2(ext2_filsys fs, ext2_ino_t ino,
struct ext2_inode *inode,
char *block_buf, int bmap_flags, blk64_t block,
int *ret_flags, blk64_t *phys_blk);
+errcode_t ext2fs_map_cluster_block(ext2_filsys fs, ext2_ino_t ino,
+ struct ext2_inode *inode, blk64_t lblk,
+ blk64_t *pblk);
#if 0
/* bmove.c */
diff --git a/lib/ext2fs/punch.c b/lib/ext2fs/punch.c
index ff051f7..f138297 100644
--- a/lib/ext2fs/punch.c
+++ b/lib/ext2fs/punch.c
@@ -177,6 +177,75 @@ static void dbg_print_extent(char *desc, struct ext2fs_extent *extent)
#define dbg_printf(f, a...) do { } while (0)
#endif
+/* Free a range of blocks, respecting cluster boundaries */
+static errcode_t punch_extent_blocks(ext2_filsys fs, ext2_ino_t ino,
+ struct ext2_inode *inode,
+ blk64_t lfree_start, blk64_t free_start,
+ __u32 free_count, int *freed)
+{
+ blk64_t pblk;
+ int freed_now = 0;
+ __u32 cluster_freed;
+ errcode_t retval = 0;
+
+ /* No bigalloc? Just free each block. */
+ if (EXT2FS_CLUSTER_RATIO(fs) == 1) {
+ *freed += free_count;
+ while (free_count-- > 0)
+ ext2fs_block_alloc_stats2(fs, free_start++, -1);
+ return retval;
+ }
+
+ /*
+ * Try to free up to the next cluster boundary. We assume that all
+ * blocks in a logical cluster map to blocks from the same physical
+ * cluster, and that the offsets within the [pl]clusters match.
+ */
+ if (free_start & EXT2FS_CLUSTER_MASK(fs)) {
+ retval = ext2fs_map_cluster_block(fs, ino, inode,
+ lfree_start, &pblk);
+ if (retval)
+ goto errout;
+ if (!pblk) {
+ ext2fs_block_alloc_stats2(fs, free_start, -1);
+ freed_now++;
+ }
+ cluster_freed = EXT2FS_CLUSTER_RATIO(fs) -
+ (free_start & EXT2FS_CLUSTER_MASK(fs));
+ if (cluster_freed > free_count)
+ cluster_freed = free_count;
+ free_count -= cluster_freed;
+ free_start += cluster_freed;
+ lfree_start += cluster_freed;
+ }
+
+ /* Free whole clusters from the middle of the range. */
+ while (free_count > 0 && free_count >= EXT2FS_CLUSTER_RATIO(fs)) {
+ ext2fs_block_alloc_stats2(fs, free_start, -1);
+ freed_now++;
+ cluster_freed = EXT2FS_CLUSTER_RATIO(fs);
+ free_count -= cluster_freed;
+ free_start += cluster_freed;
+ lfree_start += cluster_freed;
+ }
+
+ /* Try to free the last cluster. */
+ if (free_count > 0) {
+ retval = ext2fs_map_cluster_block(fs, ino, inode,
+ lfree_start, &pblk);
+ if (retval)
+ goto errout;
+ if (!pblk) {
+ ext2fs_block_alloc_stats2(fs, free_start, -1);
+ freed_now++;
+ }
+ }
+
+errout:
+ *freed += freed_now;
+ return retval;
+}
+
static errcode_t ext2fs_punch_extent(ext2_filsys fs, ext2_ino_t ino,
struct ext2_inode *inode,
blk64_t start, blk64_t end)
@@ -184,7 +253,7 @@ static errcode_t ext2fs_punch_extent(ext2_filsys fs, ext2_ino_t ino,
ext2_extent_handle_t handle = 0;
struct ext2fs_extent extent;
errcode_t retval;
- blk64_t free_start, next;
+ blk64_t free_start, next, lfree_start;
__u32 free_count, newlen;
int freed = 0;
int op;
@@ -225,6 +294,7 @@ static errcode_t ext2fs_punch_extent(ext2_filsys fs, ext2_ino_t ino,
/* Start of deleted region before extent;
adjust beginning of extent */
free_start = extent.e_pblk;
+ lfree_start = extent.e_lblk;
if (next > end)
free_count = end - extent.e_lblk + 1;
else
@@ -240,6 +310,7 @@ static errcode_t ext2fs_punch_extent(ext2_filsys fs, ext2_ino_t ino,
dbg_printf("Case #%d\n", 2);
newlen = start - extent.e_lblk;
free_start = extent.e_pblk + newlen;
+ lfree_start = extent.e_lblk + newlen;
free_count = extent.e_len - newlen;
extent.e_len = newlen;
} else {
@@ -255,6 +326,7 @@ static errcode_t ext2fs_punch_extent(ext2_filsys fs, ext2_ino_t ino,
extent.e_len = start - extent.e_lblk;
free_start = extent.e_pblk + extent.e_len;
+ lfree_start = extent.e_lblk + extent.e_len;
free_count = end - start + 1;
dbg_print_extent("inserting", &newex);
@@ -314,10 +386,10 @@ static errcode_t ext2fs_punch_extent(ext2_filsys fs, ext2_ino_t ino,
goto errout;
dbg_printf("Free start %llu, free count = %u\n",
free_start, free_count);
- while (free_count-- > 0) {
- ext2fs_block_alloc_stats2(fs, free_start++, -1);
- freed++;
- }
+ retval = punch_extent_blocks(fs, ino, inode, lfree_start,
+ free_start, free_count, &freed);
+ if (retval)
+ goto errout;
next_extent:
retval = ext2fs_extent_get(handle, op,
&extent);
When we're appending a block to a directory file or the journal file,
and the new block is part of a cluster that has already been allocated
to the file (implied cluster allocation), don't update the bitmap or
the summary counts because that was performed when the cluster was
allocated.
Reviewed-by: Zheng Liu <[email protected]>
Signed-off-by: Darrick J. Wong <[email protected]>
---
lib/ext2fs/expanddir.c | 2 +-
lib/ext2fs/mkjournal.c | 2 +-
2 files changed, 2 insertions(+), 2 deletions(-)
diff --git a/lib/ext2fs/expanddir.c b/lib/ext2fs/expanddir.c
index 22558d6..09a15fa 100644
--- a/lib/ext2fs/expanddir.c
+++ b/lib/ext2fs/expanddir.c
@@ -55,6 +55,7 @@ static int expand_dir_proc(ext2_filsys fs,
return BLOCK_ABORT;
}
es->newblocks++;
+ ext2fs_block_alloc_stats2(fs, new_blk, +1);
}
if (blockcnt > 0) {
retval = ext2fs_new_dir_block(fs, 0, 0, &block);
@@ -82,7 +83,6 @@ static int expand_dir_proc(ext2_filsys fs,
}
ext2fs_free_mem(&block);
*blocknr = new_blk;
- ext2fs_block_alloc_stats2(fs, new_blk, +1);
if (es->done)
return (BLOCK_CHANGED | BLOCK_ABORT);
diff --git a/lib/ext2fs/mkjournal.c b/lib/ext2fs/mkjournal.c
index 69ac135..d09c458 100644
--- a/lib/ext2fs/mkjournal.c
+++ b/lib/ext2fs/mkjournal.c
@@ -250,6 +250,7 @@ static int mkjournal_proc(ext2_filsys fs,
es->err = retval;
return BLOCK_ABORT;
}
+ ext2fs_block_alloc_stats2(fs, new_blk, +1);
es->newblocks++;
}
if (blockcnt >= 0)
@@ -285,7 +286,6 @@ static int mkjournal_proc(ext2_filsys fs,
return BLOCK_ABORT;
}
*blocknr = es->goal = new_blk;
- ext2fs_block_alloc_stats2(fs, new_blk, +1);
if (es->num_blocks == 0)
return (BLOCK_CHANGED | BLOCK_ABORT);
When the rehash process is running on a bigalloc filesystem, it
compresses all the directory entries and hash structures into the
beginning of the directory file and then uses block_iterate3() to free
the blocks off the end of the file. It seems to call
ext2fs_block_alloc_stats2() for every block in a cluster, which is
unfortunate because this function allocates and frees entire clusters
(and updates the summary counts accordingly). In this case e2fsck
writes out incorrect summary counts.
Reviewed-by: Zheng Liu <[email protected]>
Signed-off-by: Darrick J. Wong <[email protected]>
---
e2fsck/rehash.c | 14 +++++++++++---
1 file changed, 11 insertions(+), 3 deletions(-)
diff --git a/e2fsck/rehash.c b/e2fsck/rehash.c
index 20704a9..9b90353 100644
--- a/e2fsck/rehash.c
+++ b/e2fsck/rehash.c
@@ -719,10 +719,18 @@ static int write_dir_block(ext2_filsys fs,
/* We don't need this block, so release it */
e2fsck_read_bitmaps(wd->ctx);
blk = *block_nr;
- ext2fs_unmark_block_bitmap2(wd->ctx->block_found_map, blk);
- ext2fs_block_alloc_stats2(fs, blk, -1);
+ /*
+ * In theory, we only release blocks from the end of the
+ * directory file, so it's fine to clobber a whole cluster at
+ * once.
+ */
+ if (blk % EXT2FS_CLUSTER_RATIO(fs) == 0) {
+ ext2fs_unmark_block_bitmap2(wd->ctx->block_found_map,
+ blk);
+ ext2fs_block_alloc_stats2(fs, blk, -1);
+ wd->cleared++;
+ }
*block_nr = 0;
- wd->cleared++;
return BLOCK_CHANGED;
}
If pass5 finds bitmap errors in a range of clusters, don't print each
cluster number individually when we could print only the start and end
cluster number. e2fsck already does this for the non-bigalloc case.
Reviewed-by: Zheng Liu <[email protected]>
Signed-off-by: Darrick J. Wong <[email protected]>
---
e2fsck/pass5.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/e2fsck/pass5.c b/e2fsck/pass5.c
index 6d7b968..04d8843 100644
--- a/e2fsck/pass5.c
+++ b/e2fsck/pass5.c
@@ -456,8 +456,8 @@ redo_counts:
save_problem = problem;
} else {
if ((problem == save_problem) &&
- (pctx.blk2 == i-1))
- pctx.blk2++;
+ (pctx.blk2 == i - EXT2FS_CLUSTER_RATIO(fs)))
+ pctx.blk2 += EXT2FS_CLUSTER_RATIO(fs);
else {
print_bitmap_problem(ctx, save_problem, &pctx);
pctx.blk = pctx.blk2 = i;
When we're expanding a directory, check to see if we're doing an
implied cluster allocation; if so, we don't need to allocate a new
block, and we certainly don't need to update the summary counts.
Reported-by: Zheng Liu <[email protected]>
Signed-off-by: Darrick J. Wong <[email protected]>
---
e2fsck/pass3.c | 23 ++++++++++++++++-------
1 file changed, 16 insertions(+), 7 deletions(-)
diff --git a/e2fsck/pass3.c b/e2fsck/pass3.c
index dc9d7c1..c57aab8 100644
--- a/e2fsck/pass3.c
+++ b/e2fsck/pass3.c
@@ -718,12 +718,23 @@ static int expand_dir_proc(ext2_filsys fs,
last_blk = *blocknr;
return 0;
}
- retval = ext2fs_new_block2(fs, last_blk, ctx->block_found_map,
- &new_blk);
- if (retval) {
- es->err = retval;
- return BLOCK_ABORT;
+
+ if (blockcnt &&
+ (EXT2FS_B2C(fs, last_blk) == EXT2FS_B2C(fs, last_blk + 1)))
+ new_blk = last_blk + 1;
+ else {
+ last_blk &= ~EXT2FS_CLUSTER_MASK(fs);
+ retval = ext2fs_new_block2(fs, last_blk, ctx->block_found_map,
+ &new_blk);
+ if (retval) {
+ es->err = retval;
+ return BLOCK_ABORT;
+ }
+ es->newblocks++;
+ ext2fs_block_alloc_stats2(fs, new_blk, +1);
}
+ last_blk = new_blk;
+
if (blockcnt > 0) {
retval = ext2fs_new_dir_block(fs, 0, 0, &block);
if (retval) {
@@ -749,8 +760,6 @@ static int expand_dir_proc(ext2_filsys fs,
ext2fs_free_mem(&block);
*blocknr = new_blk;
ext2fs_mark_block_bitmap2(ctx->block_found_map, new_blk);
- ext2fs_block_alloc_stats2(fs, new_blk, +1);
- es->newblocks++;
if (es->num == 0)
return (BLOCK_CHANGED | BLOCK_ABORT);
When freeing a group's metadata blocks, be careful not to free
clusters belonging to other groups!
Signed-off-by: Darrick J. Wong <[email protected]>
---
resize/resize2fs.c | 78 +++++++++++++++++++++++++++++++++-------------------
1 file changed, 49 insertions(+), 29 deletions(-)
diff --git a/resize/resize2fs.c b/resize/resize2fs.c
index ff5e6a2..49fe986 100644
--- a/resize/resize2fs.c
+++ b/resize/resize2fs.c
@@ -270,40 +270,60 @@ static void fix_uninit_block_bitmaps(ext2_filsys fs)
* release them in the new filesystem data structure, and mark them as
* reserved so the old inode table blocks don't get overwritten.
*/
-static void free_gdp_blocks(ext2_filsys fs,
- ext2fs_block_bitmap reserve_blocks,
- ext2_filsys old_fs,
- dgrp_t group)
+static errcode_t free_gdp_blocks(ext2_filsys fs,
+ ext2fs_block_bitmap reserve_blocks,
+ ext2_filsys old_fs,
+ dgrp_t group, dgrp_t count)
{
blk64_t blk;
int j;
+ dgrp_t i;
+ ext2fs_block_bitmap bg_map = NULL;
+ errcode_t retval = 0;
- blk = ext2fs_block_bitmap_loc(old_fs, group);
- if (blk &&
- (blk < ext2fs_blocks_count(fs->super))) {
- ext2fs_block_alloc_stats2(fs, blk, -1);
- ext2fs_mark_block_bitmap2(reserve_blocks, blk);
- }
+ /* If bigalloc, don't free metadata living in the same cluster */
+ if (EXT2FS_CLUSTER_RATIO(fs) > 1) {
+ retval = ext2fs_allocate_block_bitmap(fs, "bgdata", &bg_map);
+ if (retval)
+ goto out;
- blk = ext2fs_inode_bitmap_loc(old_fs, group);
- if (blk &&
- (blk < ext2fs_blocks_count(fs->super))) {
- ext2fs_block_alloc_stats2(fs, blk, -1);
- ext2fs_mark_block_bitmap2(reserve_blocks, blk);
+ retval = mark_table_blocks(fs, bg_map);
+ if (retval)
+ goto out;
}
- blk = ext2fs_inode_table_loc(old_fs, group);
- if (blk == 0 ||
- (blk >= ext2fs_blocks_count(fs->super)))
- return;
+ for (i = group; i < group + count; i++) {
+ blk = ext2fs_block_bitmap_loc(old_fs, i);
+ if (blk &&
+ (blk < ext2fs_blocks_count(fs->super)) &&
+ !(bg_map && ext2fs_test_block_bitmap2(bg_map, blk))) {
+ ext2fs_block_alloc_stats2(fs, blk, -1);
+ ext2fs_mark_block_bitmap2(reserve_blocks, blk);
+ }
- for (j = 0;
- j < fs->inode_blocks_per_group; j++, blk++) {
- if (blk >= ext2fs_blocks_count(fs->super))
- break;
- ext2fs_block_alloc_stats2(fs, blk, -1);
- ext2fs_mark_block_bitmap2(reserve_blocks, blk);
+ blk = ext2fs_inode_bitmap_loc(old_fs, i);
+ if (blk &&
+ (blk < ext2fs_blocks_count(fs->super)) &&
+ !(bg_map && ext2fs_test_block_bitmap2(bg_map, blk))) {
+ ext2fs_block_alloc_stats2(fs, blk, -1);
+ ext2fs_mark_block_bitmap2(reserve_blocks, blk);
+ }
+
+ blk = ext2fs_inode_table_loc(old_fs, i);
+ for (j = 0;
+ j < fs->inode_blocks_per_group; j++, blk++) {
+ if (blk >= ext2fs_blocks_count(fs->super) ||
+ (bg_map && ext2fs_test_block_bitmap2(bg_map, blk)))
+ continue;
+ ext2fs_block_alloc_stats2(fs, blk, -1);
+ ext2fs_mark_block_bitmap2(reserve_blocks, blk);
+ }
}
+
+out:
+ if (bg_map)
+ ext2fs_free_block_bitmap(bg_map);
+ return retval;
}
/*
@@ -467,10 +487,10 @@ retry:
* Check the block groups that we are chopping off
* and free any blocks associated with their metadata
*/
- for (i = fs->group_desc_count;
- i < old_fs->group_desc_count; i++)
- free_gdp_blocks(fs, reserve_blocks, old_fs, i);
- retval = 0;
+ retval = free_gdp_blocks(fs, reserve_blocks, old_fs,
+ fs->group_desc_count,
+ old_fs->group_desc_count -
+ fs->group_desc_count);
goto errout;
}
When we're moving blocks around the filesystem, ensure that freeing
the old blocks only frees the clusters if they're not in use by other
metadata.
Signed-off-by: Darrick J. Wong <[email protected]>
---
resize/resize2fs.c | 72 ++++++++++++++++++++++++++++++++++++++++++++++------
1 file changed, 63 insertions(+), 9 deletions(-)
diff --git a/resize/resize2fs.c b/resize/resize2fs.c
index 49fe986..6c2c870 100644
--- a/resize/resize2fs.c
+++ b/resize/resize2fs.c
@@ -870,12 +870,12 @@ static errcode_t blocks_to_move(ext2_resize_t rfs)
int j, has_super;
dgrp_t i, max_groups, g;
blk64_t blk, group_blk;
- blk64_t old_blocks, new_blocks;
+ blk64_t old_blocks, new_blocks, group_end, cluster_freed;
blk64_t new_size;
unsigned int meta_bg, meta_bg_size;
errcode_t retval;
ext2_filsys fs, old_fs;
- ext2fs_block_bitmap meta_bmap;
+ ext2fs_block_bitmap meta_bmap, new_meta_bmap = NULL;
int flex_bg;
fs = rfs->new_fs;
@@ -982,15 +982,42 @@ static errcode_t blocks_to_move(ext2_resize_t rfs)
* blocks as free.
*/
if (old_blocks > new_blocks) {
+ if (EXT2FS_CLUSTER_RATIO(fs) > 1) {
+ retval = ext2fs_allocate_block_bitmap(fs,
+ _("new meta blocks"),
+ &new_meta_bmap);
+ if (retval)
+ goto errout;
+
+ retval = mark_table_blocks(fs, new_meta_bmap);
+ if (retval)
+ goto errout;
+ }
+
for (i = 0; i < max_groups; i++) {
if (!ext2fs_bg_has_super(fs, i)) {
group_blk += fs->super->s_blocks_per_group;
continue;
}
- for (blk = group_blk+1+new_blocks;
- blk < group_blk+1+old_blocks; blk++) {
- ext2fs_block_alloc_stats2(fs, blk, -1);
+ group_end = group_blk + 1 + old_blocks;
+ for (blk = group_blk + 1 + new_blocks;
+ blk < group_end;) {
+ if (new_meta_bmap == NULL ||
+ !ext2fs_test_block_bitmap2(new_meta_bmap,
+ blk)) {
+ cluster_freed =
+ EXT2FS_CLUSTER_RATIO(fs) -
+ (blk &
+ EXT2FS_CLUSTER_MASK(fs));
+ if (cluster_freed > group_end - blk)
+ cluster_freed = group_end - blk;
+ ext2fs_block_alloc_stats2(fs, blk, -1);
+ blk += EXT2FS_CLUSTER_RATIO(fs);
+ rfs->needed_blocks -= cluster_freed;
+ continue;
+ }
rfs->needed_blocks--;
+ blk++;
}
group_blk += fs->super->s_blocks_per_group;
}
@@ -1136,6 +1163,8 @@ static errcode_t blocks_to_move(ext2_resize_t rfs)
retval = 0;
errout:
+ if (new_meta_bmap)
+ ext2fs_free_block_bitmap(new_meta_bmap);
if (meta_bmap)
ext2fs_free_block_bitmap(meta_bmap);
@@ -1809,9 +1838,10 @@ static errcode_t move_itables(ext2_resize_t rfs)
dgrp_t i, max_groups;
ext2_filsys fs = rfs->new_fs;
char *cp;
- blk64_t old_blk, new_blk, blk;
+ blk64_t old_blk, new_blk, blk, cluster_freed;
errcode_t retval;
int j, to_move, moved;
+ ext2fs_block_bitmap new_bmap = NULL;
max_groups = fs->group_desc_count;
if (max_groups > rfs->old_fs->group_desc_count)
@@ -1824,6 +1854,17 @@ static errcode_t move_itables(ext2_resize_t rfs)
return retval;
}
+ if (EXT2FS_CLUSTER_RATIO(fs) > 1) {
+ retval = ext2fs_allocate_block_bitmap(fs, _("new meta blocks"),
+ &new_bmap);
+ if (retval)
+ return retval;
+
+ retval = mark_table_blocks(fs, new_bmap);
+ if (retval)
+ goto errout;
+ }
+
/*
* Figure out how many inode tables we need to move
*/
@@ -1901,8 +1942,19 @@ static errcode_t move_itables(ext2_resize_t rfs)
}
for (blk = ext2fs_inode_table_loc(rfs->old_fs, i), j=0;
- j < fs->inode_blocks_per_group ; j++, blk++)
- ext2fs_block_alloc_stats2(fs, blk, -1);
+ j < fs->inode_blocks_per_group;) {
+ if (new_bmap == NULL ||
+ !ext2fs_test_block_bitmap2(new_bmap, blk)) {
+ ext2fs_block_alloc_stats2(fs, blk, -1);
+ cluster_freed = EXT2FS_CLUSTER_RATIO(fs) -
+ (blk & EXT2FS_CLUSTER_MASK(fs));
+ blk += cluster_freed;
+ j += cluster_freed;
+ continue;
+ }
+ blk++;
+ j++;
+ }
ext2fs_inode_table_loc_set(rfs->old_fs, i, new_blk);
ext2fs_group_desc_csum_set(rfs->old_fs, i);
@@ -1922,9 +1974,11 @@ static errcode_t move_itables(ext2_resize_t rfs)
if (rfs->flags & RESIZE_DEBUG_ITABLEMOVE)
printf("Inode table move finished.\n");
#endif
- return 0;
+ retval = 0;
errout:
+ if (new_bmap)
+ ext2fs_free_block_bitmap(new_bmap);
return retval;
}
The block_validity mount option spot-checks block allocations against
a bitmap of known group metadata blocks. This helps us to prevent
self-inflicted catastrophic failures such as trying to "share"
critical metadata (think bitmaps) with file data, which usually
results in filesystem destruction.
In order to test the overhead of the mount option, I re-used the speed
tests in the metadata checksum testing script. In short, the program
creates what looks like 15 copies of a kernel source tree, except that
it uses fallocate to strip out the overhead of writing the file data
so that we can focus on metadata overhead. On a 64G RAM disk, the
overhead was generally about 0.9% and at most 1.6%. On a 160G USB
disk, the overhead was about 0.8% and peaked at 1.2%.
When I changed the test to write out files instead of merely
fallocating space, the overhead was negligible.
Signed-off-by: Darrick J. Wong <[email protected]>
---
misc/mke2fs.conf.in | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/misc/mke2fs.conf.in b/misc/mke2fs.conf.in
index 178733f..3919f3b 100644
--- a/misc/mke2fs.conf.in
+++ b/misc/mke2fs.conf.in
@@ -1,6 +1,6 @@
[defaults]
base_features = sparse_super,filetype,resize_inode,dir_index,ext_attr
- default_mntopts = acl,user_xattr
+ default_mntopts = acl,user_xattr,block_validity
enable_periodic_fsck = 0
blocksize = 4096
inode_size = 256
In order to support fallocate, we need to be able to have
ext2fs_bmap2() allocate blocks and put them into uninitialized
extents. There's a flag to do this in the extent code, but it's not
exposed to the bmap2 interface, so plumb that in. Eventually fuse2fs
or somebody will use it.
Signed-off-by: Darrick J. Wong <[email protected]>
---
lib/ext2fs/bmap.c | 49 +++++++++++++++++++++++++++++++++++++++++++++++--
lib/ext2fs/ext2fs.h | 1 +
2 files changed, 48 insertions(+), 2 deletions(-)
diff --git a/lib/ext2fs/bmap.c b/lib/ext2fs/bmap.c
index 3a18d76..ca89eb1 100644
--- a/lib/ext2fs/bmap.c
+++ b/lib/ext2fs/bmap.c
@@ -33,6 +33,32 @@ extern errcode_t ext2fs_bmap(ext2_filsys fs, ext2_ino_t ino,
#define inode_bmap(inode, nr) ((inode)->i_block[(nr)])
+static errcode_t zero_block(ext2_filsys fs, blk64_t blk)
+{
+ void *b;
+ errcode_t retval;
+
+ if (io_channel_discard_zeroes_data(fs->io)) {
+ retval = io_channel_discard(fs->io, blk, 1);
+ if (retval == 0)
+ return 0;
+ }
+
+ retval = ext2fs_get_memzero(fs->blocksize, &b);
+ if (retval)
+ return retval;
+
+ memset(b, 0, fs->blocksize);
+
+ retval = io_channel_write_blk64(fs->io, blk, 1, b);
+ if (retval)
+ goto out;
+
+out:
+ ext2fs_free_mem(&b);
+ return retval;
+}
+
static _BMAP_INLINE_ errcode_t block_ind_bmap(ext2_filsys fs, int flags,
blk_t ind, char *block_buf,
int *blocks_alloc,
@@ -72,6 +98,11 @@ static _BMAP_INLINE_ errcode_t block_ind_bmap(ext2_filsys fs, int flags,
block_buf + fs->blocksize, &b);
if (retval)
return retval;
+ if (flags & BMAP_UNINIT) {
+ retval = zero_block(fs, b);
+ if (retval)
+ return retval;
+ }
#ifdef WORDS_BIGENDIAN
((blk_t *) block_buf)[nr] = ext2fs_swab32(b);
@@ -214,10 +245,13 @@ static errcode_t extent_bmap(ext2_filsys fs, ext2_ino_t ino,
errcode_t retval = 0;
blk64_t blk64 = 0;
int alloc = 0;
+ int set_flags;
+
+ set_flags = bmap_flags & BMAP_UNINIT ? EXT2_EXTENT_SET_BMAP_UNINIT : 0;
if (bmap_flags & BMAP_SET) {
retval = ext2fs_extent_set_bmap(handle, block,
- *phys_blk, 0);
+ *phys_blk, set_flags);
return retval;
}
retval = ext2fs_extent_goto(handle, block);
@@ -254,7 +288,7 @@ got_block:
alloc++;
set_extent:
retval = ext2fs_extent_set_bmap(handle, block,
- blk64, 0);
+ blk64, set_flags);
if (retval)
return retval;
/* Update inode after setting extent */
@@ -336,6 +370,12 @@ errcode_t ext2fs_bmap2(ext2_filsys fs, ext2_ino_t ino, struct ext2_inode *inode,
goto done;
}
+ if ((bmap_flags & BMAP_SET) && (bmap_flags & BMAP_UNINIT)) {
+ retval = zero_block(fs, *phys_blk);
+ if (retval)
+ goto done;
+ }
+
if (block < EXT2_NDIR_BLOCKS) {
if (bmap_flags & BMAP_SET) {
b = *phys_blk;
@@ -351,6 +391,11 @@ errcode_t ext2fs_bmap2(ext2_filsys fs, ext2_ino_t ino, struct ext2_inode *inode,
retval = ext2fs_alloc_block(fs, b, block_buf, &b);
if (retval)
goto done;
+ if (bmap_flags & BMAP_UNINIT) {
+ retval = zero_block(fs, b);
+ if (retval)
+ goto done;
+ }
inode_bmap(inode, block) = b;
blocks_alloc++;
*phys_blk = b;
diff --git a/lib/ext2fs/ext2fs.h b/lib/ext2fs/ext2fs.h
index da518df..316e6f5 100644
--- a/lib/ext2fs/ext2fs.h
+++ b/lib/ext2fs/ext2fs.h
@@ -525,6 +525,7 @@ typedef struct ext2_icount *ext2_icount_t;
*/
#define BMAP_ALLOC 0x0001
#define BMAP_SET 0x0002
+#define BMAP_UNINIT 0x0004
/*
* Returned flags from ext2fs_bmap
The file IO routines do not handle uninit blocks at all. The read
method should check for the uninit flag and return a buffer of zeroes,
and the write routine should convert unwritten extents.
Signed-off-by: Darrick J. Wong <[email protected]>
---
lib/ext2fs/fileio.c | 24 ++++++++++++++++++++++--
1 file changed, 22 insertions(+), 2 deletions(-)
diff --git a/lib/ext2fs/fileio.c b/lib/ext2fs/fileio.c
index 582b306..40438d0 100644
--- a/lib/ext2fs/fileio.c
+++ b/lib/ext2fs/fileio.c
@@ -122,6 +122,8 @@ errcode_t ext2fs_file_flush(ext2_file_t file)
{
errcode_t retval;
ext2_filsys fs;
+ int ret_flags;
+ blk64_t dontcare;
EXT2_CHECK_MAGIC(file, EXT2_ET_MAGIC_EXT2_FILE);
fs = file->fs;
@@ -130,6 +132,22 @@ errcode_t ext2fs_file_flush(ext2_file_t file)
!(file->flags & EXT2_FILE_BUF_DIRTY))
return 0;
+ /* Is this an uninit block? */
+ if (file->physblock && file->inode.i_flags & EXT4_EXTENTS_FL) {
+ retval = ext2fs_bmap2(fs, file->ino, &file->inode, BMAP_BUFFER,
+ 0, file->blockno, &ret_flags, &dontcare);
+ if (retval)
+ return retval;
+ if (ret_flags & BMAP_RET_UNINIT) {
+ retval = ext2fs_bmap2(fs, file->ino, &file->inode,
+ BMAP_BUFFER, BMAP_SET,
+ file->blockno, 0,
+ &file->physblock);
+ if (retval)
+ return retval;
+ }
+ }
+
/*
* OK, the physical block hasn't been allocated yet.
* Allocate it.
@@ -184,15 +202,17 @@ static errcode_t load_buffer(ext2_file_t file, int dontfill)
{
ext2_filsys fs = file->fs;
errcode_t retval;
+ int ret_flags;
if (!(file->flags & EXT2_FILE_BUF_VALID)) {
retval = ext2fs_bmap2(fs, file->ino, &file->inode,
- BMAP_BUFFER, 0, file->blockno, 0,
+ BMAP_BUFFER, 0, file->blockno, &ret_flags,
&file->physblock);
if (retval)
return retval;
if (!dontfill) {
- if (file->physblock) {
+ if (file->physblock &&
+ !(ret_flags & BMAP_RET_UNINIT)) {
retval = io_channel_read_blk64(fs->io,
file->physblock,
1, file->buf);
resize2fs does its magic by loading a filesystem, duplicating the
in-memory image of that fs, moving relevant blocks out of the way of
whatever new metadata get created, and finally writing everything back
out to disk. Enabling 64bit mode enlarges the group descriptors,
which makes resize2fs a reasonable vehicle for taking care of the rest
of the bookkeeping requirements, so add to resize2fs the ability to
convert a filesystem to 64bit mode and back.
Signed-off-by: Darrick J. Wong <[email protected]>
---
resize/main.c | 40 ++++++-
resize/resize2fs.8.in | 18 +++
resize/resize2fs.c | 282 ++++++++++++++++++++++++++++++++++++++++++++++++-
resize/resize2fs.h | 3 +
4 files changed, 336 insertions(+), 7 deletions(-)
diff --git a/resize/main.c b/resize/main.c
index 1394ae1..cf37e26 100644
--- a/resize/main.c
+++ b/resize/main.c
@@ -41,7 +41,7 @@ char *program_name, *device_name, *io_options;
static void usage (char *prog)
{
fprintf (stderr, _("Usage: %s [-d debug_flags] [-f] [-F] [-M] [-P] "
- "[-p] device [new_size]\n\n"), prog);
+ "[-p] device [-b|-s|new_size]\n\n"), prog);
exit (1);
}
@@ -199,7 +199,7 @@ int main (int argc, char ** argv)
if (argc && *argv)
program_name = *argv;
- while ((c = getopt (argc, argv, "d:fFhMPpS:")) != EOF) {
+ while ((c = getopt(argc, argv, "d:fFhMPpS:bs")) != EOF) {
switch (c) {
case 'h':
usage(program_name);
@@ -225,6 +225,12 @@ int main (int argc, char ** argv)
case 'S':
use_stride = atoi(optarg);
break;
+ case 'b':
+ flags |= RESIZE_ENABLE_64BIT;
+ break;
+ case 's':
+ flags |= RESIZE_DISABLE_64BIT;
+ break;
default:
usage(program_name);
}
@@ -383,6 +389,10 @@ int main (int argc, char ** argv)
if (sys_page_size > fs->blocksize)
new_size &= ~((sys_page_size / fs->blocksize)-1);
}
+ /* If changing 64bit, don't change the filesystem size. */
+ if (flags & (RESIZE_DISABLE_64BIT | RESIZE_ENABLE_64BIT)) {
+ new_size = ext2fs_blocks_count(fs->super);
+ }
if (!EXT2_HAS_INCOMPAT_FEATURE(fs->super,
EXT4_FEATURE_INCOMPAT_64BIT)) {
/* Take 16T down to 2^32-1 blocks */
@@ -434,7 +444,31 @@ int main (int argc, char ** argv)
fs->blocksize / 1024, new_size);
exit(1);
}
- if (new_size == ext2fs_blocks_count(fs->super)) {
+ if ((flags & RESIZE_DISABLE_64BIT) && (flags & RESIZE_ENABLE_64BIT)) {
+ fprintf(stderr, _("Cannot set and unset 64bit feature.\n"));
+ exit(1);
+ } else if (flags & (RESIZE_DISABLE_64BIT | RESIZE_ENABLE_64BIT)) {
+ new_size = ext2fs_blocks_count(fs->super);
+ if (new_size >= (1ULL << 32)) {
+ fprintf(stderr, _("Cannot change the 64bit feature "
+ "on a filesystem that is larger than "
+ "2^32 blocks.\n"));
+ exit(1);
+ }
+ if (mount_flags & EXT2_MF_MOUNTED) {
+ fprintf(stderr, _("Cannot change the 64bit feature "
+ "while the filesystem is mounted.\n"));
+ exit(1);
+ }
+ if (flags & RESIZE_ENABLE_64BIT &&
+ !EXT2_HAS_INCOMPAT_FEATURE(fs->super,
+ EXT3_FEATURE_INCOMPAT_EXTENTS)) {
+ fprintf(stderr, _("Please enable the extents feature "
+ "with tune2fs before enabling the 64bit "
+ "feature.\n"));
+ exit(1);
+ }
+ } else if (new_size == ext2fs_blocks_count(fs->super)) {
fprintf(stderr, _("The filesystem is already %llu blocks "
"long. Nothing to do!\n\n"), new_size);
exit(0);
diff --git a/resize/resize2fs.8.in b/resize/resize2fs.8.in
index a1f3099..1c75816 100644
--- a/resize/resize2fs.8.in
+++ b/resize/resize2fs.8.in
@@ -8,7 +8,7 @@ resize2fs \- ext2/ext3/ext4 file system resizer
.SH SYNOPSIS
.B resize2fs
[
-.B \-fFpPM
+.B \-fFpPMbs
]
[
.B \-d
@@ -85,8 +85,21 @@ to shrink the size of filesystem. Then you may use
to shrink the size of the partition. When shrinking the size of
the partition, make sure you do not make it smaller than the new size
of the ext2 filesystem!
+.PP
+The
+.B \-b
+and
+.B \-s
+options enable and disable the 64bit feature, respectively. The resize2fs
+program will, of course, take care of resizing the block group descriptors
+and moving other data blocks out of the way, as needed. It is not possible
+to resize the filesystem concurrent with changing the 64bit status.
.SH OPTIONS
.TP
+.B \-b
+Turns on the 64bit feature, resizes the group descriptors as necessary, and
+moves other metadata out of the way.
+.TP
.B \-d \fIdebug-flags
Turns on various resize2fs debugging features, if they have been compiled
into the binary.
@@ -126,6 +139,9 @@ of what the program is doing.
.B \-P
Print the minimum size of the filesystem and exit.
.TP
+.B \-s
+Turns off the 64bit feature and frees blocks that are no longer in use.
+.TP
.B \-S \fIRAID-stride
The
.B resize2fs
diff --git a/resize/resize2fs.c b/resize/resize2fs.c
index 6c2c870..3ee8ee4 100644
--- a/resize/resize2fs.c
+++ b/resize/resize2fs.c
@@ -53,6 +53,9 @@ static errcode_t ext2fs_calculate_summary_stats(ext2_filsys fs);
static errcode_t fix_sb_journal_backup(ext2_filsys fs);
static errcode_t mark_table_blocks(ext2_filsys fs,
ext2fs_block_bitmap bmap);
+static errcode_t resize_group_descriptors(ext2_resize_t rfs, blk64_t new_size);
+static errcode_t move_bg_metadata(ext2_resize_t rfs);
+static errcode_t zero_high_bits_in_inodes(ext2_resize_t rfs);
/*
* Some helper CPP macros
@@ -119,13 +122,30 @@ errcode_t resize_fs(ext2_filsys fs, blk64_t *new_size, int flags,
if (retval)
goto errout;
+ init_resource_track(&rtrack, "resize_group_descriptors", fs->io);
+ retval = resize_group_descriptors(rfs, *new_size);
+ if (retval)
+ goto errout;
+ print_resource_track(rfs, &rtrack, fs->io);
+
+ init_resource_track(&rtrack, "move_bg_metadata", fs->io);
+ retval = move_bg_metadata(rfs);
+ if (retval)
+ goto errout;
+ print_resource_track(rfs, &rtrack, fs->io);
+
+ init_resource_track(&rtrack, "zero_high_bits_in_metadata", fs->io);
+ retval = zero_high_bits_in_inodes(rfs);
+ if (retval)
+ goto errout;
+ print_resource_track(rfs, &rtrack, fs->io);
+
init_resource_track(&rtrack, "adjust_superblock", fs->io);
retval = adjust_superblock(rfs, *new_size);
if (retval)
goto errout;
print_resource_track(rfs, &rtrack, fs->io);
-
init_resource_track(&rtrack, "fix_uninit_block_bitmaps 2", fs->io);
fix_uninit_block_bitmaps(rfs->new_fs);
print_resource_track(rfs, &rtrack, fs->io);
@@ -221,6 +241,259 @@ errout:
return retval;
}
+/* Toggle 64bit mode */
+static errcode_t resize_group_descriptors(ext2_resize_t rfs, blk64_t new_size)
+{
+ void *o, *n, *new_group_desc;
+ dgrp_t i;
+ int copy_size;
+ errcode_t retval;
+
+ if (!(rfs->flags & (RESIZE_DISABLE_64BIT | RESIZE_ENABLE_64BIT)))
+ return 0;
+
+ if (new_size != ext2fs_blocks_count(rfs->new_fs->super) ||
+ ext2fs_blocks_count(rfs->new_fs->super) >= (1ULL << 32) ||
+ (rfs->flags & RESIZE_DISABLE_64BIT &&
+ rfs->flags & RESIZE_ENABLE_64BIT))
+ return EXT2_ET_INVALID_ARGUMENT;
+
+ if (rfs->flags & RESIZE_DISABLE_64BIT) {
+ rfs->new_fs->super->s_feature_incompat &=
+ ~EXT4_FEATURE_INCOMPAT_64BIT;
+ rfs->new_fs->super->s_desc_size = EXT2_MIN_DESC_SIZE;
+ } else if (rfs->flags & RESIZE_ENABLE_64BIT) {
+ rfs->new_fs->super->s_feature_incompat |=
+ EXT4_FEATURE_INCOMPAT_64BIT;
+ rfs->new_fs->super->s_desc_size = EXT2_MIN_DESC_SIZE_64BIT;
+ }
+
+ if (EXT2_DESC_SIZE(rfs->old_fs->super) ==
+ EXT2_DESC_SIZE(rfs->new_fs->super))
+ return 0;
+
+ o = rfs->new_fs->group_desc;
+ rfs->new_fs->desc_blocks = ext2fs_div_ceil(
+ rfs->old_fs->group_desc_count,
+ EXT2_DESC_PER_BLOCK(rfs->new_fs->super));
+ retval = ext2fs_get_arrayzero(rfs->new_fs->desc_blocks,
+ rfs->old_fs->blocksize, &new_group_desc);
+ if (retval)
+ return retval;
+
+ n = new_group_desc;
+
+ if (EXT2_DESC_SIZE(rfs->old_fs->super) <=
+ EXT2_DESC_SIZE(rfs->new_fs->super))
+ copy_size = EXT2_DESC_SIZE(rfs->old_fs->super);
+ else
+ copy_size = EXT2_DESC_SIZE(rfs->new_fs->super);
+ for (i = 0; i < rfs->old_fs->group_desc_count; i++) {
+ memcpy(n, o, copy_size);
+ n += EXT2_DESC_SIZE(rfs->new_fs->super);
+ o += EXT2_DESC_SIZE(rfs->old_fs->super);
+ }
+
+ ext2fs_free_mem(&rfs->new_fs->group_desc);
+ rfs->new_fs->group_desc = new_group_desc;
+
+ for (i = 0; i < rfs->old_fs->group_desc_count; i++)
+ ext2fs_group_desc_csum_set(rfs->new_fs, i);
+
+ return 0;
+}
+
+/* Move bitmaps/inode tables out of the way. */
+static errcode_t move_bg_metadata(ext2_resize_t rfs)
+{
+ dgrp_t i;
+ blk64_t b, c, d;
+ ext2fs_block_bitmap old_map, new_map;
+ int old, new;
+ errcode_t retval;
+ int zero = 0, one = 1;
+
+ if (!(rfs->flags & (RESIZE_DISABLE_64BIT | RESIZE_ENABLE_64BIT)))
+ return 0;
+
+ retval = ext2fs_allocate_block_bitmap(rfs->old_fs, "oldfs", &old_map);
+ if (retval)
+ return retval;
+
+ retval = ext2fs_allocate_block_bitmap(rfs->new_fs, "newfs", &new_map);
+ if (retval)
+ goto out;
+
+ /* Construct bitmaps of super/descriptor blocks in old and new fs */
+ for (i = 0; i < rfs->old_fs->group_desc_count; i++) {
+ retval = ext2fs_super_and_bgd_loc2(rfs->old_fs, i, &b, &c, &d,
+ NULL);
+ if (retval)
+ goto out;
+ ext2fs_mark_block_bitmap2(old_map, b);
+ ext2fs_mark_block_bitmap2(old_map, c);
+ ext2fs_mark_block_bitmap2(old_map, d);
+
+ retval = ext2fs_super_and_bgd_loc2(rfs->new_fs, i, &b, &c, &d,
+ NULL);
+ if (retval)
+ goto out;
+ ext2fs_mark_block_bitmap2(new_map, b);
+ ext2fs_mark_block_bitmap2(new_map, c);
+ ext2fs_mark_block_bitmap2(new_map, d);
+ }
+
+ /* Find changes in block allocations for bg metadata */
+ for (b = 0;
+ b < ext2fs_blocks_count(rfs->new_fs->super);
+ b += EXT2FS_CLUSTER_RATIO(rfs->new_fs)) {
+ old = ext2fs_test_block_bitmap2(old_map, b);
+ new = ext2fs_test_block_bitmap2(new_map, b);
+
+ if (old && !new)
+ ext2fs_unmark_block_bitmap2(rfs->new_fs->block_map, b);
+ else if (!old && new)
+ ; /* empty ext2fs_mark_block_bitmap2(new_map, b); */
+ else
+ ext2fs_unmark_block_bitmap2(new_map, b);
+ }
+ /* new_map now shows blocks that have been newly allocated. */
+
+ /* Move any conflicting bitmaps and inode tables */
+ for (i = 0; i < rfs->old_fs->group_desc_count; i++) {
+ b = ext2fs_block_bitmap_loc(rfs->new_fs, i);
+ if (ext2fs_test_block_bitmap2(new_map, b))
+ ext2fs_block_bitmap_loc_set(rfs->new_fs, i, 0);
+
+ b = ext2fs_inode_bitmap_loc(rfs->new_fs, i);
+ if (ext2fs_test_block_bitmap2(new_map, b))
+ ext2fs_inode_bitmap_loc_set(rfs->new_fs, i, 0);
+
+ c = ext2fs_inode_table_loc(rfs->new_fs, i);
+ for (b = 0; b < rfs->new_fs->inode_blocks_per_group; b++) {
+ if (ext2fs_test_block_bitmap2(new_map, b + c)) {
+ ext2fs_inode_table_loc_set(rfs->new_fs, i, 0);
+ break;
+ }
+ }
+ }
+
+out:
+ if (old_map)
+ ext2fs_free_block_bitmap(old_map);
+ if (new_map)
+ ext2fs_free_block_bitmap(new_map);
+ return retval;
+}
+
+/* Zero out the high bits of extent fields */
+static errcode_t zero_high_bits_in_extents(ext2_filsys fs, ext2_ino_t ino,
+ struct ext2_inode *inode)
+{
+ ext2_extent_handle_t handle;
+ struct ext2fs_extent extent;
+ int op = EXT2_EXTENT_ROOT;
+ errcode_t errcode;
+
+ if (!(inode->i_flags & EXT4_EXTENTS_FL))
+ return 0;
+
+ errcode = ext2fs_extent_open(fs, ino, &handle);
+ if (errcode)
+ return errcode;
+
+ while (1) {
+ errcode = ext2fs_extent_get(handle, op, &extent);
+ if (errcode)
+ break;
+
+ op = EXT2_EXTENT_NEXT_SIB;
+
+ if (extent.e_pblk > (1ULL << 32)) {
+ extent.e_pblk &= (1ULL << 32) - 1;
+ errcode = ext2fs_extent_replace(handle, 0, &extent);
+ if (errcode)
+ break;
+ }
+ }
+
+ /* Ok if we run off the end */
+ if (errcode == EXT2_ET_EXTENT_NO_NEXT)
+ errcode = 0;
+ return errcode;
+}
+
+/* Zero out the high bits of inodes. */
+static errcode_t zero_high_bits_in_inodes(ext2_resize_t rfs)
+{
+ ext2_filsys fs = rfs->new_fs;
+ int length = EXT2_INODE_SIZE(fs->super);
+ struct ext2_inode *inode = NULL;
+ ext2_inode_scan scan = NULL;
+ errcode_t retval;
+ ext2_ino_t ino;
+ blk64_t file_acl_block;
+ int inode_dirty;
+
+ if (!(rfs->flags & (RESIZE_DISABLE_64BIT | RESIZE_ENABLE_64BIT)))
+ return 0;
+
+ if (fs->super->s_creator_os != EXT2_OS_LINUX)
+ return 0;
+
+ retval = ext2fs_open_inode_scan(fs, 0, &scan);
+ if (retval)
+ return retval;
+
+ retval = ext2fs_get_mem(length, &inode);
+ if (retval)
+ goto out;
+
+ do {
+ retval = ext2fs_get_next_inode_full(scan, &ino, inode, length);
+ if (retval)
+ goto out;
+ if (!ino)
+ break;
+ if (!ext2fs_test_inode_bitmap2(fs->inode_map, ino))
+ continue;
+
+ /*
+ * Here's how we deal with high block number fields:
+ *
+ * - i_size_high has been been written out with i_size_lo
+ * since the ext2 days, so no conversion is needed.
+ *
+ * - i_blocks_hi is guarded by both the huge_file feature and
+ * inode flags and has always been written out with
+ * i_blocks_lo if the feature is set. The field is only
+ * ever read if both feature and inode flag are set, so
+ * we don't need to zero it now.
+ *
+ * - i_file_acl_high can be uninitialized, so zero it if
+ * it isn't already.
+ */
+ if (inode->osd2.linux2.l_i_file_acl_high) {
+ inode->osd2.linux2.l_i_file_acl_high = 0;
+ retval = ext2fs_write_inode_full(fs, ino, inode,
+ length);
+ if (retval)
+ goto out;
+ }
+
+ retval = zero_high_bits_in_extents(fs, ino, inode);
+ if (retval)
+ goto out;
+ } while (ino);
+
+out:
+ if (inode)
+ ext2fs_free_mem(&inode);
+ if (scan)
+ ext2fs_close_inode_scan(scan);
+ return retval;
+}
+
/*
* Clean up the bitmaps for unitialized bitmaps
*/
@@ -444,7 +717,8 @@ retry:
/*
* Reallocate the group descriptors as necessary.
*/
- if (old_fs->desc_blocks != fs->desc_blocks) {
+ if (EXT2_DESC_SIZE(old_fs->super) == EXT2_DESC_SIZE(fs->super) &&
+ old_fs->desc_blocks != fs->desc_blocks) {
retval = ext2fs_resize_mem(old_fs->desc_blocks *
fs->blocksize,
fs->desc_blocks * fs->blocksize,
@@ -967,7 +1241,9 @@ static errcode_t blocks_to_move(ext2_resize_t rfs)
new_blocks = fs->desc_blocks + fs->super->s_reserved_gdt_blocks;
}
- if (old_blocks == new_blocks) {
+ if (EXT2_DESC_SIZE(rfs->old_fs->super) ==
+ EXT2_DESC_SIZE(rfs->new_fs->super) &&
+ old_blocks == new_blocks) {
retval = 0;
goto errout;
}
diff --git a/resize/resize2fs.h b/resize/resize2fs.h
index 52319b5..5a1c5dc 100644
--- a/resize/resize2fs.h
+++ b/resize/resize2fs.h
@@ -82,6 +82,9 @@ typedef struct ext2_sim_progress *ext2_sim_progmeter;
#define RESIZE_PERCENT_COMPLETE 0x0100
#define RESIZE_VERBOSE 0x0200
+#define RESIZE_ENABLE_64BIT 0x0400
+#define RESIZE_DISABLE_64BIT 0x0800
+
/*
* This structure is used for keeping track of how much resources have
* been used for a particular resize2fs pass.
Currently, move_bg_metadata() assumes that if a block containing a
superblock or a group descriptor is no longer needed, then it is safe
to free the whole cluster. This of course isn't true, for bitmaps and
inode tables can share these clusters. Therefore, check a little more
carefully before freeing clusters.
Signed-off-by: Darrick J. Wong <[email protected]>
---
resize/resize2fs.c | 71 ++++++++++++++++++++++++++++++++++++++++------------
1 file changed, 55 insertions(+), 16 deletions(-)
diff --git a/resize/resize2fs.c b/resize/resize2fs.c
index 3ee8ee4..e95179d 100644
--- a/resize/resize2fs.c
+++ b/resize/resize2fs.c
@@ -307,11 +307,11 @@ static errcode_t resize_group_descriptors(ext2_resize_t rfs, blk64_t new_size)
static errcode_t move_bg_metadata(ext2_resize_t rfs)
{
dgrp_t i;
- blk64_t b, c, d;
+ blk64_t b, c, d, old_desc_blocks, new_desc_blocks, j;
ext2fs_block_bitmap old_map, new_map;
int old, new;
errcode_t retval;
- int zero = 0, one = 1;
+ int zero = 0, one = 1, cluster_ratio;
if (!(rfs->flags & (RESIZE_DISABLE_64BIT | RESIZE_ENABLE_64BIT)))
return 0;
@@ -324,6 +324,17 @@ static errcode_t move_bg_metadata(ext2_resize_t rfs)
if (retval)
goto out;
+ if (EXT2_HAS_INCOMPAT_FEATURE(rfs->old_fs->super,
+ EXT2_FEATURE_INCOMPAT_META_BG)) {
+ old_desc_blocks = rfs->old_fs->super->s_first_meta_bg;
+ new_desc_blocks = rfs->new_fs->super->s_first_meta_bg;
+ } else {
+ old_desc_blocks = rfs->old_fs->desc_blocks +
+ rfs->old_fs->super->s_reserved_gdt_blocks;
+ new_desc_blocks = rfs->new_fs->desc_blocks +
+ rfs->new_fs->super->s_reserved_gdt_blocks;
+ }
+
/* Construct bitmaps of super/descriptor blocks in old and new fs */
for (i = 0; i < rfs->old_fs->group_desc_count; i++) {
retval = ext2fs_super_and_bgd_loc2(rfs->old_fs, i, &b, &c, &d,
@@ -331,7 +342,8 @@ static errcode_t move_bg_metadata(ext2_resize_t rfs)
if (retval)
goto out;
ext2fs_mark_block_bitmap2(old_map, b);
- ext2fs_mark_block_bitmap2(old_map, c);
+ for (j = 0; c != 0 && j < old_desc_blocks; j++)
+ ext2fs_mark_block_bitmap2(old_map, c + j);
ext2fs_mark_block_bitmap2(old_map, d);
retval = ext2fs_super_and_bgd_loc2(rfs->new_fs, i, &b, &c, &d,
@@ -339,45 +351,72 @@ static errcode_t move_bg_metadata(ext2_resize_t rfs)
if (retval)
goto out;
ext2fs_mark_block_bitmap2(new_map, b);
- ext2fs_mark_block_bitmap2(new_map, c);
+ for (j = 0; c != 0 && j < new_desc_blocks; j++)
+ ext2fs_mark_block_bitmap2(new_map, c + j);
ext2fs_mark_block_bitmap2(new_map, d);
}
+ cluster_ratio = EXT2FS_CLUSTER_RATIO(rfs->new_fs);
+
/* Find changes in block allocations for bg metadata */
for (b = 0;
b < ext2fs_blocks_count(rfs->new_fs->super);
- b += EXT2FS_CLUSTER_RATIO(rfs->new_fs)) {
+ b += cluster_ratio) {
old = ext2fs_test_block_bitmap2(old_map, b);
new = ext2fs_test_block_bitmap2(new_map, b);
- if (old && !new)
- ext2fs_unmark_block_bitmap2(rfs->new_fs->block_map, b);
- else if (!old && new)
- ; /* empty ext2fs_mark_block_bitmap2(new_map, b); */
- else
+ if (old && !new) {
+ /* mark old_map, unmark new_map */
+ if (cluster_ratio == 1)
+ ext2fs_unmark_block_bitmap2(
+ rfs->new_fs->block_map, b);
+ } else if (!old && new)
+ ; /* unmark old_map, mark new_map */
+ else {
+ ext2fs_unmark_block_bitmap2(old_map, b);
ext2fs_unmark_block_bitmap2(new_map, b);
+ }
}
- /* new_map now shows blocks that have been newly allocated. */
- /* Move any conflicting bitmaps and inode tables */
+ /*
+ * new_map now shows blocks that have been newly allocated.
+ * old_map now shows blocks that have been newly freed.
+ */
+
+ /*
+ * Move any conflicting bitmaps and inode tables. Ensure that we
+ * don't try to free clusters associated with bitmaps or tables.
+ */
for (i = 0; i < rfs->old_fs->group_desc_count; i++) {
b = ext2fs_block_bitmap_loc(rfs->new_fs, i);
if (ext2fs_test_block_bitmap2(new_map, b))
ext2fs_block_bitmap_loc_set(rfs->new_fs, i, 0);
+ else if (ext2fs_test_block_bitmap2(old_map, b))
+ ext2fs_unmark_block_bitmap2(old_map, b);
b = ext2fs_inode_bitmap_loc(rfs->new_fs, i);
if (ext2fs_test_block_bitmap2(new_map, b))
ext2fs_inode_bitmap_loc_set(rfs->new_fs, i, 0);
+ else if (ext2fs_test_block_bitmap2(old_map, b))
+ ext2fs_unmark_block_bitmap2(old_map, b);
c = ext2fs_inode_table_loc(rfs->new_fs, i);
- for (b = 0; b < rfs->new_fs->inode_blocks_per_group; b++) {
- if (ext2fs_test_block_bitmap2(new_map, b + c)) {
+ for (b = 0;
+ b < rfs->new_fs->inode_blocks_per_group;
+ b++) {
+ if (ext2fs_test_block_bitmap2(new_map, b + c))
ext2fs_inode_table_loc_set(rfs->new_fs, i, 0);
- break;
- }
+ else if (ext2fs_test_block_bitmap2(old_map, b + c))
+ ext2fs_unmark_block_bitmap2(old_map, b + c);
}
}
+ /* Free unused clusters */
+ for (b = 0;
+ cluster_ratio > 1 && b < ext2fs_blocks_count(rfs->new_fs->super);
+ b += cluster_ratio)
+ if (ext2fs_test_block_bitmap2(old_map, b))
+ ext2fs_unmark_block_bitmap2(rfs->new_fs->block_map, b);
out:
if (old_map)
ext2fs_free_block_bitmap(old_map);
Since we're constructing the fantasy that new_fs has always been a
64bit fs, we need to adjust reserved_gdt_blocks when we start resizing
the metadata so that the size of the gdt space in the new fs reflects
the fantasy throughout the resize process.
Signed-off-by: Darrick J. Wong <[email protected]>
---
resize/resize2fs.c | 37 ++++++++++++++++++++++++-------------
1 file changed, 24 insertions(+), 13 deletions(-)
diff --git a/resize/resize2fs.c b/resize/resize2fs.c
index e95179d..f33ec01 100644
--- a/resize/resize2fs.c
+++ b/resize/resize2fs.c
@@ -241,6 +241,24 @@ errout:
return retval;
}
+/* Keep the size of the group descriptor region constant */
+static void adjust_reserved_gdt_blocks(ext2_filsys old_fs, ext2_filsys fs)
+{
+ if ((fs->super->s_feature_compat &
+ EXT2_FEATURE_COMPAT_RESIZE_INODE) &&
+ (old_fs->desc_blocks != fs->desc_blocks)) {
+ int new;
+
+ new = ((int) fs->super->s_reserved_gdt_blocks) +
+ (old_fs->desc_blocks - fs->desc_blocks);
+ if (new < 0)
+ new = 0;
+ if (new > (int) fs->blocksize/4)
+ new = fs->blocksize/4;
+ fs->super->s_reserved_gdt_blocks = new;
+ }
+}
+
/* Toggle 64bit mode */
static errcode_t resize_group_descriptors(ext2_resize_t rfs, blk64_t new_size)
{
@@ -300,6 +318,8 @@ static errcode_t resize_group_descriptors(ext2_resize_t rfs, blk64_t new_size)
for (i = 0; i < rfs->old_fs->group_desc_count; i++)
ext2fs_group_desc_csum_set(rfs->new_fs, i);
+ adjust_reserved_gdt_blocks(rfs->old_fs, rfs->new_fs);
+
return 0;
}
@@ -776,20 +796,11 @@ retry:
* number of descriptor blocks, then adjust
* s_reserved_gdt_blocks if possible to avoid needing to move
* the inode table either now or in the future.
+ *
+ * Note: If we're converting to 64bit mode, we did this earlier.
*/
- if ((fs->super->s_feature_compat &
- EXT2_FEATURE_COMPAT_RESIZE_INODE) &&
- (old_fs->desc_blocks != fs->desc_blocks)) {
- int new;
Add functions to allow clients to get, set, and remove extended
attributes from any file. It also supports modifying EAs living in
i_file_acl.
v2: Put the header declarations in the correct part of ext2fs.h,
provide a function to release an EA block from an inode, and check
i_extra_isize to make sure we actually have space for in-inode EAs.
v3: Add system.richacl prefix support, and only allow the new
ext2fs_xattr_* functions to run if we have either ext_attr or
inline_data set. Fix some memory leaks and stack disclosure problems,
and an accounting problem when freeing an EA block.
Signed-off-by: Darrick J. Wong <[email protected]>
---
lib/ext2fs/ext2_err.et.in | 21 +
lib/ext2fs/ext2fs.h | 28 ++
lib/ext2fs/ext_attr.c | 757 +++++++++++++++++++++++++++++++++++++++++++++
3 files changed, 806 insertions(+)
diff --git a/lib/ext2fs/ext2_err.et.in b/lib/ext2fs/ext2_err.et.in
index 93a1106..0a69aa3 100644
--- a/lib/ext2fs/ext2_err.et.in
+++ b/lib/ext2fs/ext2_err.et.in
@@ -482,4 +482,25 @@ ec EXT2_ET_BLOCK_BITMAP_CSUM_INVALID,
ec EXT2_ET_INLINE_DATA_CANT_ITERATE,
"Cannot iterate data blocks of an inode containing inline data"
+ec EXT2_ET_EA_BAD_NAME_LEN,
+ "Extended attribute has an invalid name length"
+
+ec EXT2_ET_EA_BAD_VALUE_SIZE,
+ "Extended attribute has an invalid value length"
+
+ec EXT2_ET_BAD_EA_HASH,
+ "Extended attribute has an incorrect hash"
+
+ec EXT2_ET_BAD_EA_HEADER,
+ "Extended attribute block has a bad header"
+
+ec EXT2_ET_EA_KEY_NOT_FOUND,
+ "Extended attribute key not found"
+
+ec EXT2_ET_EA_NO_SPACE,
+ "Insufficient space to store extended attribute data"
+
+ec EXT2_ET_MISSING_EA_FEATURE,
+ "Filesystem is missing ext_attr or inline_data feature"
+
end
diff --git a/lib/ext2fs/ext2fs.h b/lib/ext2fs/ext2fs.h
index 316e6f5..f3cb3a0 100644
--- a/lib/ext2fs/ext2fs.h
+++ b/lib/ext2fs/ext2fs.h
@@ -638,6 +638,13 @@ typedef struct stat ext2fs_struct_stat;
#define EXT2_FLAG_FLUSH_NO_SYNC 1
/*
+ * Modify and iterate extended attributes
+ */
+struct ext2_xattr_handle;
+#define XATTR_ABORT 1
+#define XATTR_CHANGED 2
+
+/*
* function prototypes
*/
static inline int ext2fs_has_group_desc_csum(ext2_filsys fs)
@@ -1152,6 +1159,27 @@ extern errcode_t ext2fs_adjust_ea_refcount3(ext2_filsys fs, blk64_t blk,
char *block_buf,
int adjust, __u32 *newcount,
ext2_ino_t inum);
+errcode_t ext2fs_xattrs_expand(struct ext2_xattr_handle *h,
+ unsigned int expandby);
+errcode_t ext2fs_xattrs_write(struct ext2_xattr_handle *handle);
+errcode_t ext2fs_xattrs_read(struct ext2_xattr_handle *handle);
+errcode_t ext2fs_xattrs_iterate(struct ext2_xattr_handle *h,
+ int (*func)(char *name, char *value,
+ void *data),
+ void *data);
+errcode_t ext2fs_xattr_get(struct ext2_xattr_handle *h, const char *key,
+ void **value, unsigned int *value_len);
+errcode_t ext2fs_xattr_set(struct ext2_xattr_handle *handle,
+ const char *key,
+ const void *value,
+ unsigned int value_len);
+errcode_t ext2fs_xattr_remove(struct ext2_xattr_handle *handle,
+ const char *key);
+errcode_t ext2fs_xattrs_open(ext2_filsys fs, ext2_ino_t ino,
+ struct ext2_xattr_handle **handle);
+errcode_t ext2fs_xattrs_close(struct ext2_xattr_handle **handle);
+errcode_t ext2fs_free_ext_attr(ext2_filsys fs, ext2_ino_t ino,
+ struct ext2_inode_large *inode);
/* extent.c */
extern errcode_t ext2fs_extent_header_verify(void *ptr, int size);
diff --git a/lib/ext2fs/ext_attr.c b/lib/ext2fs/ext_attr.c
index 9649a14..6eadca2 100644
--- a/lib/ext2fs/ext_attr.c
+++ b/lib/ext2fs/ext_attr.c
@@ -186,3 +186,760 @@ errcode_t ext2fs_adjust_ea_refcount(ext2_filsys fs, blk_t blk,
return ext2fs_adjust_ea_refcount2(fs, blk, block_buf, adjust,
newcount);
}
+
+/* Manipulate the contents of extended attribute regions */
+struct ext2_xattr {
+ char *name;
+ void *value;
+ unsigned int value_len;
+};
+
+struct ext2_xattr_handle {
+ ext2_filsys fs;
+ struct ext2_xattr *attrs;
+ unsigned int length;
+ ext2_ino_t ino;
+ int dirty;
+};
+
+errcode_t ext2fs_xattrs_expand(struct ext2_xattr_handle *h,
+ unsigned int expandby)
+{
+ struct ext2_xattr *new_attrs;
+ errcode_t err;
+
+ err = ext2fs_get_arrayzero(h->length + expandby,
+ sizeof(struct ext2_xattr), &new_attrs);
+ if (err)
+ return err;
+
+ memcpy(new_attrs, h->attrs, h->length * sizeof(struct ext2_xattr));
+ ext2fs_free_mem(&h->attrs);
+ h->length += expandby;
+ h->attrs = new_attrs;
+
+ return 0;
+}
+
+struct ea_name_index {
+ int index;
+ const char *name;
+};
+
+/* Keep these names sorted in order of decreasing specificity. */
+static struct ea_name_index ea_names[] = {
+ {3, "system.posix_acl_default"},
+ {2, "system.posix_acl_access"},
+ {8, "system.richacl"},
+ {6, "security."},
+ {4, "trusted."},
+ {7, "system."},
+ {1, "user."},
+ {0, NULL},
+};
+
+static const char *find_ea_prefix(int index)
+{
+ struct ea_name_index *e;
+
+ for (e = ea_names; e->name; e++)
+ if (e->index == index)
+ return e->name;
+
+ return NULL;
+}
+
+static int find_ea_index(const char *fullname, char **name, int *index)
+{
+ struct ea_name_index *e;
+
+ for (e = ea_names; e->name; e++) {
+ if (memcmp(fullname, e->name, strlen(e->name)) == 0) {
+ *name = (char *)fullname + strlen(e->name);
+ *index = e->index;
+ return 1;
+ }
+ }
+ return 0;
+}
+
+errcode_t ext2fs_free_ext_attr(ext2_filsys fs, ext2_ino_t ino,
+ struct ext2_inode_large *inode)
+{
+ struct ext2_ext_attr_header *header;
+ void *block_buf = NULL;
+ dgrp_t grp;
+ blk64_t blk, goal;
+ errcode_t err;
+ struct ext2_inode_large i;
+
+ /* Read inode? */
+ if (inode == NULL) {
+ err = ext2fs_read_inode_full(fs, ino, (struct ext2_inode *)&i,
+ sizeof(struct ext2_inode_large));
+ if (err)
+ return err;
+ inode = &i;
+ }
+
+ /* Do we already have an EA block? */
+ blk = ext2fs_file_acl_block(fs, (struct ext2_inode *)inode);
+ if (blk == 0)
+ return 0;
+
+ /* Find block, zero it, write back */
+ if ((blk < fs->super->s_first_data_block) ||
+ (blk >= ext2fs_blocks_count(fs->super))) {
+ err = EXT2_ET_BAD_EA_BLOCK_NUM;
+ goto out;
+ }
+
+ err = ext2fs_get_mem(fs->blocksize, &block_buf);
+ if (err)
+ goto out;
+
+ err = ext2fs_read_ext_attr3(fs, blk, block_buf, ino);
+ if (err)
+ goto out2;
+
+ header = (struct ext2_ext_attr_header *) block_buf;
+ if (header->h_magic != EXT2_EXT_ATTR_MAGIC) {
+ err = EXT2_ET_BAD_EA_HEADER;
+ goto out2;
+ }
+
+ header->h_refcount--;
+ err = ext2fs_write_ext_attr3(fs, blk, block_buf, ino);
+ if (err)
+ goto out2;
+
+ /* Erase link to block */
+ ext2fs_file_acl_block_set(fs, (struct ext2_inode *)inode, 0);
+ if (header->h_refcount == 0)
+ ext2fs_block_alloc_stats2(fs, blk, -1);
+ err = ext2fs_iblk_sub_blocks(fs, (struct ext2_inode *)inode, 1);
+ if (err)
+ goto out2;
+
+ /* Write inode? */
+ if (inode == &i) {
+ err = ext2fs_write_inode_full(fs, ino, (struct ext2_inode *)&i,
+ sizeof(struct ext2_inode_large));
+ if (err)
+ goto out2;
+ }
+
+out2:
+ ext2fs_free_mem(&block_buf);
+out:
+ return err;
+}
+
+static errcode_t prep_ea_block_for_write(ext2_filsys fs, ext2_ino_t ino,
+ struct ext2_inode_large *inode)
+{
+ struct ext2_ext_attr_header *header;
+ void *block_buf = NULL;
+ dgrp_t grp;
+ blk64_t blk, goal;
+ errcode_t err;
+
+ /* Do we already have an EA block? */
+ blk = ext2fs_file_acl_block(fs, (struct ext2_inode *)inode);
+ if (blk != 0) {
+ if ((blk < fs->super->s_first_data_block) ||
+ (blk >= ext2fs_blocks_count(fs->super))) {
+ err = EXT2_ET_BAD_EA_BLOCK_NUM;
+ goto out;
+ }
+
+ err = ext2fs_get_mem(fs->blocksize, &block_buf);
+ if (err)
+ goto out;
+
+ err = ext2fs_read_ext_attr3(fs, blk, block_buf, ino);
+ if (err)
+ goto out2;
+
+ header = (struct ext2_ext_attr_header *) block_buf;
+ if (header->h_magic != EXT2_EXT_ATTR_MAGIC) {
+ err = EXT2_ET_BAD_EA_HEADER;
+ goto out2;
+ }
+
+ /* Single-user block. We're done here. */
+ if (header->h_refcount == 1)
+ goto out2;
+
+ /* We need to CoW the block. */
+ header->h_refcount--;
+ err = ext2fs_write_ext_attr3(fs, blk, block_buf, ino);
+ if (err)
+ goto out2;
+ } else {
+ /* No block, we must increment i_blocks */
+ err = ext2fs_iblk_add_blocks(fs, (struct ext2_inode *)inode,
+ 1);
+ if (err)
+ goto out;
+ }
+
+ /* Allocate a block */
+ grp = ext2fs_group_of_ino(fs, ino);
+ goal = ext2fs_inode_table_loc(fs, grp);
+ err = ext2fs_alloc_block2(fs, goal, NULL, &blk);
+ if (err)
+ goto out2;
+ ext2fs_file_acl_block_set(fs, (struct ext2_inode *)inode, blk);
+out2:
+ if (block_buf)
+ ext2fs_free_mem(&block_buf);
+out:
+ return err;
+}
+
+
+static errcode_t write_xattrs_to_buffer(struct ext2_xattr_handle *handle,
+ struct ext2_xattr **pos,
+ void *entries_start,
+ unsigned int storage_size,
+ unsigned int value_offset_correction)
+{
+ struct ext2_xattr *x = *pos;
+ struct ext2_ext_attr_entry *e = entries_start;
+ void *end = entries_start + storage_size;
+ char *shortname;
+ unsigned int entry_size, value_size;
+ int idx, ret;
+
+ /* For all remaining x... */
+ for (; x < handle->attrs + handle->length; x++) {
+ if (!x->name)
+ continue;
+
+ /* Calculate index and shortname position */
+ shortname = x->name;
+ ret = find_ea_index(x->name, &shortname, &idx);
+
+ /* Calculate entry and value size */
+ entry_size = (sizeof(*e) + strlen(shortname) +
+ EXT2_EXT_ATTR_PAD - 1) &
+ ~(EXT2_EXT_ATTR_PAD - 1);
+ value_size = ((x->value_len + EXT2_EXT_ATTR_PAD - 1) /
+ EXT2_EXT_ATTR_PAD) * EXT2_EXT_ATTR_PAD;
+
+ /*
+ * Would entry collide with value?
+ * Note that we must leave sufficient room for a (u32)0 to
+ * mark the end of the entries.
+ */
+ if ((void *)e + entry_size + sizeof(__u32) > end - value_size)
+ break;
+
+ /* Fill out e appropriately */
+ e->e_name_len = strlen(shortname);
+ e->e_name_index = (ret ? idx : 0);
+ e->e_value_offs = end - value_size - (void *)entries_start +
+ value_offset_correction;
+ e->e_value_block = 0;
+ e->e_value_size = x->value_len;
+
+ /* Store name and value */
+ end -= value_size;
+ memcpy((void *)e + sizeof(*e), shortname, e->e_name_len);
+ memcpy(end, x->value, e->e_value_size);
+
+ e->e_hash = ext2fs_ext_attr_hash_entry(e, end);
+
+ e = EXT2_EXT_ATTR_NEXT(e);
+ *(__u32 *)e = 0;
+ }
+ *pos = x;
+
+ return 0;
+}
+
+errcode_t ext2fs_xattrs_write(struct ext2_xattr_handle *handle)
+{
+ struct ext2_xattr *x;
+ struct ext2_inode_large *inode;
+ void *start, *block_buf = NULL;
+ struct ext2_ext_attr_header *header;
+ __u32 ea_inode_magic;
+ blk64_t blk;
+ unsigned int storage_size;
+ unsigned int i;
+ errcode_t err;
+
+ i = EXT2_INODE_SIZE(handle->fs->super);
+ if (i < sizeof(*inode))
+ i = sizeof(*inode);
+ err = ext2fs_get_memzero(i, &inode);
+ if (err)
+ return err;
+
+ err = ext2fs_read_inode_full(handle->fs, handle->ino,
+ (struct ext2_inode *)inode,
+ EXT2_INODE_SIZE(handle->fs->super));
+ if (err)
+ goto out;
+
+ x = handle->attrs;
+ /* Does the inode have size for EA? */
+ if (EXT2_INODE_SIZE(handle->fs->super) <= EXT2_GOOD_OLD_INODE_SIZE +
+ inode->i_extra_isize +
+ sizeof(__u32))
+ goto write_ea_block;
+
+ /* Write the inode EA */
+ ea_inode_magic = EXT2_EXT_ATTR_MAGIC;
+ memcpy(((char *) inode) + EXT2_GOOD_OLD_INODE_SIZE +
+ inode->i_extra_isize, &ea_inode_magic, sizeof(__u32));
+ storage_size = EXT2_INODE_SIZE(handle->fs->super) -
+ EXT2_GOOD_OLD_INODE_SIZE - inode->i_extra_isize -
+ sizeof(__u32);
+ start = ((char *) inode) + EXT2_GOOD_OLD_INODE_SIZE +
+ inode->i_extra_isize + sizeof(__u32);
+
+ err = write_xattrs_to_buffer(handle, &x, start, storage_size, 0);
+ if (err)
+ goto out;
+
+ /* Are we done? */
+ if (x == handle->attrs + handle->length)
+ goto skip_ea_block;
+
+write_ea_block:
+ /* Write the EA block */
+ err = ext2fs_get_memzero(handle->fs->blocksize, &block_buf);
+ if (err)
+ goto out;
+
+ storage_size = handle->fs->blocksize -
+ sizeof(struct ext2_ext_attr_header);
+ start = block_buf + sizeof(struct ext2_ext_attr_header);
+
+ err = write_xattrs_to_buffer(handle, &x, start, storage_size,
+ (void *)start - block_buf);
+ if (err)
+ goto out2;
+
+ if (x < handle->attrs + handle->length) {
+ err = EXT2_ET_EA_NO_SPACE;
+ goto out2;
+ }
+
+ /* Write a header on the EA block */
+ header = block_buf;
+ header->h_magic = EXT2_EXT_ATTR_MAGIC;
+ header->h_refcount = 1;
+ header->h_blocks = 1;
+
+ /* Get a new block for writing */
+ err = prep_ea_block_for_write(handle->fs, handle->ino, inode);
+ if (err)
+ goto out2;
+
+ /* Finally, write the new EA block */
+ blk = ext2fs_file_acl_block(handle->fs,
+ (struct ext2_inode *)inode);
+ err = ext2fs_write_ext_attr3(handle->fs, blk, block_buf,
+ handle->ino);
+ if (err)
+ goto out2;
+
+skip_ea_block:
+ blk = ext2fs_file_acl_block(handle->fs, (struct ext2_inode *)inode);
+ if (!block_buf && blk) {
+ /* xattrs shrunk, free the block */
+ err = ext2fs_free_ext_attr(handle->fs, handle->ino, inode);
+ if (err)
+ goto out;
+ }
+
+ /* Write the inode */
+ err = ext2fs_write_inode_full(handle->fs, handle->ino,
+ (struct ext2_inode *)inode,
+ EXT2_INODE_SIZE(handle->fs->super));
+ if (err)
+ goto out2;
+
+out2:
+ ext2fs_free_mem(&block_buf);
+out:
+ ext2fs_free_mem(&inode);
+ handle->dirty = 0;
+ return err;
+}
+
+static errcode_t read_xattrs_from_buffer(struct ext2_xattr_handle *handle,
+ struct ext2_ext_attr_entry *entries,
+ unsigned int storage_size,
+ void *value_start)
+{
+ struct ext2_xattr *x;
+ struct ext2_ext_attr_entry *entry;
+ const char *prefix;
+ void *ptr;
+ unsigned int remain, prefix_len;
+ errcode_t err;
+
+ x = handle->attrs;
+ while (x->name)
+ x++;
+
+ entry = entries;
+ while (!EXT2_EXT_IS_LAST_ENTRY(entry)) {
+ __u32 hash;
+
+ /* header eats this space */
+ remain -= sizeof(struct ext2_ext_attr_entry);
+
+ /* is attribute name valid? */
+ if (EXT2_EXT_ATTR_SIZE(entry->e_name_len) > remain)
+ return EXT2_ET_EA_BAD_NAME_LEN;
+
+ /* attribute len eats this space */
+ remain -= EXT2_EXT_ATTR_SIZE(entry->e_name_len);
+
+ /* check value size */
+ if (entry->e_value_size > remain)
+ return EXT2_ET_EA_BAD_VALUE_SIZE;
+
+ /* e_value_block must be 0 in inode's ea */
+ if (entry->e_value_block != 0)
+ return EXT2_ET_BAD_EA_BLOCK_NUM;
+
+ hash = ext2fs_ext_attr_hash_entry(entry, value_start +
+ entry->e_value_offs);
+
+ /* e_hash may be 0 in older inode's ea */
+ if (entry->e_hash != 0 && entry->e_hash != hash)
+ return EXT2_ET_BAD_EA_HASH;
+
+ remain -= entry->e_value_size;
+
+ /* Allocate space for more attrs? */
+ if (x == handle->attrs + handle->length) {
+ err = ext2fs_xattrs_expand(handle, 4);
+ if (err)
+ return err;
+ x = handle->attrs + handle->length - 4;
+ }
+
+ /* Extract name/value */
+ prefix = find_ea_prefix(entry->e_name_index);
+ prefix_len = (prefix ? strlen(prefix) : 0);
+ err = ext2fs_get_memzero(entry->e_name_len + prefix_len + 1,
+ &x->name);
+ if (err)
+ return err;
+ if (prefix)
+ memcpy(x->name, prefix, prefix_len);
+ if (entry->e_name_len)
+ memcpy(x->name + prefix_len,
+ (void *)entry + sizeof(*entry),
+ entry->e_name_len);
+
+ err = ext2fs_get_mem(entry->e_value_size, &x->value);
+ if (err)
+ return err;
+ x->value_len = entry->e_value_size;
+ memcpy(x->value, value_start + entry->e_value_offs,
+ entry->e_value_size);
+ x++;
+ entry = EXT2_EXT_ATTR_NEXT(entry);
+ }
+
+ return 0;
+}
+
+errcode_t ext2fs_xattrs_read(struct ext2_xattr_handle *handle)
+{
+ struct ext2_xattr *attrs = NULL, *x;
+ struct ext2_inode_large *inode;
+ struct ext2_ext_attr_header *header;
+ __u32 ea_inode_magic;
+ unsigned int storage_size;
+ void *start, *block_buf = NULL;
+ blk64_t blk;
+ int i;
+ errcode_t err;
+
+ i = EXT2_INODE_SIZE(handle->fs->super);
+ if (i < sizeof(*inode))
+ i = sizeof(*inode);
+ err = ext2fs_get_memzero(i, &inode);
+ if (err)
+ return err;
+
+ err = ext2fs_read_inode_full(handle->fs, handle->ino,
+ (struct ext2_inode *)inode,
+ EXT2_INODE_SIZE(handle->fs->super));
+ if (err)
+ goto out;
+
+ /* Does the inode have size for EA? */
+ if (EXT2_INODE_SIZE(handle->fs->super) <= EXT2_GOOD_OLD_INODE_SIZE +
+ inode->i_extra_isize +
+ sizeof(__u32))
+ goto read_ea_block;
+
+ /* Look for EA in the inode */
+ memcpy(&ea_inode_magic, ((char *) inode) + EXT2_GOOD_OLD_INODE_SIZE +
+ inode->i_extra_isize, sizeof(__u32));
+ if (ea_inode_magic == EXT2_EXT_ATTR_MAGIC) {
+ storage_size = EXT2_INODE_SIZE(handle->fs->super) -
+ EXT2_GOOD_OLD_INODE_SIZE - inode->i_extra_isize -
+ sizeof(__u32);
+ start = ((char *) inode) + EXT2_GOOD_OLD_INODE_SIZE +
+ inode->i_extra_isize + sizeof(__u32);
+
+ err = read_xattrs_from_buffer(handle, start, storage_size,
+ start);
+ if (err)
+ goto out;
+ }
+
+read_ea_block:
+ /* Look for EA in a separate EA block */
+ blk = ext2fs_file_acl_block(handle->fs, (struct ext2_inode *)inode);
+ if (blk != 0) {
+ if ((blk < handle->fs->super->s_first_data_block) ||
+ (blk >= ext2fs_blocks_count(handle->fs->super))) {
+ err = EXT2_ET_BAD_EA_BLOCK_NUM;
+ goto out;
+ }
+
+ err = ext2fs_get_mem(handle->fs->blocksize, &block_buf);
+ if (err)
+ goto out;
+
+ err = ext2fs_read_ext_attr3(handle->fs, blk, block_buf,
+ handle->ino);
+ if (err)
+ goto out3;
+
+ header = (struct ext2_ext_attr_header *) block_buf;
+ if (header->h_magic != EXT2_EXT_ATTR_MAGIC) {
+ err = EXT2_ET_BAD_EA_HEADER;
+ goto out3;
+ }
+
+ if (header->h_blocks != 1) {
+ err = EXT2_ET_BAD_EA_HEADER;
+ goto out3;
+ }
+
+ /* Read EAs */
+ storage_size = handle->fs->blocksize -
+ sizeof(struct ext2_ext_attr_header);
+ start = block_buf + sizeof(struct ext2_ext_attr_header);
+ err = read_xattrs_from_buffer(handle, start, storage_size,
+ block_buf);
+ if (err)
+ goto out3;
+
+ ext2fs_free_mem(&block_buf);
+ }
+
+ ext2fs_free_mem(&block_buf);
+ ext2fs_free_mem(&inode);
+ return 0;
+
+out3:
+ ext2fs_free_mem(&block_buf);
+out:
+ ext2fs_free_mem(&inode);
+ return err;
+}
+
+errcode_t ext2fs_xattrs_iterate(struct ext2_xattr_handle *h,
+ int (*func)(char *name, char *value,
+ void *data),
+ void *data)
+{
+ struct ext2_xattr *x;
+ errcode_t err;
+ int ret;
+
+ for (x = h->attrs; x < h->attrs + h->length; x++) {
+ if (!x->name)
+ continue;
+
+ ret = func(x->name, x->value, data);
+ if (ret & XATTR_CHANGED)
+ h->dirty = 1;
+ if (ret & XATTR_ABORT)
+ return 0;
+ }
+
+ return 0;
+}
+
+errcode_t ext2fs_xattr_get(struct ext2_xattr_handle *h, const char *key,
+ void **value, unsigned int *value_len)
+{
+ struct ext2_xattr *x;
+ void *val;
+ errcode_t err;
+
+ for (x = h->attrs; x < h->attrs + h->length; x++) {
+ if (!x->name)
+ continue;
+
+ if (strcmp(x->name, key) == 0) {
+ err = ext2fs_get_mem(x->value_len, &val);
+ if (err)
+ return err;
+ memcpy(val, x->value, x->value_len);
+ *value = val;
+ *value_len = x->value_len;
+ return 0;
+ }
+ }
+
+ return EXT2_ET_EA_KEY_NOT_FOUND;
+}
+
+errcode_t ext2fs_xattr_set(struct ext2_xattr_handle *handle,
+ const char *key,
+ const void *value,
+ unsigned int value_len)
+{
+ struct ext2_xattr *x, *last_empty;
+ char *new_value;
+ errcode_t err;
+
+ last_empty = NULL;
+ for (x = handle->attrs; x < handle->attrs + handle->length; x++) {
+ if (!x->name) {
+ last_empty = x;
+ continue;
+ }
+
+ /* Replace xattr */
+ if (strcmp(x->name, key) == 0) {
+ err = ext2fs_get_mem(value_len, &new_value);
+ if (err)
+ return err;
+ memcpy(new_value, value, value_len);
+ ext2fs_free_mem(&x->value);
+ x->value = new_value;
+ x->value_len = value_len;
+ handle->dirty = 1;
+ return 0;
+ }
+ }
+
+ /* Add attr to empty slot */
+ if (last_empty) {
+ err = ext2fs_get_mem(strlen(key) + 1, &last_empty->name);
+ if (err)
+ return err;
+ strcpy(last_empty->name, key);
+
+ err = ext2fs_get_mem(value_len, &last_empty->value);
+ if (err)
+ return err;
+ memcpy(last_empty->value, value, value_len);
+ last_empty->value_len = value_len;
+ handle->dirty = 1;
+ return 0;
+ }
+
+ /* Expand array, append slot */
+ err = ext2fs_xattrs_expand(handle, 4);
+ if (err)
+ return err;
+
+ x = handle->attrs + handle->length - 4;
+ err = ext2fs_get_mem(strlen(key) + 1, &x->name);
+ if (err)
+ return err;
+ strcpy(x->name, key);
+
+ err = ext2fs_get_mem(value_len, &x->value);
+ if (err)
+ return err;
+ memcpy(x->value, value, value_len);
+ x->value_len = value_len;
+ handle->dirty = 1;
+ return 0;
+}
+
+errcode_t ext2fs_xattr_remove(struct ext2_xattr_handle *handle,
+ const char *key)
+{
+ struct ext2_xattr *x;
+ errcode_t err;
+
+ for (x = handle->attrs; x < handle->attrs + handle->length; x++) {
+ if (!x->name)
+ continue;
+
+ if (strcmp(x->name, key) == 0) {
+ ext2fs_free_mem(&x->name);
+ ext2fs_free_mem(&x->value);
+ x->value_len = 0;
+ handle->dirty = 1;
+ return 0;
+ }
+ }
+
+ return EXT2_ET_EA_KEY_NOT_FOUND;
+}
+
+errcode_t ext2fs_xattrs_open(ext2_filsys fs, ext2_ino_t ino,
+ struct ext2_xattr_handle **handle)
+{
+ struct ext2_xattr_handle *h;
+ errcode_t err;
+
+ if (!EXT2_HAS_COMPAT_FEATURE(fs->super,
+ EXT2_FEATURE_COMPAT_EXT_ATTR) &&
+ !EXT2_HAS_INCOMPAT_FEATURE(fs->super,
+ EXT4_FEATURE_INCOMPAT_INLINE_DATA))
+ return EXT2_ET_MISSING_EA_FEATURE;
+
+ err = ext2fs_get_memzero(sizeof(*h), &h);
+ if (err)
+ return err;
+
+ h->length = 4;
+ err = ext2fs_get_arrayzero(h->length, sizeof(struct ext2_xattr),
+ &h->attrs);
+ if (err) {
+ ext2fs_free_mem(&h);
+ return err;
+ }
+ h->ino = ino;
+ h->fs = fs;
+ *handle = h;
+ return 0;
+}
+
+errcode_t ext2fs_xattrs_close(struct ext2_xattr_handle **handle)
+{
+ unsigned int i;
+ struct ext2_xattr_handle *h = *handle;
+ struct ext2_xattr *a = h->attrs;
+ errcode_t err;
+
+ if (h->dirty) {
+ err = ext2fs_xattrs_write(h);
+ if (err)
+ return err;
+ }
+
+ for (i = 0; i < h->length; i++) {
+ if (a[i].name)
+ ext2fs_free_mem(&a[i].name);
+ if (a[i].value)
+ ext2fs_free_mem(&a[i].value);
+ }
+
+ ext2fs_free_mem(&h->attrs);
+ ext2fs_free_mem(handle);
+ return 0;
+}
A few tweaks to the extended attribute editing APIs:
* Use size_t, not unsigned int, in the new extended attribute editing
API.
* Don't expose the _expand() call since there should be no external
users.
Signed-off-by: Darrick J. Wong <[email protected]>
---
lib/ext2fs/ext2fs.h | 8 +++-----
lib/ext2fs/ext_attr.c | 16 ++++++++--------
2 files changed, 11 insertions(+), 13 deletions(-)
diff --git a/lib/ext2fs/ext2fs.h b/lib/ext2fs/ext2fs.h
index f3cb3a0..9f631e6 100644
--- a/lib/ext2fs/ext2fs.h
+++ b/lib/ext2fs/ext2fs.h
@@ -1159,20 +1159,18 @@ extern errcode_t ext2fs_adjust_ea_refcount3(ext2_filsys fs, blk64_t blk,
char *block_buf,
int adjust, __u32 *newcount,
ext2_ino_t inum);
-errcode_t ext2fs_xattrs_expand(struct ext2_xattr_handle *h,
- unsigned int expandby);
errcode_t ext2fs_xattrs_write(struct ext2_xattr_handle *handle);
errcode_t ext2fs_xattrs_read(struct ext2_xattr_handle *handle);
errcode_t ext2fs_xattrs_iterate(struct ext2_xattr_handle *h,
int (*func)(char *name, char *value,
- void *data),
+ size_t value_len, void *data),
void *data);
errcode_t ext2fs_xattr_get(struct ext2_xattr_handle *h, const char *key,
- void **value, unsigned int *value_len);
+ void **value, size_t *value_len);
errcode_t ext2fs_xattr_set(struct ext2_xattr_handle *handle,
const char *key,
const void *value,
- unsigned int value_len);
+ size_t value_len);
errcode_t ext2fs_xattr_remove(struct ext2_xattr_handle *handle,
const char *key);
errcode_t ext2fs_xattrs_open(ext2_filsys fs, ext2_ino_t ino,
diff --git a/lib/ext2fs/ext_attr.c b/lib/ext2fs/ext_attr.c
index 6eadca2..8101c7f 100644
--- a/lib/ext2fs/ext_attr.c
+++ b/lib/ext2fs/ext_attr.c
@@ -191,19 +191,19 @@ errcode_t ext2fs_adjust_ea_refcount(ext2_filsys fs, blk_t blk,
struct ext2_xattr {
char *name;
void *value;
- unsigned int value_len;
+ size_t value_len;
};
struct ext2_xattr_handle {
ext2_filsys fs;
struct ext2_xattr *attrs;
- unsigned int length;
+ size_t length;
ext2_ino_t ino;
int dirty;
};
-errcode_t ext2fs_xattrs_expand(struct ext2_xattr_handle *h,
- unsigned int expandby)
+static errcode_t ext2fs_xattrs_expand(struct ext2_xattr_handle *h,
+ unsigned int expandby)
{
struct ext2_xattr *new_attrs;
errcode_t err;
@@ -756,7 +756,7 @@ out:
errcode_t ext2fs_xattrs_iterate(struct ext2_xattr_handle *h,
int (*func)(char *name, char *value,
- void *data),
+ size_t value_len, void *data),
void *data)
{
struct ext2_xattr *x;
@@ -767,7 +767,7 @@ errcode_t ext2fs_xattrs_iterate(struct ext2_xattr_handle *h,
if (!x->name)
continue;
- ret = func(x->name, x->value, data);
+ ret = func(x->name, x->value, x->value_len, data);
if (ret & XATTR_CHANGED)
h->dirty = 1;
if (ret & XATTR_ABORT)
@@ -778,7 +778,7 @@ errcode_t ext2fs_xattrs_iterate(struct ext2_xattr_handle *h,
}
errcode_t ext2fs_xattr_get(struct ext2_xattr_handle *h, const char *key,
- void **value, unsigned int *value_len)
+ void **value, size_t *value_len)
{
struct ext2_xattr *x;
void *val;
@@ -805,7 +805,7 @@ errcode_t ext2fs_xattr_get(struct ext2_xattr_handle *h, const char *key,
errcode_t ext2fs_xattr_set(struct ext2_xattr_handle *handle,
const char *key,
const void *value,
- unsigned int value_len)
+ size_t value_len)
{
struct ext2_xattr *x, *last_empty;
char *new_value;
Add another API to query the number of extended attributes.
Signed-off-by: Darrick J. Wong <[email protected]>
---
lib/ext2fs/ext2fs.h | 1 +
lib/ext2fs/ext_attr.c | 19 +++++++++++++++----
2 files changed, 16 insertions(+), 4 deletions(-)
diff --git a/lib/ext2fs/ext2fs.h b/lib/ext2fs/ext2fs.h
index 9f631e6..d94fdd4 100644
--- a/lib/ext2fs/ext2fs.h
+++ b/lib/ext2fs/ext2fs.h
@@ -1178,6 +1178,7 @@ errcode_t ext2fs_xattrs_open(ext2_filsys fs, ext2_ino_t ino,
errcode_t ext2fs_xattrs_close(struct ext2_xattr_handle **handle);
errcode_t ext2fs_free_ext_attr(ext2_filsys fs, ext2_ino_t ino,
struct ext2_inode_large *inode);
+size_t ext2fs_xattrs_count(struct ext2_xattr_handle *handle);
/* extent.c */
extern errcode_t ext2fs_extent_header_verify(void *ptr, int size);
diff --git a/lib/ext2fs/ext_attr.c b/lib/ext2fs/ext_attr.c
index 8101c7f..772bb07 100644
--- a/lib/ext2fs/ext_attr.c
+++ b/lib/ext2fs/ext_attr.c
@@ -197,7 +197,7 @@ struct ext2_xattr {
struct ext2_xattr_handle {
ext2_filsys fs;
struct ext2_xattr *attrs;
- size_t length;
+ size_t length, count;
ext2_ino_t ino;
int dirty;
};
@@ -575,7 +575,8 @@ out:
static errcode_t read_xattrs_from_buffer(struct ext2_xattr_handle *handle,
struct ext2_ext_attr_entry *entries,
unsigned int storage_size,
- void *value_start)
+ void *value_start,
+ size_t *nr_read)
{
struct ext2_xattr *x;
struct ext2_ext_attr_entry *entry;
@@ -648,6 +649,7 @@ static errcode_t read_xattrs_from_buffer(struct ext2_xattr_handle *handle,
memcpy(x->value, value_start + entry->e_value_offs,
entry->e_value_size);
x++;
+ (*nr_read)++;
entry = EXT2_EXT_ATTR_NEXT(entry);
}
@@ -696,7 +698,7 @@ errcode_t ext2fs_xattrs_read(struct ext2_xattr_handle *handle)
inode->i_extra_isize + sizeof(__u32);
err = read_xattrs_from_buffer(handle, start, storage_size,
- start);
+ start, &handle->count);
if (err)
goto out;
}
@@ -736,7 +738,7 @@ read_ea_block:
sizeof(struct ext2_ext_attr_header);
start = block_buf + sizeof(struct ext2_ext_attr_header);
err = read_xattrs_from_buffer(handle, start, storage_size,
- block_buf);
+ block_buf, &handle->count);
if (err)
goto out3;
@@ -845,6 +847,7 @@ errcode_t ext2fs_xattr_set(struct ext2_xattr_handle *handle,
memcpy(last_empty->value, value, value_len);
last_empty->value_len = value_len;
handle->dirty = 1;
+ handle->count++;
return 0;
}
@@ -865,6 +868,7 @@ errcode_t ext2fs_xattr_set(struct ext2_xattr_handle *handle,
memcpy(x->value, value, value_len);
x->value_len = value_len;
handle->dirty = 1;
+ handle->count++;
return 0;
}
@@ -883,6 +887,7 @@ errcode_t ext2fs_xattr_remove(struct ext2_xattr_handle *handle,
ext2fs_free_mem(&x->value);
x->value_len = 0;
handle->dirty = 1;
+ handle->count--;
return 0;
}
}
@@ -913,6 +918,7 @@ errcode_t ext2fs_xattrs_open(ext2_filsys fs, ext2_ino_t ino,
ext2fs_free_mem(&h);
return err;
}
+ h->count = 0;
h->ino = ino;
h->fs = fs;
*handle = h;
@@ -943,3 +949,8 @@ errcode_t ext2fs_xattrs_close(struct ext2_xattr_handle **handle)
ext2fs_free_mem(handle);
return 0;
}
+
+size_t ext2fs_xattrs_count(struct ext2_xattr_handle *handle)
+{
+ return handle->count;
+}
Before loading extended attributes, free any key/value pairs that
might already be associated with the file.
Signed-off-by: Darrick J. Wong <[email protected]>
---
lib/ext2fs/ext_attr.c | 26 +++++++++++++++++---------
1 file changed, 17 insertions(+), 9 deletions(-)
diff --git a/lib/ext2fs/ext_attr.c b/lib/ext2fs/ext_attr.c
index 772bb07..e69275e 100644
--- a/lib/ext2fs/ext_attr.c
+++ b/lib/ext2fs/ext_attr.c
@@ -656,6 +656,20 @@ static errcode_t read_xattrs_from_buffer(struct ext2_xattr_handle *handle,
return 0;
}
+static void xattrs_free_keys(struct ext2_xattr_handle *h)
+{
+ struct ext2_xattr *a = h->attrs;
+ size_t i;
+
+ for (i = 0; i < h->length; i++) {
+ if (a[i].name)
+ ext2fs_free_mem(&a[i].name);
+ if (a[i].value)
+ ext2fs_free_mem(&a[i].value);
+ }
+ h->count = 0;
+}
+
errcode_t ext2fs_xattrs_read(struct ext2_xattr_handle *handle)
{
struct ext2_xattr *attrs = NULL, *x;
@@ -681,6 +695,8 @@ errcode_t ext2fs_xattrs_read(struct ext2_xattr_handle *handle)
if (err)
goto out;
+ xattrs_free_keys(handle);
+
/* Does the inode have size for EA? */
if (EXT2_INODE_SIZE(handle->fs->super) <= EXT2_GOOD_OLD_INODE_SIZE +
inode->i_extra_isize +
@@ -927,9 +943,7 @@ errcode_t ext2fs_xattrs_open(ext2_filsys fs, ext2_ino_t ino,
errcode_t ext2fs_xattrs_close(struct ext2_xattr_handle **handle)
{
- unsigned int i;
struct ext2_xattr_handle *h = *handle;
- struct ext2_xattr *a = h->attrs;
errcode_t err;
if (h->dirty) {
@@ -938,13 +952,7 @@ errcode_t ext2fs_xattrs_close(struct ext2_xattr_handle **handle)
return err;
}
- for (i = 0; i < h->length; i++) {
- if (a[i].name)
- ext2fs_free_mem(&a[i].name);
- if (a[i].value)
- ext2fs_free_mem(&a[i].value);
- }
Use the new extended attribute APIs to display all extended attributes
(current code does not look in the EA block) and display full names
(current code ignores name index too).
Signed-off-by: Darrick J. Wong <[email protected]>
---
debugfs/debugfs.c | 68 +++++++++++++++++++++++++------------------
tests/r_inline_xattr/expect | 6 +---
2 files changed, 42 insertions(+), 32 deletions(-)
diff --git a/debugfs/debugfs.c b/debugfs/debugfs.c
index 578d577..e37d3f5 100644
--- a/debugfs/debugfs.c
+++ b/debugfs/debugfs.c
@@ -543,34 +543,45 @@ static void internal_dump_inode_extra(FILE *out,
inode->i_extra_isize);
return;
}
- storage_size = EXT2_INODE_SIZE(current_fs->super) -
- EXT2_GOOD_OLD_INODE_SIZE -
- inode->i_extra_isize;
- magic = (__u32 *)((char *)inode + EXT2_GOOD_OLD_INODE_SIZE +
- inode->i_extra_isize);
- if (*magic == EXT2_EXT_ATTR_MAGIC) {
- fprintf(out, "Extended attributes stored in inode body: \n");
- end = (char *) inode + EXT2_INODE_SIZE(current_fs->super);
- start = (char *) magic + sizeof(__u32);
- entry = (struct ext2_ext_attr_entry *) start;
- while (!EXT2_EXT_IS_LAST_ENTRY(entry)) {
- struct ext2_ext_attr_entry *next =
- EXT2_EXT_ATTR_NEXT(entry);
- if (entry->e_value_size > storage_size ||
- (char *) next >= end) {
- fprintf(out, "invalid EA entry in inode\n");
- return;
- }
- fprintf(out, " ");
- dump_xattr_string(out, EXT2_EXT_ATTR_NAME(entry),
- entry->e_name_len);
- fprintf(out, " = \"");
- dump_xattr_string(out, start + entry->e_value_offs,
- entry->e_value_size);
- fprintf(out, "\" (%u)\n", entry->e_value_size);
- entry = next;
- }
- }
+}
+
+/* Dump extended attributes */
+static int dump_attr(char *name, char *value, size_t value_len, void *data)
+{
+ FILE *out = data;
+
+ fprintf(out, " ");
+ dump_xattr_string(out, name, strlen(name));
+ fprintf(out, " = \"");
+ dump_xattr_string(out, value, value_len);
+ fprintf(out, "\" (%zu)\n", value_len);
+
+ return 0;
+}
+
+static void dump_inode_attributes(FILE *out, ext2_ino_t ino)
+{
+ struct ext2_xattr_handle *h;
+ errcode_t err;
+
+ err = ext2fs_xattrs_open(current_fs, ino, &h);
+ if (err)
+ return;
+
+ err = ext2fs_xattrs_read(h);
+ if (err)
+ goto out;
+
+ if (ext2fs_xattrs_count(h) == 0)
+ goto out;
+
+ fprintf(out, "Extended attributes:\n");
+ err = ext2fs_xattrs_iterate(h, dump_attr, out);
+ if (err)
+ goto out;
+
+out:
+ err = ext2fs_xattrs_close(&h);
}
static void dump_blocks(FILE *f, const char *prefix, ext2_ino_t inode)
@@ -818,6 +829,7 @@ void internal_dump_inode(FILE *out, const char *prefix,
if (EXT2_INODE_SIZE(current_fs->super) > EXT2_GOOD_OLD_INODE_SIZE)
internal_dump_inode_extra(out, prefix, inode_num,
(struct ext2_inode_large *) inode);
+ dump_inode_attributes(out, inode_num);
if (current_fs->super->s_creator_os == EXT2_OS_LINUX &&
current_fs->super->s_feature_ro_compat &
EXT4_FEATURE_RO_COMPAT_METADATA_CSUM) {
diff --git a/tests/r_inline_xattr/expect b/tests/r_inline_xattr/expect
index 9e71264..c7aa088 100644
--- a/tests/r_inline_xattr/expect
+++ b/tests/r_inline_xattr/expect
@@ -1,8 +1,7 @@
resize2fs test
debugfs -R ''stat file'' test.img 2>&1 | grep ''^Inode\|in inode body\|name = ''
Inode: 1550 Type: regular Mode: 0644 Flags: 0x0
-Extended attributes stored in inode body:
- name = "propervalue" (11)
+ user.name = "propervalue" (11)
Exit status is 0
resize2fs test.img 5M
Resizing the filesystem on test.img to 5120 (1k) blocks.
@@ -11,6 +10,5 @@ The filesystem on test.img is now 5120 blocks long.
Exit status is 0
debugfs -R ''stat file'' test.img 2>&1 | grep ''^Inode\|in inode body\|name = ''
Inode: 12 Type: regular Mode: 0644 Flags: 0x0
-Extended attributes stored in inode body:
- name = "propervalue" (11)
+ user.name = "propervalue" (11)
Exit status is 0
When writing xattrs to disk, move the inline_data attribute to the
front of the list so that inline data always ends up in the inode body
(and not a separate EA block).
Signed-off-by: Darrick J. Wong <[email protected]>
---
lib/ext2fs/ext_attr.c | 20 ++++++++++++++++++++
1 file changed, 20 insertions(+)
diff --git a/lib/ext2fs/ext_attr.c b/lib/ext2fs/ext_attr.c
index e69275e..50f7300 100644
--- a/lib/ext2fs/ext_attr.c
+++ b/lib/ext2fs/ext_attr.c
@@ -238,6 +238,24 @@ static struct ea_name_index ea_names[] = {
{0, NULL},
};
+static void move_inline_data_to_front(struct ext2_xattr_handle *h)
+{
+ struct ext2_xattr *x;
+ struct ext2_xattr tmp;
+
+ for (x = h->attrs + 1; x < h->attrs + h->length; x++) {
+ if (!x->name)
+ continue;
+
+ if (strcmp(x->name, "system.data") == 0) {
+ memcpy(&tmp, x, sizeof(tmp));
+ memcpy(x, h->attrs, sizeof(tmp));
+ memcpy(h->attrs, &tmp, sizeof(tmp));
+ return;
+ }
+ }
+}
+
static const char *find_ea_prefix(int index)
{
struct ea_name_index *e;
@@ -484,6 +502,8 @@ errcode_t ext2fs_xattrs_write(struct ext2_xattr_handle *handle)
if (err)
goto out;
+ move_inline_data_to_front(handle);
+
x = handle->attrs;
/* Does the inode have size for EA? */
if (EXT2_INODE_SIZE(handle->fs->super) <= EXT2_GOOD_OLD_INODE_SIZE +
Since it's possible for very large filesystems to store backup
superblocks at very large (> 2^32) block numbers, we need to be able
to handle the case of a caller directing us to read one of these
high-numbered backups.
Signed-off-by: Darrick J. Wong <[email protected]>
---
debugfs/debugfs.c | 4 ++--
e2fsck/journal.c | 6 +++---
e2fsck/unix.c | 8 ++++----
lib/ext2fs/ext2fs.h | 4 ++++
lib/ext2fs/openfs.c | 23 ++++++++++++++++-------
misc/dumpe2fs.c | 4 ++--
6 files changed, 31 insertions(+), 18 deletions(-)
diff --git a/debugfs/debugfs.c b/debugfs/debugfs.c
index e37d3f5..f9eb578 100644
--- a/debugfs/debugfs.c
+++ b/debugfs/debugfs.c
@@ -94,8 +94,8 @@ static void open_filesystem(char *device, int open_flags, blk64_t superblock,
if (catastrophic)
open_flags |= EXT2_FLAG_SKIP_MMP;
- retval = ext2fs_open(device, open_flags, superblock, blocksize,
- unix_io_manager, ¤t_fs);
+ retval = ext2fs_open3(device, NULL, open_flags, superblock, blocksize,
+ unix_io_manager, ¤t_fs);
if (retval) {
com_err(device, retval, "while opening filesystem");
current_fs = NULL;
diff --git a/e2fsck/journal.c b/e2fsck/journal.c
index 22f06e7..a7a714f 100644
--- a/e2fsck/journal.c
+++ b/e2fsck/journal.c
@@ -966,9 +966,9 @@ errcode_t e2fsck_run_ext3_journal(e2fsck_t ctx)
ext2fs_mmp_stop(ctx->fs);
ext2fs_free(ctx->fs);
- retval = ext2fs_open(ctx->filesystem_name, EXT2_FLAG_RW,
- ctx->superblock, blocksize, io_ptr,
- &ctx->fs);
+ retval = ext2fs_open3(ctx->filesystem_name, NULL, EXT2_FLAG_RW,
+ ctx->superblock, blocksize, io_ptr,
+ &ctx->fs);
if (retval) {
com_err(ctx->program_name, retval,
_("while trying to re-open %s"),
diff --git a/e2fsck/unix.c b/e2fsck/unix.c
index 7a8fce2..261b84b 100644
--- a/e2fsck/unix.c
+++ b/e2fsck/unix.c
@@ -1042,7 +1042,7 @@ static errcode_t try_open_fs(e2fsck_t ctx, int flags, io_manager io_ptr,
*ret_fs = NULL;
if (ctx->superblock && ctx->blocksize) {
- retval = ext2fs_open2(ctx->filesystem_name, ctx->io_options,
+ retval = ext2fs_open3(ctx->filesystem_name, ctx->io_options,
flags, ctx->superblock, ctx->blocksize,
io_ptr, ret_fs);
} else if (ctx->superblock) {
@@ -1053,7 +1053,7 @@ static errcode_t try_open_fs(e2fsck_t ctx, int flags, io_manager io_ptr,
ext2fs_free(*ret_fs);
*ret_fs = NULL;
}
- retval = ext2fs_open2(ctx->filesystem_name,
+ retval = ext2fs_open3(ctx->filesystem_name,
ctx->io_options, flags,
ctx->superblock, blocksize,
io_ptr, ret_fs);
@@ -1061,7 +1061,7 @@ static errcode_t try_open_fs(e2fsck_t ctx, int flags, io_manager io_ptr,
break;
}
} else
- retval = ext2fs_open2(ctx->filesystem_name, ctx->io_options,
+ retval = ext2fs_open3(ctx->filesystem_name, ctx->io_options,
flags, 0, 0, io_ptr, ret_fs);
if (retval == 0)
@@ -1377,7 +1377,7 @@ failure:
* don't need to update the mount count and last checked
* fields in the backup superblock (the kernel doesn't update
* the backup superblocks anyway). With newer versions of the
- * library this flag is set by ext2fs_open2(), but we set this
+ * library this flag is set by ext2fs_open3(), but we set this
* here just to be sure. (No, we don't support e2fsck running
* with some other libext2fs than the one that it was shipped
* with, but just in case....)
diff --git a/lib/ext2fs/ext2fs.h b/lib/ext2fs/ext2fs.h
index d94fdd4..ba5c388 100644
--- a/lib/ext2fs/ext2fs.h
+++ b/lib/ext2fs/ext2fs.h
@@ -1480,6 +1480,10 @@ extern errcode_t ext2fs_open2(const char *name, const char *io_options,
int flags, int superblock,
unsigned int block_size, io_manager manager,
ext2_filsys *ret_fs);
+extern errcode_t ext2fs_open3(const char *name, const char *io_options,
+ int flags, blk64_t superblock,
+ unsigned int block_size, io_manager manager,
+ ext2_filsys *ret_fs);
/*
* The dgrp_t argument to these two functions is not actually a group number
* but a block number offset within a group table! Convert with the formula
diff --git a/lib/ext2fs/openfs.c b/lib/ext2fs/openfs.c
index 5cf6ae4..2639ae5 100644
--- a/lib/ext2fs/openfs.c
+++ b/lib/ext2fs/openfs.c
@@ -94,6 +94,15 @@ errcode_t ext2fs_open(const char *name, int flags, int superblock,
manager, ret_fs);
}
+errcode_t ext2fs_open2(const char *name, const char *io_options,
+ int flags, int superblock,
+ unsigned int block_size, io_manager manager,
+ ext2_filsys *ret_fs)
+{
+ return ext2fs_open3(name, io_options, flags, superblock, block_size,
+ manager, ret_fs);
+}
+
/*
* Note: if superblock is non-zero, block-size must also be non-zero.
* Superblock and block_size can be zero to use the default size.
@@ -108,8 +117,8 @@ errcode_t ext2fs_open(const char *name, int flags, int superblock,
* EXT2_FLAG_64BITS - Allow 64-bit bitfields (needed for large
* filesystems)
*/
-errcode_t ext2fs_open2(const char *name, const char *io_options,
- int flags, int superblock,
+errcode_t ext2fs_open3(const char *name, const char *io_options,
+ int flags, blk64_t superblock,
unsigned int block_size, io_manager manager,
ext2_filsys *ret_fs)
{
@@ -208,8 +217,8 @@ errcode_t ext2fs_open2(const char *name, const char *io_options,
if (retval)
goto cleanup;
}
- retval = io_channel_read_blk(fs->io, superblock, -SUPERBLOCK_SIZE,
- fs->super);
+ retval = io_channel_read_blk64(fs->io, superblock, -SUPERBLOCK_SIZE,
+ fs->super);
if (retval)
goto cleanup;
if (fs->orig_super)
@@ -410,9 +419,9 @@ errcode_t ext2fs_open2(const char *name, const char *io_options,
else
first_meta_bg = fs->desc_blocks;
if (first_meta_bg) {
- retval = io_channel_read_blk(fs->io, group_block +
- group_zero_adjust + 1,
- first_meta_bg, dest);
+ retval = io_channel_read_blk64(fs->io, group_block +
+ group_zero_adjust + 1,
+ first_meta_bg, dest);
if (retval)
goto cleanup;
#ifdef WORDS_BIGENDIAN
diff --git a/misc/dumpe2fs.c b/misc/dumpe2fs.c
index 3dbfcb9..0446f7f 100644
--- a/misc/dumpe2fs.c
+++ b/misc/dumpe2fs.c
@@ -621,7 +621,7 @@ int main (int argc, char ** argv)
for (use_blocksize = EXT2_MIN_BLOCK_SIZE;
use_blocksize <= EXT2_MAX_BLOCK_SIZE;
use_blocksize *= 2) {
- retval = ext2fs_open (device_name, flags,
+ retval = ext2fs_open3(device_name, NULL, flags,
use_superblock,
use_blocksize, unix_io_manager,
&fs);
@@ -629,7 +629,7 @@ int main (int argc, char ** argv)
break;
}
} else
- retval = ext2fs_open (device_name, flags, use_superblock,
+ retval = ext2fs_open3(device_name, NULL, flags, use_superblock,
use_blocksize, unix_io_manager, &fs);
if (retval) {
com_err (program_name, retval, _("while trying to open %s"),
This is the initial implementation of a FUSE server based on
e2fsprogs. The point of this program is to enable ext4 to run on any
OS that FUSE supports (and doesn't already have a native driver), such
as MacOS X, BSDs, and Windows. The code requires FUSE API v28, which
is available in Linux fuse and osxfuse releases that are available as
of August 2013.
v2: Remove unnecessary calls to ext2fs_flush(), and ensure that xattr
blocks are freed when removing an inode. Modify xattr calls to
reflect the API adjustments.
v3: Zero out large inodes before reading them, so that i_extra_size is
always accurate; disable FUSE attribute caching to prevent stat() from
returning stale contents; use rmdir when renaming onto a directory;
fix some resource leaks with error handling; update inode ctime when
modifying extended attributes; emulate kernel behavior when punching a
zero-length range; fix permission checking; expand directory if we run
out of space while creating a symlink.
Signed-off-by: Darrick J. Wong <[email protected]>
---
MCONFIG.in | 1
configure | 268 +++++
configure.in | 50 +
misc/Makefile.in | 15
misc/fuse2fs.c | 2967 ++++++++++++++++++++++++++++++++++++++++++++++++++++++
5 files changed, 3299 insertions(+), 2 deletions(-)
create mode 100644 misc/fuse2fs.c
diff --git a/MCONFIG.in b/MCONFIG.in
index 557b37a..8fa3391 100644
--- a/MCONFIG.in
+++ b/MCONFIG.in
@@ -107,6 +107,7 @@ LIBCOM_ERR = $(LIB)/libcom_err@LIB_EXT@ @PRIVATE_LIBS_CMT@ @SEM_INIT_LIB@
LIBE2P = $(LIB)/libe2p@LIB_EXT@
LIBEXT2FS = $(LIB)/libext2fs@LIB_EXT@
LIBUUID = @LIBUUID@ @SOCKET_LIB@
+LIBFUSE = @FUSE_LIB@
LIBQUOTA = @STATIC_LIBQUOTA@
LIBBLKID = @LIBBLKID@ @PRIVATE_LIBS_CMT@ $(LIBUUID)
LIBINTL = @LIBINTL@
diff --git a/configure b/configure
index 2338fbe..feea2ae 100755
--- a/configure
+++ b/configure
@@ -639,6 +639,8 @@ CYGWIN_CMT
LINUX_CMT
UNI_DIFF_OPTS
SEM_INIT_LIB
+FUSE_CMT
+FUSE_LIB
SOCKET_LIB
SIZEOF_OFF_T
SIZEOF_LONG_LONG
@@ -861,6 +863,7 @@ enable_rpath
with_libiconv_prefix
with_included_gettext
with_libintl_prefix
+enable_fuse2fs
with_multiarch
'
ac_precious_vars='build_alias
@@ -1516,6 +1519,7 @@ Optional Features:
--enable-bmap-stats-ops enable collection of additional bitmap stats
--disable-nls do not use Native Language Support
--disable-rpath do not hardcode runtime library paths
+ --disable-fuse2fs do not build fuse2fs
Optional Packages:
--with-PACKAGE[=ARG] use PACKAGE [ARG=yes]
@@ -11172,6 +11176,270 @@ if test "x$ac_cv_lib_socket_socket" = xyes; then :
fi
+FUSE_CMT=
+FUSE_LIB=
+# Check whether --enable-fuse2fs was given.
+if test "${enable_fuse2fs+set}" = set; then :
+ enableval=$enable_fuse2fs; if test "$enableval" = "no"
+then
+ FUSE_CMT="#"
+ { $as_echo "$as_me:${as_lineno-$LINENO}: result: Disabling fuse2fs" >&5
+$as_echo "Disabling fuse2fs" >&6; }
+else
+ for ac_header in pthread.h fuse.h
+do :
+ as_ac_Header=`$as_echo "ac_cv_header_$ac_header" | $as_tr_sh`
+ac_fn_c_check_header_compile "$LINENO" "$ac_header" "$as_ac_Header" "#define _FILE_OFFSET_BITS 64
+"
+if eval test \"x\$"$as_ac_Header"\" = x"yes"; then :
+ cat >>confdefs.h <<_ACEOF
+#define `$as_echo "HAVE_$ac_header" | $as_tr_cpp` 1
+_ACEOF
+
+else
+ { { $as_echo "$as_me:${as_lineno-$LINENO}: error: in \`$ac_pwd':" >&5
+$as_echo "$as_me: error: in \`$ac_pwd':" >&2;}
+as_fn_error $? "Cannot find fuse2fs headers.
+See \`config.log' for more details" "$LINENO" 5; }
+fi
+
+done
+
+
+ cat confdefs.h - <<_ACEOF >conftest.$ac_ext
+/* end confdefs.h. */
+#ifdef __linux__
+#include <linux/fs.h>
+#include <linux/falloc.h>
+#include <linux/xattr.h>
+#endif
+
+int
+main ()
+{
+
+ ;
+ return 0;
+}
+_ACEOF
+if ac_fn_c_try_cpp "$LINENO"; then :
+
+else
+ { { $as_echo "$as_me:${as_lineno-$LINENO}: error: in \`$ac_pwd':" >&5
+$as_echo "$as_me: error: in \`$ac_pwd':" >&2;}
+as_fn_error $? "Cannot find fuse2fs Linux headers.
+See \`config.log' for more details" "$LINENO" 5; }
+fi
+rm -f conftest.err conftest.i conftest.$ac_ext
+
+ { $as_echo "$as_me:${as_lineno-$LINENO}: checking for fuse_main in -losxfuse" >&5
+$as_echo_n "checking for fuse_main in -losxfuse... " >&6; }
+if ${ac_cv_lib_osxfuse_fuse_main+:} false; then :
+ $as_echo_n "(cached) " >&6
+else
+ ac_check_lib_save_LIBS=$LIBS
+LIBS="-losxfuse $LIBS"
+cat confdefs.h - <<_ACEOF >conftest.$ac_ext
+/* end confdefs.h. */
+
+/* Override any GCC internal prototype to avoid an error.
+ Use char because int might match the return type of a GCC
+ builtin and then its argument prototype would still apply. */
+#ifdef __cplusplus
+extern "C"
+#endif
+char fuse_main ();
+int
+main ()
+{
+return fuse_main ();
+ ;
+ return 0;
+}
+_ACEOF
+if ac_fn_c_try_link "$LINENO"; then :
+ ac_cv_lib_osxfuse_fuse_main=yes
+else
+ ac_cv_lib_osxfuse_fuse_main=no
+fi
+rm -f core conftest.err conftest.$ac_objext \
+ conftest$ac_exeext conftest.$ac_ext
+LIBS=$ac_check_lib_save_LIBS
+fi
+{ $as_echo "$as_me:${as_lineno-$LINENO}: result: $ac_cv_lib_osxfuse_fuse_main" >&5
+$as_echo "$ac_cv_lib_osxfuse_fuse_main" >&6; }
+if test "x$ac_cv_lib_osxfuse_fuse_main" = xyes; then :
+ FUSE_LIB=-losxfuse
+else
+ { $as_echo "$as_me:${as_lineno-$LINENO}: checking for fuse_main in -lfuse" >&5
+$as_echo_n "checking for fuse_main in -lfuse... " >&6; }
+if ${ac_cv_lib_fuse_fuse_main+:} false; then :
+ $as_echo_n "(cached) " >&6
+else
+ ac_check_lib_save_LIBS=$LIBS
+LIBS="-lfuse $LIBS"
+cat confdefs.h - <<_ACEOF >conftest.$ac_ext
+/* end confdefs.h. */
+
+/* Override any GCC internal prototype to avoid an error.
+ Use char because int might match the return type of a GCC
+ builtin and then its argument prototype would still apply. */
+#ifdef __cplusplus
+extern "C"
+#endif
+char fuse_main ();
+int
+main ()
+{
+return fuse_main ();
+ ;
+ return 0;
+}
+_ACEOF
+if ac_fn_c_try_link "$LINENO"; then :
+ ac_cv_lib_fuse_fuse_main=yes
+else
+ ac_cv_lib_fuse_fuse_main=no
+fi
+rm -f core conftest.err conftest.$ac_objext \
+ conftest$ac_exeext conftest.$ac_ext
+LIBS=$ac_check_lib_save_LIBS
+fi
+{ $as_echo "$as_me:${as_lineno-$LINENO}: result: $ac_cv_lib_fuse_fuse_main" >&5
+$as_echo "$ac_cv_lib_fuse_fuse_main" >&6; }
+if test "x$ac_cv_lib_fuse_fuse_main" = xyes; then :
+ FUSE_LIB=-lfuse
+else
+ { { $as_echo "$as_me:${as_lineno-$LINENO}: error: in \`$ac_pwd':" >&5
+$as_echo "$as_me: error: in \`$ac_pwd':" >&2;}
+as_fn_error $? "Cannot find fuse library.
+See \`config.log' for more details" "$LINENO" 5; }
+fi
+
+fi
+
+ { $as_echo "$as_me:${as_lineno-$LINENO}: result: Enabling fuse2fs" >&5
+$as_echo "Enabling fuse2fs" >&6; }
+fi
+
+else
+ for ac_header in pthread.h fuse.h
+do :
+ as_ac_Header=`$as_echo "ac_cv_header_$ac_header" | $as_tr_sh`
+ac_fn_c_check_header_compile "$LINENO" "$ac_header" "$as_ac_Header" "#define _FILE_OFFSET_BITS 64
+#ifdef __linux__
+# include <linux/fs.h>
+# include <linux/falloc.h>
+# include <linux/xattr.h>
+#endif
+"
+if eval test \"x\$"$as_ac_Header"\" = x"yes"; then :
+ cat >>confdefs.h <<_ACEOF
+#define `$as_echo "HAVE_$ac_header" | $as_tr_cpp` 1
+_ACEOF
+
+else
+ FUSE_CMT="#"
+fi
+
+done
+
+if test -z "$FUSE_CMT"
+then
+ { $as_echo "$as_me:${as_lineno-$LINENO}: checking for fuse_main in -losxfuse" >&5
+$as_echo_n "checking for fuse_main in -losxfuse... " >&6; }
+if ${ac_cv_lib_osxfuse_fuse_main+:} false; then :
+ $as_echo_n "(cached) " >&6
+else
+ ac_check_lib_save_LIBS=$LIBS
+LIBS="-losxfuse $LIBS"
+cat confdefs.h - <<_ACEOF >conftest.$ac_ext
+/* end confdefs.h. */
+
+/* Override any GCC internal prototype to avoid an error.
+ Use char because int might match the return type of a GCC
+ builtin and then its argument prototype would still apply. */
+#ifdef __cplusplus
+extern "C"
+#endif
+char fuse_main ();
+int
+main ()
+{
+return fuse_main ();
+ ;
+ return 0;
+}
+_ACEOF
+if ac_fn_c_try_link "$LINENO"; then :
+ ac_cv_lib_osxfuse_fuse_main=yes
+else
+ ac_cv_lib_osxfuse_fuse_main=no
+fi
+rm -f core conftest.err conftest.$ac_objext \
+ conftest$ac_exeext conftest.$ac_ext
+LIBS=$ac_check_lib_save_LIBS
+fi
+{ $as_echo "$as_me:${as_lineno-$LINENO}: result: $ac_cv_lib_osxfuse_fuse_main" >&5
+$as_echo "$ac_cv_lib_osxfuse_fuse_main" >&6; }
+if test "x$ac_cv_lib_osxfuse_fuse_main" = xyes; then :
+ FUSE_LIB=-losxfuse
+else
+ { $as_echo "$as_me:${as_lineno-$LINENO}: checking for fuse_main in -lfuse" >&5
+$as_echo_n "checking for fuse_main in -lfuse... " >&6; }
+if ${ac_cv_lib_fuse_fuse_main+:} false; then :
+ $as_echo_n "(cached) " >&6
+else
+ ac_check_lib_save_LIBS=$LIBS
+LIBS="-lfuse $LIBS"
+cat confdefs.h - <<_ACEOF >conftest.$ac_ext
+/* end confdefs.h. */
+
+/* Override any GCC internal prototype to avoid an error.
+ Use char because int might match the return type of a GCC
+ builtin and then its argument prototype would still apply. */
+#ifdef __cplusplus
+extern "C"
+#endif
+char fuse_main ();
+int
+main ()
+{
+return fuse_main ();
+ ;
+ return 0;
+}
+_ACEOF
+if ac_fn_c_try_link "$LINENO"; then :
+ ac_cv_lib_fuse_fuse_main=yes
+else
+ ac_cv_lib_fuse_fuse_main=no
+fi
+rm -f core conftest.err conftest.$ac_objext \
+ conftest$ac_exeext conftest.$ac_ext
+LIBS=$ac_check_lib_save_LIBS
+fi
+{ $as_echo "$as_me:${as_lineno-$LINENO}: result: $ac_cv_lib_fuse_fuse_main" >&5
+$as_echo "$ac_cv_lib_fuse_fuse_main" >&6; }
+if test "x$ac_cv_lib_fuse_fuse_main" = xyes; then :
+ FUSE_LIB=-lfuse
+else
+ FUSE_CMT="#"
+fi
+
+fi
+
+fi
+if test -z "$FUSE_CMT"
+then
+ { $as_echo "$as_me:${as_lineno-$LINENO}: result: Enabling fuse2fs by default." >&5
+$as_echo "Enabling fuse2fs by default." >&6; }
+fi
+
+fi
+
+
+
{ $as_echo "$as_me:${as_lineno-$LINENO}: checking for optreset" >&5
$as_echo_n "checking for optreset... " >&6; }
if ${ac_cv_have_optreset+:} false; then :
diff --git a/configure.in b/configure.in
index 049dc11..27655e3 100644
--- a/configure.in
+++ b/configure.in
@@ -1127,6 +1127,56 @@ SOCKET_LIB=''
AC_CHECK_LIB(socket, socket, [SOCKET_LIB=-lsocket])
AC_SUBST(SOCKET_LIB)
dnl
+dnl Check to see if the FUSE library is -lfuse or -losxfuse
+dnl
+FUSE_CMT=
+FUSE_LIB=
+dnl osxfuse.dylib supersedes fuselib.dylib
+AC_ARG_ENABLE([fuse2fs],
+[ --disable-fuse2fs do not build fuse2fs],
+if test "$enableval" = "no"
+then
+ FUSE_CMT="#"
+ AC_MSG_RESULT([Disabling fuse2fs])
+else
+ AC_CHECK_HEADERS([pthread.h fuse.h], [],
+[AC_MSG_FAILURE([Cannot find fuse2fs headers.])],
+[#define _FILE_OFFSET_BITS 64])
+
+ AC_PREPROC_IFELSE(
+[AC_LANG_PROGRAM([[#ifdef __linux__
+#include <linux/fs.h>
+#include <linux/falloc.h>
+#include <linux/xattr.h>
+#endif
+]], [])], [], [AC_MSG_FAILURE([Cannot find fuse2fs Linux headers.])])
+
+ AC_CHECK_LIB(osxfuse, fuse_main, [FUSE_LIB=-losxfuse],
+ [AC_CHECK_LIB(fuse, fuse_main, [FUSE_LIB=-lfuse],
+ [AC_MSG_FAILURE([Cannot find fuse library.])])])
+ AC_MSG_RESULT([Enabling fuse2fs])
+fi
+,
+AC_CHECK_HEADERS([pthread.h fuse.h], [], [FUSE_CMT="#"],
+[#define _FILE_OFFSET_BITS 64
+#ifdef __linux__
+# include <linux/fs.h>
+# include <linux/falloc.h>
+# include <linux/xattr.h>
+#endif])
+if test -z "$FUSE_CMT"
+then
+ AC_CHECK_LIB(osxfuse, fuse_main, [FUSE_LIB=-losxfuse],
+[AC_CHECK_LIB(fuse, fuse_main, [FUSE_LIB=-lfuse], [FUSE_CMT="#"])])
+fi
+if test -z "$FUSE_CMT"
+then
+ AC_MSG_RESULT([Enabling fuse2fs by default.])
+fi
+)
+AC_SUBST(FUSE_LIB)
+AC_SUBST(FUSE_CMT)
+dnl
dnl See if optreset exists
dnl
AC_MSG_CHECKING(for optreset)
diff --git a/misc/Makefile.in b/misc/Makefile.in
index a798f96..1838d03 100644
--- a/misc/Makefile.in
+++ b/misc/Makefile.in
@@ -26,9 +26,12 @@ INSTALL = @INSTALL@
@BLKID_CMT@FINDFS_LINK= findfs
@BLKID_CMT@FINDFS_MAN= findfs.8
+@FUSE_CMT@FUSE_PROG= fuse2fs
+
SPROGS= mke2fs badblocks tune2fs dumpe2fs $(BLKID_PROG) logsave \
$(E2IMAGE_PROG) @FSCK_PROG@ e2undo
-USPROGS= mklost+found filefrag e2freefrag $(UUIDD_PROG) $(E4DEFRAG_PROG)
+USPROGS= mklost+found filefrag e2freefrag $(UUIDD_PROG) $(E4DEFRAG_PROG) \
+ $(FUSE_PROG)
SMANPAGES= tune2fs.8 mklost+found.8 mke2fs.8 dumpe2fs.8 badblocks.8 \
e2label.8 $(FINDFS_MAN) $(BLKID_MAN) $(E2IMAGE_MAN) \
logsave.8 filefrag.8 e2freefrag.8 e2undo.8 \
@@ -56,6 +59,7 @@ FILEFRAG_OBJS= filefrag.o
E2UNDO_OBJS= e2undo.o
E4DEFRAG_OBJS= e4defrag.o
E2FREEFRAG_OBJS= e2freefrag.o
+FUSE2FS_OBJS= fuse2fs.o
PROFILED_TUNE2FS_OBJS= profiled/tune2fs.o profiled/util.o
PROFILED_MKLPF_OBJS= profiled/mklost+found.o
@@ -75,6 +79,7 @@ PROFILED_FILEFRAG_OBJS= profiled/filefrag.o
PROFILED_E2FREEFRAG_OBJS= profiled/e2freefrag.o
PROFILED_E2UNDO_OBJS= profiled/e2undo.o
PROFILED_E4DEFRAG_OBJS= profiled/e4defrag.o
+PROFILED_FUSE2FS_OJBS= profiled/fuse2fs.o
SRCS= $(srcdir)/tune2fs.c $(srcdir)/mklost+found.c $(srcdir)/mke2fs.c \
$(srcdir)/chattr.c $(srcdir)/lsattr.c $(srcdir)/dumpe2fs.c \
@@ -82,7 +87,7 @@ SRCS= $(srcdir)/tune2fs.c $(srcdir)/mklost+found.c $(srcdir)/mke2fs.c \
$(srcdir)/uuidgen.c $(srcdir)/blkid.c $(srcdir)/logsave.c \
$(srcdir)/filefrag.c $(srcdir)/base_device.c \
$(srcdir)/ismounted.c $(srcdir)/../e2fsck/profile.c \
- $(srcdir)/e2undo.c $(srcdir)/e2freefrag.c
+ $(srcdir)/e2undo.c $(srcdir)/e2freefrag.c $(srcdir)/fuse2fs.c
LIBS= $(LIBEXT2FS) $(LIBCOM_ERR)
DEPLIBS= $(LIBEXT2FS) $(DEPLIBCOM_ERR)
@@ -335,6 +340,12 @@ filefrag.profiled: $(FILEFRAG_OBJS)
$(Q) $(CC) $(ALL_LDFLAGS) -g -pg -o filefrag.profiled \
$(PROFILED_FILEFRAG_OBJS)
+fuse2fs: $(FUSE2FS_OBJS) $(DEPLIBS) $(DEPLIBBLKID) $(DEPLIBUUID) \
+ $(DEPLIBQUOTA) $(LIBEXT2FS)
+ $(E) " LD $@"
+ $(Q) $(CC) $(ALL_LDFLAGS) -o fuse2fs $(FUSE2FS_OBJS) $(LIBS) \
+ $(LIBFUSE) $(LIBBLKID) $(LIBUUID) $(LIBEXT2FS)
+
tst_ismounted: $(srcdir)/ismounted.c $(STATIC_LIBEXT2FS) $(DEPLIBCOM_ERR)
$(E) " LD $@"
$(CC) -o tst_ismounted $(srcdir)/ismounted.c -DDEBUG $(ALL_CFLAGS) \
diff --git a/misc/fuse2fs.c b/misc/fuse2fs.c
new file mode 100644
index 0000000..8be9070
--- /dev/null
+++ b/misc/fuse2fs.c
@@ -0,0 +1,2967 @@
+/*
+ * fuse2fs.c - FUSE server for e2fsprogs.
+ *
+ * Copyright (C) 2013 Oracle.
+ *
+ * %Begin-Header%
+ * This file may be redistributed under the terms of the GNU Public
+ * License.
+ * %End-Header%
+ */
+#define _FILE_OFFSET_BITS 64
+#define FUSE_USE_VERSION 29
+#define _GNU_SOURCE
+#include <pthread.h>
+#ifdef __linux__
+# include <linux/fs.h>
+# include <linux/falloc.h>
+# include <linux/xattr.h>
+#endif
+#include <sys/ioctl.h>
+#include <unistd.h>
+#include <fuse.h>
+#include "ext2fs/ext2fs.h"
+#include "ext2fs/ext2_fs.h"
+
+#if FUSE_VERSION >= FUSE_MAKE_VERSION(2, 8)
+# ifdef _IOR
+# ifdef _IOW
+# define SUPPORT_I_FLAGS
+# endif
+# endif
+#endif
+
+#ifdef FALLOC_FL_KEEP_SIZE
+# define FL_KEEP_SIZE_FLAG FALLOC_FL_KEEP_SIZE
+#else
+# define FL_KEEP_SIZE_FLAG (0)
+#endif
+
+#ifdef FALLOC_FL_PUNCH_HOLE
+# define FL_PUNCH_HOLE_FLAG FALLOC_FL_PUNCH_HOLE
+#else
+# define FL_PUNCH_HOLE_FLAG (0)
+#endif
+
+/*
+ * ext2_file_t contains a struct inode, so we can't leave files open.
+ * Use this as a proxy instead.
+ */
+struct fuse2fs_file_handle {
+ ext2_ino_t ino;
+ int open_flags;
+};
+
+/* Main program context */
+struct fuse2fs {
+ ext2_filsys fs;
+ pthread_mutex_t bfl;
+ int panic_on_error;
+ FILE *err_fp;
+ unsigned int next_generation;
+};
+
+static int __translate_error(ext2_filsys fs, errcode_t err, ext2_ino_t ino,
+ const char *file, int line);
+#define translate_error(fs, ino, err) __translate_error((fs), (err), (ino), \
+ __FILE__, __LINE__)
+
+/* for macosx */
+#ifndef W_OK
+# define W_OK 2
+#endif
+
+#ifndef R_OK
+# define R_OK 4
+#endif
+
+#define EXT4_EPOCH_BITS 2
+#define EXT4_EPOCH_MASK ((1 << EXT4_EPOCH_BITS) - 1)
+#define EXT4_NSEC_MASK (~0UL << EXT4_EPOCH_BITS)
+
+/*
+ * Extended fields will fit into an inode if the filesystem was formatted
+ * with large inodes (-I 256 or larger) and there are not currently any EAs
+ * consuming all of the available space. For new inodes we always reserve
+ * enough space for the kernel's known extended fields, but for inodes
+ * created with an old kernel this might not have been the case. None of
+ * the extended inode fields is critical for correct filesystem operation.
+ * This macro checks if a certain field fits in the inode. Note that
+ * inode-size = GOOD_OLD_INODE_SIZE + i_extra_isize
+ */
+#define EXT4_FITS_IN_INODE(ext4_inode, field) \
+ ((offsetof(typeof(*ext4_inode), field) + \
+ sizeof((ext4_inode)->field)) \
+ <= (EXT2_GOOD_OLD_INODE_SIZE + \
+ (ext4_inode)->i_extra_isize)) \
+
+static inline __u32 ext4_encode_extra_time(const struct timespec *time)
+{
+ return (sizeof(time->tv_sec) > 4 ?
+ (time->tv_sec >> 32) & EXT4_EPOCH_MASK : 0) |
+ ((time->tv_nsec << EXT4_EPOCH_BITS) & EXT4_NSEC_MASK);
+}
+
+static inline void ext4_decode_extra_time(struct timespec *time, __u32 extra)
+{
+ if (sizeof(time->tv_sec) > 4)
+ time->tv_sec |= (__u64)((extra) & EXT4_EPOCH_MASK) << 32;
+ time->tv_nsec = ((extra) & EXT4_NSEC_MASK) >> EXT4_EPOCH_BITS;
+}
+
+#define EXT4_INODE_SET_XTIME(xtime, timespec, raw_inode) \
+do { \
+ (raw_inode)->xtime = (timespec)->tv_sec; \
+ if (EXT4_FITS_IN_INODE(raw_inode, xtime ## _extra)) \
+ (raw_inode)->xtime ## _extra = \
+ ext4_encode_extra_time(timespec); \
+} while (0)
+
+#define EXT4_EINODE_SET_XTIME(xtime, timespec, raw_inode) \
+do { \
+ if (EXT4_FITS_IN_INODE(raw_inode, xtime)) \
+ (raw_inode)->xtime = (timespec)->tv_sec; \
+ if (EXT4_FITS_IN_INODE(raw_inode, xtime ## _extra)) \
+ (raw_inode)->xtime ## _extra = \
+ ext4_encode_extra_time(timespec); \
+} while (0)
+
+#define EXT4_INODE_GET_XTIME(xtime, timespec, raw_inode) \
+do { \
+ (timespec)->tv_sec = (signed)((raw_inode)->xtime); \
+ if (EXT4_FITS_IN_INODE(raw_inode, xtime ## _extra)) \
+ ext4_decode_extra_time((timespec), \
+ raw_inode->xtime ## _extra); \
+ else \
+ (timespec)->tv_nsec = 0; \
+} while (0)
+
+#define EXT4_EINODE_GET_XTIME(xtime, timespec, raw_inode) \
+do { \
+ if (EXT4_FITS_IN_INODE(raw_inode, xtime)) \
+ (timespec)->tv_sec = \
+ (signed)((raw_inode)->xtime); \
+ if (EXT4_FITS_IN_INODE(raw_inode, xtime ## _extra)) \
+ ext4_decode_extra_time((timespec), \
+ raw_inode->xtime ## _extra); \
+ else \
+ (timespec)->tv_nsec = 0; \
+} while (0)
+
+static void get_now(struct timespec *now)
+{
+#ifdef CLOCK_REALTIME
+ if (!clock_gettime(CLOCK_REALTIME, now))
+ return;
+#endif
+
+ now->tv_sec = time(NULL);
+ now->tv_nsec = 0;
+}
+
+static void increment_version(struct ext2_inode_large *inode)
+{
+ __u64 ver;
+
+ ver = inode->osd1.linux1.l_i_version;
+ if (EXT4_FITS_IN_INODE(inode, i_version_hi))
+ ver |= (__u64)inode->i_version_hi << 32;
+ ver++;
+ inode->osd1.linux1.l_i_version = ver;
+ if (EXT4_FITS_IN_INODE(inode, i_version_hi))
+ inode->i_version_hi = ver >> 32;
+}
+
+static void init_times(struct ext2_inode_large *inode)
+{
+ struct timespec now;
+
+ get_now(&now);
+ EXT4_INODE_SET_XTIME(i_atime, &now, inode);
+ EXT4_INODE_SET_XTIME(i_ctime, &now, inode);
+ EXT4_INODE_SET_XTIME(i_mtime, &now, inode);
+ EXT4_EINODE_SET_XTIME(i_crtime, &now, inode);
+ increment_version(inode);
+}
+
+static int update_ctime(ext2_filsys fs, ext2_ino_t ino,
+ struct ext2_inode_large *pinode)
+{
+ errcode_t err;
+ struct timespec now;
+ struct ext2_inode_large inode;
+
+ get_now(&now);
+
+ /* If user already has a inode buffer, just update that */
+ if (pinode) {
+ increment_version(pinode);
+ EXT4_INODE_SET_XTIME(i_ctime, &now, pinode);
+ return 0;
+ }
+
+ /* Otherwise we have to read-modify-write the inode */
+ memset(&inode, 0, sizeof(inode));
+ err = ext2fs_read_inode_full(fs, ino, (struct ext2_inode *)&inode,
+ sizeof(inode));
+ if (err)
+ return translate_error(fs, ino, err);
+
+ increment_version(&inode);
+ EXT4_INODE_SET_XTIME(i_ctime, &now, &inode);
+
+ err = ext2fs_write_inode_full(fs, ino, (struct ext2_inode *)&inode,
+ sizeof(inode));
+ if (err)
+ return translate_error(fs, ino, err);
+
+ return 0;
+}
+
+static int update_atime(ext2_filsys fs, ext2_ino_t ino)
+{
+ errcode_t err;
+ struct ext2_inode_large inode, *pinode;
+ struct timespec atime, mtime, now;
+
+ if (!(fs->flags & EXT2_FLAG_RW))
+ return 0;
+ memset(&inode, 0, sizeof(inode));
+ err = ext2fs_read_inode_full(fs, ino, (struct ext2_inode *)&inode,
+ sizeof(inode));
+ if (err)
+ return translate_error(fs, ino, err);
+
+ pinode = &inode;
+ EXT4_INODE_GET_XTIME(i_atime, &atime, pinode);
+ EXT4_INODE_GET_XTIME(i_mtime, &mtime, pinode);
+ get_now(&now);
+ /*
+ * If atime is newer than mtime and atime hasn't been updated in more
+ * than a day, skip the atime update. Same idea as Linux "relatime".
+ */
+ if (atime.tv_sec >= mtime.tv_sec && atime.tv_sec >= now.tv_sec - 86400)
+ return 0;
+ EXT4_INODE_SET_XTIME(i_atime, &now, &inode);
+
+ err = ext2fs_write_inode_full(fs, ino, (struct ext2_inode *)&inode,
+ sizeof(inode));
+ if (err)
+ return translate_error(fs, ino, err);
+
+ return 0;
+}
+
+static int update_mtime(ext2_filsys fs, ext2_ino_t ino)
+{
+ errcode_t err;
+ struct ext2_inode_large inode;
+ struct timespec now;
+
+ memset(&inode, 0, sizeof(inode));
+ err = ext2fs_read_inode_full(fs, ino, (struct ext2_inode *)&inode,
+ sizeof(inode));
+ if (err)
+ return translate_error(fs, ino, err);
+
+ get_now(&now);
+ EXT4_INODE_SET_XTIME(i_mtime, &now, &inode);
+ EXT4_INODE_SET_XTIME(i_ctime, &now, &inode);
+ increment_version(&inode);
+
+ err = ext2fs_write_inode_full(fs, ino, (struct ext2_inode *)&inode,
+ sizeof(inode));
+ if (err)
+ return translate_error(fs, ino, err);
+
+ return 0;
+}
+
+static int ext2_file_type(unsigned int mode)
+{
+ if (LINUX_S_ISREG(mode))
+ return EXT2_FT_REG_FILE;
+
+ if (LINUX_S_ISDIR(mode))
+ return EXT2_FT_DIR;
+
+ if (LINUX_S_ISCHR(mode))
+ return EXT2_FT_CHRDEV;
+
+ if (LINUX_S_ISBLK(mode))
+ return EXT2_FT_BLKDEV;
+
+ if (LINUX_S_ISLNK(mode))
+ return EXT2_FT_SYMLINK;
+
+ if (LINUX_S_ISFIFO(mode))
+ return EXT2_FT_FIFO;
+
+ if (LINUX_S_ISSOCK(mode))
+ return EXT2_FT_SOCK;
+
+ return 0;
+}
+
+static int fs_writeable(ext2_filsys fs)
+{
+ return (fs->flags & EXT2_FLAG_RW) && (fs->super->s_error_count == 0);
+}
+
+static int check_inum_access(struct fuse_context *ctxt, ext2_filsys fs,
+ ext2_ino_t ino, int mask)
+{
+ errcode_t err;
+ struct ext2_inode inode;
+ mode_t perms;
+
+ /* no writing to read-only or broken fs */
+ if ((mask & W_OK) && !fs_writeable(fs))
+ return -EROFS;
+
+ err = ext2fs_read_inode(fs, ino, &inode);
+ if (err)
+ return translate_error(fs, ino, err);
+ perms = inode.i_mode & 0777;
+
+ printf("access ino=%d mask=e%s%s%s perms=0%o uid=%d\n", ino,
+ (mask & R_OK ? "r" : ""), (mask & W_OK ? "w" : ""),
+ (mask & X_OK ? "x" : ""), perms, ctxt->uid);
+
+ /* existence check */
+ if (mask == 0)
+ return 0;
+
+ /* is immutable? */
+ if ((mask & W_OK) &&
+ (inode.i_flags & EXT2_IMMUTABLE_FL))
+ return -EPERM;
+
+ /* Figure out what root's allowed to do */
+ if (ctxt->uid == 0) {
+ /* Non-file access always ok */
+ if (!LINUX_S_ISREG(inode.i_mode))
+ return 0;
+
+ /* R/W access to a file always ok */
+ if (!(mask & X_OK))
+ return 0;
+
+ /* X access to a file ok if a user/group/other can X */
+ if (perms & 0111)
+ return 0;
+
+ /* Trying to execute a file that's not executable. BZZT! */
+ return -EPERM;
+ }
+
+ /* allow owner, if perms match */
+ if (inode.i_uid == ctxt->uid) {
+ if ((mask & (perms >> 6)) == mask)
+ return 0;
+ return -EPERM;
+ }
+
+ /* allow group, if perms match */
+ if (inode.i_gid == ctxt->gid) {
+ if ((mask & (perms >> 3)) == mask)
+ return 0;
+ return -EPERM;
+ }
+
+ /* otherwise check other */
+ if ((mask & perms) == mask)
+ return 0;
+ return -EPERM;
+}
+
+static void op_destroy(void *p)
+{
+ struct fuse_context *ctxt = fuse_get_context();
+ struct fuse2fs *ff = (struct fuse2fs *)ctxt->private_data;
+ ext2_filsys fs = ff->fs;
+ errcode_t err;
+
+ if (fs->flags & EXT2_FLAG_RW) {
+ fs->super->s_state |= EXT2_VALID_FS;
+ if (fs->super->s_error_count)
+ fs->super->s_state |= EXT2_ERROR_FS;
+ ext2fs_mark_super_dirty(fs);
+ err = ext2fs_set_gdt_csum(fs);
+ if (err)
+ translate_error(fs, 0, err);
+
+ err = ext2fs_flush2(fs, 0);
+ if (err)
+ translate_error(fs, 0, err);
+ }
+}
+
+static void *op_init(struct fuse_conn_info *conn)
+{
+ struct fuse_context *ctxt = fuse_get_context();
+ struct fuse2fs *ff = (struct fuse2fs *)ctxt->private_data;
+ ext2_filsys fs = ff->fs;
+ errcode_t err;
+
+ if (fs->flags & EXT2_FLAG_RW) {
+ fs->super->s_mnt_count++;
+ fs->super->s_mtime = time(NULL);
+ fs->super->s_state &= ~EXT2_VALID_FS;
+ ext2fs_mark_super_dirty(fs);
+ err = ext2fs_flush2(fs, 0);
+ if (err)
+ translate_error(fs, 0, err);
+ }
+ return ff;
+}
+
+static int stat_inode(ext2_filsys fs, ext2_ino_t ino, struct stat *statbuf)
+{
+ struct ext2_inode_large inode;
+ dev_t fakedev = 0;
+ errcode_t err;
+ int ret = 0;
+
+ memset(&inode, 0, sizeof(inode));
+ err = ext2fs_read_inode_full(fs, ino, (struct ext2_inode *)&inode,
+ sizeof(inode));
+ if (err)
+ return translate_error(fs, ino, err);
+
+ memcpy(&fakedev, fs->super->s_uuid, sizeof(fakedev));
+ statbuf->st_dev = fakedev;
+ statbuf->st_ino = ino;
+ statbuf->st_mode = inode.i_mode;
+ statbuf->st_nlink = inode.i_links_count;
+ statbuf->st_uid = inode.i_uid;
+ statbuf->st_gid = inode.i_gid;
+ statbuf->st_size = inode.i_size;
+ statbuf->st_blksize = fs->blocksize;
+ statbuf->st_blocks = inode.i_blocks;
+ statbuf->st_atime = inode.i_atime;
+ statbuf->st_mtime = inode.i_mtime;
+ statbuf->st_ctime = inode.i_ctime;
+ if (LINUX_S_ISCHR(inode.i_mode) ||
+ LINUX_S_ISBLK(inode.i_mode)) {
+ if (inode.i_block[0])
+ statbuf->st_rdev = inode.i_block[0];
+ else
+ statbuf->st_rdev = inode.i_block[1];
+ }
+
+ return ret;
+}
+
+static int op_getattr(const char *path, struct stat *statbuf)
+{
+ struct fuse_context *ctxt = fuse_get_context();
+ struct fuse2fs *ff = (struct fuse2fs *)ctxt->private_data;
+ ext2_filsys fs = ff->fs;
+ ext2_ino_t ino;
+ errcode_t err;
+ int ret = 0;
+
+ pthread_mutex_lock(&ff->bfl);
+ err = ext2fs_namei(fs, EXT2_ROOT_INO, EXT2_ROOT_INO, path, &ino);
+ if (err) {
+ ret = translate_error(fs, 0, err);
+ goto out;
+ }
+ ret = stat_inode(fs, ino, statbuf);
+out:
+ pthread_mutex_unlock(&ff->bfl);
+ return ret;
+}
+
+static int op_readlink(const char *path, char *buf, size_t len)
+{
+ struct fuse_context *ctxt = fuse_get_context();
+ struct fuse2fs *ff = (struct fuse2fs *)ctxt->private_data;
+ ext2_filsys fs = ff->fs;
+ errcode_t err;
+ ext2_ino_t ino;
+ struct ext2_inode inode;
+ unsigned int got;
+ ext2_file_t file;
+ int ret = 0;
+
+ pthread_mutex_lock(&ff->bfl);
+ err = ext2fs_namei(fs, EXT2_ROOT_INO, EXT2_ROOT_INO, path, &ino);
+ if (err || ino == 0) {
+ ret = translate_error(fs, 0, err);
+ goto out;
+ }
+
+ err = ext2fs_read_inode(fs, ino, &inode);
+ if (err) {
+ ret = translate_error(fs, ino, err);
+ goto out;
+ }
+
+ if (!LINUX_S_ISLNK(inode.i_mode)) {
+ ret = -EINVAL;
+ goto out;
+ }
+
+ len--;
+ if (inode.i_size < len)
+ len = inode.i_size;
+ if (ext2fs_inode_data_blocks2(fs, &inode)) {
+ /* big symlink */
+
+ err = ext2fs_file_open(fs, ino, 0, &file);
+ if (err) {
+ ret = translate_error(fs, ino, err);
+ goto out;
+ }
+
+ err = ext2fs_file_read(file, buf, len, &got);
+ if (err || got != len) {
+ ext2fs_file_close(file);
+ ret = translate_error(fs, ino, err);
+ goto out2;
+ }
+
+out2:
+ err = ext2fs_file_close(file);
+ if (ret)
+ goto out;
+ if (err) {
+ ret = translate_error(fs, ino, err);
+ goto out;
+ }
+ } else
+ /* inline symlink */
+ memcpy(buf, (char *)inode.i_block, len);
+ buf[len] = 0;
+
+ if (fs_writeable(fs)) {
+ ret = update_atime(fs, ino);
+ if (ret)
+ goto out;
+ }
+
+out:
+ pthread_mutex_unlock(&ff->bfl);
+ return ret;
+}
+
+static int op_mknod(const char *path, mode_t mode, dev_t dev)
+{
+ struct fuse_context *ctxt = fuse_get_context();
+ struct fuse2fs *ff = (struct fuse2fs *)ctxt->private_data;
+ ext2_filsys fs = ff->fs;
+ ext2_ino_t parent, child;
+ char *temp_path = strdup(path);
+ errcode_t err;
+ char *node_name, a;
+ int filetype;
+ struct ext2_inode_large inode;
+ int ret = 0;
+
+ if (!temp_path) {
+ ret = -ENOMEM;
+ goto out;
+ }
+ node_name = strrchr(temp_path, '/');
+ if (!node_name) {
+ ret = -ENOMEM;
+ goto out;
+ }
+ node_name++;
+ a = *node_name;
+ *node_name = 0;
+
+ pthread_mutex_lock(&ff->bfl);
+ err = ext2fs_namei(fs, EXT2_ROOT_INO, EXT2_ROOT_INO, temp_path,
+ &parent);
+ if (err) {
+ ret = translate_error(fs, 0, err);
+ goto out2;
+ }
+
+ ret = check_inum_access(ctxt, fs, parent, W_OK);
+ if (ret)
+ goto out2;
+
+ *node_name = a;
+
+ if (LINUX_S_ISCHR(mode))
+ filetype = EXT2_FT_CHRDEV;
+ else if (LINUX_S_ISBLK(mode))
+ filetype = EXT2_FT_BLKDEV;
+ else if (LINUX_S_ISFIFO(mode))
+ filetype = EXT2_FT_FIFO;
+ else {
+ ret = -EINVAL;
+ goto out2;
+ }
+
+ err = ext2fs_new_inode(fs, parent, mode, 0, &child);
+ if (err) {
+ ret = translate_error(fs, 0, err);
+ goto out2;
+ }
+
+ err = ext2fs_link(fs, parent, node_name, child, filetype);
+ if (err == EXT2_ET_DIR_NO_SPACE) {
+ err = ext2fs_expand_dir(fs, parent);
+ if (err) {
+ ret = translate_error(fs, parent, err);
+ goto out2;
+ }
+
+ err = ext2fs_link(fs, parent, node_name, child,
+ filetype);
+ }
+ if (err) {
+ ret = translate_error(fs, parent, err);
+ goto out2;
+ }
+
+ ret = update_mtime(fs, parent);
+ if (ret)
+ goto out2;
+
+ memset(&inode, 0, sizeof(inode));
+ inode.i_mode = mode;
+
+ if (dev & ~0xFFFF)
+ inode.i_block[1] = dev;
+ else
+ inode.i_block[0] = dev;
+ inode.i_links_count = 1;
+ inode.i_extra_isize = sizeof(struct ext2_inode_large) -
+ EXT2_GOOD_OLD_INODE_SIZE;
+
+ err = ext2fs_write_new_inode(fs, child, (struct ext2_inode *)&inode);
+ if (err) {
+ ret = translate_error(fs, child, err);
+ goto out2;
+ }
+
+ inode.i_generation = ff->next_generation++;
+ init_times(&inode);
+ err = ext2fs_write_inode_full(fs, child, (struct ext2_inode *)&inode,
+ sizeof(inode));
+ if (err) {
+ ret = translate_error(fs, child, err);
+ goto out2;
+ }
+
+ ext2fs_inode_alloc_stats2(fs, child, 1, 0);
+
+out2:
+ pthread_mutex_unlock(&ff->bfl);
+out:
+ free(temp_path);
+ return ret;
+}
+
+static int op_mkdir(const char *path, mode_t mode)
+{
+ struct fuse_context *ctxt = fuse_get_context();
+ struct fuse2fs *ff = (struct fuse2fs *)ctxt->private_data;
+ ext2_filsys fs = ff->fs;
+ ext2_ino_t parent, child;
+ char *temp_path = strdup(path);
+ errcode_t err;
+ char *node_name, a;
+ struct ext2_inode_large inode;
+ char *block;
+ blk64_t blk;
+ int ret = 0;
+ mode_t parent_sgid;
+
+ if (!temp_path) {
+ ret = -ENOMEM;
+ goto out;
+ }
+ node_name = strrchr(temp_path, '/');
+ if (!node_name) {
+ ret = -ENOMEM;
+ goto out;
+ }
+ node_name++;
+ a = *node_name;
+ *node_name = 0;
+
+ pthread_mutex_lock(&ff->bfl);
+ err = ext2fs_namei(fs, EXT2_ROOT_INO, EXT2_ROOT_INO, temp_path,
+ &parent);
+ if (err) {
+ ret = translate_error(fs, 0, err);
+ goto out2;
+ }
+
+ ret = check_inum_access(ctxt, fs, parent, W_OK);
+ if (ret)
+ goto out2;
+
+ /* Is the parent dir sgid? */
+ err = ext2fs_read_inode_full(fs, parent, (struct ext2_inode *)&inode,
+ sizeof(inode));
+ if (err) {
+ ret = translate_error(fs, parent, err);
+ goto out2;
+ }
+ parent_sgid = inode.i_mode & S_ISGID;
+
+ *node_name = a;
+
+ err = ext2fs_mkdir(fs, parent, 0, node_name);
+ if (err == EXT2_ET_DIR_NO_SPACE) {
+ err = ext2fs_expand_dir(fs, parent);
+ if (err) {
+ ret = translate_error(fs, parent, err);
+ goto out2;
+ }
+
+ err = ext2fs_mkdir(fs, parent, 0, node_name);
+ }
+ if (err) {
+ ret = translate_error(fs, parent, err);
+ goto out2;
+ }
+
+ ret = update_mtime(fs, parent);
+ if (ret)
+ goto out2;
+
+ /* Still have to update the uid/gid of the dir */
+ err = ext2fs_namei(fs, EXT2_ROOT_INO, EXT2_ROOT_INO, temp_path,
+ &child);
+ if (err) {
+ ret = translate_error(fs, 0, err);
+ goto out2;
+ }
+
+ memset(&inode, 0, sizeof(inode));
+ err = ext2fs_read_inode_full(fs, child, (struct ext2_inode *)&inode,
+ sizeof(inode));
+ if (err) {
+ ret = translate_error(fs, child, err);
+ goto out2;
+ }
+
+ inode.i_uid = ctxt->uid;
+ inode.i_gid = ctxt->gid;
+ inode.i_mode = LINUX_S_IFDIR | (mode & ~(S_ISUID | fs->umask)) |
+ parent_sgid;
+ inode.i_generation = ff->next_generation++;
+
+ err = ext2fs_write_inode_full(fs, child, (struct ext2_inode *)&inode,
+ sizeof(inode));
+ if (err) {
+ ret = translate_error(fs, child, err);
+ goto out2;
+ }
+
+ /* Rewrite the directory block checksum, having set i_generation */
+ if (!EXT2_HAS_RO_COMPAT_FEATURE(fs->super,
+ EXT4_FEATURE_RO_COMPAT_METADATA_CSUM))
+ goto out2;
+ err = ext2fs_new_dir_block(fs, child, parent, &block);
+ if (err) {
+ ret = translate_error(fs, child, err);
+ goto out2;
+ }
+ err = ext2fs_bmap2(fs, child, (struct ext2_inode *)&inode, NULL, 0, 0,
+ NULL, &blk);
+ if (err) {
+ ret = translate_error(fs, child, err);
+ goto out3;
+ }
+ err = ext2fs_write_dir_block4(fs, blk, block, 0, child);
+ if (err) {
+ ret = translate_error(fs, child, err);
+ goto out3;
+ }
+
+out3:
+ ext2fs_free_mem(&block);
+out2:
+ pthread_mutex_unlock(&ff->bfl);
+out:
+ free(temp_path);
+ return ret;
+}
+
+static int unlink_file_by_name(struct fuse_context *ctxt, ext2_filsys fs,
+ const char *path)
+{
+ errcode_t err;
+ ext2_ino_t dir;
+ char *filename = strdup(path);
+ char *base_name;
+ int ret;
+
+ base_name = strrchr(filename, '/');
+ if (base_name) {
+ *base_name++ = '\0';
+ err = ext2fs_namei(fs, EXT2_ROOT_INO, EXT2_ROOT_INO, filename,
+ &dir);
+ if (err) {
+ free(filename);
+ return translate_error(fs, 0, err);
+ }
+ } else {
+ dir = EXT2_ROOT_INO;
+ base_name = filename;
+ }
+
+ ret = check_inum_access(ctxt, fs, dir, W_OK);
+ if (ret) {
+ free(filename);
+ return ret;
+ }
+
+ err = ext2fs_unlink(fs, dir, base_name, 0, 0);
+ free(filename);
+ if (err)
+ return translate_error(fs, dir, err);
+
+ return update_mtime(fs, dir);
+}
+
+static int release_blocks_proc(ext2_filsys fs, blk64_t *blocknr,
+ e2_blkcnt_t blockcnt EXT2FS_ATTR((unused)),
+ blk64_t ref_block EXT2FS_ATTR((unused)),
+ int ref_offset EXT2FS_ATTR((unused)),
+ void *private EXT2FS_ATTR((unused)))
+{
+ blk64_t blk = *blocknr;
+
+ if (blk % EXT2FS_CLUSTER_RATIO(fs) == 0)
+ ext2fs_block_alloc_stats2(fs, *blocknr, -1);
+ return 0;
+}
+
+static int remove_inode(struct fuse2fs *ff, ext2_ino_t ino)
+{
+ ext2_filsys fs = ff->fs;
+ errcode_t err;
+ struct ext2_inode_large inode;
+ int ret = 0;
+
+ memset(&inode, 0, sizeof(inode));
+ err = ext2fs_read_inode_full(fs, ino, (struct ext2_inode *)&inode,
+ sizeof(inode));
+ if (err) {
+ ret = translate_error(fs, ino, err);
+ goto out;
+ }
+
+ switch (inode.i_links_count) {
+ case 0:
+ return 0; /* XXX: already done? */
+ case 1:
+ inode.i_links_count--;
+ inode.i_dtime = fs->now ? fs->now : time(0);
+ break;
+ default:
+ inode.i_links_count--;
+ }
+
+ ret = update_ctime(fs, ino, &inode);
+ if (ret)
+ goto out;
+
+ err = ext2fs_write_inode_full(fs, ino, (struct ext2_inode *)&inode,
+ sizeof(inode));
+ if (err) {
+ ret = translate_error(fs, ino, err);
+ goto out;
+ }
+
+ if (inode.i_links_count)
+ goto out;
+
+ err = ext2fs_free_ext_attr(fs, ino, &inode);
+ if (err)
+ goto out;
+ if (ext2fs_inode_has_valid_blocks2(fs, (struct ext2_inode *)&inode))
+ ext2fs_block_iterate3(fs, ino, BLOCK_FLAG_READ_ONLY, NULL,
+ release_blocks_proc, NULL);
+ ext2fs_inode_alloc_stats2(fs, ino, -1,
+ LINUX_S_ISDIR(inode.i_mode));
+out:
+ return ret;
+}
+
+static int __op_unlink(const char *path)
+{
+ struct fuse_context *ctxt = fuse_get_context();
+ struct fuse2fs *ff = (struct fuse2fs *)ctxt->private_data;
+ ext2_filsys fs = ff->fs;
+ ext2_ino_t ino;
+ errcode_t err;
+ int ret = 0;
+
+ err = ext2fs_namei(fs, EXT2_ROOT_INO, EXT2_ROOT_INO, path, &ino);
+ if (err) {
+ ret = translate_error(fs, 0, err);
+ goto out;
+ }
+
+ ret = unlink_file_by_name(ctxt, fs, path);
+ if (ret)
+ goto out;
+
+ ret = remove_inode(ff, ino);
+ if (ret)
+ goto out;
+out:
+ return ret;
+}
+
+static int op_unlink(const char *path)
+{
+ struct fuse_context *ctxt = fuse_get_context();
+ struct fuse2fs *ff = (struct fuse2fs *)ctxt->private_data;
+ int ret;
+
+ pthread_mutex_lock(&ff->bfl);
+ ret = __op_unlink(path);
+ pthread_mutex_unlock(&ff->bfl);
+ return ret;
+}
+
+struct rd_struct {
+ ext2_ino_t parent;
+ int empty;
+};
+
+static int rmdir_proc(ext2_ino_t dir EXT2FS_ATTR((unused)),
+ int entry EXT2FS_ATTR((unused)),
+ struct ext2_dir_entry *dirent,
+ int offset EXT2FS_ATTR((unused)),
+ int blocksize EXT2FS_ATTR((unused)),
+ char *buf EXT2FS_ATTR((unused)),
+ void *private)
+{
+ struct rd_struct *rds = (struct rd_struct *) private;
+
+ if (dirent->inode == 0)
+ return 0;
+ if (((dirent->name_len & 0xFF) == 1) && (dirent->name[0] == '.'))
+ return 0;
+ if (((dirent->name_len & 0xFF) == 2) && (dirent->name[0] == '.') &&
+ (dirent->name[1] == '.')) {
+ rds->parent = dirent->inode;
+ return 0;
+ }
+ rds->empty = 0;
+ return 0;
+}
+
+static int __op_rmdir(const char *path)
+{
+ struct fuse_context *ctxt = fuse_get_context();
+ struct fuse2fs *ff = (struct fuse2fs *)ctxt->private_data;
+ ext2_filsys fs = ff->fs;
+ ext2_ino_t child;
+ errcode_t err;
+ struct ext2_inode inode;
+ struct rd_struct rds;
+ int ret = 0;
+
+ err = ext2fs_namei(fs, EXT2_ROOT_INO, EXT2_ROOT_INO, path, &child);
+ if (err) {
+ ret = translate_error(fs, 0, err);
+ goto out;
+ }
+
+ rds.parent = 0;
+ rds.empty = 1;
+
+ err = ext2fs_dir_iterate2(fs, child, 0, 0, rmdir_proc, &rds);
+ if (err) {
+ ret = translate_error(fs, child, err);
+ goto out;
+ }
+
+ if (rds.empty == 0) {
+ ret = -ENOTEMPTY;
+ goto out;
+ }
+
+ ret = unlink_file_by_name(ctxt, fs, path);
+ if (ret)
+ goto out;
+ /* Directories have to be "removed" twice. */
+ ret = remove_inode(ff, child);
+ if (ret)
+ goto out;
+ ret = remove_inode(ff, child);
+ if (ret)
+ goto out;
+
+ if (rds.parent) {
+ err = ext2fs_read_inode(fs, rds.parent, &inode);
+ if (err) {
+ ret = translate_error(fs, rds.parent, err);
+ goto out;
+ }
+ if (inode.i_links_count > 1)
+ inode.i_links_count--;
+ ret = update_mtime(fs, rds.parent);
+ if (ret)
+ goto out;
+ err = ext2fs_write_inode(fs, rds.parent, &inode);
+ if (err) {
+ ret = translate_error(fs, rds.parent, err);
+ goto out;
+ }
+ }
+
+out:
+ return ret;
+}
+
+static int op_rmdir(const char *path)
+{
+ struct fuse_context *ctxt = fuse_get_context();
+ struct fuse2fs *ff = (struct fuse2fs *)ctxt->private_data;
+ int ret;
+
+ pthread_mutex_lock(&ff->bfl);
+ ret = __op_rmdir(path);
+ pthread_mutex_unlock(&ff->bfl);
+ return ret;
+}
+
+static int op_symlink(const char *src, const char *dest)
+{
+ struct fuse_context *ctxt = fuse_get_context();
+ struct fuse2fs *ff = (struct fuse2fs *)ctxt->private_data;
+ ext2_filsys fs = ff->fs;
+ ext2_ino_t parent, child;
+ char *temp_path = strdup(dest);
+ errcode_t err;
+ char *node_name, a;
+ struct ext2_inode_large inode;
+ int len = strlen(src);
+ int ret = 0;
+
+ if (!temp_path) {
+ ret = -ENOMEM;
+ goto out;
+ }
+ node_name = strrchr(temp_path, '/');
+ if (!node_name) {
+ ret = -ENOMEM;
+ goto out;
+ }
+ node_name++;
+ a = *node_name;
+ *node_name = 0;
+
+ pthread_mutex_lock(&ff->bfl);
+ err = ext2fs_namei(fs, EXT2_ROOT_INO, EXT2_ROOT_INO, temp_path,
+ &parent);
+ *node_name = a;
+ if (err) {
+ ret = translate_error(fs, 0, err);
+ goto out2;
+ }
+
+ ret = check_inum_access(ctxt, fs, parent, W_OK);
+ if (ret)
+ goto out2;
+
+
+ /* Create symlink */
+ err = ext2fs_symlink(fs, parent, 0, node_name, (char *)src);
+ if (err == EXT2_ET_DIR_NO_SPACE) {
+ err = ext2fs_expand_dir(fs, parent);
+ if (err) {
+ ret = translate_error(fs, parent, err);
+ goto out2;
+ }
+
+ err = ext2fs_symlink(fs, parent, 0, node_name, (char *)src);
+ }
+ if (err) {
+ ret = translate_error(fs, parent, err);
+ goto out2;
+ }
+
+ /* Update parent dir's mtime */
+ ret = update_mtime(fs, parent);
+ if (ret)
+ goto out2;
+
+ /* Still have to update the uid/gid of the symlink */
+ err = ext2fs_namei(fs, EXT2_ROOT_INO, EXT2_ROOT_INO, temp_path,
+ &child);
+ if (err) {
+ ret = translate_error(fs, 0, err);
+ goto out2;
+ }
+
+ memset(&inode, 0, sizeof(inode));
+ err = ext2fs_read_inode_full(fs, child, (struct ext2_inode *)&inode,
+ sizeof(inode));
+ if (err) {
+ ret = translate_error(fs, child, err);
+ goto out2;
+ }
+
+ inode.i_uid = ctxt->uid;
+ inode.i_gid = ctxt->gid;
+ inode.i_generation = ff->next_generation++;
+
+ err = ext2fs_write_inode_full(fs, child, (struct ext2_inode *)&inode,
+ sizeof(inode));
+ if (err) {
+ ret = translate_error(fs, child, err);
+ goto out2;
+ }
+out2:
+ pthread_mutex_unlock(&ff->bfl);
+out:
+ free(temp_path);
+ return ret;
+}
+
+static int op_rename(const char *from, const char *to)
+{
+ struct fuse_context *ctxt = fuse_get_context();
+ struct fuse2fs *ff = (struct fuse2fs *)ctxt->private_data;
+ ext2_filsys fs = ff->fs;
+ errcode_t err;
+ ext2_ino_t from_ino, to_ino, to_dir_ino, from_dir_ino;
+ char *temp_to = NULL, *temp_from = NULL;
+ char *cp, a;
+ struct ext2_inode inode;
+ int ret = 0;
+
+ pthread_mutex_lock(&ff->bfl);
+ err = ext2fs_namei(fs, EXT2_ROOT_INO, EXT2_ROOT_INO, from, &from_ino);
+ if (err || from_ino == 0) {
+ ret = translate_error(fs, 0, err);
+ goto out;
+ }
+
+ err = ext2fs_namei(fs, EXT2_ROOT_INO, EXT2_ROOT_INO, to, &to_ino);
+ if (err && err != EXT2_ET_FILE_NOT_FOUND) {
+ ret = translate_error(fs, 0, err);
+ goto out;
+ }
+
+ if (err == EXT2_ET_FILE_NOT_FOUND)
+ to_ino = 0;
+
+ /* Already the same file? */
+ if (to_ino != 0 && to_ino == from_ino) {
+ ret = 0;
+ goto out;
+ }
+
+ temp_to = strdup(to);
+ if (!temp_to) {
+ ret = -ENOMEM;
+ goto out;
+ }
+
+ temp_from = strdup(from);
+ if (!temp_from) {
+ ret = -ENOMEM;
+ goto out2;
+ }
+
+ /* Find parent dir of the source and check write access */
+ cp = strrchr(temp_from, '/');
+ if (!cp) {
+ ret = -EINVAL;
+ goto out2;
+ }
+
+ a = *(cp + 1);
+ *(cp + 1) = 0;
+ err = ext2fs_namei(fs, EXT2_ROOT_INO, EXT2_ROOT_INO, temp_from,
+ &from_dir_ino);
+ *(cp + 1) = a;
+ if (err) {
+ ret = translate_error(fs, 0, err);
+ goto out2;
+ }
+ if (from_dir_ino == 0) {
+ ret = -ENOENT;
+ goto out2;
+ }
+
+ ret = check_inum_access(ctxt, fs, from_dir_ino, W_OK);
+ if (ret)
+ goto out2;
+
+ /* Find parent dir of the destination and check write access */
+ cp = strrchr(temp_to, '/');
+ if (!cp) {
+ ret = -EINVAL;
+ goto out2;
+ }
+
+ a = *(cp + 1);
+ *(cp + 1) = 0;
+ err = ext2fs_namei(fs, EXT2_ROOT_INO, EXT2_ROOT_INO, temp_to,
+ &to_dir_ino);
+ *(cp + 1) = a;
+ if (err) {
+ ret = translate_error(fs, 0, err);
+ goto out2;
+ }
+ if (to_dir_ino == 0) {
+ ret = -ENOENT;
+ goto out2;
+ }
+
+ ret = check_inum_access(ctxt, fs, to_dir_ino, W_OK);
+ if (ret)
+ goto out2;
+
+ /* If the target exists, unlink it first */
+ if (to_ino != 0) {
+ err = ext2fs_read_inode(fs, to_ino, &inode);
+ if (err) {
+ ret = translate_error(fs, to_ino, err);
+ goto out2;
+ }
+
+ if (LINUX_S_ISDIR(inode.i_mode))
+ ret = __op_rmdir(to);
+ else
+ ret = __op_unlink(to);
+ if (ret)
+ goto out2;
+ }
+
+ /* Get ready to do the move */
+ err = ext2fs_read_inode(fs, from_ino, &inode);
+ if (err) {
+ ret = translate_error(fs, from_ino, err);
+ goto out2;
+ }
+
+ /* Link in the new file */
+ err = ext2fs_link(fs, to_dir_ino, cp + 1, from_ino,
+ ext2_file_type(inode.i_mode));
+ if (err == EXT2_ET_DIR_NO_SPACE) {
+ err = ext2fs_expand_dir(fs, to_dir_ino);
+ if (err) {
+ ret = translate_error(fs, to_dir_ino, err);
+ goto out2;
+ }
+
+ err = ext2fs_link(fs, to_dir_ino, cp + 1, from_ino,
+ ext2_file_type(inode.i_mode));
+ }
+ if (err) {
+ ret = translate_error(fs, to_dir_ino, err);
+ goto out2;
+ }
+
+ ret = update_ctime(fs, from_ino, NULL);
+ if (ret)
+ goto out2;
+
+ ret = update_mtime(fs, to_dir_ino);
+ if (ret)
+ goto out2;
+
+ /* Remove the old file */
+ ret = unlink_file_by_name(ctxt, fs, from);
+ if (ret)
+ goto out2;
+
+ /* Flush the whole mess out */
+ err = ext2fs_flush2(fs, 0);
+ if (err)
+ ret = translate_error(fs, 0, err);
+
+out2:
+ free(temp_from);
+ free(temp_to);
+out:
+ pthread_mutex_unlock(&ff->bfl);
+ return ret;
+}
+
+static int op_link(const char *src, const char *dest)
+{
+ struct fuse_context *ctxt = fuse_get_context();
+ struct fuse2fs *ff = (struct fuse2fs *)ctxt->private_data;
+ ext2_filsys fs = ff->fs;
+ char *temp_path = strdup(dest);
+ errcode_t err;
+ char *node_name, a;
+ ext2_ino_t parent, ino;
+ struct ext2_inode_large inode;
+ int ret = 0;
+
+ if (!temp_path) {
+ ret = -ENOMEM;
+ goto out;
+ }
+ node_name = strrchr(temp_path, '/');
+ if (!node_name) {
+ ret = -ENOMEM;
+ goto out;
+ }
+ node_name++;
+ a = *node_name;
+ *node_name = 0;
+
+ pthread_mutex_lock(&ff->bfl);
+ err = ext2fs_namei(fs, EXT2_ROOT_INO, EXT2_ROOT_INO, temp_path,
+ &parent);
+ *node_name = a;
+ if (err) {
+ err = -ENOENT;
+ goto out2;
+ }
+
+ ret = check_inum_access(ctxt, fs, parent, W_OK);
+ if (ret)
+ goto out2;
+
+
+ err = ext2fs_namei(fs, EXT2_ROOT_INO, EXT2_ROOT_INO, src, &ino);
+ if (err || ino == 0) {
+ ret = translate_error(fs, 0, err);
+ goto out2;
+ }
+
+ memset(&inode, 0, sizeof(inode));
+ err = ext2fs_read_inode_full(fs, ino, (struct ext2_inode *)&inode,
+ sizeof(inode));
+ if (err) {
+ ret = translate_error(fs, ino, err);
+ goto out2;
+ }
+
+ inode.i_links_count++;
+ ret = update_ctime(fs, ino, &inode);
+ if (ret)
+ goto out2;
+
+ err = ext2fs_write_inode_full(fs, ino, (struct ext2_inode *)&inode,
+ sizeof(inode));
+ if (err) {
+ ret = translate_error(fs, ino, err);
+ goto out2;
+ }
+
+ err = ext2fs_link(fs, parent, node_name, ino,
+ ext2_file_type(inode.i_mode));
+ if (err == EXT2_ET_DIR_NO_SPACE) {
+ err = ext2fs_expand_dir(fs, parent);
+ if (err) {
+ ret = translate_error(fs, parent, err);
+ goto out2;
+ }
+
+ err = ext2fs_link(fs, parent, node_name, ino,
+ ext2_file_type(inode.i_mode));
+ }
+ if (err) {
+ ret = translate_error(fs, parent, err);
+ goto out2;
+ }
+
+ ret = update_mtime(fs, parent);
+ if (ret)
+ goto out2;
+
+out2:
+ pthread_mutex_unlock(&ff->bfl);
+out:
+ free(temp_path);
+ return ret;
+}
+
+static int op_chmod(const char *path, mode_t mode)
+{
+ struct fuse_context *ctxt = fuse_get_context();
+ struct fuse2fs *ff = (struct fuse2fs *)ctxt->private_data;
+ ext2_filsys fs = ff->fs;
+ errcode_t err;
+ ext2_ino_t ino;
+ struct ext2_inode_large inode;
+ int ret = 0;
+
+ pthread_mutex_lock(&ff->bfl);
+ err = ext2fs_namei(fs, EXT2_ROOT_INO, EXT2_ROOT_INO, path, &ino);
+ if (err) {
+ ret = translate_error(fs, 0, err);
+ goto out;
+ }
+
+ memset(&inode, 0, sizeof(inode));
+ err = ext2fs_read_inode_full(fs, ino, (struct ext2_inode *)&inode,
+ sizeof(inode));
+ if (err) {
+ ret = translate_error(fs, ino, err);
+ goto out;
+ }
+
+ if (ctxt->uid != 0 && ctxt->uid != inode.i_uid) {
+ ret = -EPERM;
+ goto out;
+ }
+
+ /*
+ * XXX: We should really check that the inode gid is not in /any/
+ * of the user's groups, but FUSE only tells us about the primary
+ * group.
+ */
+ if (ctxt->uid != 0 && ctxt->gid != inode.i_gid)
+ mode &= ~S_ISGID;
+
+ inode.i_mode &= ~0xFFF;
+ inode.i_mode |= mode & 0xFFF;
+ ret = update_ctime(fs, ino, &inode);
+ if (ret)
+ goto out;
+
+ err = ext2fs_write_inode_full(fs, ino, (struct ext2_inode *)&inode,
+ sizeof(inode));
+ if (err) {
+ ret = translate_error(fs, ino, err);
+ goto out;
+ }
+
+out:
+ pthread_mutex_unlock(&ff->bfl);
+ return ret;
+}
+
+static int op_chown(const char *path, uid_t owner, gid_t group)
+{
+ struct fuse_context *ctxt = fuse_get_context();
+ struct fuse2fs *ff = (struct fuse2fs *)ctxt->private_data;
+ ext2_filsys fs = ff->fs;
+ errcode_t err;
+ ext2_ino_t ino;
+ struct ext2_inode_large inode;
+ int ret = 0;
+
+ pthread_mutex_lock(&ff->bfl);
+ err = ext2fs_namei(fs, EXT2_ROOT_INO, EXT2_ROOT_INO, path, &ino);
+ if (err) {
+ ret = translate_error(fs, 0, err);
+ goto out;
+ }
+
+ memset(&inode, 0, sizeof(inode));
+ err = ext2fs_read_inode_full(fs, ino, (struct ext2_inode *)&inode,
+ sizeof(inode));
+ if (err) {
+ ret = translate_error(fs, ino, err);
+ goto out;
+ }
+
+ /* FUSE seems to feed us ~0 to mean "don't change" */
+ if (owner != ~0) {
+ /* Only root gets to change UID. */
+ if (ctxt->uid != 0 &&
+ !(inode.i_uid == ctxt->uid && owner == ctxt->uid)) {
+ ret = -EPERM;
+ goto out;
+ }
+ inode.i_uid = owner;
+ }
+
+ if (group != ~0) {
+ /* Only root or the owner get to change GID. */
+ if (ctxt->uid != 0 && inode.i_uid != ctxt->uid) {
+ ret = -EPERM;
+ goto out;
+ }
+
+ /* XXX: We /should/ check group membership but FUSE */
+ inode.i_gid = group;
+ }
+
+ ret = update_ctime(fs, ino, &inode);
+ if (ret)
+ goto out;
+
+ err = ext2fs_write_inode_full(fs, ino, (struct ext2_inode *)&inode,
+ sizeof(inode));
+ if (err) {
+ ret = translate_error(fs, ino, err);
+ goto out;
+ }
+
+out:
+ pthread_mutex_unlock(&ff->bfl);
+ return ret;
+}
+
+static int op_truncate(const char *path, off_t len)
+{
+ struct fuse_context *ctxt = fuse_get_context();
+ struct fuse2fs *ff = (struct fuse2fs *)ctxt->private_data;
+ ext2_filsys fs = ff->fs;
+ errcode_t err;
+ ext2_ino_t ino;
+ ext2_file_t file;
+ int ret = 0;
+
+ pthread_mutex_lock(&ff->bfl);
+ err = ext2fs_namei(fs, EXT2_ROOT_INO, EXT2_ROOT_INO, path, &ino);
+ if (err || ino == 0) {
+ ret = translate_error(fs, 0, err);
+ goto out;
+ }
+
+ ret = check_inum_access(ctxt, fs, ino, W_OK);
+ if (ret)
+ goto out;
+
+ err = ext2fs_file_open(fs, ino, EXT2_FILE_WRITE, &file);
+ if (err) {
+ ret = translate_error(fs, ino, err);
+ goto out;
+ }
+
+ err = ext2fs_file_set_size2(file, len);
+ if (err) {
+ ret = translate_error(fs, ino, err);
+ goto out2;
+ }
+
+out2:
+ err = ext2fs_file_close(file);
+ if (ret)
+ goto out;
+ if (err) {
+ ret = translate_error(fs, ino, err);
+ goto out;
+ }
+
+ ret = update_mtime(fs, ino);
+
+out:
+ pthread_mutex_unlock(&ff->bfl);
+ return err;
+}
+
+#ifdef __linux__
+void detect_linux_executable_open(int kernel_flags, int *access_check,
+ int *e2fs_open_flags)
+{
+ /*
+ * On Linux, execve will bleed __FMODE_EXEC into the file mode flags,
+ * and FUSE is more than happy to let that slip through.
+ */
+ if (kernel_flags & 0x20) {
+ *access_check = X_OK;
+ *e2fs_open_flags &= ~EXT2_FILE_WRITE;
+ }
+}
+#else
+void detect_linux_executable_open(int kernel_flags, int *access_check,
+ int *e2fs_open_flags)
+{
+ /* empty */
+}
+#endif /* __linux__ */
+
+static int __op_open(const char *path, struct fuse_file_info *fp)
+{
+ struct fuse_context *ctxt = fuse_get_context();
+ struct fuse2fs *ff = (struct fuse2fs *)ctxt->private_data;
+ ext2_filsys fs = ff->fs;
+ errcode_t err;
+ ext2_ino_t ino;
+ struct fuse2fs_file_handle *file;
+ int check, ret = 0;
+
+ file = calloc(1, sizeof(*file));
+ if (!file)
+ return -ENOMEM;
+
+ file->open_flags = 0;
+ switch (fp->flags & O_ACCMODE) {
+ case O_RDONLY:
+ check = R_OK;
+ break;
+ case O_WRONLY:
+ check = W_OK;
+ file->open_flags |= EXT2_FILE_WRITE;
+ break;
+ case O_RDWR:
+ check = R_OK | W_OK;
+ file->open_flags |= EXT2_FILE_WRITE;
+ break;
+ }
+
+ detect_linux_executable_open(fp->flags, &check, &file->open_flags);
+
+ if (fp->flags & O_CREAT)
+ file->open_flags |= EXT2_FILE_CREATE;
+
+ err = ext2fs_namei(fs, EXT2_ROOT_INO, EXT2_ROOT_INO, path, &file->ino);
+ if (err || file->ino == 0) {
+ ret = translate_error(fs, 0, err);
+ goto out;
+ }
+
+ ret = check_inum_access(ctxt, fs, file->ino, check);
+ if (ret) {
+ /*
+ * In a regular (Linux) fs driver, the kernel will open
+ * binaries for reading if the user has --x privileges (i.e.
+ * execute without read). Since the kernel doesn't have any
+ * way to tell us if it's opening a file via execve, we'll
+ * just assume that allowing access is ok if asking for ro mode
+ * fails but asking for x mode succeeds. Of course we can
+ * also employ undocumented hacks (see above).
+ */
+ if (check == R_OK) {
+ ret = check_inum_access(ctxt, fs, file->ino, X_OK);
+ if (ret)
+ goto out;
+ } else
+ goto out;
+ }
+ fp->fh = (uint64_t)file;
+
+out:
+ if (ret)
+ free(file);
+ return ret;
+}
+
+static int op_open(const char *path, struct fuse_file_info *fp)
+{
+ struct fuse_context *ctxt = fuse_get_context();
+ struct fuse2fs *ff = (struct fuse2fs *)ctxt->private_data;
+ int ret;
+
+ pthread_mutex_lock(&ff->bfl);
+ ret = __op_open(path, fp);
+ pthread_mutex_unlock(&ff->bfl);
+ return ret;
+}
+
+static int op_read(const char *path, char *buf, size_t len, off_t offset,
+ struct fuse_file_info *fp)
+{
+ struct fuse_context *ctxt = fuse_get_context();
+ struct fuse2fs *ff = (struct fuse2fs *)ctxt->private_data;
+ ext2_filsys fs = ff->fs;
+ struct fuse2fs_file_handle *fh = (struct fuse2fs_file_handle *)fp->fh;
+ ext2_file_t efp;
+ errcode_t err;
+ unsigned int got;
+ int ret = 0;
+
+ pthread_mutex_lock(&ff->bfl);
+ err = ext2fs_file_open(fs, fh->ino, fh->open_flags, &efp);
+ if (err) {
+ ret = translate_error(fs, fh->ino, err);
+ goto out;
+ }
+
+ err = ext2fs_file_llseek(efp, offset, SEEK_SET, NULL);
+ if (err) {
+ ret = translate_error(fs, fh->ino, err);
+ goto out2;
+ }
+
+ err = ext2fs_file_read(efp, buf, len, &got);
+ if (err) {
+ ret = translate_error(fs, fh->ino, err);
+ goto out2;
+ }
+
+out2:
+ err = ext2fs_file_close(efp);
+ if (ret)
+ goto out;
+ if (err) {
+ ret = translate_error(fs, fh->ino, err);
+ goto out;
+ }
+
+ if (fs_writeable(fs)) {
+ ret = update_atime(fs, fh->ino);
+ if (ret)
+ goto out;
+ }
+out:
+ pthread_mutex_unlock(&ff->bfl);
+ return got ? got : ret;
+}
+
+static int op_write(const char *path, const char *buf, size_t len, off_t offset,
+ struct fuse_file_info *fp)
+{
+ struct fuse_context *ctxt = fuse_get_context();
+ struct fuse2fs *ff = (struct fuse2fs *)ctxt->private_data;
+ ext2_filsys fs = ff->fs;
+ struct fuse2fs_file_handle *fh = (struct fuse2fs_file_handle *)fp->fh;
+ ext2_file_t efp;
+ errcode_t err;
+ unsigned int got;
+ __u64 fsize;
+ int ret = 0;
+
+ pthread_mutex_lock(&ff->bfl);
+ if (!fs_writeable(fs)) {
+ ret = -EROFS;
+ goto out;
+ }
+
+ err = ext2fs_file_open(fs, fh->ino, fh->open_flags, &efp);
+ if (err) {
+ ret = translate_error(fs, fh->ino, err);
+ goto out;
+ }
+
+ err = ext2fs_file_llseek(efp, offset, SEEK_SET, NULL);
+ if (err) {
+ ret = translate_error(fs, fh->ino, err);
+ goto out2;
+ }
+
+ err = ext2fs_file_write(efp, buf, len, &got);
+ if (err) {
+ ret = translate_error(fs, fh->ino, err);
+ goto out2;
+ }
+
+ err = ext2fs_file_flush(efp);
+ if (err) {
+ got = 0;
+ ret = translate_error(fs, fh->ino, err);
+ goto out2;
+ }
+
+out2:
+ err = ext2fs_file_close(efp);
+ if (ret)
+ goto out;
+ if (err) {
+ ret = translate_error(fs, fh->ino, err);
+ goto out;
+ }
+
+ ret = update_mtime(fs, fh->ino);
+ if (ret)
+ goto out;
+
+out:
+ pthread_mutex_unlock(&ff->bfl);
+ return got ? got : ret;
+}
+
+static int op_release(const char *path, struct fuse_file_info *fp)
+{
+ struct fuse_context *ctxt = fuse_get_context();
+ struct fuse2fs *ff = (struct fuse2fs *)ctxt->private_data;
+ ext2_filsys fs = ff->fs;
+ struct fuse2fs_file_handle *fh = (struct fuse2fs_file_handle *)fp->fh;
+ errcode_t err;
+ int ret = 0;
+
+ pthread_mutex_lock(&ff->bfl);
+ if (fs_writeable(fs) && fh->open_flags & EXT2_FILE_WRITE) {
+ err = ext2fs_flush2(fs, EXT2_FLAG_FLUSH_NO_SYNC);
+ if (err)
+ ret = translate_error(fs, fh->ino, err);
+ }
+ fp->fh = 0;
+ pthread_mutex_unlock(&ff->bfl);
+
+ free(fh);
+
+ return ret;
+}
+
+static int op_fsync(const char *path, int datasync, struct fuse_file_info *fp)
+{
+ struct fuse_context *ctxt = fuse_get_context();
+ struct fuse2fs *ff = (struct fuse2fs *)ctxt->private_data;
+ ext2_filsys fs = ff->fs;
+ struct fuse2fs_file_handle *fh = (struct fuse2fs_file_handle *)fp->fh;
+ errcode_t err;
+ int ret = 0;
+
+ /* For now, flush everything, even if it's slow */
+ pthread_mutex_lock(&ff->bfl);
+ if (fs_writeable(fs) && fh->open_flags & EXT2_FILE_WRITE) {
+ err = ext2fs_flush2(fs, 0);
+ if (err)
+ ret = translate_error(fs, fh->ino, err);
+ }
+ pthread_mutex_unlock(&ff->bfl);
+
+ return ret;
+}
+
+static int op_statfs(const char *path, struct statvfs *buf)
+{
+ struct fuse_context *ctxt = fuse_get_context();
+ struct fuse2fs *ff = (struct fuse2fs *)ctxt->private_data;
+ ext2_filsys fs = ff->fs;
+ uint64_t fsid, *f;
+
+ buf->f_bsize = fs->blocksize;
+ buf->f_frsize = 0;
+ buf->f_blocks = fs->super->s_blocks_count;
+ buf->f_bfree = fs->super->s_free_blocks_count;
+ if (fs->super->s_free_blocks_count < fs->super->s_r_blocks_count)
+ buf->f_bavail = 0;
+ else
+ buf->f_bavail = fs->super->s_free_blocks_count -
+ fs->super->s_r_blocks_count;
+ buf->f_files = fs->super->s_inodes_count;
+ buf->f_ffree = fs->super->s_free_inodes_count;
+ buf->f_favail = fs->super->s_free_inodes_count;
+ f = (uint64_t *)fs->super->s_uuid;
+ fsid = *f;
+ f++;
+ fsid ^= *f;
+ buf->f_fsid = fsid;
+ buf->f_flag = 0;
+ if (fs->flags & EXT2_FLAG_RW)
+ buf->f_flag |= ST_RDONLY;
+ buf->f_namemax = EXT2_NAME_LEN;
+
+ return 0;
+}
+
+static int op_getxattr(const char *path, const char *key, char *value,
+ size_t len)
+{
+ struct fuse_context *ctxt = fuse_get_context();
+ struct fuse2fs *ff = (struct fuse2fs *)ctxt->private_data;
+ ext2_filsys fs = ff->fs;
+ struct ext2_xattr_handle *h;
+ void *ptr;
+ unsigned int plen;
+ ext2_ino_t ino;
+ errcode_t err;
+ int ret = 0;
+
+ pthread_mutex_lock(&ff->bfl);
+ if (!EXT2_HAS_COMPAT_FEATURE(fs->super,
+ EXT2_FEATURE_COMPAT_EXT_ATTR)) {
+ ret = -ENOTSUP;
+ goto out;
+ }
+
+ err = ext2fs_namei(fs, EXT2_ROOT_INO, EXT2_ROOT_INO, path, &ino);
+ if (err || ino == 0) {
+ ret = translate_error(fs, 0, err);
+ goto out;
+ }
+
+ ret = check_inum_access(ctxt, fs, ino, R_OK);
+ if (ret)
+ goto out;
+
+ err = ext2fs_xattrs_open(fs, ino, &h);
+ if (err) {
+ ret = translate_error(fs, ino, err);
+ goto out;
+ }
+
+ err = ext2fs_xattrs_read(h);
+ if (err) {
+ ret = translate_error(fs, ino, err);
+ goto out2;
+ }
+
+ err = ext2fs_xattr_get(h, key, &ptr, &plen);
+ if (err) {
+ ret = translate_error(fs, ino, err);
+ goto out2;
+ }
+
+ if (!len) {
+ ret = plen;
+ } else if (len < plen) {
+ ret = -ERANGE;
+ } else {
+ memcpy(value, ptr, plen);
+ ret = plen;
+ }
+
+ ext2fs_free_mem(&ptr);
+out2:
+ err = ext2fs_xattrs_close(&h);
+ if (err)
+ ret = translate_error(fs, ino, err);
+out:
+ pthread_mutex_unlock(&ff->bfl);
+
+ return ret;
+}
+
+static int count_buffer_space(char *name, char *value, size_t value_len,
+ void *data)
+{
+ unsigned int *x = data;
+
+ *x = *x + strlen(name) + 1;
+ return 0;
+}
+
+static int copy_names(char *name, char *value, size_t value_len, void *data)
+{
+ char **b = data;
+
+ strncpy(*b, name, strlen(name));
+ *b = *b + strlen(name) + 1;
+
+ return 0;
+}
+
+static int op_listxattr(const char *path, char *names, size_t len)
+{
+ struct fuse_context *ctxt = fuse_get_context();
+ struct fuse2fs *ff = (struct fuse2fs *)ctxt->private_data;
+ ext2_filsys fs = ff->fs;
+ struct ext2_xattr_handle *h;
+ unsigned int bufsz;
+ ext2_ino_t ino;
+ errcode_t err;
+ int ret = 0;
+
+ pthread_mutex_lock(&ff->bfl);
+ if (!EXT2_HAS_COMPAT_FEATURE(fs->super,
+ EXT2_FEATURE_COMPAT_EXT_ATTR)) {
+ ret = -ENOTSUP;
+ goto out;
+ }
+
+ err = ext2fs_namei(fs, EXT2_ROOT_INO, EXT2_ROOT_INO, path, &ino);
+ if (err || ino == 0) {
+ ret = translate_error(fs, ino, err);
+ goto out;
+ }
+
+ ret = check_inum_access(ctxt, fs, ino, R_OK);
+ if (ret)
+ goto out2;
+
+ err = ext2fs_xattrs_open(fs, ino, &h);
+ if (err) {
+ ret = translate_error(fs, ino, err);
+ goto out;
+ }
+
+ err = ext2fs_xattrs_read(h);
+ if (err) {
+ ret = translate_error(fs, ino, err);
+ goto out2;
+ }
+
+ /* Count buffer space needed for names */
+ bufsz = 0;
+ err = ext2fs_xattrs_iterate(h, count_buffer_space, &bufsz);
+ if (err) {
+ ret = translate_error(fs, ino, err);
+ goto out2;
+ }
+
+ if (len == 0) {
+ ret = bufsz;
+ goto out2;
+ } else if (len < bufsz) {
+ ret = -ERANGE;
+ goto out2;
+ }
+
+ /* Copy names out */
+ memset(names, 0, len);
+ err = ext2fs_xattrs_iterate(h, copy_names, &names);
+ if (err) {
+ ret = translate_error(fs, ino, err);
+ goto out2;
+ }
+ ret = bufsz;
+out2:
+ err = ext2fs_xattrs_close(&h);
+ if (err)
+ ret = translate_error(fs, ino, err);
+out:
+ pthread_mutex_unlock(&ff->bfl);
+
+ return ret;
+}
+
+static int op_setxattr(const char *path, const char *key, const char *value,
+ size_t len, int flags)
+{
+ struct fuse_context *ctxt = fuse_get_context();
+ struct fuse2fs *ff = (struct fuse2fs *)ctxt->private_data;
+ ext2_filsys fs = ff->fs;
+ struct ext2_xattr_handle *h;
+ ext2_ino_t ino;
+ errcode_t err;
+ int ret = 0;
+
+ pthread_mutex_lock(&ff->bfl);
+ if (!EXT2_HAS_COMPAT_FEATURE(fs->super,
+ EXT2_FEATURE_COMPAT_EXT_ATTR)) {
+ ret = -ENOTSUP;
+ goto out;
+ }
+
+ err = ext2fs_namei(fs, EXT2_ROOT_INO, EXT2_ROOT_INO, path, &ino);
+ if (err || ino == 0) {
+ ret = translate_error(fs, 0, err);
+ goto out;
+ }
+
+ ret = check_inum_access(ctxt, fs, ino, W_OK);
+ if (ret)
+ goto out;
+
+ err = ext2fs_xattrs_open(fs, ino, &h);
+ if (err) {
+ ret = translate_error(fs, ino, err);
+ goto out;
+ }
+
+ err = ext2fs_xattrs_read(h);
+ if (err) {
+ ret = translate_error(fs, ino, err);
+ goto out2;
+ }
+
+ err = ext2fs_xattr_set(h, key, value, len);
+ if (err) {
+ ret = translate_error(fs, ino, err);
+ goto out2;
+ }
+
+ err = ext2fs_xattrs_write(h);
+ if (err) {
+ ret = translate_error(fs, ino, err);
+ goto out2;
+ }
+
+ ret = update_ctime(fs, ino, NULL);
+out2:
+ err = ext2fs_xattrs_close(&h);
+ if (!ret && err)
+ ret = translate_error(fs, ino, err);
+out:
+ pthread_mutex_unlock(&ff->bfl);
+
+ return ret;
+}
+
+static int op_removexattr(const char *path, const char *key)
+{
+ struct fuse_context *ctxt = fuse_get_context();
+ struct fuse2fs *ff = (struct fuse2fs *)ctxt->private_data;
+ ext2_filsys fs = ff->fs;
+ struct ext2_xattr_handle *h;
+ ext2_ino_t ino;
+ errcode_t err;
+ int ret = 0;
+
+ pthread_mutex_lock(&ff->bfl);
+ if (!EXT2_HAS_COMPAT_FEATURE(fs->super,
+ EXT2_FEATURE_COMPAT_EXT_ATTR)) {
+ ret = -ENOTSUP;
+ goto out;
+ }
+
+ err = ext2fs_namei(fs, EXT2_ROOT_INO, EXT2_ROOT_INO, path, &ino);
+ if (err || ino == 0) {
+ ret = translate_error(fs, 0, err);
+ goto out;
+ }
+
+ ret = check_inum_access(ctxt, fs, ino, W_OK);
+ if (ret)
+ goto out;
+
+ err = ext2fs_xattrs_open(fs, ino, &h);
+ if (err) {
+ ret = translate_error(fs, ino, err);
+ goto out;
+ }
+
+ err = ext2fs_xattrs_read(h);
+ if (err) {
+ ret = translate_error(fs, ino, err);
+ goto out2;
+ }
+
+ err = ext2fs_xattr_remove(h, key);
+ if (err) {
+ ret = translate_error(fs, ino, err);
+ goto out2;
+ }
+
+ err = ext2fs_xattrs_write(h);
+ if (err) {
+ ret = translate_error(fs, ino, err);
+ goto out2;
+ }
+
+ ret = update_ctime(fs, ino, NULL);
+out2:
+ err = ext2fs_xattrs_close(&h);
+ if (err)
+ ret = translate_error(fs, ino, err);
+out:
+ pthread_mutex_unlock(&ff->bfl);
+
+ return ret;
+}
+
+struct readdir_iter {
+ void *buf;
+ fuse_fill_dir_t func;
+};
+
+static int op_readdir_iter(ext2_ino_t dir, int entry,
+ struct ext2_dir_entry *dirent, int offset,
+ int blocksize, char *buf, void *data)
+{
+ struct readdir_iter *i = data;
+ struct stat statbuf;
+ char namebuf[EXT2_NAME_LEN + 1];
+ int ret;
+
+ memcpy(namebuf, dirent->name, dirent->name_len & 0xFF);
+ namebuf[dirent->name_len & 0xFF] = 0;
+ statbuf.st_ino = dirent->inode;
+ statbuf.st_mode = S_IFREG;
+ ret = i->func(i->buf, namebuf, NULL, 0);
+ if (ret)
+ return DIRENT_ABORT;
+
+ return 0;
+}
+
+static int op_readdir(const char *path, void *buf, fuse_fill_dir_t fill_func,
+ off_t offset, struct fuse_file_info *fp)
+{
+ struct fuse_context *ctxt = fuse_get_context();
+ struct fuse2fs *ff = (struct fuse2fs *)ctxt->private_data;
+ ext2_filsys fs = ff->fs;
+ struct fuse2fs_file_handle *fh = (struct fuse2fs_file_handle *)fp->fh;
+ errcode_t err;
+ ext2_ino_t ino;
+ struct readdir_iter i;
+ int ret = 0;
+
+ pthread_mutex_lock(&ff->bfl);
+ i.buf = buf;
+ i.func = fill_func;
+ err = ext2fs_dir_iterate2(fs, fh->ino, 0, NULL, op_readdir_iter, &i);
+ if (err) {
+ ret = translate_error(fs, fh->ino, err);
+ goto out;
+ }
+
+ if (fs_writeable(fs)) {
+ ret = update_atime(fs, fh->ino);
+ if (ret)
+ goto out;
+ }
+out:
+ pthread_mutex_unlock(&ff->bfl);
+ return ret;
+}
+
+static int op_access(const char *path, int mask)
+{
+ struct fuse_context *ctxt = fuse_get_context();
+ struct fuse2fs *ff = (struct fuse2fs *)ctxt->private_data;
+ ext2_filsys fs = ff->fs;
+ errcode_t err;
+ ext2_ino_t ino;
+ int ret = 0;
+
+ pthread_mutex_lock(&ff->bfl);
+ err = ext2fs_namei(fs, EXT2_ROOT_INO, EXT2_ROOT_INO, path, &ino);
+ if (err || ino == 0) {
+ ret = translate_error(fs, 0, err);
+ goto out;
+ }
+
+ ret = check_inum_access(ctxt, fs, ino, mask);
+ if (ret)
+ goto out;
+
+out:
+ pthread_mutex_unlock(&ff->bfl);
+ return ret;
+}
+
+static int op_create(const char *path, mode_t mode, struct fuse_file_info *fp)
+{
+ struct fuse_context *ctxt = fuse_get_context();
+ struct fuse2fs *ff = (struct fuse2fs *)ctxt->private_data;
+ ext2_filsys fs = ff->fs;
+ struct ext3_extent_header *eh;
+ ext2_ino_t parent, child;
+ char *temp_path = strdup(path);
+ errcode_t err;
+ char *node_name, a;
+ int filetype, i;
+ struct ext2_inode_large inode;
+ int ret = 0;
+
+ if (!temp_path) {
+ ret = -ENOMEM;
+ goto out;
+ }
+ node_name = strrchr(temp_path, '/');
+ if (!node_name) {
+ ret = -ENOMEM;
+ goto out;
+ }
+ node_name++;
+ a = *node_name;
+ *node_name = 0;
+
+ pthread_mutex_lock(&ff->bfl);
+ err = ext2fs_namei(fs, EXT2_ROOT_INO, EXT2_ROOT_INO, temp_path,
+ &parent);
+ if (err) {
+ ret = translate_error(fs, 0, err);
+ goto out2;
+ }
+
+ ret = check_inum_access(ctxt, fs, parent, W_OK);
+ if (ret)
+ goto out2;
+
+ *node_name = a;
+
+ filetype = ext2_file_type(mode);
+
+ err = ext2fs_new_inode(fs, parent, mode, 0, &child);
+ if (err) {
+ ret = translate_error(fs, parent, err);
+ goto out2;
+ }
+
+ err = ext2fs_link(fs, parent, node_name, child, filetype);
+ if (err == EXT2_ET_DIR_NO_SPACE) {
+ err = ext2fs_expand_dir(fs, parent);
+ if (err) {
+ ret = translate_error(fs, parent, err);
+ goto out2;
+ }
+
+ err = ext2fs_link(fs, parent, node_name, child,
+ filetype);
+ }
+ if (err) {
+ ret = translate_error(fs, parent, err);
+ goto out2;
+ }
+
+ ret = update_mtime(fs, parent);
+ if (ret)
+ goto out2;
+
+ memset(&inode, 0, sizeof(inode));
+ inode.i_mode = mode;
+ inode.i_links_count = 1;
+ inode.i_extra_isize = sizeof(struct ext2_inode_large) -
+ EXT2_GOOD_OLD_INODE_SIZE;
+ if (fs->super->s_feature_incompat & EXT3_FEATURE_INCOMPAT_EXTENTS) {
+ inode.i_flags = EXT4_EXTENTS_FL;
+
+ /* This must be initialized, even for a zero byte file. */
+ eh = (struct ext3_extent_header *) &inode.i_block[0];
+ eh->eh_magic = ext2fs_cpu_to_le16(EXT3_EXT_MAGIC);
+ eh->eh_depth = 0;
+ eh->eh_entries = 0;
+ i = (sizeof(inode.i_block) - sizeof(*eh)) /
+ sizeof(struct ext3_extent);
+ eh->eh_max = ext2fs_cpu_to_le16(i);
+ }
+
+ err = ext2fs_write_new_inode(fs, child, (struct ext2_inode *)&inode);
+ if (err) {
+ ret = translate_error(fs, child, err);
+ goto out2;
+ }
+
+ inode.i_generation = ff->next_generation++;
+ init_times(&inode);
+ err = ext2fs_write_inode_full(fs, child, (struct ext2_inode *)&inode,
+ sizeof(inode));
+ if (err) {
+ ret = translate_error(fs, child, err);
+ goto out2;
+ }
+
+ ext2fs_inode_alloc_stats2(fs, child, 1, 0);
+
+ ret = __op_open(path, fp);
+ if (ret)
+ goto out2;
+out2:
+ pthread_mutex_unlock(&ff->bfl);
+out:
+ free(temp_path);
+ return ret;
+}
+
+static int op_ftruncate(const char *path, off_t len, struct fuse_file_info *fp)
+{
+ struct fuse_context *ctxt = fuse_get_context();
+ struct fuse2fs *ff = (struct fuse2fs *)ctxt->private_data;
+ ext2_filsys fs = ff->fs;
+ struct fuse2fs_file_handle *fh = (struct fuse2fs_file_handle *)fp->fh;
+ ext2_file_t efp;
+ errcode_t err;
+ int ret = 0;
+
+ pthread_mutex_lock(&ff->bfl);
+ if (!fs_writeable(fs)) {
+ ret = -EROFS;
+ goto out;
+ }
+
+ err = ext2fs_file_open(fs, fh->ino, fh->open_flags, &efp);
+ if (err) {
+ ret = translate_error(fs, fh->ino, err);
+ goto out;
+ }
+
+ err = ext2fs_file_set_size2(efp, len);
+ if (err) {
+ ret = translate_error(fs, fh->ino, err);
+ goto out2;
+ }
+
+out2:
+ err = ext2fs_file_close(efp);
+ if (ret)
+ goto out;
+ if (err) {
+ ret = translate_error(fs, fh->ino, err);
+ goto out;
+ }
+
+ ret = update_mtime(fs, fh->ino);
+ if (ret)
+ goto out;
+
+out:
+ pthread_mutex_unlock(&ff->bfl);
+ return 0;
+}
+
+static int op_fgetattr(const char *path, struct stat *statbuf,
+ struct fuse_file_info *fp)
+{
+ struct fuse_context *ctxt = fuse_get_context();
+ struct fuse2fs *ff = (struct fuse2fs *)ctxt->private_data;
+ ext2_filsys fs = ff->fs;
+ struct fuse2fs_file_handle *fh = (struct fuse2fs_file_handle *)fp->fh;
+ int ret = 0;
+
+ pthread_mutex_lock(&ff->bfl);
+ ret = stat_inode(fs, fh->ino, statbuf);
+ pthread_mutex_unlock(&ff->bfl);
+
+ return ret;
+}
+
+static int op_utimens(const char *path, const struct timespec tv[2])
+{
+ struct fuse_context *ctxt = fuse_get_context();
+ struct fuse2fs *ff = (struct fuse2fs *)ctxt->private_data;
+ ext2_filsys fs = ff->fs;
+ errcode_t err;
+ ext2_ino_t ino;
+ struct ext2_inode_large inode;
+ int ret = 0;
+
+ pthread_mutex_lock(&ff->bfl);
+ err = ext2fs_namei(fs, EXT2_ROOT_INO, EXT2_ROOT_INO, path, &ino);
+ if (err) {
+ ret = translate_error(fs, 0, err);
+ goto out;
+ }
+
+ ret = check_inum_access(ctxt, fs, ino, W_OK);
+ if (ret)
+ goto out;
+
+ memset(&inode, 0, sizeof(inode));
+ err = ext2fs_read_inode_full(fs, ino, (struct ext2_inode *)&inode,
+ sizeof(inode));
+ if (err) {
+ ret = translate_error(fs, ino, err);
+ goto out;
+ }
+
+ EXT4_INODE_SET_XTIME(i_atime, tv, &inode);
+ EXT4_INODE_SET_XTIME(i_mtime, tv + 1, &inode);
+ ret = update_ctime(fs, ino, &inode);
+ if (ret)
+ goto out;
+
+ err = ext2fs_write_inode_full(fs, ino, (struct ext2_inode *)&inode,
+ sizeof(inode));
+ if (err) {
+ ret = translate_error(fs, ino, err);
+ goto out;
+ }
+
+out:
+ pthread_mutex_unlock(&ff->bfl);
+ return 0;
+}
+
+#ifdef SUPPORT_I_FLAGS
+static int ioctl_getflags(ext2_filsys fs, struct fuse2fs_file_handle *fh,
+ void *data)
+{
+ errcode_t err;
+ struct ext2_inode_large inode;
+
+ memset(&inode, 0, sizeof(inode));
+ err = ext2fs_read_inode_full(fs, fh->ino, (struct ext2_inode *)&inode,
+ sizeof(inode));
+ if (err)
+ return translate_error(fs, fh->ino, err);
+
+ *(__u32 *)data = inode.i_flags & EXT2_FL_USER_VISIBLE;
+ return 0;
+}
+
+#define FUSE2FS_MODIFIABLE_IFLAGS \
+ (EXT2_IMMUTABLE_FL | EXT2_APPEND_FL | EXT2_NODUMP_FL | \
+ EXT2_NOATIME_FL | EXT3_JOURNAL_DATA_FL | EXT2_DIRSYNC_FL | \
+ EXT2_TOPDIR_FL)
+
+static int ioctl_setflags(ext2_filsys fs, struct fuse2fs_file_handle *fh,
+ void *data)
+{
+ errcode_t err;
+ struct ext2_inode_large inode;
+ int ret;
+ __u32 flags = *(__u32 *)data;
+ struct fuse_context *ctxt = fuse_get_context();
+
+ memset(&inode, 0, sizeof(inode));
+ err = ext2fs_read_inode_full(fs, fh->ino, (struct ext2_inode *)&inode,
+ sizeof(inode));
+ if (err)
+ return translate_error(fs, fh->ino, err);
+
+ if (ctxt->uid != 0 && inode.i_uid != ctxt->uid)
+ return -EPERM;
+
+ if ((inode.i_flags ^ flags) & ~FUSE2FS_MODIFIABLE_IFLAGS)
+ return -EINVAL;
+
+ inode.i_flags = inode.i_flags & ~FUSE2FS_MODIFIABLE_IFLAGS |
+ flags & FUSE2FS_MODIFIABLE_IFLAGS;
+
+ err = ext2fs_write_inode_full(fs, fh->ino, (struct ext2_inode *)&inode,
+ sizeof(inode));
+ if (err)
+ return translate_error(fs, fh->ino, err);
+
+ return 0;
+}
+#endif /* SUPPORT_I_FLAGS */
+
+#if FUSE_VERSION >= FUSE_MAKE_VERSION(2, 8)
+static int op_ioctl(const char *path, int cmd, void *arg,
+ struct fuse_file_info *fp, unsigned int flags, void *data)
+{
+ struct fuse_context *ctxt = fuse_get_context();
+ struct fuse2fs *ff = (struct fuse2fs *)ctxt->private_data;
+ ext2_filsys fs = ff->fs;
+ struct fuse2fs_file_handle *fh = (struct fuse2fs_file_handle *)fp->fh;
+ int ret = 0;
+
+ pthread_mutex_lock(&ff->bfl);
+ switch (cmd) {
+#ifdef SUPPORT_I_FLAGS
+ case EXT2_IOC_GETFLAGS:
+ ret = ioctl_getflags(fs, fh, data);
+ break;
+ case EXT2_IOC_SETFLAGS:
+ ret = ioctl_setflags(fs, fh, data);
+ break;
+#endif
+ default:
+ ret = -ENOTTY;
+ }
+ pthread_mutex_unlock(&ff->bfl);
+
+ return ret;
+}
+#endif /* FUSE 28 */
+
+static int op_bmap(const char *path, size_t blocksize, uint64_t *idx)
+{
+ struct fuse_context *ctxt = fuse_get_context();
+ struct fuse2fs *ff = (struct fuse2fs *)ctxt->private_data;
+ ext2_filsys fs = ff->fs;
+ ext2_ino_t ino;
+ errcode_t err;
+ int ret = 0;
+
+ pthread_mutex_lock(&ff->bfl);
+ err = ext2fs_namei(fs, EXT2_ROOT_INO, EXT2_ROOT_INO, path, &ino);
+ if (err) {
+ ret = translate_error(fs, 0, err);
+ goto out;
+ }
+
+ err = ext2fs_bmap2(fs, ino, NULL, NULL, 0, *idx, 0, (blk64_t *)idx);
+ if (err) {
+ ret = translate_error(fs, ino, err);
+ goto out;
+ }
+
+out:
+ pthread_mutex_unlock(&ff->bfl);
+ return ret;
+}
+
+#if FUSE_VERSION >= FUSE_MAKE_VERSION(2, 9)
+static int fallocate_helper(struct fuse_file_info *fp, int mode, off_t offset,
+ off_t len)
+{
+ struct fuse_context *ctxt = fuse_get_context();
+ struct fuse2fs *ff = (struct fuse2fs *)ctxt->private_data;
+ ext2_filsys fs = ff->fs;
+ struct fuse2fs_file_handle *fh = (struct fuse2fs_file_handle *)fp->fh;
+ blk64_t blk, end, x;
+ __u64 fsize;
+ ext2_file_t efp;
+ errcode_t err;
+ int ret = 0;
+
+ /* Allocate a bunch of blocks */
+ end = (offset + len - 1) / fs->blocksize;
+ for (blk = offset / fs->blocksize; blk <= end; blk++) {
+ err = ext2fs_bmap2(fs, fh->ino, NULL, NULL, BMAP_ALLOC, blk,
+ 0, &x);
+ if (err)
+ return translate_error(fs, fh->ino, err);
+ }
+
+ /* Update i_size */
+ if (!(mode & FL_KEEP_SIZE_FLAG)) {
+ err = ext2fs_file_open(fs, fh->ino, fh->open_flags, &efp);
+ if (err)
+ return translate_error(fs, fh->ino, err);
+
+ err = ext2fs_file_get_lsize(efp, &fsize);
+ if (err) {
+ ret = translate_error(fs, fh->ino, err);
+ goto out_isize;
+ }
+ if (offset + len > fsize) {
+ fsize = offset + len;
+ err = ext2fs_file_set_size2(efp, fsize);
+ if (err) {
+ ret = translate_error(fs, fh->ino, err);
+ goto out_isize;
+ }
+ }
+
+out_isize:
+ err = ext2fs_file_close(efp);
+ if (ret)
+ return ret;
+ if (err)
+ return translate_error(fs, fh->ino, err);
+ }
+
+ return update_mtime(fs, fh->ino);
+}
+
+static int punch_helper(struct fuse_file_info *fp, int mode, off_t offset,
+ off_t len)
+{
+ struct fuse_context *ctxt = fuse_get_context();
+ struct fuse2fs *ff = (struct fuse2fs *)ctxt->private_data;
+ ext2_filsys fs = ff->fs;
+ struct fuse2fs_file_handle *fh = (struct fuse2fs_file_handle *)fp->fh;
+ blk64_t blk, start, end, x;
+ __u64 fsize;
+ ext2_file_t efp;
+ errcode_t err;
+ int ret = 0;
+
+ /* kernel ext4 punch requires this flag to be set */
+ if (!(mode & FL_KEEP_SIZE_FLAG))
+ return -EINVAL;
+
+ if (len < fs->blocksize)
+ return 0;
+
+ /* Punch out a bunch of blocks */
+ start = (offset + fs->blocksize - 1) / fs->blocksize;
+ end = (offset + len - fs->blocksize) / fs->blocksize;
+
+ if (start > end)
+ return 0;
+
+ err = ext2fs_punch(fs, fh->ino, NULL, NULL, start, end);
+ if (err)
+ return translate_error(fs, fh->ino, err);
+
+ return update_mtime(fs, fh->ino);
+}
+
+static int op_fallocate(const char *path, int mode, off_t offset, off_t len,
+ struct fuse_file_info *fp)
+{
+ struct fuse_context *ctxt = fuse_get_context();
+ struct fuse2fs *ff = (struct fuse2fs *)ctxt->private_data;
+ ext2_filsys fs = ff->fs;
+ int ret;
+
+ /* Catch unknown flags */
+ if (mode & ~(FL_PUNCH_HOLE_FLAG | FL_KEEP_SIZE_FLAG))
+ return -EINVAL;
+
+ pthread_mutex_lock(&ff->bfl);
+ if (!fs_writeable(fs)) {
+ ret = -EROFS;
+ goto out;
+ }
+ if (mode & FL_PUNCH_HOLE_FLAG)
+ ret = punch_helper(fp, mode, offset, len);
+ else
+ ret = fallocate_helper(fp, mode, offset, len);
+out:
+ pthread_mutex_unlock(&ff->bfl);
+
+ return ret;
+}
+#endif /* FUSE 29 */
+
+static struct fuse_operations fs_ops = {
+ .init = op_init,
+ .destroy = op_destroy,
+ .getattr = op_getattr,
+ .readlink = op_readlink,
+ .mknod = op_mknod,
+ .mkdir = op_mkdir,
+ .unlink = op_unlink,
+ .rmdir = op_rmdir,
+ .symlink = op_symlink,
+ .rename = op_rename,
+ .link = op_link,
+ .chmod = op_chmod,
+ .chown = op_chown,
+ .truncate = op_truncate,
+ .open = op_open,
+ .read = op_read,
+ .write = op_write,
+ .statfs = op_statfs,
+ .release = op_release,
+ .fsync = op_fsync,
+ .setxattr = op_setxattr,
+ .getxattr = op_getxattr,
+ .listxattr = op_listxattr,
+ .removexattr = op_removexattr,
+ .opendir = op_open,
+ .readdir = op_readdir,
+ .releasedir = op_release,
+ .fsyncdir = op_fsync,
+ .access = op_access,
+ .create = op_create,
+ .ftruncate = op_ftruncate,
+ .fgetattr = op_fgetattr,
+ .utimens = op_utimens,
+ .bmap = op_bmap,
+#ifdef SUPERFLUOUS
+ .lock = op_lock,
+ .poll = op_poll,
+#endif
+#if FUSE_VERSION >= FUSE_MAKE_VERSION(2, 8)
+ .ioctl = op_ioctl,
+ .flag_nullpath_ok = 1,
+#endif
+#if FUSE_VERSION >= FUSE_MAKE_VERSION(2, 9)
+ .flag_nopath = 1,
+ .fallocate = op_fallocate,
+#endif
+};
+
+static int get_random_bytes(void *p, size_t sz)
+{
+ int fd;
+ ssize_t r;
+
+ fd = open("/dev/random", O_RDONLY);
+ if (fd < 0) {
+ perror("/dev/random");
+ return 0;
+ }
+
+ r = read(fd, p, sz);
+
+ close(fd);
+ return r == sz;
+}
+
+static void print_help(const char *progname)
+{
+ printf("Usage: %s dev mntpt [-o options] [fuse_args]\n", progname);
+}
+
+int main(int argc, char *argv[])
+{
+ errcode_t err;
+ ext2_filsys fs;
+ char *tok, *arg, *logfile;
+ int i;
+ int readwrite = 1, panic_on_error = 0;
+ struct fuse2fs *ff;
+ char extra_args[BUFSIZ];
+ int ret = 0, flags = EXT2_FLAG_64BITS | EXT2_FLAG_EXCLUSIVE;
+
+ if (argc < 2) {
+ print_help(argv[0]);
+ return 1;
+ }
+
+ for (i = 1; i < argc; i++) {
+ if (strcmp(argv[i], "--help") == 0) {
+ print_help(argv[0]);
+ return 1;
+ }
+ }
+
+ for (i = 1; i < argc - 1; i++) {
+ if (strcmp(argv[i], "-o"))
+ continue;
+ arg = argv[i + 1];
+ while ((tok = strtok(arg, ","))) {
+ arg = NULL;
+ if (!strcmp(tok, "ro"))
+ readwrite = 0;
+ else if (!strcmp(tok, "errors=panic"))
+ panic_on_error = 1;
+ }
+ }
+
+ if (!readwrite)
+ printf("Mounting read-only.\n");
+
+#ifdef ENABLE_NLS
+ setlocale(LC_MESSAGES, "");
+ setlocale(LC_CTYPE, "");
+ bindtextdomain(NLS_CAT_NAME, LOCALEDIR);
+ textdomain(NLS_CAT_NAME);
+ set_com_err_gettext(gettext);
+#endif
+ add_error_table(&et_ext2_error_table);
+
+ ff = calloc(1, sizeof(*ff));
+ if (!ff) {
+ perror("init");
+ return 1;
+ }
+ ff->panic_on_error = panic_on_error;
+
+ /* Set up error logging */
+ logfile = getenv("FUSE2FS_LOGFILE");
+ if (logfile) {
+ ff->err_fp = fopen(logfile, "a");
+ if (!ff->err_fp) {
+ perror(logfile);
+ goto out_nofs;
+ }
+ } else
+ ff->err_fp = stderr;
+
+ /* Start up the fs (while we still can use stdout) */
+ ret = 2;
+ if (readwrite)
+ flags |= EXT2_FLAG_RW;
+ err = ext2fs_open3(argv[1], NULL, flags, 0, 0, unix_io_manager, &fs);
+ if (err) {
+ printf("%s: %s.\n", argv[1], error_message(err));
+ printf("Please run e2fsck -fy %s.\n", argv[1]);
+ goto out_nofs;
+ }
+ ff->fs = fs;
+ fs->priv_data = ff;
+
+ ret = 3;
+ if (EXT2_HAS_INCOMPAT_FEATURE(fs->super,
+ EXT3_FEATURE_INCOMPAT_RECOVER)) {
+ printf("Journal needs recovery; running `e2fsck -E "
+ "journal_only' is required.\n");
+ goto out;
+ }
+
+ if (readwrite) {
+ if (EXT2_HAS_COMPAT_FEATURE(fs->super,
+ EXT3_FEATURE_COMPAT_HAS_JOURNAL))
+ printf("Journal mode will not be used.\n");
+ err = ext2fs_read_inode_bitmap(fs);
+ if (err) {
+ translate_error(fs, 0, err);
+ goto out;
+ }
+ err = ext2fs_read_block_bitmap(fs);
+ if (err) {
+ translate_error(fs, 0, err);
+ goto out;
+ }
+ }
+
+ if (!(fs->super->s_state & EXT2_VALID_FS))
+ printf("Warning: Mounting unchecked fs, running e2fsck "
+ "is recommended.\n");
+ if (fs->super->s_max_mnt_count > 0 &&
+ fs->super->s_mnt_count >= fs->super->s_max_mnt_count)
+ printf("Warning: Maximal mount count reached, running "
+ "e2fsck is recommended.\n");
+ if (fs->super->s_checkinterval > 0 &&
+ fs->super->s_lastcheck + fs->super->s_checkinterval <= time(0))
+ printf("Warning: Check time reached; running e2fsck "
+ "is recommended.\n");
+ if (fs->super->s_last_orphan)
+ printf("Orphans detected; running e2fsck is recommended.\n");
+
+ if (fs->super->s_state & EXT2_ERROR_FS) {
+ printf("Errors detected; running e2fsck is required.\n");
+ goto out;
+ }
+
+ /* Initialize generation counter */
+ get_random_bytes(&ff->next_generation, sizeof(unsigned int));
+
+ /* Stuff in some fuse parameters of our own */
+ snprintf(extra_args, BUFSIZ, "-okernel_cache,subtype=ext4,use_ino,"
+ "fsname=%s,attr_timeout=0,nonempty,allow_other", argv[1]);
+ argv[0] = argv[1];
+ argv[1] = argv[2];
+ argv[2] = extra_args;
+
+ pthread_mutex_init(&ff->bfl, NULL);
+ fuse_main(argc, argv, &fs_ops, ff);
+ pthread_mutex_destroy(&ff->bfl);
+
+ ret = 0;
+out:
+ err = ext2fs_close(fs);
+ if (err)
+ ret = translate_error(fs, 0, err);
+out_nofs:
+ free(ff);
+
+ return ret;
+}
+
+static int __translate_error(ext2_filsys fs, errcode_t err, ext2_ino_t ino,
+ const char *file, int line)
+{
+ struct timespec now;
+ int ret;
+ struct fuse2fs *ff = fs->priv_data;
+ int is_err = 0;
+
+ /* Translate ext2 error to unix error code */
+ switch (err) {
+ case EXT2_ET_NO_MEMORY:
+ case EXT2_ET_TDB_ERR_OOM:
+ ret = -ENOMEM;
+ break;
+ case EXT2_ET_INVALID_ARGUMENT:
+ case EXT2_ET_LLSEEK_FAILED:
+ ret = -EINVAL;
+ break;
+ case EXT2_ET_NO_DIRECTORY:
+ ret = -ENOTDIR;
+ break;
+ case EXT2_ET_FILE_NOT_FOUND:
+ ret = -ENOENT;
+ break;
+ case EXT2_ET_DIR_NO_SPACE:
+ is_err = 1;
+ case EXT2_ET_TOOSMALL:
+ case EXT2_ET_BLOCK_ALLOC_FAIL:
+ case EXT2_ET_INODE_ALLOC_FAIL:
+ case EXT2_ET_EA_NO_SPACE:
+ ret = -ENOSPC;
+ break;
+ case EXT2_ET_SYMLINK_LOOP:
+ ret = -EMLINK;
+ break;
+ case EXT2_ET_FILE_TOO_BIG:
+ ret = -EFBIG;
+ break;
+ case EXT2_ET_TDB_ERR_EXISTS:
+ case EXT2_ET_FILE_EXISTS:
+ ret = -EEXIST;
+ break;
+ case EXT2_ET_MMP_FAILED:
+ case EXT2_ET_MMP_FSCK_ON:
+ ret = -EBUSY;
+ break;
+ case EXT2_ET_EA_KEY_NOT_FOUND:
+ ret = -ENODATA;
+ break;
+ default:
+ is_err = 1;
+ ret = -EIO;
+ break;
+ }
+
+ if (!is_err)
+ return ret;
+
+ if (ino)
+ fprintf(ff->err_fp, "FUSE2FS (%s): %s (inode #%d) at %s:%d.\n",
+ fs && fs->device_name ? fs->device_name : "???",
+ error_message(err), ino, file, line);
+ else
+ fprintf(ff->err_fp, "FUSE2FS (%s): %s at %s:%d.\n",
+ fs && fs->device_name ? fs->device_name : "???",
+ error_message(err), file, line);
+ fflush(ff->err_fp);
+
+ /* Make a note in the error log */
+ get_now(&now);
+ fs->super->s_last_error_time = now.tv_sec;
+ fs->super->s_last_error_ino = ino;
+ fs->super->s_last_error_line = line;
+ fs->super->s_last_error_block = 0;
+ strncpy(fs->super->s_last_error_func, file,
+ sizeof(fs->super->s_last_error_func));
+ if (fs->super->s_first_error_time == 0) {
+ fs->super->s_first_error_time = now.tv_sec;
+ fs->super->s_first_error_ino = ino;
+ fs->super->s_first_error_line = line;
+ fs->super->s_first_error_block = 0;
+ strncpy(fs->super->s_first_error_func, file,
+ sizeof(fs->super->s_first_error_func));
+ }
+
+ fs->super->s_error_count++;
+ ext2fs_mark_super_dirty(fs);
+ ext2fs_flush(fs);
+ if (ff->panic_on_error)
+ abort();
+
+ return ret;
+}
Translate "native" ACL structures into ext4 ACL structures when
reading or writing the ACL EAs.
Signed-off-by: Darrick J. Wong <[email protected]>
---
misc/fuse2fs.c | 253 +++++++++++++++++++++++++++++++++++++++++++++++++++++++-
1 file changed, 248 insertions(+), 5 deletions(-)
diff --git a/misc/fuse2fs.c b/misc/fuse2fs.c
index 8be9070..317032c 100644
--- a/misc/fuse2fs.c
+++ b/misc/fuse2fs.c
@@ -18,6 +18,7 @@
# include <linux/xattr.h>
#endif
#include <sys/ioctl.h>
+#include <sys/acl.h>
#include <unistd.h>
#include <fuse.h>
#include "ext2fs/ext2fs.h"
@@ -43,6 +44,197 @@
# define FL_PUNCH_HOLE_FLAG (0)
#endif
+/* ACL translation stuff */
+/*
+ * Copied from acl_ea.h in libacl source; ACLs have to be sent to and from fuse
+ * in this format... at least on Linux.
+ */
+#define ACL_EA_ACCESS "system.posix_acl_access"
+#define ACL_EA_DEFAULT "system.posix_acl_default"
+
+#define ACL_EA_VERSION 0x0002
+
+typedef struct {
+ u_int16_t e_tag;
+ u_int16_t e_perm;
+ u_int32_t e_id;
+} acl_ea_entry;
+
+typedef struct {
+ u_int32_t a_version;
+ acl_ea_entry a_entries[0];
+} acl_ea_header;
+
+static inline size_t acl_ea_size(int count)
+{
+ return sizeof(acl_ea_header) + count * sizeof(acl_ea_entry);
+}
+
+static inline int acl_ea_count(size_t size)
+{
+ if (size < sizeof(acl_ea_header))
+ return -1;
+ size -= sizeof(acl_ea_header);
+ if (size % sizeof(acl_ea_entry))
+ return -1;
+ return size / sizeof(acl_ea_entry);
+}
+
+/*
+ * ext4 ACL structures, copied from fs/ext4/acl.h.
+ */
+#define EXT4_ACL_VERSION 0x0001
+
+typedef struct {
+ __u16 e_tag;
+ __u16 e_perm;
+ __u32 e_id;
+} ext4_acl_entry;
+
+typedef struct {
+ __u16 e_tag;
+ __u16 e_perm;
+} ext4_acl_entry_short;
+
+typedef struct {
+ __u32 a_version;
+} ext4_acl_header;
+
+static inline size_t ext4_acl_size(int count)
+{
+ if (count <= 4) {
+ return sizeof(ext4_acl_header) +
+ count * sizeof(ext4_acl_entry_short);
+ } else {
+ return sizeof(ext4_acl_header) +
+ 4 * sizeof(ext4_acl_entry_short) +
+ (count - 4) * sizeof(ext4_acl_entry);
+ }
+}
+
+static inline int ext4_acl_count(size_t size)
+{
+ ssize_t s;
+ size -= sizeof(ext4_acl_header);
+ s = size - 4 * sizeof(ext4_acl_entry_short);
+ if (s < 0) {
+ if (size % sizeof(ext4_acl_entry_short))
+ return -1;
+ return size / sizeof(ext4_acl_entry_short);
+ } else {
+ if (s % sizeof(ext4_acl_entry))
+ return -1;
+ return s / sizeof(ext4_acl_entry) + 4;
+ }
+}
+
+static errcode_t fuse_to_ext4_acl(acl_ea_header *facl, size_t facl_sz,
+ ext4_acl_header **eacl, size_t *eacl_sz)
+{
+ int i, facl_count;
+ ext4_acl_header *h;
+ size_t h_sz;
+ ext4_acl_entry *e;
+ acl_ea_entry *a;
+ void *hptr;
+ errcode_t err;
+
+ facl_count = acl_ea_count(facl_sz);
+ h_sz = ext4_acl_size(facl_count);
+ if (h_sz < 0 || facl_count < 0 || facl->a_version != ACL_EA_VERSION)
+ return EXT2_ET_INVALID_ARGUMENT;
+
+ err = ext2fs_get_mem(h_sz, &h);
+ if (err)
+ return err;
+
+ h->a_version = ext2fs_cpu_to_le32(EXT4_ACL_VERSION);
+ hptr = h + 1;
+ for (i = 0, a = facl->a_entries; i < facl_count; i++, a++) {
+ e = hptr;
+ e->e_tag = ext2fs_cpu_to_le16(a->e_tag);
+ e->e_perm = ext2fs_cpu_to_le16(a->e_perm);
+
+ switch (a->e_tag) {
+ case ACL_USER:
+ case ACL_GROUP:
+ e->e_id = ext2fs_cpu_to_le32(a->e_id);
+ hptr += sizeof(ext4_acl_entry);
+ break;
+ case ACL_USER_OBJ:
+ case ACL_GROUP_OBJ:
+ case ACL_MASK:
+ case ACL_OTHER:
+ hptr += sizeof(ext4_acl_entry_short);
+ break;
+ default:
+ err = EXT2_ET_INVALID_ARGUMENT;
+ goto out;
+ }
+ }
+
+ *eacl = h;
+ *eacl_sz = h_sz;
+ return err;
+out:
+ ext2fs_free_mem(&h);
+ return err;
+}
+
+static errcode_t ext4_to_fuse_acl(acl_ea_header **facl, size_t *facl_sz,
+ ext4_acl_header *eacl, size_t eacl_sz)
+{
+ int i, eacl_count;
+ acl_ea_header *f;
+ ext4_acl_entry *e;
+ acl_ea_entry *a;
+ size_t f_sz;
+ void *hptr;
+ errcode_t err;
+
+ eacl_count = ext4_acl_count(eacl_sz);
+ f_sz = acl_ea_size(eacl_count);
+ if (f_sz < 0 || eacl_count < 0 ||
+ eacl->a_version != ext2fs_cpu_to_le32(EXT4_ACL_VERSION))
+ return EXT2_ET_INVALID_ARGUMENT;
+
+ err = ext2fs_get_mem(f_sz, &f);
+ if (err)
+ return err;
+
+ f->a_version = ACL_EA_VERSION;
+ hptr = eacl + 1;
+ for (i = 0, a = f->a_entries; i < eacl_count; i++, a++) {
+ e = hptr;
+ a->e_tag = ext2fs_le16_to_cpu(e->e_tag);
+ a->e_perm = ext2fs_le16_to_cpu(e->e_perm);
+
+ switch (a->e_tag) {
+ case ACL_USER:
+ case ACL_GROUP:
+ a->e_id = ext2fs_le32_to_cpu(e->e_id);
+ hptr += sizeof(ext4_acl_entry);
+ break;
+ case ACL_USER_OBJ:
+ case ACL_GROUP_OBJ:
+ case ACL_MASK:
+ case ACL_OTHER:
+ hptr += sizeof(ext4_acl_entry_short);
+ break;
+ default:
+ err = EXT2_ET_INVALID_ARGUMENT;
+ goto out;
+ }
+ }
+
+ *facl = f;
+ *facl_sz = f_sz;
+ return err;
+out:
+ ext2fs_free_mem(&f);
+ return err;
+}
+
/*
* ext2_file_t contains a struct inode, so we can't leave files open.
* Use this as a proxy instead.
@@ -1837,6 +2029,28 @@ static int op_statfs(const char *path, struct statvfs *buf)
return 0;
}
+typedef errcode_t (*xattr_xlate_get)(void **cooked_buf, size_t *cooked_sz,
+ const void *raw_buf, size_t raw_sz);
+typedef errcode_t (*xattr_xlate_set)(const void *cooked_buf, size_t cooked_sz,
+ void **raw_buf, size_t *raw_sz);
+struct xattr_translate {
+ const char *prefix;
+ xattr_xlate_get get;
+ xattr_xlate_set set;
+};
+
+#define XATTR_TRANSLATOR(p, g, s) \
+ {.prefix = (p), \
+ .get = (xattr_xlate_get)(g), \
+ .set = (xattr_xlate_set)(s)}
+
+static struct xattr_translate xattr_translators[] = {
+ XATTR_TRANSLATOR(ACL_EA_ACCESS, ext4_to_fuse_acl, fuse_to_ext4_acl),
+ XATTR_TRANSLATOR(ACL_EA_DEFAULT, ext4_to_fuse_acl, fuse_to_ext4_acl),
+ XATTR_TRANSLATOR(NULL, NULL, NULL),
+};
+#undef XATTR_TRANSLATOR
+
static int op_getxattr(const char *path, const char *key, char *value,
size_t len)
{
@@ -1844,8 +2058,9 @@ static int op_getxattr(const char *path, const char *key, char *value,
struct fuse2fs *ff = (struct fuse2fs *)ctxt->private_data;
ext2_filsys fs = ff->fs;
struct ext2_xattr_handle *h;
- void *ptr;
- unsigned int plen;
+ struct xattr_translate *xt;
+ void *ptr, *cptr;
+ size_t plen, clen;
ext2_ino_t ino;
errcode_t err;
int ret = 0;
@@ -1885,6 +2100,17 @@ static int op_getxattr(const char *path, const char *key, char *value,
goto out2;
}
+ for (xt = xattr_translators; xt->prefix != NULL; xt++) {
+ if (strncmp(key, xt->prefix, strlen(xt->prefix)) == 0) {
+ err = xt->get(&cptr, &clen, ptr, plen);
+ if (err)
+ goto out3;
+ ext2fs_free_mem(&ptr);
+ ptr = cptr;
+ plen = clen;
+ }
+ }
+
if (!len) {
ret = plen;
} else if (len < plen) {
@@ -1894,6 +2120,7 @@ static int op_getxattr(const char *path, const char *key, char *value,
ret = plen;
}
+out3:
ext2fs_free_mem(&ptr);
out2:
err = ext2fs_xattrs_close(&h);
@@ -2005,6 +2232,9 @@ static int op_setxattr(const char *path, const char *key, const char *value,
struct fuse2fs *ff = (struct fuse2fs *)ctxt->private_data;
ext2_filsys fs = ff->fs;
struct ext2_xattr_handle *h;
+ struct xattr_translate *xt;
+ void *cvalue;
+ size_t clen;
ext2_ino_t ino;
errcode_t err;
int ret = 0;
@@ -2038,19 +2268,32 @@ static int op_setxattr(const char *path, const char *key, const char *value,
goto out2;
}
- err = ext2fs_xattr_set(h, key, value, len);
+ cvalue = (void *)value;
+ clen = len;
+ for (xt = xattr_translators; xt->prefix != NULL; xt++) {
+ if (strncmp(key, xt->prefix, strlen(xt->prefix)) == 0) {
+ err = xt->set(value, len, &cvalue, &clen);
+ if (err)
+ goto out3;
+ }
+ }
+
+ err = ext2fs_xattr_set(h, key, cvalue, clen);
if (err) {
ret = translate_error(fs, ino, err);
- goto out2;
+ goto out3;
}
err = ext2fs_xattrs_write(h);
if (err) {
ret = translate_error(fs, ino, err);
- goto out2;
+ goto out3;
}
ret = update_ctime(fs, ino, NULL);
+out3:
+ if (cvalue != value)
+ ext2fs_free_mem(&cvalue);
out2:
err = ext2fs_xattrs_close(&h);
if (!ret && err)
Use the new ext2fs_bmap2 flag to allocate uninitialized extents
when doing fallocate.
Signed-off-by: Darrick J. Wong <[email protected]>
---
misc/fuse2fs.c | 7 +++++--
1 file changed, 5 insertions(+), 2 deletions(-)
diff --git a/misc/fuse2fs.c b/misc/fuse2fs.c
index 317032c..105c54f 100644
--- a/misc/fuse2fs.c
+++ b/misc/fuse2fs.c
@@ -2798,8 +2798,8 @@ static int fallocate_helper(struct fuse_file_info *fp, int mode, off_t offset,
/* Allocate a bunch of blocks */
end = (offset + len - 1) / fs->blocksize;
for (blk = offset / fs->blocksize; blk <= end; blk++) {
- err = ext2fs_bmap2(fs, fh->ino, NULL, NULL, BMAP_ALLOC, blk,
- 0, &x);
+ err = ext2fs_bmap2(fs, fh->ino, NULL, NULL,
+ BMAP_ALLOC | BMAP_UNINIT, blk, 0, &x);
if (err)
return translate_error(fs, fh->ino, err);
}
@@ -3164,6 +3164,9 @@ static int __translate_error(ext2_filsys fs, errcode_t err, ext2_ino_t ino,
case EXT2_ET_EA_KEY_NOT_FOUND:
ret = -ENODATA;
break;
+ case EXT2_ET_UNIMPLEMENTED:
+ ret = -EOPNOTSUPP;
+ break;
default:
is_err = 1;
ret = -EIO;
Fix fuse2fs' interpretation of 64-bit date quantities to match the
kernel.
Signed-off-by: Darrick J. Wong <[email protected]>
---
misc/fuse2fs.c | 31 ++++++++++++++++++++++---------
1 file changed, 22 insertions(+), 9 deletions(-)
diff --git a/misc/fuse2fs.c b/misc/fuse2fs.c
index 105c54f..3b5e5e7 100644
--- a/misc/fuse2fs.c
+++ b/misc/fuse2fs.c
@@ -289,15 +289,24 @@ static int __translate_error(ext2_filsys fs, errcode_t err, ext2_ino_t ino,
static inline __u32 ext4_encode_extra_time(const struct timespec *time)
{
- return (sizeof(time->tv_sec) > 4 ?
- (time->tv_sec >> 32) & EXT4_EPOCH_MASK : 0) |
- ((time->tv_nsec << EXT4_EPOCH_BITS) & EXT4_NSEC_MASK);
+ __u32 extra = sizeof(time->tv_sec) > 4 ?
+ ((time->tv_sec - (__s32)time->tv_sec) >> 32) &
+ EXT4_EPOCH_MASK : 0;
+ return extra | (time->tv_nsec << EXT4_EPOCH_BITS);
}
static inline void ext4_decode_extra_time(struct timespec *time, __u32 extra)
{
- if (sizeof(time->tv_sec) > 4)
- time->tv_sec |= (__u64)((extra) & EXT4_EPOCH_MASK) << 32;
+ if (sizeof(time->tv_sec) > 4 && (extra & EXT4_EPOCH_MASK)) {
+ __u64 extra_bits = extra & EXT4_EPOCH_MASK;
+ /*
+ * Prior to kernel 3.14?, we had a broken decode function,
+ * wherein we effectively did this:
+ * if (extra_bits == 3)
+ * extra_bits = 0;
+ */
+ time->tv_sec += extra_bits << 32;
+ }
time->tv_nsec = ((extra) & EXT4_NSEC_MASK) >> EXT4_EPOCH_BITS;
}
@@ -323,7 +332,7 @@ do { \
(timespec)->tv_sec = (signed)((raw_inode)->xtime); \
if (EXT4_FITS_IN_INODE(raw_inode, xtime ## _extra)) \
ext4_decode_extra_time((timespec), \
- raw_inode->xtime ## _extra); \
+ (raw_inode)->xtime ## _extra); \
else \
(timespec)->tv_nsec = 0; \
} while (0)
@@ -614,6 +623,7 @@ static int stat_inode(ext2_filsys fs, ext2_ino_t ino, struct stat *statbuf)
dev_t fakedev = 0;
errcode_t err;
int ret = 0;
+ struct timespec tv;
memset(&inode, 0, sizeof(inode));
err = ext2fs_read_inode_full(fs, ino, (struct ext2_inode *)&inode,
@@ -631,9 +641,12 @@ static int stat_inode(ext2_filsys fs, ext2_ino_t ino, struct stat *statbuf)
statbuf->st_size = inode.i_size;
statbuf->st_blksize = fs->blocksize;
statbuf->st_blocks = inode.i_blocks;
- statbuf->st_atime = inode.i_atime;
- statbuf->st_mtime = inode.i_mtime;
- statbuf->st_ctime = inode.i_ctime;
+ EXT4_INODE_GET_XTIME(i_atime, &tv, &inode);
+ statbuf->st_atime = tv.tv_sec;
+ EXT4_INODE_GET_XTIME(i_mtime, &tv, &inode);
+ statbuf->st_mtime = tv.tv_sec;
+ EXT4_INODE_GET_XTIME(i_ctime, &tv, &inode);
+ statbuf->st_ctime = tv.tv_sec;
if (LINUX_S_ISCHR(inode.i_mode) ||
LINUX_S_ISBLK(inode.i_mode)) {
if (inode.i_block[0])
Check that fallocate actually creates uninitialized extents, that
reads from uninit regions don't return garbage data, and that
subsequent writes to an uninitalized extent actually get recorded.
Signed-off-by: Darrick J. Wong <[email protected]>
---
tests/metadata-checksum-test.sh | 62 ++++++++++++++++++++++++++++++++++++++-
1 file changed, 61 insertions(+), 1 deletion(-)
diff --git a/tests/metadata-checksum-test.sh b/tests/metadata-checksum-test.sh
index 987e653..1c73e58 100755
--- a/tests/metadata-checksum-test.sh
+++ b/tests/metadata-checksum-test.sh
@@ -21,6 +21,7 @@ DIR="$(readlink -f "$(dirname "$0")")"
E2FSPROGS="${DIR}/../"
export LD_LIBRARY_PATH="${E2FSPROGS}/lib/:${LD_LIBRARY_PATH}"
BLK_SZ=4096
+MAX_BLK_SZ=65536
#MOUNT_OPTS="errors=remount-ro"
HUGE_DEV_NAME="HUGE"
FUZZ_DEV=0
@@ -132,7 +133,7 @@ function msg {
fi
}
-for prog in attr /usr/bin/time truncate fallocate gcc; do
+for prog in od attr /usr/bin/time truncate fallocate gcc; do
type "${prog}" 2> /dev/null || msg "WARNING: ${prog} not found!"
done
@@ -3095,6 +3096,65 @@ MKE2FS_CONFIG=/tmp/mke2fs.conf
export MKE2FS_CONFIG
}
+##########################
+function fallocate_dirty_test {
+msg "fallocate_dirty_test"
+$VALGRIND ${E2FSPROGS}/misc/mke2fs -T ext4icsum $MKFS_OPTS $MKFS_FEATURES -F "${DEV}"
+test -z "$NO_CSUM" && $VALGRIND ${E2FSPROGS}/misc/tune2fs -O metadata_csum $DEV
+${E2FSPROGS}/misc/dumpe2fs -h $DEV 2> /dev/null | egrep -q "^Filesystem state:[ ]*clean$" || ${fsck_cmd} -fDy $DEV || true
+
+${mount_cmd} ${MOUNT_OPTS} "${DEV}" "${MNT}" -t ext4 -o journal_checksum
+echo "moo" > "${MNT}/a"
+fallocate -l "$((6 * MAX_BLK_SZ))" "${MNT}/a"
+umount "${MNT}"
+${fsck_cmd} -f -n "${DEV}"
+
+str="$(${E2FSPROGS}/debugfs/debugfs -R 'ex /a' "${DEV}")"
+echo "${str}"
+echo "${str}" | grep -i uninit
+
+zap="$(${E2FSPROGS}/debugfs/debugfs -R 'bmap /a 1' "${DEV}")"
+${E2FSPROGS}/debugfs/debugfs -w -R "zap -p 0x55 ${zap}" "${DEV}"
+
+${mount_cmd} ${MOUNT_OPTS} "${DEV}" "${MNT}" -t ext4 -o journal_checksum
+od -tx1 -Ad -c "${MNT}/a" > /tmp/a
+cat > /tmp/b << ENDL
+0000000 6d 6f 6f 0a 00 00 00 00 00 00 00 00 00 00 00 00
+ m o o \n \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0
+0000016 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
+ \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0
+*
+0393216
+ENDL
+diff -u /tmp/a /tmp/b
+echo "cow" | dd of="${MNT}/a" bs="${BLK_SZ}" count=1 seek=3 conv=notrunc
+umount "${MNT}"
+${fsck_cmd} -f -n "${DEV}"
+
+${E2FSPROGS}/debugfs/debugfs -R 'ex /a' "${DEV}" | cat -
+
+${mount_cmd} ${MOUNT_OPTS} "${DEV}" "${MNT}" -t ext4 -o journal_checksum
+od -tx1 -Ad -c "${MNT}/a" > /tmp/a
+OFF1="$(printf '%07d\n' "$((3 * BLK_SZ))")"
+OFF2="$(printf '%07d\n' "$(((3 * BLK_SZ) + 16))")"
+cat > /tmp/b << ENDL
+0000000 6d 6f 6f 0a 00 00 00 00 00 00 00 00 00 00 00 00
+ m o o \n \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0
+0000016 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
+ \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0
+*
+${OFF1} 63 6f 77 0a 00 00 00 00 00 00 00 00 00 00 00 00
+ c o w \n \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0
+${OFF2} 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
+ \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0
+*
+0393216
+ENDL
+diff -u /tmp/a /tmp/b
+umount "${MNT}"
+${fsck_cmd} -f -n "${DEV}"
+}
+
# This test should be the last one (before speed tests, anyway)
#### ALL SPEED TESTS GO AT THE END
Extend the metadata checksum speed test to evaluate the performance
impact of the block_validity mount option.
Signed-off-by: Darrick J. Wong <[email protected]>
---
tests/metadata-checksum-test.sh | 72 +++++++++++++++++++++++++++++++++++----
1 file changed, 65 insertions(+), 7 deletions(-)
diff --git a/tests/metadata-checksum-test.sh b/tests/metadata-checksum-test.sh
index 1c73e58..0703501 100755
--- a/tests/metadata-checksum-test.sh
+++ b/tests/metadata-checksum-test.sh
@@ -192,6 +192,15 @@ cat > "${MKE2FS_CONFIG}" << ENDL
inode_ratio = 16384
[fs_types]
+ ext4icsum_no_bv = {
+ features = has_journal,extent,huge_file,flex_bg,uninit_bg,dir_nlink,extra_isize,64bit$MKFS_OPTS
+ default_mntopts = acl,user_xattr
+ inode_size = ${INODE_SZ}
+ blocksize = ${BLK_SZ}
+ options = mmp_update_interval=5 #${RESIZE_PARAM}
+ lazy_itable_init = 1
+ cluster_size = $((BLK_SZ * 2))
+ }
ext4icsum = {
features = has_journal,extent,huge_file,flex_bg,uninit_bg,dir_nlink,extra_isize,64bit$MKFS_OPTS
inode_size = ${INODE_SZ}
@@ -3171,11 +3180,14 @@ if [ ! -r /tmp/tarfiles ]; then
fi
if [ ! -r /tmp/dirs -o ! -r /tmp/files -o ! -r /tmp/topdirs ]; then
rm -rf /tmp/dirs /tmp/files /tmp/topdirs
+ set +x
+ echo '+ create_fab_config'
for i in a1 a2 a3 a4 a5 a6 a7 a8 a9 aA aB aC aD aE aF; do
cat /tmp/tarfiles | grep ^d | awk "{printf(\"$i/%s\n\", \$6);}" >> /tmp/dirs
cat /tmp/tarfiles | grep -v ^d | awk "{printf(\"%s $i/%s\n\", \$3, \$6);}" >> /tmp/files
echo "$i" >> /tmp/topdirs
done
+ set -x
fi
if [ ! -x /tmp/fab ]; then
cat > /tmp/fab.c << ENDL
@@ -3188,19 +3200,28 @@ if [ ! -x /tmp/fab ]; then
#include <stdlib.h>
#include <errno.h>
+#define ZERO_BUF_SZ 65536
+char zeroes[ZERO_BUF_SZ];
+
int main(int argc, char *argv[])
{
FILE *fp;
int fd, ret;
- size_t size;
+ size_t size, off, retoff;
char *space;
char buf[1024];
+ int write_files = 0;
if (argc < 2) {
printf("Usage: %s file_containing_sz_name_pairs\n", argv[0]);
return 4;
}
+ if (getenv("FAB_WRITE_FILES")) {
+ write_files = 1;
+ memset(zeroes, 0, ZERO_BUF_SZ);
+ }
+
fp = fopen(argv[1], "r");
if (!fp) {
perror(argv[1]);
@@ -3223,12 +3244,23 @@ int main(int argc, char *argv[])
}
size = strtoul(buf, NULL, 0);
if (size) {
- ret = posix_fallocate(fd, 0, size);
- if (ret) {
- errno = ret;
- perror(space);
- close(fd);
- break;
+ if (write_files) {
+ for (off = 0; off < size; off += ZERO_BUF_SZ) {
+ retoff = pwrite(fd, zeroes, (size - off) % ZERO_BUF_SZ, off);
+ if (retoff != (size - off) % ZERO_BUF_SZ) {
+ perror(space);
+ close(fd);
+ break;
+ }
+ }
+ } else {
+ ret = posix_fallocate(fd, 0, size);
+ if (ret) {
+ errno = ret;
+ perror(space);
+ close(fd);
+ break;
+ }
}
}
close(fd);
@@ -3331,6 +3363,32 @@ umount $MNT
}
#####################
+function block_validity_speed_test {
+test "${SKIP_SPEED_TESTS}" -gt 0 && return
+msg "block_validity_speed_test"
+prep_speed_test
+
+msg "No block_validity"
+$VALGRIND ${E2FSPROGS}/misc/mke2fs -T ext4icsum_no_bv $MKFS_OPTS $MKFS_FEATURES -F "${DEV}"
+test -z "$NO_CSUM" && $VALGRIND ${E2FSPROGS}/misc/tune2fs -O metadata_csum $DEV
+${E2FSPROGS}/misc/dumpe2fs -h $DEV 2> /dev/null | egrep -q "^Filesystem state:[ ]*clean$" || ${fsck_cmd} -fDy $DEV || true
+${mount_cmd} ${MOUNT_OPTS} "${DEV}" "${MNT}" -t ext4
+cd "${MNT}"
+/usr/bin/time bash -c "mkdir -p \$(cat /tmp/dirs); /tmp/fab /tmp/files; sync"
+/usr/bin/time bash -c "rm -rf \$(cat /tmp/topdirs); sync"
+cd -
+umount "${MNT}"
+
+msg "Yes block_validity"
+${mount_cmd} ${MOUNT_OPTS} "${DEV}" "${MNT}" -t ext4 -o block_validity
+cd "${MNT}"
+/usr/bin/time bash -c "mkdir -p \$(cat /tmp/dirs); /tmp/fab /tmp/files; sync"
+/usr/bin/time bash -c "rm -rf \$(cat /tmp/topdirs); sync"
+cd -
+umount "${MNT}"
+}
+
+#####################
# Allow restarting of a test run by giving the test name and a plus sign.
VERBS_LEN="${#VERBS}"
Make sure we test behavior when running out of space.
Signed-off-by: Darrick J. Wong <[email protected]>
---
tests/metadata-checksum-test.sh | 20 ++++++++++++++++++++
1 file changed, 20 insertions(+)
diff --git a/tests/metadata-checksum-test.sh b/tests/metadata-checksum-test.sh
index 0703501..75b92a0 100755
--- a/tests/metadata-checksum-test.sh
+++ b/tests/metadata-checksum-test.sh
@@ -3164,6 +3164,26 @@ umount "${MNT}"
${fsck_cmd} -f -n "${DEV}"
}
+##########################
+function enospc_test {
+msg "enospc_test"
+$VALGRIND ${E2FSPROGS}/misc/mke2fs -T ext4icsum $MKFS_OPTS $MKFS_FEATURES -F "${DEV}" 12800
+test -z "$NO_CSUM" && $VALGRIND ${E2FSPROGS}/misc/tune2fs -O metadata_csum $DEV
+${E2FSPROGS}/misc/dumpe2fs -h $DEV 2> /dev/null | egrep -q "^Filesystem state:[ ]*clean$" || ${fsck_cmd} -fDy $DEV || true
+
+${mount_cmd} ${MOUNT_OPTS} "${DEV}" "${MNT}" -t ext4 -o journal_checksum
+dd if=/dev/zero of="${MNT}/a" || true
+dd if=/dev/zero of="${MNT}/b" || true
+ls -la "${MNT}/"
+for i in 1 2 3; do
+ XYZ="$(dd if=/dev/zero of="${MNT}/b" 2>&1 || true)"
+ echo "${XYZ}" | grep -c "No space left"
+ ls -la "${MNT}/"
+done
+umount "${MNT}"
+${fsck_cmd} -f -n "${DEV}"
+}
+
# This test should be the last one (before speed tests, anyway)
#### ALL SPEED TESTS GO AT THE END
Add a test to look for stale data after a file truncation.
Signed-off-by: Darrick J. Wong <[email protected]>
---
tests/metadata-checksum-test.sh | 47 +++++++++++++++++++++++++++++++++++++++
1 file changed, 47 insertions(+)
diff --git a/tests/metadata-checksum-test.sh b/tests/metadata-checksum-test.sh
index 75b92a0..1f0d5bc 100755
--- a/tests/metadata-checksum-test.sh
+++ b/tests/metadata-checksum-test.sh
@@ -3184,6 +3184,53 @@ umount "${MNT}"
${fsck_cmd} -f -n "${DEV}"
}
+##########################
+function truncate_stale_test {
+msg "truncate_stale_test"
+$VALGRIND ${E2FSPROGS}/misc/mke2fs -T ext4icsum $MKFS_OPTS $MKFS_FEATURES -F "${DEV}" 12800
+test -z "$NO_CSUM" && $VALGRIND ${E2FSPROGS}/misc/tune2fs -O metadata_csum $DEV
+${E2FSPROGS}/misc/dumpe2fs -h $DEV 2> /dev/null | egrep -q "^Filesystem state:[ ]*clean$" || ${fsck_cmd} -fDy $DEV || true
+
+# a: truncate, fallocate
+# b: truncate, truncate
+${mount_cmd} ${MOUNT_OPTS} "${DEV}" "${MNT}" -t ext4 -o journal_checksum
+echo "${FUZZ}" > "${MNT}/original"
+echo "${FUZZ}" >> "${MNT}/original"
+echo "${FUZZ}" >> "${MNT}/original"
+
+cp "${MNT}/original" "${MNT}/a"
+cp "${MNT}/original" "${MNT}/b"
+
+dd if=/dev/zero of="${MNT}/correct" bs=20480 count=1
+echo -n "${FUZZ}" | dd of="${MNT}/correct" conv=notrunc
+
+umount "${MNT}"
+${fsck_cmd} -f -n "${DEV}"
+
+${mount_cmd} ${MOUNT_OPTS} "${DEV}" "${MNT}" -t ext4 -o journal_checksum
+FUZZ_LEN="${#FUZZ}"
+truncate -s "${FUZZ_LEN}" "${MNT}/a"
+truncate -s "${FUZZ_LEN}" "${MNT}/b"
+umount "${MNT}"
+${fsck_cmd} -f -n "${DEV}"
+
+${mount_cmd} ${MOUNT_OPTS} "${DEV}" "${MNT}" -t ext4 -o journal_checksum
+fallocate -l 20k "${MNT}/a"
+truncate -s 20k "${MNT}/b"
+umount "${MNT}"
+${fsck_cmd} -f -n "${DEV}"
+
+${mount_cmd} ${MOUNT_OPTS} "${DEV}" "${MNT}" -t ext4 -o journal_checksum
+for i in a b correct original; do
+ od -tx1 -Ad -c "${MNT}/${i}" > "${MNT}/${i}.txt"
+done
+diff -u "${MNT}/original.txt" "${MNT}/correct.txt" || true
+diff -u "${MNT}/a.txt" "${MNT}/correct.txt"
+diff -u "${MNT}/b.txt" "${MNT}/correct.txt"
+umount "${MNT}"
+${fsck_cmd} -f -n "${DEV}"
+}
+
# This test should be the last one (before speed tests, anyway)
#### ALL SPEED TESTS GO AT THE END
Create a test to try to map logical blocks at the upper end of where
we can map blocks. Then check the stat results, and run the whole
thing by e2fsck.
Signed-off-by: Darrick J. Wong <[email protected]>
---
tests/metadata-checksum-test.sh | 46 +++++++++++++++++++++++++++++++++++++++
1 file changed, 46 insertions(+)
diff --git a/tests/metadata-checksum-test.sh b/tests/metadata-checksum-test.sh
index 1f0d5bc..4e353e1 100755
--- a/tests/metadata-checksum-test.sh
+++ b/tests/metadata-checksum-test.sh
@@ -3231,6 +3231,52 @@ umount "${MNT}"
${fsck_cmd} -f -n "${DEV}"
}
+##########################
+function enormous_extent_file_test {
+msg "enormous_extent_file_test"
+
+# Try to create a file that's too big (with extents)
+$VALGRIND ${E2FSPROGS}/misc/mke2fs -T ext4icsum $MKFS_OPTS -O extent,^bigalloc,^64bit -F "${DEV}"
+test -z "$NO_CSUM" && $VALGRIND ${E2FSPROGS}/misc/tune2fs -O metadata_csum $DEV
+${E2FSPROGS}/misc/dumpe2fs -h $DEV 2> /dev/null | egrep -q "^Filesystem state:[ ]*clean$" || ${fsck_cmd} -fDy $DEV || true
+
+MAX_BLK=$(( (2**32) - 1 ))
+FILE_SIZE="$(( MAX_BLK * BLK_SZ ))"
+${mount_cmd} ${MOUNT_OPTS} "${DEV}" "${MNT}" -t ext4 -o journal_checksum
+dd if=/dev/zero of="${MNT}/bigfile" bs="${BLK_SZ}" count=32 seek=4294967288 conv=notrunc || true
+dd if=/dev/zero of="${MNT}/bigfile2" bs="${BLK_SZ}" count=32 seek=4294967288 || true
+umount "${MNT}"
+${fsck_cmd} -C0 -f -n "${DEV}"
+${E2FSPROGS}/debugfs/debugfs -R 'stat /bigfile' "${DEV}" | cat -
+${E2FSPROGS}/debugfs/debugfs -R 'stat /bigfile' "${DEV}" | grep -c "Size: ${FILE_SIZE}"
+${E2FSPROGS}/debugfs/debugfs -R 'stat /bigfile2' "${DEV}" | grep -c "Size: ${FILE_SIZE}"
+}
+
+##########################
+function enormous_blockmap_file_test {
+msg "enormous_blockmap_file_test"
+
+# Try to create a file that's too big (with block maps)
+$VALGRIND ${E2FSPROGS}/misc/mke2fs -T ext4icsum $MKFS_OPTS -O ^extent,^bigalloc,^64bit -F "${DEV}"
+test -z "$NO_CSUM" && $VALGRIND ${E2FSPROGS}/misc/tune2fs -O metadata_csum $DEV
+${E2FSPROGS}/misc/dumpe2fs -h $DEV 2> /dev/null | egrep -q "^Filesystem state:[ ]*clean$" || ${fsck_cmd} -fDy $DEV || true
+
+ADDR_PER_BLOCK="$((BLK_SZ / 4))"
+MAX_BLK="$(( (ADDR_PER_BLOCK ** 3) + (ADDR_PER_BLOCK ** 2) + (ADDR_PER_BLOCK) + 12))"
+if [ "${MAX_BLK}" -gt "$(( (2**32) - 1 ))" ]; then
+ MAX_BLK="$(( (2**32) - 1 ))"
+fi
+FILE_SIZE="$(( MAX_BLK * BLK_SZ ))"
+${mount_cmd} ${MOUNT_OPTS} "${DEV}" "${MNT}" -t ext4 -o journal_checksum
+dd if=/dev/zero of="${MNT}/bigfile" bs="${BLK_SZ}" count=32 seek="$((MAX_BLK - 8))" conv=notrunc || true
+dd if=/dev/zero of="${MNT}/bigfile2" bs="${BLK_SZ}" count=32 seek="$((MAX_BLK - 8))" || true
+umount "${MNT}"
+${E2FSPROGS}/debugfs/debugfs -R 'stat /bigfile' "${DEV}" | cat -
+${fsck_cmd} -C0 -f -n "${DEV}"
+${E2FSPROGS}/debugfs/debugfs -R 'stat /bigfile' "${DEV}" | grep -c "Size: ${FILE_SIZE}"
+${E2FSPROGS}/debugfs/debugfs -R 'stat /bigfile2' "${DEV}" | grep -c "Size: ${FILE_SIZE}"
+}
+
# This test should be the last one (before speed tests, anyway)
#### ALL SPEED TESTS GO AT THE END
Create custom mount/umount commands so that we can run the metadata
checksumming tests against fuse2fs.
Signed-off-by: Darrick J. Wong <[email protected]>
---
tests/fuse2fs/mount | 22 ++++++++++++++++++++++
tests/fuse2fs/umount | 21 +++++++++++++++++++++
2 files changed, 43 insertions(+)
create mode 100755 tests/fuse2fs/mount
create mode 100755 tests/fuse2fs/umount
diff --git a/tests/fuse2fs/mount b/tests/fuse2fs/mount
new file mode 100755
index 0000000..96a89b7
--- /dev/null
+++ b/tests/fuse2fs/mount
@@ -0,0 +1,22 @@
+#!/bin/bash
+
+# Mount ext4 via fuse. Put tests/fuse2fs/ at the start of PATH if you want
+# to run the metadata checksumming tests with fuse2fs.
+
+for arg in "$@"; do
+ if [ -b "${arg}" ]; then
+ DEV="${arg}"
+ elif [ -d "${arg}" ]; then
+ MNT="${arg}"
+ fi
+done
+
+if [ -z "${DEV}" -o -z "${MNT}" ]; then
+ echo "Please specify a device and a mountpoint."
+fi
+
+DIR="$(readlink -f "$(dirname "$0")")"
+"${DIR}/../../misc/fuse2fs" "${DEV}" "${MNT}"
+ERR=$?
+sleep 1
+exit "${ERR}"
diff --git a/tests/fuse2fs/umount b/tests/fuse2fs/umount
new file mode 100755
index 0000000..715bee1
--- /dev/null
+++ b/tests/fuse2fs/umount
@@ -0,0 +1,21 @@
+#!/bin/bash
+
+# unmount a filesystem
+sync
+sync
+sync
+
+N=1
+if [ -x /bin/umount ]; then
+ /bin/umount "$@"
+ ERR=$?
+elif [ -x /sbin/umount ]; then
+ /sbin/umount "$@"
+ ERR=$?
+else
+ echo "Where is umount?"
+ exit 5
+fi
+sleep 1
+
+exit "${ERR}"
Test the creation and reading of a large symlink.
Signed-off-by: Darrick J. Wong <[email protected]>
---
tests/metadata-checksum-test.sh | 24 ++++++++++++++++++++++++
1 file changed, 24 insertions(+)
diff --git a/tests/metadata-checksum-test.sh b/tests/metadata-checksum-test.sh
index 4e353e1..d34985b 100755
--- a/tests/metadata-checksum-test.sh
+++ b/tests/metadata-checksum-test.sh
@@ -3277,6 +3277,30 @@ ${E2FSPROGS}/debugfs/debugfs -R 'stat /bigfile' "${DEV}" | grep -c "Size: ${FILE
${E2FSPROGS}/debugfs/debugfs -R 'stat /bigfile2' "${DEV}" | grep -c "Size: ${FILE_SIZE}"
}
+##########################
+function big_symlink_test {
+msg "big_symlink_test"
+
+$VALGRIND ${E2FSPROGS}/misc/mke2fs -T ext4icsum $MKFS_OPTS $MKFS_FEATURES -F "${DEV}"
+test -z "$NO_CSUM" && $VALGRIND ${E2FSPROGS}/misc/tune2fs -O metadata_csum $DEV
+${E2FSPROGS}/misc/dumpe2fs -h $DEV 2> /dev/null | egrep -q "^Filesystem state:[ ]*clean$" || ${fsck_cmd} -fDy $DEV || true
+
+set +x
+LINK_TARGET="/$(printf "x%.0s" {1..1000})"
+set -x
+${mount_cmd} ${MOUNT_OPTS} "${DEV}" "${MNT}" -t ext4 -o journal_checksum
+ln -s "${LINK_TARGET}" "${MNT}/biglink"
+umount "${MNT}"
+${E2FSPROGS}/debugfs/debugfs -R 'stat /biglink' "${DEV}" | cat -
+${fsck_cmd} -C0 -f -n "${DEV}"
+
+${mount_cmd} ${MOUNT_OPTS} "${DEV}" "${MNT}" -t ext4 -o journal_checksum
+LINK_NOW="$(readlink "${MNT}/biglink")"
+umount "${MNT}"
+
+test "${LINK_NOW}" = "${LINK_TARGET}"
+}
+
# This test should be the last one (before speed tests, anyway)
#### ALL SPEED TESTS GO AT THE END
Test our ability to handle the entire range of valid dates.
Signed-off-by: Darrick J. Wong <[email protected]>
---
tests/metadata-checksum-test.sh | 62 +++++++++++++++++++++++++++++++++++++++
1 file changed, 62 insertions(+)
diff --git a/tests/metadata-checksum-test.sh b/tests/metadata-checksum-test.sh
index d34985b..507d355 100755
--- a/tests/metadata-checksum-test.sh
+++ b/tests/metadata-checksum-test.sh
@@ -3301,6 +3301,68 @@ umount "${MNT}"
test "${LINK_NOW}" = "${LINK_TARGET}"
}
+##########################
+function date_test {
+msg "date_test"
+
+$VALGRIND ${E2FSPROGS}/misc/mke2fs -T ext4icsum $MKFS_OPTS $MKFS_FEATURES -F "${DEV}"
+test -z "$NO_CSUM" && $VALGRIND ${E2FSPROGS}/misc/tune2fs -O metadata_csum $DEV
+${E2FSPROGS}/misc/dumpe2fs -h $DEV 2> /dev/null | egrep -q "^Filesystem state:[ ]*clean$" || ${fsck_cmd} -fDy $DEV || true
+rm -rf /tmp/ls.before /tmp/ls.after /tmp/debugfs.diff
+
+INODE_SIZE="$(${E2FSPROGS}/misc/dumpe2fs -h "${DEV}" | grep 'Inode size:' | awk '{print $3}')"
+if [ "${INODE_SIZE}" -gt 128 ]; then
+ LAST_YEAR=2430
+else
+ LAST_YEAR=2030
+fi
+
+# Write dates
+${mount_cmd} ${MOUNT_OPTS} "${DEV}" "${MNT}" -t ext4 -o journal_checksum
+seq 1910 20 "${LAST_YEAR}" | while read year; do
+ DATE="${year}-01-01 00:00:00.000000000"
+ FNAME="$(echo "${DATE}" | tr '[ \-:.]' '____')"
+ touch -d "${DATE}" "${MNT}/${FNAME}"
+ echo "${FNAME} ${DATE}" >> /tmp/ls.before
+done
+umount "${MNT}"
+${fsck_cmd} -C0 -f -n "${DEV}"
+
+# debugfs
+seq 1910 20 "${LAST_YEAR}" | while read year; do
+ DATE="${year}-01-01 00:00:00.000000000"
+ FNAME="$(echo "${DATE}" | tr '[ \-:.]' '____')"
+ echo "${FNAME}" "$(${E2FSPROGS}/debugfs/debugfs -R "stat ${FNAME}" "${DEV}" | grep 'mtime:')"
+done > /tmp/debugfs.before
+
+# Re-read from kernel
+${mount_cmd} ${MOUNT_OPTS} "${DEV}" "${MNT}" -t ext4 -o journal_checksum
+seq 1910 20 "${LAST_YEAR}" | while read year; do
+ DATE="${year}-01-01 00:00:00.000000000"
+ FNAME="$(echo "${DATE}" | tr '[ \-:.]' '____')"
+ FDATE="$(stat -c '%y' "${MNT}/${FNAME}" | sed -e 's/......$//g')"
+ echo "${FNAME}" "${FDATE}" >> /tmp/ls.after
+done
+umount "${MNT}"
+
+# Did the kernel work?
+diff -u /tmp/ls.before /tmp/ls.after > /tmp/ls.diff || true
+
+# Does debugfs work?
+touch /tmp/debugfs.diff
+cat /tmp/debugfs.before | sed -e 's/^\(....\).*\(....\)$/\1 \2/g' | while read date fdate crap; do
+ if [ "${date}" != "${fdate}" ]; then
+ echo "${date} != ${fdate}" >> /tmp/debugfs.diff
+ fi
+done
+
+if [ "$(cat /tmp/debugfs.diff /tmp/ls.diff | wc -l)" -gt 0 ]; then
+ echo "BROKEN DATE HANDLING"
+ cat /tmp/debugfs.diff /tmp/ls.diff
+ false
+fi
+}
+
# This test should be the last one (before speed tests, anyway)
#### ALL SPEED TESTS GO AT THE END
On Tue, Dec 10, 2013 at 05:18:21PM -0800, Darrick J. Wong wrote:
> On a FS with a rather large blockize (> 4K), the old block map
> structure can construct a fat enough "tree" (or whatever we call that
> lopsided thing) that (at least in theory) one could create mappings
> for logical blocks higher than 32 bits. In practice this doesn't
> happen, but the 'max' and 'iter' variables that the punch helpers use
> will overflow because the BLOCK_SIZE_BITS shifts are too large to fit
> a 32-bit variable. The current variable declarations also cause punch
> to fail on TIND-mapped blocks even if the file is < 16T. So enlarge
> the fields to fit.
>
> Yes this is an obscure corner case, but it seems a little silly if we
> can't punch a file's block 300,000,000 on a 64k-block filesystem.
>
> Signed-off-by: Darrick J. Wong <[email protected]>
Thanks, applied.
- Ted
On Tue, Dec 10, 2013 at 05:18:27PM -0800, Darrick J. Wong wrote:
> For each site where we test for a large file (> 2GB) and set the
> LARGE_FILE feature, use a helper function to make the size test
> consistent with the test that's in e2fsck. This fixes the fsck
> complaints when we try to create a 2GB journal (not so hard with 64k
> block size) and fixes the incorrect test in fileio.c.
>
> v2: Fix another site in e2fsck/pass2.c that Zheng Liu pointed out.
Thanks, applied.
- Ted
On Tue, Dec 10, 2013 at 05:18:37PM -0800, Darrick J. Wong wrote:
> mke2fs has a series of checks to ensure that we don't create a
> filesystem too big for its blocksize -- if auto-64bit is on, then it
> turns on 64bit; otherwise it complains. Unfortunately, it performs
> these checks before looking in mke2fs.conf for a blocksize, which
> means that the checks are incorrect if the user specifies a non-4096
> blocksize in the config file and says nothing on the command line.
> The bug also has the effect of mandating a 4k block size on any block
> device larger than 4T in that situation. Therefore, read the block
> size from the config file before performing the 64bit checks.
>
> Reviewed-by: Zheng Liu <[email protected]>
> Signed-off-by: Darrick J. Wong <[email protected]>
Thanks, applied.
- Ted
On Tue, Dec 10, 2013 at 05:18:43PM -0800, Darrick J. Wong wrote:
> Use the new ext2fs_punch() call to truncate the quota file. This also
> eliminates the need to fix it to work with bigalloc.
>
> Reviewed-by: Zheng Liu <[email protected]>
> Signed-off-by: Darrick J. Wong <[email protected]>
Thanks, applied.
- Ted
On Thu, Dec 12, 2013 at 12:28:40PM -0500, Theodore Ts'o wrote:
> On Tue, Dec 10, 2013 at 05:18:43PM -0800, Darrick J. Wong wrote:
> > Use the new ext2fs_punch() call to truncate the quota file. This also
> > eliminates the need to fix it to work with bigalloc.
> >
> > Reviewed-by: Zheng Liu <[email protected]>
> > Signed-off-by: Darrick J. Wong <[email protected]>
>
> Thanks, applied.
Hmm, I spoke too soon. This is causing a test failure when applied
against the maint branch. I know that you developed this versus the
next branch, so it's possible the bug in ext2fs_punch() was fixed
next. I'm going to drop this from the maint branch for now, I'll take
a closer look at this after going through the rest of the bug fix
patches, and see whether it works when applied against next (in which
case I'll have to see which bug fix we need to apply to backport the
fix to the maint branch).
- Ted
On Tue, Dec 10, 2013 at 05:18:50PM -0800, Darrick J. Wong wrote:
> The help text for debugfs' init_filesys command is incorrect; the
> second parameter is the size of the filesystem in blocks, not the size
> of an individual filesystem block. There is in fact no way to set
> that parameter.
>
> Reported-by: Zheng Liu <[email protected]>
> Signed-off-by: Darrick J. Wong <[email protected]>
Thanks, applied.
- Ted
On Tue, Dec 10, 2013 at 05:19:10PM -0800, Darrick J. Wong wrote:
> Forbid clients from trying to map logical block numbers that are
> larger than the lblk->pblk data structures are capable of handling.
> While we're at it, don't let clients set the file size to a number
> that's beyond what can be mapped.
>
> Signed-off-by: Darrick J. Wong <[email protected]>
Thanks, applied.
- Ted
On Tue, Dec 10, 2013 at 05:19:16PM -0800, Darrick J. Wong wrote:
> 'an block' should be 'a block'. Missed the read case in the first patch.
>
> Signed-off-by: Darrick J. Wong <[email protected]>
Thanks, applied.
- Ted
On Tue, Dec 10, 2013 at 05:19:28PM -0800, Darrick J. Wong wrote:
> We should really use the ext2fs memory allocator functions in
> copy_file(), and we really should return a value if there's allocation
> problems.
>
> Also fix up a minor bogosity in an error message.
Thanks, applied.
- Ted
On Tue, Dec 10, 2013 at 05:19:41PM -0800, Darrick J. Wong wrote:
> If we have to create a big symlink (i.e. one that doesn't fit into
> i_block[]), we are not 64bit block safe and the namei code does not
> handle extents at all. Fix both.
>
> Signed-off-by: Darrick J. Wong <[email protected]>
Thanks, applied.
- Ted
On Tue, Dec 10, 2013 at 05:19:48PM -0800, Darrick J. Wong wrote:
> debugfs should use strtoull wrappers for reading block numbers from
> the command line. "unsigned long" isn't wide enough to handle block
> numbers on 32bit platforms.
>
> Signed-off-by: Darrick J. Wong <[email protected]>
Thanks, applied.
- Ted
On Tue, Dec 10, 2013 at 05:19:54PM -0800, Darrick J. Wong wrote:
> When reading or writing file blocks, use the IO manager routines that
> can handle 64bit block numbers.
>
> Signed-off-by: Darrick J. Wong <[email protected]>
Thanks, applied.
- Ted
On Tue, Dec 10, 2013 at 05:20:07PM -0800, Darrick J. Wong wrote:
> The caller of dump_file provides a fd to write to, so the caller
> should also dispose of the fd. Also, the fd never gets closed if
> preserve=1.
>
> Signed-off-by: Darrick J. Wong <[email protected]>
Thanks, applied.
- Ted
On Tue, Dec 10, 2013 at 05:20:14PM -0800, Darrick J. Wong wrote:
> ext2fs_free_mem() takes a pointer to a pointer, similar to
> ext2fs_get_mem(). Improve the documentation, and fix debugfs.
>
> Signed-off-by: Darrick J. Wong <[email protected]>
Thanks, applied.
- Ted
On Tue, Dec 10, 2013 at 05:20:20PM -0800, Darrick J. Wong wrote:
> Signed-off-by: Darrick J. Wong <[email protected]>
Thanks, applied.
- Ted
On Tue, Dec 10, 2013 at 05:20:27PM -0800, Darrick J. Wong wrote:
> Signed-off-by: Darrick J. Wong <[email protected]>
Thanks, applied.
- Ted
On Tue, Dec 10, 2013 at 05:20:33PM -0800, Darrick J. Wong wrote:
> Signed-off-by: Darrick J. Wong <[email protected]>
Thanks, applied.
- Ted
On Tue, Dec 10, 2013 at 05:20:40PM -0800, Darrick J. Wong wrote:
> If someone umounts the filesystem between statfs64 and the getmntent()
> iteration, we can exit the loop having never set mnt_type, and strcmp
> can crash. Fix the potential NULL deref.
>
> Signed-off-by: Darrick J. Wong <[email protected]>
Thanks, applied.
- Ted
On Tue, Dec 10, 2013 at 05:20:46PM -0800, Darrick J. Wong wrote:
> sysconf(_SC_PAGESIZE) will probably never return an error, but just in
> case it does, we shouldn't pass what looks like a huge number to
> sync_file_range() and posix_fadvise().
>
> Signed-off-by: Darrick J. Wong <[email protected]>
Thanks, applied.
- Ted
On Tue, Dec 10, 2013 at 05:20:53PM -0800, Darrick J. Wong wrote:
> Signed-off-by: Darrick J. Wong <[email protected]>
Thanks, applied.
- Ted
On Tue, Dec 10, 2013 at 05:20:59PM -0800, Darrick J. Wong wrote:
> Check the return values from ext2fs_get_block_bitmap_range2(); if an
> error happened, print that and don't print garbage bitmap.
>
> Signed-off-by: Darrick J. Wong <[email protected]>
Thanks, applied.
- Ted
On Tue, Dec 10, 2013 at 05:21:08PM -0800, Darrick J. Wong wrote:
> Signed-off-by: Darrick J. Wong <[email protected]>
Thanks, applied.
- Ted
On Tue, Dec 10, 2013 at 05:21:15PM -0800, Darrick J. Wong wrote:
> Fix memory allocation calculations and check for NULL pointer returns.
>
> Signed-off-by: Darrick J. Wong <[email protected]>
Thanks, applied.
- Ted
On Tue, Dec 10, 2013 at 05:21:21PM -0800, Darrick J. Wong wrote:
> Signed-off-by: Darrick J. Wong <[email protected]>
Thanks, applied.
- Ted
On Tue, Dec 10, 2013 at 05:21:28PM -0800, Darrick J. Wong wrote:
> Fix up a few places where we ignore return values.
>
> Signed-off-by: Darrick J. Wong <[email protected]>
Thanks, applied.
- Ted
On Tue, Dec 10, 2013 at 05:21:34PM -0800, Darrick J. Wong wrote:
> Signed-off-by: Darrick J. Wong <[email protected]>
Thanks, applied.
- Ted
On Tue, Dec 10, 2013 at 05:21:41PM -0800, Darrick J. Wong wrote:
> Zero is a valid file descriptor, so close it.
>
> Signed-off-by: Darrick J. Wong <[email protected]>
Thanks, applied.
- Ted
On Tue, Dec 10, 2013 at 05:21:47PM -0800, Darrick J. Wong wrote:
> If we're using ext2fs_file_write() to write to a hole in a file,
> ensure that we can actually allocate the block before updating i_size.
> In other words, don't update i_size and don't return success if we hit
> an error while allocating space.
>
> Signed-off-by: Darrick J. Wong <[email protected]>
Thanks, applied.
- Ted
On Tue, Dec 10, 2013 at 05:21:54PM -0800, Darrick J. Wong wrote:
> When deleting an entire extent, we cannot always slip to the previous
> leaf extent because there might not /be/ a previous extent.
> Attempting to correct for that error by asking for the 'current' leaf
> extent also doesn't work, because the failed attempt to change to the
> previous extent leaves us with no current extent.
>
> Fix this problem by recording the lblk of the next extent before
> deleting the current extent and _goto()ing to the next extent after
> the deletion.
>
> Signed-off-by: Darrick J. Wong <[email protected]>
Thanks, applied.
- Ted
On Tue, Dec 10, 2013 at 05:22:00PM -0800, Darrick J. Wong wrote:
> If we're asked to punch a file with no data blocks mapped to it and a
> non-zero length, we don't need to do any work in ext2fs_punch_extent()
> and can return success. Unfortunately, the extent_get() function
> returns "no current node" because it (correctly) failed to find any
> extents, which is bubbled up to callers. Since no extents being found
> is not an error in this corner case, fix up ext2fs_punch_extent() to
> return 0 to callers.
>
> Signed-off-by: Darrick J. Wong <[email protected]>
Thanks, applied.
- Ted
On Tue, Dec 10, 2013 at 05:22:07PM -0800, Darrick J. Wong wrote:
> When we're rehashing directories, it's possible that an extent block
> (or a map block) could be (silently) allocated by the underlying
> libext2fs when expanding the directory. This silent allocation is not
> captured in block_found_map, which is disastrous if later the rehash
> process expands another directory and uses that same block from
> before without realizing that it's now in use.
>
> Therefore, if we notice that the free block count has dropped by more
> than what e2fsck allocated itself during the expansion, we iterate the
> directory's blocks a second time to ensure that these silent
> allocations are marked in the found blocks bitmap.
>
> Signed-off-by: Darrick J. Wong <[email protected]>
Thanks, applied.
- Ted
On Tue, Dec 10, 2013 at 05:22:13PM -0800, Darrick J. Wong wrote:
> When we set the file size, find the block containing EOF, and zero
> everything in that block past EOF so that we can't return stale data
> if we ever use fallocate or truncate to lengthen the file.
>
> Signed-off-by: Darrick J. Wong <[email protected]>
Thanks, applied.
- Ted
On Thu, Dec 12, 2013 at 12:36:34PM -0500, Theodore Ts'o wrote:
> On Thu, Dec 12, 2013 at 12:28:40PM -0500, Theodore Ts'o wrote:
> > On Tue, Dec 10, 2013 at 05:18:43PM -0800, Darrick J. Wong wrote:
> > > Use the new ext2fs_punch() call to truncate the quota file. This also
> > > eliminates the need to fix it to work with bigalloc.
> > >
> > > Reviewed-by: Zheng Liu <[email protected]>
> > > Signed-off-by: Darrick J. Wong <[email protected]>
> >
> > Thanks, applied.
>
> Hmm, I spoke too soon. This is causing a test failure when applied
> against the maint branch. I know that you developed this versus the
> next branch, so it's possible the bug in ext2fs_punch() was fixed
> next. I'm going to drop this from the maint branch for now, I'll take
> a closer look at this after going through the rest of the bug fix
> patches, and see whether it works when applied against next (in which
> case I'll have to see which bug fix we need to apply to backport the
> fix to the maint branch).
Hmm, I'll try to reproduce it here; which test is failing?
It's possible that I fubar'd the ordering when I resorted the patches just
prior to sending them out. :/
--D
>
> - Ted
On Thu, Dec 12, 2013 at 12:07:09PM -0800, Darrick J. Wong wrote:
>
> Hmm, I'll try to reproduce it here; which test is failing?
>
> It's possible that I fubar'd the ordering when I resorted the patches just
> prior to sending them out. :/
It's failing when I apply this patch against both maint and next, so
you may be quite right that it's commit ordering issue.
The failing test was t_quota_2off; it looks like punch wasn't
adjusting the block bitmap correctly?
- Ted
tune2fs 1.43-WIP (8-Jul-2013)
e2fsck 1.43-WIP (8-Jul-2013)
Pass 1: Checking inodes, blocks, and sizes
Pass 2: Checking directory structure
Pass 3: Checking directory connectivity
Pass 4: Checking reference counts
Pass 5: Checking group summary information
Block bitmap differences: -(12--15)
Fix? yes
Free blocks count wrong for group #0 (84, counted=88).
Fix? yes
Free blocks count wrong (84, counted=88).
Fix? yes
/tmp/e2fsprogs-tmp.fMP6jN: ***** FILE SYSTEM WAS MODIFIED *****
/tmp/e2fsprogs-tmp.fMP6jN: 11/64 files (0.0% non-contiguous), 12/100 blocks
On Thu, Dec 12, 2013 at 12:07:09PM -0800, Darrick J. Wong wrote:
> On Thu, Dec 12, 2013 at 12:36:34PM -0500, Theodore Ts'o wrote:
> > On Thu, Dec 12, 2013 at 12:28:40PM -0500, Theodore Ts'o wrote:
> > > On Tue, Dec 10, 2013 at 05:18:43PM -0800, Darrick J. Wong wrote:
> > > > Use the new ext2fs_punch() call to truncate the quota file. This also
> > > > eliminates the need to fix it to work with bigalloc.
> > > >
> > > > Reviewed-by: Zheng Liu <[email protected]>
> > > > Signed-off-by: Darrick J. Wong <[email protected]>
> > >
> > > Thanks, applied.
> >
> > Hmm, I spoke too soon. This is causing a test failure when applied
> > against the maint branch. I know that you developed this versus the
> > next branch, so it's possible the bug in ext2fs_punch() was fixed
> > next. I'm going to drop this from the maint branch for now, I'll take
> > a closer look at this after going through the rest of the bug fix
> > patches, and see whether it works when applied against next (in which
> > case I'll have to see which bug fix we need to apply to backport the
> > fix to the maint branch).
>
> Hmm, I'll try to reproduce it here; which test is failing?
Is t_quota_2off the culprit?
> It's possible that I fubar'd the ordering when I resorted the patches just
> prior to sending them out. :/
I discovered that at some point, configure.in got touched, which caused
configure to be re-run during a build. Unfortunately --disable-quota is the
default, so the quota check part tests never got run, and that's why I never
noticed on subsequent runs of make check. :(
The problem is that on a ^extents filesystem (such as what t_quota_2off
creates), the end parameter of ~0ULL overflows the (end - start + 1)
calculation, effectively setting count to 0, and no blocks get punched.
libquota then zeroes the inode, and the blocks are lost.
Since block map files shouldn't really have more than 2^32 blocks anyway, I
think it's safe to clamp 'end' to 2^32 in the non-extent case.
I'll send a patch shortly.
--D
On Dec 10, 2013, at 6:18 PM, Darrick J. Wong <[email protected]> wrote:
> mke2fs has a series of checks to ensure that we don't create a
> filesystem too big for its blocksize -- if auto-64bit is on, then it
> turns on 64bit; otherwise it complains. Unfortunately, it performs
> these checks before looking in mke2fs.conf for a blocksize, which
> means that the checks are incorrect if the user specifies a non-4096
> blocksize in the config file and says nothing on the command line.
> The bug also has the effect of mandating a 4k block size on any block
> device larger than 4T in that situation. Therefore, read the block
> size from the config file before performing the 64bit checks.
>
> Reviewed-by: Zheng Liu <[email protected]>
> Signed-off-by: Darrick J. Wong <[email protected]>
> ---
> misc/mke2fs.c | 134 +++++++++++++++++++++++++++++++--------------------------
> 1 file changed, 72 insertions(+), 62 deletions(-)
>
>
> diff --git a/misc/mke2fs.c b/misc/mke2fs.c
> index 67c9225..19b6e85 100644
> --- a/misc/mke2fs.c
> +++ b/misc/mke2fs.c
> @@ -1780,15 +1795,67 @@ profile_error:
> }
> }
>
> + /* Get the hardware sector sizes, if available */
> + retval = ext2fs_get_device_sectsize(device_name, &lsector_size);
> + if (retval) {
> + com_err(program_name, retval,
> + _("while trying to determine hardware sector size"));
> + exit(1);
> + }
> + retval = ext2fs_get_device_phys_sectsize(device_name, &psector_size);
> + if (retval) {
> + com_err(program_name, retval,
> + _("while trying to determine physical sector size"));
> + exit(1);
> + }
> +
> + tmp = getenv("MKE2FS_DEVICE_SECTSIZE");
> + if (tmp != NULL)
> + lsector_size = atoi(tmp);
> + tmp = getenv("MKE2FS_DEVICE_PHYS_SECTSIZE");
> + if (tmp != NULL)
> + psector_size = atoi(tmp);
> +
> + /* Older kernels may not have physical/logical distinction */
> + if (!psector_size)
> + psector_size = lsector_size;
> +
> + if (blocksize <= 0) {
> + use_bsize = get_int_from_profile(fs_types, "blocksize", 4096);
> +
> + if (use_bsize == -1) {
> + use_bsize = sys_page_size;
> + if ((linux_version_code < (2*65536 + 6*256)) &&
Would be nice to have a helper to compute the linux_version_code comparison,
the above is a bit too much detail for this code.
> + (use_bsize > 4096))
> + use_bsize = 4096;
> + }
> + if (lsector_size && use_bsize < lsector_size)
> + use_bsize = lsector_size;
> + if ((blocksize < 0) && (use_bsize < (-blocksize)))
> + use_bsize = -blocksize;
> + blocksize = use_bsize;
> + fs_blocks_count /= (blocksize / 1024);
> + } else {
> + if (blocksize < lsector_size) { /* Impossible */
> + com_err(program_name, EINVAL,
> + _("while setting blocksize; too small "
> + "for device\n"));
> + exit(1);
> + } else if ((blocksize < psector_size) &&
> + (psector_size <= sys_page_size)) { /* Suboptimal */
> + fprintf(stderr, _("Warning: specified blocksize %d is "
> + "less than device physical sectorsize %d\n"),
> + blocksize, psector_size);
> + }
> + }
> +
> + fs_param.s_log_block_size =
> + int_log2(blocksize >> EXT2_MIN_BLOCK_LOG_SIZE);
Does it make sense to wrap this whole block size guessing dance into a small
helper routine, like "figure_fs_blocksize()" or similar?
Cheers, Andreas
On Dec 10, 2013, at 6:20 PM, Darrick J. Wong <[email protected]> wrote:
> ext2fs_free_mem() takes a pointer to a pointer, similar to
> ext2fs_get_mem(). Improve the documentation, and fix debugfs.
>
> diff --git a/lib/ext2fs/ext2fs.h b/lib/ext2fs/ext2fs.h
> index 64e498f..0624350 100644
> --- a/lib/ext2fs/ext2fs.h
> +++ b/lib/ext2fs/ext2fs.h
> @@ -1608,7 +1608,7 @@ _INLINE_ void ext2fs_init_csum_seed(ext2_filsys fs)
> #ifndef EXT2_CUSTOM_MEMORY_ROUTINES
> #include <string.h>
> /*
> - * Allocate memory
> + * Allocate memory. The 'ptr' arg must point to a pointer.
> */
> _INLINE_ errcode_t ext2fs_get_mem(unsigned long size, void *ptr)
> {
Would that imply the second argument in all of these functions is "void **ptr"?
Does GCC handle that correctly? Do other compilers? Am I just clueless?
Cheers, Andreas
> @@ -1655,7 +1655,7 @@ _INLINE_ errcode_t ext2fs_get_arrayzero(unsigned long count,
> }
>
> /*
> - * Free memory
> + * Free memory. The 'ptr' arg must point to a pointer.
> */
> _INLINE_ errcode_t ext2fs_free_mem(void *ptr)
> {
> @@ -1669,7 +1669,7 @@ _INLINE_ errcode_t ext2fs_free_mem(void *ptr)
> }
>
> /*
> - * Resize memory
> + * Resize memory. The 'ptr' arg must point to a pointer.
> */
> _INLINE_ errcode_t ext2fs_resize_mem(unsigned long EXT2FS_ATTR((unused)) old_size,
> unsigned long size, void *ptr)
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
Cheers, Andreas
On Thu, Dec 12, 2013 at 03:33:12PM -0700, Andreas Dilger wrote:
> On Dec 10, 2013, at 6:20 PM, Darrick J. Wong <[email protected]> wrote:
> > ext2fs_free_mem() takes a pointer to a pointer, similar to
> > ext2fs_get_mem(). Improve the documentation, and fix debugfs.
> >
> > diff --git a/lib/ext2fs/ext2fs.h b/lib/ext2fs/ext2fs.h
> > index 64e498f..0624350 100644
> > --- a/lib/ext2fs/ext2fs.h
> > +++ b/lib/ext2fs/ext2fs.h
> > @@ -1608,7 +1608,7 @@ _INLINE_ void ext2fs_init_csum_seed(ext2_filsys fs)
> > #ifndef EXT2_CUSTOM_MEMORY_ROUTINES
> > #include <string.h>
> > /*
> > - * Allocate memory
> > + * Allocate memory. The 'ptr' arg must point to a pointer.
> > */
> > _INLINE_ errcode_t ext2fs_get_mem(unsigned long size, void *ptr)
> > {
>
> Would that imply the second argument in all of these functions is "void **ptr"?
> Does GCC handle that correctly? Do other compilers? Am I just clueless?
In the first function, pp is a pointer to malloc()'d memory, and we copy the
contents of pp into wherever ptr points to. In the second function, we free p,
set p to NULL, then copy that NULL into wherever ptr points to. In both cases,
ptr has to point to a pointer.
So, yes, the second argument is a pointer to a pointer. I'm not sure why the
declarations are so strange here, since it seems to promote confusion and
broken code.
Frankly I was tempted just to fix the declarations, since anybody passing a
pointer (not a pointer-pointer) to these functions has never not been broken
anyway. However, maybe Ted knows of some reason why things are this way?
(That said, the programming errors mostly seem to be on the free end.)
_INLINE_ errcode_t ext2fs_get_mem(unsigned long size, void *ptr)
{
void *pp;
pp = malloc(size);
if (!pp)
return EXT2_ET_NO_MEMORY;
memcpy(ptr, &pp, sizeof (pp));
return 0;
}
_INLINE_ errcode_t ext2fs_free_mem(void *ptr)
{
void *p;
memcpy(&p, ptr, sizeof(p));
free(p);
p = 0;
memcpy(ptr, &p, sizeof(p));
return 0;
}
--D
>
> Cheers, Andreas
>
> > @@ -1655,7 +1655,7 @@ _INLINE_ errcode_t ext2fs_get_arrayzero(unsigned long count,
> > }
> >
> > /*
> > - * Free memory
> > + * Free memory. The 'ptr' arg must point to a pointer.
> > */
> > _INLINE_ errcode_t ext2fs_free_mem(void *ptr)
> > {
> > @@ -1669,7 +1669,7 @@ _INLINE_ errcode_t ext2fs_free_mem(void *ptr)
> > }
> >
> > /*
> > - * Resize memory
> > + * Resize memory. The 'ptr' arg must point to a pointer.
> > */
> > _INLINE_ errcode_t ext2fs_resize_mem(unsigned long EXT2FS_ATTR((unused)) old_size,
> > unsigned long size, void *ptr)
> >
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> > the body of a message to [email protected]
> > More majordomo info at http://vger.kernel.org/majordomo-info.html
>
>
> Cheers, Andreas
>
>
>
>
>
On Thu, Dec 12, 2013 at 03:28:14PM -0700, Andreas Dilger wrote:
>
> On Dec 10, 2013, at 6:18 PM, Darrick J. Wong <[email protected]> wrote:
>
> > mke2fs has a series of checks to ensure that we don't create a
> > filesystem too big for its blocksize -- if auto-64bit is on, then it
> > turns on 64bit; otherwise it complains. Unfortunately, it performs
> > these checks before looking in mke2fs.conf for a blocksize, which
> > means that the checks are incorrect if the user specifies a non-4096
> > blocksize in the config file and says nothing on the command line.
> > The bug also has the effect of mandating a 4k block size on any block
> > device larger than 4T in that situation. Therefore, read the block
> > size from the config file before performing the 64bit checks.
> >
> > Reviewed-by: Zheng Liu <[email protected]>
> > Signed-off-by: Darrick J. Wong <[email protected]>
> > ---
> > misc/mke2fs.c | 134 +++++++++++++++++++++++++++++++--------------------------
> > 1 file changed, 72 insertions(+), 62 deletions(-)
> >
> >
> > diff --git a/misc/mke2fs.c b/misc/mke2fs.c
> > index 67c9225..19b6e85 100644
> > --- a/misc/mke2fs.c
> > +++ b/misc/mke2fs.c
> > @@ -1780,15 +1795,67 @@ profile_error:
> > }
> > }
> >
> > + /* Get the hardware sector sizes, if available */
> > + retval = ext2fs_get_device_sectsize(device_name, &lsector_size);
> > + if (retval) {
> > + com_err(program_name, retval,
> > + _("while trying to determine hardware sector size"));
> > + exit(1);
> > + }
> > + retval = ext2fs_get_device_phys_sectsize(device_name, &psector_size);
> > + if (retval) {
> > + com_err(program_name, retval,
> > + _("while trying to determine physical sector size"));
> > + exit(1);
> > + }
> > +
> > + tmp = getenv("MKE2FS_DEVICE_SECTSIZE");
> > + if (tmp != NULL)
> > + lsector_size = atoi(tmp);
> > + tmp = getenv("MKE2FS_DEVICE_PHYS_SECTSIZE");
> > + if (tmp != NULL)
> > + psector_size = atoi(tmp);
> > +
> > + /* Older kernels may not have physical/logical distinction */
> > + if (!psector_size)
> > + psector_size = lsector_size;
> > +
> > + if (blocksize <= 0) {
> > + use_bsize = get_int_from_profile(fs_types, "blocksize", 4096);
> > +
> > + if (use_bsize == -1) {
> > + use_bsize = sys_page_size;
> > + if ((linux_version_code < (2*65536 + 6*256)) &&
>
> Would be nice to have a helper to compute the linux_version_code comparison,
> the above is a bit too much detail for this code.
Hmm, yes, that would be a nice cleanup.
Yikes, there's a test for pre-2.2 kernels elsewhere in mke2fs.c.
How many people still run on Linux 2.0? :D
>
> > + (use_bsize > 4096))
> > + use_bsize = 4096;
> > + }
> > + if (lsector_size && use_bsize < lsector_size)
> > + use_bsize = lsector_size;
> > + if ((blocksize < 0) && (use_bsize < (-blocksize)))
> > + use_bsize = -blocksize;
> > + blocksize = use_bsize;
> > + fs_blocks_count /= (blocksize / 1024);
> > + } else {
> > + if (blocksize < lsector_size) { /* Impossible */
> > + com_err(program_name, EINVAL,
> > + _("while setting blocksize; too small "
> > + "for device\n"));
> > + exit(1);
> > + } else if ((blocksize < psector_size) &&
> > + (psector_size <= sys_page_size)) { /* Suboptimal */
> > + fprintf(stderr, _("Warning: specified blocksize %d is "
> > + "less than device physical sectorsize %d\n"),
> > + blocksize, psector_size);
> > + }
> > + }
> > +
> > + fs_param.s_log_block_size =
> > + int_log2(blocksize >> EXT2_MIN_BLOCK_LOG_SIZE);
>
> Does it make sense to wrap this whole block size guessing dance into a small
> helper routine, like "figure_fs_blocksize()" or similar?
I suppose it could be cut out into its own helper function, to shrink PRS() a
bit. I'd have to pass in pointers all the variables that it touches
(blocksize, fs_blocks_count, fs_param, psector_size), which made doing that
less attractive.
--D
>
>
> Cheers, Andreas
>
>
>
>
>
Refactor the running kernel version checks to hide the details of
version code checking, etc.
Signed-off-by: Darrick J. Wong <[email protected]>
---
misc/mke2fs.c | 40 +++++++++++++++++++++++++---------------
1 file changed, 25 insertions(+), 15 deletions(-)
diff --git a/misc/mke2fs.c b/misc/mke2fs.c
index c1cbcaa..a76ebe6 100644
--- a/misc/mke2fs.c
+++ b/misc/mke2fs.c
@@ -27,6 +27,7 @@
#include <time.h>
#ifdef __linux__
#include <sys/utsname.h>
+#include <linux/version.h>
#endif
#ifdef HAVE_GETOPT_H
#include <getopt.h>
@@ -109,7 +110,6 @@ char **fs_types;
profile_t profile;
int sys_page_size = 4096;
-int linux_version_code = 0;
static void usage(void)
{
@@ -168,7 +168,28 @@ static int parse_version_number(const char *s)
rev = strtol(cp, &endptr, 10);
if (cp == endptr)
return 0;
- return ((((major * 256) + minor) * 256) + rev);
+ return KERNEL_VERSION(major, minor, rev);
+}
+
+static int is_before_linux_ver(unsigned int major, unsigned int minor)
+{
+ struct utsname ut;
+ int linux_version_code = 0;
+
+ if (uname(&ut)) {
+ perror("uname");
+ exit(1);
+ }
+ linux_version_code = parse_version_number(ut.release);
+ if (linux_version_code == 0)
+ return 0;
+
+ return linux_version_code < KERNEL_VERSION(major, minor, 0);
+}
+#else
+static int is_before_linux_ver(unsigned int major, unsigned int minor)
+{
+ return 0;
}
#endif
@@ -1315,9 +1336,6 @@ static void PRS(int argc, char *argv[])
* Finally, we complain about fs_blocks_count > 2^32 on a non-64bit fs.
*/
blk64_t fs_blocks_count = 0;
-#ifdef __linux__
- struct utsname ut;
-#endif
long sysval;
int s_opt = -1, r_opt = -1;
char *fs_features = 0;
@@ -1382,15 +1400,8 @@ profile_error:
memset(&fs_param, 0, sizeof(struct ext2_super_block));
fs_param.s_rev_level = 1; /* Create revision 1 filesystems now */
-#ifdef __linux__
- if (uname(&ut)) {
- perror("uname");
- exit(1);
- }
- linux_version_code = parse_version_number(ut.release);
- if (linux_version_code && linux_version_code < (2*65536 + 2*256))
+ if (is_before_linux_ver(2, 2))
fs_param.s_rev_level = 0;
-#endif
if (argc && *argv) {
program_name = get_progname(*argv);
@@ -1827,8 +1838,7 @@ profile_error:
if (use_bsize == -1) {
use_bsize = sys_page_size;
- if ((linux_version_code < (2*65536 + 6*256)) &&
- (use_bsize > 4096))
+ if (is_before_linux_ver(2, 6) && use_bsize > 4096)
use_bsize = 4096;
}
if (lsector_size && use_bsize < lsector_size)
On Tue, Dec 10, 2013 at 05:19:03PM -0800, Darrick J. Wong wrote:
> Tweak the wording to be a little less ambiguous, since 'block' can be
> a noun or a verb.
>
> Signed-off-by: Darrick J. Wong <[email protected]>
Thanks, applied to the next branch.
- Ted
On Tue, Dec 10, 2013 at 05:19:35PM -0800, Darrick J. Wong wrote:
> metadata_csum implies uninit_bg, and in fact forces the bit off for
> rocompat with older implementations. Therefore, to detect the
> presence of checksums, we should use the predicate function to decide
> if group descriptor checksums are turned on, not open-coded flag
> tests.
>
> Signed-off-by: Darrick J. Wong <[email protected]>
Thanks, applied to the next branch.
- Ted
On Tue, Dec 10, 2013 at 05:20:01PM -0800, Darrick J. Wong wrote:
> With the advent of metadata_csum, we now tie extent and directory
> blocks to the associated inode number (and generation). Therefore, we
> must be careful when remapping inodes. At that point in the resize
> process, all the blocks that are going away have been duplicated
> elsewhere in the FS (albeit with checksums based on the old inode
> numbers). If we're moving the inode, then do that and remember that
> new inode number. Now we can update the block mappings for each inode
> with the final inode number, and schedule directory blocks for mass
> inode relocation. We also have to recalculate the EA block checksum.
>
> Signed-off-by: Darrick J. Wong <[email protected]>
Thanks, applied to the next branch.
- Ted
It looks like the subject line for this commit was truncated --- I
could try to make something up, but could you suggest something?
Thanks!!
- Ted
On Tue, Dec 10, 2013 at 05:22:20PM -0800, Darrick J. Wong wrote:
> If ext2fs_descriptor_block_loc2() is called with a meta_bg filesystem
> and group_block is not the normal value, the function will return the
> location of the backup group descriptor block in the next block group.
> Unfortunately, it fails to account for the possibility that the backup
> group contains a backup superblock but the regular superblock does
> not. This is the case with block groups 48-49 on a meta_bg fs with 1k
> blocks; in this case, libext2fs will fail to open the filesystem.
>
> Therefore, teach the function to adjust for superblocks in the backup
> group, if necessary.
>
> Signed-off-by: Darrick J. Wong <[email protected]>
> ---
> lib/ext2fs/openfs.c | 19 +++++++++++++++----
> 1 file changed, 15 insertions(+), 4 deletions(-)
>
>
> diff --git a/lib/ext2fs/openfs.c b/lib/ext2fs/openfs.c
> index b2a8abb..92d9e40 100644
> --- a/lib/ext2fs/openfs.c
> +++ b/lib/ext2fs/openfs.c
> @@ -47,7 +47,7 @@ blk64_t ext2fs_descriptor_block_loc2(ext2_filsys fs, blk64_t group_block,
> bg = EXT2_DESC_PER_BLOCK(fs->super) * i;
> if (ext2fs_bg_has_super(fs, bg))
> has_super = 1;
> - ret_blk = ext2fs_group_first_block2(fs, bg) + has_super;
> + ret_blk = ext2fs_group_first_block2(fs, bg);
> /*
> * If group_block is not the normal value, we're trying to use
> * the backup group descriptors and superblock --- so use the
> @@ -57,10 +57,21 @@ blk64_t ext2fs_descriptor_block_loc2(ext2_filsys fs, blk64_t group_block,
> * have the infrastructure in place to do that.
> */
> if (group_block != fs->super->s_first_data_block &&
> - ((ret_blk + fs->super->s_blocks_per_group) <
> - ext2fs_blocks_count(fs->super)))
> + ((ret_blk + has_super + fs->super->s_blocks_per_group) <
> + ext2fs_blocks_count(fs->super))) {
> ret_blk += fs->super->s_blocks_per_group;
> - return ret_blk;
> +
> + /*
> + * If we're going to jump forward a block group, make sure
> + * that we adjust has_super to account for the next group's
> + * backup superblock (or lack thereof).
> + */
> + if (ext2fs_bg_has_super(fs, bg + 1))
> + has_super = 1;
> + else
> + has_super = 0;
> + }
> + return ret_blk + has_super;
> }
>
> blk_t ext2fs_descriptor_block_loc(ext2_filsys fs, blk_t group_block, dgrp_t i)
>
On Tue, Dec 10, 2013 at 05:18:57PM -0800, Darrick J. Wong wrote:
> The old uninit_bg checksums depend on the UUID, so prohibit changes to
> the UUID if a checksumming filesystem is mounted, because this
> introduces a nasty race where the kernel and tune2fs are both trying
> to rewrite group descriptors at the same time, with different ideas
> about what the UUID is.
>
> Signed-off-by: Darrick J. Wong <[email protected]>
I made the following change to the maint branch, which pulls back some
changes that had first made in the next branch into maint, and then
merged the maint branch back into the next branch --- and was
impressed how painlessly git handled the merge. :-)
The result was slightly cleaner than what resulted after applying your
original patch to the next branch, since it resulted in:
if (ext2fs_has_group_desc_csum(fs)) {
/*
* Changing the UUID requires rewriting all metadata,
* which can race with a mounted fs. Don't allow that.
*/
...
}
if (ext2fs_has_group_desc_csum(fs)) {
/*
* Determine if the block group checksums are
* correct so we know whether or not to set
* them later on.
*/
...
}
- Ted
>From 66457fcb842300e757a69c49c2eb4d8e335be34c Mon Sep 17 00:00:00 2001
From: "Darrick J. Wong" <[email protected]>
Date: Sat, 14 Dec 2013 20:51:04 -0500
Subject: [PATCH] tune2fs: forbid changing uuid on an uninit_bg filesystem
The old uninit_bg checksums depend on the UUID, so prohibit changes to
the UUID if a checksumming filesystem is mounted, because this
introduces a nasty race where the kernel and tune2fs are both trying
to rewrite group descriptors at the same time, with different ideas
about what the UUID is.
Signed-off-by: Darrick J. Wong <[email protected]>
Signed-off-by: "Theodore Ts'o" <[email protected]>
---
misc/tune2fs.c | 23 +++++++++++++++++++++++
1 file changed, 23 insertions(+)
diff --git a/misc/tune2fs.c b/misc/tune2fs.c
index 822df74..a8dc111 100644
--- a/misc/tune2fs.c
+++ b/misc/tune2fs.c
@@ -358,6 +358,16 @@ static int update_mntopts(ext2_filsys fs, char *mntopts)
return 0;
}
+static int check_fsck_needed(ext2_filsys fs)
+{
+ if (fs->super->s_state & EXT2_VALID_FS)
+ return 0;
+ printf("\n%s\n", _(please_fsck));
+ if (mount_flags & EXT2_MF_READONLY)
+ printf(_("(and reboot afterwards!)\n"));
+ return 1;
+}
+
static void request_fsck_afterwards(ext2_filsys fs)
{
static int requested = 0;
@@ -2147,6 +2157,19 @@ retry_open:
if (sb->s_feature_ro_compat &
EXT4_FEATURE_RO_COMPAT_GDT_CSUM) {
/*
+ * Changing the UUID requires rewriting all metadata,
+ * which can race with a mounted fs. Don't allow that.
+ */
+ if (mount_flags & EXT2_MF_MOUNTED) {
+ fputs(_("The UUID may only be "
+ "changed when the filesystem is "
+ "unmounted.\n"), stderr);
+ exit(1);
+ }
+ if (check_fsck_needed(fs))
+ exit(1);
+
+ /*
* Determine if the block group checksums are
* correct so we know whether or not to set
* them later on.
--
1.8.5.rc3.362.gdf10213
On Tue, Dec 10, 2013 at 05:22:59PM -0800, Darrick J. Wong wrote:
> When bigalloc is enabled, using ext2fs_block_alloc_stats2() to free
> any block in a cluster has the effect of freeing the entire cluster.
> This is problematic if a caller instructs us to punch, say, blocks
> 12-15 of a 16-block cluster, because blocks 0-11 now point to a "free"
> cluster.
>
> The naive way to solve this problem is to see if any of the other
> blocks in this logical cluster map to a physical cluster. If so, then
> we know that the cluster is still in use and it mustn't be freed.
> Otherwise, we are punching the last mapped block in this cluster, so
> we can free the cluster.
>
> The implementation given only does the rigorous checks for the partial
> clusters at the beginning and end of the punching range.
>
> v2: Refactor the block free code into a separate helper function that
> should be more efficient.
>
> Reviewed-by: Zheng Liu <[email protected]>
> Signed-off-by: Darrick J. Wong <[email protected]>
Thanks, applied.
- Ted
On Tue, Dec 10, 2013 at 05:23:06PM -0800, Darrick J. Wong wrote:
> When we're appending a block to a directory file or the journal file,
> and the new block is part of a cluster that has already been allocated
> to the file (implied cluster allocation), don't update the bitmap or
> the summary counts because that was performed when the cluster was
> allocated.
>
> Reviewed-by: Zheng Liu <[email protected]>
> Signed-off-by: Darrick J. Wong <[email protected]>
Thanks, applied.
- Ted
On Tue, Dec 10, 2013 at 05:23:12PM -0800, Darrick J. Wong wrote:
> When the rehash process is running on a bigalloc filesystem, it
> compresses all the directory entries and hash structures into the
> beginning of the directory file and then uses block_iterate3() to free
> the blocks off the end of the file. It seems to call
> ext2fs_block_alloc_stats2() for every block in a cluster, which is
> unfortunate because this function allocates and frees entire clusters
> (and updates the summary counts accordingly). In this case e2fsck
> writes out incorrect summary counts.
>
> Reviewed-by: Zheng Liu <[email protected]>
> Signed-off-by: Darrick J. Wong <[email protected]>
Thanks, applied.
- Ted
On Tue, Dec 10, 2013 at 05:23:20PM -0800, Darrick J. Wong wrote:
> If pass5 finds bitmap errors in a range of clusters, don't print each
> cluster number individually when we could print only the start and end
> cluster number. e2fsck already does this for the non-bigalloc case.
>
> Reviewed-by: Zheng Liu <[email protected]>
> Signed-off-by: Darrick J. Wong <[email protected]>
Thanks, applied.
- Ted
On Tue, Dec 10, 2013 at 05:23:27PM -0800, Darrick J. Wong wrote:
> When we're expanding a directory, check to see if we're doing an
> implied cluster allocation; if so, we don't need to allocate a new
> block, and we certainly don't need to update the summary counts.
>
> Reported-by: Zheng Liu <[email protected]>
> Signed-off-by: Darrick J. Wong <[email protected]>
Thanks, applied.
- Ted
On Tue, Dec 10, 2013 at 05:23:33PM -0800, Darrick J. Wong wrote:
> When freeing a group's metadata blocks, be careful not to free
> clusters belonging to other groups!
>
> Signed-off-by: Darrick J. Wong <[email protected]>
> ---
> resize/resize2fs.c | 78 +++++++++++++++++++++++++++++++++-------------------
> 1 file changed, 49 insertions(+), 29 deletions(-)
>
>
> diff --git a/resize/resize2fs.c b/resize/resize2fs.c
> index ff5e6a2..49fe986 100644
> --- a/resize/resize2fs.c
> +++ b/resize/resize2fs.c
> @@ -270,40 +270,60 @@ static void fix_uninit_block_bitmaps(ext2_filsys fs)
> * release them in the new filesystem data structure, and mark them as
> * reserved so the old inode table blocks don't get overwritten.
> */
> -static void free_gdp_blocks(ext2_filsys fs,
> - ext2fs_block_bitmap reserve_blocks,
> - ext2_filsys old_fs,
> - dgrp_t group)
> +static errcode_t free_gdp_blocks(ext2_filsys fs,
> + ext2fs_block_bitmap reserve_blocks,
> + ext2_filsys old_fs,
> + dgrp_t group, dgrp_t count)
This function is only used in one place, and "count" is calculated
using values from fs and old_fs.
old_fs->group_desc_count - fs->group_desc_count
Wouldn't it be clearer to do this calculation in free_gdp_blocks, i.e:
dgrp_t count = old_fs->group_desc_count - fs->group_desc_count;
This makes it clearer what's going on, instead of using a generic
parameter name such as "count" which isn't as clear...
- Ted
On Mon, Dec 16, 2013 at 12:01:47AM -0500, Theodore Ts'o wrote:
> On Tue, Dec 10, 2013 at 05:23:33PM -0800, Darrick J. Wong wrote:
> > When freeing a group's metadata blocks, be careful not to free
> > clusters belonging to other groups!
> >
> > Signed-off-by: Darrick J. Wong <[email protected]>
> > ---
> > resize/resize2fs.c | 78 +++++++++++++++++++++++++++++++++-------------------
> > 1 file changed, 49 insertions(+), 29 deletions(-)
> >
> >
> > diff --git a/resize/resize2fs.c b/resize/resize2fs.c
> > index ff5e6a2..49fe986 100644
> > --- a/resize/resize2fs.c
> > +++ b/resize/resize2fs.c
> > @@ -270,40 +270,60 @@ static void fix_uninit_block_bitmaps(ext2_filsys fs)
> > * release them in the new filesystem data structure, and mark them as
> > * reserved so the old inode table blocks don't get overwritten.
> > */
> > -static void free_gdp_blocks(ext2_filsys fs,
> > - ext2fs_block_bitmap reserve_blocks,
> > - ext2_filsys old_fs,
> > - dgrp_t group)
> > +static errcode_t free_gdp_blocks(ext2_filsys fs,
> > + ext2fs_block_bitmap reserve_blocks,
> > + ext2_filsys old_fs,
> > + dgrp_t group, dgrp_t count)
>
> This function is only used in one place, and "count" is calculated
> using values from fs and old_fs.
>
> old_fs->group_desc_count - fs->group_desc_count
>
> Wouldn't it be clearer to do this calculation in free_gdp_blocks, i.e:
>
> dgrp_t count = old_fs->group_desc_count - fs->group_desc_count;
>
> This makes it clearer what's going on, instead of using a generic
> parameter name such as "count" which isn't as clear...
Yep, that makes sense. How about something like this?
---
resize2fs: during shrink, don't free in-use bg data clusters
When freeing a group's metadata blocks, be careful not to free
clusters belonging to other groups!
v2: Remove unnecessary parameters, per Ted's suggestion.
Signed-off-by: Darrick J. Wong <[email protected]>
---
resize/resize2fs.c | 77 ++++++++++++++++++++++++++++++++--------------------
1 file changed, 48 insertions(+), 29 deletions(-)
diff --git a/resize/resize2fs.c b/resize/resize2fs.c
index ff5e6a2..fce5a70 100644
--- a/resize/resize2fs.c
+++ b/resize/resize2fs.c
@@ -270,40 +270,61 @@ static void fix_uninit_block_bitmaps(ext2_filsys fs)
* release them in the new filesystem data structure, and mark them as
* reserved so the old inode table blocks don't get overwritten.
*/
-static void free_gdp_blocks(ext2_filsys fs,
- ext2fs_block_bitmap reserve_blocks,
- ext2_filsys old_fs,
- dgrp_t group)
+static errcode_t free_gdp_blocks(ext2_filsys fs,
+ ext2fs_block_bitmap reserve_blocks,
+ ext2_filsys old_fs,
+ dgrp_t group)
{
blk64_t blk;
int j;
+ dgrp_t i;
+ ext2fs_block_bitmap bg_map = NULL;
+ errcode_t retval = 0;
+ dgrp_t count = old_fs->group_desc_count - fs->group_desc_count;
+
+ /* If bigalloc, don't free metadata living in the same cluster */
+ if (EXT2FS_CLUSTER_RATIO(fs) > 1) {
+ retval = ext2fs_allocate_block_bitmap(fs, "bgdata", &bg_map);
+ if (retval)
+ goto out;
- blk = ext2fs_block_bitmap_loc(old_fs, group);
- if (blk &&
- (blk < ext2fs_blocks_count(fs->super))) {
- ext2fs_block_alloc_stats2(fs, blk, -1);
- ext2fs_mark_block_bitmap2(reserve_blocks, blk);
+ retval = mark_table_blocks(fs, bg_map);
+ if (retval)
+ goto out;
}
- blk = ext2fs_inode_bitmap_loc(old_fs, group);
- if (blk &&
- (blk < ext2fs_blocks_count(fs->super))) {
- ext2fs_block_alloc_stats2(fs, blk, -1);
- ext2fs_mark_block_bitmap2(reserve_blocks, blk);
- }
+ for (i = group; i < group + count; i++) {
+ blk = ext2fs_block_bitmap_loc(old_fs, i);
+ if (blk &&
+ (blk < ext2fs_blocks_count(fs->super)) &&
+ !(bg_map && ext2fs_test_block_bitmap2(bg_map, blk))) {
+ ext2fs_block_alloc_stats2(fs, blk, -1);
+ ext2fs_mark_block_bitmap2(reserve_blocks, blk);
+ }
- blk = ext2fs_inode_table_loc(old_fs, group);
- if (blk == 0 ||
- (blk >= ext2fs_blocks_count(fs->super)))
- return;
+ blk = ext2fs_inode_bitmap_loc(old_fs, i);
+ if (blk &&
+ (blk < ext2fs_blocks_count(fs->super)) &&
+ !(bg_map && ext2fs_test_block_bitmap2(bg_map, blk))) {
+ ext2fs_block_alloc_stats2(fs, blk, -1);
+ ext2fs_mark_block_bitmap2(reserve_blocks, blk);
+ }
- for (j = 0;
- j < fs->inode_blocks_per_group; j++, blk++) {
- if (blk >= ext2fs_blocks_count(fs->super))
- break;
- ext2fs_block_alloc_stats2(fs, blk, -1);
- ext2fs_mark_block_bitmap2(reserve_blocks, blk);
+ blk = ext2fs_inode_table_loc(old_fs, i);
+ for (j = 0;
+ j < fs->inode_blocks_per_group; j++, blk++) {
+ if (blk >= ext2fs_blocks_count(fs->super) ||
+ (bg_map && ext2fs_test_block_bitmap2(bg_map, blk)))
+ continue;
+ ext2fs_block_alloc_stats2(fs, blk, -1);
+ ext2fs_mark_block_bitmap2(reserve_blocks, blk);
+ }
}
+
+out:
+ if (bg_map)
+ ext2fs_free_block_bitmap(bg_map);
+ return retval;
}
/*
@@ -467,10 +488,8 @@ retry:
* Check the block groups that we are chopping off
* and free any blocks associated with their metadata
*/
- for (i = fs->group_desc_count;
- i < old_fs->group_desc_count; i++)
- free_gdp_blocks(fs, reserve_blocks, old_fs, i);
- retval = 0;
+ retval = free_gdp_blocks(fs, reserve_blocks, old_fs,
+ fs->group_desc_count);
goto errout;
}
On 12/10/13, 7:20 PM, Darrick J. Wong wrote:
> Signed-off-by: Darrick J. Wong <[email protected]>
> ---
> e2fsck/journal.c | 4 +++-
> e2fsck/pass3.c | 5 +++--
> e2fsck/profile.c | 2 ++
> e2fsck/unix.c | 2 ++
> 4 files changed, 10 insertions(+), 3 deletions(-)
>
>
> diff --git a/e2fsck/journal.c b/e2fsck/journal.c
> index e3f80bc..22f06e7 100644
> --- a/e2fsck/journal.c
> +++ b/e2fsck/journal.c
> @@ -1139,8 +1139,10 @@ int e2fsck_fix_ext3_journal_hint(e2fsck_t ctx)
> if (!journal_name)
> return 0;
>
> - if (stat(journal_name, &st) < 0)
> + if (stat(journal_name, &st) < 0) {
> + free(journal_name);
> return 0;
> + }
>
> if (st.st_rdev != sb->s_journal_dev) {
> clear_problem_context(&pctx);
> diff --git a/e2fsck/pass3.c b/e2fsck/pass3.c
> index fbaadcf..6989f17 100644
> --- a/e2fsck/pass3.c
> +++ b/e2fsck/pass3.c
> @@ -53,7 +53,7 @@ static ext2fs_inode_bitmap inode_done_map = 0;
> void e2fsck_pass3(e2fsck_t ctx)
> {
> ext2_filsys fs = ctx->fs;
> - struct dir_info_iter *iter;
> + struct dir_info_iter *iter = NULL;
> #ifdef RESOURCE_TRACK
> struct resource_track rtrack;
> #endif
> @@ -108,7 +108,6 @@ void e2fsck_pass3(e2fsck_t ctx)
> if (check_directory(ctx, dir->ino, &pctx))
> goto abort_exit;
> }
> - e2fsck_dir_info_iter_end(ctx, iter);
>
> /*
> * Force the creation of /lost+found if not present
> @@ -123,6 +122,8 @@ void e2fsck_pass3(e2fsck_t ctx)
> e2fsck_rehash_directories(ctx);
>
> abort_exit:
> + if (iter)
> + e2fsck_dir_info_iter_end(ctx, iter);
> e2fsck_free_dir_info(ctx);
> if (inode_loop_detect) {
> ext2fs_free_inode_bitmap(inode_loop_detect);
> diff --git a/e2fsck/profile.c b/e2fsck/profile.c
> index 019c6f5..92aa893 100644
> --- a/e2fsck/profile.c
> +++ b/e2fsck/profile.c
> @@ -318,6 +318,8 @@ profile_init(const char **files, profile_t *ret_profile)
> /* if the filenames list is not specified return an empty profile */
> if ( files ) {
> for (fs = files; !PROFILE_LAST_FILESPEC(*fs); fs++) {
> + if (array)
> + free_list(array);
> retval = get_dirlist(*fs, &array);
> if (retval == 0) {
> if (!array)
Coverity didn't quite like this. You free it, but then it's later tested,
so we get double frees and such. Need to assign it to NULL after freeing.
Darrick I think you're on the scan project right, so you can take a look,
CID 1138576.
-Eric
> diff --git a/e2fsck/unix.c b/e2fsck/unix.c
> index a6c8d25..7a8fce2 100644
> --- a/e2fsck/unix.c
> +++ b/e2fsck/unix.c
> @@ -869,6 +869,8 @@ static errcode_t PRS(int argc, char *argv[], e2fsck_t *ret_ctx)
> case 'L':
> replace_bad_blocks++;
> case 'l':
> + if (bad_blocks_file)
> + free(bad_blocks_file);
> bad_blocks_file = string_copy(ctx, optarg, 0);
> break;
> case 'd':
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
On 12/10/13, 7:21 PM, Darrick J. Wong wrote:
> Fix up a few places where we ignore return values.
>
> Signed-off-by: Darrick J. Wong <[email protected]>
> ---
> lib/ext2fs/flushb.c | 2 +-
> lib/ext2fs/icount.c | 2 ++
> lib/ext2fs/imager.c | 7 ++++++-
> lib/ext2fs/mkjournal.c | 4 +++-
> lib/ext2fs/punch.c | 7 +++++++
> 5 files changed, 19 insertions(+), 3 deletions(-)
>
>
> diff --git a/lib/ext2fs/flushb.c b/lib/ext2fs/flushb.c
> index ac8923c..98821fc 100644
> --- a/lib/ext2fs/flushb.c
> +++ b/lib/ext2fs/flushb.c
> @@ -70,7 +70,7 @@ errcode_t ext2fs_sync_device(int fd, int flushb)
> #warning BLKFLSBUF not defined
> #endif
> #ifdef FDFLUSH
> - ioctl (fd, FDFLUSH, 0); /* In case this is a floppy */
> + return ioctl(fd, FDFLUSH, 0); /* In case this is a floppy */
> #elif defined(__linux__)
> #warning FDFLUSH not defined
> #endif
> diff --git a/lib/ext2fs/icount.c b/lib/ext2fs/icount.c
> index 84b74a9..c5ebf74 100644
> --- a/lib/ext2fs/icount.c
> +++ b/lib/ext2fs/icount.c
> @@ -193,6 +193,8 @@ errcode_t ext2fs_create_icount_tdb(ext2_filsys fs, char *tdb_dir,
> uuid_unparse(fs->super->s_uuid, uuid);
> sprintf(fn, "%s/%s-icount-XXXXXX", tdb_dir, uuid);
> fd = mkstemp(fn);
> + if (fd < 0)
> + return fd;
Turns out this leaks "fn" (coverity spotted this, CID 1138575)
Thanks,
-Eric
>
> /*
> * This is an overestimate of the size that we will need; the
> diff --git a/lib/ext2fs/imager.c b/lib/ext2fs/imager.c
> index 7f3b25b..378a3c8 100644
> --- a/lib/ext2fs/imager.c
> +++ b/lib/ext2fs/imager.c
> @@ -66,6 +66,7 @@ errcode_t ext2fs_image_inode_write(ext2_filsys fs, int fd, int flags)
> blk64_t blk;
> ssize_t actual;
> errcode_t retval;
> + off_t r;
>
> buf = malloc(fs->blocksize * BUF_BLOCKS);
> if (!buf)
> @@ -97,7 +98,11 @@ errcode_t ext2fs_image_inode_write(ext2_filsys fs, int fd, int flags)
> blk++;
> left--;
> cp += fs->blocksize;
> - lseek(fd, fs->blocksize, SEEK_CUR);
> + r = lseek(fd, fs->blocksize, SEEK_CUR);
> + if (r < 0) {
> + retval = errno;
> + goto errout;
> + }
> continue;
> }
> /* Find non-zero blocks */
> diff --git a/lib/ext2fs/mkjournal.c b/lib/ext2fs/mkjournal.c
> index 2afd3b7..1d5b1a7 100644
> --- a/lib/ext2fs/mkjournal.c
> +++ b/lib/ext2fs/mkjournal.c
> @@ -520,8 +520,10 @@ errcode_t ext2fs_add_journal_inode(ext2_filsys fs, blk_t num_blocks, int flags)
> #if HAVE_EXT2_IOCTLS
> fd = open(jfile, O_RDONLY);
> if (fd >= 0) {
> - ioctl(fd, EXT2_IOC_SETFLAGS, &f);
> + retval = ioctl(fd, EXT2_IOC_SETFLAGS, &f);
> close(fd);
> + if (retval)
> + return retval;
> }
> #endif
> #endif
> diff --git a/lib/ext2fs/punch.c b/lib/ext2fs/punch.c
> index 790a0ad8..ceec336 100644
> --- a/lib/ext2fs/punch.c
> +++ b/lib/ext2fs/punch.c
> @@ -192,6 +192,13 @@ static errcode_t ext2fs_punch_extent(ext2_filsys fs, ext2_ino_t ino,
> retval = ext2fs_extent_open2(fs, ino, inode, &handle);
> if (retval)
> return retval;
> + /*
> + * Find the extent closest to the start of the punch range. We don't
> + * check the return value because _goto() sets the current node to the
> + * next-lowest extent if 'start' is in a hole, and doesn't set a
> + * current node if there was a real error reading the extent tree.
> + * In that case, _get() will error out.
> + */
> ext2fs_extent_goto(handle, start);
> retval = ext2fs_extent_get(handle, EXT2_EXTENT_CURRENT, &extent);
> if (retval)
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
On 12/17/13, 10:57 AM, Eric Sandeen wrote:
> On 12/10/13, 7:21 PM, Darrick J. Wong wrote:
>> Fix up a few places where we ignore return values.
>>
>> Signed-off-by: Darrick J. Wong <[email protected]>
>> ---
>> lib/ext2fs/flushb.c | 2 +-
>> lib/ext2fs/icount.c | 2 ++
>> lib/ext2fs/imager.c | 7 ++++++-
>> lib/ext2fs/mkjournal.c | 4 +++-
>> lib/ext2fs/punch.c | 7 +++++++
>> 5 files changed, 19 insertions(+), 3 deletions(-)
>>
>>
>> diff --git a/lib/ext2fs/flushb.c b/lib/ext2fs/flushb.c
>> index ac8923c..98821fc 100644
>> --- a/lib/ext2fs/flushb.c
>> +++ b/lib/ext2fs/flushb.c
>> @@ -70,7 +70,7 @@ errcode_t ext2fs_sync_device(int fd, int flushb)
>> #warning BLKFLSBUF not defined
>> #endif
>> #ifdef FDFLUSH
>> - ioctl (fd, FDFLUSH, 0); /* In case this is a floppy */
>> + return ioctl(fd, FDFLUSH, 0); /* In case this is a floppy */
>> #elif defined(__linux__)
>> #warning FDFLUSH not defined
>> #endif
>> diff --git a/lib/ext2fs/icount.c b/lib/ext2fs/icount.c
>> index 84b74a9..c5ebf74 100644
>> --- a/lib/ext2fs/icount.c
>> +++ b/lib/ext2fs/icount.c
>> @@ -193,6 +193,8 @@ errcode_t ext2fs_create_icount_tdb(ext2_filsys fs, char *tdb_dir,
>> uuid_unparse(fs->super->s_uuid, uuid);
>> sprintf(fn, "%s/%s-icount-XXXXXX", tdb_dir, uuid);
>> fd = mkstemp(fn);
>> + if (fd < 0)
>> + return fd;
>
> Turns out this leaks "fn" (coverity spotted this, CID 1138575)
oops, and icount as well.
-Eric
> Thanks,
> -Eric
>
>>
>> /*
>> * This is an overestimate of the size that we will need; the
>> diff --git a/lib/ext2fs/imager.c b/lib/ext2fs/imager.c
>> index 7f3b25b..378a3c8 100644
>> --- a/lib/ext2fs/imager.c
>> +++ b/lib/ext2fs/imager.c
>> @@ -66,6 +66,7 @@ errcode_t ext2fs_image_inode_write(ext2_filsys fs, int fd, int flags)
>> blk64_t blk;
>> ssize_t actual;
>> errcode_t retval;
>> + off_t r;
>>
>> buf = malloc(fs->blocksize * BUF_BLOCKS);
>> if (!buf)
>> @@ -97,7 +98,11 @@ errcode_t ext2fs_image_inode_write(ext2_filsys fs, int fd, int flags)
>> blk++;
>> left--;
>> cp += fs->blocksize;
>> - lseek(fd, fs->blocksize, SEEK_CUR);
>> + r = lseek(fd, fs->blocksize, SEEK_CUR);
>> + if (r < 0) {
>> + retval = errno;
>> + goto errout;
>> + }
>> continue;
>> }
>> /* Find non-zero blocks */
>> diff --git a/lib/ext2fs/mkjournal.c b/lib/ext2fs/mkjournal.c
>> index 2afd3b7..1d5b1a7 100644
>> --- a/lib/ext2fs/mkjournal.c
>> +++ b/lib/ext2fs/mkjournal.c
>> @@ -520,8 +520,10 @@ errcode_t ext2fs_add_journal_inode(ext2_filsys fs, blk_t num_blocks, int flags)
>> #if HAVE_EXT2_IOCTLS
>> fd = open(jfile, O_RDONLY);
>> if (fd >= 0) {
>> - ioctl(fd, EXT2_IOC_SETFLAGS, &f);
>> + retval = ioctl(fd, EXT2_IOC_SETFLAGS, &f);
>> close(fd);
>> + if (retval)
>> + return retval;
>> }
>> #endif
>> #endif
>> diff --git a/lib/ext2fs/punch.c b/lib/ext2fs/punch.c
>> index 790a0ad8..ceec336 100644
>> --- a/lib/ext2fs/punch.c
>> +++ b/lib/ext2fs/punch.c
>> @@ -192,6 +192,13 @@ static errcode_t ext2fs_punch_extent(ext2_filsys fs, ext2_ino_t ino,
>> retval = ext2fs_extent_open2(fs, ino, inode, &handle);
>> if (retval)
>> return retval;
>> + /*
>> + * Find the extent closest to the start of the punch range. We don't
>> + * check the return value because _goto() sets the current node to the
>> + * next-lowest extent if 'start' is in a hole, and doesn't set a
>> + * current node if there was a real error reading the extent tree.
>> + * In that case, _get() will error out.
>> + */
>> ext2fs_extent_goto(handle, start);
>> retval = ext2fs_extent_get(handle, EXT2_EXTENT_CURRENT, &extent);
>> if (retval)
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
>> the body of a message to [email protected]
>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>>
>
On 12/10/13, 7:19 PM, Darrick J. Wong wrote:
> debugfs should use strtoull wrappers for reading block numbers from
> the command line. "unsigned long" isn't wide enough to handle block
> numbers on 32bit platforms.
>
> Signed-off-by: Darrick J. Wong <[email protected]>
> ---
> debugfs/debugfs.c | 33 ++++++++++++++++++++++-----------
> debugfs/extent_inode.c | 22 +++++++++-------------
> debugfs/util.c | 2 +-
> 3 files changed, 32 insertions(+), 25 deletions(-)
>
>
> diff --git a/debugfs/debugfs.c b/debugfs/debugfs.c
> index c5f8a1f..578d577 100644
> --- a/debugfs/debugfs.c
> +++ b/debugfs/debugfs.c
> @@ -181,8 +181,7 @@ void do_open_filesys(int argc, char **argv)
> return;
> break;
> case 's':
> - superblock = parse_ulong(optarg, argv[0],
> - "superblock number", &err);
> + err = strtoblk(argv[0], optarg, &superblock);
> if (err)
> return;
> break;
> @@ -278,14 +277,17 @@ void do_init_filesys(int argc, char **argv)
> struct ext2_super_block param;
> errcode_t retval;
> int err;
> + blk64_t blocks;
>
> if (common_args_process(argc, argv, 3, 3, "initialize",
> "<device> <blocks>", CHECK_FS_NOTOPEN))
> return;
>
> memset(¶m, 0, sizeof(struct ext2_super_block));
> - ext2fs_blocks_count_set(¶m, parse_ulong(argv[2], argv[0],
> - "blocks count", &err));
> + err = strtoblk(argv[0], argv[2], &blocks);
> + if (err)
> + return;
> + ext2fs_blocks_count_set(¶m, blocks);
> if (err)
> return;
err cannot be true here, is the 2nd "if (err)" extraneous?
CID 1138573
Thanks,
-Eric
On 12/10/13, 7:21 PM, Darrick J. Wong wrote:
> Fix memory allocation calculations and check for NULL pointer returns.
>
> Signed-off-by: Darrick J. Wong <[email protected]>
> ---
> lib/ss/invocation.c | 5 +++++
> lib/ss/parse.c | 4 ++++
> lib/ss/request_tbl.c | 2 +-
> 3 files changed, 10 insertions(+), 1 deletion(-)
>
>
> diff --git a/lib/ss/invocation.c b/lib/ss/invocation.c
> index a711050..08b66f2 100644
> --- a/lib/ss/invocation.c
> +++ b/lib/ss/invocation.c
> @@ -20,6 +20,7 @@
> #ifdef HAVE_DLOPEN
> #include <dlfcn.h>
> #endif
> +#include <errno.h>
>
> int ss_create_invocation(subsystem_name, version_string, info_ptr,
> request_table_ptr, code_ptr)
> @@ -46,6 +47,10 @@ int ss_create_invocation(subsystem_name, version_string, info_ptr,
> ;
> table = (ss_data **) realloc((char *)table,
> ((unsigned)sci_idx+2)*size);
> + if (table == NULL) {
> + *code_ptr = errno;
> + return 0;
> + }
According to coverity CID 295143, this leaks "new_table"
Just a free() before return would suffice I think.
Thanks,
-Eric
On Tue, Dec 17, 2013 at 11:04:15AM -0600, Eric Sandeen wrote:
> On 12/10/13, 7:21 PM, Darrick J. Wong wrote:
> > Fix memory allocation calculations and check for NULL pointer returns.
> >
> > Signed-off-by: Darrick J. Wong <[email protected]>
> > ---
> > lib/ss/invocation.c | 5 +++++
> > lib/ss/parse.c | 4 ++++
> > lib/ss/request_tbl.c | 2 +-
> > 3 files changed, 10 insertions(+), 1 deletion(-)
> >
> >
> > diff --git a/lib/ss/invocation.c b/lib/ss/invocation.c
> > index a711050..08b66f2 100644
> > --- a/lib/ss/invocation.c
> > +++ b/lib/ss/invocation.c
> > @@ -20,6 +20,7 @@
> > #ifdef HAVE_DLOPEN
> > #include <dlfcn.h>
> > #endif
> > +#include <errno.h>
> >
> > int ss_create_invocation(subsystem_name, version_string, info_ptr,
> > request_table_ptr, code_ptr)
> > @@ -46,6 +47,10 @@ int ss_create_invocation(subsystem_name, version_string, info_ptr,
> > ;
> > table = (ss_data **) realloc((char *)table,
> > ((unsigned)sci_idx+2)*size);
> > + if (table == NULL) {
> > + *code_ptr = errno;
> > + return 0;
> > + }
>
> According to coverity CID 295143, this leaks "new_table"
>
> Just a free() before return would suffice I think.
All right, I'll send out a couple of cleanup patches in a bit.
--D
>
> Thanks,
> -Eric
>
Hi,
>Now that we've trained libext2fs to have this same behavior whenever
>it's loading a block bitmap, we no longer need to unset BLOCK_UNINIT
>for a group that contains only its own group metadata -- kernel,
>e2fsck, and e2fsprogs will handle this correctly.
It seems to me that the problem (*) I reported is not fixed
after applying your 38-41 patches. Do we need extra patch for this?
(*)
http://marc.info/?l=linux-ext4&m=138242796915518&w=2
Regards,
Akira Fujita
>From: Darrick J. Wong [mailto:[email protected]]
>Sent: Wednesday, December 11, 2013 10:23 AM
>To: [email protected]; [email protected]
>Cc: Akira Fujita; [email protected]
>Subject: [PATCH 40/74] libext2fs: no need to clear BLOCK_UNINIT during ext2fs_reserve_super_and_bgd
>
>Since the beginning of the uninit_bg feature, the kernel[1] and
>e2fsck[2] have always been careful to detect the presence of the
>BLOCK_UNINIT flag, and compute a block bitmap with any group metadata
>blocks marked in that bitmap. With that in mind, I think it's safe to
>say that this is a design feature of uninit_bg.
>
>Now that we've trained libext2fs to have this same behavior whenever
>it's loading a block bitmap, we no longer need to unset BLOCK_UNINIT
>for a group that contains only its own group metadata -- kernel,
>e2fsck, and e2fsprogs will handle this correctly.
>
>[1] kernel git 717d50e4971b81b96c0199c91cdf0039a8cb181a
> "Ext4: Uninitialized Block Groups"
>[2] e2fsprogs git f5fa20078bfc05b554294fe9c5505375d7913e8c
> "Add support for EXT2_FEATURE_COMPAT_LAZY_BG"
>
>Reported-by: Akira Fujita <[email protected]>
>Signed-off-by: Darrick J. Wong <[email protected]>
>---
> lib/ext2fs/alloc_sb.c | 2 --
> 1 file changed, 2 deletions(-)
>
>
>diff --git a/lib/ext2fs/alloc_sb.c b/lib/ext2fs/alloc_sb.c
>index 223ec51..8788c00 100644
>--- a/lib/ext2fs/alloc_sb.c
>+++ b/lib/ext2fs/alloc_sb.c
>@@ -65,8 +65,6 @@ int ext2fs_reserve_super_and_bgd(ext2_filsys fs,
> ext2fs_mark_block_bitmap2(bmap, 0);
>
> if (old_desc_blk) {
>- if (fs->super->s_reserved_gdt_blocks && fs->block_map == bmap)
>- ext2fs_bg_flags_clear(fs, group, EXT2_BG_BLOCK_UNINIT);
> num_blocks = old_desc_blocks;
> if (old_desc_blk + num_blocks >= ext2fs_blocks_count(fs->super))
> num_blocks = ext2fs_blocks_count(fs->super) -
On Tue, Dec 10, 2013 at 05:22:26PM -0800, Darrick J. Wong wrote:
> On a filesystem with 1K blocks and meta_bg enabled, opening a
> filesystem with automatic superblock detection tries to compensate for
> the fact that the superblock lives in block 1. However, the method by
> which this is done is later misinterpreted to mean "read the backup
> group descriptors", which is not what we want in this case.
>
> Therefore, in ext2fs_open3() separate the 'group zero' adjustment into
> its own variable so that we don't get fed backup group descriptors
> when we try to load meta_bg group descriptors.
>
> Furthermore, enhance ext2fs_descriptor_block_loc2() to perform its own
> group zero correction. The other caller of this function neglects to
> do any group-zero correction of their own, so this fixes them too.
>
> Signed-off-by: Darrick J. Wong <[email protected]>
Thanks, applied to the maint branch.
- Ted
On Tue, Dec 10, 2013 at 05:22:33PM -0800, Darrick J. Wong wrote:
> The kernel[1] and e2fsck[2] both react to a BLOCK_UNINIT group by
> calculating the block bitmap that's needed to show all the group
> blocks for that group (if any) and using that. However, when reading
> bitmaps from disk, libext2fs simply imports a block of zeroes into the
> bitmap, without bothering to check for group blocks. This erroneous
> behavior results in the filesystem having a block bitmap that does not
> accurately reflect disk contents, and worse yet makes it seem as
> though superblocks, group descriptors, bitmaps, and inode tables are
> "free" space on disk.
>
> So, fix the block bitmap loading routines to calculate the correct
> block bitmap for all groups and load it into the main fs block bitmap....
Thanks, applied.
- Ted
On Tue, Dec 10, 2013 at 05:22:39PM -0800, Darrick J. Wong wrote:
> Since libext2fs now detects a BLOCK_UNINIT group and calculates the
> group's block bitmap, we no longer need to emulate this behavior in
> e2fsck. We can simply compare the found block map against the
> filesystem's, and proceed from there.
>
> Signed-off-by: Darrick J. Wong <[email protected]>
Thanks, applied.
- Ted
On Tue, Dec 10, 2013 at 05:22:46PM -0800, Darrick J. Wong wrote:
> Since the beginning of the uninit_bg feature, the kernel[1] and
> e2fsck[2] have always been careful to detect the presence of the
> BLOCK_UNINIT flag, and compute a block bitmap with any group metadata
> blocks marked in that bitmap. With that in mind, I think it's safe to
> say that this is a design feature of uninit_bg.
>
> Now that we've trained libext2fs to have this same behavior whenever
> it's loading a block bitmap, we no longer need to unset BLOCK_UNINIT
> for a group that contains only its own group metadata -- kernel,
> e2fsck, and e2fsprogs will handle this correctly.
>
> [1] kernel git 717d50e4971b81b96c0199c91cdf0039a8cb181a
> "Ext4: Uninitialized Block Groups"
> [2] e2fsprogs git f5fa20078bfc05b554294fe9c5505375d7913e8c
> "Add support for EXT2_FEATURE_COMPAT_LAZY_BG"
>
> Reported-by: Akira Fujita <[email protected]>
> Signed-off-by: Darrick J. Wong <[email protected]>
Thanks, applied.
- Ted
On Tue, Dec 10, 2013 at 05:22:52PM -0800, Darrick J. Wong wrote:
> Now that libext2fs marks group metadata in the fs block bitmap, adjust
> the expected test output to reflect expanded use of block_uninit and
> the fact debugfs no longer prints block bitmap data that fails to
> account for group data blocks.
>
> Signed-off-by: Darrick J. Wong <[email protected]>
Thanks, applied.
- Ted
On Tue, Dec 10, 2013 at 05:23:53PM -0800, Darrick J. Wong wrote:
> @@ -336,6 +370,12 @@ errcode_t ext2fs_bmap2(ext2_filsys fs, ext2_ino_t ino, struct ext2_inode *inode,
> goto done;
> }
>
> + if ((bmap_flags & BMAP_SET) && (bmap_flags & BMAP_UNINIT)) {
> + retval = zero_block(fs, *phys_blk);
> + if (retval)
> + goto done;
> + }
> +
We should use a new flag (say, BMAP_ZERO) if we want ext2fs_bmap2() to
zero out the data block. Otherwise, a number of tools which are
currently using ext2fs_bmap, or debugfs "write" command to copy files
into a file system will end up doing double writes into the file
system --- once to zero the block, and a second time to write data
into said block.
The libext2fs library is designed to be used for low-level tools, so
we shouldn't presume that we should force blocks to be zero'ed unless
the application really wants it.
The other thing to note about this patch is that if you want to
implement fallocate, ext2fs_bmap2() is really the wrong tool to use.
I've been working on a program for work which pre-creates a bunch of
llarge files allocated contiguously on the disk as part of the mke2fs
process, and it turns out that if you try to allocate several
gigabytes worth of files using ext2fs_bmap2(), you end up burning a
huge amount of CPU time (as in around 30 seconds of CPU times while
fallocating a 10GB worth of blocks; so if you try to allocate a
terabyte or three worth of blocks, it would take a truly long time,
while you turn your CPU into a space heater :-).
The top profile user was update_path() in fs/ext4/extents.c, which is
caused by the very large number of extent operations that are needed
for each extent operation. The second largest profile user is
ext2fs_crc16(), caused by the large number of calls to
ext2fs_block_alloc_stats2(), which causes the the block group
descriptors to get incremented one at a time.
What we need to do if we want create an optimized fallocate() is to
allocate blocks until we either exceed the max number of blocks in an
extent, or we get a non-contiguous allocation, and then insert the
extent into extent tree one extent at a time. Similarly, we need to
update the block group descriptors a batched chunks, instead of after
each individual block allocation.
Similarly, as far as calling zero_block(), you really don't want to
issue each 4k write separately.
Cheers,
- Ted
On Thu, Dec 12, 2013 at 11:39:38PM -0500, Theodore Ts'o wrote:
> It looks like the subject line for this commit was truncated --- I
> could try to make something up, but could you suggest something?
I have no recollection if I ever sent a reply to this, but the subject line
should be:
"libext2fs: detect superblock when jumping ahead trying to read gdt"
--D
>
> Thanks!!
>
> - Ted
>
>
> On Tue, Dec 10, 2013 at 05:22:20PM -0800, Darrick J. Wong wrote:
> > If ext2fs_descriptor_block_loc2() is called with a meta_bg filesystem
> > and group_block is not the normal value, the function will return the
> > location of the backup group descriptor block in the next block group.
> > Unfortunately, it fails to account for the possibility that the backup
> > group contains a backup superblock but the regular superblock does
> > not. This is the case with block groups 48-49 on a meta_bg fs with 1k
> > blocks; in this case, libext2fs will fail to open the filesystem.
> >
> > Therefore, teach the function to adjust for superblocks in the backup
> > group, if necessary.
> >
> > Signed-off-by: Darrick J. Wong <[email protected]>
> > ---
> > lib/ext2fs/openfs.c | 19 +++++++++++++++----
> > 1 file changed, 15 insertions(+), 4 deletions(-)
> >
> >
> > diff --git a/lib/ext2fs/openfs.c b/lib/ext2fs/openfs.c
> > index b2a8abb..92d9e40 100644
> > --- a/lib/ext2fs/openfs.c
> > +++ b/lib/ext2fs/openfs.c
> > @@ -47,7 +47,7 @@ blk64_t ext2fs_descriptor_block_loc2(ext2_filsys fs, blk64_t group_block,
> > bg = EXT2_DESC_PER_BLOCK(fs->super) * i;
> > if (ext2fs_bg_has_super(fs, bg))
> > has_super = 1;
> > - ret_blk = ext2fs_group_first_block2(fs, bg) + has_super;
> > + ret_blk = ext2fs_group_first_block2(fs, bg);
> > /*
> > * If group_block is not the normal value, we're trying to use
> > * the backup group descriptors and superblock --- so use the
> > @@ -57,10 +57,21 @@ blk64_t ext2fs_descriptor_block_loc2(ext2_filsys fs, blk64_t group_block,
> > * have the infrastructure in place to do that.
> > */
> > if (group_block != fs->super->s_first_data_block &&
> > - ((ret_blk + fs->super->s_blocks_per_group) <
> > - ext2fs_blocks_count(fs->super)))
> > + ((ret_blk + has_super + fs->super->s_blocks_per_group) <
> > + ext2fs_blocks_count(fs->super))) {
> > ret_blk += fs->super->s_blocks_per_group;
> > - return ret_blk;
> > +
> > + /*
> > + * If we're going to jump forward a block group, make sure
> > + * that we adjust has_super to account for the next group's
> > + * backup superblock (or lack thereof).
> > + */
> > + if (ext2fs_bg_has_super(fs, bg + 1))
> > + has_super = 1;
> > + else
> > + has_super = 0;
> > + }
> > + return ret_blk + has_super;
> > }
> >
> > blk_t ext2fs_descriptor_block_loc(ext2_filsys fs, blk_t group_block, dgrp_t i)
> >
> --
> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
On Sat, Jan 11, 2014 at 05:57:55PM -0500, Theodore Ts'o wrote:
> On Tue, Dec 10, 2013 at 05:23:53PM -0800, Darrick J. Wong wrote:
> > @@ -336,6 +370,12 @@ errcode_t ext2fs_bmap2(ext2_filsys fs, ext2_ino_t ino, struct ext2_inode *inode,
> > goto done;
> > }
> >
> > + if ((bmap_flags & BMAP_SET) && (bmap_flags & BMAP_UNINIT)) {
> > + retval = zero_block(fs, *phys_blk);
> > + if (retval)
> > + goto done;
> > + }
> > +
>
> We should use a new flag (say, BMAP_ZERO) if we want ext2fs_bmap2() to
> zero out the data block. Otherwise, a number of tools which are
> currently using ext2fs_bmap, or debugfs "write" command to copy files
> into a file system will end up doing double writes into the file
> system --- once to zero the block, and a second time to write data
> into said block.
Ok, I'll create a BMAP_ZERO to do this.
> The libext2fs library is designed to be used for low-level tools, so
> we shouldn't presume that we should force blocks to be zero'ed unless
> the application really wants it.
>
> The other thing to note about this patch is that if you want to
> implement fallocate, ext2fs_bmap2() is really the wrong tool to use.
> I've been working on a program for work which pre-creates a bunch of
I think that ext2fs_fallocate would be a good addition to the library. Is your
program far enough along to share? fuse2fs would benefit greatly.
That said, I've also found a couple of bugs in the extent code by implementing
fallocate in such a stupid way. :) It turns out that if (a) we need to split
an extent into three pieces (say we write to a block in the middle of an
unwritten extent and don't want to convert the whole extent) and (b) either of
the extent_insert calls requires us to split the extent block and (c) we ENOSPC
while trying to allocate a new extent block, we don't put the extent tree back
the way it was before the split, and all the blocks after that point are lost.
I will send patches to avoid this corruption by checking for enough space soon.
I think your local git tree has patches in it that aren't on kernel.org yet, so
I'll hold off until I see them show up.
Fortunately there are only 5 new patches since last month. :)
> llarge files allocated contiguously on the disk as part of the mke2fs
> process, and it turns out that if you try to allocate several
> gigabytes worth of files using ext2fs_bmap2(), you end up burning a
> huge amount of CPU time (as in around 30 seconds of CPU times while
> fallocating a 10GB worth of blocks; so if you try to allocate a
> terabyte or three worth of blocks, it would take a truly long time,
> while you turn your CPU into a space heater :-).
>
> The top profile user was update_path() in fs/ext4/extents.c, which is
> caused by the very large number of extent operations that are needed
> for each extent operation. The second largest profile user is
> ext2fs_crc16(), caused by the large number of calls to
> ext2fs_block_alloc_stats2(), which causes the the block group
> descriptors to get incremented one at a time.
>
> What we need to do if we want create an optimized fallocate() is to
> allocate blocks until we either exceed the max number of blocks in an
> extent, or we get a non-contiguous allocation, and then insert the
> extent into extent tree one extent at a time. Similarly, we need to
> update the block group descriptors a batched chunks, instead of after
> each individual block allocation.
>
> Similarly, as far as calling zero_block(), you really don't want to
> issue each 4k write separately.
Alternately, we could simply not allow BMAP_UNINIT for non-extent files.
That's the only reason why there's any zeroing going on at all.
--D
>
> Cheers,
>
> - Ted
> --
> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, Jan 15, 2014 at 01:11:22PM -0800, Darrick J. Wong wrote:
> > The other thing to note about this patch is that if you want to
> > implement fallocate, ext2fs_bmap2() is really the wrong tool to use.
> > I've been working on a program for work which pre-creates a bunch of
>
> I think that ext2fs_fallocate would be a good addition to the library. Is your
> program far enough along to share? fuse2fs would benefit greatly.
An ext2fs_fallocate() is way more difficult than what I've done, since
you have to deal with all sorts of corner cases where the file has
pre-existing sparse extents, which may or may not be initialized, and
making sure that it works in that case. Allocating blocks to a file
which you know started as a zero length file is in fact much easier.
Here are the key bits from my program:
/*
* This should eventually be cleaned up and put into libext2fs
* This is much faster than calling ext2fs_block_alloc_stats() for each
* block, since it requires recalculating the bg descriptor checksum
* for every single block that you allocate.
*/
static void ext2fs_block_alloc_stats_range(ext2_filsys fs, blk64_t blk,
blk_t num, int inuse)
{
int group;
#ifndef OMIT_COM_ERR
if (blk + num >= ext2fs_blocks_count(fs->super)) {
com_err("ext2fs_block_alloc_stats_range", 0,
"Illegal block range: %llu (%u) ",
(unsigned long long) blk, num);
return;
}
#endif
if (inuse == 0)
return;
if (inuse > 0) {
ext2fs_mark_block_bitmap_range2(fs->block_map, blk, num);
inuse = 1;
} else {
ext2fs_unmark_block_bitmap_range2(fs->block_map, blk, num);
inuse = -1;
}
while (num) {
group = ext2fs_group_of_blk2(fs, blk);
blk64_t last_blk = ext2fs_group_last_block2(fs, group);
blk_t n = num;
if (blk + num > last_blk)
n = last_blk - blk + 1;
ext2fs_bg_free_blocks_count_set(fs, group,
ext2fs_bg_free_blocks_count(fs, group) -
inuse*n/EXT2FS_CLUSTER_RATIO(fs));
ext2fs_bg_flags_clear(fs, group, EXT2_BG_BLOCK_UNINIT);
ext2fs_group_desc_csum_set(fs, group);
ext2fs_free_blocks_count_add(fs->super, -inuse * n);
ext2fs_mark_super_dirty(fs);
ext2fs_mark_bb_dirty(fs);
blk += n;
num -= n;
}
}
/*
* ext2fs_allocate_tables() is not optimally allocating blocks in all
* situations. We need to take a look at this at some point. For
* now, just replace it with something simple and stupid.
*/
errcode_t my_allocate_tables(ext2_filsys fs)
{
errcode_t retval;
dgrp_t i;
for (i = 0; i < fs->group_desc_count; i++) {
retval = ext2fs_new_block2(fs, goal, NULL, &goal);
if (retval)
return retval;
ext2fs_block_alloc_stats2(fs, goal, +1);
ext2fs_block_bitmap_loc_set(fs, i, goal);
}
for (i = 0; i < fs->group_desc_count; i++) {
retval = ext2fs_new_block2(fs, goal, NULL, &goal);
if (retval)
return retval;
ext2fs_block_alloc_stats2(fs, goal, +1);
ext2fs_inode_bitmap_loc_set(fs, i, goal);
}
for (i = 0; i < fs->group_desc_count; i++) {
blk64_t end = ext2fs_blocks_count(fs->super) - 1;
retval = ext2fs_get_free_blocks2(fs, goal, end,
fs->inode_blocks_per_group,
fs->block_map, &goal);
if (retval)
return retval;
ext2fs_block_alloc_stats_range(fs, goal,
fs->inode_blocks_per_group, +1);
ext2fs_inode_table_loc_set(fs, i, goal);
}
return 0;
}
/*
* Some of this could eventually get turned into fallocate, but that's
* actually a much more difficult and tricking thing to implement.
*/
static errcode_t mk_hugefile(ext2_filsys fs, unsigned int num,
ext2_ino_t dir, int idx, ext2_ino_t *ino)
{
errcode_t retval;
blk64_t lblk, blk, bend;
__u64 size;
unsigned int i;
struct ext2_inode inode;
ext2_extent_handle_t handle;
char fn[32];
retval = ext2fs_new_inode(fs, 0, LINUX_S_IFREG, NULL, ino);
if (retval)
return retval;
memset(&inode, 0, sizeof(struct ext2_inode));
inode.i_mode = LINUX_S_IFREG | 0600;
ext2fs_iblk_set(fs, &inode, num / EXT2FS_CLUSTER_RATIO(fs));
size = (__u64) num * fs->blocksize;
inode.i_size = size & 0xffffffff;
inode.i_size_high = (size >> 32);
inode.i_links_count = 1;
retval = ext2fs_write_new_inode(fs, *ino, &inode);
if (retval)
return retval;
ext2fs_inode_alloc_stats2(fs, *ino, +1, 0);
retval = ext2fs_extent_open2(fs, *ino, &inode, &handle);
if (retval)
return retval;
{
struct ext2_inode t;
ext2fs_read_inode(fs, *ino, &t);
printf("eo: i_size_high: %lu size: %llu\n", t.i_size_high,
EXT2_I_SIZE(&t));
}
lblk = 0;
while (num) {
blk64_t pblk, end;
blk_t n = num;
retval = ext2fs_find_first_zero_block_bitmap2(fs->block_map,
goal, ext2fs_blocks_count(fs->super) - 1, &end);
if (retval)
return ENOSPC;
goal = end;
retval = ext2fs_find_first_set_block_bitmap2(fs->block_map, goal,
ext2fs_blocks_count(fs->super) - 1, &bend);
if (bend == ENOENT)
bend = ext2fs_blocks_count(fs->super);
if (bend - goal < num)
n = bend - goal;
printf("goal %llu bend %llu num %u n %u\n", goal, bend, num, n);
pblk = goal;
num -= n;
goal += n;
ext2fs_block_alloc_stats_range(fs, pblk, n, +1);
while (n) {
blk_t l = n;
struct ext2fs_extent newextent;
{
struct ext2_inode t;
ext2fs_read_inode(fs, *ino, &t);
printf("i_size_high: %lu size: %llu\n", t.i_size_high,
EXT2_I_SIZE(&t));
}
if (l > EXT_INIT_MAX_LEN)
l = EXT_INIT_MAX_LEN;
newextent.e_len = l;
newextent.e_pblk = pblk;
newextent.e_lblk = lblk;
newextent.e_flags = 0;
printf("inserting extent: %llu %llu %u\n", lblk, pblk, l);
retval = ext2fs_extent_insert(handle,
EXT2_EXTENT_INSERT_AFTER, &newextent);
if (retval)
return retval;
pblk += l;
lblk += l;
n -= l;
}
}
{
struct ext2_inode t;
ext2fs_read_inode(fs, *ino, &t);
printf("i_size_high: %lu size: %llu\n", t.i_size_high,
EXT2_I_SIZE(&t));
}
sprintf(fn, "hugefile%05d", idx);
retry:
retval = ext2fs_link(fs, dir, fn, *ino, EXT2_FT_REG_FILE);
if (retval == EXT2_ET_DIR_NO_SPACE) {
retval = ext2fs_expand_dir(fs, dir);
if (retval)
goto errout;
goto retry;
}
if (retval)
goto errout;
errout:
if (handle)
ext2fs_extent_free(handle);
return retval;
}
Note that this requires some of the test patches I've been sending
out, since it uses ext2fs_find_first_{set,zero}_block_bitmap2().
There are also some bugs in the versions which I sent out; I'm working
on fixing them....
> That said, I've also found a couple of bugs in the extent code by implementing
> fallocate in such a stupid way. :) It turns out that if (a) we need to split
> an extent into three pieces (say we write to a block in the middle of an
> unwritten extent and don't want to convert the whole extent) and (b) either of
> the extent_insert calls requires us to split the extent block and (c) we ENOSPC
> while trying to allocate a new extent block, we don't put the extent tree back
> the way it was before the split, and all the blocks after that point are lost.
Well, I found a bug in extfs_extent_insert() which showed up when I
tried to implement the block allocation in an intelligent way. :-)
I'll send out that bug fix a bit.
> I will send patches to avoid this corruption by checking for enough space soon.
> I think your local git tree has patches in it that aren't on kernel.org yet, so
> I'll hold off until I see them show up.
Yeah, some of those patches still need some clean up, so I haven't
pushed my maint branch to kernel.org yet.
But anyway, the above code will give you an idea where I'm going ---
this is **way** faster than trying to allocate blocks using the
set_bmap() function. :-)
- Ted
On Wed, Jan 15, 2014 at 05:19:51PM -0500, Theodore Ts'o wrote:
> {
> struct ext2_inode t;
>
> ext2fs_read_inode(fs, *ino, &t);
> printf("i_size_high: %lu size: %llu\n", t.i_size_high,
> EXT2_I_SIZE(&t));
> }
Oops, ignore this debugging code. This was to find a bug in
ext2fs_extents_insert() where under some circumstances it end up
running into a buffer overrun bug which causes it to wipe out
i_size_high.
- Ted
On Mon, Dec 16, 2013 at 12:10:56PM -0800, Darrick J. Wong wrote:
> Yep, that makes sense. How about something like this?
>
> ---
> resize2fs: during shrink, don't free in-use bg data clusters
>
> When freeing a group's metadata blocks, be careful not to free
> clusters belonging to other groups!
>
> v2: Remove unnecessary parameters, per Ted's suggestion.
>
> Signed-off-by: Darrick J. Wong <[email protected]>
Thanks, applied.
- Ted
On Tue, Dec 10, 2013 at 05:23:40PM -0800, Darrick J. Wong wrote:
> When we're moving blocks around the filesystem, ensure that freeing
> the old blocks only frees the clusters if they're not in use by other
> metadata.
>
> Signed-off-by: Darrick J. Wong <[email protected]>
Thanks, applied.
- Ted
On Tue, Dec 10, 2013 at 05:24:26PM -0800, Darrick J. Wong wrote:
> Add functions to allow clients to get, set, and remove extended
> attributes from any file. It also supports modifying EAs living in
> i_file_acl.
>
> v2: Put the header declarations in the correct part of ext2fs.h,
> provide a function to release an EA block from an inode, and check
> i_extra_isize to make sure we actually have space for in-inode EAs.
>
> v3: Add system.richacl prefix support, and only allow the new
> ext2fs_xattr_* functions to run if we have either ext_attr or
> inline_data set. Fix some memory leaks and stack disclosure problems,
> and an accounting problem when freeing an EA block.
>
> Signed-off-by: Darrick J. Wong <[email protected]>
Applied, thanks.
- Ted
On Tue, Dec 10, 2013 at 05:24:32PM -0800, Darrick J. Wong wrote:
> A few tweaks to the extended attribute editing APIs:
>
> * Use size_t, not unsigned int, in the new extended attribute editing
> API.
>
> * Don't expose the _expand() call since there should be no external
> users.
>
> Signed-off-by: Darrick J. Wong <[email protected]>
Applied, thanks.
- Ted
On Tue, Dec 10, 2013 at 05:24:39PM -0800, Darrick J. Wong wrote:
> Add another API to query the number of extended attributes.
>
> Signed-off-by: Darrick J. Wong <[email protected]>
Applied, thanks.
- Ted
On Tue, Dec 10, 2013 at 05:24:45PM -0800, Darrick J. Wong wrote:
> Before loading extended attributes, free any key/value pairs that
> might already be associated with the file.
>
> Signed-off-by: Darrick J. Wong <[email protected]>
Applied, thanks.
- Ted
On Tue, Dec 10, 2013 at 05:24:52PM -0800, Darrick J. Wong wrote:
> Use the new extended attribute APIs to display all extended attributes
> (current code does not look in the EA block) and display full names
> (current code ignores name index too).
>
> Signed-off-by: Darrick J. Wong <[email protected]>
Applied, thanks.
- Ted