2008-03-24 22:13:44

by Eric Sandeen

[permalink] [raw]
Subject: [PATCH e2fsprogs-next] Fix extent flag validity tests in pass1 on big endian boxes.

Extent data is shared with the i_block[] space in the inode,
but it is always swapped on access, not when the inode is read.

In e2fsck/pass1.c we must be careful when checking validity
of the extents flag on the inode. If the flag was set when
the inode was read & swapped, then the extents data itself
(in ->i_block[]) was NOT swapped, so testing for a valid
extent header requires some swapping first. Then, if we
ultimately set the extents flag, all of i_block[] must be
re/un-swapped.

This passes the f_extent regression test on both ppc & x86.

Signed-off-by: Eric Sandeen <[email protected]>
---
(I said I wanted to make it not-ugly... not sure I succeeded)
(also feel free to edit the comments if too verbose)

e2fsck/pass1.c | 52 +++++++++++++++++++++++++++++++++++++++++++++++++----
1 files changed, 48 insertions(+), 4 deletions(-)

diff --git a/e2fsck/pass1.c b/e2fsck/pass1.c
index 79e9f23..0d6307d 100644
--- a/e2fsck/pass1.c
+++ b/e2fsck/pass1.c
@@ -680,7 +680,22 @@ void e2fsck_pass1(e2fsck_t ctx)
return;
}
}
-
+
+ /*
+ * Test for incorrect extent flag settings.
+ *
+ * On big-endian machines we must be careful:
+ * When the inode is read, the i_block array is not swapped
+ * if the extent flag is set. Therefore if we are testing
+ * for or fixing a wrongly-set flag, we must potentially
+ * (un)swap before testing, or after fixing.
+ */
+
+ /*
+ * In this case the extents flag was set when read, so
+ * extent_header_verify is ok. If the inode is cleared,
+ * no need to swap... so no extra swapping here.
+ */
if ((inode->i_flags & EXT4_EXTENTS_FL) && !extent_fs &&
(inode->i_links_count || (ino == EXT2_BAD_INO) ||
(ino == EXT2_ROOT_INO) || (ino == EXT2_JOURNAL_INO))) {
@@ -700,19 +716,48 @@ void e2fsck_pass1(e2fsck_t ctx)
}
}

+ /*
+ * For big-endian machines:
+ * If the inode didn't have the extents flag set when it
+ * was read, then the i_blocks array was swapped. To test
+ * as an extents header, we must swap it back first.
+ * IF we then set the extents flag, the entire i_block
+ * array must be un/re-swapped to make it proper extents data.
+ */
+
if (extent_fs && !(inode->i_flags & EXT4_EXTENTS_FL) &&
(inode->i_links_count || (ino == EXT2_BAD_INO) ||
(ino == EXT2_ROOT_INO) || (ino == EXT2_JOURNAL_INO)) &&
(LINUX_S_ISREG(inode->i_mode) ||
- LINUX_S_ISDIR(inode->i_mode)) &&
- (ext2fs_extent_header_verify(inode->i_block,
- sizeof(inode->i_block)) == 0)) {
+ LINUX_S_ISDIR(inode->i_mode))) {
+ void *ehp;
+#ifdef WORDS_BIGENDIAN
+ __u32 tmp_block[3];
+
+ for (i = 0; i < 3; i++)
+ tmp_block[i] = ext2fs_swab32(inode->i_block[i]);
+ ehp = tmp_block;
+#else
+ ehp = inode->i_block;
+#endif
+ if (ext2fs_extent_header_verify(ehp,
+ sizeof(inode->i_block)) == 0) {
if (fix_problem(ctx, PR_1_UNSET_EXTENT_FL, &pctx)) {
inode->i_flags |= EXT4_EXTENTS_FL;
+#ifdef WORDS_BIGENDIAN
+ for (i = 0; i < EXT2_N_BLOCKS; i++)
+ inode->i_block[i] =
+ ext2fs_swab32(inode->i_block[i]);
+#endif
e2fsck_write_inode(ctx, ino, inode, "pass1");
}
+ }
}

+ /*
+ * ext2fs_inode_has_valid_blocks does not actually look
+ * at i_block[] values, so not endian-sensitive here.
+ */
if (extent_fs && (inode->i_flags & EXT4_EXTENTS_FL) &&
LINUX_S_ISLNK(inode->i_mode) &&
!ext2fs_inode_has_valid_blocks(inode) &&
-- 1.5.4.1



2008-03-25 00:33:03

by Andreas Dilger

[permalink] [raw]
Subject: Re: [PATCH e2fsprogs-next] Fix extent flag validity tests in pass1 on big endian boxes.

On Mar 24, 2008 17:13 -0500, Eric Sandeen wrote:
> Extent data is shared with the i_block[] space in the inode,
> but it is always swapped on access, not when the inode is read.
>
> In e2fsck/pass1.c we must be careful when checking validity
> of the extents flag on the inode. If the flag was set when
> the inode was read & swapped, then the extents data itself
> (in ->i_block[]) was NOT swapped, so testing for a valid
> extent header requires some swapping first. Then, if we
> ultimately set the extents flag, all of i_block[] must be
> re/un-swapped.

This seems pretty awkward for any other users of the library. Having the
i_block[] array NOT be swabbed if it is an extent file means that every
place in the code which is accessing this array also needs to do the
swabbing itself. This would break the abstraction that the in-memory
inode is in host-endian order, and also forces every application to
understand the difference between extent- and non-extent-mapped inodes,
and the on-disk byte order. Ugh.

IMHO, it would be better to swab the i_block[] array in
ext2fs_swap_inode_full() if the EXTENTS_FL is set, and in the rare
case of e2fsck needing to clear that flag then it should un-swap (if
needed), clear the flag, and re-swap (if needed). This will very rarely
happen, I think. Note that the Lustre extents patches did NOT do the
swap+clear_swap operation when clearing the extents flags because we
don't have any big-endian server systems (our PPC testing is limited to
userspace "make check"), though I think that is the right thing to do.


Cheers, Andreas
--
Andreas Dilger
Sr. Staff Engineer, Lustre Group
Sun Microsystems of Canada, Inc.


2008-03-25 00:56:26

by Eric Sandeen

[permalink] [raw]
Subject: Re: [PATCH e2fsprogs-next] Fix extent flag validity tests in pass1 on big endian boxes.

Andreas Dilger wrote:
> On Mar 24, 2008 17:13 -0500, Eric Sandeen wrote:
>> Extent data is shared with the i_block[] space in the inode,
>> but it is always swapped on access, not when the inode is read.
>>
>> In e2fsck/pass1.c we must be careful when checking validity
>> of the extents flag on the inode. If the flag was set when
>> the inode was read & swapped, then the extents data itself
>> (in ->i_block[]) was NOT swapped, so testing for a valid
>> extent header requires some swapping first. Then, if we
>> ultimately set the extents flag, all of i_block[] must be
>> re/un-swapped.
>
> This seems pretty awkward for any other users of the library. Having the
> i_block[] array NOT be swabbed if it is an extent file means that every
> place in the code which is accessing this array also needs to do the
> swabbing itself. This would break the abstraction that the in-memory
> inode is in host-endian order, and also forces every application to
> understand the difference between extent- and non-extent-mapped inodes,
> and the on-disk byte order. Ugh.

Well, I tend to agree, but I thought that's how it was implemented
throughout the library... as well as in the kernel?

if (unlikely(le16_to_cpu(eh->eh_depth) != depth)) {
error_msg = "unexpected eh_depth";
goto corrupted;
}

...

static inline unsigned short ext_depth(struct inode *inode)
{
return le16_to_cpu(ext_inode_hdr(inode)->eh_depth);
}

etc... or am I missing something silly...

-Eric

> IMHO, it would be better to swab the i_block[] array in
> ext2fs_swap_inode_full() if the EXTENTS_FL is set, and in the rare
> case of e2fsck needing to clear that flag then it should un-swap (if
> needed), clear the flag, and re-swap (if needed). This will very rarely
> happen, I think. Note that the Lustre extents patches did NOT do the
> swap+clear_swap operation when clearing the extents flags because we
> don't have any big-endian server systems (our PPC testing is limited to
> userspace "make check"), though I think that is the right thing to do.
>
>
> Cheers, Andreas
> --
> Andreas Dilger
> Sr. Staff Engineer, Lustre Group
> Sun Microsystems of Canada, Inc.
>


2008-03-25 02:20:19

by Theodore Ts'o

[permalink] [raw]
Subject: Re: [PATCH e2fsprogs-next] Fix extent flag validity tests in pass1 on big endian boxes.

On Mon, Mar 24, 2008 at 06:32:51PM -0600, Andreas Dilger wrote:
> On Mar 24, 2008 17:13 -0500, Eric Sandeen wrote:
> > Extent data is shared with the i_block[] space in the inode,
> > but it is always swapped on access, not when the inode is read.
> >
> > In e2fsck/pass1.c we must be careful when checking validity
> > of the extents flag on the inode. If the flag was set when
> > the inode was read & swapped, then the extents data itself
> > (in ->i_block[]) was NOT swapped, so testing for a valid
> > extent header requires some swapping first. Then, if we
> > ultimately set the extents flag, all of i_block[] must be
> > re/un-swapped.
>
> This seems pretty awkward for any other users of the library. Having the
> i_block[] array NOT be swabbed if it is an extent file means that every
> place in the code which is accessing this array also needs to do the
> swabbing itself. This would break the abstraction that the in-memory
> inode is in host-endian order, and also forces every application to
> understand the difference between extent- and non-extent-mapped inodes,
> and the on-disk byte order. Ugh.

I did this design intentionally, because the *only* part of e2fsprogs
which is supposed to know about byte-swapping is lib/ext2fs/extents.c.
Well, at least normally unless EXTENTS_FL is wrongly set or unset. So
e2fsck needs to have some special case code to undo swapping
i_blocks[], but that's the only part of the library that will need to
deal with byte-swapping extents.

The reason why I did that is was because I didn't want to have a lot
of messy code in lib/ext2fs/swapfs.c that might need to change if we
needed to support new extents format (some of which might have more
complicated byte-swapping formats) in order to support 64-bit block
numbers, for example --- or, if we end up using a bit-packed more
compressible format so we can fit more extents into i_blocks[]. So
right now, by *design* the only place that needs to know about extents
formats is lib/ext2fs/extent.c.

- Ted