2009-05-18 21:30:29

by Eric Sandeen

[permalink] [raw]
Subject: [PATCH] resize2fs: fix ENOSPC corruption case

http://people.redhat.com/esandeen/livecd-creator-imagefile.bz2
contains an image (for now) which, when resized to 578639, corrupts
the filesystem.

This is a bit crazy, I guess, because the fs currently has only
1 free block, but still, we should be graceful about the failure.
Perhaps it would make sense to check the requested valuea against
the minimum value resize2fs would compute for "-P" and fail (at
least without a force).

But in any case, this exposed 2 bugs when moving that one block
required an extent split, which is what hit the ENOSPC.

For starters, ext2fs_extent_set_bmap() in the "(re/un)mapping last
block in extent" case was replacing the old extent before the
new one was created; when the new extent creation failed, it
left us in an inconsistent state. Simply changing the order of
the two should fix this problem.

Next, ext2fs_extent_insert was calling ext2fs_extent_delete()
on *any* error, including one caused by failure to allocate a new
block to split the node to hold that extent ... the handle was left
unchanged, and we deleted the -original- extent.

As a quick fix for this, just don't do the delete if we fail the split,
though this may need to be smarter. I don't think we have terribly
consistent behavior about where a handle is left on various errors.

Signed-off-by: Eric Sandeen <[email protected]>
---

Index: e2fsprogs/lib/ext2fs/extent.c
===================================================================
--- e2fsprogs.orig/lib/ext2fs/extent.c
+++ e2fsprogs/lib/ext2fs/extent.c
@@ -1068,16 +1068,17 @@ errcode_t ext2fs_extent_insert(ext2_exte

retval = ext2fs_extent_replace(handle, 0, extent);
if (retval)
- goto errout;
+ goto errout_delete;

retval = update_path(handle);
if (retval)
- goto errout;
+ goto errout_delete;

return 0;

-errout:
+errout_delete:
ext2fs_extent_delete(handle, 0);
+errout:
return retval;
}

@@ -1239,16 +1240,17 @@ again:
#ifdef DEBUG
printf("(re/un)mapping last block in extent\n");
#endif
- extent.e_len--;
- retval = ext2fs_extent_replace(handle, 0, &extent);
- if (retval)
- goto done;
+ /* Make sure insert works before replacing old extent */
if (physical) {
retval = ext2fs_extent_insert(handle,
EXT2_EXTENT_INSERT_AFTER, &newextent);
if (retval)
goto done;
}
+ extent.e_len--;
+ retval = ext2fs_extent_replace(handle, 0, &extent);
+ if (retval)
+ goto done;
} else if (logical == extent.e_lblk) {
#ifdef DEBUG
printf("(re/un)mapping first block in extent\n");



2009-05-18 21:35:13

by Eric Sandeen

[permalink] [raw]
Subject: Re: [PATCH] resize2fs: fix ENOSPC corruption case

Eric Sandeen wrote:
> http://people.redhat.com/esandeen/livecd-creator-imagefile.bz2
> contains an image (for now) which, when resized to 578639, corrupts
> the filesystem.
>
> This is a bit crazy, I guess, because the fs currently has only
> 1 free block, but still, we should be graceful about the failure.
> Perhaps it would make sense to check the requested valuea against
> the minimum value resize2fs would compute for "-P" and fail (at
> least without a force).

heh, well, maybe not:

[root@inode e2fsprogs]# dumpe2fs -h /mnt/test/livecd-creator-imagefile |
grep "Block count"
dumpe2fs 1.41.5 (23-Apr-2009)
Block count: 578640
[root@inode e2fsprogs]# resize/resize2fs.static -P
/mnt/test/livecd-creator-imagefile
resize2fs 1.41.5 (23-Apr-2009)
Estimated minimum size of the filesystem: 9171138

The estimated minimum size is 16x the current size.

So much for that plan, at least without a bit more work.

-Eric

2009-05-20 11:37:49

by Theodore Ts'o

[permalink] [raw]
Subject: Re: [PATCH] resize2fs: fix ENOSPC corruption case

On Mon, May 18, 2009 at 04:30:29PM -0500, Eric Sandeen wrote:
>
> Index: e2fsprogs/lib/ext2fs/extent.c
> ===================================================================
> --- e2fsprogs.orig/lib/ext2fs/extent.c
> +++ e2fsprogs/lib/ext2fs/extent.c
> @@ -1068,16 +1068,17 @@ errcode_t ext2fs_extent_insert(ext2_exte
>
> retval = ext2fs_extent_replace(handle, 0, extent);
> if (retval)
> - goto errout;
> + goto errout_delete;
>
> retval = update_path(handle);
> if (retval)
> - goto errout;
> + goto errout_delete;
>
> return 0;
>
> -errout:
> +errout_delete:
> ext2fs_extent_delete(handle, 0);
> +errout:
> return retval;
> }


Instead adding an errout_delete and changing what errout means, why
not just change:

retval = extent_node_split(handle);
if (retval)
- goto errout;
+ return retval;
path = handle->path + handle->level;

I also took a quick scan of extent_node_split(), and I'm not 100%
convinced it handles all of its error cases sanely. But that should
be the focus of a different patch....


> @@ -1239,16 +1240,17 @@ again:
> #ifdef DEBUG
> printf("(re/un)mapping last block in extent\n");
> #endif
> - extent.e_len--;
> - retval = ext2fs_extent_replace(handle, 0, &extent);
> - if (retval)
> - goto done;
> + /* Make sure insert works before replacing old extent */
> if (physical) {
> retval = ext2fs_extent_insert(handle,
> EXT2_EXTENT_INSERT_AFTER, &newextent);
> if (retval)
> goto done;
> }
> + extent.e_len--;
> + retval = ext2fs_extent_replace(handle, 0, &extent);
> + if (retval)
> + goto done;
> } else if (logical == extent.e_lblk) {

Have you tested this by hand, using the tst_extents test progam? Code
inspection says that ext2fs_extent_insert leaves extent handle
pointing at the same place, but I admit the semantics need to be
better documented and tested to make sure they are correct in all
situations. The extent code is new enough and tricky enough that I'm
always cautious about changes to it.

Looks good, though.

- Ted

2009-05-20 15:57:07

by Eric Sandeen

[permalink] [raw]
Subject: Re: [PATCH] resize2fs: fix ENOSPC corruption case

Theodore Tso wrote:
> On Mon, May 18, 2009 at 04:30:29PM -0500, Eric Sandeen wrote:
>> Index: e2fsprogs/lib/ext2fs/extent.c
>> ===================================================================
>> --- e2fsprogs.orig/lib/ext2fs/extent.c
>> +++ e2fsprogs/lib/ext2fs/extent.c
>> @@ -1068,16 +1068,17 @@ errcode_t ext2fs_extent_insert(ext2_exte
>>
>> retval = ext2fs_extent_replace(handle, 0, extent);
>> if (retval)
>> - goto errout;
>> + goto errout_delete;
>>
>> retval = update_path(handle);
>> if (retval)
>> - goto errout;
>> + goto errout_delete;
>>
>> return 0;
>>
>> -errout:
>> +errout_delete:
>> ext2fs_extent_delete(handle, 0);
>> +errout:
>> return retval;
>> }
>
>
> Instead adding an errout_delete and changing what errout means, why
> not just change:
>
> retval = extent_node_split(handle);
> if (retval)
> - goto errout;
> + return retval;
> path = handle->path + handle->level;

I guess there are other earlier direct returns, so sure, that'd be fine.

> I also took a quick scan of extent_node_split(), and I'm not 100%
> convinced it handles all of its error cases sanely. But that should
> be the focus of a different patch....

yeah, it probably needs more careful inspection. Who wrote that thing
anyway ... ;)

>
>> @@ -1239,16 +1240,17 @@ again:
>> #ifdef DEBUG
>> printf("(re/un)mapping last block in extent\n");
>> #endif
>> - extent.e_len--;
>> - retval = ext2fs_extent_replace(handle, 0, &extent);
>> - if (retval)
>> - goto done;
>> + /* Make sure insert works before replacing old extent */
>> if (physical) {
>> retval = ext2fs_extent_insert(handle,
>> EXT2_EXTENT_INSERT_AFTER, &newextent);
>> if (retval)
>> goto done;
>> }
>> + extent.e_len--;
>> + retval = ext2fs_extent_replace(handle, 0, &extent);
>> + if (retval)
>> + goto done;
>> } else if (logical == extent.e_lblk) {
>
> Have you tested this by hand, using the tst_extents test progam?

No, but I should.

> Code
> inspection says that ext2fs_extent_insert leaves extent handle
> pointing at the same place, but I admit the semantics need to be
> better documented and tested to make sure they are correct in all
> situations. The extent code is new enough and tricky enough that I'm
> always cautious about changes to it.

yep, this likely needs more work to document the semantics, make sure
they're consistent, and check all those error cases.

For F11 I will likely put in the later 2 patches, to fix up the minimum
size reporting and then enforce that as the minimum size, so we
shouldn't(tm) get into these error paths. Seems like the safer
last-minute change.

-Eric

> Looks good, though.
>
> - Ted


2009-05-20 20:58:35

by Theodore Ts'o

[permalink] [raw]
Subject: Re: [PATCH] resize2fs: fix ENOSPC corruption case

On Wed, May 20, 2009 at 10:57:05AM -0500, Eric Sandeen wrote:
> For F11 I will likely put in the later 2 patches, to fix up the minimum
> size reporting and then enforce that as the minimum size, so we
> shouldn't(tm) get into these error paths. Seems like the safer
> last-minute change.

The first half of this patch I think is quite safe, and after stepping
through tst_etent, I'm pretty sure the second half of this patch would
be pretty easy to validate. We probably don't want to enter any of
these error paths for now, though, since it's not clear we've fied all
of them, particularly in the split_node code.

- Ted

2009-05-20 21:36:28

by Eric Sandeen

[permalink] [raw]
Subject: [PATCH V2] resize2fs: fix ENOSPC corruption case

http://people.redhat.com/esandeen/livecd-creator-imagefile.bz2
contains an image (for now) which, when resized to 578639, corrupts
the filesystem.

This is a bit crazy, I guess, because the fs currently has only
1 free block, but still, we should be graceful about the failure.
Perhaps it would make sense to check the requested valuea against
the minimum value resize2fs would compute for "-P" and fail (at
least without a force).

But in any case, this exposed 2 bugs when moving that one block
required an extent split, which is what hit the ENOSPC.

For starters, ext2fs_extent_set_bmap() in the "(re/un)mapping last
block in extent" case was replacing the old extent before the
new one was created; when the new extent creation failed, it
left us in an inconsistent state. Simply changing the order of
the two should fix this problem.

Next, ext2fs_extent_insert was calling ext2fs_extent_delete()
on *any* error, including one caused by failure to allocate a new
block to split the node to hold that extent ... the handle was left
unchanged, and we deleted the -original- extent.

As a quick fix for this, just don't do the delete if we fail the split,
though this may need to be smarter. I don't think we have terribly
consistent behavior about where a handle is left on various errors.

v2: lose the goto change complexity

Signed-off-by: Eric Sandeen <[email protected]>
---


diff --git a/lib/ext2fs/extent.c b/lib/ext2fs/extent.c
index 143929e..ddb2d2a 100644
--- a/lib/ext2fs/extent.c
+++ b/lib/ext2fs/extent.c
@@ -1040,7 +1040,7 @@ errcode_t ext2fs_extent_insert(ext2_extent_handle_t handle, int flags,
#endif
retval = extent_node_split(handle);
if (retval)
- goto errout;
+ return retval;
path = handle->path + handle->level;
}
}
@@ -1239,16 +1239,17 @@ again:
#ifdef DEBUG
printf("(re/un)mapping last block in extent\n");
#endif
- extent.e_len--;
- retval = ext2fs_extent_replace(handle, 0, &extent);
- if (retval)
- goto done;
+ /* Make sure insert works before replacing old extent */
if (physical) {
retval = ext2fs_extent_insert(handle,
EXT2_EXTENT_INSERT_AFTER, &newextent);
if (retval)
goto done;
}
+ extent.e_len--;
+ retval = ext2fs_extent_replace(handle, 0, &extent);
+ if (retval)
+ goto done;
} else if (logical == extent.e_lblk) {
#ifdef DEBUG
printf("(re/un)mapping first block in extent\n");


2009-05-26 02:38:49

by Theodore Ts'o

[permalink] [raw]