We have hit the same BUG_ON as described in
https://bugzilla.kernel.org/show_bug.cgi?id=31222
on some of our servers that have disk failures or corrupted inodes. After
looking at the code, I think the problem is that we are not freeing inode's
preallocation list when check_eofblocks_fl fails in ext4_ext_map_blocks(),
which leaves the inode's preallocation list in an inconsistent state.
Below is a proposed patch to fix the bug. I have tested it by manually
inserting a random failure in check_eofblocks_fl() and run a test that
creates and uses an inode's preallocated blocks. Without the fix, the kernel
crashes after a few runs. With the fix, no crash is observed.
ext4: free allocated and pre-allocated blocks when check_eofblocks_fl fails
Upon corrupted inode or disk failures, we may fail after we already allocate
some blocks from the inode or take some blocks from the inode's preallocation
list, but before we successfully insert the corresponding extent to the extent
tree. In this case, we should free any allocated blocks and discard the inode's
preallocated blocks because the entries in the inode's preallocation list may
be in an inconsistent state.
Signed-off-by: Jiaying Zhang <[email protected]>
diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c
index 5199bac..8cf6ec9 100644
--- a/fs/ext4/extents.c
+++ b/fs/ext4/extents.c
@@ -3596,10 +3596,8 @@ int ext4_ext_map_blocks(handle_t *handle, struct inode *inode,
}
err = check_eofblocks_fl(handle, inode, map->m_lblk, path, ar.len);
- if (err)
- goto out2;
On Mon, 20 Jun 2011, Jiaying Zhang wrote:
> We have hit the same BUG_ON as described in
> https://bugzilla.kernel.org/show_bug.cgi?id=31222
> on some of our servers that have disk failures or corrupted inodes. After
> looking at the code, I think the problem is that we are not freeing inode's
> preallocation list when check_eofblocks_fl fails in ext4_ext_map_blocks(),
> which leaves the inode's preallocation list in an inconsistent state.
>
> Below is a proposed patch to fix the bug. I have tested it by manually
> inserting a random failure in check_eofblocks_fl() and run a test that
> creates and uses an inode's preallocated blocks. Without the fix, the kernel
> crashes after a few runs. With the fix, no crash is observed.
>
> ext4: free allocated and pre-allocated blocks when check_eofblocks_fl fails
>
> Upon corrupted inode or disk failures, we may fail after we already allocate
> some blocks from the inode or take some blocks from the inode's preallocation
> list, but before we successfully insert the corresponding extent to the extent
> tree. In this case, we should free any allocated blocks and discard the inode's
> preallocated blocks because the entries in the inode's preallocation list may
> be in an inconsistent state.
>
> Signed-off-by: Jiaying Zhang <[email protected]>
Hi,
the change looks good to me (with one note), but the patch itself is not very
well formated, and Ted would have to copy it out. Please use this:
ext4: free allocated and pre-allocated blocks when check_eofblocks_fl fails
for Subject line and put any comments which should not appear in the
commit description after signed-off-by line starting with "--- " line.
Or just use git format-patch.
>
> diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c
> index 5199bac..8cf6ec9 100644
> --- a/fs/ext4/extents.c
> +++ b/fs/ext4/extents.c
> @@ -3596,10 +3596,8 @@ int ext4_ext_map_blocks(handle_t *handle, struct inode *inode,
> }
>
> err = check_eofblocks_fl(handle, inode, map->m_lblk, path, ar.len);
> - if (err)
> - goto out2;
> -
> - err = ext4_ext_insert_extent(handle, inode, path, &newex, flags);
> + if (!err)
> + err = ext4_ext_insert_extent(handle, inode, path, &newex, flags);
> if (err) {
why not to use:
else {
> /* free data blocks we just allocated */
> /* not a good idea to call discard here directly,
> --
> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
Thanks!
-Lukas
On Tue, Jun 21, 2011 at 8:24 AM, Lukas Czerner <[email protected]> wrote:
>
> On Mon, 20 Jun 2011, Jiaying Zhang wrote:
>
> > We have hit the same BUG_ON as described in
> > https://bugzilla.kernel.org/show_bug.cgi?id=31222
> > on some of our servers that have disk failures or corrupted inodes. After
> > looking at the code, I think the problem is that we are not freeing inode's
> > preallocation list when check_eofblocks_fl fails in ext4_ext_map_blocks(),
> > which leaves the inode's preallocation list in an inconsistent state.
> >
> > Below is a proposed patch to fix the bug. I have tested it by manually
> > inserting a random failure in check_eofblocks_fl() and run a test that
> > creates and uses an inode's preallocated blocks. Without the fix, the kernel
> > crashes after a few runs. With the fix, no crash is observed.
> >
> > ext4: free allocated and pre-allocated blocks when check_eofblocks_fl fails
> >
> > Upon corrupted inode or disk failures, we may fail after we already allocate
> > some blocks from the inode or take some blocks from the inode's preallocation
> > list, but before we successfully insert the corresponding extent to the extent
> > tree. In this case, we should free any allocated blocks and discard the inode's
> > preallocated blocks because the entries in the inode's preallocation list may
> > be in an inconsistent state.
> >
> > Signed-off-by: Jiaying Zhang <[email protected]>
>
> Hi,
>
> the change looks good to me (with one note), but the patch itself is not very
> well formated, and Ted would have to copy it out. Please use this:
>
> ext4: free allocated and pre-allocated blocks when check_eofblocks_fl fails
>
> for Subject line and put any comments which should not appear in the
> commit description after signed-off-by line starting with "--- " line.
> Or just use git format-patch.
Thanks for the suggestion. I will re-send the patch with correct format.
>
> >
> > diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c
> > index 5199bac..8cf6ec9 100644
> > --- a/fs/ext4/extents.c
> > +++ b/fs/ext4/extents.c
> > @@ -3596,10 +3596,8 @@ int ext4_ext_map_blocks(handle_t *handle, struct inode *inode,
> > ? ? ? }
> >
> > ? ? ? err = check_eofblocks_fl(handle, inode, map->m_lblk, path, ar.len);
> > - ? ? if (err)
> > - ? ? ? ? ? ? goto out2;
> > -
> > - ? ? err = ext4_ext_insert_extent(handle, inode, path, &newex, flags);
> > + ? ? if (!err)
> > + ? ? ? ? ? ? err = ext4_ext_insert_extent(handle, inode, path, &newex, flags);
> > ? ? ? if (err) {
>
> why not to use:
> ? ? ? ?else {
>
It is not the same. If there is an error from ext4_ext_insert_extent(),
we also want to free allocated blocks.
Jiaying
>
> > ? ? ? ? ? ? ? /* free data blocks we just allocated */
> > ? ? ? ? ? ? ? /* not a good idea to call discard here directly,
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> > the body of a message to [email protected]
> > More majordomo info at ?http://vger.kernel.org/majordomo-info.html
> >
>
> Thanks!
> -Lukas
On Tue, 21 Jun 2011, Jiaying Zhang wrote:
> > >
> > > diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c
> > > index 5199bac..8cf6ec9 100644
> > > --- a/fs/ext4/extents.c
> > > +++ b/fs/ext4/extents.c
> > > @@ -3596,10 +3596,8 @@ int ext4_ext_map_blocks(handle_t *handle, struct inode *inode,
> > > ? ? ? }
> > >
> > > ? ? ? err = check_eofblocks_fl(handle, inode, map->m_lblk, path, ar.len);
> > > - ? ? if (err)
> > > - ? ? ? ? ? ? goto out2;
> > > -
> > > - ? ? err = ext4_ext_insert_extent(handle, inode, path, &newex, flags);
> > > + ? ? if (!err)
> > > + ? ? ? ? ? ? err = ext4_ext_insert_extent(handle, inode, path, &newex, flags);
> > > ? ? ? if (err) {
> >
> > why not to use:
> > ? ? ? ?else {
> >
> It is not the same. If there is an error from ext4_ext_insert_extent(),
> we also want to free allocated blocks.
Oh, of course I have missed the assignment, sorry.
Thanks!
-Lukas
>
> Jiaying
> >
> > > ? ? ? ? ? ? ? /* free data blocks we just allocated */
> > > ? ? ? ? ? ? ? /* not a good idea to call discard here directly,
> > > --
> > > To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> > > the body of a message to [email protected]
> > > More majordomo info at ?http://vger.kernel.org/majordomo-info.html
> > >
> >
> > Thanks!
> > -Lukas
>