From: Forrest Liu Subject: Re: [PATCH] ext4: fix extent tree corruption that incurred by hole punch Date: Thu, 6 Dec 2012 19:35:32 +0800 Message-ID: References: <1354623069-8124-1-git-send-email-forrestl@synology.com> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Cc: "Theodore Ts'o" , ext4 development To: Ashish Sangwan Return-path: Received: from mail-ie0-f170.google.com ([209.85.223.170]:64700 "EHLO mail-ie0-f170.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1423208Ab2LFLku (ORCPT ); Thu, 6 Dec 2012 06:40:50 -0500 Received: by mail-ie0-f170.google.com with SMTP id k10so17421519iea.1 for ; Thu, 06 Dec 2012 03:40:49 -0800 (PST) In-Reply-To: Sender: linux-ext4-owner@vger.kernel.org List-ID: Hi Ashish, Thanks for your review and fixed hole punch bugs. We can't propogate ei_block only in ext4_ext_rm_leave. There have two cases to call ext4_ext_rm_idx 1) extent block has no entry, ext4_ext_rm_leave calls ext4_ext_rm_idx to release extent block. 2) extent index block has no entry, ext4_ext_remove_space calls ext4_ext_rm_idx to release extent index block. Consider remove third level of extent index block that indexed by second level of extent index, and the extent index in second level is first entry in block. In this case, ei_block will not propogate to top level correctly, if we only do propogation in ext4_ext_rm_leave. Regards, Forrest 2012/12/5 Ashish Sangwan : > Sorry 1 typo. > It should be > + path->p_idx->ei_block =(path + 1)->p_idx->ei_block ; > instead of > + path->p_idx->ei_block = border; > > On Wed, Dec 5, 2012 at 1:23 PM, Ashish Sangwan wrote: >> I have tested it and the problem is real. >> It happens when depth => 2 and punch hole completely removes any of >> the first extent indexes at depth 1. >> The change in ei_block is not propagated backwards to index at depth 0. >> >> However, AFAICS, the problematic call to ext4_ext_rm_idx is from >> ext4_ext_rm_leaf and there is no need to change in >> ext4_ext_remove_space or in ext4_ext_rm_idx. >> >> Forrest, what do you think about below change ? >> basically its your code, but its been added to ext4_ext_rm_leaf after >> removing index. >> >> @@ -2448,8 +2464,28 @@ ext4_ext_rm_leaf(handle_t *handle, struct inode *inode, >> >> /* if this leaf is free, then we should >> * remove it from index block above */ >> - if (err == 0 && eh->eh_entries == 0 && path[depth].p_bh != NULL) >> + if (err == 0 && eh->eh_entries == 0 && path[depth].p_bh != NULL) { >> err = ext4_ext_rm_idx(handle, inode, path + depth); >> + /* If this was the first extent index in node, >> + * propagates the ei_block updation info to top */ >> + if(!err) { >> + --depth; >> + path = path + depth; >> + while (--depth >= 0) { >> + if (path->p_idx != EXT_FIRST_INDEX(path->p_hdr)) >> + break; >> + path--; >> + err = ext4_ext_get_access(handle, inode, path); >> + if (err) >> + break; >> + path->p_idx->ei_block = border; >> + err = ext4_ext_dirty(handle, inode, path); >> + if (err) >> + break; >> + } >> + } >> + } >> + >> >> On Wed, Dec 5, 2012 at 11:43 AM, Forrest Liu wrote: >>> 2012/12/4 forrest : >>>> This problem is easily to reproduce >>>> >>>> Create a file with size larger than 1GB. >>>> >>>> dd if=/dev/zero of=/test_file bs=1M count=1024 >>>> >>>> Punch every even block in test_file, and then use debugfs to dump >>>> extents, followings is dumped result >>>> >>>> 2/ 2 339/340 231197 - 231197 3917597 - 3917597 1 >>>> 2/ 2 340/340 231199 - 231199 3917599 - 3917599 1 >>>> 0/ 2 2/ 2 231201 - 262143 3901486 30943 >>>> 1/ 2 1/ 46 231201 - 231880 3901488 680 >>>> 2/ 2 1/340 231201 - 231201 3917601 - 3917601 1 >>>> 2/ 2 2/340 231203 - 231203 3917603 - 3917603 1 >>>> >>>> Punch blocks #231779 ~#231201 , to remove extent index, and then use >>>> debugfs to dump extents, followings is dumped result >>>> >>>> 2/ 2 340/340 231199 - 231199 3917599 - 3917599 1 >>>> 0/ 2 2/ 2 231201 - 262143 3901486 30943 >>>> -------> logical block index didn't update when remove extent index >>>> 1/ 2 1/ 45 231881 - 232560 3901490 680 >>>> 2/ 2 1/340 231881 - 231881 3918281 - 3918281 1 >>>> 2/ 2 2/340 231883 - 231883 3918283 - 3918283 1 >>> >>> After blocks #231202~#231880 had been punch out, theses blocks can't map again. >>> If we do map a block in #231202~#231880, we will get error messages >>> like followings. >>> >>> EXT4-fs error (device md2): ext4_ext_search_left:1239: inode #13: comm >>> flush-9:2: ix (231201) != EXT_FIRST_INDEX (1) (depth 0)! >>> EXT4-fs (md2): delayed block allocation failed for inode 13 at logical >>> offset 231203 with max blocks 1 with error -5 >>> EXT4-fs (md2): This should not happen!! Data will be lost >>> >>> Regards, >>> Forrest >>> >>>> >>>> 2012/12/4 Forrest Liu : >>>>> Extent indexes didn't update correctly in ext4_ext_rm_idx, when depth >>>>> of extent tree is greater than 1. >>>>> >>>>> Signed-off-by: Forrest Liu >>>>> --- >>>>> fs/ext4/extents.c | 24 ++++++++++++++++++++---- >>>>> 1 files changed, 20 insertions(+), 4 deletions(-) >>>>> >>>>> diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c >>>>> index d3dd618..b10b8c0 100644 >>>>> --- a/fs/ext4/extents.c >>>>> +++ b/fs/ext4/extents.c >>>>> @@ -2190,13 +2190,15 @@ errout: >>>>> * removes index from the index block. >>>>> */ >>>>> static int ext4_ext_rm_idx(handle_t *handle, struct inode *inode, >>>>> - struct ext4_ext_path *path) >>>>> + struct ext4_ext_path *path, int depth) >>>>> { >>>>> int err; >>>>> ext4_fsblk_t leaf; >>>>> + __le32 border; >>>>> >>>>> /* free index block */ >>>>> - path--; >>>>> + depth--; >>>>> + path = path + depth; >>>>> leaf = ext4_idx_pblock(path->p_idx); >>>>> if (unlikely(path->p_hdr->eh_entries == 0)) { >>>>> EXT4_ERROR_INODE(inode, "path->p_hdr->eh_entries == 0"); >>>>> @@ -2221,6 +2223,20 @@ static int ext4_ext_rm_idx(handle_t *handle, struct inode *inode, >>>>> >>>>> ext4_free_blocks(handle, inode, NULL, leaf, 1, >>>>> EXT4_FREE_BLOCKS_METADATA | EXT4_FREE_BLOCKS_FORGET); >>>>> + >>>>> + border = path->p_idx->ei_block; >>>>> + while (--depth >= 0) { >>>>> + if (path->p_idx != EXT_FIRST_INDEX(path->p_hdr)) >>>>> + break; >>>>> + path--; >>>>> + err = ext4_ext_get_access(handle, inode, path); >>>>> + if (err) >>>>> + break; >>>>> + path->p_idx->ei_block = border; >>>>> + err = ext4_ext_dirty(handle, inode, path); >>>>> + if (err) >>>>> + break; >>>>> + } >>>>> return err; >>>>> } >>>>> >>>>> @@ -2557,7 +2573,7 @@ ext4_ext_rm_leaf(handle_t *handle, struct inode *inode, >>>>> /* if this leaf is free, then we should >>>>> * remove it from index block above */ >>>>> if (err == 0 && eh->eh_entries == 0 && path[depth].p_bh != NULL) >>>>> - err = ext4_ext_rm_idx(handle, inode, path + depth); >>>>> + err = ext4_ext_rm_idx(handle, inode, path, depth); >>>>> >>>>> out: >>>>> return err; >>>>> @@ -2760,7 +2776,7 @@ again: >>>>> /* index is empty, remove it; >>>>> * handle must be already prepared by the >>>>> * truncatei_leaf() */ >>>>> - err = ext4_ext_rm_idx(handle, inode, path + i); >>>>> + err = ext4_ext_rm_idx(handle, inode, path, i); >>>>> } >>>>> /* root level has p_bh == NULL, brelse() eats this */ >>>>> brelse(path[i].p_bh); >>>>> -- >>>>> 1.7.5.4 >>>>> >>> -- >>> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in >>> the body of a message to majordomo@vger.kernel.org >>> More majordomo info at http://vger.kernel.org/majordomo-info.html