Received: by 2002:a25:868d:0:0:0:0:0 with SMTP id z13csp1964228ybk; Thu, 21 May 2020 21:16:25 -0700 (PDT) X-Google-Smtp-Source: ABdhPJzblTKy2PE0UJWWyPcEmAfq4jSnYKqX8kZDCliZq39P0VE5MfSXnuEbvYpKt6egQbGH8sPD X-Received: by 2002:aa7:dd16:: with SMTP id i22mr1603198edv.366.1590120985074; Thu, 21 May 2020 21:16:25 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1590120985; cv=none; d=google.com; s=arc-20160816; b=W0ihXsIWnn5s5EdIKday5I86yV/26146MJWwM4ewbPggAzRp0id9nY83GKWyl1NYe3 Q9GkVfZ/JvqgpYI9DzfRknNoPTXUD0ZC9/L+KMFZX0YkM2WM2hS8jM+DjRClOrLgW4fj hCFe+0W/DHLuJWywrOgIpU+x3a1jRAPX6HNQzhNadEa9zPuGYIhkwW+Bbb4faO4VitB0 f9NoFuefD6mX2XsR3roe8mBRTodfUT1g3n/dIGpQF9fcQ4Cd437wbrFIMGsLkl9bJCTx LfIiD0K4q4DoAsKTvHquAVMNI4gAh6bgyIZkkp5aRL1lNtw1WyR+XiRPAk7xwHud1eUM E3hQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-language :content-transfer-encoding:in-reply-to:mime-version:user-agent:date :message-id:references:cc:to:from:subject; bh=ELTd7gEwN0TdBx1k61IpaOJZsbRcoYjhOAeU9T0kTp0=; b=RcIdXm+KLdP2sh4pEu5RQ54cs/LznkNJoIoF/T2gBWUl9UJ+Idku6frawj4QlcrqOI x+Oe+r7KHFSvqaCM/5HkmtlY9R43MeA1fMGaH0U1Lhq/PavXkXQ/XHZWKdFeRGzbbAwH Zoz2m60fq9SQuXYUWrFfSHnIj84IZtJ3WALxrxdpPW6wJSBrHn8Yfxos7oyn8xdxCKbM sXR+LtAZvqyzaRkRWvI42A3devHGVwjMLF5WTSJUq822y1WKXr7Eo1hgzSTnX0QiPv18 368XxH5mjsnHrbIUM3WL6VVPqsy9WBzclBl5kTxbOHgOEUOQ2uvgdYCwe0gbk7AlR0eM gCAA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-ext4-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-ext4-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=alibaba.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id j2si3980487edd.36.2020.05.21.21.15.53; Thu, 21 May 2020 21:16:25 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-ext4-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-ext4-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-ext4-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=alibaba.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726286AbgEVEPq (ORCPT + 99 others); Fri, 22 May 2020 00:15:46 -0400 Received: from out30-45.freemail.mail.aliyun.com ([115.124.30.45]:52521 "EHLO out30-45.freemail.mail.aliyun.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725956AbgEVEPp (ORCPT ); Fri, 22 May 2020 00:15:45 -0400 X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R771e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=e01e04357;MF=jefflexu@linux.alibaba.com;NM=1;PH=DS;RN=4;SR=0;TI=SMTPD_---0TzF92eq_1590120939; Received: from admindeMacBook-Pro-2.local(mailfrom:jefflexu@linux.alibaba.com fp:SMTPD_---0TzF92eq_1590120939) by smtp.aliyun-inc.com(127.0.0.1); Fri, 22 May 2020 12:15:40 +0800 Subject: Re: [PATCH RFC] ext4: fix partial cluster initialization when splitting extent From: JeffleXu To: tytso@mit.edu, enwlinux@gmail.com Cc: linux-ext4@vger.kernel.org, joseph.qi@linux.alibaba.com References: <1590120164-26949-1-git-send-email-jefflexu@linux.alibaba.com> Message-ID: <56098e56-0388-d11c-b822-291b5c8f77da@linux.alibaba.com> Date: Fri, 22 May 2020 12:15:39 +0800 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.14; rv:68.0) Gecko/20100101 Thunderbird/68.8.0 MIME-Version: 1.0 In-Reply-To: <1590120164-26949-1-git-send-email-jefflexu@linux.alibaba.com> Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit Content-Language: en-US Sender: linux-ext4-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-ext4@vger.kernel.org Sorry I forget to remove the "RFC" flag. Please ignore this patch and I will resend another one later. On 5/22/20 12:02 PM, Jeffle Xu wrote: > Fix the bug when calculating the physical block number of the first > block in the split extent. > > This bug will cause xfstests shared/298 failure on ext4 with bigalloc > enabled occasionally. Ext4 error messages indicate that previously freed > blocks are being freed again, and the following fsck will fail due to > the inconsistency of block bitmap and bg descriptor. > > The following is an example case: > > 1. First, Initialize a ext4 filesystem with cluster size '16K', block size > '4K', in which case, one cluster contains four blocks. > > 2. Create one file (e.g., xxx.img) on this ext4 filesystem. Now the extent > tree of this file is like: > > ... > 36864:[0]4:220160 > 36868:[0]14332:145408 > 51200:[0]2:231424 > ... > > 3. Then execute PUNCH_HOLE fallocate on this file. The hole range is > like: > > .. > ext4_ext_remove_space: dev 254,16 ino 12 since 49506 end 49506 depth 1 > ext4_ext_remove_space: dev 254,16 ino 12 since 49544 end 49546 depth 1 > ext4_ext_remove_space: dev 254,16 ino 12 since 49605 end 49607 depth 1 > ... > > 4. Then the extent tree of this file after punching is like > > ... > 49507:[0]37:158047 > 49547:[0]58:158087 > ... > > 5. Detailed procedure of punching hole [49544, 49546] > > 5.1. The block address space: > ``` > lblk ~49505 49506 49507~49543 49544~49546 49547~ > ---------+------+-------------+----------------+-------- > extent | hole | extent | hole | extent > ---------+------+-------------+----------------+-------- > pblk ~158045 158046 158047~158083 158084~158086 158087~ > ``` > > 5.2. The detailed layout of cluster 39521: > ``` > cluster 39521 > <-------------------------------> > > hole extent > <----------------------><-------- > > lblk 49544 49545 49546 49547 > +-------+-------+-------+-------+ > | | | | | > +-------+-------+-------+-------+ > pblk 158084 1580845 158086 158087 > ``` > > 5.3. The ftrace output when punching hole [49544, 49546]: > - ext4_ext_remove_space (start 49544, end 49546) > - ext4_ext_rm_leaf (start 49544, end 49546, last_extent [49507(158047), 40], partial [pclu 39522 lblk 0 state 2]) > - ext4_remove_blocks (extent [49507(158047), 40], from 49544 to 49546, partial [pclu 39522 lblk 0 state 2] > - ext4_free_blocks: (block 158084 count 4) > - ext4_mballoc_free (extent 1/6753/1) > > 5.4. Ext4 error message in dmesg: > EXT4-fs error (device vdb): mb_free_blocks:1457: group 1, block 158084:freeing already freed block (bit 6753); block bitmap corrupt. > EXT4-fs error (device vdb): ext4_mb_generate_buddy:747: group 1, block bitmap and bg descriptor inconsistent: 19550 vs 19551 free clusters > > > In this case, the whole cluster 39521 is freed mistakenly when freeing > pblock 158084~158086 (i.e., the first three blocks of this cluster), > although pblock 158087 (the last remaining block of this cluster) has > not been freed yet. > > The root cause of this isuue is that, the pclu of the partial cluster is > calculated mistakenly in ext4_ext_remove_space(). The correct > partial_cluster.pclu (i.e., the cluster number of the first block in the > next extent, that is, lblock 49597 (pblock 158086)) should be 39521 rather > than 39522. > > Fixes: f4226d9ea400 ("ext4: fix partial cluster initialization") > Signed-off-by: Jeffle Xu > Reviewed-by: Eric Whitney > Cc: stable@kernel.org # v3.19+ > --- > fs/ext4/extents.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c > index f2b577b..cb74496 100644 > --- a/fs/ext4/extents.c > +++ b/fs/ext4/extents.c > @@ -2828,7 +2828,7 @@ int ext4_ext_remove_space(struct inode *inode, ext4_lblk_t start, > * in use to avoid freeing it when removing blocks. > */ > if (sbi->s_cluster_ratio > 1) { > - pblk = ext4_ext_pblock(ex) + end - ee_block + 2; > + pblk = ext4_ext_pblock(ex) + end - ee_block + 1; > partial.pclu = EXT4_B2C(sbi, pblk); > partial.state = nofree; > }