Received: by 2002:a05:6a10:a841:0:0:0:0 with SMTP id d1csp264175pxy; Wed, 21 Apr 2021 02:22:33 -0700 (PDT) X-Google-Smtp-Source: ABdhPJyM3ZRAx4sW5fk+prf9Vgp0iKzL1D//WjRsbj52FoXf24vPfycWr94av1Q+SKpASn+wRT/d X-Received: by 2002:a17:907:f93:: with SMTP id kb19mr31474685ejc.207.1618996953626; Wed, 21 Apr 2021 02:22:33 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1618996953; cv=none; d=google.com; s=arc-20160816; b=xldMmhC7L2p8dVOqRU2cc6quB6FaBmZkD7tlfsZUi9uinz82NnOnotoiJ261G6Jix+ Gov/gYI15qElsQBxEfXkjGcMQ8jGw3zuV9Iv5q9NNbFVgIrsvPbMUHjoqxkiOnrd+Y9c GNhFUkawrU+RKUx0Heu+TVHiRm6vme5oAYjLarFh0TucbBYyNxVVQYb4gYRe0JNXUnR+ ShrkfAlsBCXhHvsxdkQONRLYl+IR16N8+xAhbdrS0rom5o35u41iW89lVltw6QRD5YTj b1ZgfZtpCl8VvE54qLxenZZ6x1wMxcdu8+7BKvzbWu4eJm/RLKeLjw7T66eZW2bAZNIo dAZA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:user-agent:in-reply-to:content-disposition :mime-version:references:message-id:subject:cc:to:from:date; bh=7dXTx1Iqo1agaIs+FFi/RjxIbm+6ywiqpiicHJUSu2w=; b=jX6P7dj5N8ViBOnF/6V5LDeY7WezPyyWLsY3tuuA3fUjLuWrkHCnuF/OGFeGbP4OIk KG93AcCCjTlWRM7tBg1y/84VUj8BAt1ocQwSDZCDGE6A/PgD28nYaU/eHdQcDgX+Rgqp bOrlfsgaZakkrki04l50RO7SMBaq9/dfQav3aFa+duiI3Wygb/mcHVWxzNoGdEUiwQsP mLq0duep2DgxTmlk3yP6Q2qFqdm9yzK2QGUpHvUM8y1Ej0Cu4Vn4VrfkfyxoQwXwjR9b IfQBD07pBiMJpz5IUaJWYVhD+qmzZ0c7Xcs4SZALC9zmOZ1h836Nre8l7Tuoilt8eDt+ SA4A== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-ext4-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-ext4-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id c24si1444566edw.370.2021.04.21.02.22.05; Wed, 21 Apr 2021 02:22:33 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-ext4-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-ext4-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-ext4-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237840AbhDUJWd (ORCPT + 99 others); Wed, 21 Apr 2021 05:22:33 -0400 Received: from mx2.suse.de ([195.135.220.15]:46652 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235170AbhDUJWb (ORCPT ); Wed, 21 Apr 2021 05:22:31 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id 1FAEFB2DC; Wed, 21 Apr 2021 09:21:57 +0000 (UTC) Received: by quack2.suse.cz (Postfix, from userid 1000) id A5DAD1F2B69; Wed, 21 Apr 2021 11:21:56 +0200 (CEST) Date: Wed, 21 Apr 2021 11:21:56 +0200 From: Jan Kara To: Ye Bin Cc: tytso@mit.edu, adilger.kernel@dilger.ca, linux-kernel@vger.kernel.org, linux-ext4@vger.kernel.org, jack@suse.cz Subject: Re: [PATCH v2] ext4: Fix bug on in ext4_es_cache_extent as ext4_split_extent_at failed Message-ID: <20210421092156.GL8706@quack2.suse.cz> References: <20210420142518.1573012-1-yebin10@huawei.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20210420142518.1573012-1-yebin10@huawei.com> User-Agent: Mutt/1.10.1 (2018-07-13) Precedence: bulk List-ID: X-Mailing-List: linux-ext4@vger.kernel.org On Tue 20-04-21 22:25:18, Ye Bin wrote: > We got follow bug_on when run fsstress with injecting IO fault: > [130747.323114] kernel BUG at fs/ext4/extents_status.c:762! > [130747.323117] Internal error: Oops - BUG: 0 [#1] SMP Which BUG_ON is that please? I don't see any relevant one near that line in current upstream kernel... > ...... > [130747.334329] Call trace: > [130747.334553] ext4_es_cache_extent+0x150/0x168 [ext4] > [130747.334975] ext4_cache_extents+0x64/0xe8 [ext4] > [130747.335368] ext4_find_extent+0x300/0x330 [ext4] > [130747.335759] ext4_ext_map_blocks+0x74/0x1178 [ext4] > [130747.336179] ext4_map_blocks+0x2f4/0x5f0 [ext4] > [130747.336567] ext4_mpage_readpages+0x4a8/0x7a8 [ext4] > [130747.336995] ext4_readpage+0x54/0x100 [ext4] > [130747.337359] generic_file_buffered_read+0x410/0xae8 > [130747.337767] generic_file_read_iter+0x114/0x190 > [130747.338152] ext4_file_read_iter+0x5c/0x140 [ext4] > [130747.338556] __vfs_read+0x11c/0x188 > [130747.338851] vfs_read+0x94/0x150 > [130747.339110] ksys_read+0x74/0xf0 > > If call ext4_ext_insert_extent failed but new extent already inserted, we just > update "ex->ee_len = orig_ex.ee_len", this will lead to extent overlap, then > cause bug on when cache extent. > If call ext4_ext_insert_extent failed don't update ex->ee_len with old value. > Maybe there will lead to block leak, but it can be fixed by fsck later. Thanks for the analysis and the patch but the fact is that the extent tree is going to be corrupted one way or the other. And I think other disk corruptions can lead to similar problematic extent layout anyway. So I rather think we need to make the code more robust to not take the kernel down when seeing corrupted extent tree - and we've done work to make code more robust in this sence in the past. Can you reproduce the issue with the current upstream kernel? If yes, we are probably still missing extent tree consistency checks on some path and we need to add them... Honza > > Signed-off-by: Ye Bin > --- > fs/ext4/extents.c | 5 ++++- > 1 file changed, 4 insertions(+), 1 deletion(-) > > diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c > index 77c84d6f1af6..6161db9c17c9 100644 > --- a/fs/ext4/extents.c > +++ b/fs/ext4/extents.c > @@ -3245,8 +3245,11 @@ static int ext4_split_extent_at(handle_t *handle, > err = ext4_zeroout_es(inode, &zero_ex); > > goto out; > - } else if (err) > + } else if (err == -EROFS) { > + return err; > + } else { > goto fix_extent_len; > + } > > out: > ext4_ext_show_leaf(inode, path); > -- > 2.25.4 > -- Jan Kara SUSE Labs, CR