Received: by 2002:a25:e74b:0:0:0:0:0 with SMTP id e72csp638604ybh; Sun, 12 Jul 2020 18:45:16 -0700 (PDT) X-Google-Smtp-Source: ABdhPJztn4tSAFeYCdShnsROFH1BW2AVEB0ziVOI8NxdFitfw8cugG9Ofm+OdKP30toHWzfm0pDX X-Received: by 2002:a17:906:abd6:: with SMTP id kq22mr74760652ejb.458.1594604716010; Sun, 12 Jul 2020 18:45:16 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1594604716; cv=none; d=google.com; s=arc-20160816; b=sx1Z3azjF5lA5TSkWiu2+W6M3dGC+auTd2ekdAkpzKkF8k+8thdtoeC8ykK49Z5Heg wkcDRe8lkHXP0sfhidqNvOmnMSwFqEBJ09AUGxzoKXSbvq3aV8C5p4LcthZbJt5c2MrJ WAPaSnf4qwOlixHzvUbVP6PO8IZbI2Hot0jCsZItQFGtREQzdgdo4bqXQAp4unOpw/LD nslRki2jgmdCaVs2Tkv26StuahGO2lJuWJYKkwQapcan60Xtt8aN/F/6BIoVCa/Rx/Gh 8pBxIKpeLY7LW47BmSUCsS/gZLrD3BYFAdiHeS+jXsxrIAV80J4urBVztdJnttHjTdp0 D0YA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding :content-language:in-reply-to:mime-version:user-agent:date :message-id:from:references:cc:to:subject; bh=oQwIq5Z1Xn9Kvxey2rnn4h4vJxUVu/Wf2OctmVBNODA=; b=oatDbzYgXh+57j6CgG0a0f66xiM9F9dcUYd82B7IX/Ctyxy/BzE2Ir34HDZZPCEFea LfnJ5i22Z8oAM2CCHh3qqWfvJxSt1A7mFXvGM59Py9EXBj26c6oN5bXyPhJsOz6U4sOi kfUBG/4XAlSt2EAkZZHa9MW4Kzx13im+u5Hi9b4/+VEU6aptRlr1aKXmzF9mncRkUYLA S3hdO6CAUqNQ6r0MGMhYnuLZwwK7nteCztL349oOZFfWCqmSbK+QkEAPJsmQ8VHr8oAX 30hKZ3cFIfLXfFqr4Gm+t1gFAsMKZtXAU93v/i799r/9wfyhQQ/U/1N9KzGT1VbNTOz1 vp/g== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-ext4-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-ext4-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id bo14si7854110edb.112.2020.07.12.18.44.43; Sun, 12 Jul 2020 18:45:16 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-ext4-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-ext4-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-ext4-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728047AbgGMBlA (ORCPT + 99 others); Sun, 12 Jul 2020 21:41:00 -0400 Received: from szxga04-in.huawei.com ([45.249.212.190]:7839 "EHLO huawei.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1726262AbgGMBlA (ORCPT ); Sun, 12 Jul 2020 21:41:00 -0400 Received: from DGGEMS409-HUB.china.huawei.com (unknown [172.30.72.60]) by Forcepoint Email with ESMTP id ADFC6E5CEAD6CEEBB804; Mon, 13 Jul 2020 09:40:57 +0800 (CST) Received: from [127.0.0.1] (10.174.179.187) by DGGEMS409-HUB.china.huawei.com (10.3.19.209) with Microsoft SMTP Server id 14.3.487.0; Mon, 13 Jul 2020 09:40:48 +0800 Subject: Re: [PATCH v3 0/5] ext4: fix inconsistency since async write metadata buffer error To: , , CC: , , References: <20200620025427.1756360-1-yi.zhang@huawei.com> From: "zhangyi (F)" Message-ID: <4b8a3738-cf3a-a1fb-06d6-c14436cf2cf4@huawei.com> Date: Mon, 13 Jul 2020 09:40:47 +0800 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:68.0) Gecko/20100101 Thunderbird/68.3.0 MIME-Version: 1.0 In-Reply-To: <20200620025427.1756360-1-yi.zhang@huawei.com> Content-Type: text/plain; charset="utf-8" Content-Language: en-US Content-Transfer-Encoding: 7bit X-Originating-IP: [10.174.179.187] X-CFilter-Loop: Reflected Sender: linux-ext4-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-ext4@vger.kernel.org Hi, Ted and Jan, what do you think about this solution ? Thanks, Yi. On 2020/6/20 10:54, zhangyi (F) wrote: > Changes since v2: > - Christoph against the solution of adding callback in the block layer > that could let ext4 handle write error. So for simplicity, switch to > check the bdev mapping->wb_err when ext4 getting journal write access > as Jan suggested now. Maybe we could implement the callback through > introduce a special inode (e.g. a meta inode) for ext4 in the future. > - Patch 1: Add mapping->wb_err check and invoke ext4_error_err() in > ext4_journal_get_write_access() if wb_err is different from the > original one saved at mount time. > - Patch 2-3: Remove partial fix <7963e5ac90125> and <9c83a923c67d>. > - Patch 4: Fix another inconsistency problem since we may bypass the > journal's checkpoint procedure if we free metadata buffers which > were failed to async write out. > - Patch 5: Just a cleanup patch. > > The above 5 patches are based on linux-5.8-rc1 and have been tested by > xfstests, no newly increased failures. > > Thanks, > Yi. > > ----------------------- > > Original background > =================== > > This patch set point to fix the inconsistency problem which has been > discussed and partial fixed in [1]. > > Now, the problem is on the unstable storage which has a flaky transport > (e.g. iSCSI transport may disconnect few seconds and reconnect due to > the bad network environment), if we failed to async write metadata in > background, the end write routine in block layer will clear the buffer's > uptodate flag, but the data in such buffer is actually uptodate. Finally > we may read "old && inconsistent" metadata from the disk when we get the > buffer later because not only the uptodate flag was cleared but also we > do not check the write io error flag, or even worse the buffer has been > freed due to memory presure. > > Fortunately, if the jbd2 do checkpoint after async IO error happens, > the checkpoint routine will check the write_io_error flag and abort the > the journal if detect IO error. And in the journal recover case, the > recover code will invoke sync_blockdev() after recover complete, it will > also detect IO error and refuse to mount the filesystem. > > Current ext4 have already deal with this problem in __ext4_get_inode_loc() > and commit 7963e5ac90125 ("ext4: treat buffers with write errors as > containing valid data"), but it's not enough. > > [1] https://lore.kernel.org/linux-ext4/20190823030207.GC8130@mit.edu/ > > > zhangyi (F) (5): > ext4: abort the filesystem if failed to async write metadata buffer > ext4: remove ext4_buffer_uptodate() > ext4: remove write io error check before read inode block > jbd2: abort journal if free a async write error metadata buffer > jbd2: remove unused parameter in jbd2_journal_try_to_free_buffers() > > fs/ext4/ext4.h | 16 +++------------- > fs/ext4/ext4_jbd2.c | 25 +++++++++++++++++++++++++ > fs/ext4/inode.c | 15 +++------------ > fs/ext4/super.c | 23 ++++++++++++++++++++--- > fs/jbd2/transaction.c | 20 ++++++++++++++------ > include/linux/jbd2.h | 2 +- > 6 files changed, 66 insertions(+), 35 deletions(-) >