Received: by 2002:a05:6a10:206:0:0:0:0 with SMTP id 6csp563906pxj; Thu, 27 May 2021 06:48:30 -0700 (PDT) X-Google-Smtp-Source: ABdhPJy6Eiacqm27BF99bCvlaWlYodCYcSyn5iFhpgo7inNlM1Fvx3UXlimFmOliv5ka5F/pXkgd X-Received: by 2002:a17:907:161f:: with SMTP id hb31mr4021307ejc.278.1622123310394; Thu, 27 May 2021 06:48:30 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1622123310; cv=none; d=google.com; s=arc-20160816; b=ZIunOEyQp5taaCqOz9NsVa/NULXPsI/OlmcL5gEqmdL8X2vxNIq//ah+2zMjdMZc6G tsrC9qzombVN3OGsQC3a/4rhDkcRIsGiObuwt/an1tgzhskVPbJJyE3MId/Vg65sFkjH 7WgzVYGqS0llYg8/PuUJ0T65UyU+DGXnOji5g/CFbU0CMiD0VuPuxNmTFzJwofyLRTVX 83d1itZ+SUBRsVTzaJ5lgjzDZHCEIwgg9fbwssudIG8p3aXZgJZnR6w8l2qbLlcms2rL ZT+UIT+Ca22qyGSn1uJNQBrHB80gEGKnWZG5zbVYpBzjDrgU5KR7VykqgLFonCve70O0 yUoA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from; bh=jZs9zupIkOwi3ObK/564F7wUW+HY+iBSwxlCpytLTFc=; b=aTHFhKVp6+D5Siu2lY6mUd9WCBEcuNTWJkQ+Lo39ZRvqUvAnQ8QxzTZN3MdzHgDUY5 V8y+XxD5chQph9a1rTwuLuXP7Bsn6aoqJ6OetJxbJP6gQ4X9jK3vbiUFA/x7xKAAfVy0 1TXWgMdRrQZMZ1Joq3QLxRis1mAcaUQyf9KEGZs0IF9QD9fK+8B2zmFxpIxfJrvZPPMj CPlxqsTnK8/8PQ4Gjl+jbJsqfOy0uPeCiS7gqj87xdaOAYCDZHuj4yYGCE/tflOAvsrx 6j+LgpnY1GUyE4cunVqxB0/q2W0MCjp5lVUCzC0lDGsSOe0C14EOgpRYvFo586o3W/8o h97Q== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-ext4-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-ext4-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=huawei.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id b1si2641102ejb.714.2021.05.27.06.48.05; Thu, 27 May 2021 06:48:30 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-ext4-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-ext4-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-ext4-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=huawei.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236611AbhE0Nsy (ORCPT + 99 others); Thu, 27 May 2021 09:48:54 -0400 Received: from szxga01-in.huawei.com ([45.249.212.187]:5116 "EHLO szxga01-in.huawei.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236595AbhE0Nsx (ORCPT ); Thu, 27 May 2021 09:48:53 -0400 Received: from dggeme752-chm.china.huawei.com (unknown [172.30.72.54]) by szxga01-in.huawei.com (SkyGuard) with ESMTP id 4FrTZZ0jPLzYnFc; Thu, 27 May 2021 21:44:38 +0800 (CST) Received: from huawei.com (10.175.127.227) by dggeme752-chm.china.huawei.com (10.3.19.98) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256_P256) id 15.1.2176.2; Thu, 27 May 2021 21:47:17 +0800 From: Zhang Yi To: CC: , , , , Subject: [RFC PATCH v3 2/8] jbd2: ensure abort the journal if detect IO error when writing original buffer back Date: Thu, 27 May 2021 21:56:35 +0800 Message-ID: <20210527135641.420514-3-yi.zhang@huawei.com> X-Mailer: git-send-email 2.25.4 In-Reply-To: <20210527135641.420514-1-yi.zhang@huawei.com> References: <20210527135641.420514-1-yi.zhang@huawei.com> MIME-Version: 1.0 Content-Transfer-Encoding: 7BIT Content-Type: text/plain; charset=US-ASCII X-Originating-IP: [10.175.127.227] X-ClientProxiedBy: dggems702-chm.china.huawei.com (10.3.19.179) To dggeme752-chm.china.huawei.com (10.3.19.98) X-CFilter-Loop: Reflected Precedence: bulk List-ID: X-Mailing-List: linux-ext4@vger.kernel.org Although we merged c044f3d8360 ("jbd2: abort journal if free a async write error metadata buffer"), there is a race between jbd2_journal_try_to_free_buffers() and jbd2_journal_destroy(), so the jbd2_log_do_checkpoint() may still fail to detect the buffer write io error flag which may lead to filesystem inconsistency. jbd2_journal_try_to_free_buffers() ext4_put_super() jbd2_journal_destroy() __jbd2_journal_remove_checkpoint() detect buffer write error jbd2_log_do_checkpoint() jbd2_cleanup_journal_tail() <--- lead to inconsistency jbd2_journal_abort() Fix this issue by introducing a new atomic flag which only have one JBD2_CHECKPOINT_IO_ERROR bit now, and set it in __jbd2_journal_remove_checkpoint() when freeing a checkpoint buffer which has write_io_error flag. Then jbd2_journal_destroy() will detect this mark and abort the journal to prevent updating log tail. Signed-off-by: Zhang Yi --- fs/jbd2/checkpoint.c | 12 ++++++++++++ fs/jbd2/journal.c | 14 ++++++++++++++ include/linux/jbd2.h | 11 +++++++++++ 3 files changed, 37 insertions(+) diff --git a/fs/jbd2/checkpoint.c b/fs/jbd2/checkpoint.c index bf5511d19ac5..2cbac0e3cff3 100644 --- a/fs/jbd2/checkpoint.c +++ b/fs/jbd2/checkpoint.c @@ -564,6 +564,7 @@ int __jbd2_journal_remove_checkpoint(struct journal_head *jh) struct transaction_chp_stats_s *stats; transaction_t *transaction; journal_t *journal; + struct buffer_head *bh = jh2bh(jh); JBUFFER_TRACE(jh, "entry"); @@ -575,6 +576,17 @@ int __jbd2_journal_remove_checkpoint(struct journal_head *jh) journal = transaction->t_journal; JBUFFER_TRACE(jh, "removing from transaction"); + + /* + * If we have failed to write the buffer out to disk, the filesystem + * may become inconsistent. We cannot abort the journal here since + * we hold j_list_lock and we have to careful about races with + * jbd2_journal_destroy(). So mark the writeback IO error in the + * journal here and we abort the journal later from a better context. + */ + if (buffer_write_io_error(bh)) + set_bit(JBD2_CHECKPOINT_IO_ERROR, &journal->j_atomic_flags); + __buffer_unlink(jh); jh->b_cp_transaction = NULL; jbd2_journal_put_journal_head(jh); diff --git a/fs/jbd2/journal.c b/fs/jbd2/journal.c index 2dc944442802..90146755941f 100644 --- a/fs/jbd2/journal.c +++ b/fs/jbd2/journal.c @@ -1618,6 +1618,10 @@ int jbd2_journal_update_sb_log_tail(journal_t *journal, tid_t tail_tid, if (is_journal_aborted(journal)) return -EIO; + if (test_bit(JBD2_CHECKPOINT_IO_ERROR, &journal->j_atomic_flags)) { + jbd2_journal_abort(journal, -EIO); + return -EIO; + } BUG_ON(!mutex_is_locked(&journal->j_checkpoint_mutex)); jbd_debug(1, "JBD2: updating superblock (start %lu, seq %u)\n", @@ -1995,6 +1999,16 @@ int jbd2_journal_destroy(journal_t *journal) J_ASSERT(journal->j_checkpoint_transactions == NULL); spin_unlock(&journal->j_list_lock); + /* + * OK, all checkpoint transactions have been checked, now check the + * write out io error flag and abort the journal if some buffer failed + * to write back to the original location, otherwise the filesystem + * may become inconsistent. + */ + if (!is_journal_aborted(journal) && + test_bit(JBD2_CHECKPOINT_IO_ERROR, &journal->j_atomic_flags)) + jbd2_journal_abort(journal, -EIO); + if (journal->j_sb_buffer) { if (!is_journal_aborted(journal)) { mutex_lock_io(&journal->j_checkpoint_mutex); diff --git a/include/linux/jbd2.h b/include/linux/jbd2.h index db0e1920cb12..f9b5e657b8f3 100644 --- a/include/linux/jbd2.h +++ b/include/linux/jbd2.h @@ -779,6 +779,11 @@ struct journal_s */ unsigned long j_flags; + /** + * @j_atomic_flags: Atomic journaling state flags. + */ + unsigned long j_atomic_flags; + /** * @j_errno: * @@ -1371,6 +1376,12 @@ JBD2_FEATURE_INCOMPAT_FUNCS(fast_commit, FAST_COMMIT) #define JBD2_FAST_COMMIT_ONGOING 0x100 /* Fast commit is ongoing */ #define JBD2_FULL_COMMIT_ONGOING 0x200 /* Full commit is ongoing */ +/* + * Journal atomic flag definitions + */ +#define JBD2_CHECKPOINT_IO_ERROR 0x001 /* Detect io error while writing + * buffer back to disk */ + /* * Function declarations for the journaling transaction and buffer * management -- 2.25.4