Received: by 2002:a05:7412:31a9:b0:e2:908c:2ebd with SMTP id et41csp367563rdb; Fri, 8 Sep 2023 03:20:47 -0700 (PDT) X-Google-Smtp-Source: AGHT+IGqH4liF5l2XJDkHa45LV708wBoeM4ox1eZbj6E2y1gwDETcoCtAtKJEgU1MGgw7uEWpNSf X-Received: by 2002:aa7:c552:0:b0:525:4471:6b5d with SMTP id s18-20020aa7c552000000b0052544716b5dmr1564943edr.19.1694168447687; Fri, 08 Sep 2023 03:20:47 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1694168447; cv=none; d=google.com; s=arc-20160816; b=N/aSgXdk19y2+VAsNYszv3yOTSujOd5+Oi1MvSSJ2kMtMMJG4UNpAa+X5omd4ZPk69 0VO84OhHaw2Px0Qly03QvBUTP/EkKCaRYsfdplPMlT/vroRAZTDZRJd9QHuD1NI95Zqm W1NTYi4ZAa/KGY9U1mmtazu/USwhtTbYyo1P0quHijPq6Zo+yxqK5V1qB87ngHXPVJLS BCBmggql03SA1IkkAnii8gmorlBL2SMRqibQoHZT0RFhn3Jd7d4W6ANBuCZaSDI5C50W 8MjfY4s5eJvzmmZM4JEcaqIKUXhrH8xwYhP0iRlabGTgVT7rd/ItqTwx3uOSYKMX2VoV wDeg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :message-id:date:subject:cc:to:from; bh=Vx+lv2nGfzbx9kX43qZKXkADAbKgSz6ooELx7VmdtxQ=; fh=FsznnDxpmef1vC1z7PHIgrdOV6pY0Temw/oyToZsikg=; b=KhgBO18XwWGNjfDoF+5YohKBtXM6u9uxXEYiK91b+ge6Lb/+hm4kaJJ2Px8unTiDuJ tSCNo6uHEU4viNFU0zfknRRffisNsBDpwdRfwT/KL0vRKpkq+3hLczzzfQA+W8QT2208 kT6KkPJE14E/XMTzk72RfgU6y3xrcAGwB8dOVviKKwG9cbxqilw9A0GaeEAYMZNalpUA fI/B6lmqo+1Oxb55HCDWNhgNOdvxX5OryM1s6YJJ5gHJbsR0CuVb/A08maGToPztDRzz eNneG0imsi9I0wFDhUQwSVkU84xR6vlDjc/aUBkD/tpXUfofbsvMpncEHuoUcNY8grvt 71CA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-ext4-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-ext4-owner@vger.kernel.org; dmarc=fail (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=huawei.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id dy23-20020a05640231f700b0052596c88933si1182056edb.129.2023.09.08.03.20.18; Fri, 08 Sep 2023 03:20:47 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-ext4-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-ext4-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-ext4-owner@vger.kernel.org; dmarc=fail (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=huawei.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232861AbjIHJdU (ORCPT + 99 others); Fri, 8 Sep 2023 05:33:20 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49200 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231648AbjIHJdT (ORCPT ); Fri, 8 Sep 2023 05:33:19 -0400 Received: from szxga01-in.huawei.com (szxga01-in.huawei.com [45.249.212.187]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 51A6711B for ; Fri, 8 Sep 2023 02:33:15 -0700 (PDT) Received: from kwepemm600013.china.huawei.com (unknown [172.30.72.57]) by szxga01-in.huawei.com (SkyGuard) with ESMTP id 4RhrTQ1NtWzrSct; Fri, 8 Sep 2023 17:31:22 +0800 (CST) Received: from huawei.com (10.175.104.67) by kwepemm600013.china.huawei.com (7.193.23.68) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.31; Fri, 8 Sep 2023 17:33:12 +0800 From: Zhihao Cheng To: , CC: , , Subject: [PATCH] jbd2: Fix potential data lost in recovering journal raced with synchronizing fs bdev Date: Fri, 8 Sep 2023 17:28:08 +0800 Message-ID: <20230908092808.2929317-1-chengzhihao1@huawei.com> X-Mailer: git-send-email 2.39.2 MIME-Version: 1.0 Content-Transfer-Encoding: 7BIT Content-Type: text/plain; charset=US-ASCII X-Originating-IP: [10.175.104.67] X-ClientProxiedBy: dggems701-chm.china.huawei.com (10.3.19.178) To kwepemm600013.china.huawei.com (7.193.23.68) X-CFilter-Loop: Reflected X-Spam-Status: No, score=-4.2 required=5.0 tests=BAYES_00,RCVD_IN_DNSWL_MED, RCVD_IN_MSPIKE_H5,RCVD_IN_MSPIKE_WL,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-ext4@vger.kernel.org JBD2 makes sure journal data is fallen on fs device by sync_blockdev(), however, other process could intercept the EIO information from bdev's mapping, which leads journal recovering successful even EIO occurs during data written back to fs device. We found this problem in our product, iscsi + multipath is chosen for block device of ext4. Unstable network may trigger kpartx to rescan partitions in device mapper layer. Detailed process is shown as following: mount kpartx irq jbd2_journal_recover do_one_pass memcpy(nbh->b_data, obh->b_data) // copy data to fs dev from journal mark_buffer_dirty // mark bh dirty vfs_read generic_file_read_iter // dio filemap_write_and_wait_range __filemap_fdatawrite_range do_writepages block_write_full_folio submit_bh_wbc >> EIO occurs in disk << end_buffer_async_write mark_buffer_write_io_error mapping_set_error set_bit(AS_EIO, &mapping->flags) // set! filemap_check_errors test_and_clear_bit(AS_EIO, &mapping->flags) // clear! err2 = sync_blockdev filemap_write_and_wait filemap_check_errors test_and_clear_bit(AS_EIO, &mapping->flags) // false err2 = 0 Filesystem is mounted successfully even data from journal is failed written into disk, and ext4/ocfs2 could become corrupted. Fix it by comparing the wb_err state in fs block device before recovering and after recovering. Fetch a reproducer in [Link]. Link: https://bugzilla.kernel.org/show_bug.cgi?id=217888 Cc: stable@vger.kernel.org Signed-off-by: Zhihao Cheng Signed-off-by: Zhang Yi --- fs/jbd2/recovery.c | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/fs/jbd2/recovery.c b/fs/jbd2/recovery.c index c269a7d29a46..0fecaa6a3ac6 100644 --- a/fs/jbd2/recovery.c +++ b/fs/jbd2/recovery.c @@ -289,6 +289,8 @@ int jbd2_journal_recover(journal_t *journal) journal_superblock_t * sb; struct recovery_info info; + errseq_t wb_err; + struct address_space *mapping; memset(&info, 0, sizeof(info)); sb = journal->j_superblock; @@ -306,6 +308,8 @@ int jbd2_journal_recover(journal_t *journal) return 0; } + mapping = journal->j_fs_dev->bd_inode->i_mapping; + errseq_check_and_advance(&mapping->wb_err, &wb_err); err = do_one_pass(journal, &info, PASS_SCAN); if (!err) err = do_one_pass(journal, &info, PASS_REVOKE); @@ -327,6 +331,9 @@ int jbd2_journal_recover(journal_t *journal) jbd2_journal_clear_revoke(journal); err2 = sync_blockdev(journal->j_fs_dev); + if (!err) + err = err2; + err2 = errseq_check_and_advance(&mapping->wb_err, &wb_err); if (!err) err = err2; /* Make sure all replayed data is on permanent storage */ -- 2.39.2