Received: by 2002:a05:6a10:16a7:0:0:0:0 with SMTP id gp39csp953374pxb; Thu, 19 Nov 2020 19:06:20 -0800 (PST) X-Google-Smtp-Source: ABdhPJw1T77h9WPofu6LJ7XJhEr2xP1tQYrI2p1aMKnn/0+0NIVnftq68yILUR+WTUHh8izfB+9j X-Received: by 2002:a17:906:60c4:: with SMTP id f4mr31417760ejk.336.1605841580631; Thu, 19 Nov 2020 19:06:20 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1605841580; cv=none; d=google.com; s=arc-20160816; b=Ue5UQO/7yQG/iJoxiVXoVvWoohXaUgAsptP2UiMZEkDoPlknGEKioGyOtroT/dBeL1 Oj8dAHXEyGsTtctDEfmslYWZwdE9+kOq5YlD5EbGHX7lTqpgqnODRnGBa35pm/Rm4/ty tEFQvOE7AfPNsctSHsYSChy5Ah/E8YaJimhbjxIgNAl2wgKs32b+5T4VSe4yuVSScohh CQ7JIxoE3tHOLMW3Eb0y28ZFwvQfGK9tqWwlYVfZmi7UI3xy/DclrQwRuLlm3JzRN8je sSYO3v8aTfYCezR/jo7hVzD1hwvgmHsUGBawTgn4KnSuh2VKLcLOjSCqeyOdUBUDoOD9 y5pQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:in-reply-to :mime-version:user-agent:date:message-id:references:cc:to:from :subject; bh=KGa9k/PR0mI9GMSgS7jWJgoZuGkfQ9FrPd28tXtUZ9Y=; b=E2rDwe4H2DBsaq8E2R/ExJ8zK9VZ4N7RchfP+YFS6N57tT+ZLleFy3EvMLMgj/RqLQ GUsTP+4Xtrwf/8cmUFTL5d32NBPNGun65LYo03wsRD4nzeotn4mYy86AyizGXcqwDESa j3Exasf1NMWP+kTFPKR6Lt6TD7jMWELNpAy3+HleEIWYAuKVrQORZGiIhQKq6WlEOXz9 pMfBFTHriUnKSYfh/sTXl58wvkzlyfT4/1l2IbSPODLMqHrqPxYNtC+4K5vudNroCiYc E6fnDhviQqGVetxE3GpeAmG5oeOu8UkJLhyTOeE/8y4N3OVAFs41wc8H3FZWa4nkuU3W 1xPQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-ext4-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-ext4-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id x3si1034712ejv.461.2020.11.19.19.05.49; Thu, 19 Nov 2020 19:06:20 -0800 (PST) Received-SPF: pass (google.com: domain of linux-ext4-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-ext4-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-ext4-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726385AbgKTDDR (ORCPT + 99 others); Thu, 19 Nov 2020 22:03:17 -0500 Received: from szxga04-in.huawei.com ([45.249.212.190]:7658 "EHLO szxga04-in.huawei.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726365AbgKTDDR (ORCPT ); Thu, 19 Nov 2020 22:03:17 -0500 Received: from DGGEMS408-HUB.china.huawei.com (unknown [172.30.72.58]) by szxga04-in.huawei.com (SkyGuard) with ESMTP id 4CchDy08knz15PMm; Fri, 20 Nov 2020 11:02:58 +0800 (CST) Received: from [10.174.179.106] (10.174.179.106) by DGGEMS408-HUB.china.huawei.com (10.3.19.208) with Microsoft SMTP Server id 14.3.487.0; Fri, 20 Nov 2020 11:03:06 +0800 Subject: Re: [Bug report] journal data mode trigger panic in jbd2_journal_commit_transaction From: yangerkun To: Mauricio Oliveira CC: "Theodore Y . Ts'o" , , Jan Kara , , "zhangyi (F)" , Hou Tao , , Ye Bin , References: <68b9650e-bef2-69e2-ab5e-8aaddaf46cfe@huawei.com> <17d7ecde-5fda-cd03-6fef-e7b8250489f9@huawei.com> Message-ID: <14879a89-b6d2-e142-2ea3-23fbb041444b@huawei.com> Date: Fri, 20 Nov 2020 11:03:06 +0800 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:68.0) Gecko/20100101 Thunderbird/68.7.0 MIME-Version: 1.0 In-Reply-To: <17d7ecde-5fda-cd03-6fef-e7b8250489f9@huawei.com> Content-Type: text/plain; charset="utf-8"; format=flowed Content-Transfer-Encoding: 8bit X-Originating-IP: [10.174.179.106] X-CFilter-Loop: Reflected Precedence: bulk List-ID: X-Mailing-List: linux-ext4@vger.kernel.org 在 2020/11/20 10:54, yangerkun 写道: > > > 在 2020/11/19 21:12, Mauricio Oliveira 写道: >> On Thu, Nov 19, 2020 at 1:25 AM yangerkun wrote: >>> >>> >>> >>> 在 2020/11/16 21:50, Mauricio Oliveira 写道: >>>> Hi Kun, >>>> >>>> On Sat, Nov 14, 2020 at 5:18 AM yangerkun wrote: >>>>> While using ext4 with data=journal(3.10 kernel), we meet a problem >>>>> that >>>>> we think may never happend... >>>> [...] >>>> >>>> Could you please confirm you mean 5.10-rc* kernel instead of 3.10? >>>> (It seems so as you mention a recent commit below.)  Thanks! >>>> >>>>> For now, what I have seen that can dirty buffer directly is >>>>> ext4_page_mkwrite(64a9f1449950 ("ext4: data=journal: fixes for >>>>> ext4_page_mkwrite()")), and runing ext4_punch_hole with keep_size >>>>> /ext4_page_mkwrite parallel can trigger above warning easily. >>>> [...] >>>> >>>> >>> >>> Hi, >>> >>> Sorry for the long delay reply... And thanks a lot for your advise! The >>> bug trigger with a very low probability. So won't trigger with 5.10 can >>> not prove no bug exist in 5.10. >>> >> >> No worries, and thanks for following up. >> So I understand that the bug report was indeed on 3.10, and 5.10-rcN >> is not yet confirmed. >> >>> Google a lot and notice that someone before has report the same bug[1]. >>> '3b136499e906 ("ext4: fix data corruption in data=journal mode")' seems >>> fix the problem. I will try to understand this, and give a analysis >>> about how to reproduce it! >> >> Cool, thanks! >> >>> Thanks, >>> Kun. >> >> >> > Hi, > > The follow step can reproduce the bug[1] reported before easily. And the > bug we meet seems same. Following patch will fix the bug. > > 3b136499e906 ext4: fix data corruption in data=journal mode > b90197b65518 ext4: use private version of page_zero_new_buffers() for > data=journal mode > > > 1. mkfs.ext4 > 2. touch $tofile(ino == 12) > 3. touch $fromfile(ino == 13) and write 4k to fromfile and sync > > mmap $fromfile 4k > and write 4k > to $tofile > > ... > generic_perform_write >  ext4_write_begin >   ext4_journal_start >   (trans 1) >  if (ino == 12) sleep for 30s >  ...                           truncate $fromfile >                                to 0 >  copied=0,bytes=4k >  ext4_journalled_write_end >   page_zero_new_buffers >    mark_buffer_dirty >   write_end_fn >    ... >    __jbd2_journal_file_buffer >     test_clear_buffer_dirty >     __jbd2_journal_temp_unlink_buffer this will mark buffer dirty again! >   ext4_journal_stop >   (trans 1) >                                                  trans1 commit >                                                  ... >   ext4_truncate_failed_write >    ... >    journal_unmap_buffer >     set_buffer_freed >                                                  forget list >                                                   ... >                                                   clear_buffer_jbddirty >                                                   ... >                                                   J_ASSERT_BH(bh, >                                                   !buffer_dirty(bh)) >                                                   ^^^^^^^^^^^^^^^^^ >                                                   trigger the bug... > > > > [1]. https://www.spinics.net/lists/linux-ext4/msg56447.html > > Thanks, > Kun. > .