From: Amir Goldstein Subject: Re: [PATCH] ext4: fix interaction between i_size, fallocate, and delalloc after a crash Date: Tue, 17 Oct 2017 00:11:40 +0300 Message-ID: References: <1503830683-21455-1-git-send-email-amir73il@gmail.com> <59C8D147.1060608@cn.fujitsu.com> <59D5DEE0.6080506@cn.fujitsu.com> <20171007032917.bntgnubthdstmrrt@thunk.org> <59DDFC47.3050300@cn.fujitsu.com> Mime-Version: 1.0 Content-Type: text/plain; charset="UTF-8" Cc: Xiao Yang , Theodore Tso , Eryu Guan , Josef Bacik , fstests , Ext4 , Vijay Chidambaram To: Ashlie Martinez Return-path: In-Reply-To: Sender: fstests-owner@vger.kernel.org List-Id: linux-ext4.vger.kernel.org On Mon, Oct 16, 2017 at 10:32 PM, Ashlie Martinez wrote: > Amir, > > I know this is a bit late, but I've spent some time working through > the disk image that you provided (so that I could determine how/if I > could modify CrashMonkey to catch errors like this) and I don't think > I understand what state the disk image reflects. The disk image SHOULD reflect a state on a disk after the power was cut in the middle of mounted fs. Then power came back on, filesystem was mounted, journal recovered, then filesystem was cleanly unmounted. At this stage, I don't expect there should be anything interesting in the journal. > After digging around > the journal of the disk image you provided, I found that the first 10 > journal blocks are used, with the journal superblock being placed in > the very first block of the journal. The journal superblock says that > the first journal transaction ID that should be in the journal is > transaction ID 4. However, dumping the other journal blocks, I found > that the next block is a descriptor block for transaction ID 2. The > rest of the journal blocks are data blocks for that transaction plus a > transaction commit block. This seems a little odd considering that the > journal refers to a 4th transaction, which I have not been able to > find (I quickly dumped the first 50 blocks in debugfs and found the > rest to contain only zeros). > I did not spend time analyzing the image, so I'll take your word for it, but I can't help you understand your findings. > With this in mind, I looked back at the xfstests code for controlling > the dm_flakey device. What I realized is the `nolockfs` flag is > provided both when it switches from the real device to the flakey > device that drops writes and when it switches from the flakey device > back to the real device. I know there is a call to umount once the > flakey device that drops writes is inserted, but do you think it is > possible that the flakey device is swapped back to the real device > before all the writes forced out by umount have made it to the flakey > device? I believe umount call should be blocked until all writes have been flushed out to flakey device. > Unfortunately I still don't have a local machine that is > capable of reproducing your test results and I have not made any gce > test appliance images to test this yet, so I'm not sure if this is a > valid theory. > Ted explained that the bug related to very specific timing of flusher thread vs. fallocate thread. I was under the impression that CrashMonkey can only reorder writes between recorded FLUSH requests, so I am not really sure how you intent to modify CrashMonkey to catch this bug. Cheers, Amir.