From: Amir Goldstein Subject: Re: [RFC PATCH] fstests: Check if a fs can survive random (emulated) power loss Date: Mon, 26 Feb 2018 10:45:11 +0200 Message-ID: References: <20180226073111.3066-1-wqu@suse.com> <5c46dfaa-296e-4882-5205-13a2a6739d79@gmx.com> Mime-Version: 1.0 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Cc: Qu Wenruo , fstests , Linux Btrfs , linux-xfs , Ext4 , Josef Bacik To: Qu Wenruo Return-path: In-Reply-To: <5c46dfaa-296e-4882-5205-13a2a6739d79@gmx.com> Sender: linux-btrfs-owner@vger.kernel.org List-Id: linux-ext4.vger.kernel.org On Mon, Feb 26, 2018 at 10:41 AM, Qu Wenruo wrote: > > > On 2018=E5=B9=B402=E6=9C=8826=E6=97=A5 16:33, Amir Goldstein wrote: >> On Mon, Feb 26, 2018 at 10:20 AM, Qu Wenruo wro= te: >>> >>> >>> On 2018=E5=B9=B402=E6=9C=8826=E6=97=A5 16:15, Amir Goldstein wrote: >>>> On Mon, Feb 26, 2018 at 9:31 AM, Qu Wenruo wrote: >>>>> This test case is originally designed to expose unexpected corruption >>>>> for btrfs, where there are several reports about btrfs serious metada= ta >>>>> corruption after power loss. >>>>> >>>>> The test case itself will trigger heavy fsstress for the fs, and use >>>>> dm-flakey to emulate power loss by dropping all later writes. >>>>> >>>> >>>> Come on... dm-flakey is so 2016 >>>> You should take Josef's fsstress+log-writes test and bring it to fstes= ts: >>>> https://github.com/josefbacik/log-writes >>>> >>>> By doing that you will gain two very important features from the test: >>>> >>>> 1. Problems will be discovered much faster, because the test can run f= sck >>>> after every single block write has been replayed instead of just a= t random >>>> times like in your test >>> >>> That's what exactly I want!!! >>> >>> Great thanks for this one! I would definitely look into this. >>> (Although the initial commit is even older than 2016) >>> >> >> Please note that Josef's replay-individual-faster.sh script runs fsck >> every 1000 writes (i.e. --check 1000), so you can play with this argumen= t >> in your test. Can also run --fsck every --check fua or --check flush, wh= ich >> may be more indicative of real world problems. not sure. >> >>> >>> But the test itself could already expose something on EXT4, it still >>> makes some sense for ext4 developers as a verification test case. >>> >> >> Please take a look at generic/456 >> When generic/455 found a reproduciable problem in ext4, >> I created a specific test without any randomness to pin point the >> problem found (using dm-flakey). >> If the problem you found is reproduciable, then it will be easy for you >> to create a similar "bisected" test. > > Yep, it's definitely needed for a pin-point test case, but I'm also > wondering if a random, stress test could also help. > > Test case with plain fsstress is already super helpful to expose some > bugs, such stress test won't hurt. > Yes, but the same stress test with dm-log-writes instead of dm-flakey will be as useful and much more, so no reason to merge the less useful stress test. Thanks, Amir.