2022-03-16 02:11:42

by Ritesh Harjani

[permalink] [raw]
Subject: [PATCHv2 3/4] generic/676: Add a new shutdown recovery test

In certain cases (it is noted with ext4 fast_commit feature) that, replay phase
may not delete the right range of blocks (after sudden FS shutdown)
due to some operations which depends on inode->i_size (which during replay of
an inode with fast_commit could be 0 for sometime).
This fstest is added to test for such scenarios for all generic fs.

This test case is based on the test case shared via Xin Yin.

Signed-off-by: Ritesh Harjani <[email protected]>
---
tests/generic/676 | 72 +++++++++++++++++++++++++++++++++++++++++++
tests/generic/676.out | 7 +++++
2 files changed, 79 insertions(+)
create mode 100755 tests/generic/676
create mode 100644 tests/generic/676.out

diff --git a/tests/generic/676 b/tests/generic/676
new file mode 100755
index 00000000..315edcdf
--- /dev/null
+++ b/tests/generic/676
@@ -0,0 +1,72 @@
+#! /bin/bash
+# SPDX-License-Identifier: GPL-2.0
+# Copyright (c) 2022 IBM Corporation. All Rights Reserved.
+#
+# FS QA Test 676
+#
+# This test with ext4 fast_commit feature w/o below patch missed to delete the right
+# range during replay phase, since it depends upon inode->i_size (which might not be
+# stable during replay phase, at least for ext4).
+# 0b5b5a62b945a141: ext4: use ext4_ext_remove_space() for fast commit replay delete range
+# (Based on test case shared by Xin Yin <[email protected]>)
+#
+
+. ./common/preamble
+_begin_fstest auto shutdown quick log recoveryloop
+
+# Override the default cleanup function.
+_cleanup()
+{
+ cd /
+ rm -r -f $tmp.*
+ _scratch_unmount > /dev/null 2>&1
+}
+
+# Import common functions.
+. ./common/filter
+. ./common/punch
+
+# real QA test starts here
+
+# Modify as appropriate.
+_supported_fs generic
+_require_scratch
+_require_xfs_io_command "fpunch"
+_require_xfs_io_command "fzero"
+_require_xfs_io_command "fiemap"
+
+t1=$SCRATCH_MNT/foo
+t2=$SCRATCH_MNT/bar
+
+_scratch_mkfs > $seqres.full 2>&1
+
+_scratch_mount >> $seqres.full 2>&1
+
+bs=$(_get_block_size $SCRATCH_MNT)
+
+# create and write data to t1
+$XFS_IO_PROG -f -c "pwrite 0 $((100*$bs))" $t1 | _filter_xfs_io_numbers
+
+# fzero certain range in between with -k
+$XFS_IO_PROG -c "fzero -k $((40*$bs)) $((20*$bs))" $t1
+
+# create and fsync a new file t2
+$XFS_IO_PROG -f -c "fsync" $t2
+
+# fpunch within the i_size of a file
+$XFS_IO_PROG -c "fpunch $((30*$bs)) $((20*$bs))" $t1
+
+# fsync t1 to trigger journal operation
+$XFS_IO_PROG -c "fsync" $t1
+
+# shutdown FS now for replay journal to kick in next mount
+_scratch_shutdown -v >> $seqres.full 2>&1
+
+_scratch_cycle_mount
+
+# check fiemap reported is valid or not
+$XFS_IO_PROG -c "fiemap -v" $t1 | _filter_fiemap_flags $bs
+
+# success, all done
+status=0
+exit
diff --git a/tests/generic/676.out b/tests/generic/676.out
new file mode 100644
index 00000000..78375940
--- /dev/null
+++ b/tests/generic/676.out
@@ -0,0 +1,7 @@
+QA output created by 676
+wrote XXXX/XXXX bytes at offset XXXX
+XXX Bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+0: [0..29]: none
+1: [30..49]: hole
+2: [50..59]: unwritten
+3: [60..99]: nonelast
--
2.31.1


2022-03-16 14:57:38

by Darrick J. Wong

[permalink] [raw]
Subject: Re: [PATCHv2 3/4] generic/676: Add a new shutdown recovery test

On Tue, Mar 15, 2022 at 07:58:58PM +0530, Ritesh Harjani wrote:
> In certain cases (it is noted with ext4 fast_commit feature) that, replay phase
> may not delete the right range of blocks (after sudden FS shutdown)
> due to some operations which depends on inode->i_size (which during replay of
> an inode with fast_commit could be 0 for sometime).
> This fstest is added to test for such scenarios for all generic fs.
>
> This test case is based on the test case shared via Xin Yin.
>
> Signed-off-by: Ritesh Harjani <[email protected]>
> ---
> tests/generic/676 | 72 +++++++++++++++++++++++++++++++++++++++++++
> tests/generic/676.out | 7 +++++
> 2 files changed, 79 insertions(+)
> create mode 100755 tests/generic/676
> create mode 100644 tests/generic/676.out
>
> diff --git a/tests/generic/676 b/tests/generic/676
> new file mode 100755
> index 00000000..315edcdf
> --- /dev/null
> +++ b/tests/generic/676
> @@ -0,0 +1,72 @@
> +#! /bin/bash
> +# SPDX-License-Identifier: GPL-2.0
> +# Copyright (c) 2022 IBM Corporation. All Rights Reserved.
> +#
> +# FS QA Test 676
> +#
> +# This test with ext4 fast_commit feature w/o below patch missed to delete the right
> +# range during replay phase, since it depends upon inode->i_size (which might not be
> +# stable during replay phase, at least for ext4).
> +# 0b5b5a62b945a141: ext4: use ext4_ext_remove_space() for fast commit replay delete range
> +# (Based on test case shared by Xin Yin <[email protected]>)
> +#
> +
> +. ./common/preamble
> +_begin_fstest auto shutdown quick log recoveryloop

This isn't a looping recovery test. Maybe we should create a 'recovery'
group for tests that only run once? I think we already have a few
fstests like that.

> +
> +# Override the default cleanup function.
> +_cleanup()
> +{
> + cd /
> + rm -r -f $tmp.*
> + _scratch_unmount > /dev/null 2>&1

I think the test harness does this for you already, right?

> +}
> +
> +# Import common functions.
> +. ./common/filter
> +. ./common/punch
> +
> +# real QA test starts here
> +
> +# Modify as appropriate.
> +_supported_fs generic
> +_require_scratch
> +_require_xfs_io_command "fpunch"
> +_require_xfs_io_command "fzero"
> +_require_xfs_io_command "fiemap"

_require_scratch_shutdown

> +
> +t1=$SCRATCH_MNT/foo
> +t2=$SCRATCH_MNT/bar
> +
> +_scratch_mkfs > $seqres.full 2>&1
> +
> +_scratch_mount >> $seqres.full 2>&1
> +
> +bs=$(_get_block_size $SCRATCH_MNT)

_get_file_block_size, in case the file allocation unit isn't the same as
the fs blocksize? (e.g. bigalloc, xfs realtime, etc.)

--D

> +
> +# create and write data to t1
> +$XFS_IO_PROG -f -c "pwrite 0 $((100*$bs))" $t1 | _filter_xfs_io_numbers
> +
> +# fzero certain range in between with -k
> +$XFS_IO_PROG -c "fzero -k $((40*$bs)) $((20*$bs))" $t1
> +
> +# create and fsync a new file t2
> +$XFS_IO_PROG -f -c "fsync" $t2
> +
> +# fpunch within the i_size of a file
> +$XFS_IO_PROG -c "fpunch $((30*$bs)) $((20*$bs))" $t1
> +
> +# fsync t1 to trigger journal operation
> +$XFS_IO_PROG -c "fsync" $t1
> +
> +# shutdown FS now for replay journal to kick in next mount
> +_scratch_shutdown -v >> $seqres.full 2>&1
> +
> +_scratch_cycle_mount
> +
> +# check fiemap reported is valid or not
> +$XFS_IO_PROG -c "fiemap -v" $t1 | _filter_fiemap_flags $bs
> +
> +# success, all done
> +status=0
> +exit
> diff --git a/tests/generic/676.out b/tests/generic/676.out
> new file mode 100644
> index 00000000..78375940
> --- /dev/null
> +++ b/tests/generic/676.out
> @@ -0,0 +1,7 @@
> +QA output created by 676
> +wrote XXXX/XXXX bytes at offset XXXX
> +XXX Bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
> +0: [0..29]: none
> +1: [30..49]: hole
> +2: [50..59]: unwritten
> +3: [60..99]: nonelast
> --
> 2.31.1
>

2022-03-29 12:04:30

by Ritesh Harjani

[permalink] [raw]
Subject: Re: [PATCHv2 3/4] generic/676: Add a new shutdown recovery test

On 22/03/15 09:55AM, Darrick J. Wong wrote:
> On Tue, Mar 15, 2022 at 07:58:58PM +0530, Ritesh Harjani wrote:
> > In certain cases (it is noted with ext4 fast_commit feature) that, replay phase
> > may not delete the right range of blocks (after sudden FS shutdown)
> > due to some operations which depends on inode->i_size (which during replay of
> > an inode with fast_commit could be 0 for sometime).
> > This fstest is added to test for such scenarios for all generic fs.
> >
> > This test case is based on the test case shared via Xin Yin.
> >
> > Signed-off-by: Ritesh Harjani <[email protected]>
> > ---
> > tests/generic/676 | 72 +++++++++++++++++++++++++++++++++++++++++++
> > tests/generic/676.out | 7 +++++
> > 2 files changed, 79 insertions(+)
> > create mode 100755 tests/generic/676
> > create mode 100644 tests/generic/676.out
> >
> > diff --git a/tests/generic/676 b/tests/generic/676
> > new file mode 100755
> > index 00000000..315edcdf
> > --- /dev/null
> > +++ b/tests/generic/676
> > @@ -0,0 +1,72 @@
> > +#! /bin/bash
> > +# SPDX-License-Identifier: GPL-2.0
> > +# Copyright (c) 2022 IBM Corporation. All Rights Reserved.
> > +#
> > +# FS QA Test 676
> > +#
> > +# This test with ext4 fast_commit feature w/o below patch missed to delete the right
> > +# range during replay phase, since it depends upon inode->i_size (which might not be
> > +# stable during replay phase, at least for ext4).
> > +# 0b5b5a62b945a141: ext4: use ext4_ext_remove_space() for fast commit replay delete range
> > +# (Based on test case shared by Xin Yin <[email protected]>)
> > +#
> > +
> > +. ./common/preamble
> > +_begin_fstest auto shutdown quick log recoveryloop
>
> This isn't a looping recovery test. Maybe we should create a 'recovery'
> group for tests that only run once? I think we already have a few
> fstests like that.

I gave it a thought, but I feel it might be unncessary.
From a developer/tester perspective who wanted to test anything related to
recovery would then have to use both recovery and recoveryloop.
Thoughts?

>
> > +
> > +# Override the default cleanup function.
> > +_cleanup()
> > +{
> > + cd /
> > + rm -r -f $tmp.*
> > + _scratch_unmount > /dev/null 2>&1
>
> I think the test harness does this for you already, right?

Although, it looks like after running the test by default the run_section() in
check script, will do _test_unmount and _scratch_unmount.
But I still feel it's better if the individual test cleans up whatever it did
while running the test in it's cleanup routine, before exiting.

>
> > +}
> > +
> > +# Import common functions.
> > +. ./common/filter
> > +. ./common/punch
> > +
> > +# real QA test starts here
> > +
> > +# Modify as appropriate.
> > +_supported_fs generic
> > +_require_scratch
> > +_require_xfs_io_command "fpunch"
> > +_require_xfs_io_command "fzero"
> > +_require_xfs_io_command "fiemap"
>
> _require_scratch_shutdown
>
> > +
> > +t1=$SCRATCH_MNT/foo
> > +t2=$SCRATCH_MNT/bar
> > +
> > +_scratch_mkfs > $seqres.full 2>&1
> > +
> > +_scratch_mount >> $seqres.full 2>&1
> > +
> > +bs=$(_get_block_size $SCRATCH_MNT)
>
> _get_file_block_size, in case the file allocation unit isn't the same as
> the fs blocksize? (e.g. bigalloc, xfs realtime, etc.)

Sure. Agreed. Will make the change.

Thanks
-ritesh

2022-03-31 09:53:47

by Ritesh Harjani

[permalink] [raw]
Subject: Re: [PATCHv2 3/4] generic/676: Add a new shutdown recovery test

On 22/03/29 05:02PM, Ritesh Harjani wrote:
> On 22/03/15 09:55AM, Darrick J. Wong wrote:
> > On Tue, Mar 15, 2022 at 07:58:58PM +0530, Ritesh Harjani wrote:
> > > In certain cases (it is noted with ext4 fast_commit feature) that, replay phase
> > > may not delete the right range of blocks (after sudden FS shutdown)
> > > due to some operations which depends on inode->i_size (which during replay of
> > > an inode with fast_commit could be 0 for sometime).
> > > This fstest is added to test for such scenarios for all generic fs.
> > >
> > > This test case is based on the test case shared via Xin Yin.
> > >
> > > Signed-off-by: Ritesh Harjani <[email protected]>
> > > ---
> > > tests/generic/676 | 72 +++++++++++++++++++++++++++++++++++++++++++
> > > tests/generic/676.out | 7 +++++
> > > 2 files changed, 79 insertions(+)
> > > create mode 100755 tests/generic/676
> > > create mode 100644 tests/generic/676.out
> > >
> > > diff --git a/tests/generic/676 b/tests/generic/676
> > > new file mode 100755
> > > index 00000000..315edcdf
> > > --- /dev/null
> > > +++ b/tests/generic/676
> > > @@ -0,0 +1,72 @@
> > > +#! /bin/bash
> > > +# SPDX-License-Identifier: GPL-2.0
> > > +# Copyright (c) 2022 IBM Corporation. All Rights Reserved.
> > > +#
> > > +# FS QA Test 676
> > > +#
> > > +# This test with ext4 fast_commit feature w/o below patch missed to delete the right
> > > +# range during replay phase, since it depends upon inode->i_size (which might not be
> > > +# stable during replay phase, at least for ext4).
> > > +# 0b5b5a62b945a141: ext4: use ext4_ext_remove_space() for fast commit replay delete range
> > > +# (Based on test case shared by Xin Yin <[email protected]>)
> > > +#
> > > +
> > > +. ./common/preamble
> > > +_begin_fstest auto shutdown quick log recoveryloop
> >
> > This isn't a looping recovery test. Maybe we should create a 'recovery'
> > group for tests that only run once? I think we already have a few
> > fstests like that.
>
> I gave it a thought, but I feel it might be unncessary.
> From a developer/tester perspective who wanted to test anything related to
> recovery would then have to use both recovery and recoveryloop.
> Thoughts?
>
> >
> > > +
> > > +# Override the default cleanup function.
> > > +_cleanup()
> > > +{
> > > + cd /
> > > + rm -r -f $tmp.*
> > > + _scratch_unmount > /dev/null 2>&1
> >
> > I think the test harness does this for you already, right?

Ok, I agree with this. I will remove _scratch_unmount operation
from these two new tests in v3.

-ritesh