2017-05-31 13:08:28

by Jeff Layton

[permalink] [raw]
Subject: [xfstests PATCH v3 0/5] add a test for reporting writeback errors across all fds on fsync

This patchset is a companion to the Linux kernel patch series I recently
posted with the cover letter:

[PATCH v5 00/17] fs: introduce new writeback error reporting and convert ext2 and ext4 to use it

That patchset adds a new userland-visible change to report errors on
all open file descriptions when there is an error on fsync, not just
the first one to race in.

Note that this set contains a patch to emulate $SCRATCH_LOGDEV on btrfs,
but the kernel patches for that are not quite ready yet. The test did
pass on btrfs in an earlier incarnation of the set, however.

Jeff Layton (5):
generic: add a writeback error handling test
ext4: allow ext4 to use $SCRATCH_LOGDEV
generic: test writeback error handling on dmerror devices
ext3: allow it to put journal on a separate device when doing
scratch_mkfs
btrfs: allow it to use $SCRATCH_LOGDEV

common/dmerror | 13 ++--
common/rc | 16 ++++-
doc/auxiliary-programs.txt | 8 +++
src/Makefile | 2 +-
src/fsync-err.c | 161 +++++++++++++++++++++++++++++++++++++++++++++
tests/generic/998 | 64 ++++++++++++++++++
tests/generic/998.out | 2 +
tests/generic/999 | 76 +++++++++++++++++++++
tests/generic/999.out | 3 +
tests/generic/group | 2 +
tools/dmerror | 44 +++++++++++++
11 files changed, 384 insertions(+), 7 deletions(-)
create mode 100644 src/fsync-err.c
create mode 100755 tests/generic/998
create mode 100644 tests/generic/998.out
create mode 100755 tests/generic/999
create mode 100644 tests/generic/999.out
create mode 100755 tools/dmerror

--
2.9.4


2017-05-31 13:08:16

by Jeff Layton

[permalink] [raw]
Subject: [xfstests PATCH v3 1/5] generic: add a writeback error handling test

I'm working on a set of kernel patches to change how writeback errors
are handled and reported in the kernel. Instead of reporting a
writeback error to only the first fsync caller on the file, I aim
to make the kernel report them once on every file description.

This patch adds a test for the new behavior. Basically, open many fds
to the same file, turn on dm_error, write to each of the fds, and then
fsync them all to ensure that they all get an error back.

To do that, I'm adding a new tools/dmerror script that the C program
can use to load the error table. For now, that's all it can do, but
we can fill it out with other commands as necessary.

Signed-off-by: Jeff Layton <[email protected]>
---
common/dmerror | 13 ++--
doc/auxiliary-programs.txt | 8 +++
src/Makefile | 2 +-
src/fsync-err.c | 161 +++++++++++++++++++++++++++++++++++++++++++++
tests/generic/999 | 76 +++++++++++++++++++++
tests/generic/999.out | 3 +
tests/generic/group | 1 +
tools/dmerror | 44 +++++++++++++
8 files changed, 302 insertions(+), 6 deletions(-)
create mode 100644 src/fsync-err.c
create mode 100755 tests/generic/999
create mode 100644 tests/generic/999.out
create mode 100755 tools/dmerror

diff --git a/common/dmerror b/common/dmerror
index d46c5d0b7266..238baa213b1f 100644
--- a/common/dmerror
+++ b/common/dmerror
@@ -23,22 +23,25 @@ if [ $? -eq 0 ]; then
_notrun "Cannot run tests with DAX on dmerror devices"
fi

-_dmerror_init()
+_dmerror_setup()
{
local dm_backing_dev=$SCRATCH_DEV

- $DMSETUP_PROG remove error-test > /dev/null 2>&1
-
local blk_dev_size=`blockdev --getsz $dm_backing_dev`

DMERROR_DEV='/dev/mapper/error-test'

DMLINEAR_TABLE="0 $blk_dev_size linear $dm_backing_dev 0"

+ DMERROR_TABLE="0 $blk_dev_size error $dm_backing_dev 0"
+}
+
+_dmerror_init()
+{
+ _dmerror_setup
+ $DMSETUP_PROG remove error-test > /dev/null 2>&1
$DMSETUP_PROG create error-test --table "$DMLINEAR_TABLE" || \
_fatal "failed to create dm linear device"
-
- DMERROR_TABLE="0 $blk_dev_size error $dm_backing_dev 0"
}

_dmerror_mount()
diff --git a/doc/auxiliary-programs.txt b/doc/auxiliary-programs.txt
index 21ef118596b6..191ac0596511 100644
--- a/doc/auxiliary-programs.txt
+++ b/doc/auxiliary-programs.txt
@@ -16,6 +16,7 @@ note the dependency with:
Contents:

- af_unix -- Create an AF_UNIX socket
+ - fsync-err -- tests fsync error reporting after failed writeback
- open_by_handle -- open_by_handle_at syscall exercise
- stat_test -- statx syscall exercise
- t_dir_type -- print directory entries and their file type
@@ -30,6 +31,13 @@ af_unix

The af_unix program creates an AF_UNIX socket at the given location.

+fsync-err
+ Specialized program for testing how the kernel reports errors that
+ occur during writeback. Works in conjunction with the dmerror script
+ in tools/ to write data to a device, and then force it to fail
+ writeback and test that errors are reported during fsync and cleared
+ afterward.
+
open_by_handle

The open_by_handle program exercises the open_by_handle_at() system
diff --git a/src/Makefile b/src/Makefile
index 4ec01975f8f7..b79c4d84d31b 100644
--- a/src/Makefile
+++ b/src/Makefile
@@ -13,7 +13,7 @@ TARGETS = dirstress fill fill2 getpagesize holes lstat64 \
multi_open_unlink dmiperf unwritten_sync genhashnames t_holes \
t_mmap_writev t_truncate_cmtime dirhash_collide t_rename_overwrite \
holetest t_truncate_self t_mmap_dio af_unix t_mmap_stale_pmd \
- t_mmap_cow_race
+ t_mmap_cow_race fsync-err

LINUX_TARGETS = xfsctl bstat t_mtab getdevicesize preallo_rw_pattern_reader \
preallo_rw_pattern_writer ftrunc trunc fs_perms testx looptest \
diff --git a/src/fsync-err.c b/src/fsync-err.c
new file mode 100644
index 000000000000..cbeb37fb1790
--- /dev/null
+++ b/src/fsync-err.c
@@ -0,0 +1,161 @@
+/*
+ * fsync-err.c: test whether writeback errors are reported to all open fds
+ * and properly cleared as expected after being seen once on each
+ *
+ * Copyright (c) 2017: Jeff Layton <[email protected]>
+ */
+#include <sys/types.h>
+#include <sys/stat.h>
+#include <errno.h>
+#include <fcntl.h>
+#include <stdlib.h>
+#include <stdio.h>
+#include <string.h>
+#include <unistd.h>
+
+/*
+ * btrfs has a fixed stripewidth of 64k, so we need to write enough data to
+ * ensure that we hit both stripes.
+ *
+ * FIXME: have the test script pass in the length?
+ */
+#define BUFSIZE (65 * 1024)
+
+/* FIXME: should this be tunable */
+#define NUM_FDS 10
+
+static void usage() {
+ fprintf(stderr, "Usage: fsync-err <filename>\n");
+}
+
+int main(int argc, char **argv)
+{
+ int fd[NUM_FDS], ret, i;
+ char *fname, *buf;
+
+ if (argc < 1) {
+ usage();
+ return 1;
+ }
+
+ /* First argument is filename */
+ fname = argv[1];
+
+ for (i = 0; i < NUM_FDS; ++i) {
+ fd[i] = open(fname, O_WRONLY | O_CREAT | O_TRUNC, 0644);
+ if (fd[i] < 0) {
+ printf("open of fd[%d] failed: %m\n", i);
+ return 1;
+ }
+ }
+
+ buf = malloc(BUFSIZE);
+ if (!buf) {
+ printf("malloc failed: %m\n");
+ return 1;
+ }
+
+ memset(buf, 0x7c, BUFSIZE);
+
+ for (i = 0; i < NUM_FDS; ++i) {
+ ret = write(fd[i], buf, BUFSIZE);
+ if (ret < 0) {
+ printf("First write on fd[%d] failed: %m\n", i);
+ return 1;
+ }
+ }
+
+ for (i = 0; i < NUM_FDS; ++i) {
+ ret = fsync(fd[i]);
+ if (ret < 0) {
+ printf("First fsync on fd[%d] failed: %m\n", i);
+ return 1;
+ }
+ }
+
+ /* flip the device to non-working mode */
+ ret = system("./tools/dmerror load_error_table");
+ if (ret) {
+ if (WIFEXITED(ret))
+ printf("system: program exited: %d\n",
+ WEXITSTATUS(ret));
+ else
+ printf("system: 0x%x\n", (int)ret);
+
+ return 1;
+ }
+
+ for (i = 0; i < NUM_FDS; ++i) {
+ ret = write(fd[i], buf, strlen(buf) + 1);
+ if (ret < 0) {
+ printf("Second write on fd[%d] failed: %m\n", i);
+ return 1;
+ }
+ }
+
+ for (i = 0; i < NUM_FDS; ++i) {
+ ret = fsync(fd[i]);
+ /* Now, we EXPECT the error! */
+ if (ret >= 0) {
+ printf("Success on second fsync on fd[%d]!\n", i);
+ return 1;
+ }
+ }
+
+ for (i = 0; i < NUM_FDS; ++i) {
+ ret = fsync(fd[i]);
+ if (ret < 0) {
+ /* Now the error should be clear */
+ printf("Third fsync on fd[%d] failed: %m\n", i);
+ return 1;
+ }
+ }
+
+ /* flip the device to working mode */
+ ret = system("./tools/dmerror load_working_table");
+ if (ret) {
+ if (WIFEXITED(ret))
+ printf("system: program exited: %d\n",
+ WEXITSTATUS(ret));
+ else
+ printf("system: 0x%x\n", (int)ret);
+
+ return 1;
+ }
+
+ for (i = 0; i < NUM_FDS; ++i) {
+ ret = fsync(fd[i]);
+ if (ret < 0) {
+ /* The error should still be clear */
+ printf("fsync after healing device on fd[%d] failed: %m\n", i);
+ return 1;
+ }
+ }
+
+ /*
+ * reopen each file one at a time to ensure the same inode stays
+ * in core. fsync each one to make sure we see no errors on a fresh
+ * open of the inode.
+ */
+ for (i = 0; i < NUM_FDS; ++i) {
+ ret = close(fd[i]);
+ if (ret < 0) {
+ printf("Close of fd[%d] returned unexpected error: %m\n", i);
+ return 1;
+ }
+ fd[i] = open(fname, O_WRONLY | O_CREAT | O_TRUNC, 0644);
+ if (fd[i] < 0) {
+ printf("Second open of fd[%d] failed: %m\n", i);
+ return 1;
+ }
+ ret = fsync(fd[i]);
+ if (ret < 0) {
+ /* New opens should not return an error */
+ printf("First fsync after reopen of fd[%d] failed: %m\n", i);
+ return 1;
+ }
+ }
+
+ printf("Test passed!\n");
+ return 0;
+}
diff --git a/tests/generic/999 b/tests/generic/999
new file mode 100755
index 000000000000..6de143d1149e
--- /dev/null
+++ b/tests/generic/999
@@ -0,0 +1,76 @@
+#! /bin/bash
+# FS QA Test No. 999
+#
+# Open a file several times, write to it, fsync on all fds and make sure that
+# they all return 0. Change the device to start throwing errors. Write again
+# on all fds and fsync on all fds. Ensure that we get errors on all of them.
+# Then fsync on all one last time and verify that all return 0.
+#
+#-----------------------------------------------------------------------
+# Copyright (c) 2017, Jeff Layton <[email protected]>
+#
+# This program is free software; you can redistribute it and/or
+# modify it under the terms of the GNU General Public License as
+# published by the Free Software Foundation.
+#
+# This program is distributed in the hope that it would be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program; if not, write the Free Software Foundation,
+# Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA
+#-----------------------------------------------------------------------
+
+seq=`basename $0`
+seqres=$RESULT_DIR/$seq
+echo "QA output created by $seq"
+
+here=`pwd`
+tmp=/tmp/$$
+status=1 # failure is the default!
+trap "_cleanup; exit \$status" 0 1 2 3 15
+
+_cleanup()
+{
+ cd /
+ rm -rf $tmp.* $testdir
+ _dmerror_cleanup
+}
+
+# get standard environment, filters and checks
+. ./common/rc
+. ./common/filter
+. ./common/dmerror
+
+# real QA test starts here
+_supported_os Linux
+_require_scratch
+_require_logdev
+_require_dm_target error
+_require_test_program fsync-err
+
+rm -f $seqres.full
+
+echo "Format and mount"
+$XFS_IO_PROG -d -c "pwrite -S 0x7c -b 1048576 0 $((64 * 1048576))" $SCRATCH_DEV >> $seqres.full
+_scratch_mkfs > $seqres.full 2>&1
+_dmerror_init
+_dmerror_mount >> $seqres.full 2>&1
+_dmerror_unmount
+_dmerror_mount
+
+_require_fs_space $SCRATCH_MNT 8192
+
+testfile=$SCRATCH_MNT/fsync-err-test
+
+$here/src/fsync-err $testfile
+
+# success, all done
+_dmerror_load_working_table
+_dmerror_unmount
+_dmerror_cleanup
+_repair_scratch_fs >> $seqres.full
+status=0
+exit
diff --git a/tests/generic/999.out b/tests/generic/999.out
new file mode 100644
index 000000000000..2e48492ff6d1
--- /dev/null
+++ b/tests/generic/999.out
@@ -0,0 +1,3 @@
+QA output created by 999
+Format and mount
+Test passed!
diff --git a/tests/generic/group b/tests/generic/group
index 438c299030f2..39f7b14657f1 100644
--- a/tests/generic/group
+++ b/tests/generic/group
@@ -440,3 +440,4 @@
435 auto encrypt
436 auto quick rw
437 auto quick
+999 auto quick
diff --git a/tools/dmerror b/tools/dmerror
new file mode 100755
index 000000000000..4aaf682ee5f9
--- /dev/null
+++ b/tools/dmerror
@@ -0,0 +1,44 @@
+#!/bin/bash
+#-----------------------------------------------------------------------
+# Copyright (c) 2017, Jeff Layton <[email protected]>
+#
+# This program is free software; you can redistribute it and/or
+# modify it under the terms of the GNU General Public License as
+# published by the Free Software Foundation.
+#
+# This program is distributed in the hope that it would be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program; if not, write the Free Software Foundation,
+# Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA
+#-----------------------------------------------------------------------
+
+. ./common/config
+. ./common/dmerror
+
+_dmerror_setup
+
+case $1 in
+cleanup)
+ _dmerror_cleanup
+ ;;
+init)
+ _dmerror_init
+ ;;
+load_error_table)
+ _dmerror_load_error_table
+ ;;
+load_working_table)
+ _dmerror_load_working_table
+ ;;
+*)
+ echo "Usage: $0 {init|cleanup|load_error_table|load_working_table}"
+ exit 1
+ ;;
+esac
+
+status=0
+exit
--
2.9.4


2017-05-31 13:08:19

by Jeff Layton

[permalink] [raw]
Subject: [xfstests PATCH v3 4/5] ext3: allow it to put journal on a separate device when doing scratch_mkfs

Signed-off-by: Jeff Layton <[email protected]>
---
common/rc | 11 ++++++++++-
1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/common/rc b/common/rc
index 391d36f373cd..83765aacfb06 100644
--- a/common/rc
+++ b/common/rc
@@ -832,7 +832,16 @@ _scratch_mkfs()
mkfs_cmd="$MKFS_BTRFS_PROG"
mkfs_filter="cat"
;;
- ext2|ext3)
+ ext3)
+ mkfs_cmd="$MKFS_PROG -t $FSTYP -- -F"
+ mkfs_filter="grep -v -e ^Warning: -e \"^mke2fs \""
+
+ # put journal on separate device?
+ [ "$USE_EXTERNAL" = yes -a ! -z "$SCRATCH_LOGDEV" ] && \
+ $mkfs_cmd -O journal_dev $SCRATCH_LOGDEV && \
+ mkfs_cmd="$mkfs_cmd -J device=$SCRATCH_LOGDEV"
+ ;;
+ ext2)
mkfs_cmd="$MKFS_PROG -t $FSTYP -- -F"
mkfs_filter="grep -v -e ^Warning: -e \"^mke2fs \""
;;
--
2.9.4

2017-05-31 13:08:20

by Jeff Layton

[permalink] [raw]
Subject: [xfstests PATCH v3 5/5] btrfs: allow it to use $SCRATCH_LOGDEV

With btrfs, we can't really put the log on a separate device. What we
can do however is mirror the metadata across two devices and make the
data striped across all devices. When we turn on dmerror then the
metadata can fall back to using the other mirror while the data errors
out.

Note that the current incarnation of btrfs has a fixed 64k stripe
width. If that ever changes or becomes settable, we may need to adjust
the amount of data that the test program writes.

Signed-off-by: Jeff Layton <[email protected]>
---
common/rc | 2 ++
1 file changed, 2 insertions(+)

diff --git a/common/rc b/common/rc
index 83765aacfb06..078270451b53 100644
--- a/common/rc
+++ b/common/rc
@@ -830,6 +830,8 @@ _scratch_mkfs()
;;
btrfs)
mkfs_cmd="$MKFS_BTRFS_PROG"
+ [ "$USE_EXTERNAL" = yes -a ! -z "$SCRATCH_LOGDEV" ] && \
+ mkfs_cmd="$mkfs_cmd -d raid0 -m raid1 $SCRATCH_LOGDEV"
mkfs_filter="cat"
;;
ext3)
--
2.9.4

2017-05-31 13:08:18

by Jeff Layton

[permalink] [raw]
Subject: [xfstests PATCH v3 3/5] generic: test writeback error handling on dmerror devices

Ensure that we get an error back on all fds when a block device is
open by multiple writers and writeback fails.

Signed-off-by: Jeff Layton <[email protected]>
---
tests/generic/998 | 64 +++++++++++++++++++++++++++++++++++++++++++++++++++
tests/generic/998.out | 2 ++
tests/generic/group | 1 +
3 files changed, 67 insertions(+)
create mode 100755 tests/generic/998
create mode 100644 tests/generic/998.out

diff --git a/tests/generic/998 b/tests/generic/998
new file mode 100755
index 000000000000..fbadb47507c2
--- /dev/null
+++ b/tests/generic/998
@@ -0,0 +1,64 @@
+#! /bin/bash
+# FS QA Test No. 998
+#
+# Test writeback error handling when writing to block devices via pagecache.
+# See src/fsync-err.c for details of what test actually does.
+#
+#-----------------------------------------------------------------------
+# Copyright (c) 2017, Jeff Layton <[email protected]>
+#
+# This program is free software; you can redistribute it and/or
+# modify it under the terms of the GNU General Public License as
+# published by the Free Software Foundation.
+#
+# This program is distributed in the hope that it would be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program; if not, write the Free Software Foundation,
+# Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA
+#-----------------------------------------------------------------------
+
+seq=`basename $0`
+seqres=$RESULT_DIR/$seq
+echo "QA output created by $seq"
+
+here=`pwd`
+tmp=/tmp/$$
+status=1 # failure is the default!
+trap "_cleanup; exit \$status" 0 1 2 3 15
+
+_cleanup()
+{
+ cd /
+ rm -rf $tmp.* $testdir
+ _dmerror_cleanup
+}
+
+# get standard environment, filters and checks
+. ./common/rc
+. ./common/filter
+. ./common/dmerror
+
+# real QA test starts here
+_supported_os Linux
+_require_scratch
+_require_logdev
+_require_dm_target error
+_require_test_program fsync-err
+
+rm -f $seqres.full
+
+$XFS_IO_PROG -d -c "pwrite -S 0x7c -b 1048576 0 $((64 * 1048576))" $SCRATCH_DEV >> $seqres.full
+_dmerror_init
+
+$here/src/fsync-err $DMERROR_DEV
+
+# success, all done
+_dmerror_load_working_table
+_dmerror_cleanup
+_scratch_mkfs > $seqres.full 2>&1
+status=0
+exit
diff --git a/tests/generic/998.out b/tests/generic/998.out
new file mode 100644
index 000000000000..658c438820e2
--- /dev/null
+++ b/tests/generic/998.out
@@ -0,0 +1,2 @@
+QA output created by 998
+Test passed!
diff --git a/tests/generic/group b/tests/generic/group
index 39f7b14657f1..9fc384363ca7 100644
--- a/tests/generic/group
+++ b/tests/generic/group
@@ -440,4 +440,5 @@
435 auto encrypt
436 auto quick rw
437 auto quick
+998 auto quick
999 auto quick
--
2.9.4

2017-05-31 13:08:17

by Jeff Layton

[permalink] [raw]
Subject: [xfstests PATCH v3 2/5] ext4: allow ext4 to use $SCRATCH_LOGDEV

The writeback error handling test requires that you put the journal on a
separate device. This allows us to use dmerror to simulate data
writeback failure, without affecting the journal.

xfs already has infrastructure for this (a'la $SCRATCH_LOGDEV), so wire
up the ext4 code so that it can do the same thing when _scratch_mkfs is
called.

Signed-off-by: Jeff Layton <[email protected]>
Reviewed-by: Darrick J. Wong <[email protected]>
---
common/rc | 3 +++
1 file changed, 3 insertions(+)

diff --git a/common/rc b/common/rc
index 743df427c047..391d36f373cd 100644
--- a/common/rc
+++ b/common/rc
@@ -676,6 +676,9 @@ _scratch_mkfs_ext4()
local tmp=`mktemp`
local mkfs_status

+ [ "$USE_EXTERNAL" = yes -a ! -z "$SCRATCH_LOGDEV" ] && \
+ $mkfs_cmd -O journal_dev $SCRATCH_LOGDEV && \
+ mkfs_cmd="$mkfs_cmd -J device=$SCRATCH_LOGDEV"

_scratch_do_mkfs "$mkfs_cmd" "$mkfs_filter" $* 2>$tmp.mkfserr 1>$tmp.mkfsstd
mkfs_status=$?
--
2.9.4


2017-05-31 18:59:00

by Eduardo Valentin

[permalink] [raw]
Subject: Re: [xfstests PATCH v3 1/5] generic: add a writeback error handling test

Hello,

On Wed, May 31, 2017 at 09:08:16AM -0400, Jeff Layton wrote:
> I'm working on a set of kernel patches to change how writeback errors
> are handled and reported in the kernel. Instead of reporting a
> writeback error to only the first fsync caller on the file, I aim
> to make the kernel report them once on every file description.
>
> This patch adds a test for the new behavior. Basically, open many fds
> to the same file, turn on dm_error, write to each of the fds, and then
> fsync them all to ensure that they all get an error back.
>
> To do that, I'm adding a new tools/dmerror script that the C program
> can use to load the error table. For now, that's all it can do, but
> we can fill it out with other commands as necessary.
>
> Signed-off-by: Jeff Layton <[email protected]>
> ---
> common/dmerror | 13 ++--
> doc/auxiliary-programs.txt | 8 +++
> src/Makefile | 2 +-
> src/fsync-err.c | 161 +++++++++++++++++++++++++++++++++++++++++++++
> tests/generic/999 | 76 +++++++++++++++++++++
> tests/generic/999.out | 3 +
> tests/generic/group | 1 +
> tools/dmerror | 44 +++++++++++++
> 8 files changed, 302 insertions(+), 6 deletions(-)
> create mode 100644 src/fsync-err.c
> create mode 100755 tests/generic/999
> create mode 100644 tests/generic/999.out
> create mode 100755 tools/dmerror
>
> diff --git a/common/dmerror b/common/dmerror
> index d46c5d0b7266..238baa213b1f 100644
> --- a/common/dmerror
> +++ b/common/dmerror
> @@ -23,22 +23,25 @@ if [ $? -eq 0 ]; then
> _notrun "Cannot run tests with DAX on dmerror devices"
> fi
>
> -_dmerror_init()
> +_dmerror_setup()
> {
> local dm_backing_dev=$SCRATCH_DEV
>
> - $DMSETUP_PROG remove error-test > /dev/null 2>&1
> -
> local blk_dev_size=`blockdev --getsz $dm_backing_dev`
>
> DMERROR_DEV='/dev/mapper/error-test'
>
> DMLINEAR_TABLE="0 $blk_dev_size linear $dm_backing_dev 0"
>
> + DMERROR_TABLE="0 $blk_dev_size error $dm_backing_dev 0"
> +}
> +
> +_dmerror_init()
> +{
> + _dmerror_setup
> + $DMSETUP_PROG remove error-test > /dev/null 2>&1
> $DMSETUP_PROG create error-test --table "$DMLINEAR_TABLE" || \
> _fatal "failed to create dm linear device"
> -
> - DMERROR_TABLE="0 $blk_dev_size error $dm_backing_dev 0"
> }
>
> _dmerror_mount()
> diff --git a/doc/auxiliary-programs.txt b/doc/auxiliary-programs.txt
> index 21ef118596b6..191ac0596511 100644
> --- a/doc/auxiliary-programs.txt
> +++ b/doc/auxiliary-programs.txt
> @@ -16,6 +16,7 @@ note the dependency with:
> Contents:
>
> - af_unix -- Create an AF_UNIX socket
> + - fsync-err -- tests fsync error reporting after failed writeback
> - open_by_handle -- open_by_handle_at syscall exercise
> - stat_test -- statx syscall exercise
> - t_dir_type -- print directory entries and their file type
> @@ -30,6 +31,13 @@ af_unix
>
> The af_unix program creates an AF_UNIX socket at the given location.
>
> +fsync-err
> + Specialized program for testing how the kernel reports errors that
> + occur during writeback. Works in conjunction with the dmerror script
> + in tools/ to write data to a device, and then force it to fail
> + writeback and test that errors are reported during fsync and cleared
> + afterward.
> +
> open_by_handle
>
> The open_by_handle program exercises the open_by_handle_at() system
> diff --git a/src/Makefile b/src/Makefile
> index 4ec01975f8f7..b79c4d84d31b 100644
> --- a/src/Makefile
> +++ b/src/Makefile
> @@ -13,7 +13,7 @@ TARGETS = dirstress fill fill2 getpagesize holes lstat64 \
> multi_open_unlink dmiperf unwritten_sync genhashnames t_holes \
> t_mmap_writev t_truncate_cmtime dirhash_collide t_rename_overwrite \
> holetest t_truncate_self t_mmap_dio af_unix t_mmap_stale_pmd \
> - t_mmap_cow_race
> + t_mmap_cow_race fsync-err
>
> LINUX_TARGETS = xfsctl bstat t_mtab getdevicesize preallo_rw_pattern_reader \
> preallo_rw_pattern_writer ftrunc trunc fs_perms testx looptest \
> diff --git a/src/fsync-err.c b/src/fsync-err.c
> new file mode 100644
> index 000000000000..cbeb37fb1790
> --- /dev/null
> +++ b/src/fsync-err.c
> @@ -0,0 +1,161 @@
> +/*
> + * fsync-err.c: test whether writeback errors are reported to all open fds
> + * and properly cleared as expected after being seen once on each
> + *
> + * Copyright (c) 2017: Jeff Layton <[email protected]>
> + */
> +#include <sys/types.h>
> +#include <sys/stat.h>
> +#include <errno.h>
> +#include <fcntl.h>
> +#include <stdlib.h>
> +#include <stdio.h>
> +#include <string.h>
> +#include <unistd.h>
> +
> +/*
> + * btrfs has a fixed stripewidth of 64k, so we need to write enough data to
> + * ensure that we hit both stripes.
> + *
> + * FIXME: have the test script pass in the length?
> + */
> +#define BUFSIZE (65 * 1024)
> +
> +/* FIXME: should this be tunable */
> +#define NUM_FDS 10
> +
> +static void usage() {
> + fprintf(stderr, "Usage: fsync-err <filename>\n");
> +}

Just to follow the same style as your main:

+static void usage()
+{
+ fprintf(stderr, "Usage: fsync-err <filename>\n");
+}

> +
> +int main(int argc, char **argv)
> +{
> + int fd[NUM_FDS], ret, i;
> + char *fname, *buf;
> +
> + if (argc < 1) {
> + usage();
> + return 1;
> + }
> +
> + /* First argument is filename */
> + fname = argv[1];
> +
> + for (i = 0; i < NUM_FDS; ++i) {
> + fd[i] = open(fname, O_WRONLY | O_CREAT | O_TRUNC, 0644);
> + if (fd[i] < 0) {
> + printf("open of fd[%d] failed: %m\n", i);
> + return 1;
> + }
> + }
> +
> + buf = malloc(BUFSIZE);
> + if (!buf) {
> + printf("malloc failed: %m\n");
> + return 1;
> + }
> +
> + memset(buf, 0x7c, BUFSIZE);
> +
> + for (i = 0; i < NUM_FDS; ++i) {
> + ret = write(fd[i], buf, BUFSIZE);
> + if (ret < 0) {
> + printf("First write on fd[%d] failed: %m\n", i);
> + return 1;
> + }
> + }
> +
> + for (i = 0; i < NUM_FDS; ++i) {
> + ret = fsync(fd[i]);
> + if (ret < 0) {
> + printf("First fsync on fd[%d] failed: %m\n", i);
> + return 1;
> + }
> + }
> +
> + /* flip the device to non-working mode */
> + ret = system("./tools/dmerror load_error_table");

Just assuming a hardcoded relative path?

> + if (ret) {
> + if (WIFEXITED(ret))
> + printf("system: program exited: %d\n",
> + WEXITSTATUS(ret));
> + else
> + printf("system: 0x%x\n", (int)ret);
> +
> + return 1;
> + }
> +
> + for (i = 0; i < NUM_FDS; ++i) {
> + ret = write(fd[i], buf, strlen(buf) + 1);

Why not BUFSIZE here?

> + if (ret < 0) {
> + printf("Second write on fd[%d] failed: %m\n", i);
> + return 1;
> + }
> + }
> +
> + for (i = 0; i < NUM_FDS; ++i) {
> + ret = fsync(fd[i]);
> + /* Now, we EXPECT the error! */
> + if (ret >= 0) {
> + printf("Success on second fsync on fd[%d]!\n", i);
> + return 1;
> + }
> + }
> +
> + for (i = 0; i < NUM_FDS; ++i) {
> + ret = fsync(fd[i]);
> + if (ret < 0) {
> + /* Now the error should be clear */
> + printf("Third fsync on fd[%d] failed: %m\n", i);
> + return 1;
> + }
> + }
> +
> + /* flip the device to working mode */
> + ret = system("./tools/dmerror load_working_table");
> + if (ret) {
> + if (WIFEXITED(ret))
> + printf("system: program exited: %d\n",
> + WEXITSTATUS(ret));
> + else
> + printf("system: 0x%x\n", (int)ret);
> +
> + return 1;
> + }
> +
> + for (i = 0; i < NUM_FDS; ++i) {
> + ret = fsync(fd[i]);
> + if (ret < 0) {
> + /* The error should still be clear */
> + printf("fsync after healing device on fd[%d] failed: %m\n", i);
> + return 1;
> + }
> + }
> +
> + /*
> + * reopen each file one at a time to ensure the same inode stays
> + * in core. fsync each one to make sure we see no errors on a fresh
> + * open of the inode.
> + */
> + for (i = 0; i < NUM_FDS; ++i) {
> + ret = close(fd[i]);
> + if (ret < 0) {
> + printf("Close of fd[%d] returned unexpected error: %m\n", i);
> + return 1;
> + }
> + fd[i] = open(fname, O_WRONLY | O_CREAT | O_TRUNC, 0644);
> + if (fd[i] < 0) {
> + printf("Second open of fd[%d] failed: %m\n", i);
> + return 1;
> + }
> + ret = fsync(fd[i]);
> + if (ret < 0) {
> + /* New opens should not return an error */
> + printf("First fsync after reopen of fd[%d] failed: %m\n", i);
> + return 1;
> + }
> + }
> +
> + printf("Test passed!\n");
> + return 0;
> +}
> diff --git a/tests/generic/999 b/tests/generic/999
> new file mode 100755
> index 000000000000..6de143d1149e
> --- /dev/null
> +++ b/tests/generic/999
> @@ -0,0 +1,76 @@
> +#! /bin/bash
> +# FS QA Test No. 999
> +#
> +# Open a file several times, write to it, fsync on all fds and make sure that
> +# they all return 0. Change the device to start throwing errors. Write again
> +# on all fds and fsync on all fds. Ensure that we get errors on all of them.
> +# Then fsync on all one last time and verify that all return 0.
> +#
> +#-----------------------------------------------------------------------
> +# Copyright (c) 2017, Jeff Layton <[email protected]>
> +#
> +# This program is free software; you can redistribute it and/or
> +# modify it under the terms of the GNU General Public License as
> +# published by the Free Software Foundation.
> +#
> +# This program is distributed in the hope that it would be useful,
> +# but WITHOUT ANY WARRANTY; without even the implied warranty of
> +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
> +# GNU General Public License for more details.
> +#
> +# You should have received a copy of the GNU General Public License
> +# along with this program; if not, write the Free Software Foundation,
> +# Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA
> +#-----------------------------------------------------------------------

Sure you want to track the address? Maybe just remove it?

> +



> +seq=`basename $0`
> +seqres=$RESULT_DIR/$seq
> +echo "QA output created by $seq"
> +
> +here=`pwd`
> +tmp=/tmp/$$
> +status=1 # failure is the default!
> +trap "_cleanup; exit \$status" 0 1 2 3 15
> +
> +_cleanup()
> +{
> + cd /
> + rm -rf $tmp.* $testdir
> + _dmerror_cleanup
> +}
> +
> +# get standard environment, filters and checks
> +. ./common/rc
> +. ./common/filter
> +. ./common/dmerror
> +
> +# real QA test starts here
> +_supported_os Linux
> +_require_scratch
> +_require_logdev
> +_require_dm_target error
> +_require_test_program fsync-err
> +
> +rm -f $seqres.full
> +
> +echo "Format and mount"
> +$XFS_IO_PROG -d -c "pwrite -S 0x7c -b 1048576 0 $((64 * 1048576))" $SCRATCH_DEV >> $seqres.full
> +_scratch_mkfs > $seqres.full 2>&1
> +_dmerror_init
> +_dmerror_mount >> $seqres.full 2>&1
> +_dmerror_unmount
> +_dmerror_mount
> +
> +_require_fs_space $SCRATCH_MNT 8192
> +
> +testfile=$SCRATCH_MNT/fsync-err-test
> +
> +$here/src/fsync-err $testfile
> +
> +# success, all done
> +_dmerror_load_working_table
> +_dmerror_unmount
> +_dmerror_cleanup
> +_repair_scratch_fs >> $seqres.full
> +status=0
> +exit
> diff --git a/tests/generic/999.out b/tests/generic/999.out
> new file mode 100644
> index 000000000000..2e48492ff6d1
> --- /dev/null
> +++ b/tests/generic/999.out
> @@ -0,0 +1,3 @@
> +QA output created by 999
> +Format and mount
> +Test passed!
> diff --git a/tests/generic/group b/tests/generic/group
> index 438c299030f2..39f7b14657f1 100644
> --- a/tests/generic/group
> +++ b/tests/generic/group
> @@ -440,3 +440,4 @@
> 435 auto encrypt
> 436 auto quick rw
> 437 auto quick
> +999 auto quick
> diff --git a/tools/dmerror b/tools/dmerror
> new file mode 100755
> index 000000000000..4aaf682ee5f9
> --- /dev/null
> +++ b/tools/dmerror
> @@ -0,0 +1,44 @@
> +#!/bin/bash
> +#-----------------------------------------------------------------------
> +# Copyright (c) 2017, Jeff Layton <[email protected]>
> +#
> +# This program is free software; you can redistribute it and/or
> +# modify it under the terms of the GNU General Public License as
> +# published by the Free Software Foundation.
> +#
> +# This program is distributed in the hope that it would be useful,
> +# but WITHOUT ANY WARRANTY; without even the implied warranty of
> +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
> +# GNU General Public License for more details.
> +#
> +# You should have received a copy of the GNU General Public License
> +# along with this program; if not, write the Free Software Foundation,
> +# Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA
> +#-----------------------------------------------------------------------
> +
> +. ./common/config
> +. ./common/dmerror
> +
> +_dmerror_setup
> +
> +case $1 in
> +cleanup)
> + _dmerror_cleanup
> + ;;
> +init)
> + _dmerror_init
> + ;;
> +load_error_table)
> + _dmerror_load_error_table
> + ;;
> +load_working_table)
> + _dmerror_load_working_table
> + ;;
> +*)
> + echo "Usage: $0 {init|cleanup|load_error_table|load_working_table}"
> + exit 1
> + ;;
> +esac
> +
> +status=0
> +exit
> --
> 2.9.4
>
>

--
All the best,
Eduardo Valentin

2017-05-31 20:02:39

by Jeff Layton

[permalink] [raw]
Subject: Re: [xfstests PATCH v3 1/5] generic: add a writeback error handling test

On Wed, 2017-05-31 at 11:59 -0700, Eduardo Valentin wrote:
> Hello,
>
> On Wed, May 31, 2017 at 09:08:16AM -0400, Jeff Layton wrote:
> > I'm working on a set of kernel patches to change how writeback errors
> > are handled and reported in the kernel. Instead of reporting a
> > writeback error to only the first fsync caller on the file, I aim
> > to make the kernel report them once on every file description.
> >
> > This patch adds a test for the new behavior. Basically, open many fds
> > to the same file, turn on dm_error, write to each of the fds, and then
> > fsync them all to ensure that they all get an error back.
> >
> > To do that, I'm adding a new tools/dmerror script that the C program
> > can use to load the error table. For now, that's all it can do, but
> > we can fill it out with other commands as necessary.
> >
> > Signed-off-by: Jeff Layton <[email protected]>
> > ---
> > common/dmerror | 13 ++--
> > doc/auxiliary-programs.txt | 8 +++
> > src/Makefile | 2 +-
> > src/fsync-err.c | 161 +++++++++++++++++++++++++++++++++++++++++++++
> > tests/generic/999 | 76 +++++++++++++++++++++
> > tests/generic/999.out | 3 +
> > tests/generic/group | 1 +
> > tools/dmerror | 44 +++++++++++++
> > 8 files changed, 302 insertions(+), 6 deletions(-)
> > create mode 100644 src/fsync-err.c
> > create mode 100755 tests/generic/999
> > create mode 100644 tests/generic/999.out
> > create mode 100755 tools/dmerror
> >
> > diff --git a/common/dmerror b/common/dmerror
> > index d46c5d0b7266..238baa213b1f 100644
> > --- a/common/dmerror
> > +++ b/common/dmerror
> > @@ -23,22 +23,25 @@ if [ $? -eq 0 ]; then
> > _notrun "Cannot run tests with DAX on dmerror devices"
> > fi
> >
> > -_dmerror_init()
> > +_dmerror_setup()
> > {
> > local dm_backing_dev=$SCRATCH_DEV
> >
> > - $DMSETUP_PROG remove error-test > /dev/null 2>&1
> > -
> > local blk_dev_size=`blockdev --getsz $dm_backing_dev`
> >
> > DMERROR_DEV='/dev/mapper/error-test'
> >
> > DMLINEAR_TABLE="0 $blk_dev_size linear $dm_backing_dev 0"
> >
> > + DMERROR_TABLE="0 $blk_dev_size error $dm_backing_dev 0"
> > +}
> > +
> > +_dmerror_init()
> > +{
> > + _dmerror_setup
> > + $DMSETUP_PROG remove error-test > /dev/null 2>&1
> > $DMSETUP_PROG create error-test --table "$DMLINEAR_TABLE" || \
> > _fatal "failed to create dm linear device"
> > -
> > - DMERROR_TABLE="0 $blk_dev_size error $dm_backing_dev 0"
> > }
> >
> > _dmerror_mount()
> > diff --git a/doc/auxiliary-programs.txt b/doc/auxiliary-programs.txt
> > index 21ef118596b6..191ac0596511 100644
> > --- a/doc/auxiliary-programs.txt
> > +++ b/doc/auxiliary-programs.txt
> > @@ -16,6 +16,7 @@ note the dependency with:
> > Contents:
> >
> > - af_unix -- Create an AF_UNIX socket
> > + - fsync-err -- tests fsync error reporting after failed writeback
> > - open_by_handle -- open_by_handle_at syscall exercise
> > - stat_test -- statx syscall exercise
> > - t_dir_type -- print directory entries and their file type
> > @@ -30,6 +31,13 @@ af_unix
> >
> > The af_unix program creates an AF_UNIX socket at the given location.
> >
> > +fsync-err
> > + Specialized program for testing how the kernel reports errors that
> > + occur during writeback. Works in conjunction with the dmerror script
> > + in tools/ to write data to a device, and then force it to fail
> > + writeback and test that errors are reported during fsync and cleared
> > + afterward.
> > +
> > open_by_handle
> >
> > The open_by_handle program exercises the open_by_handle_at() system
> > diff --git a/src/Makefile b/src/Makefile
> > index 4ec01975f8f7..b79c4d84d31b 100644
> > --- a/src/Makefile
> > +++ b/src/Makefile
> > @@ -13,7 +13,7 @@ TARGETS = dirstress fill fill2 getpagesize holes lstat64 \
> > multi_open_unlink dmiperf unwritten_sync genhashnames t_holes \
> > t_mmap_writev t_truncate_cmtime dirhash_collide t_rename_overwrite \
> > holetest t_truncate_self t_mmap_dio af_unix t_mmap_stale_pmd \
> > - t_mmap_cow_race
> > + t_mmap_cow_race fsync-err
> >
> > LINUX_TARGETS = xfsctl bstat t_mtab getdevicesize preallo_rw_pattern_reader \
> > preallo_rw_pattern_writer ftrunc trunc fs_perms testx looptest \
> > diff --git a/src/fsync-err.c b/src/fsync-err.c
> > new file mode 100644
> > index 000000000000..cbeb37fb1790
> > --- /dev/null
> > +++ b/src/fsync-err.c
> > @@ -0,0 +1,161 @@
> > +/*
> > + * fsync-err.c: test whether writeback errors are reported to all open fds
> > + * and properly cleared as expected after being seen once on each
> > + *
> > + * Copyright (c) 2017: Jeff Layton <[email protected]>
> > + */
> > +#include <sys/types.h>
> > +#include <sys/stat.h>
> > +#include <errno.h>
> > +#include <fcntl.h>
> > +#include <stdlib.h>
> > +#include <stdio.h>
> > +#include <string.h>
> > +#include <unistd.h>
> > +
> > +/*
> > + * btrfs has a fixed stripewidth of 64k, so we need to write enough data to
> > + * ensure that we hit both stripes.
> > + *
> > + * FIXME: have the test script pass in the length?
> > + */
> > +#define BUFSIZE (65 * 1024)
> > +
> > +/* FIXME: should this be tunable */
> > +#define NUM_FDS 10
> > +
> > +static void usage() {
> > + fprintf(stderr, "Usage: fsync-err <filename>\n");
> > +}
>
> Just to follow the same style as your main:
>
> +static void usage()
> +{
> + fprintf(stderr, "Usage: fsync-err <filename>\n");
> +}
>
> > +
> > +int main(int argc, char **argv)
> > +{
> > + int fd[NUM_FDS], ret, i;
> > + char *fname, *buf;
> > +
> > + if (argc < 1) {
> > + usage();
> > + return 1;
> > + }
> > +
> > + /* First argument is filename */
> > + fname = argv[1];
> > +
> > + for (i = 0; i < NUM_FDS; ++i) {
> > + fd[i] = open(fname, O_WRONLY | O_CREAT | O_TRUNC, 0644);
> > + if (fd[i] < 0) {
> > + printf("open of fd[%d] failed: %m\n", i);
> > + return 1;
> > + }
> > + }
> > +
> > + buf = malloc(BUFSIZE);
> > + if (!buf) {
> > + printf("malloc failed: %m\n");
> > + return 1;
> > + }
> > +
> > + memset(buf, 0x7c, BUFSIZE);
> > +
> > + for (i = 0; i < NUM_FDS; ++i) {
> > + ret = write(fd[i], buf, BUFSIZE);
> > + if (ret < 0) {
> > + printf("First write on fd[%d] failed: %m\n", i);
> > + return 1;
> > + }
> > + }
> > +
> > + for (i = 0; i < NUM_FDS; ++i) {
> > + ret = fsync(fd[i]);
> > + if (ret < 0) {
> > + printf("First fsync on fd[%d] failed: %m\n", i);
> > + return 1;
> > + }
> > + }
> > +
> > + /* flip the device to non-working mode */
> > + ret = system("./tools/dmerror load_error_table");
>
> Just assuming a hardcoded relative path?
>

The callers in tests/ call commands with hardcoded paths like "src/", so
I figured that was OK here.

Is there some other way we should do this? I'd prefer to shell out to
the dmerror tool than reimplement it here.

> > + if (ret) {
> > + if (WIFEXITED(ret))
> > + printf("system: program exited: %d\n",
> > + WEXITSTATUS(ret));
> > + else
> > + printf("system: 0x%x\n", (int)ret);
> > +
> > + return 1;
> > + }
> > +
> > + for (i = 0; i < NUM_FDS; ++i) {
> > + ret = write(fd[i], buf, strlen(buf) + 1);
>
> Why not BUFSIZE here?
>

Good catch. It should be BUFSIZE.

> > + if (ret < 0) {
> > + printf("Second write on fd[%d] failed: %m\n", i);
> > + return 1;
> > + }
> > + }
> > +
> > + for (i = 0; i < NUM_FDS; ++i) {
> > + ret = fsync(fd[i]);
> > + /* Now, we EXPECT the error! */
> > + if (ret >= 0) {
> > + printf("Success on second fsync on fd[%d]!\n", i);
> > + return 1;
> > + }
> > + }
> > +
> > + for (i = 0; i < NUM_FDS; ++i) {
> > + ret = fsync(fd[i]);
> > + if (ret < 0) {
> > + /* Now the error should be clear */
> > + printf("Third fsync on fd[%d] failed: %m\n", i);
> > + return 1;
> > + }
> > + }
> > +
> > + /* flip the device to working mode */
> > + ret = system("./tools/dmerror load_working_table");
> > + if (ret) {
> > + if (WIFEXITED(ret))
> > + printf("system: program exited: %d\n",
> > + WEXITSTATUS(ret));
> > + else
> > + printf("system: 0x%x\n", (int)ret);
> > +
> > + return 1;
> > + }
> > +
> > + for (i = 0; i < NUM_FDS; ++i) {
> > + ret = fsync(fd[i]);
> > + if (ret < 0) {
> > + /* The error should still be clear */
> > + printf("fsync after healing device on fd[%d] failed: %m\n", i);
> > + return 1;
> > + }
> > + }
> > +
> > + /*
> > + * reopen each file one at a time to ensure the same inode stays
> > + * in core. fsync each one to make sure we see no errors on a fresh
> > + * open of the inode.
> > + */
> > + for (i = 0; i < NUM_FDS; ++i) {
> > + ret = close(fd[i]);
> > + if (ret < 0) {
> > + printf("Close of fd[%d] returned unexpected error: %m\n", i);
> > + return 1;
> > + }
> > + fd[i] = open(fname, O_WRONLY | O_CREAT | O_TRUNC, 0644);
> > + if (fd[i] < 0) {
> > + printf("Second open of fd[%d] failed: %m\n", i);
> > + return 1;
> > + }
> > + ret = fsync(fd[i]);
> > + if (ret < 0) {
> > + /* New opens should not return an error */
> > + printf("First fsync after reopen of fd[%d] failed: %m\n", i);
> > + return 1;
> > + }
> > + }
> > +
> > + printf("Test passed!\n");
> > + return 0;
> > +}
> > diff --git a/tests/generic/999 b/tests/generic/999
> > new file mode 100755
> > index 000000000000..6de143d1149e
> > --- /dev/null
> > +++ b/tests/generic/999
> > @@ -0,0 +1,76 @@
> > +#! /bin/bash
> > +# FS QA Test No. 999
> > +#
> > +# Open a file several times, write to it, fsync on all fds and make sure that
> > +# they all return 0. Change the device to start throwing errors. Write again
> > +# on all fds and fsync on all fds. Ensure that we get errors on all of them.
> > +# Then fsync on all one last time and verify that all return 0.
> > +#
> > +#-----------------------------------------------------------------------
> > +# Copyright (c) 2017, Jeff Layton <[email protected]>
> > +#
> > +# This program is free software; you can redistribute it and/or
> > +# modify it under the terms of the GNU General Public License as
> > +# published by the Free Software Foundation.
> > +#
> > +# This program is distributed in the hope that it would be useful,
> > +# but WITHOUT ANY WARRANTY; without even the implied warranty of
> > +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
> > +# GNU General Public License for more details.
> > +#
> > +# You should have received a copy of the GNU General Public License
> > +# along with this program; if not, write the Free Software Foundation,
> > +# Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA
> > +#-----------------------------------------------------------------------
>
> Sure you want to track the address? Maybe just remove it?
>

Will do. I ended up copying and pasting instead of doing the "new"
script like I probably should have. I'll go back and read the docs and
make sure I get the boilerplate stuff right.

> > +
>
>
>
> > +seq=`basename $0`
> > +seqres=$RESULT_DIR/$seq
> > +echo "QA output created by $seq"
> > +
> > +here=`pwd`
> > +tmp=/tmp/$$
> > +status=1 # failure is the default!
> > +trap "_cleanup; exit \$status" 0 1 2 3 15
> > +
> > +_cleanup()
> > +{
> > + cd /
> > + rm -rf $tmp.* $testdir
> > + _dmerror_cleanup
> > +}
> > +
> > +# get standard environment, filters and checks
> > +. ./common/rc
> > +. ./common/filter
> > +. ./common/dmerror
> > +
> > +# real QA test starts here
> > +_supported_os Linux
> > +_require_scratch
> > +_require_logdev
> > +_require_dm_target error
> > +_require_test_program fsync-err
> > +
> > +rm -f $seqres.full
> > +
> > +echo "Format and mount"
> > +$XFS_IO_PROG -d -c "pwrite -S 0x7c -b 1048576 0 $((64 * 1048576))" $SCRATCH_DEV >> $seqres.full
> > +_scratch_mkfs > $seqres.full 2>&1
> > +_dmerror_init
> > +_dmerror_mount >> $seqres.full 2>&1
> > +_dmerror_unmount
> > +_dmerror_mount
> > +
> > +_require_fs_space $SCRATCH_MNT 8192
> > +
> > +testfile=$SCRATCH_MNT/fsync-err-test
> > +
> > +$here/src/fsync-err $testfile
> > +
> > +# success, all done
> > +_dmerror_load_working_table
> > +_dmerror_unmount
> > +_dmerror_cleanup
> > +_repair_scratch_fs >> $seqres.full
> > +status=0
> > +exit
> > diff --git a/tests/generic/999.out b/tests/generic/999.out
> > new file mode 100644
> > index 000000000000..2e48492ff6d1
> > --- /dev/null
> > +++ b/tests/generic/999.out
> > @@ -0,0 +1,3 @@
> > +QA output created by 999
> > +Format and mount
> > +Test passed!
> > diff --git a/tests/generic/group b/tests/generic/group
> > index 438c299030f2..39f7b14657f1 100644
> > --- a/tests/generic/group
> > +++ b/tests/generic/group
> > @@ -440,3 +440,4 @@
> > 435 auto encrypt
> > 436 auto quick rw
> > 437 auto quick
> > +999 auto quick
> > diff --git a/tools/dmerror b/tools/dmerror
> > new file mode 100755
> > index 000000000000..4aaf682ee5f9
> > --- /dev/null
> > +++ b/tools/dmerror
> > @@ -0,0 +1,44 @@
> > +#!/bin/bash
> > +#-----------------------------------------------------------------------
> > +# Copyright (c) 2017, Jeff Layton <[email protected]>
> > +#
> > +# This program is free software; you can redistribute it and/or
> > +# modify it under the terms of the GNU General Public License as
> > +# published by the Free Software Foundation.
> > +#
> > +# This program is distributed in the hope that it would be useful,
> > +# but WITHOUT ANY WARRANTY; without even the implied warranty of
> > +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
> > +# GNU General Public License for more details.
> > +#
> > +# You should have received a copy of the GNU General Public License
> > +# along with this program; if not, write the Free Software Foundation,
> > +# Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA
> > +#-----------------------------------------------------------------------
> > +
> > +. ./common/config
> > +. ./common/dmerror
> > +
> > +_dmerror_setup
> > +
> > +case $1 in
> > +cleanup)
> > + _dmerror_cleanup
> > + ;;
> > +init)
> > + _dmerror_init
> > + ;;
> > +load_error_table)
> > + _dmerror_load_error_table
> > + ;;
> > +load_working_table)
> > + _dmerror_load_working_table
> > + ;;
> > +*)
> > + echo "Usage: $0 {init|cleanup|load_error_table|load_working_table}"
> > + exit 1
> > + ;;
> > +esac
> > +
> > +status=0
> > +exit
> > --
> > 2.9.4
> >
> >
>
>

Thanks for the review!
--
Jeff Layton <[email protected]>

2017-06-06 08:58:55

by Eryu Guan

[permalink] [raw]
Subject: Re: [xfstests PATCH v3 1/5] generic: add a writeback error handling test

On Wed, May 31, 2017 at 09:08:16AM -0400, Jeff Layton wrote:
> I'm working on a set of kernel patches to change how writeback errors
> are handled and reported in the kernel. Instead of reporting a
> writeback error to only the first fsync caller on the file, I aim
> to make the kernel report them once on every file description.
>
> This patch adds a test for the new behavior. Basically, open many fds
> to the same file, turn on dm_error, write to each of the fds, and then
> fsync them all to ensure that they all get an error back.
>
> To do that, I'm adding a new tools/dmerror script that the C program
> can use to load the error table. For now, that's all it can do, but
> we can fill it out with other commands as necessary.
>
> Signed-off-by: Jeff Layton <[email protected]>

Thanks for the new tests! And sorry for the late review..

It's testing a new behavior on error reporting on writeback, I'm not
sure if we can call it a new feature or it fixed a bug? But it's more
like a behavior change, I'm not sure how to categorize it.

Because if it's testing a new feature, we usually let test do proper
detection of current test environment (based on actual behavior not
kernel version) and _notrun on filesystems that don't have this feature
yet, instead of failing the test; if it's testing a bug fix, we could
leave the test fail on unfixed filesystems, this also serves as a
reminder that there's bug to fix.

I pulled your test kernel tree, and test passed on EXT4 but failed on
other local filesystems (XFS, btrfs). I assume that's expected.

Besides this kinda high-level question, some minor comments inline.

> ---
> common/dmerror | 13 ++--
> doc/auxiliary-programs.txt | 8 +++
> src/Makefile | 2 +-
> src/fsync-err.c | 161 +++++++++++++++++++++++++++++++++++++++++++++

New binary needs an entry in .gitignore file.

> tests/generic/999 | 76 +++++++++++++++++++++
> tests/generic/999.out | 3 +
> tests/generic/group | 1 +
> tools/dmerror | 44 +++++++++++++

This file is used by the test, then it should be in src/ directory and
be installed along with other executable files on "make install".
Because files under tools/ are not installed. Most people will run tests
in the root dir of xfstests and this is not a problem, but there're
still cases people do "make && make install" and run fstests from
/var/lib/xfstests (default installation target).

> 8 files changed, 302 insertions(+), 6 deletions(-)
> create mode 100644 src/fsync-err.c
> create mode 100755 tests/generic/999
> create mode 100644 tests/generic/999.out
> create mode 100755 tools/dmerror
>
> diff --git a/common/dmerror b/common/dmerror
> index d46c5d0b7266..238baa213b1f 100644
> --- a/common/dmerror
> +++ b/common/dmerror
> @@ -23,22 +23,25 @@ if [ $? -eq 0 ]; then
> _notrun "Cannot run tests with DAX on dmerror devices"
> fi
>
> -_dmerror_init()
> +_dmerror_setup()
> {
> local dm_backing_dev=$SCRATCH_DEV
>
> - $DMSETUP_PROG remove error-test > /dev/null 2>&1
> -
> local blk_dev_size=`blockdev --getsz $dm_backing_dev`
>
> DMERROR_DEV='/dev/mapper/error-test'
>
> DMLINEAR_TABLE="0 $blk_dev_size linear $dm_backing_dev 0"
>
> + DMERROR_TABLE="0 $blk_dev_size error $dm_backing_dev 0"
> +}
> +
> +_dmerror_init()
> +{
> + _dmerror_setup
> + $DMSETUP_PROG remove error-test > /dev/null 2>&1
> $DMSETUP_PROG create error-test --table "$DMLINEAR_TABLE" || \
> _fatal "failed to create dm linear device"
> -
> - DMERROR_TABLE="0 $blk_dev_size error $dm_backing_dev 0"
> }
>
> _dmerror_mount()
> diff --git a/doc/auxiliary-programs.txt b/doc/auxiliary-programs.txt
> index 21ef118596b6..191ac0596511 100644
> --- a/doc/auxiliary-programs.txt
> +++ b/doc/auxiliary-programs.txt
> @@ -16,6 +16,7 @@ note the dependency with:
> Contents:
>
> - af_unix -- Create an AF_UNIX socket
> + - fsync-err -- tests fsync error reporting after failed writeback
> - open_by_handle -- open_by_handle_at syscall exercise
> - stat_test -- statx syscall exercise
> - t_dir_type -- print directory entries and their file type
> @@ -30,6 +31,13 @@ af_unix
>
> The af_unix program creates an AF_UNIX socket at the given location.
>
> +fsync-err
> + Specialized program for testing how the kernel reports errors that
> + occur during writeback. Works in conjunction with the dmerror script
> + in tools/ to write data to a device, and then force it to fail
> + writeback and test that errors are reported during fsync and cleared
> + afterward.
> +
> open_by_handle
>
> The open_by_handle program exercises the open_by_handle_at() system
> diff --git a/src/Makefile b/src/Makefile
> index 4ec01975f8f7..b79c4d84d31b 100644
> --- a/src/Makefile
> +++ b/src/Makefile
> @@ -13,7 +13,7 @@ TARGETS = dirstress fill fill2 getpagesize holes lstat64 \
> multi_open_unlink dmiperf unwritten_sync genhashnames t_holes \
> t_mmap_writev t_truncate_cmtime dirhash_collide t_rename_overwrite \
> holetest t_truncate_self t_mmap_dio af_unix t_mmap_stale_pmd \
> - t_mmap_cow_race
> + t_mmap_cow_race fsync-err
>
> LINUX_TARGETS = xfsctl bstat t_mtab getdevicesize preallo_rw_pattern_reader \
> preallo_rw_pattern_writer ftrunc trunc fs_perms testx looptest \
> diff --git a/src/fsync-err.c b/src/fsync-err.c
> new file mode 100644
> index 000000000000..cbeb37fb1790
> --- /dev/null
> +++ b/src/fsync-err.c
> @@ -0,0 +1,161 @@
> +/*
> + * fsync-err.c: test whether writeback errors are reported to all open fds
> + * and properly cleared as expected after being seen once on each
> + *
> + * Copyright (c) 2017: Jeff Layton <[email protected]>
> + */
> +#include <sys/types.h>
> +#include <sys/stat.h>
> +#include <errno.h>
> +#include <fcntl.h>
> +#include <stdlib.h>
> +#include <stdio.h>
> +#include <string.h>
> +#include <unistd.h>
> +
> +/*
> + * btrfs has a fixed stripewidth of 64k, so we need to write enough data to
> + * ensure that we hit both stripes.
> + *
> + * FIXME: have the test script pass in the length?
> + */
> +#define BUFSIZE (65 * 1024)
> +
> +/* FIXME: should this be tunable */
> +#define NUM_FDS 10

Passed through command line parameter, and default value is 10 if not
specified?

> +
> +static void usage() {
> + fprintf(stderr, "Usage: fsync-err <filename>\n");
> +}
> +
> +int main(int argc, char **argv)
> +{
> + int fd[NUM_FDS], ret, i;
> + char *fname, *buf;
> +
> + if (argc < 1) {
> + usage();
> + return 1;
> + }
> +
> + /* First argument is filename */
> + fname = argv[1];
> +
> + for (i = 0; i < NUM_FDS; ++i) {
> + fd[i] = open(fname, O_WRONLY | O_CREAT | O_TRUNC, 0644);
> + if (fd[i] < 0) {
> + printf("open of fd[%d] failed: %m\n", i);
> + return 1;
> + }
> + }
> +
> + buf = malloc(BUFSIZE);
> + if (!buf) {
> + printf("malloc failed: %m\n");
> + return 1;
> + }
> +
> + memset(buf, 0x7c, BUFSIZE);
> +
> + for (i = 0; i < NUM_FDS; ++i) {
> + ret = write(fd[i], buf, BUFSIZE);
> + if (ret < 0) {
> + printf("First write on fd[%d] failed: %m\n", i);
> + return 1;
> + }
> + }
> +
> + for (i = 0; i < NUM_FDS; ++i) {
> + ret = fsync(fd[i]);
> + if (ret < 0) {
> + printf("First fsync on fd[%d] failed: %m\n", i);
> + return 1;
> + }
> + }
> +
> + /* flip the device to non-working mode */
> + ret = system("./tools/dmerror load_error_table");

Hmm, how about passing these "load error table" and "load working table"
commands through command line parameters too?

> + if (ret) {
> + if (WIFEXITED(ret))
> + printf("system: program exited: %d\n",
> + WEXITSTATUS(ret));
> + else
> + printf("system: 0x%x\n", (int)ret);
> +
> + return 1;
> + }
> +
> + for (i = 0; i < NUM_FDS; ++i) {
> + ret = write(fd[i], buf, strlen(buf) + 1);
> + if (ret < 0) {
> + printf("Second write on fd[%d] failed: %m\n", i);
> + return 1;
> + }
> + }
> +
> + for (i = 0; i < NUM_FDS; ++i) {
> + ret = fsync(fd[i]);
> + /* Now, we EXPECT the error! */
> + if (ret >= 0) {
> + printf("Success on second fsync on fd[%d]!\n", i);
> + return 1;
> + }
> + }
> +
> + for (i = 0; i < NUM_FDS; ++i) {
> + ret = fsync(fd[i]);
> + if (ret < 0) {
> + /* Now the error should be clear */

It's not obvious to me why error should be clear at this stage, adding
some comments would be good.

> + printf("Third fsync on fd[%d] failed: %m\n", i);
> + return 1;
> + }
> + }
> +
> + /* flip the device to working mode */
> + ret = system("./tools/dmerror load_working_table");
> + if (ret) {
> + if (WIFEXITED(ret))
> + printf("system: program exited: %d\n",
> + WEXITSTATUS(ret));
> + else
> + printf("system: 0x%x\n", (int)ret);
> +
> + return 1;
> + }
> +
> + for (i = 0; i < NUM_FDS; ++i) {
> + ret = fsync(fd[i]);
> + if (ret < 0) {
> + /* The error should still be clear */
> + printf("fsync after healing device on fd[%d] failed: %m\n", i);
> + return 1;
> + }
> + }
> +
> + /*
> + * reopen each file one at a time to ensure the same inode stays
> + * in core. fsync each one to make sure we see no errors on a fresh
> + * open of the inode.
> + */
> + for (i = 0; i < NUM_FDS; ++i) {
> + ret = close(fd[i]);
> + if (ret < 0) {
> + printf("Close of fd[%d] returned unexpected error: %m\n", i);
> + return 1;
> + }
> + fd[i] = open(fname, O_WRONLY | O_CREAT | O_TRUNC, 0644);
> + if (fd[i] < 0) {
> + printf("Second open of fd[%d] failed: %m\n", i);
> + return 1;
> + }
> + ret = fsync(fd[i]);
> + if (ret < 0) {
> + /* New opens should not return an error */
> + printf("First fsync after reopen of fd[%d] failed: %m\n", i);
> + return 1;
> + }
> + }
> +
> + printf("Test passed!\n");
> + return 0;
> +}
> diff --git a/tests/generic/999 b/tests/generic/999
> new file mode 100755
> index 000000000000..6de143d1149e
> --- /dev/null
> +++ b/tests/generic/999
> @@ -0,0 +1,76 @@
> +#! /bin/bash
> +# FS QA Test No. 999
> +#
> +# Open a file several times, write to it, fsync on all fds and make sure that
> +# they all return 0. Change the device to start throwing errors. Write again
> +# on all fds and fsync on all fds. Ensure that we get errors on all of them.
> +# Then fsync on all one last time and verify that all return 0.
> +#
> +#-----------------------------------------------------------------------
> +# Copyright (c) 2017, Jeff Layton <[email protected]>
> +#
> +# This program is free software; you can redistribute it and/or
> +# modify it under the terms of the GNU General Public License as
> +# published by the Free Software Foundation.
> +#
> +# This program is distributed in the hope that it would be useful,
> +# but WITHOUT ANY WARRANTY; without even the implied warranty of
> +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
> +# GNU General Public License for more details.
> +#
> +# You should have received a copy of the GNU General Public License
> +# along with this program; if not, write the Free Software Foundation,
> +# Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA
> +#-----------------------------------------------------------------------
> +
> +seq=`basename $0`
> +seqres=$RESULT_DIR/$seq
> +echo "QA output created by $seq"
> +
> +here=`pwd`
> +tmp=/tmp/$$
> +status=1 # failure is the default!
> +trap "_cleanup; exit \$status" 0 1 2 3 15
> +
> +_cleanup()
> +{
> + cd /
> + rm -rf $tmp.* $testdir
> + _dmerror_cleanup
> +}
> +
> +# get standard environment, filters and checks
> +. ./common/rc
> +. ./common/filter
> +. ./common/dmerror
> +
> +# real QA test starts here
> +_supported_os Linux
> +_require_scratch
> +_require_logdev
> +_require_dm_target error
> +_require_test_program fsync-err

Test also uses tools/dmerror (or src/dmerror), also should make sure
that file is there.

# Assuming src/dmerror
_require_test_program "dmerror"

> +
> +rm -f $seqres.full
> +
> +echo "Format and mount"
> +$XFS_IO_PROG -d -c "pwrite -S 0x7c -b 1048576 0 $((64 * 1048576))" $SCRATCH_DEV >> $seqres.full

This is not needed.

> +_scratch_mkfs > $seqres.full 2>&1
> +_dmerror_init
> +_dmerror_mount >> $seqres.full 2>&1
> +_dmerror_unmount

This extra _dmerror_mount/unmount cycle doesn't seem necessary to me
either.

> +_dmerror_mount
> +
> +_require_fs_space $SCRATCH_MNT 8192
> +
> +testfile=$SCRATCH_MNT/fsync-err-test
> +
> +$here/src/fsync-err $testfile
> +
> +# success, all done
> +_dmerror_load_working_table
> +_dmerror_unmount
> +_dmerror_cleanup
> +_repair_scratch_fs >> $seqres.full

_require_scratch_fs will return 0 if it found corruption but repaired it
successfully, then we'll miss a fs curruption failure.

Test harness will do fsck after each test by default, we can exit
directly.

Thanks,
Eryu

> +status=0
> +exit
> diff --git a/tests/generic/999.out b/tests/generic/999.out
> new file mode 100644
> index 000000000000..2e48492ff6d1
> --- /dev/null
> +++ b/tests/generic/999.out
> @@ -0,0 +1,3 @@
> +QA output created by 999
> +Format and mount
> +Test passed!
> diff --git a/tests/generic/group b/tests/generic/group
> index 438c299030f2..39f7b14657f1 100644
> --- a/tests/generic/group
> +++ b/tests/generic/group
> @@ -440,3 +440,4 @@
> 435 auto encrypt
> 436 auto quick rw
> 437 auto quick
> +999 auto quick
> diff --git a/tools/dmerror b/tools/dmerror
> new file mode 100755
> index 000000000000..4aaf682ee5f9
> --- /dev/null
> +++ b/tools/dmerror
> @@ -0,0 +1,44 @@
> +#!/bin/bash
> +#-----------------------------------------------------------------------
> +# Copyright (c) 2017, Jeff Layton <[email protected]>
> +#
> +# This program is free software; you can redistribute it and/or
> +# modify it under the terms of the GNU General Public License as
> +# published by the Free Software Foundation.
> +#
> +# This program is distributed in the hope that it would be useful,
> +# but WITHOUT ANY WARRANTY; without even the implied warranty of
> +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
> +# GNU General Public License for more details.
> +#
> +# You should have received a copy of the GNU General Public License
> +# along with this program; if not, write the Free Software Foundation,
> +# Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA
> +#-----------------------------------------------------------------------
> +
> +. ./common/config
> +. ./common/dmerror
> +
> +_dmerror_setup
> +
> +case $1 in
> +cleanup)
> + _dmerror_cleanup
> + ;;
> +init)
> + _dmerror_init
> + ;;
> +load_error_table)
> + _dmerror_load_error_table
> + ;;
> +load_working_table)
> + _dmerror_load_working_table
> + ;;
> +*)
> + echo "Usage: $0 {init|cleanup|load_error_table|load_working_table}"
> + exit 1
> + ;;
> +esac
> +
> +status=0
> +exit
> --
> 2.9.4
>
> --
> To unsubscribe from this list: send the line "unsubscribe fstests" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html

2017-06-06 09:01:07

by Eryu Guan

[permalink] [raw]
Subject: Re: [xfstests PATCH v3 2/5] ext4: allow ext4 to use $SCRATCH_LOGDEV

On Wed, May 31, 2017 at 09:08:17AM -0400, Jeff Layton wrote:
> The writeback error handling test requires that you put the journal on a
> separate device. This allows us to use dmerror to simulate data
> writeback failure, without affecting the journal.
>
> xfs already has infrastructure for this (a'la $SCRATCH_LOGDEV), so wire
> up the ext4 code so that it can do the same thing when _scratch_mkfs is
> called.
>
> Signed-off-by: Jeff Layton <[email protected]>
> Reviewed-by: Darrick J. Wong <[email protected]>
> ---
> common/rc | 3 +++
> 1 file changed, 3 insertions(+)
>
> diff --git a/common/rc b/common/rc
> index 743df427c047..391d36f373cd 100644
> --- a/common/rc
> +++ b/common/rc
> @@ -676,6 +676,9 @@ _scratch_mkfs_ext4()
> local tmp=`mktemp`
> local mkfs_status
>
> + [ "$USE_EXTERNAL" = yes -a ! -z "$SCRATCH_LOGDEV" ] && \
> + $mkfs_cmd -O journal_dev $SCRATCH_LOGDEV && \
> + mkfs_cmd="$mkfs_cmd -J device=$SCRATCH_LOGDEV"

Need $MKFS_OPTIONS too when creating journal device, otherwise mkfs will
fail when making non-default block size ext4, i.e. journal device has 4k
block size, but ext4 has 1k block size if we have MKFS_OPTIONS="-b 1024"

Thanks,
Eryu

>
> _scratch_do_mkfs "$mkfs_cmd" "$mkfs_filter" $* 2>$tmp.mkfserr 1>$tmp.mkfsstd
> mkfs_status=$?
> --
> 2.9.4
>
> --
> To unsubscribe from this list: send the line "unsubscribe fstests" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html

2017-06-06 09:05:42

by Eryu Guan

[permalink] [raw]
Subject: Re: [xfstests PATCH v3 3/5] generic: test writeback error handling on dmerror devices

On Wed, May 31, 2017 at 09:08:18AM -0400, Jeff Layton wrote:
> Ensure that we get an error back on all fds when a block device is
> open by multiple writers and writeback fails.
>
> Signed-off-by: Jeff Layton <[email protected]>
> ---
> tests/generic/998 | 64 +++++++++++++++++++++++++++++++++++++++++++++++++++
> tests/generic/998.out | 2 ++
> tests/generic/group | 1 +
> 3 files changed, 67 insertions(+)
> create mode 100755 tests/generic/998
> create mode 100644 tests/generic/998.out
>
> diff --git a/tests/generic/998 b/tests/generic/998
> new file mode 100755
> index 000000000000..fbadb47507c2
> --- /dev/null
> +++ b/tests/generic/998
> @@ -0,0 +1,64 @@
> +#! /bin/bash
> +# FS QA Test No. 998
> +#
> +# Test writeback error handling when writing to block devices via pagecache.
> +# See src/fsync-err.c for details of what test actually does.
> +#
> +#-----------------------------------------------------------------------
> +# Copyright (c) 2017, Jeff Layton <[email protected]>
> +#
> +# This program is free software; you can redistribute it and/or
> +# modify it under the terms of the GNU General Public License as
> +# published by the Free Software Foundation.
> +#
> +# This program is distributed in the hope that it would be useful,
> +# but WITHOUT ANY WARRANTY; without even the implied warranty of
> +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
> +# GNU General Public License for more details.
> +#
> +# You should have received a copy of the GNU General Public License
> +# along with this program; if not, write the Free Software Foundation,
> +# Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA
> +#-----------------------------------------------------------------------
> +
> +seq=`basename $0`
> +seqres=$RESULT_DIR/$seq
> +echo "QA output created by $seq"
> +
> +here=`pwd`
> +tmp=/tmp/$$
> +status=1 # failure is the default!
> +trap "_cleanup; exit \$status" 0 1 2 3 15
> +
> +_cleanup()
> +{
> + cd /
> + rm -rf $tmp.* $testdir
> + _dmerror_cleanup
> +}
> +
> +# get standard environment, filters and checks
> +. ./common/rc
> +. ./common/filter
> +. ./common/dmerror
> +
> +# real QA test starts here
> +_supported_os Linux
> +_require_scratch

_require_scratch_nocheck

Then we don't have to re-create a filesystem on SCRATCH_DEV before
exiting.

> +_require_logdev
> +_require_dm_target error
> +_require_test_program fsync-err
> +
> +rm -f $seqres.full
> +
> +$XFS_IO_PROG -d -c "pwrite -S 0x7c -b 1048576 0 $((64 * 1048576))" $SCRATCH_DEV >> $seqres.full

I don't see why this is needed, add some comments? Or remove it if it's
not needed?

> +_dmerror_init
> +
> +$here/src/fsync-err $DMERROR_DEV
> +
> +# success, all done
> +_dmerror_load_working_table
> +_dmerror_cleanup
> +_scratch_mkfs > $seqres.full 2>&1
> +status=0
> +exit
> diff --git a/tests/generic/998.out b/tests/generic/998.out
> new file mode 100644
> index 000000000000..658c438820e2
> --- /dev/null
> +++ b/tests/generic/998.out
> @@ -0,0 +1,2 @@
> +QA output created by 998
> +Test passed!
> diff --git a/tests/generic/group b/tests/generic/group
> index 39f7b14657f1..9fc384363ca7 100644
> --- a/tests/generic/group
> +++ b/tests/generic/group
> @@ -440,4 +440,5 @@
> 435 auto encrypt
> 436 auto quick rw
> 437 auto quick
> +998 auto quick

This is a test for block device, not filesystems, I'd remove it from
auto and quick group, but add it to 'blockdev' group. So it won't be run
if someone wants to test filesystems.

Thanks,
Eryu

> 999 auto quick
> --
> 2.9.4
>
> --
> To unsubscribe from this list: send the line "unsubscribe fstests" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html

2017-06-06 09:06:22

by Eryu Guan

[permalink] [raw]
Subject: Re: [xfstests PATCH v3 4/5] ext3: allow it to put journal on a separate device when doing scratch_mkfs

On Wed, May 31, 2017 at 09:08:19AM -0400, Jeff Layton wrote:
> Signed-off-by: Jeff Layton <[email protected]>
> ---
> common/rc | 11 ++++++++++-
> 1 file changed, 10 insertions(+), 1 deletion(-)
>
> diff --git a/common/rc b/common/rc
> index 391d36f373cd..83765aacfb06 100644
> --- a/common/rc
> +++ b/common/rc
> @@ -832,7 +832,16 @@ _scratch_mkfs()
> mkfs_cmd="$MKFS_BTRFS_PROG"
> mkfs_filter="cat"
> ;;
> - ext2|ext3)
> + ext3)
> + mkfs_cmd="$MKFS_PROG -t $FSTYP -- -F"
> + mkfs_filter="grep -v -e ^Warning: -e \"^mke2fs \""
> +
> + # put journal on separate device?
> + [ "$USE_EXTERNAL" = yes -a ! -z "$SCRATCH_LOGDEV" ] && \
> + $mkfs_cmd -O journal_dev $SCRATCH_LOGDEV && \
> + mkfs_cmd="$mkfs_cmd -J device=$SCRATCH_LOGDEV"

Similar to that ext4 patch, need $MKFS_OPTIONS when creating journal
device.

Thanks,
Eryu

> + ;;
> + ext2)
> mkfs_cmd="$MKFS_PROG -t $FSTYP -- -F"
> mkfs_filter="grep -v -e ^Warning: -e \"^mke2fs \""
> ;;
> --
> 2.9.4
>
> --
> To unsubscribe from this list: send the line "unsubscribe fstests" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html

2017-06-06 09:19:30

by Eryu Guan

[permalink] [raw]
Subject: Re: [xfstests PATCH v3 5/5] btrfs: allow it to use $SCRATCH_LOGDEV

On Wed, May 31, 2017 at 09:08:20AM -0400, Jeff Layton wrote:
> With btrfs, we can't really put the log on a separate device. What we
> can do however is mirror the metadata across two devices and make the
> data striped across all devices. When we turn on dmerror then the
> metadata can fall back to using the other mirror while the data errors
> out.
>
> Note that the current incarnation of btrfs has a fixed 64k stripe
> width. If that ever changes or becomes settable, we may need to adjust
> the amount of data that the test program writes.
>
> Signed-off-by: Jeff Layton <[email protected]>
> ---
> common/rc | 2 ++
> 1 file changed, 2 insertions(+)
>
> diff --git a/common/rc b/common/rc
> index 83765aacfb06..078270451b53 100644
> --- a/common/rc
> +++ b/common/rc
> @@ -830,6 +830,8 @@ _scratch_mkfs()
> ;;
> btrfs)
> mkfs_cmd="$MKFS_BTRFS_PROG"
> + [ "$USE_EXTERNAL" = yes -a ! -z "$SCRATCH_LOGDEV" ] && \
> + mkfs_cmd="$mkfs_cmd -d raid0 -m raid1 $SCRATCH_LOGDEV"

I don't think this is the correct way to do it. If btrfs doesn't support
external log device, then this test doesn't fit btrfs, or we need other
ways to test btrfs.

One of the problems of this hack is that raid1 requires all devices are
in the same size, we have a _require_scratch_dev_pool_equal_size() rule
to check on it, but this hack doesn't do the proper check and test fails
if SCRATCH_LOGDEV is smaller or bigger in size.

If btrfs "-d raid0 -m raid1" is capable to do this writeback error test,
perhaps you can write a new btrfs test and mkfs with "-d raid0 -m raid1"
explicitly. e.g.

...
_require_scratch_dev_pool 2
_require_scratch_dev_pool_equal_size
...
_scratch_mkfs "-d raid0 -m raid1"
...

Thanks,
Eryu

> mkfs_filter="cat"
> ;;
> ext3)
> --
> 2.9.4
>
> --
> To unsubscribe from this list: send the line "unsubscribe fstests" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html

2017-06-06 10:15:57

by Jeff Layton

[permalink] [raw]
Subject: Re: [xfstests PATCH v3 1/5] generic: add a writeback error handling test

On Tue, 2017-06-06 at 16:58 +0800, Eryu Guan wrote:
> On Wed, May 31, 2017 at 09:08:16AM -0400, Jeff Layton wrote:
> > I'm working on a set of kernel patches to change how writeback errors
> > are handled and reported in the kernel. Instead of reporting a
> > writeback error to only the first fsync caller on the file, I aim
> > to make the kernel report them once on every file description.
> >
> > This patch adds a test for the new behavior. Basically, open many fds
> > to the same file, turn on dm_error, write to each of the fds, and then
> > fsync them all to ensure that they all get an error back.
> >
> > To do that, I'm adding a new tools/dmerror script that the C program
> > can use to load the error table. For now, that's all it can do, but
> > we can fill it out with other commands as necessary.
> >
> > Signed-off-by: Jeff Layton <[email protected]>
>
> Thanks for the new tests! And sorry for the late review..
>
> It's testing a new behavior on error reporting on writeback, I'm not
> sure if we can call it a new feature or it fixed a bug? But it's more
> like a behavior change, I'm not sure how to categorize it.
>
> Because if it's testing a new feature, we usually let test do proper
> detection of current test environment (based on actual behavior not
> kernel version) and _notrun on filesystems that don't have this feature
> yet, instead of failing the test; if it's testing a bug fix, we could
> leave the test fail on unfixed filesystems, this also serves as a
> reminder that there's bug to fix.
>

Thanks for the review! I'm not sure how to categorize this either. Since
the plan is to convert all the filesystems piecemeal, maybe we should
just consider it a new feature.

> I pulled your test kernel tree, and test passed on EXT4 but failed on
> other local filesystems (XFS, btrfs). I assume that's expected.
>
> Besides this kinda high-level question, some minor comments inline.
>

Yes, ext4 should pass on my latest kernel tree, but everything else
should fail.

> > ---
> > common/dmerror | 13 ++--
> > doc/auxiliary-programs.txt | 8 +++
> > src/Makefile | 2 +-
> > src/fsync-err.c | 161 +++++++++++++++++++++++++++++++++++++++++++++
>
> New binary needs an entry in .gitignore file.
>

OK, thanks. Will fix.

> > tests/generic/999 | 76 +++++++++++++++++++++
> > tests/generic/999.out | 3 +
> > tests/generic/group | 1 +
> > tools/dmerror | 44 +++++++++++++
>
> This file is used by the test, then it should be in src/ directory and
> be installed along with other executable files on "make install".
> Because files under tools/ are not installed. Most people will run tests
> in the root dir of xfstests and this is not a problem, but there're
> still cases people do "make && make install" and run fstests from
> /var/lib/xfstests (default installation target).
>

Ok, no problem. I'll move it. I wasn't sure here since dmerror is a
shell script, and most of the stuff in src/ is stuff that needs to be
built.

> > 8 files changed, 302 insertions(+), 6 deletions(-)
> > create mode 100644 src/fsync-err.c
> > create mode 100755 tests/generic/999
> > create mode 100644 tests/generic/999.out
> > create mode 100755 tools/dmerror
> >
> > diff --git a/common/dmerror b/common/dmerror
> > index d46c5d0b7266..238baa213b1f 100644
> > --- a/common/dmerror
> > +++ b/common/dmerror
> > @@ -23,22 +23,25 @@ if [ $? -eq 0 ]; then
> > _notrun "Cannot run tests with DAX on dmerror devices"
> > fi
> >
> > -_dmerror_init()
> > +_dmerror_setup()
> > {
> > local dm_backing_dev=$SCRATCH_DEV
> >
> > - $DMSETUP_PROG remove error-test > /dev/null 2>&1
> > -
> > local blk_dev_size=`blockdev --getsz $dm_backing_dev`
> >
> > DMERROR_DEV='/dev/mapper/error-test'
> >
> > DMLINEAR_TABLE="0 $blk_dev_size linear $dm_backing_dev 0"
> >
> > + DMERROR_TABLE="0 $blk_dev_size error $dm_backing_dev 0"
> > +}
> > +
> > +_dmerror_init()
> > +{
> > + _dmerror_setup
> > + $DMSETUP_PROG remove error-test > /dev/null 2>&1
> > $DMSETUP_PROG create error-test --table "$DMLINEAR_TABLE" || \
> > _fatal "failed to create dm linear device"
> > -
> > - DMERROR_TABLE="0 $blk_dev_size error $dm_backing_dev 0"
> > }
> >
> > _dmerror_mount()
> > diff --git a/doc/auxiliary-programs.txt b/doc/auxiliary-programs.txt
> > index 21ef118596b6..191ac0596511 100644
> > --- a/doc/auxiliary-programs.txt
> > +++ b/doc/auxiliary-programs.txt
> > @@ -16,6 +16,7 @@ note the dependency with:
> > Contents:
> >
> > - af_unix -- Create an AF_UNIX socket
> > + - fsync-err -- tests fsync error reporting after failed writeback
> > - open_by_handle -- open_by_handle_at syscall exercise
> > - stat_test -- statx syscall exercise
> > - t_dir_type -- print directory entries and their file type
> > @@ -30,6 +31,13 @@ af_unix
> >
> > The af_unix program creates an AF_UNIX socket at the given location.
> >
> > +fsync-err
> > + Specialized program for testing how the kernel reports errors that
> > + occur during writeback. Works in conjunction with the dmerror script
> > + in tools/ to write data to a device, and then force it to fail
> > + writeback and test that errors are reported during fsync and cleared
> > + afterward.
> > +
> > open_by_handle
> >
> > The open_by_handle program exercises the open_by_handle_at() system
> > diff --git a/src/Makefile b/src/Makefile
> > index 4ec01975f8f7..b79c4d84d31b 100644
> > --- a/src/Makefile
> > +++ b/src/Makefile
> > @@ -13,7 +13,7 @@ TARGETS = dirstress fill fill2 getpagesize holes lstat64 \
> > multi_open_unlink dmiperf unwritten_sync genhashnames t_holes \
> > t_mmap_writev t_truncate_cmtime dirhash_collide t_rename_overwrite \
> > holetest t_truncate_self t_mmap_dio af_unix t_mmap_stale_pmd \
> > - t_mmap_cow_race
> > + t_mmap_cow_race fsync-err
> >
> > LINUX_TARGETS = xfsctl bstat t_mtab getdevicesize preallo_rw_pattern_reader \
> > preallo_rw_pattern_writer ftrunc trunc fs_perms testx looptest \
> > diff --git a/src/fsync-err.c b/src/fsync-err.c
> > new file mode 100644
> > index 000000000000..cbeb37fb1790
> > --- /dev/null
> > +++ b/src/fsync-err.c
> > @@ -0,0 +1,161 @@
> > +/*
> > + * fsync-err.c: test whether writeback errors are reported to all open fds
> > + * and properly cleared as expected after being seen once on each
> > + *
> > + * Copyright (c) 2017: Jeff Layton <[email protected]>
> > + */
> > +#include <sys/types.h>
> > +#include <sys/stat.h>
> > +#include <errno.h>
> > +#include <fcntl.h>
> > +#include <stdlib.h>
> > +#include <stdio.h>
> > +#include <string.h>
> > +#include <unistd.h>
> > +
> > +/*
> > + * btrfs has a fixed stripewidth of 64k, so we need to write enough data to
> > + * ensure that we hit both stripes.
> > + *
> > + * FIXME: have the test script pass in the length?
> > + */
> > +#define BUFSIZE (65 * 1024)
> > +
> > +/* FIXME: should this be tunable */
> > +#define NUM_FDS 10
>
> Passed through command line parameter, and default value is 10 if not
> specified?
>

Sure. Perhaps we should do the same with BUFSIZE? Particularly if we
need to vary it between fs'.

> > +
> > +static void usage() {
> > + fprintf(stderr, "Usage: fsync-err <filename>\n");
> > +}
> > +
> > +int main(int argc, char **argv)
> > +{
> > + int fd[NUM_FDS], ret, i;
> > + char *fname, *buf;
> > +
> > + if (argc < 1) {
> > + usage();
> > + return 1;
> > + }
> > +
> > + /* First argument is filename */
> > + fname = argv[1];
> > +
> > + for (i = 0; i < NUM_FDS; ++i) {
> > + fd[i] = open(fname, O_WRONLY | O_CREAT | O_TRUNC, 0644);
> > + if (fd[i] < 0) {
> > + printf("open of fd[%d] failed: %m\n", i);
> > + return 1;
> > + }
> > + }
> > +
> > + buf = malloc(BUFSIZE);
> > + if (!buf) {
> > + printf("malloc failed: %m\n");
> > + return 1;
> > + }
> > +
> > + memset(buf, 0x7c, BUFSIZE);
> > +
> > + for (i = 0; i < NUM_FDS; ++i) {
> > + ret = write(fd[i], buf, BUFSIZE);
> > + if (ret < 0) {
> > + printf("First write on fd[%d] failed: %m\n", i);
> > + return 1;
> > + }
> > + }
> > +
> > + for (i = 0; i < NUM_FDS; ++i) {
> > + ret = fsync(fd[i]);
> > + if (ret < 0) {
> > + printf("First fsync on fd[%d] failed: %m\n", i);
> > + return 1;
> > + }
> > + }
> > +
> > + /* flip the device to non-working mode */
> > + ret = system("./tools/dmerror load_error_table");
>
> Hmm, how about passing these "load error table" and "load working table"
> commands through command line parameters too?
>
> > + if (ret) {
> > + if (WIFEXITED(ret))
> > + printf("system: program exited: %d\n",
> > + WEXITSTATUS(ret));
> > + else
> > + printf("system: 0x%x\n", (int)ret);
> > +
> > + return 1;
> > + }
> > +
> > + for (i = 0; i < NUM_FDS; ++i) {
> > + ret = write(fd[i], buf, strlen(buf) + 1);
> > + if (ret < 0) {
> > + printf("Second write on fd[%d] failed: %m\n", i);
> > + return 1;
> > + }
> > + }
> > +
> > + for (i = 0; i < NUM_FDS; ++i) {
> > + ret = fsync(fd[i]);
> > + /* Now, we EXPECT the error! */
> > + if (ret >= 0) {
> > + printf("Success on second fsync on fd[%d]!\n", i);
> > + return 1;
> > + }
> > + }
> > +
> > + for (i = 0; i < NUM_FDS; ++i) {
> > + ret = fsync(fd[i]);
> > + if (ret < 0) {
> > + /* Now the error should be clear */
>
> It's not obvious to me why error should be clear at this stage, adding
> some comments would be good.
>

Ok. FWIW:

We did some writes to a failing device and called fsync and got an error
back. Since no more data was written since then, we don't expect to see
an error at this point since it should have been "cleared".

> > + printf("Third fsync on fd[%d] failed: %m\n", i);
> > + return 1;
> > + }
> > + }
> > +
> > + /* flip the device to working mode */
> > + ret = system("./tools/dmerror load_working_table");
> > + if (ret) {
> > + if (WIFEXITED(ret))
> > + printf("system: program exited: %d\n",
> > + WEXITSTATUS(ret));
> > + else
> > + printf("system: 0x%x\n", (int)ret);
> > +
> > + return 1;
> > + }
> > +
> > + for (i = 0; i < NUM_FDS; ++i) {
> > + ret = fsync(fd[i]);
> > + if (ret < 0) {
> > + /* The error should still be clear */
> > + printf("fsync after healing device on fd[%d] failed: %m\n", i);
> > + return 1;
> > + }
> > + }
> > +
> > + /*
> > + * reopen each file one at a time to ensure the same inode stays
> > + * in core. fsync each one to make sure we see no errors on a fresh
> > + * open of the inode.
> > + */
> > + for (i = 0; i < NUM_FDS; ++i) {
> > + ret = close(fd[i]);
> > + if (ret < 0) {
> > + printf("Close of fd[%d] returned unexpected error: %m\n", i);
> > + return 1;
> > + }
> > + fd[i] = open(fname, O_WRONLY | O_CREAT | O_TRUNC, 0644);
> > + if (fd[i] < 0) {
> > + printf("Second open of fd[%d] failed: %m\n", i);
> > + return 1;
> > + }
> > + ret = fsync(fd[i]);
> > + if (ret < 0) {
> > + /* New opens should not return an error */
> > + printf("First fsync after reopen of fd[%d] failed: %m\n", i);
> > + return 1;
> > + }
> > + }
> > +
> > + printf("Test passed!\n");
> > + return 0;
> > +}
> > diff --git a/tests/generic/999 b/tests/generic/999
> > new file mode 100755
> > index 000000000000..6de143d1149e
> > --- /dev/null
> > +++ b/tests/generic/999
> > @@ -0,0 +1,76 @@
> > +#! /bin/bash
> > +# FS QA Test No. 999
> > +#
> > +# Open a file several times, write to it, fsync on all fds and make sure that
> > +# they all return 0. Change the device to start throwing errors. Write again
> > +# on all fds and fsync on all fds. Ensure that we get errors on all of them.
> > +# Then fsync on all one last time and verify that all return 0.
> > +#
> > +#-----------------------------------------------------------------------
> > +# Copyright (c) 2017, Jeff Layton <[email protected]>
> > +#
> > +# This program is free software; you can redistribute it and/or
> > +# modify it under the terms of the GNU General Public License as
> > +# published by the Free Software Foundation.
> > +#
> > +# This program is distributed in the hope that it would be useful,
> > +# but WITHOUT ANY WARRANTY; without even the implied warranty of
> > +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
> > +# GNU General Public License for more details.
> > +#
> > +# You should have received a copy of the GNU General Public License
> > +# along with this program; if not, write the Free Software Foundation,
> > +# Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA
> > +#-----------------------------------------------------------------------
> > +
> > +seq=`basename $0`
> > +seqres=$RESULT_DIR/$seq
> > +echo "QA output created by $seq"
> > +
> > +here=`pwd`
> > +tmp=/tmp/$$
> > +status=1 # failure is the default!
> > +trap "_cleanup; exit \$status" 0 1 2 3 15
> > +
> > +_cleanup()
> > +{
> > + cd /
> > + rm -rf $tmp.* $testdir
> > + _dmerror_cleanup
> > +}
> > +
> > +# get standard environment, filters and checks
> > +. ./common/rc
> > +. ./common/filter
> > +. ./common/dmerror
> > +
> > +# real QA test starts here
> > +_supported_os Linux
> > +_require_scratch
> > +_require_logdev
> > +_require_dm_target error
> > +_require_test_program fsync-err
>
> Test also uses tools/dmerror (or src/dmerror), also should make sure
> that file is there.
>
> # Assuming src/dmerror
> _require_test_program "dmerror"
>
> > +
> > +rm -f $seqres.full
> > +
> > +echo "Format and mount"
> > +$XFS_IO_PROG -d -c "pwrite -S 0x7c -b 1048576 0 $((64 * 1048576))" $SCRATCH_DEV >> $seqres.full
>
> This is not needed.
>
> > +_scratch_mkfs > $seqres.full 2>&1
> > +_dmerror_init
> > +_dmerror_mount >> $seqres.full 2>&1
> > +_dmerror_unmount
>
> This extra _dmerror_mount/unmount cycle doesn't seem necessary to me
> either.
>

ACK -- will fix all of the above.

> > +_dmerror_mount
> > +
> > +_require_fs_space $SCRATCH_MNT 8192
> > +
> > +testfile=$SCRATCH_MNT/fsync-err-test
> > +
> > +$here/src/fsync-err $testfile
> > +
> > +# success, all done
> > +_dmerror_load_working_table
> > +_dmerror_unmount
> > +_dmerror_cleanup
> > +_repair_scratch_fs >> $seqres.full
>
> _require_scratch_fs will return 0 if it found corruption but repaired it
> successfully, then we'll miss a fs curruption failure.
>
> Test harness will do fsck after each test by default, we can exit
> directly.
>

I'm not really interested in testing whether the fs is corrupt after
this. It may very well be completely hosed afterward. The only thing we
want to test here is the behavior while the errors actually occur.


> Thanks,
> Eryu
>
> > +status=0
> > +exit
> > diff --git a/tests/generic/999.out b/tests/generic/999.out
> > new file mode 100644
> > index 000000000000..2e48492ff6d1
> > --- /dev/null
> > +++ b/tests/generic/999.out
> > @@ -0,0 +1,3 @@
> > +QA output created by 999
> > +Format and mount
> > +Test passed!
> > diff --git a/tests/generic/group b/tests/generic/group
> > index 438c299030f2..39f7b14657f1 100644
> > --- a/tests/generic/group
> > +++ b/tests/generic/group
> > @@ -440,3 +440,4 @@
> > 435 auto encrypt
> > 436 auto quick rw
> > 437 auto quick
> > +999 auto quick
> > diff --git a/tools/dmerror b/tools/dmerror
> > new file mode 100755
> > index 000000000000..4aaf682ee5f9
> > --- /dev/null
> > +++ b/tools/dmerror
> > @@ -0,0 +1,44 @@
> > +#!/bin/bash
> > +#-----------------------------------------------------------------------
> > +# Copyright (c) 2017, Jeff Layton <[email protected]>
> > +#
> > +# This program is free software; you can redistribute it and/or
> > +# modify it under the terms of the GNU General Public License as
> > +# published by the Free Software Foundation.
> > +#
> > +# This program is distributed in the hope that it would be useful,
> > +# but WITHOUT ANY WARRANTY; without even the implied warranty of
> > +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
> > +# GNU General Public License for more details.
> > +#
> > +# You should have received a copy of the GNU General Public License
> > +# along with this program; if not, write the Free Software Foundation,
> > +# Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA
> > +#-----------------------------------------------------------------------
> > +
> > +. ./common/config
> > +. ./common/dmerror
> > +
> > +_dmerror_setup
> > +
> > +case $1 in
> > +cleanup)
> > + _dmerror_cleanup
> > + ;;
> > +init)
> > + _dmerror_init
> > + ;;
> > +load_error_table)
> > + _dmerror_load_error_table
> > + ;;
> > +load_working_table)
> > + _dmerror_load_working_table
> > + ;;
> > +*)
> > + echo "Usage: $0 {init|cleanup|load_error_table|load_working_table}"
> > + exit 1
> > + ;;
> > +esac
> > +
> > +status=0
> > +exit
> > --
> > 2.9.4
> >
> > --
> > To unsubscribe from this list: send the line "unsubscribe fstests" in
> > the body of a message to [email protected]
> > More majordomo info at http://vger.kernel.org/majordomo-info.html

--
Jeff Layton <[email protected]>

2017-06-06 12:23:25

by Eryu Guan

[permalink] [raw]
Subject: Re: [xfstests PATCH v3 1/5] generic: add a writeback error handling test

On Tue, Jun 06, 2017 at 06:15:57AM -0400, Jeff Layton wrote:
> On Tue, 2017-06-06 at 16:58 +0800, Eryu Guan wrote:
> > On Wed, May 31, 2017 at 09:08:16AM -0400, Jeff Layton wrote:
> > > I'm working on a set of kernel patches to change how writeback errors
> > > are handled and reported in the kernel. Instead of reporting a
> > > writeback error to only the first fsync caller on the file, I aim
> > > to make the kernel report them once on every file description.
> > >
> > > This patch adds a test for the new behavior. Basically, open many fds
> > > to the same file, turn on dm_error, write to each of the fds, and then
> > > fsync them all to ensure that they all get an error back.
> > >
> > > To do that, I'm adding a new tools/dmerror script that the C program
> > > can use to load the error table. For now, that's all it can do, but
> > > we can fill it out with other commands as necessary.
> > >
> > > Signed-off-by: Jeff Layton <[email protected]>
> >
> > Thanks for the new tests! And sorry for the late review..
> >
> > It's testing a new behavior on error reporting on writeback, I'm not
> > sure if we can call it a new feature or it fixed a bug? But it's more
> > like a behavior change, I'm not sure how to categorize it.
> >
> > Because if it's testing a new feature, we usually let test do proper
> > detection of current test environment (based on actual behavior not
> > kernel version) and _notrun on filesystems that don't have this feature
> > yet, instead of failing the test; if it's testing a bug fix, we could
> > leave the test fail on unfixed filesystems, this also serves as a
> > reminder that there's bug to fix.
> >
>
> Thanks for the review! I'm not sure how to categorize this either. Since
> the plan is to convert all the filesystems piecemeal, maybe we should
> just consider it a new feature.

Then we need a new _require rule to properly detect for the 'feature'
support. I'm not sure if this is doable, but something like
_require_statx, _require_seek_data_hole would be good.

>
> > I pulled your test kernel tree, and test passed on EXT4 but failed on
> > other local filesystems (XFS, btrfs). I assume that's expected.
> >
> > Besides this kinda high-level question, some minor comments inline.
> >
>
> Yes, ext4 should pass on my latest kernel tree, but everything else
> should fail.

With the new _require rule, test should _notrun on XFS and btrfs then.

>
> > > ---
> > > common/dmerror | 13 ++--
> > > doc/auxiliary-programs.txt | 8 +++
> > > src/Makefile | 2 +-
> > > src/fsync-err.c | 161 +++++++++++++++++++++++++++++++++++++++++++++
> >
> > New binary needs an entry in .gitignore file.
> >
>
> OK, thanks. Will fix.
>
> > > tests/generic/999 | 76 +++++++++++++++++++++
> > > tests/generic/999.out | 3 +
> > > tests/generic/group | 1 +
> > > tools/dmerror | 44 +++++++++++++
> >
> > This file is used by the test, then it should be in src/ directory and
> > be installed along with other executable files on "make install".
> > Because files under tools/ are not installed. Most people will run tests
> > in the root dir of xfstests and this is not a problem, but there're
> > still cases people do "make && make install" and run fstests from
> > /var/lib/xfstests (default installation target).
> >
>
> Ok, no problem. I'll move it. I wasn't sure here since dmerror is a
> shell script, and most of the stuff in src/ is stuff that needs to be
> built.

We do have a few perl or shell scripts in src/ dir, we can see this from
src/Makefile

$(LTINSTALL) -m 755 fill2attr fill2fs fill2fs_check scaleread.sh $(PKG_LIB_DIR)/src

>
> > > 8 files changed, 302 insertions(+), 6 deletions(-)
> > > create mode 100644 src/fsync-err.c
> > > create mode 100755 tests/generic/999
> > > create mode 100644 tests/generic/999.out
> > > create mode 100755 tools/dmerror
> > >
> > > diff --git a/common/dmerror b/common/dmerror
> > > index d46c5d0b7266..238baa213b1f 100644
> > > --- a/common/dmerror
> > > +++ b/common/dmerror
> > > @@ -23,22 +23,25 @@ if [ $? -eq 0 ]; then
> > > _notrun "Cannot run tests with DAX on dmerror devices"
> > > fi
> > >
> > > -_dmerror_init()
> > > +_dmerror_setup()
> > > {
> > > local dm_backing_dev=$SCRATCH_DEV
> > >
> > > - $DMSETUP_PROG remove error-test > /dev/null 2>&1
> > > -
> > > local blk_dev_size=`blockdev --getsz $dm_backing_dev`
> > >
> > > DMERROR_DEV='/dev/mapper/error-test'
> > >
> > > DMLINEAR_TABLE="0 $blk_dev_size linear $dm_backing_dev 0"
> > >
> > > + DMERROR_TABLE="0 $blk_dev_size error $dm_backing_dev 0"
> > > +}
> > > +
> > > +_dmerror_init()
> > > +{
> > > + _dmerror_setup
> > > + $DMSETUP_PROG remove error-test > /dev/null 2>&1
> > > $DMSETUP_PROG create error-test --table "$DMLINEAR_TABLE" || \
> > > _fatal "failed to create dm linear device"
> > > -
> > > - DMERROR_TABLE="0 $blk_dev_size error $dm_backing_dev 0"
> > > }
> > >
> > > _dmerror_mount()
> > > diff --git a/doc/auxiliary-programs.txt b/doc/auxiliary-programs.txt
> > > index 21ef118596b6..191ac0596511 100644
> > > --- a/doc/auxiliary-programs.txt
> > > +++ b/doc/auxiliary-programs.txt
> > > @@ -16,6 +16,7 @@ note the dependency with:
> > > Contents:
> > >
> > > - af_unix -- Create an AF_UNIX socket
> > > + - fsync-err -- tests fsync error reporting after failed writeback
> > > - open_by_handle -- open_by_handle_at syscall exercise
> > > - stat_test -- statx syscall exercise
> > > - t_dir_type -- print directory entries and their file type
> > > @@ -30,6 +31,13 @@ af_unix
> > >
> > > The af_unix program creates an AF_UNIX socket at the given location.
> > >
> > > +fsync-err
> > > + Specialized program for testing how the kernel reports errors that
> > > + occur during writeback. Works in conjunction with the dmerror script
> > > + in tools/ to write data to a device, and then force it to fail
> > > + writeback and test that errors are reported during fsync and cleared
> > > + afterward.
> > > +
> > > open_by_handle
> > >
> > > The open_by_handle program exercises the open_by_handle_at() system
> > > diff --git a/src/Makefile b/src/Makefile
> > > index 4ec01975f8f7..b79c4d84d31b 100644
> > > --- a/src/Makefile
> > > +++ b/src/Makefile
> > > @@ -13,7 +13,7 @@ TARGETS = dirstress fill fill2 getpagesize holes lstat64 \
> > > multi_open_unlink dmiperf unwritten_sync genhashnames t_holes \
> > > t_mmap_writev t_truncate_cmtime dirhash_collide t_rename_overwrite \
> > > holetest t_truncate_self t_mmap_dio af_unix t_mmap_stale_pmd \
> > > - t_mmap_cow_race
> > > + t_mmap_cow_race fsync-err
> > >
> > > LINUX_TARGETS = xfsctl bstat t_mtab getdevicesize preallo_rw_pattern_reader \
> > > preallo_rw_pattern_writer ftrunc trunc fs_perms testx looptest \
> > > diff --git a/src/fsync-err.c b/src/fsync-err.c
> > > new file mode 100644
> > > index 000000000000..cbeb37fb1790
> > > --- /dev/null
> > > +++ b/src/fsync-err.c
> > > @@ -0,0 +1,161 @@
> > > +/*
> > > + * fsync-err.c: test whether writeback errors are reported to all open fds
> > > + * and properly cleared as expected after being seen once on each
> > > + *
> > > + * Copyright (c) 2017: Jeff Layton <[email protected]>
> > > + */
> > > +#include <sys/types.h>
> > > +#include <sys/stat.h>
> > > +#include <errno.h>
> > > +#include <fcntl.h>
> > > +#include <stdlib.h>
> > > +#include <stdio.h>
> > > +#include <string.h>
> > > +#include <unistd.h>
> > > +
> > > +/*
> > > + * btrfs has a fixed stripewidth of 64k, so we need to write enough data to
> > > + * ensure that we hit both stripes.
> > > + *
> > > + * FIXME: have the test script pass in the length?
> > > + */
> > > +#define BUFSIZE (65 * 1024)
> > > +
> > > +/* FIXME: should this be tunable */
> > > +#define NUM_FDS 10
> >
> > Passed through command line parameter, and default value is 10 if not
> > specified?
> >
>
> Sure. Perhaps we should do the same with BUFSIZE? Particularly if we
> need to vary it between fs'.
>
> > > +
> > > +static void usage() {
> > > + fprintf(stderr, "Usage: fsync-err <filename>\n");
> > > +}
> > > +
> > > +int main(int argc, char **argv)
> > > +{
> > > + int fd[NUM_FDS], ret, i;
> > > + char *fname, *buf;
> > > +
> > > + if (argc < 1) {
> > > + usage();
> > > + return 1;
> > > + }
> > > +
> > > + /* First argument is filename */
> > > + fname = argv[1];
> > > +
> > > + for (i = 0; i < NUM_FDS; ++i) {
> > > + fd[i] = open(fname, O_WRONLY | O_CREAT | O_TRUNC, 0644);
> > > + if (fd[i] < 0) {
> > > + printf("open of fd[%d] failed: %m\n", i);
> > > + return 1;
> > > + }
> > > + }
> > > +
> > > + buf = malloc(BUFSIZE);
> > > + if (!buf) {
> > > + printf("malloc failed: %m\n");
> > > + return 1;
> > > + }
> > > +
> > > + memset(buf, 0x7c, BUFSIZE);
> > > +
> > > + for (i = 0; i < NUM_FDS; ++i) {
> > > + ret = write(fd[i], buf, BUFSIZE);
> > > + if (ret < 0) {
> > > + printf("First write on fd[%d] failed: %m\n", i);
> > > + return 1;
> > > + }
> > > + }
> > > +
> > > + for (i = 0; i < NUM_FDS; ++i) {
> > > + ret = fsync(fd[i]);
> > > + if (ret < 0) {
> > > + printf("First fsync on fd[%d] failed: %m\n", i);
> > > + return 1;
> > > + }
> > > + }
> > > +
> > > + /* flip the device to non-working mode */
> > > + ret = system("./tools/dmerror load_error_table");
> >
> > Hmm, how about passing these "load error table" and "load working table"
> > commands through command line parameters too?
> >
> > > + if (ret) {
> > > + if (WIFEXITED(ret))
> > > + printf("system: program exited: %d\n",
> > > + WEXITSTATUS(ret));
> > > + else
> > > + printf("system: 0x%x\n", (int)ret);
> > > +
> > > + return 1;
> > > + }
> > > +
> > > + for (i = 0; i < NUM_FDS; ++i) {
> > > + ret = write(fd[i], buf, strlen(buf) + 1);
> > > + if (ret < 0) {
> > > + printf("Second write on fd[%d] failed: %m\n", i);
> > > + return 1;
> > > + }
> > > + }
> > > +
> > > + for (i = 0; i < NUM_FDS; ++i) {
> > > + ret = fsync(fd[i]);
> > > + /* Now, we EXPECT the error! */
> > > + if (ret >= 0) {
> > > + printf("Success on second fsync on fd[%d]!\n", i);
> > > + return 1;
> > > + }
> > > + }
> > > +
> > > + for (i = 0; i < NUM_FDS; ++i) {
> > > + ret = fsync(fd[i]);
> > > + if (ret < 0) {
> > > + /* Now the error should be clear */
> >
> > It's not obvious to me why error should be clear at this stage, adding
> > some comments would be good.
> >
>
> Ok. FWIW:
>
> We did some writes to a failing device and called fsync and got an error
> back. Since no more data was written since then, we don't expect to see
> an error at this point since it should have been "cleared".

Thanks! It wasn't clear to me until I read your kernel patch :)

Thanks,
Eryu

2017-06-06 17:17:04

by Darrick J. Wong

[permalink] [raw]
Subject: Re: [xfstests PATCH v3 1/5] generic: add a writeback error handling test

On Tue, Jun 06, 2017 at 08:23:25PM +0800, Eryu Guan wrote:
> On Tue, Jun 06, 2017 at 06:15:57AM -0400, Jeff Layton wrote:
> > On Tue, 2017-06-06 at 16:58 +0800, Eryu Guan wrote:
> > > On Wed, May 31, 2017 at 09:08:16AM -0400, Jeff Layton wrote:
> > > > I'm working on a set of kernel patches to change how writeback errors
> > > > are handled and reported in the kernel. Instead of reporting a
> > > > writeback error to only the first fsync caller on the file, I aim
> > > > to make the kernel report them once on every file description.
> > > >
> > > > This patch adds a test for the new behavior. Basically, open many fds
> > > > to the same file, turn on dm_error, write to each of the fds, and then
> > > > fsync them all to ensure that they all get an error back.
> > > >
> > > > To do that, I'm adding a new tools/dmerror script that the C program
> > > > can use to load the error table. For now, that's all it can do, but
> > > > we can fill it out with other commands as necessary.
> > > >
> > > > Signed-off-by: Jeff Layton <[email protected]>
> > >
> > > Thanks for the new tests! And sorry for the late review..
> > >
> > > It's testing a new behavior on error reporting on writeback, I'm not
> > > sure if we can call it a new feature or it fixed a bug? But it's more
> > > like a behavior change, I'm not sure how to categorize it.
> > >
> > > Because if it's testing a new feature, we usually let test do proper
> > > detection of current test environment (based on actual behavior not
> > > kernel version) and _notrun on filesystems that don't have this feature
> > > yet, instead of failing the test; if it's testing a bug fix, we could
> > > leave the test fail on unfixed filesystems, this also serves as a
> > > reminder that there's bug to fix.
> > >
> >
> > Thanks for the review! I'm not sure how to categorize this either. Since
> > the plan is to convert all the filesystems piecemeal, maybe we should
> > just consider it a new feature.
>
> Then we need a new _require rule to properly detect for the 'feature'
> support. I'm not sure if this is doable, but something like
> _require_statx, _require_seek_data_hole would be good.
>
> >
> > > I pulled your test kernel tree, and test passed on EXT4 but failed on
> > > other local filesystems (XFS, btrfs). I assume that's expected.
> > >
> > > Besides this kinda high-level question, some minor comments inline.
> > >
> >
> > Yes, ext4 should pass on my latest kernel tree, but everything else
> > should fail.
>
> With the new _require rule, test should _notrun on XFS and btrfs then.

Frankly I personally prefer that upstream XFS fails until someone fixes it. :)
(But that's just my opinion.)

That said, I'm not 100% sure what's required of XFS to play nicely with
this new mechanism -- glancing at the ext* patches it looks like we'd
need to set a fs flag and possibly change some or all of the "write
cached dirty buffers out to disk" calls to their _since variants?
Metadata writeback errors are handled by retrying writes and/or shutting
down the fs, so I think the f_md_wb_error case is already covered.

That said, I agree that it's useful to detect that the kernel simply
lacks any of the new wb error reporting at all, so therefore we can skip
the tests.

--D

>
> >
> > > > ---
> > > > common/dmerror | 13 ++--
> > > > doc/auxiliary-programs.txt | 8 +++
> > > > src/Makefile | 2 +-
> > > > src/fsync-err.c | 161 +++++++++++++++++++++++++++++++++++++++++++++
> > >
> > > New binary needs an entry in .gitignore file.
> > >
> >
> > OK, thanks. Will fix.
> >
> > > > tests/generic/999 | 76 +++++++++++++++++++++
> > > > tests/generic/999.out | 3 +
> > > > tests/generic/group | 1 +
> > > > tools/dmerror | 44 +++++++++++++
> > >
> > > This file is used by the test, then it should be in src/ directory and
> > > be installed along with other executable files on "make install".
> > > Because files under tools/ are not installed. Most people will run tests
> > > in the root dir of xfstests and this is not a problem, but there're
> > > still cases people do "make && make install" and run fstests from
> > > /var/lib/xfstests (default installation target).
> > >
> >
> > Ok, no problem. I'll move it. I wasn't sure here since dmerror is a
> > shell script, and most of the stuff in src/ is stuff that needs to be
> > built.
>
> We do have a few perl or shell scripts in src/ dir, we can see this from
> src/Makefile
>
> $(LTINSTALL) -m 755 fill2attr fill2fs fill2fs_check scaleread.sh $(PKG_LIB_DIR)/src
>
> >
> > > > 8 files changed, 302 insertions(+), 6 deletions(-)
> > > > create mode 100644 src/fsync-err.c
> > > > create mode 100755 tests/generic/999
> > > > create mode 100644 tests/generic/999.out
> > > > create mode 100755 tools/dmerror
> > > >
> > > > diff --git a/common/dmerror b/common/dmerror
> > > > index d46c5d0b7266..238baa213b1f 100644
> > > > --- a/common/dmerror
> > > > +++ b/common/dmerror
> > > > @@ -23,22 +23,25 @@ if [ $? -eq 0 ]; then
> > > > _notrun "Cannot run tests with DAX on dmerror devices"
> > > > fi
> > > >
> > > > -_dmerror_init()
> > > > +_dmerror_setup()
> > > > {
> > > > local dm_backing_dev=$SCRATCH_DEV
> > > >
> > > > - $DMSETUP_PROG remove error-test > /dev/null 2>&1
> > > > -
> > > > local blk_dev_size=`blockdev --getsz $dm_backing_dev`
> > > >
> > > > DMERROR_DEV='/dev/mapper/error-test'
> > > >
> > > > DMLINEAR_TABLE="0 $blk_dev_size linear $dm_backing_dev 0"
> > > >
> > > > + DMERROR_TABLE="0 $blk_dev_size error $dm_backing_dev 0"
> > > > +}
> > > > +
> > > > +_dmerror_init()
> > > > +{
> > > > + _dmerror_setup
> > > > + $DMSETUP_PROG remove error-test > /dev/null 2>&1
> > > > $DMSETUP_PROG create error-test --table "$DMLINEAR_TABLE" || \
> > > > _fatal "failed to create dm linear device"
> > > > -
> > > > - DMERROR_TABLE="0 $blk_dev_size error $dm_backing_dev 0"
> > > > }
> > > >
> > > > _dmerror_mount()
> > > > diff --git a/doc/auxiliary-programs.txt b/doc/auxiliary-programs.txt
> > > > index 21ef118596b6..191ac0596511 100644
> > > > --- a/doc/auxiliary-programs.txt
> > > > +++ b/doc/auxiliary-programs.txt
> > > > @@ -16,6 +16,7 @@ note the dependency with:
> > > > Contents:
> > > >
> > > > - af_unix -- Create an AF_UNIX socket
> > > > + - fsync-err -- tests fsync error reporting after failed writeback
> > > > - open_by_handle -- open_by_handle_at syscall exercise
> > > > - stat_test -- statx syscall exercise
> > > > - t_dir_type -- print directory entries and their file type
> > > > @@ -30,6 +31,13 @@ af_unix
> > > >
> > > > The af_unix program creates an AF_UNIX socket at the given location.
> > > >
> > > > +fsync-err
> > > > + Specialized program for testing how the kernel reports errors that
> > > > + occur during writeback. Works in conjunction with the dmerror script
> > > > + in tools/ to write data to a device, and then force it to fail
> > > > + writeback and test that errors are reported during fsync and cleared
> > > > + afterward.
> > > > +
> > > > open_by_handle
> > > >
> > > > The open_by_handle program exercises the open_by_handle_at() system
> > > > diff --git a/src/Makefile b/src/Makefile
> > > > index 4ec01975f8f7..b79c4d84d31b 100644
> > > > --- a/src/Makefile
> > > > +++ b/src/Makefile
> > > > @@ -13,7 +13,7 @@ TARGETS = dirstress fill fill2 getpagesize holes lstat64 \
> > > > multi_open_unlink dmiperf unwritten_sync genhashnames t_holes \
> > > > t_mmap_writev t_truncate_cmtime dirhash_collide t_rename_overwrite \
> > > > holetest t_truncate_self t_mmap_dio af_unix t_mmap_stale_pmd \
> > > > - t_mmap_cow_race
> > > > + t_mmap_cow_race fsync-err
> > > >
> > > > LINUX_TARGETS = xfsctl bstat t_mtab getdevicesize preallo_rw_pattern_reader \
> > > > preallo_rw_pattern_writer ftrunc trunc fs_perms testx looptest \
> > > > diff --git a/src/fsync-err.c b/src/fsync-err.c
> > > > new file mode 100644
> > > > index 000000000000..cbeb37fb1790
> > > > --- /dev/null
> > > > +++ b/src/fsync-err.c
> > > > @@ -0,0 +1,161 @@
> > > > +/*
> > > > + * fsync-err.c: test whether writeback errors are reported to all open fds
> > > > + * and properly cleared as expected after being seen once on each
> > > > + *
> > > > + * Copyright (c) 2017: Jeff Layton <[email protected]>
> > > > + */
> > > > +#include <sys/types.h>
> > > > +#include <sys/stat.h>
> > > > +#include <errno.h>
> > > > +#include <fcntl.h>
> > > > +#include <stdlib.h>
> > > > +#include <stdio.h>
> > > > +#include <string.h>
> > > > +#include <unistd.h>
> > > > +
> > > > +/*
> > > > + * btrfs has a fixed stripewidth of 64k, so we need to write enough data to
> > > > + * ensure that we hit both stripes.
> > > > + *
> > > > + * FIXME: have the test script pass in the length?
> > > > + */
> > > > +#define BUFSIZE (65 * 1024)
> > > > +
> > > > +/* FIXME: should this be tunable */
> > > > +#define NUM_FDS 10
> > >
> > > Passed through command line parameter, and default value is 10 if not
> > > specified?
> > >
> >
> > Sure. Perhaps we should do the same with BUFSIZE? Particularly if we
> > need to vary it between fs'.
> >
> > > > +
> > > > +static void usage() {
> > > > + fprintf(stderr, "Usage: fsync-err <filename>\n");
> > > > +}
> > > > +
> > > > +int main(int argc, char **argv)
> > > > +{
> > > > + int fd[NUM_FDS], ret, i;
> > > > + char *fname, *buf;
> > > > +
> > > > + if (argc < 1) {
> > > > + usage();
> > > > + return 1;
> > > > + }
> > > > +
> > > > + /* First argument is filename */
> > > > + fname = argv[1];
> > > > +
> > > > + for (i = 0; i < NUM_FDS; ++i) {
> > > > + fd[i] = open(fname, O_WRONLY | O_CREAT | O_TRUNC, 0644);
> > > > + if (fd[i] < 0) {
> > > > + printf("open of fd[%d] failed: %m\n", i);
> > > > + return 1;
> > > > + }
> > > > + }
> > > > +
> > > > + buf = malloc(BUFSIZE);
> > > > + if (!buf) {
> > > > + printf("malloc failed: %m\n");
> > > > + return 1;
> > > > + }
> > > > +
> > > > + memset(buf, 0x7c, BUFSIZE);
> > > > +
> > > > + for (i = 0; i < NUM_FDS; ++i) {
> > > > + ret = write(fd[i], buf, BUFSIZE);
> > > > + if (ret < 0) {
> > > > + printf("First write on fd[%d] failed: %m\n", i);
> > > > + return 1;
> > > > + }
> > > > + }
> > > > +
> > > > + for (i = 0; i < NUM_FDS; ++i) {
> > > > + ret = fsync(fd[i]);
> > > > + if (ret < 0) {
> > > > + printf("First fsync on fd[%d] failed: %m\n", i);
> > > > + return 1;
> > > > + }
> > > > + }
> > > > +
> > > > + /* flip the device to non-working mode */
> > > > + ret = system("./tools/dmerror load_error_table");
> > >
> > > Hmm, how about passing these "load error table" and "load working table"
> > > commands through command line parameters too?
> > >
> > > > + if (ret) {
> > > > + if (WIFEXITED(ret))
> > > > + printf("system: program exited: %d\n",
> > > > + WEXITSTATUS(ret));
> > > > + else
> > > > + printf("system: 0x%x\n", (int)ret);
> > > > +
> > > > + return 1;
> > > > + }
> > > > +
> > > > + for (i = 0; i < NUM_FDS; ++i) {
> > > > + ret = write(fd[i], buf, strlen(buf) + 1);
> > > > + if (ret < 0) {
> > > > + printf("Second write on fd[%d] failed: %m\n", i);
> > > > + return 1;
> > > > + }
> > > > + }
> > > > +
> > > > + for (i = 0; i < NUM_FDS; ++i) {
> > > > + ret = fsync(fd[i]);
> > > > + /* Now, we EXPECT the error! */
> > > > + if (ret >= 0) {
> > > > + printf("Success on second fsync on fd[%d]!\n", i);
> > > > + return 1;
> > > > + }
> > > > + }
> > > > +
> > > > + for (i = 0; i < NUM_FDS; ++i) {
> > > > + ret = fsync(fd[i]);
> > > > + if (ret < 0) {
> > > > + /* Now the error should be clear */
> > >
> > > It's not obvious to me why error should be clear at this stage, adding
> > > some comments would be good.
> > >
> >
> > Ok. FWIW:
> >
> > We did some writes to a failing device and called fsync and got an error
> > back. Since no more data was written since then, we don't expect to see
> > an error at this point since it should have been "cleared".
>
> Thanks! It wasn't clear to me until I read your kernel patch :)
>
> Thanks,
> Eryu

2017-06-06 20:12:58

by Jeff Layton

[permalink] [raw]
Subject: Re: [xfstests PATCH v3 1/5] generic: add a writeback error handling test

On Tue, 2017-06-06 at 10:17 -0700, Darrick J. Wong wrote:
> On Tue, Jun 06, 2017 at 08:23:25PM +0800, Eryu Guan wrote:
> > On Tue, Jun 06, 2017 at 06:15:57AM -0400, Jeff Layton wrote:
> > > On Tue, 2017-06-06 at 16:58 +0800, Eryu Guan wrote:
> > > > On Wed, May 31, 2017 at 09:08:16AM -0400, Jeff Layton wrote:
> > > > > I'm working on a set of kernel patches to change how writeback errors
> > > > > are handled and reported in the kernel. Instead of reporting a
> > > > > writeback error to only the first fsync caller on the file, I aim
> > > > > to make the kernel report them once on every file description.
> > > > >
> > > > > This patch adds a test for the new behavior. Basically, open many fds
> > > > > to the same file, turn on dm_error, write to each of the fds, and then
> > > > > fsync them all to ensure that they all get an error back.
> > > > >
> > > > > To do that, I'm adding a new tools/dmerror script that the C program
> > > > > can use to load the error table. For now, that's all it can do, but
> > > > > we can fill it out with other commands as necessary.
> > > > >
> > > > > Signed-off-by: Jeff Layton <[email protected]>
> > > >
> > > > Thanks for the new tests! And sorry for the late review..
> > > >
> > > > It's testing a new behavior on error reporting on writeback, I'm not
> > > > sure if we can call it a new feature or it fixed a bug? But it's more
> > > > like a behavior change, I'm not sure how to categorize it.
> > > >
> > > > Because if it's testing a new feature, we usually let test do proper
> > > > detection of current test environment (based on actual behavior not
> > > > kernel version) and _notrun on filesystems that don't have this feature
> > > > yet, instead of failing the test; if it's testing a bug fix, we could
> > > > leave the test fail on unfixed filesystems, this also serves as a
> > > > reminder that there's bug to fix.
> > > >
> > >
> > > Thanks for the review! I'm not sure how to categorize this either. Since
> > > the plan is to convert all the filesystems piecemeal, maybe we should
> > > just consider it a new feature.
> >
> > Then we need a new _require rule to properly detect for the 'feature'
> > support. I'm not sure if this is doable, but something like
> > _require_statx, _require_seek_data_hole would be good.
> >
> > >
> > > > I pulled your test kernel tree, and test passed on EXT4 but failed on
> > > > other local filesystems (XFS, btrfs). I assume that's expected.
> > > >
> > > > Besides this kinda high-level question, some minor comments inline.
> > > >
> > >
> > > Yes, ext4 should pass on my latest kernel tree, but everything else
> > > should fail.

Oh, and I should mention that ext2/3 also pass when mounted using ext4
driver. Legacy ext2 driver sort of works, but it reports a few too many
errors because of the way the ext2_error macro works. That shouldn't be
too hard to fix, I just need some guidance on that one.

I had xfs and btrfs working with an earlier iteration of the patches,
but now that we're converting a fs at a time, it's a little more work to
get there. It shouldn't be too hard to do though. I'll probably re-post
in a few days, and will try to take a stab at XFS and btrfs conversion
too.

> >
> > With the new _require rule, test should _notrun on XFS and btrfs then.
>
> Frankly I personally prefer that upstream XFS fails until someone fixes it. :)
> (But that's just my opinion.)
>
> That said, I'm not 100% sure what's required of XFS to play nicely with
> this new mechanism -- glancing at the ext* patches it looks like we'd
> need to set a fs flag and possibly change some or all of the "write
> cached dirty buffers out to disk" calls to their _since variants?

Yeah, that's pretty much the size of it.

In fact, the latter part (changing to the _since variants) is somewhat
optional. We can have the errseq_t based tracking coexist with the
AS_EIO/AS_ENOSPC flags. It's weird but I don't see a real downside to
preserving them until we've got more of this converted over.

In the latest branch I'm working on, I'm breaking up those changes into
different patches so it should be a little clearer for other fs
maintainers to see what I'm doing and why. Stay tuned...

> Metadata writeback errors are handled by retrying writes and/or shutting
> down the fs, so I think the f_md_wb_error case is already covered.

Thanks. I think we do need f_md_wb_err for ext2/4 though, IIUC?

>
> That said, I agree that it's useful to detect that the kernel simply
> lacks any of the new wb error reporting at all, so therefore we can skip
> the tests.
>

Suggestions on ways to implement such a check would be welcome. Maybe a
file in /sys or in debugfs?

--
Jeff Layton <[email protected]>

2017-06-06 22:07:29

by Darrick J. Wong

[permalink] [raw]
Subject: Re: [xfstests PATCH v3 1/5] generic: add a writeback error handling test

On Tue, Jun 06, 2017 at 04:12:58PM -0400, Jeff Layton wrote:
> On Tue, 2017-06-06 at 10:17 -0700, Darrick J. Wong wrote:
> > On Tue, Jun 06, 2017 at 08:23:25PM +0800, Eryu Guan wrote:
> > > On Tue, Jun 06, 2017 at 06:15:57AM -0400, Jeff Layton wrote:
> > > > On Tue, 2017-06-06 at 16:58 +0800, Eryu Guan wrote:
> > > > > On Wed, May 31, 2017 at 09:08:16AM -0400, Jeff Layton wrote:
> > > > > > I'm working on a set of kernel patches to change how writeback errors
> > > > > > are handled and reported in the kernel. Instead of reporting a
> > > > > > writeback error to only the first fsync caller on the file, I aim
> > > > > > to make the kernel report them once on every file description.
> > > > > >
> > > > > > This patch adds a test for the new behavior. Basically, open many fds
> > > > > > to the same file, turn on dm_error, write to each of the fds, and then
> > > > > > fsync them all to ensure that they all get an error back.
> > > > > >
> > > > > > To do that, I'm adding a new tools/dmerror script that the C program
> > > > > > can use to load the error table. For now, that's all it can do, but
> > > > > > we can fill it out with other commands as necessary.
> > > > > >
> > > > > > Signed-off-by: Jeff Layton <[email protected]>
> > > > >
> > > > > Thanks for the new tests! And sorry for the late review..
> > > > >
> > > > > It's testing a new behavior on error reporting on writeback, I'm not
> > > > > sure if we can call it a new feature or it fixed a bug? But it's more
> > > > > like a behavior change, I'm not sure how to categorize it.
> > > > >
> > > > > Because if it's testing a new feature, we usually let test do proper
> > > > > detection of current test environment (based on actual behavior not
> > > > > kernel version) and _notrun on filesystems that don't have this feature
> > > > > yet, instead of failing the test; if it's testing a bug fix, we could
> > > > > leave the test fail on unfixed filesystems, this also serves as a
> > > > > reminder that there's bug to fix.
> > > > >
> > > >
> > > > Thanks for the review! I'm not sure how to categorize this either. Since
> > > > the plan is to convert all the filesystems piecemeal, maybe we should
> > > > just consider it a new feature.
> > >
> > > Then we need a new _require rule to properly detect for the 'feature'
> > > support. I'm not sure if this is doable, but something like
> > > _require_statx, _require_seek_data_hole would be good.
> > >
> > > >
> > > > > I pulled your test kernel tree, and test passed on EXT4 but failed on
> > > > > other local filesystems (XFS, btrfs). I assume that's expected.
> > > > >
> > > > > Besides this kinda high-level question, some minor comments inline.
> > > > >
> > > >
> > > > Yes, ext4 should pass on my latest kernel tree, but everything else
> > > > should fail.
>
> Oh, and I should mention that ext2/3 also pass when mounted using ext4
> driver. Legacy ext2 driver sort of works, but it reports a few too many
> errors because of the way the ext2_error macro works. That shouldn't be
> too hard to fix, I just need some guidance on that one.
>
> I had xfs and btrfs working with an earlier iteration of the patches,
> but now that we're converting a fs at a time, it's a little more work to
> get there. It shouldn't be too hard to do though. I'll probably re-post
> in a few days, and will try to take a stab at XFS and btrfs conversion
> too.
>
> > >
> > > With the new _require rule, test should _notrun on XFS and btrfs then.
> >
> > Frankly I personally prefer that upstream XFS fails until someone fixes it. :)
> > (But that's just my opinion.)
> >
> > That said, I'm not 100% sure what's required of XFS to play nicely with
> > this new mechanism -- glancing at the ext* patches it looks like we'd
> > need to set a fs flag and possibly change some or all of the "write
> > cached dirty buffers out to disk" calls to their _since variants?
>
> Yeah, that's pretty much the size of it.
>
> In fact, the latter part (changing to the _since variants) is somewhat
> optional. We can have the errseq_t based tracking coexist with the
> AS_EIO/AS_ENOSPC flags. It's weird but I don't see a real downside to
> preserving them until we've got more of this converted over.
>
> In the latest branch I'm working on, I'm breaking up those changes into
> different patches so it should be a little clearer for other fs
> maintainers to see what I'm doing and why. Stay tuned...

Ok.

> > Metadata writeback errors are handled by retrying writes and/or shutting
> > down the fs, so I think the f_md_wb_error case is already covered.
>
> Thanks. I think we do need f_md_wb_err for ext2/4 though, IIUC?

Yes. Sorry, the previous statement applies only to XFS.

> > That said, I agree that it's useful to detect that the kernel simply
> > lacks any of the new wb error reporting at all, so therefore we can skip
> > the tests.
> >
>
> Suggestions on ways to implement such a check would be welcome. Maybe a
> file in /sys or in debugfs?

Assuming that this patchset applies the same wb error reporting behavior
to block devices, you could open a bunch of file descriptors to a linear
dm target sitting atop $SCRATCH_DEV, switch out the table to dm_error,
then write something, fsync, and see if we get more than one EIO. Then
you'd know if the kernel supports it, at least... I think?

--D

>
> --
> Jeff Layton <[email protected]>

2017-06-08 12:48:07

by Jeff Layton

[permalink] [raw]
Subject: Re: [xfstests PATCH v3 5/5] btrfs: allow it to use $SCRATCH_LOGDEV

On Tue, 2017-06-06 at 17:19 +0800, Eryu Guan wrote:
> On Wed, May 31, 2017 at 09:08:20AM -0400, Jeff Layton wrote:
> > With btrfs, we can't really put the log on a separate device. What we
> > can do however is mirror the metadata across two devices and make the
> > data striped across all devices. When we turn on dmerror then the
> > metadata can fall back to using the other mirror while the data errors
> > out.
> >
> > Note that the current incarnation of btrfs has a fixed 64k stripe
> > width. If that ever changes or becomes settable, we may need to adjust
> > the amount of data that the test program writes.
> >
> > Signed-off-by: Jeff Layton <[email protected]>
> > ---
> > common/rc | 2 ++
> > 1 file changed, 2 insertions(+)
> >
> > diff --git a/common/rc b/common/rc
> > index 83765aacfb06..078270451b53 100644
> > --- a/common/rc
> > +++ b/common/rc
> > @@ -830,6 +830,8 @@ _scratch_mkfs()
> > ;;
> > btrfs)
> > mkfs_cmd="$MKFS_BTRFS_PROG"
> > + [ "$USE_EXTERNAL" = yes -a ! -z "$SCRATCH_LOGDEV" ] && \
> > + mkfs_cmd="$mkfs_cmd -d raid0 -m raid1 $SCRATCH_LOGDEV"
>
> I don't think this is the correct way to do it. If btrfs doesn't support
> external log device, then this test doesn't fit btrfs, or we need other
> ways to test btrfs.
>
> One of the problems of this hack is that raid1 requires all devices are
> in the same size, we have a _require_scratch_dev_pool_equal_size() rule
> to check on it, but this hack doesn't do the proper check and test fails
> if SCRATCH_LOGDEV is smaller or bigger in size.
>
> If btrfs "-d raid0 -m raid1" is capable to do this writeback error test,
> perhaps you can write a new btrfs test and mkfs with "-d raid0 -m raid1"
> explicitly. e.g.
>
> ...
> _require_scratch_dev_pool 2
> _require_scratch_dev_pool_equal_size
> ...
> _scratch_mkfs "-d raid0 -m raid1"
> ...
>
> Thanks,
> Eryu


Yeah, that's probably the right way to do this. It looks like btrfs also
has $SCRATCH_DEV_POOL, and we can probably base it on that. I'll look at
reworking it.

--
Jeff Layton <[email protected]>