2017-03-14 02:56:46

by Murphy Zhou

[permalink] [raw]
Subject: fsx tests on DAX started to fail with msync failure on 0307 -next tree

Hi,

xfstests cases:
generic/075 generic/112 generic/127 generic/231 generic/263

fail with DAX, pass without it. Both xfs and ext4.

It was okay on 0306 -next tree.

+ ./check generic/075
FSTYP -- xfs (non-debug)
PLATFORM -- Linux/x86_64 hp-dl360g9-12 4.11.0-rc1-linux-next-5be4921-next-20170310
MKFS_OPTIONS -- -f -bsize=4096 /dev/pmem0p2
MOUNT_OPTIONS -- -o dax -o context=system_u:object_r:nfs_t:s0 /dev/pmem0p2 /daxsch

generic/075 4s ... [failed, exit status 1] - output mismatch (see /root/xfstests/results//generic/075.out.bad)
--- tests/generic/075.out 2016-12-13 14:38:25.984557426 +0800
+++ /root/xfstests/results//generic/075.out.bad 2017-03-14 10:40:23.083052839 +0800
@@ -4,15 +4,4 @@
-----------------------------------------------
fsx.0 : -d -N numops -S 0
-----------------------------------------------
-
------------------------------------------------
-fsx.1 : -d -N numops -S 0 -x
------------------------------------------------
...
(Run 'diff -u tests/generic/075.out /root/xfstests/results//generic/075.out.bad' to see the entire diff)
..

$ diff -u xfstests/tests/generic/075.out /root/xfstests/results//generic/075.out.bad
--- xfstests/tests/generic/075.out 2016-12-13 14:38:25.984557426 +0800
+++ /root/xfstests/results//generic/075.out.bad 2017-03-14 10:40:23.083052839 +0800
@@ -4,15 +4,4 @@
-----------------------------------------------
fsx.0 : -d -N numops -S 0
-----------------------------------------------
-
------------------------------------------------
-fsx.1 : -d -N numops -S 0 -x
------------------------------------------------
-
------------------------------------------------
-fsx.2 : -d -N numops -l filelen -S 0
------------------------------------------------
-
------------------------------------------------
-fsx.3 : -d -N numops -l filelen -S 0 -x
------------------------------------------------
+ fsx (-d -N 1000 -S 0) failed, 0 - compare /root/xfstests/results//generic/075.0.{good,bad,fsxlog}

$ diff -u /root/xfstests/results//generic/075.0.{good,fsxlog} | tail -20
-03cb30 f903 da03 1103 7503 5403 8903 9f03 6b03
-03cb40 bb03 fb03 5603 7e03 c503 ca03 0103 9603
-03cb50 7f03 7c03 0c03 5103 ed03 dc03 a403 5c03
-03cb60 5403 b903 4403 3c03 4b03 a903 2303 1a03
-03cb70 2b03 5f03 fd03 ee03 1303 9703 2903 d303
-03cb80 4e03 9903 f903 8003 b803 2503 2203 c903
-03cb90 6803 7a03 0f03 6303 de03 ba03 6e03 6503
-03cba0 db03
-03cba2
+skipping zero size read
+skipping insert range behind EOF
+3 mapwrite 0x2e836 thru 0x3cba1 (0xe36c bytes)
+domapwrite: msync: Invalid argument
+LOG DUMP (3 total operations):
+1( 1 mod 256): SKIPPED (no operation)
+2( 2 mod 256): SKIPPED (no operation)
+3( 3 mod 256): MAPWRITE 0x2e836 thru 0x3cba1 (0xe36c bytes)
+Log of operations saved to "075.0.fsxops"; replay with --replay-ops
+Correct content saved for comparison
+(maybe hexdump "075.0" vs "075.0.fsxgood")

https://git.kernel.org/pub/scm/fs/xfs/xfstests-dev.git/tree/tests/generic/075
https://git.kernel.org/pub/scm/fs/xfs/xfstests-dev.git/tree/ltp/fsx.c


2017-03-14 17:36:15

by Ross Zwisler

[permalink] [raw]
Subject: Re: fsx tests on DAX started to fail with msync failure on 0307 -next tree

On Tue, Mar 14, 2017 at 10:56:42AM +0800, Xiong Zhou wrote:
> Hi,
>
> xfstests cases:
> generic/075 generic/112 generic/127 generic/231 generic/263
>
> fail with DAX, pass without it. Both xfs and ext4.
>
> It was okay on 0306 -next tree.

Thanks for the report. I'm looking into it. -next is all kinds of broken.
:(

2017-03-14 21:54:07

by Ross Zwisler

[permalink] [raw]
Subject: [PATCH] dax: fix regression in dax_writeback_mapping_range()

commit 354ae7432ee8 ("dax: add tracepoints to dax_writeback_mapping_range()")
in the -next tree, which appears in next-20170310, inadvertently changed
dax_writeback_mapping_range() so that it could end up returning a positive
value: the number of bytes flushed, as returned by dax_writeback_one().
This was incorrect. This function either needs to return a negative error
value, or zero on success.

This change was causing xfstest failures, as reported by Xiong:

https://lkml.org/lkml/2017/3/13/1220

With this fix applied to next-20170310, all the test failures reported by
Xiong (generic/075 generic/112 generic/127 generic/231 generic/263) are
resolved.

Reported-by: Xiong Zhou <[email protected]>
Signed-off-by: Ross Zwisler <[email protected]>
---
fs/dax.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/dax.c b/fs/dax.c
index 1861ef0..60688c7 100644
--- a/fs/dax.c
+++ b/fs/dax.c
@@ -907,7 +907,7 @@ int dax_writeback_mapping_range(struct address_space *mapping,
}
out:
trace_dax_writeback_range_done(inode, start_index, end_index);
- return ret;
+ return (ret < 0 ? ret : 0);
}
EXPORT_SYMBOL_GPL(dax_writeback_mapping_range);

--
2.9.3

2017-03-16 15:53:40

by Ross Zwisler

[permalink] [raw]
Subject: Re: fsx tests on DAX started to fail with msync failure on 0307 -next tree

On Tue, Mar 14, 2017 at 11:36:10AM -0600, Ross Zwisler wrote:
> On Tue, Mar 14, 2017 at 10:56:42AM +0800, Xiong Zhou wrote:
> > Hi,
> >
> > xfstests cases:
> > generic/075 generic/112 generic/127 generic/231 generic/263
> >
> > fail with DAX, pass without it. Both xfs and ext4.
> >
> > It was okay on 0306 -next tree.
>
> Thanks for the report. I'm looking into it. -next is all kinds of broken.
> :(

Just FYI, in case folks are still testing -next:

One other issue that I was hitting was that for many of the commits in -next
kernel modules wouldn't load, which meant that my /dev/pmem0 device wasn't
showing up because I have libnvdimm compiled as a module.

I bisected that issue to this commit:

commit d1091c7fa3d5 ("objtool: Improve detection of BUG() and other dead
ends")

It looks like Xiong also found this issue:

https://lkml.org/lkml/2017/3/2/114

And Linus found it:

https://lkml.org/lkml/2017/2/28/794

- Ross