LinuxLists.cc - [mm/readahead] a0b99df1aa: xfstests.xfs.421.fail

2022-02-21 09:33:06

Subject: [mm/readahead] a0b99df1aa: xfstests.xfs.421.fail

Greeting,

FYI, we noticed the following commit (built with gcc-9):

commit: a0b99df1aa37d714eb80be5fb54efd56c88a3336 ("mm/readahead: Add large folio readahead")
git://git.infradead.org/users/willy/pagecache for-next

in testcase: xfstests
version: xfstests-x86_64-1de1db8-1_20220217
with following parameters:

disk: 4HDD
fs: xfs
test: xfs-reflink-21
ucode: 0x28

test-description: xfstests is a regression test suite for xfs and other files ystems.
test-url: git://git.kernel.org/pub/scm/fs/xfs/xfstests-dev.git

on test machine: 8 threads 1 sockets Intel(R) Core(TM) i7-4790 v3 @ 3.60GHz with 6G memory

caused below changes (please refer to attached dmesg/kmsg for entire log/backtrace):

If you fix the issue, kindly add following tag
Reported-by: kernel test robot <[email protected]>

2022-02-20 20:34:16 export TEST_DIR=/fs/sdb1
2022-02-20 20:34:16 export TEST_DEV=/dev/sdb1
2022-02-20 20:34:16 export FSTYP=xfs
2022-02-20 20:34:16 export SCRATCH_MNT=/fs/scratch
2022-02-20 20:34:16 mkdir /fs/scratch -p
2022-02-20 20:34:16 export SCRATCH_DEV=/dev/sdb4
2022-02-20 20:34:16 export SCRATCH_LOGDEV=/dev/sdb2
2022-02-20 20:34:16 export SCRATCH_XFS_LIST_METADATA_FIELDS=u3.sfdir3.hdr.parent.i4
2022-02-20 20:34:16 export SCRATCH_XFS_LIST_FUZZ_VERBS=random
2022-02-20 20:34:16 export MKFS_OPTIONS=-mreflink=1
2022-02-20 20:34:16 sed "s:^:xfs/:" //lkp/benchmarks/xfstests/tests/xfs-reflink-21
2022-02-20 20:34:16 ./check xfs/420 xfs/421 xfs/435
FSTYP -- xfs (debug)
PLATFORM -- Linux/x86_64 lkp-hsw-d02 5.17.0-rc4-00174-ga0b99df1aa37 #1 SMP Mon Feb 21 03:58:09 CST 2022
MKFS_OPTIONS -- -f -mreflink=1 /dev/sdb4
MOUNT_OPTIONS -- /dev/sdb4 /fs/scratch

xfs/420 - output mismatch (see /lkp/benchmarks/xfstests/results//xfs/420.out.bad)
--- tests/xfs/420.out 2022-02-17 11:55:00.000000000 +0000
+++ /lkp/benchmarks/xfstests/results//xfs/420.out.bad 2022-02-20 20:34:22.430378506 +0000
@@ -13,9 +13,7 @@
Seek holes and data in file2
Whence Result
DATA 0
-HOLE 131072
-DATA 196608
-HOLE 262144
+HOLE 524288
...
(Run 'diff -u /lkp/benchmarks/xfstests/tests/xfs/420.out /lkp/benchmarks/xfstests/results//xfs/420.out.bad' to see the entire diff)
xfs/421 - output mismatch (see /lkp/benchmarks/xfstests/results//xfs/421.out.bad)
--- tests/xfs/421.out 2022-02-17 11:55:00.000000000 +0000
+++ /lkp/benchmarks/xfstests/results//xfs/421.out.bad 2022-02-20 20:34:26.538378346 +0000
@@ -13,9 +13,7 @@
Seek holes and data in file2
Whence Result
DATA 0
-HOLE 131072
-DATA 196608
-HOLE 262144
+HOLE 524288
...
(Run 'diff -u /lkp/benchmarks/xfstests/tests/xfs/421.out /lkp/benchmarks/xfstests/results//xfs/421.out.bad' to see the entire diff)
xfs/435 4s
Ran: xfs/420 xfs/421 xfs/435
Failures: xfs/420 xfs/421
Failed 2 of 3 tests

To reproduce:

git clone https://github.com/intel/lkp-tests.git
cd lkp-tests
sudo bin/lkp install job.yaml # job file is attached in this email
bin/lkp split-job --compatible job.yaml # generate the yaml file for lkp run
sudo bin/lkp run generated-yaml-file

# if come across any failure that blocks the test,
# please remove ~/.lkp and /lkp dir to run from a clean state.

---
0DAY/LKP+ Test Infrastructure Open Source Technology Center
https://lists.01.org/hyperkitty/list/[email protected] Intel Corporation

Thanks,
Oliver Sang

Attachments:

(No filename) (3.49 kB)
config-5.17.0-rc4-00174-ga0b99df1aa37 (181.74 kB)
job-script (5.73 kB)
dmesg.xz (25.36 kB)
xfstests (2.04 kB)
job.yaml (4.85 kB)
reproduce (932.00 B)
Download all attachments

2022-02-22 04:12:20

by Dave Chinner

[permalink] [raw]

Subject: Re: [mm/readahead] a0b99df1aa: xfstests.xfs.421.fail

On Mon, Feb 21, 2022 at 09:10:20PM +0000, Matthew Wilcox wrote:
> On Tue, Feb 22, 2022 at 07:55:29AM +1100, Dave Chinner wrote:
> > On Mon, Feb 21, 2022 at 01:56:55PM +0000, Matthew Wilcox wrote:
> > > On Mon, Feb 21, 2022 at 04:02:18PM +0800, kernel test robot wrote:
> > > > commit: a0b99df1aa37d714eb80be5fb54efd56c88a3336 ("mm/readahead: Add large folio readahead")
> > >
> > > > xfs/420 - output mismatch (see /lkp/benchmarks/xfstests/results//xfs/420.out.bad)
> > > > --- tests/xfs/420.out 2022-02-17 11:55:00.000000000 +0000
> > > > +++ /lkp/benchmarks/xfstests/results//xfs/420.out.bad 2022-02-20 20:34:22.430378506 +0000
> > > > @@ -13,9 +13,7 @@
> > > > Seek holes and data in file2
> > > > Whence Result
> > > > DATA 0
> > > > -HOLE 131072
> > > > -DATA 196608
> > > > -HOLE 262144
> > > > +HOLE 524288
> > >
> > > Confirm this test now fails. I don't think it's actually a bug,
> > > though. I think the test is now using larger pages to cache the
> > > file, and it fails to report that there's a hole in the file.
> > > Maybe there actually isn't a hole in the file any more; using
> > > larger pages to cache the file means we'll now write more data
> > > than we used to.
> > >
> > > Adding XFS people for their thoughts.
> > >
> > > Complete output:
> > >
> > > $ diff -u ../ktest/tests/xfstests/tests/xfs/420.out ktest-out/xfstests/xfs/420.out.bad
> > > --- ../ktest/tests/xfstests/tests/xfs/420.out 2021-07-05 15:49:45.539887305 -0400
> > > +++ ktest-out/xfstests/xfs/420.out.bad 2022-02-21 08:14:40.000000000 -0500
> > > @@ -13,9 +13,7 @@
> > > Seek holes and data in file2
> > > Whence Result
> > > DATA 0
> > > -HOLE 131072
> > > -DATA 196608
> > > -HOLE 262144
> > > +HOLE 524288
> > > Compare files
> > > c2803804acc9936eef8aab42c119bfac SCRATCH_MNT/test-420/file1
> > > 017c08a9320aad844ce86aa9631afb98 SCRATCH_MNT/test-420/file2
> > > @@ -28,9 +26,7 @@
> > > Seek holes and data in file2
> > > Whence Result
> > > DATA 0
> > > -HOLE 131072
> > > -DATA 196608
> > > -HOLE 262144
> > > +HOLE 524288
> > > Compare files
> > > c2803804acc9936eef8aab42c119bfac SCRATCH_MNT/test-420/file1
> > > 017c08a9320aad844ce86aa9631afb98 SCRATCH_MNT/test-420/file2
> > >
> > > So the file checksums are right, which means I didn't break the COW
> > > functionality. But we're no longer reporting a hole at 128k.
> >
> > Can you post the contents of the 420.full output file so we can see
> > what the output of the various commands that are run are? e.g.
> > things like cowextsize that is configured, etc?
>
> Sure! It's short, so I've included it inline.

Ok, I'll cut this up so it makes sense...

>
>
> meta-data=/dev/sdc isize=512 agcount=4, agsize=3670016 blks
> = sectsz=512 attr=2, projid32bit=1
> = crc=1 finobt=1, sparse=1, rmapbt=1
> = reflink=1 bigtime=0
> data = bsize=1024 blocks=14680064, imaxpct=25
> = sunit=0 swidth=0 blks
> naming =version 2 bsize=4096 ascii-ci=0, ftype=1
> log =internal log bsize=1024 blocks=10240, version=2
> = sectsz=512 sunit=0 blks, lazy-count=1
> realtime =none extsz=4096 blocks=0, rtextents=0
> Discarding blocks...Done.
> [0] /mnt/scratch/test-420
> [524288] /mnt/scratch/test-420
> wrote 131072/131072 bytes at offset 0
> 128 KiB, 128 ops; 0.0000 sec (1.327 GiB/sec and 1391304.3478 ops/sec)
> wrote 524288/524288 bytes at offset 0
> 512 KiB, 512 ops; 0.0003 sec (1.341 GiB/sec and 1406593.4066 ops/sec)
> wrote 131072/131072 bytes at offset 0
> 128 KiB, 128 ops; 0.0000 sec (1.822 GiB/sec and 1910447.7612 ops/sec)
> CoW the shared part then write into the empty part
> [524288] /mnt/scratch/test-420/file1
> [524288] /mnt/scratch/test-420/file2
> wrote 65536/65536 bytes at offset 0
> 64 KiB, 64 ops; 0.0001 sec (416.667 MiB/sec and 426666.6667 ops/sec)
> wrote 65536/65536 bytes at offset 196608
> 64 KiB, 64 ops; 0.0000 sec (1.695 GiB/sec and 1777777.7778 ops/sec)
> wrote 65536/65536 bytes at offset 0
> 64 KiB, 64 ops; 0.0000 sec (1.387 GiB/sec and 1454545.4545 ops/sec)
> wrote 65536/65536 bytes at offset 196608
> 64 KiB, 64 ops; 0.0000 sec (1.526 GiB/sec and 1600000.0000 ops/sec)
> xfs_io: xfsctl(XFS_IOC_GETBMAPX) iflags=0x28 ["/mnt/scratch/test-420/file1"]: Invalid argument
> /mnt/scratch/test-420/file1:
> EXT: FILE-OFFSET BLOCK-RANGE AG AG-OFFSET TOTAL
> 0: [0..255]: 128..383 0 (128..383) 256 100000
> 1: [256..1023]: hole 768
> xfs_io: xfsctl(XFS_IOC_GETBMAPX) iflags=0x28 ["/mnt/scratch/test-420/file2"]: Invalid argument
> /mnt/scratch/test-420/file2:
> EXT: FILE-OFFSET BLOCK-RANGE AG AG-OFFSET TOTAL
> 0: [0..255]: 128..383 0 (128..383) 256 100000
> 1: [256..1023]: hole 768

So this is the extent list for file2 after the reflink. Note the
hole at 128-512kB. The flags tell us the 128kB data extent is
shared. There are no unwritten extents at all.

> > > Seek holes and data in file2
> > > Whence Result
> > > DATA 0
> > > -HOLE 131072
> > > -DATA 196608
> > > -HOLE 262144
> > > +HOLE 524288

indicates the file is completely full of data at this point. Based
on the extent list, and looking at the current
iomap_seek_hole/data() implementation, this looks wrong.

Technically speaking, however, it is valid because holes are allowed
to be reported as data but not vice versa. That said, in this case
there are no unwritten extents so iomap_seek_hole/data should be
reporting the exact on-disk extent mapping and not even be looking
at the page cache contents. The page cache is only only for
unwritten extents, and there are none of them so page cache
footprint should not change this result.

> xfs_io: xfsctl(XFS_IOC_GETBMAPX) iflags=0x28 ["/mnt/scratch/test-420/file3"]: Invalid argument
> /mnt/scratch/test-420/file3:
> EXT: FILE-OFFSET BLOCK-RANGE AG AG-OFFSET TOTAL
> 0: [0..1023]: 512..1535 0 (512..1535) 1024
> sync filesystem
> xfs_io: xfsctl(XFS_IOC_GETBMAPX) iflags=0x28 ["/mnt/scratch/test-420/file1"]: Invalid argument
> /mnt/scratch/test-420/file1:
> EXT: FILE-OFFSET BLOCK-RANGE AG AG-OFFSET TOTAL
> 0: [0..127]: 128..255 0 (128..255) 128
> 1: [128..255]: 256..383 0 (256..383) 128 100000
> 2: [256..1023]: hole 768
> xfs_io: xfsctl(XFS_IOC_GETBMAPX) iflags=0x28 ["/mnt/scratch/test-420/file2"]: Invalid argument
> /mnt/scratch/test-420/file2:
> EXT: FILE-OFFSET BLOCK-RANGE AG AG-OFFSET TOTAL
> 0: [0..127]: 1536..1663 0 (1536..1663) 128
> 1: [128..255]: 256..383 0 (256..383) 128 100000
> 2: [256..383]: hole 128
> 3: [384..511]: 1920..2047 0 (1920..2047) 128
> 4: [512..1023]: hole 512

And the extent list after we've overwritten the first 64kB and
another 64kB write at 192kB. Note that the 64kB hole at 128kB is
still there.

> > > @@ -13,9 +13,7 @@
> > > Seek holes and data in file2
> > > Whence Result
> > > DATA 0
> > > -HOLE 131072
> > > -DATA 196608
> > > -HOLE 262144
> > > +HOLE 524288

And this still looks wrong. Again, no unwritten extents, so we
should just be reporting the on-disk extent layout here.

> xfs_io: xfsctl(XFS_IOC_GETBMAPX) iflags=0x28 ["/mnt/scratch/test-420/file3"]: Invalid argument
> /mnt/scratch/test-420/file3:
> EXT: FILE-OFFSET BLOCK-RANGE AG AG-OFFSET TOTAL
> 0: [0..1023]: 512..1535 0 (512..1535) 1024
> Remount
> xfs_io: xfsctl(XFS_IOC_GETBMAPX) iflags=0x28 ["/mnt/scratch/test-420/file1"]: Invalid argument
> /mnt/scratch/test-420/file1:
> EXT: FILE-OFFSET BLOCK-RANGE AG AG-OFFSET TOTAL
> 0: [0..127]: 128..255 0 (128..255) 128
> 1: [128..255]: 256..383 0 (256..383) 128 100000
> 2: [256..1023]: hole 768
> xfs_io: xfsctl(XFS_IOC_GETBMAPX) iflags=0x28 ["/mnt/scratch/test-420/file2"]: Invalid argument
> /mnt/scratch/test-420/file2:
> EXT: FILE-OFFSET BLOCK-RANGE AG AG-OFFSET TOTAL
> 0: [0..127]: 1536..1663 0 (1536..1663) 128
> 1: [128..255]: 256..383 0 (256..383) 128 100000
> 2: [256..383]: hole 128
> 3: [384..511]: 1920..2047 0 (1920..2047) 128
> 4: [512..1023]: hole 512

This is after mount/unmount, where there is now no page cache over
the file. This reports the same layout as bmap does (i.e. matches
the golden output) and is obviously correct.

This smells of a bug in the large folio patchset as the on-disk
extent layouts are correct and unchanged. The question now is what
if the large folio patchset doing to iomap_seek_hole/data that makes
it behave differently to the current code?

Cheers,

Dave.
--
Dave Chinner
[email protected]

2022-02-22 04:25:44

by Dave Chinner

[permalink] [raw]

Subject: Re: [mm/readahead] a0b99df1aa: xfstests.xfs.421.fail

On Mon, Feb 21, 2022 at 01:56:55PM +0000, Matthew Wilcox wrote:
> On Mon, Feb 21, 2022 at 04:02:18PM +0800, kernel test robot wrote:
> > commit: a0b99df1aa37d714eb80be5fb54efd56c88a3336 ("mm/readahead: Add large folio readahead")
>
> > xfs/420 - output mismatch (see /lkp/benchmarks/xfstests/results//xfs/420.out.bad)
> > --- tests/xfs/420.out 2022-02-17 11:55:00.000000000 +0000
> > +++ /lkp/benchmarks/xfstests/results//xfs/420.out.bad 2022-02-20 20:34:22.430378506 +0000
> > @@ -13,9 +13,7 @@
> > Seek holes and data in file2
> > Whence Result
> > DATA 0
> > -HOLE 131072
> > -DATA 196608
> > -HOLE 262144
> > +HOLE 524288
>
> Confirm this test now fails. I don't think it's actually a bug,
> though. I think the test is now using larger pages to cache the
> file, and it fails to report that there's a hole in the file.
> Maybe there actually isn't a hole in the file any more; using
> larger pages to cache the file means we'll now write more data
> than we used to.
>
> Adding XFS people for their thoughts.
>
> Complete output:
>
> $ diff -u ../ktest/tests/xfstests/tests/xfs/420.out ktest-out/xfstests/xfs/420.out.bad
> --- ../ktest/tests/xfstests/tests/xfs/420.out 2021-07-05 15:49:45.539887305 -0400
> +++ ktest-out/xfstests/xfs/420.out.bad 2022-02-21 08:14:40.000000000 -0500
> @@ -13,9 +13,7 @@
> Seek holes and data in file2
> Whence Result
> DATA 0
> -HOLE 131072
> -DATA 196608
> -HOLE 262144
> +HOLE 524288
> Compare files
> c2803804acc9936eef8aab42c119bfac SCRATCH_MNT/test-420/file1
> 017c08a9320aad844ce86aa9631afb98 SCRATCH_MNT/test-420/file2
> @@ -28,9 +26,7 @@
> Seek holes and data in file2
> Whence Result
> DATA 0
> -HOLE 131072
> -DATA 196608
> -HOLE 262144
> +HOLE 524288
> Compare files
> c2803804acc9936eef8aab42c119bfac SCRATCH_MNT/test-420/file1
> 017c08a9320aad844ce86aa9631afb98 SCRATCH_MNT/test-420/file2
>
> So the file checksums are right, which means I didn't break the COW
> functionality. But we're no longer reporting a hole at 128k.

Can you post the contents of the 420.full output file so we can see
what the output of the various commands that are run are? e.g.
things like cowextsize that is configured, etc?

Cheers,

Dave.
--
Dave Chinner
[email protected]

2022-02-22 05:14:45

by Matthew Wilcox

[permalink] [raw]

Subject: Re: [mm/readahead] a0b99df1aa: xfstests.xfs.421.fail

On Mon, Feb 21, 2022 at 04:02:18PM +0800, kernel test robot wrote:
> commit: a0b99df1aa37d714eb80be5fb54efd56c88a3336 ("mm/readahead: Add large folio readahead")

> xfs/420 - output mismatch (see /lkp/benchmarks/xfstests/results//xfs/420.out.bad)
> --- tests/xfs/420.out 2022-02-17 11:55:00.000000000 +0000
> +++ /lkp/benchmarks/xfstests/results//xfs/420.out.bad 2022-02-20 20:34:22.430378506 +0000
> @@ -13,9 +13,7 @@
> Seek holes and data in file2
> Whence Result
> DATA 0
> -HOLE 131072
> -DATA 196608
> -HOLE 262144
> +HOLE 524288

Confirm this test now fails. I don't think it's actually a bug,
though. I think the test is now using larger pages to cache the
file, and it fails to report that there's a hole in the file.
Maybe there actually isn't a hole in the file any more; using
larger pages to cache the file means we'll now write more data
than we used to.

Adding XFS people for their thoughts.

Complete output:

$ diff -u ../ktest/tests/xfstests/tests/xfs/420.out ktest-out/xfstests/xfs/420.out.bad
--- ../ktest/tests/xfstests/tests/xfs/420.out 2021-07-05 15:49:45.539887305 -0400
+++ ktest-out/xfstests/xfs/420.out.bad 2022-02-21 08:14:40.000000000 -0500
@@ -13,9 +13,7 @@
Seek holes and data in file2
Whence Result
DATA 0
-HOLE 131072
-DATA 196608
-HOLE 262144
+HOLE 524288
Compare files
c2803804acc9936eef8aab42c119bfac SCRATCH_MNT/test-420/file1
017c08a9320aad844ce86aa9631afb98 SCRATCH_MNT/test-420/file2
@@ -28,9 +26,7 @@
Seek holes and data in file2
Whence Result
DATA 0
-HOLE 131072
-DATA 196608
-HOLE 262144
+HOLE 524288
Compare files
c2803804acc9936eef8aab42c119bfac SCRATCH_MNT/test-420/file1
017c08a9320aad844ce86aa9631afb98 SCRATCH_MNT/test-420/file2

So the file checksums are right, which means I didn't break the COW
functionality. But we're no longer reporting a hole at 128k.

2022-02-22 05:53:05

by Matthew Wilcox

[permalink] [raw]

Subject: Re: [mm/readahead] a0b99df1aa: xfstests.xfs.421.fail

On Tue, Feb 22, 2022 at 07:55:29AM +1100, Dave Chinner wrote:
> On Mon, Feb 21, 2022 at 01:56:55PM +0000, Matthew Wilcox wrote:
> > On Mon, Feb 21, 2022 at 04:02:18PM +0800, kernel test robot wrote:
> > > commit: a0b99df1aa37d714eb80be5fb54efd56c88a3336 ("mm/readahead: Add large folio readahead")
> >
> > > xfs/420 - output mismatch (see /lkp/benchmarks/xfstests/results//xfs/420.out.bad)
> > > --- tests/xfs/420.out 2022-02-17 11:55:00.000000000 +0000
> > > +++ /lkp/benchmarks/xfstests/results//xfs/420.out.bad 2022-02-20 20:34:22.430378506 +0000
> > > @@ -13,9 +13,7 @@
> > > Seek holes and data in file2
> > > Whence Result
> > > DATA 0
> > > -HOLE 131072
> > > -DATA 196608
> > > -HOLE 262144
> > > +HOLE 524288
> >
> > Confirm this test now fails. I don't think it's actually a bug,
> > though. I think the test is now using larger pages to cache the
> > file, and it fails to report that there's a hole in the file.
> > Maybe there actually isn't a hole in the file any more; using
> > larger pages to cache the file means we'll now write more data
> > than we used to.
> >
> > Adding XFS people for their thoughts.
> >
> > Complete output:
> >
> > $ diff -u ../ktest/tests/xfstests/tests/xfs/420.out ktest-out/xfstests/xfs/420.out.bad
> > --- ../ktest/tests/xfstests/tests/xfs/420.out 2021-07-05 15:49:45.539887305 -0400
> > +++ ktest-out/xfstests/xfs/420.out.bad 2022-02-21 08:14:40.000000000 -0500
> > @@ -13,9 +13,7 @@
> > Seek holes and data in file2
> > Whence Result
> > DATA 0
> > -HOLE 131072
> > -DATA 196608
> > -HOLE 262144
> > +HOLE 524288
> > Compare files
> > c2803804acc9936eef8aab42c119bfac SCRATCH_MNT/test-420/file1
> > 017c08a9320aad844ce86aa9631afb98 SCRATCH_MNT/test-420/file2
> > @@ -28,9 +26,7 @@
> > Seek holes and data in file2
> > Whence Result
> > DATA 0
> > -HOLE 131072
> > -DATA 196608
> > -HOLE 262144
> > +HOLE 524288
> > Compare files
> > c2803804acc9936eef8aab42c119bfac SCRATCH_MNT/test-420/file1
> > 017c08a9320aad844ce86aa9631afb98 SCRATCH_MNT/test-420/file2
> >
> > So the file checksums are right, which means I didn't break the COW
> > functionality. But we're no longer reporting a hole at 128k.
>
> Can you post the contents of the 420.full output file so we can see
> what the output of the various commands that are run are? e.g.
> things like cowextsize that is configured, etc?

Sure! It's short, so I've included it inline.

meta-data=/dev/sdc isize=512 agcount=4, agsize=3670016 blks
= sectsz=512 attr=2, projid32bit=1
= crc=1 finobt=1, sparse=1, rmapbt=1
= reflink=1 bigtime=0
data = bsize=1024 blocks=14680064, imaxpct=25
= sunit=0 swidth=0 blks
naming =version 2 bsize=4096 ascii-ci=0, ftype=1
log =internal log bsize=1024 blocks=10240, version=2
= sectsz=512 sunit=0 blks, lazy-count=1
realtime =none extsz=4096 blocks=0, rtextents=0
Discarding blocks...Done.
[0] /mnt/scratch/test-420
[524288] /mnt/scratch/test-420
wrote 131072/131072 bytes at offset 0
128 KiB, 128 ops; 0.0000 sec (1.327 GiB/sec and 1391304.3478 ops/sec)
wrote 524288/524288 bytes at offset 0
512 KiB, 512 ops; 0.0003 sec (1.341 GiB/sec and 1406593.4066 ops/sec)
wrote 131072/131072 bytes at offset 0
128 KiB, 128 ops; 0.0000 sec (1.822 GiB/sec and 1910447.7612 ops/sec)
CoW the shared part then write into the empty part
[524288] /mnt/scratch/test-420/file1
[524288] /mnt/scratch/test-420/file2
wrote 65536/65536 bytes at offset 0
64 KiB, 64 ops; 0.0001 sec (416.667 MiB/sec and 426666.6667 ops/sec)
wrote 65536/65536 bytes at offset 196608
64 KiB, 64 ops; 0.0000 sec (1.695 GiB/sec and 1777777.7778 ops/sec)
wrote 65536/65536 bytes at offset 0
64 KiB, 64 ops; 0.0000 sec (1.387 GiB/sec and 1454545.4545 ops/sec)
wrote 65536/65536 bytes at offset 196608
64 KiB, 64 ops; 0.0000 sec (1.526 GiB/sec and 1600000.0000 ops/sec)
xfs_io: xfsctl(XFS_IOC_GETBMAPX) iflags=0x28 ["/mnt/scratch/test-420/file1"]: Invalid argument
/mnt/scratch/test-420/file1:
EXT: FILE-OFFSET BLOCK-RANGE AG AG-OFFSET TOTAL
0: [0..255]: 128..383 0 (128..383) 256 100000
1: [256..1023]: hole 768
xfs_io: xfsctl(XFS_IOC_GETBMAPX) iflags=0x28 ["/mnt/scratch/test-420/file2"]: Invalid argument
/mnt/scratch/test-420/file2:
EXT: FILE-OFFSET BLOCK-RANGE AG AG-OFFSET TOTAL
0: [0..255]: 128..383 0 (128..383) 256 100000
1: [256..1023]: hole 768
xfs_io: xfsctl(XFS_IOC_GETBMAPX) iflags=0x28 ["/mnt/scratch/test-420/file3"]: Invalid argument
/mnt/scratch/test-420/file3:
EXT: FILE-OFFSET BLOCK-RANGE AG AG-OFFSET TOTAL
0: [0..1023]: 512..1535 0 (512..1535) 1024
sync filesystem
xfs_io: xfsctl(XFS_IOC_GETBMAPX) iflags=0x28 ["/mnt/scratch/test-420/file1"]: Invalid argument
/mnt/scratch/test-420/file1:
EXT: FILE-OFFSET BLOCK-RANGE AG AG-OFFSET TOTAL
0: [0..127]: 128..255 0 (128..255) 128
1: [128..255]: 256..383 0 (256..383) 128 100000
2: [256..1023]: hole 768
xfs_io: xfsctl(XFS_IOC_GETBMAPX) iflags=0x28 ["/mnt/scratch/test-420/file2"]: Invalid argument
/mnt/scratch/test-420/file2:
EXT: FILE-OFFSET BLOCK-RANGE AG AG-OFFSET TOTAL
0: [0..127]: 1536..1663 0 (1536..1663) 128
1: [128..255]: 256..383 0 (256..383) 128 100000
2: [256..383]: hole 128
3: [384..511]: 1920..2047 0 (1920..2047) 128
4: [512..1023]: hole 512
xfs_io: xfsctl(XFS_IOC_GETBMAPX) iflags=0x28 ["/mnt/scratch/test-420/file3"]: Invalid argument
/mnt/scratch/test-420/file3:
EXT: FILE-OFFSET BLOCK-RANGE AG AG-OFFSET TOTAL
0: [0..1023]: 512..1535 0 (512..1535) 1024
Remount
xfs_io: xfsctl(XFS_IOC_GETBMAPX) iflags=0x28 ["/mnt/scratch/test-420/file1"]: Invalid argument
/mnt/scratch/test-420/file1:
EXT: FILE-OFFSET BLOCK-RANGE AG AG-OFFSET TOTAL
0: [0..127]: 128..255 0 (128..255) 128
1: [128..255]: 256..383 0 (256..383) 128 100000
2: [256..1023]: hole 768
xfs_io: xfsctl(XFS_IOC_GETBMAPX) iflags=0x28 ["/mnt/scratch/test-420/file2"]: Invalid argument
/mnt/scratch/test-420/file2:
EXT: FILE-OFFSET BLOCK-RANGE AG AG-OFFSET TOTAL
0: [0..127]: 1536..1663 0 (1536..1663) 128
1: [128..255]: 256..383 0 (256..383) 128 100000
2: [256..383]: hole 128
3: [384..511]: 1920..2047 0 (1920..2047) 128
4: [512..1023]: hole 512
xfs_io: xfsctl(XFS_IOC_GETBMAPX) iflags=0x28 ["/mnt/scratch/test-420/file3"]: Invalid argument
/mnt/scratch/test-420/file3:
EXT: FILE-OFFSET BLOCK-RANGE AG AG-OFFSET TOTAL
0: [0..1023]: 512..1535 0 (512..1535) 1024

2022-02-24 01:35:29

by Matthew Wilcox

[permalink] [raw]

Subject: Re: [mm/readahead] a0b99df1aa: xfstests.xfs.421.fail

On Tue, Feb 22, 2022 at 08:43:55AM +1100, Dave Chinner wrote:
> On Mon, Feb 21, 2022 at 09:10:20PM +0000, Matthew Wilcox wrote:
> > On Tue, Feb 22, 2022 at 07:55:29AM +1100, Dave Chinner wrote:
> > > On Mon, Feb 21, 2022 at 01:56:55PM +0000, Matthew Wilcox wrote:
> > > > On Mon, Feb 21, 2022 at 04:02:18PM +0800, kernel test robot wrote:
> > > > > commit: a0b99df1aa37d714eb80be5fb54efd56c88a3336 ("mm/readahead: Add large folio readahead")
> > > >
> > > > > xfs/420 - output mismatch (see /lkp/benchmarks/xfstests/results//xfs/420.out.bad)
> > > > > --- tests/xfs/420.out 2022-02-17 11:55:00.000000000 +0000
> > > > > +++ /lkp/benchmarks/xfstests/results//xfs/420.out.bad 2022-02-20 20:34:22.430378506 +0000
> > > > > @@ -13,9 +13,7 @@
> > > > > Seek holes and data in file2
> > > > > Whence Result
> > > > > DATA 0
> > > > > -HOLE 131072
> > > > > -DATA 196608
> > > > > -HOLE 262144
> > > > > +HOLE 524288
> > > >
> > > > Confirm this test now fails. I don't think it's actually a bug,
> > > > though. I think the test is now using larger pages to cache the
> > > > file, and it fails to report that there's a hole in the file.
> > > > Maybe there actually isn't a hole in the file any more; using
> > > > larger pages to cache the file means we'll now write more data
> > > > than we used to.
> > > >
> > > > Adding XFS people for their thoughts.
> > > >
> > > > Complete output:
> > > >
> > > > $ diff -u ../ktest/tests/xfstests/tests/xfs/420.out ktest-out/xfstests/xfs/420.out.bad
> > > > --- ../ktest/tests/xfstests/tests/xfs/420.out 2021-07-05 15:49:45.539887305 -0400
> > > > +++ ktest-out/xfstests/xfs/420.out.bad 2022-02-21 08:14:40.000000000 -0500
> > > > @@ -13,9 +13,7 @@
> > > > Seek holes and data in file2
> > > > Whence Result
> > > > DATA 0
> > > > -HOLE 131072
> > > > -DATA 196608
> > > > -HOLE 262144
> > > > +HOLE 524288
> > > > Compare files
> > > > c2803804acc9936eef8aab42c119bfac SCRATCH_MNT/test-420/file1
> > > > 017c08a9320aad844ce86aa9631afb98 SCRATCH_MNT/test-420/file2
> > > > @@ -28,9 +26,7 @@
> > > > Seek holes and data in file2
> > > > Whence Result
> > > > DATA 0
> > > > -HOLE 131072
> > > > -DATA 196608
> > > > -HOLE 262144
> > > > +HOLE 524288
> > > > Compare files
> > > > c2803804acc9936eef8aab42c119bfac SCRATCH_MNT/test-420/file1
> > > > 017c08a9320aad844ce86aa9631afb98 SCRATCH_MNT/test-420/file2
> > > >
> > > > So the file checksums are right, which means I didn't break the COW
> > > > functionality. But we're no longer reporting a hole at 128k.
> > >
> > > Can you post the contents of the 420.full output file so we can see
> > > what the output of the various commands that are run are? e.g.
> > > things like cowextsize that is configured, etc?
> >
> > Sure! It's short, so I've included it inline.
>
> Ok, I'll cut this up so it makes sense...
>
> >
> >
> > meta-data=/dev/sdc isize=512 agcount=4, agsize=3670016 blks
> > = sectsz=512 attr=2, projid32bit=1
> > = crc=1 finobt=1, sparse=1, rmapbt=1
> > = reflink=1 bigtime=0
> > data = bsize=1024 blocks=14680064, imaxpct=25
> > = sunit=0 swidth=0 blks
> > naming =version 2 bsize=4096 ascii-ci=0, ftype=1
> > log =internal log bsize=1024 blocks=10240, version=2
> > = sectsz=512 sunit=0 blks, lazy-count=1
> > realtime =none extsz=4096 blocks=0, rtextents=0
> > Discarding blocks...Done.
> > [0] /mnt/scratch/test-420
> > [524288] /mnt/scratch/test-420
> > wrote 131072/131072 bytes at offset 0
> > 128 KiB, 128 ops; 0.0000 sec (1.327 GiB/sec and 1391304.3478 ops/sec)
> > wrote 524288/524288 bytes at offset 0
> > 512 KiB, 512 ops; 0.0003 sec (1.341 GiB/sec and 1406593.4066 ops/sec)
> > wrote 131072/131072 bytes at offset 0
> > 128 KiB, 128 ops; 0.0000 sec (1.822 GiB/sec and 1910447.7612 ops/sec)
> > CoW the shared part then write into the empty part
> > [524288] /mnt/scratch/test-420/file1
> > [524288] /mnt/scratch/test-420/file2
> > wrote 65536/65536 bytes at offset 0
> > 64 KiB, 64 ops; 0.0001 sec (416.667 MiB/sec and 426666.6667 ops/sec)
> > wrote 65536/65536 bytes at offset 196608
> > 64 KiB, 64 ops; 0.0000 sec (1.695 GiB/sec and 1777777.7778 ops/sec)
> > wrote 65536/65536 bytes at offset 0
> > 64 KiB, 64 ops; 0.0000 sec (1.387 GiB/sec and 1454545.4545 ops/sec)
> > wrote 65536/65536 bytes at offset 196608
> > 64 KiB, 64 ops; 0.0000 sec (1.526 GiB/sec and 1600000.0000 ops/sec)
> > xfs_io: xfsctl(XFS_IOC_GETBMAPX) iflags=0x28 ["/mnt/scratch/test-420/file1"]: Invalid argument
> > /mnt/scratch/test-420/file1:
> > EXT: FILE-OFFSET BLOCK-RANGE AG AG-OFFSET TOTAL
> > 0: [0..255]: 128..383 0 (128..383) 256 100000
> > 1: [256..1023]: hole 768
> > xfs_io: xfsctl(XFS_IOC_GETBMAPX) iflags=0x28 ["/mnt/scratch/test-420/file2"]: Invalid argument
> > /mnt/scratch/test-420/file2:
> > EXT: FILE-OFFSET BLOCK-RANGE AG AG-OFFSET TOTAL
> > 0: [0..255]: 128..383 0 (128..383) 256 100000
> > 1: [256..1023]: hole 768
>
> So this is the extent list for file2 after the reflink. Note the
> hole at 128-512kB. The flags tell us the 128kB data extent is
> shared. There are no unwritten extents at all.

My debugging indicates otherwise:

+++ b/fs/iomap/seek.c
@@ -15,6 +15,7 @@ static loff_t iomap_seek_hole_iter(const struct iomap_iter *it
er,
{
loff_t length = iomap_length(iter);

+printk("%s %ld %d:%lld,%lld\n", __func__, iter->inode->i_ino, iter->iomap.type, iter->iomap.length);
switch (iter->iomap.type) {
case IOMAP_UNWRITTEN:
*hole_pos = mapping_seek_hole_data(iter->inode->i_mapping,
@@ -61,6 +62,7 @@ static loff_t iomap_seek_data_iter(const struct iomap_iter *iter,
{
loff_t length = iomap_length(iter);

+printk("%s %ld %d:%lld,%lld\n", __func__, iter->inode->i_ino, iter->iomap.type, iter->iomap.length);
switch (iter->iomap.type) {
case IOMAP_HOLE:
return length;

gives me output:

00016 iomap_seek_hole_iter 68 2:131072,-131387284454392
00016 iomap_seek_hole_iter 68 0:393216,-131387284454392
00016 iomap_seek_data_iter 68 2:131072,-131387284454392
00016 iomap_seek_hole_iter 68 2:131072,-131387284454392
00016 iomap_seek_hole_iter 68 0:393216,-131387284454392
00016 iomap_seek_data_iter 68 0:393216,-131387284454392
00016 iomap_seek_hole_iter 69 2:131072,-131387284455352
00016 iomap_seek_hole_iter 69 3:393216,4503599627239424
00016 iomap_seek_data_iter 69 2:131072,-131387284455352
00016 iomap_seek_hole_iter 69 2:131072,-131387284455352
00016 iomap_seek_hole_iter 69 3:393216,4503599627239424

This is after the first call to drop_caches, so in userspace, we're
doing:
echo 1 > /proc/sys/vm/drop_caches

echo "CoW the shared part then write into the empty part" | tee -a $seqres.full
$XFS_IO_PROG -c "cowextsize" $testdir/file1 >> $seqres.full
grep 'order=[^0]' /sys/kernel/debug/tracing/trace
$XFS_IO_PROG -c "cowextsize" $testdir/file2 >> $seqres.full
grep 'order=[^0]' /sys/kernel/debug/tracing/trace
$XFS_IO_PROG -c "pwrite -S 0x63 0 $blksz" $testdir/file2 >> $seqres.full
grep 'order=[^0]' /sys/kernel/debug/tracing/trace
$XFS_IO_PROG -c "pwrite -S 0x63 $((blksz * 3)) $blksz" $testdir/file2 >> $seqres.full
grep 'order=[^0]' /sys/kernel/debug/tracing/trace

The calls to grep were added by me, and should show any order>0 pages
added to the page cache. There are none, which seems like it would
exonerate large folios. Of course, it must be this commit, so there
must be large pages.

Debugging continues ...

2022-02-24 01:35:39

by Matthew Wilcox

[permalink] [raw]

Subject: Re: [mm/readahead] a0b99df1aa: xfstests.xfs.421.fail

On Wed, Feb 23, 2022 at 07:59:39PM +0000, Matthew Wilcox wrote:
> > > /mnt/scratch/test-420/file2:
> > > EXT: FILE-OFFSET BLOCK-RANGE AG AG-OFFSET TOTAL
> > > 0: [0..255]: 128..383 0 (128..383) 256 100000
> > > 1: [256..1023]: hole 768
> >
> > So this is the extent list for file2 after the reflink. Note the
> > hole at 128-512kB. The flags tell us the 128kB data extent is
> > shared. There are no unwritten extents at all.
>
> My debugging indicates otherwise:
>
> +++ b/fs/iomap/seek.c
> @@ -15,6 +15,7 @@ static loff_t iomap_seek_hole_iter(const struct iomap_iter *it
> er,
> {
> loff_t length = iomap_length(iter);
>
> +printk("%s %ld %d:%lld,%lld\n", __func__, iter->inode->i_ino, iter->iomap.type, iter->iomap.length);
> switch (iter->iomap.type) {
> case IOMAP_UNWRITTEN:
> *hole_pos = mapping_seek_hole_data(iter->inode->i_mapping,
> @@ -61,6 +62,7 @@ static loff_t iomap_seek_data_iter(const struct iomap_iter *iter,
> {
> loff_t length = iomap_length(iter);
>
> +printk("%s %ld %d:%lld,%lld\n", __func__, iter->inode->i_ino, iter->iomap.type, iter->iomap.length);
> switch (iter->iomap.type) {
> case IOMAP_HOLE:
> return length;
>
> gives me output:
>
> 00016 iomap_seek_hole_iter 68 2:131072,-131387284454392
> 00016 iomap_seek_hole_iter 68 0:393216,-131387284454392
> 00016 iomap_seek_data_iter 68 2:131072,-131387284454392
> 00016 iomap_seek_hole_iter 68 2:131072,-131387284454392
> 00016 iomap_seek_hole_iter 68 0:393216,-131387284454392
> 00016 iomap_seek_data_iter 68 0:393216,-131387284454392
> 00016 iomap_seek_hole_iter 69 2:131072,-131387284455352
> 00016 iomap_seek_hole_iter 69 3:393216,4503599627239424
> 00016 iomap_seek_data_iter 69 2:131072,-131387284455352
> 00016 iomap_seek_hole_iter 69 2:131072,-131387284455352
> 00016 iomap_seek_hole_iter 69 3:393216,4503599627239424

I forgot to print 'offset' first, so that's showing (type:len,garbage)
With that fixed, and a call to dump_page() in folio_seek_hole_data(),
I see:

00016 iomap_seek_hole_iter 69 3:131072,65536
00016 page:00000000f7f02915 refcount:18 mapcount:0 mapping:000000008692207c index:0x20 pfn:0x108750
00016 head:00000000f7f02915 order:4 compound_mapcount:0 compound_pincount:0
00016 aops:0xffffffff81c32a00 ino:45 dentry name:"file2"
00016 flags: 0x8000000000012014(uptodate|lru|private|head|zone=2)
00016 raw: 8000000000012014 ffffea000421e008 ffffea000421d008 ffff888103947520
00016 raw: 0000000000000020 ffff8881051cd2a0 00000012ffffffff 0000000000000000
00016 page dumped because: folio_seek_hole_data

(inode 69 decimal is 45 hex)

So I think this explains the output? an order-4 page is 64kB, so it'll
report the entirety of 128kB-192kB as data.