Greeting,
FYI, we noticed xfstests.btrfs.172.fail due to commit (built with gcc-11):
commit: 5f58d783fd7823b2c2d5954d1126e702f94bfc4c ("btrfs: free device in btrfs_close_devices for a single device filesystem")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
in testcase: xfstests
version: xfstests-i386-5a5e419-1_20220926
with following parameters:
disk: 6HDD
fs: btrfs
test: btrfs-logwrites
test-description: xfstests is a regression test suite for xfs and other files ystems.
test-url: git://git.kernel.org/pub/scm/fs/xfs/xfstests-dev.git
on test machine: 8 threads 1 sockets Intel(R) Core(TM) i7-4770 CPU @ 3.40GHz (Haswell) with 8G memory
caused below changes (please refer to attached dmesg/kmsg for entire log/backtrace):
we did this test on i386 and we found the issue doesn't always happen,
on both this commit and v6.3-rc2, it failed in about half of tests.
however, always clean on parent commit 519b7e13b5ae8dd3.
519b7e13b5ae8dd3 5f58d783fd7823b2c2d5954d112 v6.3-rc2
---------------- --------------------------- ---------------------------
fail:runs %reproduction fail:runs %reproduction fail:runs
| | | | |
:14 23% 7:13 33% 10:19 xfstests.btrfs.172.fail
14:14 -27% 6:13 -17% 9:19 xfstests.btrfs.172.pass
If you fix the issue, kindly add following tag
| Reported-by: kernel test robot <[email protected]>
| Link: https://lore.kernel.org/oe-lkp/[email protected]
2023-03-08 04:47:50 export TEST_DIR=/fs/sdb1
2023-03-08 04:47:50 export TEST_DEV=/dev/sdb1
2023-03-08 04:47:50 export FSTYP=btrfs
2023-03-08 04:47:50 export SCRATCH_MNT=/fs/scratch
2023-03-08 04:47:50 mkdir /fs/scratch -p
2023-03-08 04:47:50 export SCRATCH_DEV_POOL="/dev/sdb2 /dev/sdb3 /dev/sdb4 /dev/sdb5 /dev/sdb6"
2023-03-08 04:47:50 export LOGWRITES_DEV=/dev/sdb2
2023-03-08 04:47:50 export SCRATCH_DEV=/dev/sdb6
2023-03-08 04:47:50 unset SCRATCH_DEV_POOL
2023-03-08 04:47:50 sed "s:^:btrfs/:" //lkp/benchmarks/xfstests/tests/btrfs-logwrites
2023-03-08 04:47:50 ./check btrfs/291 btrfs/206 btrfs/196 btrfs/192 btrfs/190 btrfs/172
FSTYP -- btrfs
PLATFORM -- Linux/i686 lkp-hsw-d01 6.1.0-rc8-00284-g5f58d783fd78 #1 SMP Wed Mar 8 11:05:36 CST 2023
MKFS_OPTIONS -- /dev/sdb6
MOUNT_OPTIONS -- /dev/sdb6 /fs/scratch
btrfs/172 [failed, exit status 1]- output mismatch (see /lkp/benchmarks/xfstests/results//btrfs/172.out.bad)
--- tests/btrfs/172.out 2022-09-26 09:38:15.000000000 +0000
+++ /lkp/benchmarks/xfstests/results//btrfs/172.out.bad 2023-03-08 04:47:54.134026209 +0000
@@ -1,3 +1,5 @@
QA output created by 172
wrote 5242880/5242880 bytes at offset 0
XXX Bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+_check_btrfs_filesystem: filesystem on /dev/sdb6 is inconsistent
+(see /lkp/benchmarks/xfstests/results//btrfs/172.full for details)
...
(Run 'diff -u /lkp/benchmarks/xfstests/tests/btrfs/172.out /lkp/benchmarks/xfstests/results//btrfs/172.out.bad' to see the entire diff)
btrfs/190 15s
btrfs/192 101s
btrfs/196 163s
btrfs/206 4s
btrfs/291 [not run] kernel btrfs isn't configured with verity support
Ran: btrfs/172 btrfs/190 btrfs/192 btrfs/196 btrfs/206 btrfs/291
Not run: btrfs/291
Failures: btrfs/172
Failed 1 of 6 tests
To reproduce:
git clone https://github.com/intel/lkp-tests.git
cd lkp-tests
sudo bin/lkp install job.yaml # job file is attached in this email
bin/lkp split-job --compatible job.yaml # generate the yaml file for lkp run
sudo bin/lkp run generated-yaml-file
# if come across any failure that blocks the test,
# please remove ~/.lkp and /lkp dir to run from a clean state.
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests
On 3/17/23 09:17, kernel test rboot wrote:
>
> Greeting,
>
> FYI, we noticed xfstests.btrfs.172.fail due to commit (built with gcc-11):
>
> commit: 5f58d783fd7823b2c2d5954d1126e702f94bfc4c ("btrfs: free device in btrfs_close_devices for a single device filesystem")
> https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
>
> in testcase: xfstests
> version: xfstests-i386-5a5e419-1_20220926
> with following parameters:
>
> disk: 6HDD
> fs: btrfs
> test: btrfs-logwrites
>
> test-description: xfstests is a regression test suite for xfs and other files ystems.
> test-url: git://git.kernel.org/pub/scm/fs/xfs/xfstests-dev.git
>
>
> on test machine: 8 threads 1 sockets Intel(R) Core(TM) i7-4770 CPU @ 3.40GHz (Haswell) with 8G memory
>
> caused below changes (please refer to attached dmesg/kmsg for entire log/backtrace):
>
>
> we did this test on i386 and we found the issue doesn't always happen,
> on both this commit and v6.3-rc2, it failed in about half of tests.
> however, always clean on parent commit 519b7e13b5ae8dd3.
>
> 519b7e13b5ae8dd3 5f58d783fd7823b2c2d5954d112 v6.3-rc2
> ---------------- --------------------------- ---------------------------
> fail:runs %reproduction fail:runs %reproduction fail:runs
> | | | | |
> :14 23% 7:13 33% 10:19 xfstests.btrfs.172.fail
> 14:14 -27% 6:13 -17% 9:19 xfstests.btrfs.172.pass
>
>
>
> If you fix the issue, kindly add following tag
> | Reported-by: kernel test robot <[email protected]>
> | Link: https://lore.kernel.org/oe-lkp/[email protected]
>
> 2023-03-08 04:47:50 export TEST_DIR=/fs/sdb1
> 2023-03-08 04:47:50 export TEST_DEV=/dev/sdb1
> 2023-03-08 04:47:50 export FSTYP=btrfs
> 2023-03-08 04:47:50 export SCRATCH_MNT=/fs/scratch
> 2023-03-08 04:47:50 mkdir /fs/scratch -p
> 2023-03-08 04:47:50 export SCRATCH_DEV_POOL="/dev/sdb2 /dev/sdb3 /dev/sdb4 /dev/sdb5 /dev/sdb6"
> 2023-03-08 04:47:50 export LOGWRITES_DEV=/dev/sdb2
> 2023-03-08 04:47:50 export SCRATCH_DEV=/dev/sdb6
> 2023-03-08 04:47:50 unset SCRATCH_DEV_POOL
> 2023-03-08 04:47:50 sed "s:^:btrfs/:" //lkp/benchmarks/xfstests/tests/btrfs-logwrites
> 2023-03-08 04:47:50 ./check btrfs/291 btrfs/206 btrfs/196 btrfs/192 btrfs/190 btrfs/172
> FSTYP -- btrfs
> PLATFORM -- Linux/i686 lkp-hsw-d01 6.1.0-rc8-00284-g5f58d783fd78 #1 SMP Wed Mar 8 11:05:36 CST 2023
> MKFS_OPTIONS -- /dev/sdb6
> MOUNT_OPTIONS -- /dev/sdb6 /fs/scratch
>
> btrfs/172 [failed, exit status 1]- output mismatch (see /lkp/benchmarks/xfstests/results//btrfs/172.out.bad)
> --- tests/btrfs/172.out 2022-09-26 09:38:15.000000000 +0000
> +++ /lkp/benchmarks/xfstests/results//btrfs/172.out.bad 2023-03-08 04:47:54.134026209 +0000
> @@ -1,3 +1,5 @@
> QA output created by 172
> wrote 5242880/5242880 bytes at offset 0
> XXX Bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
> +_check_btrfs_filesystem: filesystem on /dev/sdb6 is inconsistent
Thank you for reporting it. I am currently looking into it. It appears
that the device is being freed after it's closed, which could indicate
that we overlooked something at close, just a hypothesis at this point.
-Anand
> +(see /lkp/benchmarks/xfstests/results//btrfs/172.full for details)
> ...
> (Run 'diff -u /lkp/benchmarks/xfstests/tests/btrfs/172.out /lkp/benchmarks/xfstests/results//btrfs/172.out.bad' to see the entire diff)
> btrfs/190 15s
> btrfs/192 101s
> btrfs/196 163s
> btrfs/206 4s
> btrfs/291 [not run] kernel btrfs isn't configured with verity support
> Ran: btrfs/172 btrfs/190 btrfs/192 btrfs/196 btrfs/206 btrfs/291
> Not run: btrfs/291
> Failures: btrfs/172
> Failed 1 of 6 tests
>
>
>
>
> To reproduce:
>
> git clone https://github.com/intel/lkp-tests.git
> cd lkp-tests
> sudo bin/lkp install job.yaml # job file is attached in this email
> bin/lkp split-job --compatible job.yaml # generate the yaml file for lkp run
> sudo bin/lkp run generated-yaml-file
>
> # if come across any failure that blocks the test,
> # please remove ~/.lkp and /lkp dir to run from a clean state.
>
>
>
On 17/03/2023 12:14, Anand Jain wrote:
>
>
> On 3/17/23 09:17, kernel test rboot wrote:
>>
>> Greeting,
>>
>> FYI, we noticed xfstests.btrfs.172.fail due to commit (built with
>> gcc-11):
>>
>> commit: 5f58d783fd7823b2c2d5954d1126e702f94bfc4c ("btrfs: free device
>> in btrfs_close_devices for a single device filesystem")
>> https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
>>
>> in testcase: xfstests
>> version: xfstests-i386-5a5e419-1_20220926
>> with following parameters:
>>
>> disk: 6HDD
>> fs: btrfs
>> test: btrfs-logwrites
>>
>> test-description: xfstests is a regression test suite for xfs and
>> other files ystems.
>> test-url: git://git.kernel.org/pub/scm/fs/xfs/xfstests-dev.git
>>
>>
>> on test machine: 8 threads 1 sockets Intel(R) Core(TM) i7-4770 CPU @
>> 3.40GHz (Haswell) with 8G memory
>>
>> caused below changes (please refer to attached dmesg/kmsg for entire
>> log/backtrace):
>>
>>
>> we did this test on i386 and we found the issue doesn't always happen,
>> on both this commit and v6.3-rc2, it failed in about half of tests.
>> however, always clean on parent commit 519b7e13b5ae8dd3.
>>
>> 519b7e13b5ae8dd3 5f58d783fd7823b2c2d5954d112 v6.3-rc2
>> ---------------- --------------------------- ---------------------------
>> fail:runs %reproduction fail:runs %reproduction fail:runs
>> | | | | |
>> :14 23% 7:13 33%
>> 10:19 xfstests.btrfs.172.fail
>> 14:14 -27% 6:13 -17%
>> 9:19 xfstests.btrfs.172.pass
>>
>>
>>
>> If you fix the issue, kindly add following tag
>> | Reported-by: kernel test robot <[email protected]>
>> | Link:
>> https://lore.kernel.org/oe-lkp/[email protected]
>>
>> 2023-03-08 04:47:50 export TEST_DIR=/fs/sdb1
>> 2023-03-08 04:47:50 export TEST_DEV=/dev/sdb1
>> 2023-03-08 04:47:50 export FSTYP=btrfs
>> 2023-03-08 04:47:50 export SCRATCH_MNT=/fs/scratch
>> 2023-03-08 04:47:50 mkdir /fs/scratch -p
>> 2023-03-08 04:47:50 export SCRATCH_DEV_POOL="/dev/sdb2 /dev/sdb3
>> /dev/sdb4 /dev/sdb5 /dev/sdb6"
>> 2023-03-08 04:47:50 export LOGWRITES_DEV=/dev/sdb2
>> 2023-03-08 04:47:50 export SCRATCH_DEV=/dev/sdb6
>> 2023-03-08 04:47:50 unset SCRATCH_DEV_POOL
>> 2023-03-08 04:47:50 sed "s:^:btrfs/:"
>> //lkp/benchmarks/xfstests/tests/btrfs-logwrites
>> 2023-03-08 04:47:50 ./check btrfs/291 btrfs/206 btrfs/196 btrfs/192
>> btrfs/190 btrfs/172
>> FSTYP -- btrfs
>> PLATFORM -- Linux/i686 lkp-hsw-d01 6.1.0-rc8-00284-g5f58d783fd78
>> #1 SMP Wed Mar 8 11:05:36 CST 2023
>> MKFS_OPTIONS -- /dev/sdb6
>> MOUNT_OPTIONS -- /dev/sdb6 /fs/scratch
>>
>> btrfs/172 [failed, exit status 1]- output mismatch (see
>> /lkp/benchmarks/xfstests/results//btrfs/172.out.bad)
>> --- tests/btrfs/172.out 2022-09-26 09:38:15.000000000 +0000
>> +++ /lkp/benchmarks/xfstests/results//btrfs/172.out.bad
>> 2023-03-08 04:47:54.134026209 +0000
>> @@ -1,3 +1,5 @@
>> QA output created by 172
>> wrote 5242880/5242880 bytes at offset 0
>> XXX Bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
>
>> +_check_btrfs_filesystem: filesystem on /dev/sdb6 is inconsistent
>
>
> Thank you for reporting it. I am currently looking into it. It appears
> that the device is being freed after it's closed, which could indicate
> that we overlooked something at close, just a hypothesis at this point.
>
> -Anand
>
>
>> +(see /lkp/benchmarks/xfstests/results//btrfs/172.full for details)
Hmm, I'm unable to reproduce the issue. However, it's possible that the
O_EXCL open for the 'btrfs check' command failed because 'systemd-udev'
was scanning the btrfs device at the same time. I noticed from the dmesg
that the 'systemd-udevd' thread was running [1] at some point. I'm not
entirely sure if it raced with the 'btrfs check' command to successfully
acquire the O_EXCL lock. If you could send me the 'testcase.full' and
'testcase.out.bad' logs from the system, I could verify the issue.
[1]
[ 337.769932][ T6408] BTRFS: device fsid
8dcaa8fb-b317-4e13-9e04-e9b63fe91948 devid 1 transid 6 /dev/sdb6 scanned
by systemd-udevd (6408)
Could you please also attach the full log and out.bad file for
the test case in the report as shown below?
/lkp/benchmarks/xfstests/results/btrfs/<test-case-number>.full
and
/lkp/benchmarks/xfstests/results/btrfs/<test-case-number>.out.bad
Thanks, Anand
>> ...
>> (Run 'diff -u /lkp/benchmarks/xfstests/tests/btrfs/172.out
>> /lkp/benchmarks/xfstests/results//btrfs/172.out.bad' to see the
>> entire diff)
>> btrfs/190 15s
>> btrfs/192 101s
>> btrfs/196 163s
>> btrfs/206 4s
>> btrfs/291 [not run] kernel btrfs isn't configured with verity
>> support
>> Ran: btrfs/172 btrfs/190 btrfs/192 btrfs/196 btrfs/206 btrfs/291
>> Not run: btrfs/291
>> Failures: btrfs/172
>> Failed 1 of 6 tests
>>
>>
>>
>>
>> To reproduce:
>>
>> git clone https://github.com/intel/lkp-tests.git
>> cd lkp-tests
>> sudo bin/lkp install job.yaml # job file is
>> attached in this email
>> bin/lkp split-job --compatible job.yaml # generate the yaml
>> file for lkp run
>> sudo bin/lkp run generated-yaml-file
>>
>> # if come across any failure that blocks the test,
>> # please remove ~/.lkp and /lkp dir to run from a clean state.
>>
>>
>>
hi, Anand,
On Thu, Mar 23, 2023 at 07:49:27AM +0800, Anand Jain wrote:
> On 17/03/2023 12:14, Anand Jain wrote:
> >
> >
> > Thank you for reporting it. I am currently looking into it. It appears
> > that the device is being freed after it's closed, which could indicate
> > that we overlooked something at close, just a hypothesis at this point.
> >
> > -Anand
> >
> >
>
>
> > > ???? +(see /lkp/benchmarks/xfstests/results//btrfs/172.full for details)
>
>
> Hmm, I'm unable to reproduce the issue. However, it's possible that the
> O_EXCL open for the 'btrfs check' command failed because 'systemd-udev'
> was scanning the btrfs device at the same time. I noticed from the dmesg
> that the 'systemd-udevd' thread was running [1] at some point. I'm not
> entirely sure if it raced with the 'btrfs check' command to successfully
> acquire the O_EXCL lock. If you could send me the 'testcase.full' and
> 'testcase.out.bad' logs from the system, I could verify the issue.
>
> [1]
> [ 337.769932][ T6408] BTRFS: device fsid
> 8dcaa8fb-b317-4e13-9e04-e9b63fe91948 devid 1 transid 6 /dev/sdb6 scanned by
> systemd-udevd (6408)
>
>
> Could you please also attach the full log and out.bad file for
> the test case in the report as shown below?
>
> /lkp/benchmarks/xfstests/results/btrfs/<test-case-number>.full
> and
> /lkp/benchmarks/xfstests/results/btrfs/<test-case-number>.out.bad
>
as attached. Thanks!
>
> Thanks, Anand
>
On 23/03/2023 12:07, Oliver Sang wrote:
> hi, Anand,
>
> On Thu, Mar 23, 2023 at 07:49:27AM +0800, Anand Jain wrote:
>> On 17/03/2023 12:14, Anand Jain wrote:
>>>
>>>
>>> Thank you for reporting it. I am currently looking into it. It appears
>>> that the device is being freed after it's closed, which could indicate
>>> that we overlooked something at close, just a hypothesis at this point.
>>>
>>> -Anand
>>>
>>>
>>
>>
>>>> +(see /lkp/benchmarks/xfstests/results//btrfs/172.full for details)
>>
>>
>> Hmm, I'm unable to reproduce the issue. However, it's possible that the
>> O_EXCL open for the 'btrfs check' command failed because 'systemd-udev'
>> was scanning the btrfs device at the same time. I noticed from the dmesg
>> that the 'systemd-udevd' thread was running [1] at some point. I'm not
>> entirely sure if it raced with the 'btrfs check' command to successfully
>> acquire the O_EXCL lock. If you could send me the 'testcase.full' and
>> 'testcase.out.bad' logs from the system, I could verify the issue.
>>
>> [1]
>> [ 337.769932][ T6408] BTRFS: device fsid
>> 8dcaa8fb-b317-4e13-9e04-e9b63fe91948 devid 1 transid 6 /dev/sdb6 scanned by
>> systemd-udevd (6408)
>>
>>
>> Could you please also attach the full log and out.bad file for
>> the test case in the report as shown below?
>>
>> /lkp/benchmarks/xfstests/results/btrfs/<test-case-number>.full
>> and
>> /lkp/benchmarks/xfstests/results/btrfs/<test-case-number>.out.bad
>>
>
> as attached. Thanks!
Thanks.
My assumption about the reason for fsck failure was correct, it failed
due to device busy.
*** fsck.btrfs output ***
ERROR: cannot open device '/dev/sdb6': Device or resource busy
ERROR: cannot open file system
Opening filesystem to check...
*** end fsck.btrfs output
I am testing a fix, I will send it.
Anand