LinuxLists.cc - [6.2][regression] after commit 947a629988f191807d2d22ba63ae18259bb645c5 btrfs volume periodical forced switch to readonly after a lot of disk writes

2022-12-25 22:07:57

Subject: [6.2][regression] after commit 947a629988f191807d2d22ba63ae18259bb645c5 btrfs volume periodical forced switch to readonly after a lot of disk writes

Hi,
It is curious but it happens only on machine which have BTRFS volume
combined from two high speed nvme (pcie 4) SSD in RAID 0. On machines
with BTRFS volume from one HDD the bug does not appear.

To bisect the problematic commit, I had to sweat a lot. At each step,
I downloaded the 150 GB game "Assassin's Creed Valhalla" 4 times and
deleted it. For make sure that the commit previous to
947a629988f191807d2d22ba63ae18259bb645c5 is definitely not affected by
the bug, I downloaded this game 10 times, which should have provided
more than 1.5 Tb of data writing to the btrfs volume.

Here is result of my bisection:
947a629988f191807d2d22ba63ae18259bb645c5 is the first bad commit
commit 947a629988f191807d2d22ba63ae18259bb645c5
Author: Qu Wenruo <[email protected]>
Date: Wed Sep 14 13:32:51 2022 +0800

btrfs: move tree block parentness check into validate_extent_buffer()

[BACKGROUND]
Although both btrfs metadata and data has their read time verification
done at endio time (btrfs_validate_metadata_buffer() and
btrfs_verify_data_csum()), metadata has extra verification, mostly
parentness check including first key/transid/owner_root/level, done at
read_tree_block() and btrfs_read_extent_buffer().

On the other hand, all the data verification is done at endio context.

[ENHANCEMENT]
This patch will make a new union in btrfs_bio, taking the space of the
old data checksums, thus it will not increase the memory usage.

With that extra btrfs_tree_parent_check inside btrfs_bio, we can just
pass the check parameter into read_extent_buffer_pages(), and before
submitting the bio, we can copy the check structure into btrfs_bio.

And finally at endio time, we can grab btrfs_bio::parent_check and pass
it to validate_extent_buffer(), to move the remaining checks into it.

This brings the following benefits:

- Much simpler btrfs_read_extent_buffer()
Now it only needs to iterate through all mirrors.

- Simpler read-time transid check
Previously we go verify_parent_transid() after reading out the extent
buffer.
Now the transid check is done inside the endio function, no other
code can modify the content.
Thus no need to use the extent lock anymore.

Signed-off-by: Qu Wenruo <[email protected]>
Signed-off-by: David Sterba <[email protected]>

fs/btrfs/disk-io.c | 73 ++++++++++++++++++++++++++++++++++++++--------------
fs/btrfs/extent_io.c | 18 ++++++++++---
fs/btrfs/extent_io.h | 5 ++--
fs/btrfs/volumes.h | 25 +++++++++++++++---
4 files changed, 93 insertions(+), 28 deletions(-)

Before going to readonly, the preceding line in kernel log display a message:
[ 1908.029663] BTRFS: error (device nvme0n1p3: state A) in
btrfs_run_delayed_refs:2147: errno=-5 IO failure

I also attached a full kernel log.

--
Best Regards,
Mike Gavrilov.

Attachments:

btrfs-issue-dmesg.txt (344.85 kB)

2022-12-26 00:49:37

by Qu Wenruo

[permalink] [raw]

Subject: Re: [6.2][regression] after commit 947a629988f191807d2d22ba63ae18259bb645c5 btrfs volume periodical forced switch to readonly after a lot of disk writes

On 2022/12/26 05:32, Mikhail Gavrilov wrote:
> Hi,
> It is curious but it happens only on machine which have BTRFS volume
> combined from two high speed nvme (pcie 4) SSD in RAID 0. On machines
> with BTRFS volume from one HDD the bug does not appear.
>
> To bisect the problematic commit, I had to sweat a lot. At each step,
> I downloaded the 150 GB game "Assassin's Creed Valhalla" 4 times and
> deleted it. For make sure that the commit previous to
> 947a629988f191807d2d22ba63ae18259bb645c5 is definitely not affected by
> the bug, I downloaded this game 10 times, which should have provided
> more than 1.5 Tb of data writing to the btrfs volume.
>
> Here is result of my bisection:
> 947a629988f191807d2d22ba63ae18259bb645c5 is the first bad commit
> commit 947a629988f191807d2d22ba63ae18259bb645c5
> Author: Qu Wenruo <[email protected]>
> Date: Wed Sep 14 13:32:51 2022 +0800
>
> btrfs: move tree block parentness check into validate_extent_buffer()
>
[...]
> Signed-off-by: Qu Wenruo <[email protected]>
> Signed-off-by: David Sterba <[email protected]>
>
> fs/btrfs/disk-io.c | 73 ++++++++++++++++++++++++++++++++++++++--------------
> fs/btrfs/extent_io.c | 18 ++++++++++---
> fs/btrfs/extent_io.h | 5 ++--
> fs/btrfs/volumes.h | 25 +++++++++++++++---
> 4 files changed, 93 insertions(+), 28 deletions(-)
>
> Before going to readonly, the preceding line in kernel log display a message:
> [ 1908.029663] BTRFS: error (device nvme0n1p3: state A) in
> btrfs_run_delayed_refs:2147: errno=-5 IO failure
>
> I also attached a full kernel log.
>
Thanks a lot for the full kernel log.

It indeed shows something is wrong in the run_one_delayed_ref().
But surprisingly, if there is something wrong, I'd expect more output
from btrfs, as normally if one tree block failed to pass whatever the
checks, it should cause an error message at least.

Since you can reproduce the bug (although I don't think this is easy to
reproduce), mind to apply the extra debug patch and then try to reproduce?

(Part of the patch would go upstreamed soon)

Another thing is, mind to run "btrfs check --readonly" on the fs?
I don't believe that's the case, but just in case.

Thanks,
Qu

Attachments:

0001-btrfs-add-extra-debug-for-run_one_delayed_ref.patch (3.02 kB)

2022-12-26 03:12:08

On 2022/12/26 16:15, Mikhail Gavrilov wrote:
> On Mon, Dec 26, 2022 at 8:29 AM Qu Wenruo <[email protected]> wrote:
>>
>>
>> OK, indeed a level mismatch.
>>
>> From the remaining lines, it shows we're failing at
>> do_free_extent_accounting(), which failed at the btrfs_del_csums().
>>
>> And inside btrfs_del_csums(), what we do are all regular btree
>> operations, thus the tree level check should work without problem.
>>
>> Thus it seems to be a corrupted csum tree.
>
> Do I need to debug anything else to understand the cause of the error?
> Thanks.

With the check output, it's indeed a runtime error.
(At least no corruption to your fs)

And it can be some call paths not properly initializing the level to check.

Here is the new debug patch.
It should be applied without any previous debug patch.

Thanks,
Qu

>
>> Could you please run "btrfs check --readonly" from a liveCD?
>> There are tons of possible false alerts if ran on a RW mounted fs.
>>
>
> # btrfs check --readonly /dev/nvme0n1p3
> Opening filesystem to check...
> Checking filesystem on /dev/nvme0n1p3
> UUID: 40e0b5d2-df54-46e0-b6f4-2f868296271d
> [1/7] checking root items
> [2/7] checking extents
> [3/7] checking free space tree
> [4/7] checking fs roots
> [5/7] checking only csums items (without verifying data)
> [6/7] checking root refs
> [7/7] checking quota groups skipped (not enabled on this FS)
> found 6828416307200 bytes used, no error found
> total csum bytes: 6651838248
> total tree bytes: 16378380288
> total fs tree bytes: 7483179008
> total extent tree bytes: 1228210176
> btree space waste bytes: 2413299694
> file data blocks allocated: 6899999100928
> referenced 7488299450368
> [root@localhost-live ~]#
>
> With liveCD looks like all OK (no errors found).
>

Attachments:

0001-btrfs-add-extra-debug-for-level-mismatch.patch (2.62 kB)

2022-12-26 11:19:28

On 2022/12/26 20:15, Mikhail Gavrilov wrote:
> On Mon, Dec 26, 2022 at 5:11 PM Qu Wenruo <[email protected]> wrote:
>>
>>
>> While at the caller, the structure is properly passed in.
>>
>> So there is something wrong between the endio function and the check.
>>
>> I have created the v2 version patch to debug, please apply without any
>> previous debug patch.
>>
>> Meanwhile this really looks like a race, thus I'm not 100% sure if my
>> debug patch would reduce the possibility to reproduce.
>>
>> Your bisect should be the determining evidence, for the worst case we
>> can revert the offending patch.
>>
>> Thank you very much for all of the testing, it really helps a lot.
>> Qu
>>
>
> Looks like v2 version patch is missed in last message
>
Oh, my bad.

Please be aware that, this newer version may cause extra kernel
warnings, if it found any uninitialized btrfs_tree_parent_check
structure at submit time.

But there are completely valid cases we can do that, thus it may cause
much larger dmesg output, and you should not rely on the kernel warning
to determine if the bug is triggered.

Thanks,
Qu

Attachments:

0001-btrfs-add-extra-debug-for-level-mismatch.patch (3.69 kB)

2022-12-27 00:17:19

by Mikhail Gavrilov

On 2022/12/28 22:12, Mikhail Gavrilov wrote:
> On Wed, Dec 28, 2022 at 6:08 AM Qu Wenruo <[email protected]> wrote:
>>
>> From the very first dmesg with calltrack, it already shows the
>> submit_one_bio() is called from submit_extent_page(), which means cases
>> cross stripe boundary, and has no parent_check populated at all.
>>
>> And since you're using RAID0 on two NVMEs, it matches the symptom, while
>> most tests done here are using single device (DUP and SINGLE), thus no
>> stripe boundary cases at all.
>> (In fact it should still be possible to trigger on SINGLE, but way too
>> hard to trigger)
>>
>> With proper root cause found, this version should mostly handle the
>> regression correctly.
>>
>> This version should mostly be the formal one I'd later send to the
>> mailing list.
>>
>> I can not thank you more for all the testing you have provided, it not
>> only pinned down the bug, but also proves I'm a total idiot...
>
> I have already written over 1.6Tb of data to disk and there are no
> hints of errors.
> For me, this is a sign that the problem has been fixed.
> Tested-by: Mikhail Gavrilov <[email protected]>

Just one last thing to confirm, mind to provide the following dump?

# btrfs ins dump-tree -t chunk <device>

If the fix works, it means you have some tree block crossing stripe
boundary, which is not the ideal situation.

But on the other hand, you have provided "btrfs check" output, which
shows no such sign.
I'm not sure if it's a bug in newer check.

Thanks for your comprehensive testing and detailed report!
Qu

>
> ❯ dmesg | grep -i btrfs
> [ 0.000000] Linux version
> 6.2.0-rc1-1b929c02afd37871d5afb9d498426f83432e71c2-btrfs-fix+
> (mikhail@mikhail-laptop) (gcc (GCC) 12.2.1 20221121 (Red Hat
> 12.2.1-4), GNU ld version 2.39-6.fc38) #7 SMP PREEMPT_DYNAMIC Wed Dec
> 28 10:00:39 +05 2022
> [ 0.000000] Command line:
> BOOT_IMAGE=(hd0,gpt1)/@root/boot/vmlinuz-6.2.0-rc1-1b929c02afd37871d5afb9d498426f83432e71c2-btrfs-fix+
> root=UUID=40e0b5d2-df54-46e0-b6f4-2f868296271d ro
> rootflags=subvol=@root
> resume=UUID=db79988f-6b70-4b52-84f5-3e505471c85e log_buf_len=16M
> sysrq_always_enabled=1 nmi_watchdog=1
> amdgpu.lockup_timeout=-1,-1,-1,-1
> [ 0.154567] Kernel command line:
> BOOT_IMAGE=(hd0,gpt1)/@root/boot/vmlinuz-6.2.0-rc1-1b929c02afd37871d5afb9d498426f83432e71c2-btrfs-fix+
> root=UUID=40e0b5d2-df54-46e0-b6f4-2f868296271d ro
> rootflags=subvol=@root
> resume=UUID=db79988f-6b70-4b52-84f5-3e505471c85e log_buf_len=16M
> sysrq_always_enabled=1 nmi_watchdog=1
> amdgpu.lockup_timeout=-1,-1,-1,-1
> [ 0.154654] Unknown kernel command line parameters
> "BOOT_IMAGE=(hd0,gpt1)/@root/boot/vmlinuz-6.2.0-rc1-1b929c02afd37871d5afb9d498426f83432e71c2-btrfs-fix+",
> will be passed to user space.
> [ 4.496766] usb usb2: Manufacturer: Linux
> 6.2.0-rc1-1b929c02afd37871d5afb9d498426f83432e71c2-btrfs-fix+ xhci-hcd
> [ 4.498963] usb usb1: Manufacturer: Linux
> 6.2.0-rc1-1b929c02afd37871d5afb9d498426f83432e71c2-btrfs-fix+ xhci-hcd
> [ 4.500665] usb usb3: Manufacturer: Linux
> 6.2.0-rc1-1b929c02afd37871d5afb9d498426f83432e71c2-btrfs-fix+ xhci-hcd
> [ 4.501851] usb usb4: Manufacturer: Linux
> 6.2.0-rc1-1b929c02afd37871d5afb9d498426f83432e71c2-btrfs-fix+ xhci-hcd
> [ 4.735212] Btrfs loaded, crc32c=crc32c-generic, assert=on,
> zoned=yes, fsverity=yes
> [ 5.223368]
> BOOT_IMAGE=(hd0,gpt1)/@root/boot/vmlinuz-6.2.0-rc1-1b929c02afd37871d5afb9d498426f83432e71c2-btrfs-fix+
> [ 6.923453] BTRFS: device label fedora_localhost-live devid 2
> transid 652981 /dev/nvme1n1p1 scanned by systemd-udevd (448)
> [ 6.974412] BTRFS: device label fedora_localhost-live devid 1
> transid 652981 /dev/nvme0n1p3 scanned by systemd-udevd (484)
> [ 11.113437] CPU: 15 PID: 478 Comm: systemd-udevd Tainted: G
> L 6.2.0-rc1-1b929c02afd37871d5afb9d498426f83432e71c2-btrfs-fix+
> #7
> [ 11.221359] CPU: 15 PID: 478 Comm: systemd-udevd Tainted: G
> W L 6.2.0-rc1-1b929c02afd37871d5afb9d498426f83432e71c2-btrfs-fix+
> #7
> [ 13.731015] BTRFS info (device nvme0n1p3): using crc32c
> (crc32c-intel) checksum algorithm
> [ 13.731147] BTRFS info (device nvme0n1p3): using free space tree
> [ 14.328439] BTRFS info (device nvme0n1p3): enabling ssd optimizations
> [ 14.328469] BTRFS info (device nvme0n1p3): auto enabling async discard
> [ 16.592713] BTRFS info (device nvme0n1p3: state M): use zstd
> compression, level 1
> [11691.071176] CPU: 11 PID: 2068 Comm: gnome-shell Tainted: G W
> L 6.2.0-rc1-1b929c02afd37871d5afb9d498426f83432e71c2-btrfs-fix+
> #7
>
>
> <OFFTOPIC>
> As I mentioned at the first message I also have a computer where the
> btrfs partition is located on a slow HDD.
> When I update the container (podman pull), the system becomes
> unresposible for half an hour, which is how long it takes to update
> the container.
> I do not expect any super-speed from the HDD, I just would like to do
> something else with this computer. Yes, at least watching videos on
> youtube. Is there anything that can be done here or is there nothing
> that we can do?
> [46944.301588] INFO: task btrfs-transacti:1184 blocked for more than
> 122 seconds.
> [46944.301825] Tainted: G W L ------- ---
> 6.2.0-0.rc1.14.fc38.x86_64+debug #1
> [46944.301829] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> disables this message.
> [46944.301832] task:btrfs-transacti state:D stack:12000 pid:1184
> ppid:2 flags:0x00004000
> [46944.301840] Call Trace:
> [46944.301843] <TASK>
> [46944.301851] __schedule+0x50c/0x1780
> [46944.301863] ? _raw_spin_unlock_irqrestore+0x30/0x60
> [46944.301876] schedule+0x5d/0xe0
> [46944.301881] wait_current_trans+0x110/0x170
> [46944.301888] ? __pfx_autoremove_wake_function+0x10/0x10
> [46944.301895] start_transaction+0x36c/0x680
> [46944.301904] transaction_kthread+0xb6/0x1b0
> [46944.301912] ? __pfx_transaction_kthread+0x10/0x10
> [46944.301916] kthread+0xf5/0x120
> [46944.301920] ? __pfx_kthread+0x10/0x10
> [46944.301926] ret_from_fork+0x2c/0x50
> [46944.301941] </TASK>
>
>
> I attached a full kernel log from this machine.
> I can start a separate thread if it makes sense.
> Sorry for oftop.
> </OFFTOPIC>
>
> --
> Best Regards,
> Mike Gavrilov.

2022-12-29 00:07:29

On Mon, Dec 26, 2022 at 7:50 AM David Arendt <[email protected]> wrote:
>
> Hi,
>
> I am experiencing I similiar issue using kernel 6.1.1. The problem has
> started in kernel 6.1. It never happend before.
>
> A btrfs scrub and a btrfs check --readonly returned 0 errors.
>
> 2022-12-26T07:44:45.000000+01:00 server02 kernel - - - BTRFS critical
> (device sda2): corrupt leaf: root=18446744073709551610
> block=203366612992 slot=73, bad key order, prev (484119 96 1312873)
> current (484119 96 1312849)
> 2022-12-26T07:44:45.000000+01:00 server02 kernel - - - BTRFS info
> (device sda2): leaf 203366612992 gen 5068802 total ptrs 105 free space
> 10820 owner 18446744073709551610
> 2022-12-26T07:44:45.000000+01:00 server02 kernel - - - \x09item 0 key
> (484119 1 0) itemoff 16123 itemsize 160
> 2022-12-26T07:44:45.000000+01:00 server02 kernel - - - \x09\x09inode
> generation 45 size 2250 mode 40700
> 2022-12-26T07:44:45.000000+01:00 server02 kernel - - - \x09item 1 key
> (484119 12 484118) itemoff 16097 itemsize 26
> 2022-12-26T07:44:45.000000+01:00 server02 kernel - - - \x09item 2 key
> (484119 72 15) itemoff 16089 itemsize 8
> 2022-12-26T07:44:45.000000+01:00 server02 kernel - - - \x09item 3 key
> (484119 72 20) itemoff 16081 itemsize 8
> ...

You've removed the most interesting part of the message.

Can you paste it without any cuts (...)?

Thanks.

> 2022-12-26T07:44:45.000000+01:00 server02 kernel - - - \x09item 82 key
> (484119 96 1312873) itemoff 14665 itemsize 51
> 2022-12-26T07:44:45.000000+01:00 server02 kernel - - - \x09item 83 key
> (484119 96 1312877) itemoff 14609 itemsize 56
> 2022-12-26T07:44:45.000000+01:00 server02 kernel - - - \x09item 84 key
> (484128 1 0) itemoff 14449 itemsize 160
> 2022-12-26T07:44:45.000000+01:00 server02 kernel - - - \x09\x09inode
> generation 45 size 98304 mode 100644
> 2022-12-26T07:44:45.000000+01:00 server02 kernel - - - \x09item 85 key
> (484128 108 0) itemoff 14396 itemsize 53
> 2022-12-26T07:44:45.000000+01:00 server02 kernel - - - \x09\x09extent
> data disk bytenr 10674830381056 nr 65536
> 2022-12-26T07:44:45.000000+01:00 server02 kernel - - - \x09\x09extent
> data offset 0 nr 65536 ram 65536
> 2022-12-26T07:44:45.000000+01:00 server02 kernel - - - \x09item 86 key
> (484129 1 0) itemoff 14236 itemsize 160
> 2022-12-26T07:44:45.000000+01:00 server02 kernel - - - \x09\x09inode
> generation 45 size 26214400 mode 100644
> 2022-12-26T07:44:45.000000+01:00 server02 kernel - - - \x09item 87 key
> (484129 108 589824) itemoff 14183 itemsize 53
> 2022-12-26T07:44:45.000000+01:00 server02 kernel - - - \x09\x09extent
> data disk bytenr 10665699962880 nr 32768
> 2022-12-26T07:44:45.000000+01:00 server02 kernel - - - \x09\x09extent
> data offset 0 nr 32768 ram 32768
> 2022-12-26T07:44:45.000000+01:00 server02 kernel - - - \x09item 88 key
> (484129 108 2850816) itemoff 14130 itemsize 53
> 2022-12-26T07:44:45.000000+01:00 server02 kernel - - - \x09\x09extent
> data disk bytenr 10665848733696 nr 32768
> 2022-12-26T07:44:45.000000+01:00 server02 kernel - - - \x09\x09extent
> data offset 0 nr 32768 ram 32768
> 2022-12-26T07:44:45.000000+01:00 server02 kernel - - - \x09item 89 key
> (484129 108 11042816) itemoff 14077 itemsize 53
> 2022-12-26T07:44:45.000000+01:00 server02 kernel - - - \x09\x09extent
> data disk bytenr 10660869349376 nr 32768
> 2022-12-26T07:44:45.000000+01:00 server02 kernel - - - \x09\x09extent
> data offset 0 nr 32768 ram 32768
> 2022-12-26T07:44:45.000000+01:00 server02 kernel - - - \x09item 90 key
> (484129 108 13402112) itemoff 14024 itemsize 53
> 2022-12-26T07:44:45.000000+01:00 server02 kernel - - - \x09\x09extent
> data disk bytenr 10660207378432 nr 32768
> 2022-12-26T07:44:45.000000+01:00 server02 kernel - - - \x09\x09extent
> data offset 0 nr 32768 ram 32768
> 2022-12-26T07:44:45.000000+01:00 server02 kernel - - - \x09item 91 key
> (484129 108 19628032) itemoff 13971 itemsize 53
> 2022-12-26T07:44:45.000000+01:00 server02 kernel - - - \x09\x09extent
> data disk bytenr 10665835618304 nr 32768
> 2022-12-26T07:44:45.000000+01:00 server02 kernel - - - \x09\x09extent
> data offset 0 nr 32768 ram 32768
> 2022-12-26T07:44:45.000000+01:00 server02 kernel - - - \x09item 92 key
> (484129 108 21266432) itemoff 13918 itemsize 53
> 2022-12-26T07:44:45.000000+01:00 server02 kernel - - - \x09\x09extent
> data disk bytenr 10661222666240 nr 32768
> 2022-12-26T07:44:45.000000+01:00 server02 kernel - - - \x09\x09extent
> data offset 0 nr 32768 ram 32768
> 2022-12-26T07:44:45.000000+01:00 server02 kernel - - - \x09item 93 key
> (484129 108 22740992) itemoff 13865 itemsize 53
> 2022-12-26T07:44:45.000000+01:00 server02 kernel - - - \x09\x09extent
> data disk bytenr 10665565814784 nr 32768
> 2022-12-26T07:44:45.000000+01:00 server02 kernel - - - \x09\x09extent
> data offset 0 nr 32768 ram 32768
> 2022-12-26T07:44:45.000000+01:00 server02 kernel - - - \x09item 94 key
> (484129 108 23101440) itemoff 13812 itemsize 53
> 2022-12-26T07:44:45.000000+01:00 server02 kernel - - - \x09\x09extent
> data disk bytenr 10665836371968 nr 32768
> 2022-12-26T07:44:45.000000+01:00 server02 kernel - - - \x09\x09extent
> data offset 0 nr 32768 ram 32768
> 2022-12-26T07:44:45.000000+01:00 server02 kernel - - - \x09item 95 key
> (484129 108 24084480) itemoff 13759 itemsize 53
> 2022-12-26T07:44:45.000000+01:00 server02 kernel - - - \x09\x09extent
> data disk bytenr 10665836404736 nr 32768
> 2022-12-26T07:44:45.000000+01:00 server02 kernel - - - \x09\x09extent
> data offset 0 nr 32768 ram 32768
> 2022-12-26T07:44:45.000000+01:00 server02 kernel - - - \x09item 96 key
> (484129 108 24150016) itemoff 13706 itemsize 53
> 2022-12-26T07:44:45.000000+01:00 server02 kernel - - - \x09\x09extent
> data disk bytenr 10665849159680 nr 32768
> 2022-12-26T07:44:45.000000+01:00 server02 kernel - - - \x09\x09extent
> data offset 0 nr 32768 ram 32768
> 2022-12-26T07:44:45.000000+01:00 server02 kernel - - - \x09item 97 key
> (484129 108 24313856) itemoff 13653 itemsize 53
> 2022-12-26T07:44:45.000000+01:00 server02 kernel - - - \x09\x09extent
> data disk bytenr 10665849192448 nr 32768
> 2022-12-26T07:44:45.000000+01:00 server02 kernel - - - \x09\x09extent
> data offset 0 nr 32768 ram 32768
> 2022-12-26T07:44:45.000000+01:00 server02 kernel - - - \x09item 98 key
> (484147 1 0) itemoff 13493 itemsize 160
> 2022-12-26T07:44:45.000000+01:00 server02 kernel - - - \x09\x09inode
> generation 45 size 886 mode 40755
> 2022-12-26T07:44:45.000000+01:00 server02 kernel - - - \x09item 99 key
> (484147 72 4) itemoff 13485 itemsize 8
> 2022-12-26T07:44:45.000000+01:00 server02 kernel - - - \x09item 100 key
> (484147 72 27) itemoff 13477 itemsize 8
> 2022-12-26T07:44:45.000000+01:00 server02 kernel - - - \x09item 101 key
> (484147 72 35) itemoff 13469 itemsize 8
> 2022-12-26T07:44:45.000000+01:00 server02 kernel - - - \x09item 102 key
> (484147 72 40) itemoff 13461 itemsize 8
> 2022-12-26T07:44:45.000000+01:00 server02 kernel - - - \x09item 103 key
> (484147 72 45) itemoff 13453 itemsize 8
> 2022-12-26T07:44:45.000000+01:00 server02 kernel - - - \x09item 104 key
> (484147 72 52) itemoff 13445 itemsize 8
> 2022-12-26T07:44:45.000000+01:00 server02 kernel - - - BTRFS error
> (device sda2): block=203366612992 write time tree block corruption detected
> 2022-12-26T07:44:45.000000+01:00 server02 kernel - - - BTRFS: error
> (device sda2: state AL) in free_log_tree:3284: errno=-5 IO failure
> 2022-12-26T07:44:45.000000+01:00 server02 kernel - - - BTRFS info
> (device sda2: state EAL): forced readonly
> 2022-12-26T07:44:45.000000+01:00 server02 kernel - - - BTRFS warning
> (device sda2: state EAL): Skipping commit of aborted transaction.
> 2022-12-26T07:44:45.000000+01:00 server02 kernel - - - BTRFS: error
> (device sda2: state EAL) in cleanup_transaction:1958: errno=-5 IO failure
>
>
> There are no SSD access errors in the kernel logs. Smart data for the
> SSD is normal. I also did a 12 hour memtest to rule out bad RAM.
>
> The filesystem consists of a single 480GB SATA SSD (Corsair Neutron
> XTI). Like for the original poster, the problems occurs only on one
> machine.
>
> The error appears about every few days and seems to be triggered by a
> combination specially under high cpu utilization combined with some disk
> IO. CPU temperature never exceeds 60 degrees.
>
>
> On 26/12/2022 04:29, Qu Wenruo wrote:
> >
> >
> > On 2022/12/26 10:47, Mikhail Gavrilov wrote:
> >> On Mon, Dec 26, 2022 at 5:15 AM Qu Wenruo <[email protected]>
> >> wrote:
> >>>
> >>> Thanks a lot for the full kernel log.
> >>>
> >>> It indeed shows something is wrong in the run_one_delayed_ref().
> >>> But surprisingly, if there is something wrong, I'd expect more output
> >>> from btrfs, as normally if one tree block failed to pass whatever the
> >>> checks, it should cause an error message at least.
> >>>
> >>> Since you can reproduce the bug (although I don't think this is easy to
> >>> reproduce), mind to apply the extra debug patch and then try to
> >>> reproduce?
> >>
> >> Of course I am still able to reproduce.
> >> The number of messages foreshadowing readonly has become somewhat more:
> >> [ 2295.155437] BTRFS error (device nvme0n1p3): level check failed on
> >> logical 4957418700800 mirror 1 wanted 0 found 1
> >
> > OK, indeed a level mismatch.
> >
> > From the remaining lines, it shows we're failing at
> > do_free_extent_accounting(), which failed at the btrfs_del_csums().
> >
> > And inside btrfs_del_csums(), what we do are all regular btree
> > operations, thus the tree level check should work without problem.
> >
> > Thus it seems to be a corrupted csum tree.
> >
> >> [ 2295.155831] BTRFS error (device nvme0n1p3: state A): Transaction
> >> aborted (error -5)
> >> [ 2295.155946] BTRFS: error (device nvme0n1p3: state A) in
> >> do_free_extent_accounting:2849: errno=-5 IO failure
> >> [ 2295.155978] BTRFS info (device nvme0n1p3: state EA): forced readonly
> >> [ 2295.155985] BTRFS error (device nvme0n1p3: state EA):
> >> run_one_delayed_ref returned -5
> >> [ 2295.156051] BTRFS: error (device nvme0n1p3: state EA) in
> >> btrfs_run_delayed_refs:2153: errno=-5 IO failure
> >>
> >> Of course full logs are also attached.
> >>
> >>> Another thing is, mind to run "btrfs check --readonly" on the fs?
> >> Result of check attached too.
> >>
> > Could you please run "btrfs check --readonly" from a liveCD?
> > There are tons of possible false alerts if ran on a RW mounted fs.
> >
> > Thanks,
> > Qu
>
>