Hi Helge Deller,
Greeting!
Reproduced code: https://github.com/xupengfe/syzkaller_logs/blob/main/230222_152458_xfs_dquot_alloc_bug/repro.c
Kconfig: https://github.com/xupengfe/syzkaller_logs/blob/main/230222_152458_xfs_dquot_alloc_bug/kconfig_origin
Issue dmesg: https://github.com/xupengfe/syzkaller_logs/blob/main/230222_152458_xfs_dquot_alloc_bug/v6.2_c9c3395d5e3dcc6daee66c6908354d47bf98cb0c_dmesg.log
Bisect info(Might not be correct this time, but just gave some clues to the problem)
https://github.com/xupengfe/syzkaller_logs/blob/main/230222_152458_xfs_dquot_alloc_bug/bisect_info.log
All detailed info: https://github.com/xupengfe/syzkaller_logs/tree/main/230222_152458_xfs_dquot_alloc_bug
Platform: ADL-S, and it could be reproduced on x86 platform in guest.
There is "xfs_dquot_alloc" related BUG in v6.2:
[ 71.149963] xfs filesystem being mounted at /root/syzkaller.6TPmw0/0/file0 supports timestamps until 2038 (0x7fffffff)
[ 71.150653] 00000000: 58 41 47 49 00 00 00 01 00 00 00 00 00 00 80 00 XAGI............
[ 71.151006] 00000010: 00 00 00 40 00 00 00 06 00 00 00 01 00 00 00 37 [email protected]
[ 71.151321] 00000020: 00 00 00 20 ff ff ff ff ff ff ff ff ff ff ff ff ... ............
[ 71.151633] 00000030: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ................
[ 71.151946] 00000040: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ................
[ 71.152259] 00000050: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ................
[ 71.152570] 00000060: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ................
[ 71.152881] 00000070: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ................
[ 71.153193] 00000080: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ................
[ 71.153607] 00000090: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ................
[ 71.153921] 000000a0: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ................
[ 71.154237] 000000b0: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ................
[ 71.154549] 000000c0: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ................
[ 71.154865] 000000d0: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ................
[ 71.155180] 000000e0: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ................
[ 71.155494] 000000f0: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ................
[ 71.155807] 00000100: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ................
[ 71.156119] 00000110: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ................
[ 71.156433] 00000120: ff ff ff ff ff ff ff ff 00 00 00 00 00 00 00 00 ................
[ 71.156747] 00000130: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
[ 71.157059] 00000140: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
[ 71.157382] 00000150: 00 00 00 00 00 00 00 00 ........
[ 71.157671] XFS (loop3): Internal error xfs_iunlink_remove_inode at line 2013 of file fs/xfs/xfs_inode.c. Caller xfs_ifree+0xed/0x9e0
[ 71.158154] CPU: 1 PID: 137 Comm: kworker/1:3 Not tainted 6.2.0-c9c3395d5e3d #1
[ 71.158447] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
[ 71.158890] Workqueue: xfs-inodegc/loop3 xfs_inodegc_worker
[ 71.159117] Call Trace:
[ 71.159220] <TASK>
[ 71.159313] dump_stack_lvl+0xa7/0xdb
[ 71.159478] dump_stack+0x19/0x1f
[ 71.159621] xfs_corruption_error+0xd7/0xe0
[ 71.159805] ? xfs_ifree+0xed/0x9e0
[ 71.159957] xfs_iunlink_remove+0x32e/0x590
[ 71.160136] ? xfs_ifree+0xed/0x9e0
[ 71.160290] xfs_ifree+0xed/0x9e0
[ 71.160432] ? write_comp_data+0x2f/0x90
[ 71.160600] ? xfs_trans_ijoin+0x47/0x70
[ 71.160768] ? __sanitizer_cov_trace_pc+0x25/0x60
[ 71.160964] ? xfs_trans_add_item+0x79/0x1c0
[ 71.161151] xfs_inactive_ifree+0xf8/0x2a0
[ 71.161324] xfs_inactive+0x226/0x340
[ 71.161482] xfs_inodegc_worker+0xd3/0x430
[ 71.161657] process_one_work+0x3b1/0x960
[ 71.161837] worker_thread+0x52/0x660
[ 71.161999] ? __pfx_worker_thread+0x10/0x10
[ 71.162184] kthread+0x161/0x1a0
[ 71.162331] ? __pfx_kthread+0x10/0x10
[ 71.162498] ret_from_fork+0x29/0x50
[ 71.162670] </TASK>
[ 71.162773] XFS (loop3): Corruption detected. Unmount and run xfs_repair
[ 71.163039] XFS (loop3): xfs_inactive_ifree: xfs_ifree returned error -117
[ 71.163869] repro: attempt to access beyond end of device
[ 71.163869] loop3: rw=432129, sector=65535, nr_sectors = 16 limit=65536
[ 71.164410] XFS (loop3): log I/O error -5
[ 71.166002] XFS (loop3): Metadata I/O Error (0x1) detected at xfs_inactive_ifree+0x232/0x2a0 (fs/xfs/xfs_inode.c:1612). Shutting down filesystem.
[ 71.166541] XFS (loop3): Please unmount the filesystem and rectify the problem(s)
[ 71.167188] XFS (loop1): DAX unsupported by block device. Turning off DAX.
[ 71.167530] XFS (loop7): DAX unsupported by block device. Turning off DAX.
[ 71.167859] XFS (loop6): DAX unsupported by block device. Turning off DAX.
[ 71.168188] XFS (loop4): DAX unsupported by block device. Turning off DAX.
[ 71.168554] XFS (loop1): Mounting V4 Filesystem 86ecfda0-089a-461f-b078-1b43afedebc1
[ 71.168995] XFS (loop7): Mounting V4 Filesystem 86ecfda0-089a-461f-b078-1b43afedebc1
[ 71.169595] XFS (loop2): Unmounting Filesystem 86ecfda0-089a-461f-b078-1b43afedebc1
[ 71.169614] XFS (loop6): Mounting V4 Filesystem 86ecfda0-089a-461f-b078-1b43afedebc1
[ 71.170444] XFS (loop4): Mounting V4 Filesystem 86ecfda0-089a-461f-b078-1b43afedebc1
[ 71.172161] XFS (loop3): Quotacheck: Unsuccessful (Error -5): Disabling quotas.
[ 71.172510] xfs filesystem being mounted at /root/syzkaller.mc3H24/0/file0 supports timestamps until 2038 (0x7fffffff)
[ 71.174633] XFS (loop3): Unmounting Filesystem 86ecfda0-089a-461f-b078-1b43afedebc1
[ 71.175936] XFS (loop5): DAX unsupported by block device. Turning off DAX.
[ 71.176303] XFS (loop5): Mounting V4 Filesystem 86ecfda0-089a-461f-b078-1b43afedebc1
[ 71.178679] XFS (loop4): totally zeroed log
[ 71.179069] XFS (loop4): Ending clean mount
[ 71.179442] XFS (loop4): Quotacheck needed: Please wait.
[ 71.183738] XFS (loop6): totally zeroed log
[ 71.184537] repro: attempt to access beyond end of device
[ 71.184537] loop4: rw=432129, sector=65535, nr_sectors = 16 limit=65536
[ 71.184637] XFS (loop6): Ending clean mount
[ 71.185052] XFS (loop4): log I/O error -5
[ 71.185339] XFS (loop6): Quotacheck needed: Please wait.
[ 71.185404] XFS (loop4): Filesystem has been shut down due to log error (0x2).
[ 71.185883] XFS (loop4): Please unmount the filesystem and rectify the problem(s).
[ 71.186372] XFS (loop4): Quotacheck: Unsuccessful (Error -5): Disabling quotas.
[ 71.186705] xfs filesystem being mounted at /root/syzkaller.uq7iOt/0/file0 supports timestamps until 2038 (0x7fffffff)
[ 71.187608] repro: attempt to access beyond end of device
[ 71.187608] loop6: rw=432129, sector=65535, nr_sectors = 16 limit=65536
[ 71.188127] XFS (loop6): log I/O error -5
[ 71.188317] XFS (loop6): Filesystem has been shut down due to log error (0x2).
[ 71.188599] XFS (loop6): Please unmount the filesystem and rectify the problem(s).
[ 71.188998] XFS (loop6): Quotacheck: Unsuccessful (Error -5): Disabling quotas.
[ 71.189324] xfs filesystem being mounted at /root/syzkaller.OjSrkA/0/file0 supports timestamps until 2038 (0x7fffffff)
[ 71.191782] XFS (loop4): Unmounting Filesystem 86ecfda0-089a-461f-b078-1b43afedebc1
[ 71.193911] XFS (loop6): Unmounting Filesystem 86ecfda0-089a-461f-b078-1b43afedebc1
[ 71.207573] XFS (loop1): totally zeroed log
[ 71.207813] XFS (loop7): totally zeroed log
[ 71.208396] XFS (loop7): Ending clean mount
[ 71.208664] XFS (loop1): Ending clean mount
[ 71.209093] XFS (loop1): Quotacheck needed: Please wait.
[ 71.210841] XFS (loop7): Quotacheck needed: Please wait.
[ 71.216659] repro: attempt to access beyond end of device
[ 71.216659] loop7: rw=432129, sector=65535, nr_sectors = 16 limit=65536
[ 71.217165] XFS (loop7): log I/O error -5
[ 71.217375] XFS (loop7): Filesystem has been shut down due to log error (0x2).
[ 71.217674] XFS (loop7): Please unmount the filesystem and rectify the problem(s).
[ 71.218071] XFS (loop7): Quotacheck: Unsuccessful (Error -5): Disabling quotas.
[ 71.218402] xfs filesystem being mounted at /root/syzkaller.Q5dMMG/0/file0 supports timestamps until 2038 (0x7fffffff)
[ 71.219906] XFS (loop7): Unmounting Filesystem 86ecfda0-089a-461f-b078-1b43afedebc1
[ 71.224443] repro: attempt to access beyond end of device
[ 71.224443] loop1: rw=432129, sector=65535, nr_sectors = 16 limit=65536
[ 71.224960] XFS (loop1): log I/O error -5
[ 71.225151] XFS (loop1): Filesystem has been shut down due to log error (0x2).
[ 71.225543] XFS (loop1): Please unmount the filesystem and rectify the problem(s).
[ 71.225966] XFS (loop1): Quotacheck: Unsuccessful (Error -5): Disabling quotas.
[ 71.226310] xfs filesystem being mounted at /root/syzkaller.qCVHXV/0/file0 supports timestamps until 2038 (0x7fffffff)
[ 71.227591] BUG: kernel NULL pointer dereference, address: 00000000000002a8
[ 71.227873] #PF: supervisor read access in kernel mode
[ 71.228077] #PF: error_code(0x0000) - not-present page
[ 71.228280] PGD c313067 P4D c313067 PUD c1fe067 PMD 0
[ 71.228494] Oops: 0000 [#1] PREEMPT SMP NOPTI
[ 71.228673] CPU: 0 PID: 161 Comm: kworker/0:4 Not tainted 6.2.0-c9c3395d5e3d #1
[ 71.228961] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
[ 71.229400] Workqueue: xfs-inodegc/loop1 xfs_inodegc_worker
[ 71.229626] RIP: 0010:xfs_dquot_alloc+0x95/0x1e0
[ 71.229820] Code: 80 15 ad 85 48 c7 c6 7c 6b 92 83 e8 75 0f 6b ff 49 8b 8d 60 01 00 00 44 89 e0 31 d2 48 c7 c6 18 ae 8f 83 48 8d bb 18 02 00 00 <f7> b1 a8 02 2
[ 71.230528] RSP: 0018:ffffc90000babc20 EFLAGS: 00010246
[ 71.230737] RAX: 0000000000000009 RBX: ffff8880093c98c0 RCX: 0000000000000000
[ 71.231014] RDX: 0000000000000000 RSI: ffffffff838fae18 RDI: ffff8880093c9ad8
[ 71.231292] RBP: ffffc90000babc48 R08: 0000000000000002 R09: 0000000000000000
[ 71.231570] R10: ffffc90000baba80 R11: ffff88800af08d98 R12: 0000000000000009
[ 71.231850] R13: ffff88800c4bc000 R14: ffff88800c4bc000 R15: 0000000000000004
[ 71.232129] FS: 0000000000000000(0000) GS:ffff88807dc00000(0000) knlGS:0000000000000000
[ 71.232441] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 71.232668] CR2: 00000000000002a8 CR3: 000000000a1d2002 CR4: 0000000000770ef0
[ 71.232949] PKRU: 55555554
[ 71.233061] Call Trace:
[ 71.233162] <TASK>
[ 71.233254] xfs_qm_dqread+0x46/0x440
[ 71.233410] ? xfs_qm_dqget_inode+0x13e/0x500
[ 71.233596] xfs_qm_dqget_inode+0x154/0x500
[ 71.233774] xfs_qm_dqattach_one+0x142/0x3c0
[ 71.233961] xfs_qm_dqattach_locked+0x14a/0x170
[ 71.234149] xfs_qm_dqattach+0x52/0x80
[ 71.234307] xfs_inactive+0x186/0x340
[ 71.234461] xfs_inodegc_worker+0xd3/0x430
[ 71.234630] process_one_work+0x3b1/0x960
[ 71.234802] worker_thread+0x52/0x660
[ 71.234957] ? __pfx_worker_thread+0x10/0x10
[ 71.235136] kthread+0x161/0x1a0
[ 71.235279] ? __pfx_kthread+0x10/0x10
[ 71.235442] ret_from_fork+0x29/0x50
[ 71.235602] </TASK>
[ 71.235696] Modules linked in:
[ 71.235826] CR2: 00000000000002a8
[ 71.235964] ---[ end trace 0000000000000000 ]---
Reporting the above issue and providing a reproduced way seems valuable.
But not sure report to who in kernel community.
So used RIP: "xfs_dquot_alloc" key word to bisect between v6.2 and v5.11.
I know there is BUG also but with some other RIP info.
Related commit:
29837019d5ebb80a5f180af3107a0645c731a770
Merge tag 'io_uring-5.19-2022-07-08' of git://git.kernel.dk/linux-block
This might not be the right point of suspicion.
Could you help to take a look or add the correct developer for this issue?
Thanks a lot!
---
If you don't need an environment to reproduce the problem or if you already
have one, please ignore the following information.
How to reproduce:
git clone https://gitlab.com/xupengfe/repro_vm_env.git
cd repro_vm_env
tar -xvf repro_vm_env.tar.gz
cd repro_vm_env; ./start3.sh // it needs qemu-system-x86_64 and I used v7.1.0
// start3.sh will load bzImage_2241ab53cbb5cdb08a6b2d4688feb13971058f65 v6.2-rc5 kernel
// You could change the bzImage_xxx as you want
In vm and login with root, there is no password for root.
After login vm successfully, you could transfer reproduced binary to the VM by below way, and reproduce the problem:
gcc -pthread -o repro repro.c
scp -P 10023 repro root@localhost:/root/
Get the bzImage for target kernel:
Please use target kconfig and copy it to kernel_src/.config
make olddefconfig
make -jx bzImage //x should equal or less than cpu num your pc has
Fill the bzImage file into above start3.sh to load the target kernel vm.
Tips:
If you already have qemu-system-x86_64, please ignore below info.
If you want to install qemu v7.1.0 version:
git clone https://github.com/qemu/qemu.git
cd qemu
git checkout -f v7.1.0
mkdir build
cd build
yum install -y ninja-build.x86_64
../configure --target-list=x86_64-softmmu --enable-kvm --enable-vnc --enable-gtk --enable-sdl
make
make install
Thanks!
BR.
Looping in xfs mailing list as this seems to be a XFS problem...
On 2/24/23 05:39, Pengfei Xu wrote:
> Hi Helge Deller,
>
> Greeting!
>
> Reproduced code: https://github.com/xupengfe/syzkaller_logs/blob/main/230222_152458_xfs_dquot_alloc_bug/repro.c
> Kconfig: https://github.com/xupengfe/syzkaller_logs/blob/main/230222_152458_xfs_dquot_alloc_bug/kconfig_origin
> Issue dmesg: https://github.com/xupengfe/syzkaller_logs/blob/main/230222_152458_xfs_dquot_alloc_bug/v6.2_c9c3395d5e3dcc6daee66c6908354d47bf98cb0c_dmesg.log
> Bisect info(Might not be correct this time, but just gave some clues to the problem)
> https://github.com/xupengfe/syzkaller_logs/blob/main/230222_152458_xfs_dquot_alloc_bug/bisect_info.log
>
> All detailed info: https://github.com/xupengfe/syzkaller_logs/tree/main/230222_152458_xfs_dquot_alloc_bug
>
> Platform: ADL-S, and it could be reproduced on x86 platform in guest.
> There is "xfs_dquot_alloc" related BUG in v6.2:
>
> [ 71.149963] xfs filesystem being mounted at /root/syzkaller.6TPmw0/0/file0 supports timestamps until 2038 (0x7fffffff)
> [ 71.150653] 00000000: 58 41 47 49 00 00 00 01 00 00 00 00 00 00 80 00 XAGI............
> [ 71.151006] 00000010: 00 00 00 40 00 00 00 06 00 00 00 01 00 00 00 37 [email protected]
> [ 71.151321] 00000020: 00 00 00 20 ff ff ff ff ff ff ff ff ff ff ff ff ... ............
> [ 71.151633] 00000030: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ................
> [ 71.151946] 00000040: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ................
> [ 71.152259] 00000050: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ................
> [ 71.152570] 00000060: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ................
> [ 71.152881] 00000070: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ................
> [ 71.153193] 00000080: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ................
> [ 71.153607] 00000090: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ................
> [ 71.153921] 000000a0: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ................
> [ 71.154237] 000000b0: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ................
> [ 71.154549] 000000c0: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ................
> [ 71.154865] 000000d0: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ................
> [ 71.155180] 000000e0: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ................
> [ 71.155494] 000000f0: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ................
> [ 71.155807] 00000100: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ................
> [ 71.156119] 00000110: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ................
> [ 71.156433] 00000120: ff ff ff ff ff ff ff ff 00 00 00 00 00 00 00 00 ................
> [ 71.156747] 00000130: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
> [ 71.157059] 00000140: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
> [ 71.157382] 00000150: 00 00 00 00 00 00 00 00 ........
> [ 71.157671] XFS (loop3): Internal error xfs_iunlink_remove_inode at line 2013 of file fs/xfs/xfs_inode.c. Caller xfs_ifree+0xed/0x9e0
> [ 71.158154] CPU: 1 PID: 137 Comm: kworker/1:3 Not tainted 6.2.0-c9c3395d5e3d #1
> [ 71.158447] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
> [ 71.158890] Workqueue: xfs-inodegc/loop3 xfs_inodegc_worker
> [ 71.159117] Call Trace:
> [ 71.159220] <TASK>
> [ 71.159313] dump_stack_lvl+0xa7/0xdb
> [ 71.159478] dump_stack+0x19/0x1f
> [ 71.159621] xfs_corruption_error+0xd7/0xe0
> [ 71.159805] ? xfs_ifree+0xed/0x9e0
> [ 71.159957] xfs_iunlink_remove+0x32e/0x590
> [ 71.160136] ? xfs_ifree+0xed/0x9e0
> [ 71.160290] xfs_ifree+0xed/0x9e0
> [ 71.160432] ? write_comp_data+0x2f/0x90
> [ 71.160600] ? xfs_trans_ijoin+0x47/0x70
> [ 71.160768] ? __sanitizer_cov_trace_pc+0x25/0x60
> [ 71.160964] ? xfs_trans_add_item+0x79/0x1c0
> [ 71.161151] xfs_inactive_ifree+0xf8/0x2a0
> [ 71.161324] xfs_inactive+0x226/0x340
> [ 71.161482] xfs_inodegc_worker+0xd3/0x430
> [ 71.161657] process_one_work+0x3b1/0x960
> [ 71.161837] worker_thread+0x52/0x660
> [ 71.161999] ? __pfx_worker_thread+0x10/0x10
> [ 71.162184] kthread+0x161/0x1a0
> [ 71.162331] ? __pfx_kthread+0x10/0x10
> [ 71.162498] ret_from_fork+0x29/0x50
> [ 71.162670] </TASK>
> [ 71.162773] XFS (loop3): Corruption detected. Unmount and run xfs_repair
> [ 71.163039] XFS (loop3): xfs_inactive_ifree: xfs_ifree returned error -117
> [ 71.163869] repro: attempt to access beyond end of device
> [ 71.163869] loop3: rw=432129, sector=65535, nr_sectors = 16 limit=65536
> [ 71.164410] XFS (loop3): log I/O error -5
> [ 71.166002] XFS (loop3): Metadata I/O Error (0x1) detected at xfs_inactive_ifree+0x232/0x2a0 (fs/xfs/xfs_inode.c:1612). Shutting down filesystem.
> [ 71.166541] XFS (loop3): Please unmount the filesystem and rectify the problem(s)
> [ 71.167188] XFS (loop1): DAX unsupported by block device. Turning off DAX.
> [ 71.167530] XFS (loop7): DAX unsupported by block device. Turning off DAX.
> [ 71.167859] XFS (loop6): DAX unsupported by block device. Turning off DAX.
> [ 71.168188] XFS (loop4): DAX unsupported by block device. Turning off DAX.
> [ 71.168554] XFS (loop1): Mounting V4 Filesystem 86ecfda0-089a-461f-b078-1b43afedebc1
> [ 71.168995] XFS (loop7): Mounting V4 Filesystem 86ecfda0-089a-461f-b078-1b43afedebc1
> [ 71.169595] XFS (loop2): Unmounting Filesystem 86ecfda0-089a-461f-b078-1b43afedebc1
> [ 71.169614] XFS (loop6): Mounting V4 Filesystem 86ecfda0-089a-461f-b078-1b43afedebc1
> [ 71.170444] XFS (loop4): Mounting V4 Filesystem 86ecfda0-089a-461f-b078-1b43afedebc1
> [ 71.172161] XFS (loop3): Quotacheck: Unsuccessful (Error -5): Disabling quotas.
> [ 71.172510] xfs filesystem being mounted at /root/syzkaller.mc3H24/0/file0 supports timestamps until 2038 (0x7fffffff)
> [ 71.174633] XFS (loop3): Unmounting Filesystem 86ecfda0-089a-461f-b078-1b43afedebc1
> [ 71.175936] XFS (loop5): DAX unsupported by block device. Turning off DAX.
> [ 71.176303] XFS (loop5): Mounting V4 Filesystem 86ecfda0-089a-461f-b078-1b43afedebc1
> [ 71.178679] XFS (loop4): totally zeroed log
> [ 71.179069] XFS (loop4): Ending clean mount
> [ 71.179442] XFS (loop4): Quotacheck needed: Please wait.
> [ 71.183738] XFS (loop6): totally zeroed log
> [ 71.184537] repro: attempt to access beyond end of device
> [ 71.184537] loop4: rw=432129, sector=65535, nr_sectors = 16 limit=65536
> [ 71.184637] XFS (loop6): Ending clean mount
> [ 71.185052] XFS (loop4): log I/O error -5
> [ 71.185339] XFS (loop6): Quotacheck needed: Please wait.
> [ 71.185404] XFS (loop4): Filesystem has been shut down due to log error (0x2).
> [ 71.185883] XFS (loop4): Please unmount the filesystem and rectify the problem(s).
> [ 71.186372] XFS (loop4): Quotacheck: Unsuccessful (Error -5): Disabling quotas.
> [ 71.186705] xfs filesystem being mounted at /root/syzkaller.uq7iOt/0/file0 supports timestamps until 2038 (0x7fffffff)
> [ 71.187608] repro: attempt to access beyond end of device
> [ 71.187608] loop6: rw=432129, sector=65535, nr_sectors = 16 limit=65536
> [ 71.188127] XFS (loop6): log I/O error -5
> [ 71.188317] XFS (loop6): Filesystem has been shut down due to log error (0x2).
> [ 71.188599] XFS (loop6): Please unmount the filesystem and rectify the problem(s).
> [ 71.188998] XFS (loop6): Quotacheck: Unsuccessful (Error -5): Disabling quotas.
> [ 71.189324] xfs filesystem being mounted at /root/syzkaller.OjSrkA/0/file0 supports timestamps until 2038 (0x7fffffff)
> [ 71.191782] XFS (loop4): Unmounting Filesystem 86ecfda0-089a-461f-b078-1b43afedebc1
> [ 71.193911] XFS (loop6): Unmounting Filesystem 86ecfda0-089a-461f-b078-1b43afedebc1
> [ 71.207573] XFS (loop1): totally zeroed log
> [ 71.207813] XFS (loop7): totally zeroed log
> [ 71.208396] XFS (loop7): Ending clean mount
> [ 71.208664] XFS (loop1): Ending clean mount
> [ 71.209093] XFS (loop1): Quotacheck needed: Please wait.
> [ 71.210841] XFS (loop7): Quotacheck needed: Please wait.
> [ 71.216659] repro: attempt to access beyond end of device
> [ 71.216659] loop7: rw=432129, sector=65535, nr_sectors = 16 limit=65536
> [ 71.217165] XFS (loop7): log I/O error -5
> [ 71.217375] XFS (loop7): Filesystem has been shut down due to log error (0x2).
> [ 71.217674] XFS (loop7): Please unmount the filesystem and rectify the problem(s).
> [ 71.218071] XFS (loop7): Quotacheck: Unsuccessful (Error -5): Disabling quotas.
> [ 71.218402] xfs filesystem being mounted at /root/syzkaller.Q5dMMG/0/file0 supports timestamps until 2038 (0x7fffffff)
> [ 71.219906] XFS (loop7): Unmounting Filesystem 86ecfda0-089a-461f-b078-1b43afedebc1
> [ 71.224443] repro: attempt to access beyond end of device
> [ 71.224443] loop1: rw=432129, sector=65535, nr_sectors = 16 limit=65536
> [ 71.224960] XFS (loop1): log I/O error -5
> [ 71.225151] XFS (loop1): Filesystem has been shut down due to log error (0x2).
> [ 71.225543] XFS (loop1): Please unmount the filesystem and rectify the problem(s).
> [ 71.225966] XFS (loop1): Quotacheck: Unsuccessful (Error -5): Disabling quotas.
> [ 71.226310] xfs filesystem being mounted at /root/syzkaller.qCVHXV/0/file0 supports timestamps until 2038 (0x7fffffff)
> [ 71.227591] BUG: kernel NULL pointer dereference, address: 00000000000002a8
> [ 71.227873] #PF: supervisor read access in kernel mode
> [ 71.228077] #PF: error_code(0x0000) - not-present page
> [ 71.228280] PGD c313067 P4D c313067 PUD c1fe067 PMD 0
> [ 71.228494] Oops: 0000 [#1] PREEMPT SMP NOPTI
> [ 71.228673] CPU: 0 PID: 161 Comm: kworker/0:4 Not tainted 6.2.0-c9c3395d5e3d #1
> [ 71.228961] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
> [ 71.229400] Workqueue: xfs-inodegc/loop1 xfs_inodegc_worker
> [ 71.229626] RIP: 0010:xfs_dquot_alloc+0x95/0x1e0
> [ 71.229820] Code: 80 15 ad 85 48 c7 c6 7c 6b 92 83 e8 75 0f 6b ff 49 8b 8d 60 01 00 00 44 89 e0 31 d2 48 c7 c6 18 ae 8f 83 48 8d bb 18 02 00 00 <f7> b1 a8 02 2
> [ 71.230528] RSP: 0018:ffffc90000babc20 EFLAGS: 00010246
> [ 71.230737] RAX: 0000000000000009 RBX: ffff8880093c98c0 RCX: 0000000000000000
> [ 71.231014] RDX: 0000000000000000 RSI: ffffffff838fae18 RDI: ffff8880093c9ad8
> [ 71.231292] RBP: ffffc90000babc48 R08: 0000000000000002 R09: 0000000000000000
> [ 71.231570] R10: ffffc90000baba80 R11: ffff88800af08d98 R12: 0000000000000009
> [ 71.231850] R13: ffff88800c4bc000 R14: ffff88800c4bc000 R15: 0000000000000004
> [ 71.232129] FS: 0000000000000000(0000) GS:ffff88807dc00000(0000) knlGS:0000000000000000
> [ 71.232441] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 71.232668] CR2: 00000000000002a8 CR3: 000000000a1d2002 CR4: 0000000000770ef0
> [ 71.232949] PKRU: 55555554
> [ 71.233061] Call Trace:
> [ 71.233162] <TASK>
> [ 71.233254] xfs_qm_dqread+0x46/0x440
> [ 71.233410] ? xfs_qm_dqget_inode+0x13e/0x500
> [ 71.233596] xfs_qm_dqget_inode+0x154/0x500
> [ 71.233774] xfs_qm_dqattach_one+0x142/0x3c0
> [ 71.233961] xfs_qm_dqattach_locked+0x14a/0x170
> [ 71.234149] xfs_qm_dqattach+0x52/0x80
> [ 71.234307] xfs_inactive+0x186/0x340
> [ 71.234461] xfs_inodegc_worker+0xd3/0x430
> [ 71.234630] process_one_work+0x3b1/0x960
> [ 71.234802] worker_thread+0x52/0x660
> [ 71.234957] ? __pfx_worker_thread+0x10/0x10
> [ 71.235136] kthread+0x161/0x1a0
> [ 71.235279] ? __pfx_kthread+0x10/0x10
> [ 71.235442] ret_from_fork+0x29/0x50
> [ 71.235602] </TASK>
> [ 71.235696] Modules linked in:
> [ 71.235826] CR2: 00000000000002a8
> [ 71.235964] ---[ end trace 0000000000000000 ]---
>
> Reporting the above issue and providing a reproduced way seems valuable.
>
> But not sure report to who in kernel community.
> So used RIP: "xfs_dquot_alloc" key word to bisect between v6.2 and v5.11.
> I know there is BUG also but with some other RIP info.
>
> Related commit:
> 29837019d5ebb80a5f180af3107a0645c731a770
> Merge tag 'io_uring-5.19-2022-07-08' of git://git.kernel.dk/linux-block
>
> This might not be the right point of suspicion.
> Could you help to take a look or add the correct developer for this issue?
> Thanks a lot!
>
> ---
>
> If you don't need an environment to reproduce the problem or if you already
> have one, please ignore the following information.
>
> How to reproduce:
> git clone https://gitlab.com/xupengfe/repro_vm_env.git
> cd repro_vm_env
> tar -xvf repro_vm_env.tar.gz
> cd repro_vm_env; ./start3.sh // it needs qemu-system-x86_64 and I used v7.1.0
> // start3.sh will load bzImage_2241ab53cbb5cdb08a6b2d4688feb13971058f65 v6.2-rc5 kernel
> // You could change the bzImage_xxx as you want
> In vm and login with root, there is no password for root.
>
> After login vm successfully, you could transfer reproduced binary to the VM by below way, and reproduce the problem:
> gcc -pthread -o repro repro.c
> scp -P 10023 repro root@localhost:/root/
>
> Get the bzImage for target kernel:
> Please use target kconfig and copy it to kernel_src/.config
> make olddefconfig
> make -jx bzImage //x should equal or less than cpu num your pc has
>
> Fill the bzImage file into above start3.sh to load the target kernel vm.
>
>
> Tips:
> If you already have qemu-system-x86_64, please ignore below info.
> If you want to install qemu v7.1.0 version:
> git clone https://github.com/qemu/qemu.git
> cd qemu
> git checkout -f v7.1.0
> mkdir build
> cd build
> yum install -y ninja-build.x86_64
> ../configure --target-list=x86_64-softmmu --enable-kvm --enable-vnc --enable-gtk --enable-sdl
> make
> make install
>
> Thanks!
> BR.
On Sat, Feb 25, 2023 at 08:58:25PM +0100, Helge Deller wrote:
> Looping in xfs mailing list as this seems to be a XFS problem...
> On 2/24/23 05:39, Pengfei Xu wrote:
> > [ 71.225966] XFS (loop1): Quotacheck: Unsuccessful (Error -5): Disabling quotas.
> > [ 71.226310] xfs filesystem being mounted at /root/syzkaller.qCVHXV/0/file0 supports timestamps until 2038 (0x7fffffff)
> > [ 71.227591] BUG: kernel NULL pointer dereference, address: 00000000000002a8
> > [ 71.227873] #PF: supervisor read access in kernel mode
> > [ 71.228077] #PF: error_code(0x0000) - not-present page
> > [ 71.228280] PGD c313067 P4D c313067 PUD c1fe067 PMD 0
> > [ 71.228494] Oops: 0000 [#1] PREEMPT SMP NOPTI
> > [ 71.228673] CPU: 0 PID: 161 Comm: kworker/0:4 Not tainted 6.2.0-c9c3395d5e3d #1
> > [ 71.228961] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
> > [ 71.229400] Workqueue: xfs-inodegc/loop1 xfs_inodegc_worker
> > [ 71.229626] RIP: 0010:xfs_dquot_alloc+0x95/0x1e0
> > [ 71.229820] Code: 80 15 ad 85 48 c7 c6 7c 6b 92 83 e8 75 0f 6b ff 49 8b 8d 60 01 00 00 44 89 e0 31 d2 48 c7 c6 18 ae 8f 83 48 8d bb 18 02 00 00 <f7> b1 a8 02 2
> > [ 71.230528] RSP: 0018:ffffc90000babc20 EFLAGS: 00010246
> > [ 71.230737] RAX: 0000000000000009 RBX: ffff8880093c98c0 RCX: 0000000000000000
> > [ 71.231014] RDX: 0000000000000000 RSI: ffffffff838fae18 RDI: ffff8880093c9ad8
> > [ 71.231292] RBP: ffffc90000babc48 R08: 0000000000000002 R09: 0000000000000000
> > [ 71.231570] R10: ffffc90000baba80 R11: ffff88800af08d98 R12: 0000000000000009
> > [ 71.231850] R13: ffff88800c4bc000 R14: ffff88800c4bc000 R15: 0000000000000004
> > [ 71.232129] FS: 0000000000000000(0000) GS:ffff88807dc00000(0000) knlGS:0000000000000000
> > [ 71.232441] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > [ 71.232668] CR2: 00000000000002a8 CR3: 000000000a1d2002 CR4: 0000000000770ef0
> > [ 71.232949] PKRU: 55555554
> > [ 71.233061] Call Trace:
> > [ 71.233162] <TASK>
> > [ 71.233254] xfs_qm_dqread+0x46/0x440
> > [ 71.233410] ? xfs_qm_dqget_inode+0x13e/0x500
> > [ 71.233596] xfs_qm_dqget_inode+0x154/0x500
> > [ 71.233774] xfs_qm_dqattach_one+0x142/0x3c0
> > [ 71.233961] xfs_qm_dqattach_locked+0x14a/0x170
> > [ 71.234149] xfs_qm_dqattach+0x52/0x80
> > [ 71.234307] xfs_inactive+0x186/0x340
> > [ 71.234461] xfs_inodegc_worker+0xd3/0x430
> > [ 71.234630] process_one_work+0x3b1/0x960
> > [ 71.234802] worker_thread+0x52/0x660
> > [ 71.234957] ? __pfx_worker_thread+0x10/0x10
> > [ 71.235136] kthread+0x161/0x1a0
> > [ 71.235279] ? __pfx_kthread+0x10/0x10
> > [ 71.235442] ret_from_fork+0x29/0x50
> > [ 71.235602] </TASK>
> > [ 71.235696] Modules linked in:
> > [ 71.235826] CR2: 00000000000002a8
> > [ 71.235964] ---[ end trace 0000000000000000 ]---
Looks like a quota disable race with background inode inactivation
reading in dquots.
Can you test the patch below?
-Dave.
--
Dave Chinner
[email protected]
xfs: quotacheck failure can race with background inode inactivation
From: Dave Chinner <[email protected]>
The background inode inactivation can attached dquots to inodes, but
this can race with a foreground quotacheck failure that leads to
disabling quotas and freeing the mp->m_quotainfo structure. The
background inode inactivation then tries to allocate a quota, tries
to dereference mp->m_quotainfo, and crashes like so:
XFS (loop1): Quotacheck: Unsuccessful (Error -5): Disabling quotas.
xfs filesystem being mounted at /root/syzkaller.qCVHXV/0/file0 supports timestamps until 2038 (0x7fffffff)
BUG: kernel NULL pointer dereference, address: 00000000000002a8
....
CPU: 0 PID: 161 Comm: kworker/0:4 Not tainted 6.2.0-c9c3395d5e3d #1
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
Workqueue: xfs-inodegc/loop1 xfs_inodegc_worker
RIP: 0010:xfs_dquot_alloc+0x95/0x1e0
....
Call Trace:
<TASK>
xfs_qm_dqread+0x46/0x440
xfs_qm_dqget_inode+0x154/0x500
xfs_qm_dqattach_one+0x142/0x3c0
xfs_qm_dqattach_locked+0x14a/0x170
xfs_qm_dqattach+0x52/0x80
xfs_inactive+0x186/0x340
xfs_inodegc_worker+0xd3/0x430
process_one_work+0x3b1/0x960
worker_thread+0x52/0x660
kthread+0x161/0x1a0
ret_from_fork+0x29/0x50
</TASK>
....
Prevent this race by flushing all the queued background inode
inactivations pending before purging all the cached dquots when
quotacheck fails.
Reported-by: Pengfei Xu <[email protected]>
Signed-off-by: Dave Chinner <[email protected]>
---
fs/xfs/xfs_qm.c | 40 ++++++++++++++++++++++++++--------------
1 file changed, 26 insertions(+), 14 deletions(-)
diff --git a/fs/xfs/xfs_qm.c b/fs/xfs/xfs_qm.c
index e2c542f6dcd4..78ca52e55f03 100644
--- a/fs/xfs/xfs_qm.c
+++ b/fs/xfs/xfs_qm.c
@@ -1321,15 +1321,14 @@ xfs_qm_quotacheck(
error = xfs_iwalk_threaded(mp, 0, 0, xfs_qm_dqusage_adjust, 0, true,
NULL);
- if (error) {
- /*
- * The inode walk may have partially populated the dquot
- * caches. We must purge them before disabling quota and
- * tearing down the quotainfo, or else the dquots will leak.
- */
- xfs_qm_dqpurge_all(mp);
- goto error_return;
- }
+
+ /*
+ * On error, the inode walk may have partially populated the dquot
+ * caches. We must purge them before disabling quota and tearing down
+ * the quotainfo, or else the dquots will leak.
+ */
+ if (error)
+ goto error_purge;
/*
* We've made all the changes that we need to make incore. Flush them
@@ -1363,10 +1362,8 @@ xfs_qm_quotacheck(
* and turn quotaoff. The dquots won't be attached to any of the inodes
* at this point (because we intentionally didn't in dqget_noattach).
*/
- if (error) {
- xfs_qm_dqpurge_all(mp);
- goto error_return;
- }
+ if (error)
+ goto error_purge;
/*
* If one type of quotas is off, then it will lose its
@@ -1376,7 +1373,7 @@ xfs_qm_quotacheck(
mp->m_qflags &= ~XFS_ALL_QUOTA_CHKD;
mp->m_qflags |= flags;
- error_return:
+error_return:
xfs_buf_delwri_cancel(&buffer_list);
if (error) {
@@ -1395,6 +1392,21 @@ xfs_qm_quotacheck(
} else
xfs_notice(mp, "Quotacheck: Done.");
return error;
+
+error_purge:
+ /*
+ * On error, we may have inodes queued for inactivation. This may try
+ * to attach dquots to the inode before running cleanup operations on
+ * the inode and this can race with the xfs_qm_destroy_quotainfo() call
+ * below that frees mp->m_quotainfo. To avoid this race, flush all the
+ * pending inodegc operations before we purge the dquots from memory,
+ * ensuring that background inactivation is idle whilst we turn off
+ * quotas.
+ */
+ xfs_inodegc_flush(mp);
+ xfs_qm_dqpurge_all(mp);
+ goto error_return;
+
}
/*
Hi Dave and Helge Deller,
Thanks Helge Deller to add the xfs mailing list!
On 2023-02-27 at 09:34:03 +1100, Dave Chinner wrote:
> On Sat, Feb 25, 2023 at 08:58:25PM +0100, Helge Deller wrote:
> > Looping in xfs mailing list as this seems to be a XFS problem...
> > On 2/24/23 05:39, Pengfei Xu wrote:
> > > [ 71.225966] XFS (loop1): Quotacheck: Unsuccessful (Error -5): Disabling quotas.
> > > [ 71.226310] xfs filesystem being mounted at /root/syzkaller.qCVHXV/0/file0 supports timestamps until 2038 (0x7fffffff)
> > > [ 71.227591] BUG: kernel NULL pointer dereference, address: 00000000000002a8
> > > [ 71.227873] #PF: supervisor read access in kernel mode
> > > [ 71.228077] #PF: error_code(0x0000) - not-present page
> > > [ 71.228280] PGD c313067 P4D c313067 PUD c1fe067 PMD 0
> > > [ 71.228494] Oops: 0000 [#1] PREEMPT SMP NOPTI
> > > [ 71.228673] CPU: 0 PID: 161 Comm: kworker/0:4 Not tainted 6.2.0-c9c3395d5e3d #1
> > > [ 71.228961] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
> > > [ 71.229400] Workqueue: xfs-inodegc/loop1 xfs_inodegc_worker
> > > [ 71.229626] RIP: 0010:xfs_dquot_alloc+0x95/0x1e0
> > > [ 71.229820] Code: 80 15 ad 85 48 c7 c6 7c 6b 92 83 e8 75 0f 6b ff 49 8b 8d 60 01 00 00 44 89 e0 31 d2 48 c7 c6 18 ae 8f 83 48 8d bb 18 02 00 00 <f7> b1 a8 02 2
> > > [ 71.230528] RSP: 0018:ffffc90000babc20 EFLAGS: 00010246
> > > [ 71.230737] RAX: 0000000000000009 RBX: ffff8880093c98c0 RCX: 0000000000000000
> > > [ 71.231014] RDX: 0000000000000000 RSI: ffffffff838fae18 RDI: ffff8880093c9ad8
> > > [ 71.231292] RBP: ffffc90000babc48 R08: 0000000000000002 R09: 0000000000000000
> > > [ 71.231570] R10: ffffc90000baba80 R11: ffff88800af08d98 R12: 0000000000000009
> > > [ 71.231850] R13: ffff88800c4bc000 R14: ffff88800c4bc000 R15: 0000000000000004
> > > [ 71.232129] FS: 0000000000000000(0000) GS:ffff88807dc00000(0000) knlGS:0000000000000000
> > > [ 71.232441] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > > [ 71.232668] CR2: 00000000000002a8 CR3: 000000000a1d2002 CR4: 0000000000770ef0
> > > [ 71.232949] PKRU: 55555554
> > > [ 71.233061] Call Trace:
> > > [ 71.233162] <TASK>
> > > [ 71.233254] xfs_qm_dqread+0x46/0x440
> > > [ 71.233410] ? xfs_qm_dqget_inode+0x13e/0x500
> > > [ 71.233596] xfs_qm_dqget_inode+0x154/0x500
> > > [ 71.233774] xfs_qm_dqattach_one+0x142/0x3c0
> > > [ 71.233961] xfs_qm_dqattach_locked+0x14a/0x170
> > > [ 71.234149] xfs_qm_dqattach+0x52/0x80
> > > [ 71.234307] xfs_inactive+0x186/0x340
> > > [ 71.234461] xfs_inodegc_worker+0xd3/0x430
> > > [ 71.234630] process_one_work+0x3b1/0x960
> > > [ 71.234802] worker_thread+0x52/0x660
> > > [ 71.234957] ? __pfx_worker_thread+0x10/0x10
> > > [ 71.235136] kthread+0x161/0x1a0
> > > [ 71.235279] ? __pfx_kthread+0x10/0x10
> > > [ 71.235442] ret_from_fork+0x29/0x50
> > > [ 71.235602] </TASK>
> > > [ 71.235696] Modules linked in:
> > > [ 71.235826] CR2: 00000000000002a8
> > > [ 71.235964] ---[ end trace 0000000000000000 ]---
>
> Looks like a quota disable race with background inode inactivation
> reading in dquots.
>
> Can you test the patch below?
>
Thanks for your fixed patch in short time!
I installed below patch on top of v6.2 kernel.
And there was no any BUG dmesg anyway, so the fixed patch worked.
And left some "Internal error xfs_iunlink_remove_inode" related Call Trace.
I'm new to xfs, could you help to check if it is some other issue or we
could ignore it.
I put the dmesg in bugzilla attachment as follow:
https://bugzilla.kernel.org/show_bug.cgi?id=217078 ->
https://bugzilla.kernel.org/attachment.cgi?id=303793
Thanks!
BR.
-Pengfei.
> -Dave.
> --
> Dave Chinner
> [email protected]
>
>
> xfs: quotacheck failure can race with background inode inactivation
>
> From: Dave Chinner <[email protected]>
>
> The background inode inactivation can attached dquots to inodes, but
> this can race with a foreground quotacheck failure that leads to
> disabling quotas and freeing the mp->m_quotainfo structure. The
> background inode inactivation then tries to allocate a quota, tries
> to dereference mp->m_quotainfo, and crashes like so:
>
> XFS (loop1): Quotacheck: Unsuccessful (Error -5): Disabling quotas.
> xfs filesystem being mounted at /root/syzkaller.qCVHXV/0/file0 supports timestamps until 2038 (0x7fffffff)
> BUG: kernel NULL pointer dereference, address: 00000000000002a8
> ....
> CPU: 0 PID: 161 Comm: kworker/0:4 Not tainted 6.2.0-c9c3395d5e3d #1
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
> Workqueue: xfs-inodegc/loop1 xfs_inodegc_worker
> RIP: 0010:xfs_dquot_alloc+0x95/0x1e0
> ....
> Call Trace:
> <TASK>
> xfs_qm_dqread+0x46/0x440
> xfs_qm_dqget_inode+0x154/0x500
> xfs_qm_dqattach_one+0x142/0x3c0
> xfs_qm_dqattach_locked+0x14a/0x170
> xfs_qm_dqattach+0x52/0x80
> xfs_inactive+0x186/0x340
> xfs_inodegc_worker+0xd3/0x430
> process_one_work+0x3b1/0x960
> worker_thread+0x52/0x660
> kthread+0x161/0x1a0
> ret_from_fork+0x29/0x50
> </TASK>
> ....
>
> Prevent this race by flushing all the queued background inode
> inactivations pending before purging all the cached dquots when
> quotacheck fails.
>
> Reported-by: Pengfei Xu <[email protected]>
> Signed-off-by: Dave Chinner <[email protected]>
> ---
> fs/xfs/xfs_qm.c | 40 ++++++++++++++++++++++++++--------------
> 1 file changed, 26 insertions(+), 14 deletions(-)
>
> diff --git a/fs/xfs/xfs_qm.c b/fs/xfs/xfs_qm.c
> index e2c542f6dcd4..78ca52e55f03 100644
> --- a/fs/xfs/xfs_qm.c
> +++ b/fs/xfs/xfs_qm.c
> @@ -1321,15 +1321,14 @@ xfs_qm_quotacheck(
>
> error = xfs_iwalk_threaded(mp, 0, 0, xfs_qm_dqusage_adjust, 0, true,
> NULL);
> - if (error) {
> - /*
> - * The inode walk may have partially populated the dquot
> - * caches. We must purge them before disabling quota and
> - * tearing down the quotainfo, or else the dquots will leak.
> - */
> - xfs_qm_dqpurge_all(mp);
> - goto error_return;
> - }
> +
> + /*
> + * On error, the inode walk may have partially populated the dquot
> + * caches. We must purge them before disabling quota and tearing down
> + * the quotainfo, or else the dquots will leak.
> + */
> + if (error)
> + goto error_purge;
>
> /*
> * We've made all the changes that we need to make incore. Flush them
> @@ -1363,10 +1362,8 @@ xfs_qm_quotacheck(
> * and turn quotaoff. The dquots won't be attached to any of the inodes
> * at this point (because we intentionally didn't in dqget_noattach).
> */
> - if (error) {
> - xfs_qm_dqpurge_all(mp);
> - goto error_return;
> - }
> + if (error)
> + goto error_purge;
>
> /*
> * If one type of quotas is off, then it will lose its
> @@ -1376,7 +1373,7 @@ xfs_qm_quotacheck(
> mp->m_qflags &= ~XFS_ALL_QUOTA_CHKD;
> mp->m_qflags |= flags;
>
> - error_return:
> +error_return:
> xfs_buf_delwri_cancel(&buffer_list);
>
> if (error) {
> @@ -1395,6 +1392,21 @@ xfs_qm_quotacheck(
> } else
> xfs_notice(mp, "Quotacheck: Done.");
> return error;
> +
> +error_purge:
> + /*
> + * On error, we may have inodes queued for inactivation. This may try
> + * to attach dquots to the inode before running cleanup operations on
> + * the inode and this can race with the xfs_qm_destroy_quotainfo() call
> + * below that frees mp->m_quotainfo. To avoid this race, flush all the
> + * pending inodegc operations before we purge the dquots from memory,
> + * ensuring that background inactivation is idle whilst we turn off
> + * quotas.
> + */
> + xfs_inodegc_flush(mp);
> + xfs_qm_dqpurge_all(mp);
> + goto error_return;
> +
> }
>
> /*
On Mon, Feb 27, 2023 at 03:29:12PM +0800, Pengfei Xu wrote:
> Hi Dave and Helge Deller,
>
> Thanks Helge Deller to add the xfs mailing list!
>
> On 2023-02-27 at 09:34:03 +1100, Dave Chinner wrote:
> > On Sat, Feb 25, 2023 at 08:58:25PM +0100, Helge Deller wrote:
> > > Looping in xfs mailing list as this seems to be a XFS problem...
> > > On 2/24/23 05:39, Pengfei Xu wrote:
> > > > [ 71.225966] XFS (loop1): Quotacheck: Unsuccessful (Error -5): Disabling quotas.
> > > > [ 71.226310] xfs filesystem being mounted at /root/syzkaller.qCVHXV/0/file0 supports timestamps until 2038 (0x7fffffff)
> > > > [ 71.227591] BUG: kernel NULL pointer dereference, address: 00000000000002a8
> > > > [ 71.227873] #PF: supervisor read access in kernel mode
> > > > [ 71.228077] #PF: error_code(0x0000) - not-present page
> > > > [ 71.228280] PGD c313067 P4D c313067 PUD c1fe067 PMD 0
> > > > [ 71.228494] Oops: 0000 [#1] PREEMPT SMP NOPTI
> > > > [ 71.228673] CPU: 0 PID: 161 Comm: kworker/0:4 Not tainted 6.2.0-c9c3395d5e3d #1
> > > > [ 71.228961] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
> > > > [ 71.229400] Workqueue: xfs-inodegc/loop1 xfs_inodegc_worker
> > > > [ 71.229626] RIP: 0010:xfs_dquot_alloc+0x95/0x1e0
> > > > [ 71.229820] Code: 80 15 ad 85 48 c7 c6 7c 6b 92 83 e8 75 0f 6b ff 49 8b 8d 60 01 00 00 44 89 e0 31 d2 48 c7 c6 18 ae 8f 83 48 8d bb 18 02 00 00 <f7> b1 a8 02 2
> > > > [ 71.230528] RSP: 0018:ffffc90000babc20 EFLAGS: 00010246
> > > > [ 71.230737] RAX: 0000000000000009 RBX: ffff8880093c98c0 RCX: 0000000000000000
> > > > [ 71.231014] RDX: 0000000000000000 RSI: ffffffff838fae18 RDI: ffff8880093c9ad8
> > > > [ 71.231292] RBP: ffffc90000babc48 R08: 0000000000000002 R09: 0000000000000000
> > > > [ 71.231570] R10: ffffc90000baba80 R11: ffff88800af08d98 R12: 0000000000000009
> > > > [ 71.231850] R13: ffff88800c4bc000 R14: ffff88800c4bc000 R15: 0000000000000004
> > > > [ 71.232129] FS: 0000000000000000(0000) GS:ffff88807dc00000(0000) knlGS:0000000000000000
> > > > [ 71.232441] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > > > [ 71.232668] CR2: 00000000000002a8 CR3: 000000000a1d2002 CR4: 0000000000770ef0
> > > > [ 71.232949] PKRU: 55555554
> > > > [ 71.233061] Call Trace:
> > > > [ 71.233162] <TASK>
> > > > [ 71.233254] xfs_qm_dqread+0x46/0x440
> > > > [ 71.233410] ? xfs_qm_dqget_inode+0x13e/0x500
> > > > [ 71.233596] xfs_qm_dqget_inode+0x154/0x500
> > > > [ 71.233774] xfs_qm_dqattach_one+0x142/0x3c0
> > > > [ 71.233961] xfs_qm_dqattach_locked+0x14a/0x170
> > > > [ 71.234149] xfs_qm_dqattach+0x52/0x80
> > > > [ 71.234307] xfs_inactive+0x186/0x340
> > > > [ 71.234461] xfs_inodegc_worker+0xd3/0x430
> > > > [ 71.234630] process_one_work+0x3b1/0x960
> > > > [ 71.234802] worker_thread+0x52/0x660
> > > > [ 71.234957] ? __pfx_worker_thread+0x10/0x10
> > > > [ 71.235136] kthread+0x161/0x1a0
> > > > [ 71.235279] ? __pfx_kthread+0x10/0x10
> > > > [ 71.235442] ret_from_fork+0x29/0x50
> > > > [ 71.235602] </TASK>
> > > > [ 71.235696] Modules linked in:
> > > > [ 71.235826] CR2: 00000000000002a8
> > > > [ 71.235964] ---[ end trace 0000000000000000 ]---
> >
> > Looks like a quota disable race with background inode inactivation
> > reading in dquots.
> >
> > Can you test the patch below?
> >
> Thanks for your fixed patch in short time!
> I installed below patch on top of v6.2 kernel.
> And there was no any BUG dmesg anyway, so the fixed patch worked.
>
> And left some "Internal error xfs_iunlink_remove_inode" related Call Trace.
> I'm new to xfs, could you help to check if it is some other issue or we
> could ignore it.
I'm guessing this the filesystem detecting a corruption and shutting
down. That's normal behaviour when tools like syzkaller through
random crap at the filesystem and expect it to like it.
> I put the dmesg in bugzilla attachment as follow:
> https://bugzilla.kernel.org/show_bug.cgi?id=217078 ->
> https://bugzilla.kernel.org/attachment.cgi?id=303793
I am not authorised to access bug 217078, so I can't read any of
this. Just cut out and attach the relevant dmesg output to the email.
-Dave.
--
Dave Chinner
[email protected]
Hi Dave,
On 2023-02-28 at 08:46:07 +1100, Dave Chinner wrote:
> On Mon, Feb 27, 2023 at 03:29:12PM +0800, Pengfei Xu wrote:
> > Hi Dave and Helge Deller,
> >
> > Thanks Helge Deller to add the xfs mailing list!
> >
> > On 2023-02-27 at 09:34:03 +1100, Dave Chinner wrote:
> > > On Sat, Feb 25, 2023 at 08:58:25PM +0100, Helge Deller wrote:
> > > > Looping in xfs mailing list as this seems to be a XFS problem...
> > > > On 2/24/23 05:39, Pengfei Xu wrote:
> > > > > [ 71.225966] XFS (loop1): Quotacheck: Unsuccessful (Error -5): Disabling quotas.
> > > > > [ 71.226310] xfs filesystem being mounted at /root/syzkaller.qCVHXV/0/file0 supports timestamps until 2038 (0x7fffffff)
> > > > > [ 71.227591] BUG: kernel NULL pointer dereference, address: 00000000000002a8
> > > > > [ 71.227873] #PF: supervisor read access in kernel mode
> > > > > [ 71.228077] #PF: error_code(0x0000) - not-present page
> > > > > [ 71.228280] PGD c313067 P4D c313067 PUD c1fe067 PMD 0
> > > > > [ 71.228494] Oops: 0000 [#1] PREEMPT SMP NOPTI
> > > > > [ 71.228673] CPU: 0 PID: 161 Comm: kworker/0:4 Not tainted 6.2.0-c9c3395d5e3d #1
> > > > > [ 71.228961] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
> > > > > [ 71.229400] Workqueue: xfs-inodegc/loop1 xfs_inodegc_worker
> > > > > [ 71.229626] RIP: 0010:xfs_dquot_alloc+0x95/0x1e0
> > > > > [ 71.229820] Code: 80 15 ad 85 48 c7 c6 7c 6b 92 83 e8 75 0f 6b ff 49 8b 8d 60 01 00 00 44 89 e0 31 d2 48 c7 c6 18 ae 8f 83 48 8d bb 18 02 00 00 <f7> b1 a8 02 2
> > > > > [ 71.230528] RSP: 0018:ffffc90000babc20 EFLAGS: 00010246
> > > > > [ 71.230737] RAX: 0000000000000009 RBX: ffff8880093c98c0 RCX: 0000000000000000
> > > > > [ 71.231014] RDX: 0000000000000000 RSI: ffffffff838fae18 RDI: ffff8880093c9ad8
> > > > > [ 71.231292] RBP: ffffc90000babc48 R08: 0000000000000002 R09: 0000000000000000
> > > > > [ 71.231570] R10: ffffc90000baba80 R11: ffff88800af08d98 R12: 0000000000000009
> > > > > [ 71.231850] R13: ffff88800c4bc000 R14: ffff88800c4bc000 R15: 0000000000000004
> > > > > [ 71.232129] FS: 0000000000000000(0000) GS:ffff88807dc00000(0000) knlGS:0000000000000000
> > > > > [ 71.232441] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > > > > [ 71.232668] CR2: 00000000000002a8 CR3: 000000000a1d2002 CR4: 0000000000770ef0
> > > > > [ 71.232949] PKRU: 55555554
> > > > > [ 71.233061] Call Trace:
> > > > > [ 71.233162] <TASK>
> > > > > [ 71.233254] xfs_qm_dqread+0x46/0x440
> > > > > [ 71.233410] ? xfs_qm_dqget_inode+0x13e/0x500
> > > > > [ 71.233596] xfs_qm_dqget_inode+0x154/0x500
> > > > > [ 71.233774] xfs_qm_dqattach_one+0x142/0x3c0
> > > > > [ 71.233961] xfs_qm_dqattach_locked+0x14a/0x170
> > > > > [ 71.234149] xfs_qm_dqattach+0x52/0x80
> > > > > [ 71.234307] xfs_inactive+0x186/0x340
> > > > > [ 71.234461] xfs_inodegc_worker+0xd3/0x430
> > > > > [ 71.234630] process_one_work+0x3b1/0x960
> > > > > [ 71.234802] worker_thread+0x52/0x660
> > > > > [ 71.234957] ? __pfx_worker_thread+0x10/0x10
> > > > > [ 71.235136] kthread+0x161/0x1a0
> > > > > [ 71.235279] ? __pfx_kthread+0x10/0x10
> > > > > [ 71.235442] ret_from_fork+0x29/0x50
> > > > > [ 71.235602] </TASK>
> > > > > [ 71.235696] Modules linked in:
> > > > > [ 71.235826] CR2: 00000000000002a8
> > > > > [ 71.235964] ---[ end trace 0000000000000000 ]---
> > >
> > > Looks like a quota disable race with background inode inactivation
> > > reading in dquots.
> > >
> > > Can you test the patch below?
> > >
> > Thanks for your fixed patch in short time!
> > I installed below patch on top of v6.2 kernel.
> > And there was no any BUG dmesg anyway, so the fixed patch worked.
> >
> > And left some "Internal error xfs_iunlink_remove_inode" related Call Trace.
> > I'm new to xfs, could you help to check if it is some other issue or we
> > could ignore it.
>
> I'm guessing this the filesystem detecting a corruption and shutting
> down. That's normal behaviour when tools like syzkaller through
> random crap at the filesystem and expect it to like it.
>
> > I put the dmesg in bugzilla attachment as follow:
> > https://bugzilla.kernel.org/show_bug.cgi?id=217078 ->
> > https://bugzilla.kernel.org/attachment.cgi?id=303793
>
> I am not authorised to access bug 217078, so I can't read any of
> this. Just cut out and attach the relevant dmesg output to the email.
>
Thanks for your check! Yes, there was "XFS (loop6): Corruption detected.
Unmount and run xfs_repair" found at the end of Call Trace info.
Sorry for unaccess link, I put the related dmesg output as follow:
And I put the whole dmesg in github:
https://raw.githubusercontent.com/xupengfe/syzkaller_logs/main/230222_152458_xfs_dquot_alloc_bug/xfs_fixed_BUG_issue_in_v6.2.log
[ 198.601079] XFS (loop2): Ending clean mount
[ 198.601338] XFS (loop6): Quotacheck needed: Please wait.
[ 198.602067] XFS (loop2): Quotacheck needed: Please wait.
[ 198.607537] 00000000: 58 41 47 49 00 00 00 01 00 00 00 00 00 00 80 00 XAGI............
[ 198.607902] 00000010: 00 00 00 40 00 00 00 06 00 00 00 01 00 00 00 37 [email protected]
[ 198.608223] 00000020: 00 00 00 20 ff ff ff ff ff ff ff ff ff ff ff ff ... ............
[ 198.608553] 00000030: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ................
[ 198.608875] 00000040: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ................
[ 198.609195] 00000050: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ................
[ 198.609514] 00000060: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ................
[ 198.609833] 00000070: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ................
[ 198.610153] 00000080: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ................
[ 198.610473] 00000090: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ................
[ 198.610793] 000000a0: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ................
[ 198.611123] 000000b0: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ................
[ 198.611443] 000000c0: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ................
[ 198.611763] 000000d0: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ................
[ 198.612082] 000000e0: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ................
[ 198.612402] 000000f0: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ................
[ 198.612733] 00000100: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ................
[ 198.613052] 00000110: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ................
[ 198.613371] 00000120: ff ff ff ff ff ff ff ff 00 00 00 00 00 00 00 00 ................
[ 198.613690] 00000130: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
[ 198.614009] 00000140: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
[ 198.614328] 00000150: 00 00 00 00 00 00 00 00 ........
[ 198.614621] XFS (loop6): Internal error xfs_iunlink_remove_inode at line 2013 of file fs/xfs/xfs_inode.c. Caller xfs_ifree+0xed/0x9e0
[ 198.615112] CPU: 0 PID: 291 Comm: kworker/0:6 Not tainted 6.2.0-kvmxfs+ #19
[ 198.615394] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
[ 198.615844] Workqueue: xfs-inodegc/loop6 xfs_inodegc_worker
[ 198.616074] Call Trace:
[ 198.616179] <TASK>
[ 198.616274] dump_stack_lvl+0xa7/0xdb
[ 198.616441] dump_stack+0x19/0x1f
[ 198.616587] xfs_corruption_error+0xd7/0xe0
[ 198.616771] ? xfs_ifree+0xed/0x9e0
[ 198.616925] xfs_iunlink_remove+0x32e/0x590
[ 198.617108] ? xfs_ifree+0xed/0x9e0
[ 198.617265] xfs_ifree+0xed/0x9e0
[ 198.617410] ? write_comp_data+0x2f/0x90
[ 198.617580] ? xfs_trans_ijoin+0x47/0x70
[ 198.617750] ? __sanitizer_cov_trace_pc+0x25/0x60
[ 198.617950] ? xfs_trans_add_item+0x79/0x1c0
[ 198.618139] xfs_inactive_ifree+0xf8/0x2a0
[ 198.618316] xfs_inactive+0x226/0x340
[ 198.618475] xfs_inodegc_worker+0xd3/0x430
[ 198.618651] process_one_work+0x3b1/0x960
[ 198.618832] worker_thread+0x52/0x660
[ 198.618993] ? __pfx_worker_thread+0x10/0x10
[ 198.619179] kthread+0x161/0x1a0
[ 198.619326] ? __pfx_kthread+0x10/0x10
[ 198.619495] ret_from_fork+0x29/0x50
[ 198.619666] </TASK>
[ 198.619768] XFS (loop6): Corruption detected. Unmount and run xfs_repair
[ 198.620038] XFS (loop6): xfs_inactive_ifree: xfs_ifree returned error -117
[ 198.657300] kworker/0:6: attempt to access beyond end of device
[ 198.657300] loop6: rw=432129, sector=65535, nr_sectors = 16 limit=65536
[ 198.657940] XFS (loop6): log I/O error -5
[ 198.658295] loop4: detected capacity change from 0 to 65536
[ 198.658706] XFS: ikeep mount option is deprecated.
[ 198.659850] XFS (loop6): Metadata I/O Error (0x1) detected at xfs_inactive_ifree+0x232/0x2a0 (fs/xfs/xfs_inode.c:1612). Shutting down filesystem.
[ 198.660387] XFS (loop6): Please unmount the filesystem and rectify the problem(s)
[ 198.660739] XFS (loop6): Quotacheck: Unsuccessful (Error -5): Disabling quotas.
Thanks!
BR.
-Pengfei.
> -Dave.
> --
> Dave Chinner
> [email protected]