Subject: kvm-vm boot fail: ttm_bo_cleanup_memtype_use?

Hey,

I am getting a bug arround:

"Workqueue: events ttm_device_delayed_workqueue [ttm]"

when using Linus' kernel with HEAD:

commit a180bd1d7e16173d965b263c5a536aa40afa2a2a (HEAD -> master, origin/master, origin/HEAD)
Author: Linus Torvalds <[email protected]>
Date: Sun Jul 4 16:12:42 2021 -0700

while booting a kvm vm.

The config is attached, and the bug messages are:

[ 3.044604] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.14.0-3.fc34 04/01/2014
[ 3.044607] Workqueue: events ttm_device_delayed_workqueue [ttm]
[ 3.044616] RIP: 0010:kfence_unprotect+0x2c/0x90
[ 3.044620] Code: 44 00 00 55 48 81 e7 00 f0 ff ff 48 89 fd 48 83 ec 08 48 8d 74 24 04 e8 f2 42 d2 ff 48 85 c0 74 07 83 7c 24 04 01 74 13 0f 0b <0f> 0b c6 05 4b 50 cc 01 00 48 83 c4 08 31 c0 5d c3 48 8b 38 48 89
[ 3.044623] RSP: 0018:ffffbcbe80633c50 EFLAGS: 00010046
[ 3.044626] RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffffffff98c26000
[ 3.044629] RDX: ffffbcbe80633c54 RSI: 0000000000000000 RDI: ffffffff98c26000
[ 3.044631] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
[ 3.044633] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
[ 3.044635] R13: ffffbcbe80633ce8 R14: 0000000000000000 R15: 0000000000000000
[ 3.044639] FS: 0000000000000000(0000) GS:ffff9e3ff7b00000(0000) knlGS:0000000000000000
[ 3.044641] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 3.044644] CR2: 0000000000000010 CR3: 0000000244c26000 CR4: 0000000000750ee0
[ 3.044648] PKRU: 55555554
[ 3.044650] Call Trace:
[ 3.044653] page_fault_oops+0x89/0x270
[ 3.044666] exc_page_fault+0x79/0x260
[ 3.044673] asm_exc_page_fault+0x1e/0x30
[ 3.044676] RIP: 0010:qxl_bo_delete_mem_notify+0x19/0x40 [qxl]
[ 3.044683] Code: 89 e7 45 31 e4 e8 27 cb fd d6 eb ea 0f 1f 44 00 00 0f 1f 44 00 00 55 48 89 fd e8 a2 02 00 00 84 c0 74 0d 48 8b 85 48 02 00 00 <83> 78 10 03 74 02 5d c3 8b 85 44 03 00 00 85 c0 74 f4 48 8b 7d 08
[ 3.044685] RSP: 0018:ffffbcbe80633d98 EFLAGS: 00010202
[ 3.044688] RAX: 0000000000000000 RBX: ffff9e3e94d30d80 RCX: 0000000000000000
[ 3.044690] RDX: 0000000000000003 RSI: ffff9e3e94d31d18 RDI: ffff9e3e84612400
[ 3.044692] RBP: ffff9e3e84612400 R08: 0000000000000000 R09: 0000000000000001
[ 3.044694] R10: ffffbcbe80633c90 R11: 000000000004cc80 R12: ffff9e3e94d31d00
[ 3.044696] R13: ffff9e3e94d30d80 R14: ffff9e3e84612670 R15: ffff9e3e84612670
[ 3.044715] ? qxl_bo_delete_mem_notify+0xe/0x40 [qxl]
[ 3.044722] ttm_bo_cleanup_mistype_use+0x22/0x60 [ttm]
[ 3.044731] ttm_bo_release+0x28e/0x540 [ttm]
[ 3.044745] ttm_bo_delayed_delete+0x1be/0x220 [ttm]
[ 3.044761] ttm_device_delayed_workqueue+0x18/0x40 [ttm]
[ 3.044774] process_one_work+0x26e/0x550
[ 3.044788] worker_thread+0x52/0x3b0
[ 3.044792] ? process_one_work+0x550/0x550
[ 3.044799] kthread+0x138/0x160
[ 3.044803] ? set_kthread_struct+0x40/0x40
[ 3.044811] ret_from_fork+0x22/0x30
[ 3.044832] irq event stamp: 2552
[ 3.044833] hardirqs last enabled at (2551): [<ffffffff978c990d>] seqcount_lockdep_reader_access+0x7d/0x90
[ 3.044838] hardirqs last disabled at (2552): [<ffffffff97cf3d98>] exc_page_fault+0x38/0x260
[ 3.044841] softirqs last enabled at (2496): [<ffffffff971c30d4>] css_free_rwork_fn+0x74/0x590
[ 3.044844] softirqs last disabled at (2494): [<ffffffff971c30b9>] css_free_rwork_fn+0x59/0x590
[ 3.044847] ---[ end trace 3fbaaa179830c812 ]---
[ 3.044850] BUG: kernel NULL pointer dereference, address: 0000000000000010
[ 3.044851] #PF: supervisor read access in kernel mode
[ 3.044853] #PF: error_code(0x0000) - not-present page
[ 3.044854] PGD 0 P4D 0
[ 3.044857] Oops: 0000 [#1] SMP NOPTI
[ 3.044859] CPU: 12 PID: 162 Comm: kworker/12:1 Tainted: G W 5.13.0+ #31
[ 3.044861] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.14.0-3.fc34 04/01/2014
[ 3.044863] Workqueue: events ttm_device_delayed_workqueue [ttm]
[ 3.044870] RIP: 0010:qxl_bo_delete_mem_notify+0x19/0x40 [qxl]
[ 3.044875] Code: 89 e7 45 31 e4 e8 27 cb fd d6 eb ea 0f 1f 44 00 00 0f 1f 44 00 00 55 48 89 fd e8 a2 02 00 00 84 c0 74 0d 48 8b 85 48 02 00 00 <83> 78 10 03 74 02 5d c3 8b 85 44 03 00 00 85 c0 74 f4 48 8b 7d 08
[ 3.044877] RSP: 0018:ffffbcbe80633d98 EFLAGS: 00010202
[ 3.044879] RAX: 0000000000000000 RBX: ffff9e3e94d30d80 RCX: 0000000000000000
[ 3.044881] RDX: 0000000000000003 RSI: ffff9e3e94d31d18 RDI: ffff9e3e84612400
[ 3.044882] RBP: ffff9e3e84612400 R08: 0000000000000000 R09: 0000000000000001
[ 3.044883] R10: ffffbcbe80633c90 R11: 000000000004cc80 R12: ffff9e3e94d31d00
[ 3.044885] R13: ffff9e3e94d30d80 R14: ffff9e3e84612670 R15: ffff9e3e84612670
[ 3.044889] FS: 0000000000000000(0000) GS:ffff9e3ff7b00000(0000) knlGS:0000000000000000
[ 3.044891] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 3.044893] CR2: 0000000000000010 CR3: 0000000244c26000 CR4: 0000000000750ee0
[ 3.044897] PKRU: 55555554
[ 3.044897] Call Trace:
[ 3.044899] ttm_bo_cleanup_memtype_use+0x22/0x60 [ttm]
[ 3.044906] ttm_bo_release+0x28e/0x540 [ttm]
[ 3.044914] ttm_bo_delayed_delete+0x1be/0x220 [ttm]
[ 3.044923] ttm_device_delayed_workqueue+0x18/0x40 [ttm]
[ 3.044929] process_one_work+0x26e/0x550
[ 3.044934] worker_thread+0x52/0x3b0
[ 3.044936] ? process_one_work+0x550/0x550
[ 3.044939] kthread+0x138/0x160
[ 3.044942] ? set_kthread_struct+0x40/0x40
[ 3.044946] ret_from_fork+0x22/0x30
[ 3.044952] Modules linked in: xfs qxl drm_ttm_helper ttm drm_kms_helper crct10dif_pclmul crc32_pclmul crc32c_intel cec drm virtio_net ghash_clmulni_intel serio_raw virtio_console net_failover failover virtio_blk qemu_fw_cfg fuse
[ 3.044967] CR2: 0000000000000010
[ 3.044969] ---[ end trace 3fbaaa179830c813 ]---
[ 3.044970] RIP: 0010:qxl_bo_delete_mem_notify+0x19/0x40 [qxl]
[ 3.044975] Code: 89 e7 45 31 e4 e8 27 cb fd d6 eb ea 0f 1f 44 00 00 0f 1f 44 00 00 55 48 89 fd e8 a2 02 00 00 84 c0 74 0d 48 8b 85 48 02 00 00 <83> 78 10 03 74 02 5d c3 8b 85 44 03 00 00 85 c0 74 f4 48 8b 7d 08
[ 3.044977] RSP: 0018:ffffbcbe80633d98 EFLAGS: 00010202
[ 3.044979] RAX: 0000000000000000 RBX: ffff9e3e94d30d80 RCX: 0000000000000000
[ 3.044980] RDX: 0000000000000003 RSI: ffff9e3e94d31d18 RDI: ffff9e3e84612400
[ 3.044982] RBP: ffff9e3e84612400 R08: 0000000000000000 R09: 0000000000000001
[ 3.044983] R10: ffffbcbe80633c90 R11: 000000000004cc80 R12: ffff9e3e94d31d00
[ 3.044985] R13: ffff9e3e94d30d80 R14: ffff9e3e84612670 R15: ffff9e3e84612670
[ 3.044989] FS: 0000000000000000(0000) GS:ffff9e3ff7b00000(0000) knlGS:0000000000000000
[ 3.044991] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 3.044992] CR2: 0000000000000010 CR3: 0000000244c26000 CR4: 0000000000750ee0
[ 3.044996] PKRU: 55555554
[ 3.061648] ------------[ cut here ]------------
[ 3.117815] WARNING: CPU: 10 PID: 142 at arch/x86/include/asm/kfence.h:44 kfence_unprotect+0x2a/0x90
[ 3.117821] Modules linked in: xfs qxl drm_ttm_helper ttm drm_kms_helper crct10dif_pclmul crc32_pclmul crc32c_intel cec drm virtio_net ghash_clmulni_intel serio_raw virtio_console net_failover failover virtio_blk qemu_fw_cfg fuse
[ 3.119709] CPU: 10 PID: 142 Comm: kworker/10:1 Tainted: G D W 5.13.0+ #31
[ 3.119711] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.14.0-3.fc34 04/01/2014
[ 3.119712] Workqueue: events drm_fb_helper_damage_work [drm_kms_helper]
[ 3.123843] RIP: 0010:kfence_unprotect+0x2a/0x90
[ 3.123845] Code: 0f 1f 44 00 00 55 48 81 e7 00 f0 ff ff 48 89 fd 48 83 ec 08 48 8d 74 24 04 e8 f2 42 d2 ff 48 85 c0 74 07 83 7c 24 04 01 74 13 <0f> 0b 0f 0b c6 05 4b 50 cc 01 00 48 83 c4 08 31 c0 5d c3 48 8b 38
[ 3.123846] RSP: 0018:ffffbcbe80593810 EFLAGS: 00010046
[ 3.123847] RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffffffff98c26000
[ 3.123848] RDX: ffffbcbe80593814 RSI: 0000000000000000 RDI: ffffffff98c26000
[ 3.123848] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
[ 3.123849] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
[ 3.123849] R13: ffffbcbe805938a8 R14: 0000000000000000 R15: 0000000000000000
[ 3.123851] FS: 0000000000000000(0000) GS:ffff9e3ff7a80000(0000) knlGS:0000000000000000
[ 3.123852] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 3.123852] CR2: 0000000000000010 CR3: 0000000113040000 CR4: 0000000000750ee0
[ 3.123854] PKRU: 55555554
[ 3.134797] Call Trace:
[ 3.134800] page_fault_oops+0x89/0x270
[ OK ] Stopped Create list of sta… nodes[ 3.134803] ? _raw_spin_unlock_irqrestore+0x37/0x40
for the current kernel.
[ OK ] Stopped Create Syste[ 3.136749] exc_page_fault+0x79/0x260
m Users.
[ 3.136752] asm_exc_page_fault+0x1e/0x30
[ 3.136753] RIP: 0010:qxl_bo_delete_mem_notify+0x19/0x40 [qxl]
[ 3.136756] Code: 89 e7 45 31 e4 e8 27 cb fd d6 eb ea 0f 1f 44 00 00 0f 1f 44 00 00 55 48 89 fd e8 a2 02 00 00 84 c0 74 0d 48 8b 85 48 02 00 00 <83> 78 10 03 74 02 5d c3 8b 85 44 03 00 00 85 c0 74 f4 48 8b 7d 08
[ 3.136757] RSP: 0018:ffffbcbe80593958 EFLAGS: 00010202
[ 3.136757] RAX: 0000000000000000 RBX: ffff9e3e957a22a8 RCX: 0000000000000000
[ 3.136758] RDX: ffff9e3e84617a70 RSI: ffffffffc034d705 RDI: ffff9e3e84617800
[ 3.136758] RBP: ffff9e3e84617800 R08: ffff9e3e84617a70 R09: 0000000000000000
[ 3.136759] R10: 0000000000000000 R11: 0000000000000001 R12: ffffbcbe80593ba0
[ 3.136760] R13: ffff9e3e94d30d80 R14: ffff9e3e84617a70 R15: ffff9e3e84617800
[ 3.136761] ? ttm_bo_release+0x285/0x540 [ttm]
[ 3.146192] ? qxl_bo_delete_mem_notify+0xe/0x40 [qxl]
[ 3.146194] ttm_bo_cleanup_memtype_use+0x22/0x60 [ttm]
[ 3.146196] ttm_bo_release+0x28e/0x540 [ttm]
[ 3.146199] ttm_mem_evict_first+0x306/0x480 [ttm]
[ 3.146202] ? ttm_range_man_alloc+0xe1/0xf0 [ttm]
[ 3.146204] ttm_bo_mem_space+0x24d/0x2b0 [ttm]
[ 3.146207] ttm_bo_validate+0xa9/0x1c0 [ttm]
[ 3.146209] ? mutex_trylock+0x116/0x130
[ 3.146210] ? ttm_bo_init_reserved+0x289/0x2b0 [ttm]
[ 3.146212] ttm_bo_init_reserved+0x213/0x2b0 [ttm]
[ 3.146214] qxl_bo_create+0x13b/0x240 [qxl]
[ 3.152840] ? qxl_ttm_debugfs_init+0xc0/0xc0 [qxl]
[ 3.152843] qxl_alloc_bo_reserved+0x2e/0x90 [qxl]
[ 3.152845] qxl_image_alloc_objects+0xab/0x120 [qxl]
[ 3.152847] qxl_draw_dirty_fb+0x14c/0x420 [qxl]
[ 3.152850] ? ww_mutex_lock_interruptible+0x30/0x90
[ 3.156624] ? modeset_lock+0x90/0x1c0 [drm]
[ 3.156637] qxl_framebuffer_surface_dirty+0xeb/0x1b0 [qxl]
[ 3.156640] drm_fb_helper_damage_work+0x18e/0x2d0 [drm_kms_helper]
[ 3.156647] process_one_work+0x26e/0x550
[ 3.156650] worker_thread+0x52/0x3b0
[ 3.156651] ? process_one_work+0x550/0x550
[ 3.156652] kthread+0x138/0x160
[ 3.156653] ? set_kthread_struct+0x40/0x40
[ 3.156655] ret_from_fork+0x22/0x30
[ 3.162750] irq event stamp: 1673
[ 3.162751] hardirqs last enabled at (1673): [<ffffffff97e00d82>] asm_sysvec_apic_timer_interrupt+0x12/0x20
[ 3.162752] hardirqs last disabled at (1671): [<ffffffff980003ee>] __do_softirq+0x3ee/0x485
[ 3.162754] softirqs last enabled at (1672): [<ffffffff970e4834>] __irq_exit_rcu+0xe4/0x110
[ 3.162756] softirqs last disabled at (1665): [<ffffffff970e4834>] __irq_exit_rcu+0xe4/0x110
[ 3.162757] ---[ end trace 3fbaaa179830c814 ]---
[ 3.162763] ------------[ cut here ]------------
[ 3.162764] WARNING: CPU: 10 PID: 142 at mm/kfence/core.c:135 kfence_unprotect+0x2c/0x90
[ 3.162765] Modules linked in: xfs qxl drm_ttm_helper ttm drm_kms_helper crct10dif_pclmul crc32_pclmul crc32c_intel cec drm virtio_net ghash_clmulni_intel serio_raw virtio_console net_failover failover virtio_blk qemu_fw_cfg fuse
[ 3.162769] CPU: 10 PID: 142 Comm: kworker/10:1 Tainted: G D W 5.13.0+ #31
[ 3.162770] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.14.0-3.fc34 04/01/2014
[ 3.162771] Workqueue: events drm_fb_helper_damage_work [drm_kms_helper]
[ 3.162776] RIP: 0010:kfence_unprotect+0x2c/0x90
[ 3.162778] Code: 44 00 00 55 48 81 e7 00 f0 ff ff 48 89 fd 48 83 ec 08 48 8d 74 24 04 e8 f2 42 d2 ff 48 85 c0 74 07 83 7c 24 04 01 74 13 0f 0b <0f> 0b c6 05 4b 50 cc 01 00 48 83 c4 08 31 c0 5d c3 48 8b 38 48 89
[ 3.162778] RSP: 0018:ffffbcbe80593810 EFLAGS: 00010046
[ 3.162779] RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffffffff98c26000
[ 3.162780] RDX: ffffbcbe80593814 RSI: 0000000000000000 RDI: ffffffff98c26000
[ 3.162780] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
[ 3.162780] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
[ 3.162781] R13: ffffbcbe805938a8 R14: 0000000000000000 R15: 0000000000000000
[ 3.162782] FS: 0000000000000000(0000) GS:ffff9e3ff7a80000(0000) knlGS:0000000000000000
[ 3.162783] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 3.162783] CR2: 0000000000000010 CR3: 0000000113040000 CR4: 0000000000750ee0
[ 3.162785] PKRU: 55555554
[ 3.162785] Call Trace:
[ 3.162786] page_fault_oops+0x89/0x270
[ 3.162787] ? _raw_spin_unlock_irqrestore+0x37/0x40
[ 3.162789] exc_page_fault+0x79/0x260
[ 3.162790] asm_exc_page_fault+0x1e/0x30
[ 3.162791] RIP: 0010:qxl_bo_delete_mem_notify+0x19/0x40 [qxl]
[ 3.162793] Code: 89 e7 45 31 e4 e8 27 cb fd d6 eb ea 0f 1f 44 00 00 0f 1f 44 00 00 55 48 89 fd e8 a2 02 00 00 84 c0 74 0d 48 8b 85 48 02 00 00 <83> 78 10 03 74 02 5d c3 8b 85 44 03 00 00 85 c0 74 f4 48 8b 7d 08
[ 3.162793] RSP: 0018:ffffbcbe80593958 EFLAGS: 00010202
[ 3.162794] RAX: 0000000000000000 RBX: ffff9e3e957a22a8 RCX: 0000000000000000
[ 3.162794] RDX: ffff9e3e84617a70 RSI: ffffffffc034d705 RDI: ffff9e3e84617800
[ 3.162795] RBP: ffff9e3e84617800 R08: ffff9e3e84617a70 R09: 0000000000000000
[ 3.162795] R10: 0000000000000000 R11: 0000000000000001 R12: ffffbcbe80593ba0
[ 3.162795] R13: ffff9e3e94d30d80 R14: ffff9e3e84617a70 R15: ffff9e3e84617800
[ 3.162797] ? ttm_bo_release+0x285/0x540 [ttm]
[ 3.162799] ? qxl_bo_delete_mem_notify+0xe/0x40 [qxl]
[ 3.162801] ttm_bo_cleanup_memtype_use+0x22/0x60 [ttm]
[ 3.162803] ttm_bo_release+0x28e/0x540 [ttm]
[ 3.162806] ttm_mem_evict_first+0x306/0x480 [ttm]
[ 3.162808] ? ttm_range_man_alloc+0xe1/0xf0 [ttm]
[ 3.162811] ttm_bo_mem_space+0x24d/0x2b0 [ttm]
[ 3.162813] ttm_bo_validate+0xa9/0x1c0 [ttm]
[ 3.162815] ? mutex_trylock+0x116/0x130
[ 3.162816] ? ttm_bo_init_reserved+0x289/0x2b0 [ttm]
[ 3.162818] ttm_bo_init_reserved+0x213/0x2b0 [ttm]
[ 3.162820] qxl_bo_create+0x13b/0x240 [qxl]
[ 3.162822] ? qxl_ttm_debugfs_init+0xc0/0xc0 [qxl]
[ 3.162824] qxl_alloc_bo_reserved+0x2e/0x90 [qxl]
[ 3.162826] qxl_image_alloc_objects+0xab/0x120 [qxl]
[ 3.162828] qxl_draw_dirty_fb+0x14c/0x420 [qxl]
[ 3.162831] ? ww_mutex_lock_interruptible+0x30/0x90
[ 3.162831] ? modeset_lock+0x90/0x1c0 [drm]
[ 3.162842] qxl_framebuffer_surface_dirty+0xeb/0x1b0 [qxl]
[ 3.162845] drm_fb_helper_damage_work+0x18e/0x2d0 [drm_kms_helper]
[ 3.162851] process_one_work+0x26e/0x550
[ 3.162852] worker_thread+0x52/0x3b0
[ 3.162853] ? process_one_work+0x550/0x550
[ 3.162854] kthread+0x138/0x160
[ 3.162855] ? set_kthread_struct+0x40/0x40
[ 3.162857] ret_from_fork+0x22/0x30
[ 3.162859] irq event stamp: 1673
[ 3.162860] hardirqs last enabled at (1673): [<ffffffff97e00d82>] asm_sysvec_apic_timer_interrupt+0x12/0x20
[ 3.162861] hardirqs last disabled at (1671): [<ffffffff980003ee>] __do_softirq+0x3ee/0x485
[ 3.162862] softirqs last enabled at (1672): [<ffffffff970e4834>] __irq_exit_rcu+0xe4/0x110
[ 3.162864] softirqs last disabled at (1665): [<ffffffff970e4834>] __irq_exit_rcu+0xe4/0x110
[ 3.162865] ---[ end trace 3fbaaa179830c815 ]---
[ 3.162866] BUG: kernel NULL pointer dereference, address: 0000000000000010
[ 3.162866] #PF: supervisor read access in kernel mode
[ 3.162867] #PF: error_code(0x0000) - not-present page
[ 3.162867] PGD 0 P4D 0
[ 3.162868] Oops: 0000 [#2] SMP NOPTI
[ 3.162869] CPU: 10 PID: 142 Comm: kworker/10:1 Tainted: G D W 5.13.0+ #31
[ 3.162870] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.14.0-3.fc34 04/01/2014
[ 3.162871] Workqueue: events drm_fb_helper_damage_work [drm_kms_helper]
[ 3.162876] RIP: 0010:qxl_bo_delete_mem_notify+0x19/0x40 [qxl]
[ 3.162877] Code: 89 e7 45 31 e4 e8 27 cb fd d6 eb ea 0f 1f 44 00 00 0f 1f 44 00 00 55 48 89 fd e8 a2 02 00 00 84 c0 74 0d 48 8b 85 48 02 00 00 <83> 78 10 03 74 02 5d c3 8b 85 44 03 00 00 85 c0 74 f4 48 8b 7d 08
[ 3.162878] RSP: 0018:ffffbcbe80593958 EFLAGS: 00010202
[ 3.162878] RAX: 0000000000000000 RBX: ffff9e3e957a22a8 RCX: 0000000000000000
[ 3.162879] RDX: ffff9e3e84617a70 RSI: ffffffffc034d705 RDI: ffff9e3e84617800
[ 3.162879] RBP: ffff9e3e84617800 R08: ffff9e3e84617a70 R09: 0000000000000000
[ 3.162880] R10: 0000000000000000 R11: 0000000000000001 R12: ffffbcbe80593ba0
[ 3.162880] R13: ffff9e3e94d30d80 R14: ffff9e3e84617a70 R15: ffff9e3e84617800
[ 3.162881] FS: 0000000000000000(0000) GS:ffff9e3ff7a80000(0000) knlGS:0000000000000000
[ 3.162882] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 3.162882] CR2: 0000000000000010 CR3: 0000000113040000 CR4: 0000000000750ee0
[ 3.162884] PKRU: 55555554
[ 3.162884] Call Trace:
[ 3.162884] ttm_bo_cleanup_memtype_use+0x22/0x60 [ttm]
[ 3.162887] ttm_bo_release+0x28e/0x540 [ttm]
[ 3.162889] ttm_mem_evict_first+0x306/0x480 [ttm]
[ 3.162891] ? ttm_range_man_alloc+0xe1/0xf0 [ttm]
[ 3.162894] ttm_bo_mem_space+0x24d/0x2b0 [ttm]
[ 3.162896] ttm_bo_validate+0xa9/0x1c0 [ttm]
[ 3.162898] ? mutex_trylock+0x116/0x130
[ 3.162899] ? ttm_bo_init_reserved+0x289/0x2b0 [ttm]
[ 3.162901] ttm_bo_init_reserved+0x213/0x2b0 [ttm]
[ 3.162903] qxl_bo_create+0x13b/0x240 [qxl]
[ 3.162905] ? qxl_ttm_debugfs_init+0xc0/0xc0 [qxl]
[ 3.162907] qxl_alloc_bo_reserved+0x2e/0x90 [qxl]
[ 3.162909] qxl_image_alloc_objects+0xab/0x120 [qxl]
[ 3.162911] qxl_draw_dirty_fb+0x14c/0x420 [qxl]
[ 3.162913] ? ww_mutex_lock_interruptible+0x30/0x90
[ 3.162913] ? modeset_lock+0x90/0x1c0 [drm]
[ 3.162922] qxl_framebuffer_surface_dirty+0xeb/0x1b0 [qxl]
[ 3.162925] drm_fb_helper_damage_work+0x18e/0x2d0 [drm_kms_helper]
[ 3.162931] process_one_work+0x26e/0x550
[ 3.162932] worker_thread+0x52/0x3b0
[ 3.162933] ? process_one_work+0x550/0x550
[ 3.162934] kthread+0x138/0x160
[ 3.162935] ? set_kthread_struct+0x40/0x40
[ 3.162936] ret_from_fork+0x22/0x30
[ 3.162938] Modules linked in: xfs qxl drm_ttm_helper ttm drm_kms_helper crct10dif_pclmul crc32_pclmul crc32c_intel cec drm virtio_net ghash_clmulni_intel serio_raw virtio_console net_failover failover virtio_blk qemu_fw_cfg fuse
[ 3.162943] CR2: 0000000000000010
[ 3.162944] ---[ end trace 3fbaaa179830c816 ]---
[ 3.162945] RIP: 0010:qxl_bo_delete_mem_notify+0x19/0x40 [qxl]
[ 3.162946] Code: 89 e7 45 31 e4 e8 27 cb fd d6 eb ea 0f 1f 44 00 00 0f 1f 44 00 00 55 48 89 fd e8 a2 02 00 00 84 c0 74 0d 48 8b 85 48 02 00 00 <83> 78 10 03 74 02 5d c3 8b 85 44 03 00 00 85 c0 74 f4 48 8b 7d 08
[ 3.162947] RSP: 0018:ffffbcbe80633d98 EFLAGS: 00010202
[ 3.162947] RAX: 0000000000000000 RBX: ffff9e3e94d30d80 RCX: 0000000000000000
[ 3.162948] RDX: 0000000000000003 RSI: ffff9e3e94d31d18 RDI: ffff9e3e84612400
[ 3.162948] RBP: ffff9e3e84612400 R08: 0000000000000000 R09: 0000000000000001
[ 3.162948] R10: ffffbcbe80633c90 R11: 000000000004cc80 R12: ffff9e3e94d31d00
[ 3.162949] R13: ffff9e3e94d30d80 R14: ffff9e3e84612670 R15: ffff9e3e84612670
[ 3.162950] FS: 0000000000000000(0000) GS:ffff9e3ff7a80000(0000) knlGS:0000000000000000
[ 3.162951] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 3.162951] CR2: 0000000000000010 CR3: 0000000113040000 CR4: 0000000000750ee0
[ 3.162952] PKRU: 55555554


Attachments:
config.txt (155.39 kB)

2021-07-06 06:55:08

by Christian König

[permalink] [raw]
Subject: Re: kvm-vm boot fail: ttm_bo_cleanup_memtype_use?

Hi Daniel,

looks like a simple missing NULL check to me. Please test the attached
patch.

We recently changed the structure of the resources, so probably
introduced while doing that.

Christian.

Am 05.07.21 um 13:57 schrieb Daniel Bristot de Oliveira:
> Hey,
>
> I am getting a bug arround:
>
> "Workqueue: events ttm_device_delayed_workqueue [ttm]"
>
> when using Linus' kernel with HEAD:
>
> commit a180bd1d7e16173d965b263c5a536aa40afa2a2a (HEAD -> master, origin/master, origin/HEAD)
> Author: Linus Torvalds <[email protected]>
> Date: Sun Jul 4 16:12:42 2021 -0700
>
> while booting a kvm vm.
>
> The config is attached, and the bug messages are:
>
> [ 3.044604] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.14.0-3.fc34 04/01/2014
> [ 3.044607] Workqueue: events ttm_device_delayed_workqueue [ttm]
> [ 3.044616] RIP: 0010:kfence_unprotect+0x2c/0x90
> [ 3.044620] Code: 44 00 00 55 48 81 e7 00 f0 ff ff 48 89 fd 48 83 ec 08 48 8d 74 24 04 e8 f2 42 d2 ff 48 85 c0 74 07 83 7c 24 04 01 74 13 0f 0b <0f> 0b c6 05 4b 50 cc 01 00 48 83 c4 08 31 c0 5d c3 48 8b 38 48 89
> [ 3.044623] RSP: 0018:ffffbcbe80633c50 EFLAGS: 00010046
> [ 3.044626] RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffffffff98c26000
> [ 3.044629] RDX: ffffbcbe80633c54 RSI: 0000000000000000 RDI: ffffffff98c26000
> [ 3.044631] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
> [ 3.044633] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
> [ 3.044635] R13: ffffbcbe80633ce8 R14: 0000000000000000 R15: 0000000000000000
> [ 3.044639] FS: 0000000000000000(0000) GS:ffff9e3ff7b00000(0000) knlGS:0000000000000000
> [ 3.044641] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 3.044644] CR2: 0000000000000010 CR3: 0000000244c26000 CR4: 0000000000750ee0
> [ 3.044648] PKRU: 55555554
> [ 3.044650] Call Trace:
> [ 3.044653] page_fault_oops+0x89/0x270
> [ 3.044666] exc_page_fault+0x79/0x260
> [ 3.044673] asm_exc_page_fault+0x1e/0x30
> [ 3.044676] RIP: 0010:qxl_bo_delete_mem_notify+0x19/0x40 [qxl]
> [ 3.044683] Code: 89 e7 45 31 e4 e8 27 cb fd d6 eb ea 0f 1f 44 00 00 0f 1f 44 00 00 55 48 89 fd e8 a2 02 00 00 84 c0 74 0d 48 8b 85 48 02 00 00 <83> 78 10 03 74 02 5d c3 8b 85 44 03 00 00 85 c0 74 f4 48 8b 7d 08
> [ 3.044685] RSP: 0018:ffffbcbe80633d98 EFLAGS: 00010202
> [ 3.044688] RAX: 0000000000000000 RBX: ffff9e3e94d30d80 RCX: 0000000000000000
> [ 3.044690] RDX: 0000000000000003 RSI: ffff9e3e94d31d18 RDI: ffff9e3e84612400
> [ 3.044692] RBP: ffff9e3e84612400 R08: 0000000000000000 R09: 0000000000000001
> [ 3.044694] R10: ffffbcbe80633c90 R11: 000000000004cc80 R12: ffff9e3e94d31d00
> [ 3.044696] R13: ffff9e3e94d30d80 R14: ffff9e3e84612670 R15: ffff9e3e84612670
> [ 3.044715] ? qxl_bo_delete_mem_notify+0xe/0x40 [qxl]
> [ 3.044722] ttm_bo_cleanup_mistype_use+0x22/0x60 [ttm]
> [ 3.044731] ttm_bo_release+0x28e/0x540 [ttm]
> [ 3.044745] ttm_bo_delayed_delete+0x1be/0x220 [ttm]
> [ 3.044761] ttm_device_delayed_workqueue+0x18/0x40 [ttm]
> [ 3.044774] process_one_work+0x26e/0x550
> [ 3.044788] worker_thread+0x52/0x3b0
> [ 3.044792] ? process_one_work+0x550/0x550
> [ 3.044799] kthread+0x138/0x160
> [ 3.044803] ? set_kthread_struct+0x40/0x40
> [ 3.044811] ret_from_fork+0x22/0x30
> [ 3.044832] irq event stamp: 2552
> [ 3.044833] hardirqs last enabled at (2551): [<ffffffff978c990d>] seqcount_lockdep_reader_access+0x7d/0x90
> [ 3.044838] hardirqs last disabled at (2552): [<ffffffff97cf3d98>] exc_page_fault+0x38/0x260
> [ 3.044841] softirqs last enabled at (2496): [<ffffffff971c30d4>] css_free_rwork_fn+0x74/0x590
> [ 3.044844] softirqs last disabled at (2494): [<ffffffff971c30b9>] css_free_rwork_fn+0x59/0x590
> [ 3.044847] ---[ end trace 3fbaaa179830c812 ]---
> [ 3.044850] BUG: kernel NULL pointer dereference, address: 0000000000000010
> [ 3.044851] #PF: supervisor read access in kernel mode
> [ 3.044853] #PF: error_code(0x0000) - not-present page
> [ 3.044854] PGD 0 P4D 0
> [ 3.044857] Oops: 0000 [#1] SMP NOPTI
> [ 3.044859] CPU: 12 PID: 162 Comm: kworker/12:1 Tainted: G W 5.13.0+ #31
> [ 3.044861] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.14.0-3.fc34 04/01/2014
> [ 3.044863] Workqueue: events ttm_device_delayed_workqueue [ttm]
> [ 3.044870] RIP: 0010:qxl_bo_delete_mem_notify+0x19/0x40 [qxl]
> [ 3.044875] Code: 89 e7 45 31 e4 e8 27 cb fd d6 eb ea 0f 1f 44 00 00 0f 1f 44 00 00 55 48 89 fd e8 a2 02 00 00 84 c0 74 0d 48 8b 85 48 02 00 00 <83> 78 10 03 74 02 5d c3 8b 85 44 03 00 00 85 c0 74 f4 48 8b 7d 08
> [ 3.044877] RSP: 0018:ffffbcbe80633d98 EFLAGS: 00010202
> [ 3.044879] RAX: 0000000000000000 RBX: ffff9e3e94d30d80 RCX: 0000000000000000
> [ 3.044881] RDX: 0000000000000003 RSI: ffff9e3e94d31d18 RDI: ffff9e3e84612400
> [ 3.044882] RBP: ffff9e3e84612400 R08: 0000000000000000 R09: 0000000000000001
> [ 3.044883] R10: ffffbcbe80633c90 R11: 000000000004cc80 R12: ffff9e3e94d31d00
> [ 3.044885] R13: ffff9e3e94d30d80 R14: ffff9e3e84612670 R15: ffff9e3e84612670
> [ 3.044889] FS: 0000000000000000(0000) GS:ffff9e3ff7b00000(0000) knlGS:0000000000000000
> [ 3.044891] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 3.044893] CR2: 0000000000000010 CR3: 0000000244c26000 CR4: 0000000000750ee0
> [ 3.044897] PKRU: 55555554
> [ 3.044897] Call Trace:
> [ 3.044899] ttm_bo_cleanup_memtype_use+0x22/0x60 [ttm]
> [ 3.044906] ttm_bo_release+0x28e/0x540 [ttm]
> [ 3.044914] ttm_bo_delayed_delete+0x1be/0x220 [ttm]
> [ 3.044923] ttm_device_delayed_workqueue+0x18/0x40 [ttm]
> [ 3.044929] process_one_work+0x26e/0x550
> [ 3.044934] worker_thread+0x52/0x3b0
> [ 3.044936] ? process_one_work+0x550/0x550
> [ 3.044939] kthread+0x138/0x160
> [ 3.044942] ? set_kthread_struct+0x40/0x40
> [ 3.044946] ret_from_fork+0x22/0x30
> [ 3.044952] Modules linked in: xfs qxl drm_ttm_helper ttm drm_kms_helper crct10dif_pclmul crc32_pclmul crc32c_intel cec drm virtio_net ghash_clmulni_intel serio_raw virtio_console net_failover failover virtio_blk qemu_fw_cfg fuse
> [ 3.044967] CR2: 0000000000000010
> [ 3.044969] ---[ end trace 3fbaaa179830c813 ]---
> [ 3.044970] RIP: 0010:qxl_bo_delete_mem_notify+0x19/0x40 [qxl]
> [ 3.044975] Code: 89 e7 45 31 e4 e8 27 cb fd d6 eb ea 0f 1f 44 00 00 0f 1f 44 00 00 55 48 89 fd e8 a2 02 00 00 84 c0 74 0d 48 8b 85 48 02 00 00 <83> 78 10 03 74 02 5d c3 8b 85 44 03 00 00 85 c0 74 f4 48 8b 7d 08
> [ 3.044977] RSP: 0018:ffffbcbe80633d98 EFLAGS: 00010202
> [ 3.044979] RAX: 0000000000000000 RBX: ffff9e3e94d30d80 RCX: 0000000000000000
> [ 3.044980] RDX: 0000000000000003 RSI: ffff9e3e94d31d18 RDI: ffff9e3e84612400
> [ 3.044982] RBP: ffff9e3e84612400 R08: 0000000000000000 R09: 0000000000000001
> [ 3.044983] R10: ffffbcbe80633c90 R11: 000000000004cc80 R12: ffff9e3e94d31d00
> [ 3.044985] R13: ffff9e3e94d30d80 R14: ffff9e3e84612670 R15: ffff9e3e84612670
> [ 3.044989] FS: 0000000000000000(0000) GS:ffff9e3ff7b00000(0000) knlGS:0000000000000000
> [ 3.044991] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 3.044992] CR2: 0000000000000010 CR3: 0000000244c26000 CR4: 0000000000750ee0
> [ 3.044996] PKRU: 55555554
> [ 3.061648] ------------[ cut here ]------------
> [ 3.117815] WARNING: CPU: 10 PID: 142 at arch/x86/include/asm/kfence.h:44 kfence_unprotect+0x2a/0x90
> [ 3.117821] Modules linked in: xfs qxl drm_ttm_helper ttm drm_kms_helper crct10dif_pclmul crc32_pclmul crc32c_intel cec drm virtio_net ghash_clmulni_intel serio_raw virtio_console net_failover failover virtio_blk qemu_fw_cfg fuse
> [ 3.119709] CPU: 10 PID: 142 Comm: kworker/10:1 Tainted: G D W 5.13.0+ #31
> [ 3.119711] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.14.0-3.fc34 04/01/2014
> [ 3.119712] Workqueue: events drm_fb_helper_damage_work [drm_kms_helper]
> [ 3.123843] RIP: 0010:kfence_unprotect+0x2a/0x90
> [ 3.123845] Code: 0f 1f 44 00 00 55 48 81 e7 00 f0 ff ff 48 89 fd 48 83 ec 08 48 8d 74 24 04 e8 f2 42 d2 ff 48 85 c0 74 07 83 7c 24 04 01 74 13 <0f> 0b 0f 0b c6 05 4b 50 cc 01 00 48 83 c4 08 31 c0 5d c3 48 8b 38
> [ 3.123846] RSP: 0018:ffffbcbe80593810 EFLAGS: 00010046
> [ 3.123847] RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffffffff98c26000
> [ 3.123848] RDX: ffffbcbe80593814 RSI: 0000000000000000 RDI: ffffffff98c26000
> [ 3.123848] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
> [ 3.123849] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
> [ 3.123849] R13: ffffbcbe805938a8 R14: 0000000000000000 R15: 0000000000000000
> [ 3.123851] FS: 0000000000000000(0000) GS:ffff9e3ff7a80000(0000) knlGS:0000000000000000
> [ 3.123852] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 3.123852] CR2: 0000000000000010 CR3: 0000000113040000 CR4: 0000000000750ee0
> [ 3.123854] PKRU: 55555554
> [ 3.134797] Call Trace:
> [ 3.134800] page_fault_oops+0x89/0x270
> [ OK ] Stopped Create list of sta… nodes[ 3.134803] ? _raw_spin_unlock_irqrestore+0x37/0x40
> for the current kernel.
> [ OK ] Stopped Create Syste[ 3.136749] exc_page_fault+0x79/0x260
> m Users.
> [ 3.136752] asm_exc_page_fault+0x1e/0x30
> [ 3.136753] RIP: 0010:qxl_bo_delete_mem_notify+0x19/0x40 [qxl]
> [ 3.136756] Code: 89 e7 45 31 e4 e8 27 cb fd d6 eb ea 0f 1f 44 00 00 0f 1f 44 00 00 55 48 89 fd e8 a2 02 00 00 84 c0 74 0d 48 8b 85 48 02 00 00 <83> 78 10 03 74 02 5d c3 8b 85 44 03 00 00 85 c0 74 f4 48 8b 7d 08
> [ 3.136757] RSP: 0018:ffffbcbe80593958 EFLAGS: 00010202
> [ 3.136757] RAX: 0000000000000000 RBX: ffff9e3e957a22a8 RCX: 0000000000000000
> [ 3.136758] RDX: ffff9e3e84617a70 RSI: ffffffffc034d705 RDI: ffff9e3e84617800
> [ 3.136758] RBP: ffff9e3e84617800 R08: ffff9e3e84617a70 R09: 0000000000000000
> [ 3.136759] R10: 0000000000000000 R11: 0000000000000001 R12: ffffbcbe80593ba0
> [ 3.136760] R13: ffff9e3e94d30d80 R14: ffff9e3e84617a70 R15: ffff9e3e84617800
> [ 3.136761] ? ttm_bo_release+0x285/0x540 [ttm]
> [ 3.146192] ? qxl_bo_delete_mem_notify+0xe/0x40 [qxl]
> [ 3.146194] ttm_bo_cleanup_memtype_use+0x22/0x60 [ttm]
> [ 3.146196] ttm_bo_release+0x28e/0x540 [ttm]
> [ 3.146199] ttm_mem_evict_first+0x306/0x480 [ttm]
> [ 3.146202] ? ttm_range_man_alloc+0xe1/0xf0 [ttm]
> [ 3.146204] ttm_bo_mem_space+0x24d/0x2b0 [ttm]
> [ 3.146207] ttm_bo_validate+0xa9/0x1c0 [ttm]
> [ 3.146209] ? mutex_trylock+0x116/0x130
> [ 3.146210] ? ttm_bo_init_reserved+0x289/0x2b0 [ttm]
> [ 3.146212] ttm_bo_init_reserved+0x213/0x2b0 [ttm]
> [ 3.146214] qxl_bo_create+0x13b/0x240 [qxl]
> [ 3.152840] ? qxl_ttm_debugfs_init+0xc0/0xc0 [qxl]
> [ 3.152843] qxl_alloc_bo_reserved+0x2e/0x90 [qxl]
> [ 3.152845] qxl_image_alloc_objects+0xab/0x120 [qxl]
> [ 3.152847] qxl_draw_dirty_fb+0x14c/0x420 [qxl]
> [ 3.152850] ? ww_mutex_lock_interruptible+0x30/0x90
> [ 3.156624] ? modeset_lock+0x90/0x1c0 [drm]
> [ 3.156637] qxl_framebuffer_surface_dirty+0xeb/0x1b0 [qxl]
> [ 3.156640] drm_fb_helper_damage_work+0x18e/0x2d0 [drm_kms_helper]
> [ 3.156647] process_one_work+0x26e/0x550
> [ 3.156650] worker_thread+0x52/0x3b0
> [ 3.156651] ? process_one_work+0x550/0x550
> [ 3.156652] kthread+0x138/0x160
> [ 3.156653] ? set_kthread_struct+0x40/0x40
> [ 3.156655] ret_from_fork+0x22/0x30
> [ 3.162750] irq event stamp: 1673
> [ 3.162751] hardirqs last enabled at (1673): [<ffffffff97e00d82>] asm_sysvec_apic_timer_interrupt+0x12/0x20
> [ 3.162752] hardirqs last disabled at (1671): [<ffffffff980003ee>] __do_softirq+0x3ee/0x485
> [ 3.162754] softirqs last enabled at (1672): [<ffffffff970e4834>] __irq_exit_rcu+0xe4/0x110
> [ 3.162756] softirqs last disabled at (1665): [<ffffffff970e4834>] __irq_exit_rcu+0xe4/0x110
> [ 3.162757] ---[ end trace 3fbaaa179830c814 ]---
> [ 3.162763] ------------[ cut here ]------------
> [ 3.162764] WARNING: CPU: 10 PID: 142 at mm/kfence/core.c:135 kfence_unprotect+0x2c/0x90
> [ 3.162765] Modules linked in: xfs qxl drm_ttm_helper ttm drm_kms_helper crct10dif_pclmul crc32_pclmul crc32c_intel cec drm virtio_net ghash_clmulni_intel serio_raw virtio_console net_failover failover virtio_blk qemu_fw_cfg fuse
> [ 3.162769] CPU: 10 PID: 142 Comm: kworker/10:1 Tainted: G D W 5.13.0+ #31
> [ 3.162770] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.14.0-3.fc34 04/01/2014
> [ 3.162771] Workqueue: events drm_fb_helper_damage_work [drm_kms_helper]
> [ 3.162776] RIP: 0010:kfence_unprotect+0x2c/0x90
> [ 3.162778] Code: 44 00 00 55 48 81 e7 00 f0 ff ff 48 89 fd 48 83 ec 08 48 8d 74 24 04 e8 f2 42 d2 ff 48 85 c0 74 07 83 7c 24 04 01 74 13 0f 0b <0f> 0b c6 05 4b 50 cc 01 00 48 83 c4 08 31 c0 5d c3 48 8b 38 48 89
> [ 3.162778] RSP: 0018:ffffbcbe80593810 EFLAGS: 00010046
> [ 3.162779] RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffffffff98c26000
> [ 3.162780] RDX: ffffbcbe80593814 RSI: 0000000000000000 RDI: ffffffff98c26000
> [ 3.162780] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
> [ 3.162780] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
> [ 3.162781] R13: ffffbcbe805938a8 R14: 0000000000000000 R15: 0000000000000000
> [ 3.162782] FS: 0000000000000000(0000) GS:ffff9e3ff7a80000(0000) knlGS:0000000000000000
> [ 3.162783] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 3.162783] CR2: 0000000000000010 CR3: 0000000113040000 CR4: 0000000000750ee0
> [ 3.162785] PKRU: 55555554
> [ 3.162785] Call Trace:
> [ 3.162786] page_fault_oops+0x89/0x270
> [ 3.162787] ? _raw_spin_unlock_irqrestore+0x37/0x40
> [ 3.162789] exc_page_fault+0x79/0x260
> [ 3.162790] asm_exc_page_fault+0x1e/0x30
> [ 3.162791] RIP: 0010:qxl_bo_delete_mem_notify+0x19/0x40 [qxl]
> [ 3.162793] Code: 89 e7 45 31 e4 e8 27 cb fd d6 eb ea 0f 1f 44 00 00 0f 1f 44 00 00 55 48 89 fd e8 a2 02 00 00 84 c0 74 0d 48 8b 85 48 02 00 00 <83> 78 10 03 74 02 5d c3 8b 85 44 03 00 00 85 c0 74 f4 48 8b 7d 08
> [ 3.162793] RSP: 0018:ffffbcbe80593958 EFLAGS: 00010202
> [ 3.162794] RAX: 0000000000000000 RBX: ffff9e3e957a22a8 RCX: 0000000000000000
> [ 3.162794] RDX: ffff9e3e84617a70 RSI: ffffffffc034d705 RDI: ffff9e3e84617800
> [ 3.162795] RBP: ffff9e3e84617800 R08: ffff9e3e84617a70 R09: 0000000000000000
> [ 3.162795] R10: 0000000000000000 R11: 0000000000000001 R12: ffffbcbe80593ba0
> [ 3.162795] R13: ffff9e3e94d30d80 R14: ffff9e3e84617a70 R15: ffff9e3e84617800
> [ 3.162797] ? ttm_bo_release+0x285/0x540 [ttm]
> [ 3.162799] ? qxl_bo_delete_mem_notify+0xe/0x40 [qxl]
> [ 3.162801] ttm_bo_cleanup_memtype_use+0x22/0x60 [ttm]
> [ 3.162803] ttm_bo_release+0x28e/0x540 [ttm]
> [ 3.162806] ttm_mem_evict_first+0x306/0x480 [ttm]
> [ 3.162808] ? ttm_range_man_alloc+0xe1/0xf0 [ttm]
> [ 3.162811] ttm_bo_mem_space+0x24d/0x2b0 [ttm]
> [ 3.162813] ttm_bo_validate+0xa9/0x1c0 [ttm]
> [ 3.162815] ? mutex_trylock+0x116/0x130
> [ 3.162816] ? ttm_bo_init_reserved+0x289/0x2b0 [ttm]
> [ 3.162818] ttm_bo_init_reserved+0x213/0x2b0 [ttm]
> [ 3.162820] qxl_bo_create+0x13b/0x240 [qxl]
> [ 3.162822] ? qxl_ttm_debugfs_init+0xc0/0xc0 [qxl]
> [ 3.162824] qxl_alloc_bo_reserved+0x2e/0x90 [qxl]
> [ 3.162826] qxl_image_alloc_objects+0xab/0x120 [qxl]
> [ 3.162828] qxl_draw_dirty_fb+0x14c/0x420 [qxl]
> [ 3.162831] ? ww_mutex_lock_interruptible+0x30/0x90
> [ 3.162831] ? modeset_lock+0x90/0x1c0 [drm]
> [ 3.162842] qxl_framebuffer_surface_dirty+0xeb/0x1b0 [qxl]
> [ 3.162845] drm_fb_helper_damage_work+0x18e/0x2d0 [drm_kms_helper]
> [ 3.162851] process_one_work+0x26e/0x550
> [ 3.162852] worker_thread+0x52/0x3b0
> [ 3.162853] ? process_one_work+0x550/0x550
> [ 3.162854] kthread+0x138/0x160
> [ 3.162855] ? set_kthread_struct+0x40/0x40
> [ 3.162857] ret_from_fork+0x22/0x30
> [ 3.162859] irq event stamp: 1673
> [ 3.162860] hardirqs last enabled at (1673): [<ffffffff97e00d82>] asm_sysvec_apic_timer_interrupt+0x12/0x20
> [ 3.162861] hardirqs last disabled at (1671): [<ffffffff980003ee>] __do_softirq+0x3ee/0x485
> [ 3.162862] softirqs last enabled at (1672): [<ffffffff970e4834>] __irq_exit_rcu+0xe4/0x110
> [ 3.162864] softirqs last disabled at (1665): [<ffffffff970e4834>] __irq_exit_rcu+0xe4/0x110
> [ 3.162865] ---[ end trace 3fbaaa179830c815 ]---
> [ 3.162866] BUG: kernel NULL pointer dereference, address: 0000000000000010
> [ 3.162866] #PF: supervisor read access in kernel mode
> [ 3.162867] #PF: error_code(0x0000) - not-present page
> [ 3.162867] PGD 0 P4D 0
> [ 3.162868] Oops: 0000 [#2] SMP NOPTI
> [ 3.162869] CPU: 10 PID: 142 Comm: kworker/10:1 Tainted: G D W 5.13.0+ #31
> [ 3.162870] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.14.0-3.fc34 04/01/2014
> [ 3.162871] Workqueue: events drm_fb_helper_damage_work [drm_kms_helper]
> [ 3.162876] RIP: 0010:qxl_bo_delete_mem_notify+0x19/0x40 [qxl]
> [ 3.162877] Code: 89 e7 45 31 e4 e8 27 cb fd d6 eb ea 0f 1f 44 00 00 0f 1f 44 00 00 55 48 89 fd e8 a2 02 00 00 84 c0 74 0d 48 8b 85 48 02 00 00 <83> 78 10 03 74 02 5d c3 8b 85 44 03 00 00 85 c0 74 f4 48 8b 7d 08
> [ 3.162878] RSP: 0018:ffffbcbe80593958 EFLAGS: 00010202
> [ 3.162878] RAX: 0000000000000000 RBX: ffff9e3e957a22a8 RCX: 0000000000000000
> [ 3.162879] RDX: ffff9e3e84617a70 RSI: ffffffffc034d705 RDI: ffff9e3e84617800
> [ 3.162879] RBP: ffff9e3e84617800 R08: ffff9e3e84617a70 R09: 0000000000000000
> [ 3.162880] R10: 0000000000000000 R11: 0000000000000001 R12: ffffbcbe80593ba0
> [ 3.162880] R13: ffff9e3e94d30d80 R14: ffff9e3e84617a70 R15: ffff9e3e84617800
> [ 3.162881] FS: 0000000000000000(0000) GS:ffff9e3ff7a80000(0000) knlGS:0000000000000000
> [ 3.162882] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 3.162882] CR2: 0000000000000010 CR3: 0000000113040000 CR4: 0000000000750ee0
> [ 3.162884] PKRU: 55555554
> [ 3.162884] Call Trace:
> [ 3.162884] ttm_bo_cleanup_memtype_use+0x22/0x60 [ttm]
> [ 3.162887] ttm_bo_release+0x28e/0x540 [ttm]
> [ 3.162889] ttm_mem_evict_first+0x306/0x480 [ttm]
> [ 3.162891] ? ttm_range_man_alloc+0xe1/0xf0 [ttm]
> [ 3.162894] ttm_bo_mem_space+0x24d/0x2b0 [ttm]
> [ 3.162896] ttm_bo_validate+0xa9/0x1c0 [ttm]
> [ 3.162898] ? mutex_trylock+0x116/0x130
> [ 3.162899] ? ttm_bo_init_reserved+0x289/0x2b0 [ttm]
> [ 3.162901] ttm_bo_init_reserved+0x213/0x2b0 [ttm]
> [ 3.162903] qxl_bo_create+0x13b/0x240 [qxl]
> [ 3.162905] ? qxl_ttm_debugfs_init+0xc0/0xc0 [qxl]
> [ 3.162907] qxl_alloc_bo_reserved+0x2e/0x90 [qxl]
> [ 3.162909] qxl_image_alloc_objects+0xab/0x120 [qxl]
> [ 3.162911] qxl_draw_dirty_fb+0x14c/0x420 [qxl]
> [ 3.162913] ? ww_mutex_lock_interruptible+0x30/0x90
> [ 3.162913] ? modeset_lock+0x90/0x1c0 [drm]
> [ 3.162922] qxl_framebuffer_surface_dirty+0xeb/0x1b0 [qxl]
> [ 3.162925] drm_fb_helper_damage_work+0x18e/0x2d0 [drm_kms_helper]
> [ 3.162931] process_one_work+0x26e/0x550
> [ 3.162932] worker_thread+0x52/0x3b0
> [ 3.162933] ? process_one_work+0x550/0x550
> [ 3.162934] kthread+0x138/0x160
> [ 3.162935] ? set_kthread_struct+0x40/0x40
> [ 3.162936] ret_from_fork+0x22/0x30
> [ 3.162938] Modules linked in: xfs qxl drm_ttm_helper ttm drm_kms_helper crct10dif_pclmul crc32_pclmul crc32c_intel cec drm virtio_net ghash_clmulni_intel serio_raw virtio_console net_failover failover virtio_blk qemu_fw_cfg fuse
> [ 3.162943] CR2: 0000000000000010
> [ 3.162944] ---[ end trace 3fbaaa179830c816 ]---
> [ 3.162945] RIP: 0010:qxl_bo_delete_mem_notify+0x19/0x40 [qxl]
> [ 3.162946] Code: 89 e7 45 31 e4 e8 27 cb fd d6 eb ea 0f 1f 44 00 00 0f 1f 44 00 00 55 48 89 fd e8 a2 02 00 00 84 c0 74 0d 48 8b 85 48 02 00 00 <83> 78 10 03 74 02 5d c3 8b 85 44 03 00 00 85 c0 74 f4 48 8b 7d 08
> [ 3.162947] RSP: 0018:ffffbcbe80633d98 EFLAGS: 00010202
> [ 3.162947] RAX: 0000000000000000 RBX: ffff9e3e94d30d80 RCX: 0000000000000000
> [ 3.162948] RDX: 0000000000000003 RSI: ffff9e3e94d31d18 RDI: ffff9e3e84612400
> [ 3.162948] RBP: ffff9e3e84612400 R08: 0000000000000000 R09: 0000000000000001
> [ 3.162948] R10: ffffbcbe80633c90 R11: 000000000004cc80 R12: ffff9e3e94d31d00
> [ 3.162949] R13: ffff9e3e94d30d80 R14: ffff9e3e84612670 R15: ffff9e3e84612670
> [ 3.162950] FS: 0000000000000000(0000) GS:ffff9e3ff7a80000(0000) knlGS:0000000000000000
> [ 3.162951] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 3.162951] CR2: 0000000000000010 CR3: 0000000113040000 CR4: 0000000000750ee0
> [ 3.162952] PKRU: 55555554


Attachments:
0001-drm-qxl-add-NULL-check-for-bo-resource.patch (1.01 kB)
Subject: Re: kvm-vm boot fail: ttm_bo_cleanup_memtype_use?

On 7/6/21 8:53 AM, Christian König wrote:
> Hi Daniel,
>
> looks like a simple missing NULL check to me. Please test the attached patch.
>

It works!

Feel free to add:

Reported-by: Daniel Bristot de Oliveira <[email protected]>
Tested-by: Daniel Bristot de Oliveira <[email protected]>

> We recently changed the structure of the resources, so probably introduced while
> doing that.
>
> Christian.

Thanks!
-- Daniel