2024-05-05 19:08:02

by Tranton Baddy

[permalink] [raw]
Subject: Error in amd driver?

I have this in my dmesg since version 6.8.6, not sure when it appeared. Is amdgpu driver has bug?
[ 64.253144] ==================================================================
[ 64.253162] BUG: KFENCE: use-after-free read in amdgpu_bo_move+0x51f/0x7a0

[ 64.253183] Use-after-free read at 0x00000000671c48dd (in kfence-#111):
[ 64.253192] amdgpu_bo_move+0x51f/0x7a0
[ 64.253202] ttm_bo_handle_move_mem+0xcf/0x180
[ 64.253211] ttm_mem_evict_first+0x1c5/0x500
[ 64.253218] ttm_resource_manager_evict_all+0xa3/0x1e0
[ 64.253228] amdgpu_device_prepare+0x66/0x110
[ 64.253237] amdgpu_pmops_runtime_suspend+0xbe/0x1c0
[ 64.253248] pci_pm_runtime_suspend+0x74/0x200
[ 64.253259] vga_switcheroo_runtime_suspend+0x21/0xb0
[ 64.253268] __rpm_callback+0x5f/0x190
[ 64.253277] rpm_callback+0x7f/0x90
[ 64.253283] rpm_suspend+0x120/0x6a0
[ 64.253290] pm_runtime_work+0x9c/0xa0
[ 64.253297] process_one_work+0x164/0x330
[ 64.253310] worker_thread+0x302/0x430
[ 64.253320] kthread+0xe4/0x110
[ 64.253329] ret_from_fork+0x4c/0x60
[ 64.253341] ret_from_fork_asm+0x1b/0x30

[ 64.253353] kfence-#111: 0x00000000d018cf03-0x0000000034e821d1, size=96, cache=kmalloc-96

[ 64.253363] allocated by task 152 on cpu 3 at 64.248952s:
[ 64.253418] kmalloc_trace+0x283/0x340
[ 64.253427] amdgpu_vram_mgr_new+0x8f/0x3f0
[ 64.253435] ttm_resource_alloc+0x39/0x90
[ 64.253444] ttm_bo_mem_space+0xa4/0x260
[ 64.253450] ttm_mem_evict_first+0x18a/0x500
[ 64.253456] ttm_resource_manager_evict_all+0xa3/0x1e0
[ 64.253465] amdgpu_device_prepare+0x66/0x110
[ 64.253472] amdgpu_pmops_runtime_suspend+0xbe/0x1c0
[ 64.253481] pci_pm_runtime_suspend+0x74/0x200
[ 64.253489] vga_switcheroo_runtime_suspend+0x21/0xb0
[ 64.253496] __rpm_callback+0x5f/0x190
[ 64.253503] rpm_callback+0x7f/0x90
[ 64.253509] rpm_suspend+0x120/0x6a0
[ 64.253516] pm_runtime_work+0x9c/0xa0
[ 64.253523] process_one_work+0x164/0x330
[ 64.253532] worker_thread+0x302/0x430
[ 64.253542] kthread+0xe4/0x110
[ 64.253550] ret_from_fork+0x4c/0x60
[ 64.253559] ret_from_fork_asm+0x1b/0x30

[ 64.253570] freed by task 152 on cpu 3 at 64.253117s:
[ 64.253582] ttm_resource_free+0x67/0x90
[ 64.253591] ttm_bo_move_accel_cleanup+0x247/0x2e0
[ 64.253598] amdgpu_bo_move+0x1bd/0x7a0
[ 64.253605] ttm_bo_handle_move_mem+0xcf/0x180
[ 64.253612] ttm_mem_evict_first+0x1c5/0x500
[ 64.253618] ttm_resource_manager_evict_all+0xa3/0x1e0
[ 64.253626] amdgpu_device_prepare+0x66/0x110
[ 64.253634] amdgpu_pmops_runtime_suspend+0xbe/0x1c0
[ 64.253642] pci_pm_runtime_suspend+0x74/0x200
[ 64.253650] vga_switcheroo_runtime_suspend+0x21/0xb0
[ 64.253658] __rpm_callback+0x5f/0x190
[ 64.253664] rpm_callback+0x7f/0x90
[ 64.253671] rpm_suspend+0x120/0x6a0
[ 64.253677] pm_runtime_work+0x9c/0xa0
[ 64.253684] process_one_work+0x164/0x330
[ 64.253693] worker_thread+0x302/0x430
[ 64.253703] kthread+0xe4/0x110
[ 64.253711] ret_from_fork+0x4c/0x60
[ 64.253723] ret_from_fork_asm+0x1b/0x30

[ 64.253735] CPU: 3 PID: 152 Comm: kworker/3:2 Tainted: P OE 6.8.9 #3 e7323d0d25f89e853881fc823e59523bdcc577c6
[ 64.253756] Hardware name: Hewlett-Packard HP Pavilion Notebook /80B9, BIOS F.54 05/27/2019
[ 64.253761] Workqueue: pm pm_runtime_work
[ 64.253771] ==================================================================


2024-05-05 22:45:40

by Borislav Petkov

[permalink] [raw]
Subject: Re: Error in amd driver?

+ [email protected]

On Sun, May 05, 2024 at 09:59:22PM +0300, Tranton Baddy wrote:
> I have this in my dmesg since version 6.8.6, not sure when it appeared. Is amdgpu driver has bug?
> [ 64.253144] ==================================================================
> [ 64.253162] BUG: KFENCE: use-after-free read in amdgpu_bo_move+0x51f/0x7a0
>
> [ 64.253183] Use-after-free read at 0x00000000671c48dd (in kfence-#111):
> [ 64.253192] amdgpu_bo_move+0x51f/0x7a0
> [ 64.253202] ttm_bo_handle_move_mem+0xcf/0x180
> [ 64.253211] ttm_mem_evict_first+0x1c5/0x500
> [ 64.253218] ttm_resource_manager_evict_all+0xa3/0x1e0
> [ 64.253228] amdgpu_device_prepare+0x66/0x110
> [ 64.253237] amdgpu_pmops_runtime_suspend+0xbe/0x1c0
> [ 64.253248] pci_pm_runtime_suspend+0x74/0x200
> [ 64.253259] vga_switcheroo_runtime_suspend+0x21/0xb0
> [ 64.253268] __rpm_callback+0x5f/0x190
> [ 64.253277] rpm_callback+0x7f/0x90
> [ 64.253283] rpm_suspend+0x120/0x6a0
> [ 64.253290] pm_runtime_work+0x9c/0xa0
> [ 64.253297] process_one_work+0x164/0x330
> [ 64.253310] worker_thread+0x302/0x430
> [ 64.253320] kthread+0xe4/0x110
> [ 64.253329] ret_from_fork+0x4c/0x60
> [ 64.253341] ret_from_fork_asm+0x1b/0x30
>
> [ 64.253353] kfence-#111: 0x00000000d018cf03-0x0000000034e821d1, size=96, cache=kmalloc-96
>
> [ 64.253363] allocated by task 152 on cpu 3 at 64.248952s:
> [ 64.253418] kmalloc_trace+0x283/0x340
> [ 64.253427] amdgpu_vram_mgr_new+0x8f/0x3f0
> [ 64.253435] ttm_resource_alloc+0x39/0x90
> [ 64.253444] ttm_bo_mem_space+0xa4/0x260
> [ 64.253450] ttm_mem_evict_first+0x18a/0x500
> [ 64.253456] ttm_resource_manager_evict_all+0xa3/0x1e0
> [ 64.253465] amdgpu_device_prepare+0x66/0x110
> [ 64.253472] amdgpu_pmops_runtime_suspend+0xbe/0x1c0
> [ 64.253481] pci_pm_runtime_suspend+0x74/0x200
> [ 64.253489] vga_switcheroo_runtime_suspend+0x21/0xb0
> [ 64.253496] __rpm_callback+0x5f/0x190
> [ 64.253503] rpm_callback+0x7f/0x90
> [ 64.253509] rpm_suspend+0x120/0x6a0
> [ 64.253516] pm_runtime_work+0x9c/0xa0
> [ 64.253523] process_one_work+0x164/0x330
> [ 64.253532] worker_thread+0x302/0x430
> [ 64.253542] kthread+0xe4/0x110
> [ 64.253550] ret_from_fork+0x4c/0x60
> [ 64.253559] ret_from_fork_asm+0x1b/0x30
>
> [ 64.253570] freed by task 152 on cpu 3 at 64.253117s:
> [ 64.253582] ttm_resource_free+0x67/0x90
> [ 64.253591] ttm_bo_move_accel_cleanup+0x247/0x2e0
> [ 64.253598] amdgpu_bo_move+0x1bd/0x7a0
> [ 64.253605] ttm_bo_handle_move_mem+0xcf/0x180
> [ 64.253612] ttm_mem_evict_first+0x1c5/0x500
> [ 64.253618] ttm_resource_manager_evict_all+0xa3/0x1e0
> [ 64.253626] amdgpu_device_prepare+0x66/0x110
> [ 64.253634] amdgpu_pmops_runtime_suspend+0xbe/0x1c0
> [ 64.253642] pci_pm_runtime_suspend+0x74/0x200
> [ 64.253650] vga_switcheroo_runtime_suspend+0x21/0xb0
> [ 64.253658] __rpm_callback+0x5f/0x190
> [ 64.253664] rpm_callback+0x7f/0x90
> [ 64.253671] rpm_suspend+0x120/0x6a0
> [ 64.253677] pm_runtime_work+0x9c/0xa0
> [ 64.253684] process_one_work+0x164/0x330
> [ 64.253693] worker_thread+0x302/0x430
> [ 64.253703] kthread+0xe4/0x110
> [ 64.253711] ret_from_fork+0x4c/0x60
> [ 64.253723] ret_from_fork_asm+0x1b/0x30
>
> [ 64.253735] CPU: 3 PID: 152 Comm: kworker/3:2 Tainted: P OE 6.8.9 #3 e7323d0d25f89e853881fc823e59523bdcc577c6
> [ 64.253756] Hardware name: Hewlett-Packard HP Pavilion Notebook /80B9, BIOS F.54 05/27/2019
> [ 64.253761] Workqueue: pm pm_runtime_work
> [ 64.253771] ==================================================================
>

--
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette

2024-05-06 15:07:50

by Alex Deucher

[permalink] [raw]
Subject: Re: Error in amd driver?

On Mon, May 6, 2024 at 6:00 AM Borislav Petkov <[email protected]> wrote:
>
> + [email protected]
>
> On Sun, May 05, 2024 at 09:59:22PM +0300, Tranton Baddy wrote:
> > I have this in my dmesg since version 6.8.6, not sure when it appeared. Is amdgpu driver has bug?

Should be fixed in:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=d3a9331a6591e9df64791e076f6591f440af51c3

Alex

> > [ 64.253144] ==================================================================
> > [ 64.253162] BUG: KFENCE: use-after-free read in amdgpu_bo_move+0x51f/0x7a0
> >
> > [ 64.253183] Use-after-free read at 0x00000000671c48dd (in kfence-#111):
> > [ 64.253192] amdgpu_bo_move+0x51f/0x7a0
> > [ 64.253202] ttm_bo_handle_move_mem+0xcf/0x180
> > [ 64.253211] ttm_mem_evict_first+0x1c5/0x500
> > [ 64.253218] ttm_resource_manager_evict_all+0xa3/0x1e0
> > [ 64.253228] amdgpu_device_prepare+0x66/0x110
> > [ 64.253237] amdgpu_pmops_runtime_suspend+0xbe/0x1c0
> > [ 64.253248] pci_pm_runtime_suspend+0x74/0x200
> > [ 64.253259] vga_switcheroo_runtime_suspend+0x21/0xb0
> > [ 64.253268] __rpm_callback+0x5f/0x190
> > [ 64.253277] rpm_callback+0x7f/0x90
> > [ 64.253283] rpm_suspend+0x120/0x6a0
> > [ 64.253290] pm_runtime_work+0x9c/0xa0
> > [ 64.253297] process_one_work+0x164/0x330
> > [ 64.253310] worker_thread+0x302/0x430
> > [ 64.253320] kthread+0xe4/0x110
> > [ 64.253329] ret_from_fork+0x4c/0x60
> > [ 64.253341] ret_from_fork_asm+0x1b/0x30
> >
> > [ 64.253353] kfence-#111: 0x00000000d018cf03-0x0000000034e821d1, size=96, cache=kmalloc-96
> >
> > [ 64.253363] allocated by task 152 on cpu 3 at 64.248952s:
> > [ 64.253418] kmalloc_trace+0x283/0x340
> > [ 64.253427] amdgpu_vram_mgr_new+0x8f/0x3f0
> > [ 64.253435] ttm_resource_alloc+0x39/0x90
> > [ 64.253444] ttm_bo_mem_space+0xa4/0x260
> > [ 64.253450] ttm_mem_evict_first+0x18a/0x500
> > [ 64.253456] ttm_resource_manager_evict_all+0xa3/0x1e0
> > [ 64.253465] amdgpu_device_prepare+0x66/0x110
> > [ 64.253472] amdgpu_pmops_runtime_suspend+0xbe/0x1c0
> > [ 64.253481] pci_pm_runtime_suspend+0x74/0x200
> > [ 64.253489] vga_switcheroo_runtime_suspend+0x21/0xb0
> > [ 64.253496] __rpm_callback+0x5f/0x190
> > [ 64.253503] rpm_callback+0x7f/0x90
> > [ 64.253509] rpm_suspend+0x120/0x6a0
> > [ 64.253516] pm_runtime_work+0x9c/0xa0
> > [ 64.253523] process_one_work+0x164/0x330
> > [ 64.253532] worker_thread+0x302/0x430
> > [ 64.253542] kthread+0xe4/0x110
> > [ 64.253550] ret_from_fork+0x4c/0x60
> > [ 64.253559] ret_from_fork_asm+0x1b/0x30
> >
> > [ 64.253570] freed by task 152 on cpu 3 at 64.253117s:
> > [ 64.253582] ttm_resource_free+0x67/0x90
> > [ 64.253591] ttm_bo_move_accel_cleanup+0x247/0x2e0
> > [ 64.253598] amdgpu_bo_move+0x1bd/0x7a0
> > [ 64.253605] ttm_bo_handle_move_mem+0xcf/0x180
> > [ 64.253612] ttm_mem_evict_first+0x1c5/0x500
> > [ 64.253618] ttm_resource_manager_evict_all+0xa3/0x1e0
> > [ 64.253626] amdgpu_device_prepare+0x66/0x110
> > [ 64.253634] amdgpu_pmops_runtime_suspend+0xbe/0x1c0
> > [ 64.253642] pci_pm_runtime_suspend+0x74/0x200
> > [ 64.253650] vga_switcheroo_runtime_suspend+0x21/0xb0
> > [ 64.253658] __rpm_callback+0x5f/0x190
> > [ 64.253664] rpm_callback+0x7f/0x90
> > [ 64.253671] rpm_suspend+0x120/0x6a0
> > [ 64.253677] pm_runtime_work+0x9c/0xa0
> > [ 64.253684] process_one_work+0x164/0x330
> > [ 64.253693] worker_thread+0x302/0x430
> > [ 64.253703] kthread+0xe4/0x110
> > [ 64.253711] ret_from_fork+0x4c/0x60
> > [ 64.253723] ret_from_fork_asm+0x1b/0x30
> >
> > [ 64.253735] CPU: 3 PID: 152 Comm: kworker/3:2 Tainted: P OE 6.8.9 #3 e7323d0d25f89e853881fc823e59523bdcc577c6
> > [ 64.253756] Hardware name: Hewlett-Packard HP Pavilion Notebook /80B9, BIOS F.54 05/27/2019
> > [ 64.253761] Workqueue: pm pm_runtime_work
> > [ 64.253771] ==================================================================
> >
>
> --
> Regards/Gruss,
> Boris.
>
> https://people.kernel.org/tglx/notes-about-netiquette

2024-05-19 13:00:38

by Tranton Baddy

[permalink] [raw]
Subject: Re: Error in amd driver?

Yes, problem is no more.
Thanks.

> On Mon, May 6, 2024 at 6:00 AM Borislav Petkov <[email protected]> wrote:
>
>> + [email protected]
>>
>> On Sun, May 05, 2024 at 09:59:22PM +0300, Tranton Baddy wrote:
>>> I have this in my dmesg since version 6.8.6, not sure when it appeared. Is amdgpu driver has bug?
>
> Should be fixed in:
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=d3a9331a6591e9df64791e076f6591f440af51c3
>
> Alex
>
>>> [ 64.253144] ==================================================================
>>> [ 64.253162] BUG: KFENCE: use-after-free read in amdgpu_bo_move+0x51f/0x7a0
>>>
>>> [ 64.253183] Use-after-free read at 0x00000000671c48dd (in kfence-#111):
>>> [ 64.253192] amdgpu_bo_move+0x51f/0x7a0
>>> [ 64.253202] ttm_bo_handle_move_mem+0xcf/0x180
>>> [ 64.253211] ttm_mem_evict_first+0x1c5/0x500
>>> [ 64.253218] ttm_resource_manager_evict_all+0xa3/0x1e0
>>> [ 64.253228] amdgpu_device_prepare+0x66/0x110
>>> [ 64.253237] amdgpu_pmops_runtime_suspend+0xbe/0x1c0
>>> [ 64.253248] pci_pm_runtime_suspend+0x74/0x200
>>> [ 64.253259] vga_switcheroo_runtime_suspend+0x21/0xb0
>>> [ 64.253268] __rpm_callback+0x5f/0x190
>>> [ 64.253277] rpm_callback+0x7f/0x90
>>> [ 64.253283] rpm_suspend+0x120/0x6a0
>>> [ 64.253290] pm_runtime_work+0x9c/0xa0
>>> [ 64.253297] process_one_work+0x164/0x330
>>> [ 64.253310] worker_thread+0x302/0x430
>>> [ 64.253320] kthread+0xe4/0x110
>>> [ 64.253329] ret_from_fork+0x4c/0x60
>>> [ 64.253341] ret_from_fork_asm+0x1b/0x30
>>>
>>> [ 64.253353] kfence-#111: 0x00000000d018cf03-0x0000000034e821d1, size=96, cache=kmalloc-96
>>>
>>> [ 64.253363] allocated by task 152 on cpu 3 at 64.248952s:
>>> [ 64.253418] kmalloc_trace+0x283/0x340
>>> [ 64.253427] amdgpu_vram_mgr_new+0x8f/0x3f0
>>> [ 64.253435] ttm_resource_alloc+0x39/0x90
>>> [ 64.253444] ttm_bo_mem_space+0xa4/0x260
>>> [ 64.253450] ttm_mem_evict_first+0x18a/0x500
>>> [ 64.253456] ttm_resource_manager_evict_all+0xa3/0x1e0
>>> [ 64.253465] amdgpu_device_prepare+0x66/0x110
>>> [ 64.253472] amdgpu_pmops_runtime_suspend+0xbe/0x1c0
>>> [ 64.253481] pci_pm_runtime_suspend+0x74/0x200
>>> [ 64.253489] vga_switcheroo_runtime_suspend+0x21/0xb0
>>> [ 64.253496] __rpm_callback+0x5f/0x190
>>> [ 64.253503] rpm_callback+0x7f/0x90
>>> [ 64.253509] rpm_suspend+0x120/0x6a0
>>> [ 64.253516] pm_runtime_work+0x9c/0xa0
>>> [ 64.253523] process_one_work+0x164/0x330
>>> [ 64.253532] worker_thread+0x302/0x430
>>> [ 64.253542] kthread+0xe4/0x110
>>> [ 64.253550] ret_from_fork+0x4c/0x60
>>> [ 64.253559] ret_from_fork_asm+0x1b/0x30
>>>
>>> [ 64.253570] freed by task 152 on cpu 3 at 64.253117s:
>>> [ 64.253582] ttm_resource_free+0x67/0x90
>>> [ 64.253591] ttm_bo_move_accel_cleanup+0x247/0x2e0
>>> [ 64.253598] amdgpu_bo_move+0x1bd/0x7a0
>>> [ 64.253605] ttm_bo_handle_move_mem+0xcf/0x180
>>> [ 64.253612] ttm_mem_evict_first+0x1c5/0x500
>>> [ 64.253618] ttm_resource_manager_evict_all+0xa3/0x1e0
>>> [ 64.253626] amdgpu_device_prepare+0x66/0x110
>>> [ 64.253634] amdgpu_pmops_runtime_suspend+0xbe/0x1c0
>>> [ 64.253642] pci_pm_runtime_suspend+0x74/0x200
>>> [ 64.253650] vga_switcheroo_runtime_suspend+0x21/0xb0
>>> [ 64.253658] __rpm_callback+0x5f/0x190
>>> [ 64.253664] rpm_callback+0x7f/0x90
>>> [ 64.253671] rpm_suspend+0x120/0x6a0
>>> [ 64.253677] pm_runtime_work+0x9c/0xa0
>>> [ 64.253684] process_one_work+0x164/0x330
>>> [ 64.253693] worker_thread+0x302/0x430
>>> [ 64.253703] kthread+0xe4/0x110
>>> [ 64.253711] ret_from_fork+0x4c/0x60
>>> [ 64.253723] ret_from_fork_asm+0x1b/0x30
>>>
>>> [ 64.253735] CPU: 3 PID: 152 Comm: kworker/3:2 Tainted: P OE 6.8.9 #3 e7323d0d25f89e853881fc823e59523bdcc577c6
>>> [ 64.253756] Hardware name: Hewlett-Packard HP Pavilion Notebook /80B9, BIOS F.54 05/27/2019
>>> [ 64.253761] Workqueue: pm pm_runtime_work
>>> [ 64.253771] ==================================================================
>>>
>>
>> --
>> Regards/Gruss,
>> Boris.
>>
>> https://people.kernel.org/tglx/notes-about-netiquette