2022-09-30 22:04:36

by Zhang Boyang

[permalink] [raw]
Subject: [RFC PATCH 0/1] drm/amdgpu: Fix NULL-deref in amdgpu_device_fini_sw()

Hi,

There are several reports of "Fatal error during GPU init" will cause
NULL-deref in amdgpu_device_fini_sw(). Although the NULL-deref is result
instead of reason, this NULL-deref will confuse user.

https://lore.kernel.org/lkml/[email protected]/
https://lore.kernel.org/lkml/[email protected]/

This is probably because "adev" is not fully initialized when
amdgpu_device_init() failed. Thus subsequent amdgpu_device_fini_sw()
will try to release "adev->reset_domain" and cause NULL-deref.

This patch fixes this problem by guarding the code with an "if".
However, I'm new to this module and I didn't fully understand the code,
so please review my code carefully.

Best Regards,
Zhang Boyang