Hello,
We are updating the kernel from the 6.1 to the 6.6 and we observe an
amdgpu’s regression with Radeon RX580 8GB and SiFive Unmatched:
“workqueue: Failed to create a rescuer kthread for wq 'amdgpu-reset-
dev': -EINTR
[drm:amdgpu_reset_create_reset_domain [amdgpu]] *ERROR* Failed to
allocate wq for amdgpu_reset_domain!
amdgpu 0000:07:00.0: amdgpu: Fatal error during GPU init
amdgpu 0000:07:00.0: amdgpu: amdgpu: finishing device.
amdgpu: probe of 0000:07:00.0 failed with error -12”
We tried to figure it out without success for the moment, do you have
some advice to identify the root cause and to fix it?
Kind regards,
Thomas Perrot
--
Thomas Perrot, Bootlin
Embedded Linux and kernel engineering
https://bootlin.com
Well the driver load is interrupted for some reason.
Have you set any timeout for modprobe?
Regards,
Christian.
Am 12.01.24 um 09:11 schrieb Thomas Perrot:
> Hello,
>
> We are updating the kernel from the 6.1 to the 6.6 and we observe an
> amdgpu’s regression with Radeon RX580 8GB and SiFive Unmatched:
> “workqueue: Failed to create a rescuer kthread for wq 'amdgpu-reset-
> dev': -EINTR
> [drm:amdgpu_reset_create_reset_domain [amdgpu]] *ERROR* Failed to
> allocate wq for amdgpu_reset_domain!
> amdgpu 0000:07:00.0: amdgpu: Fatal error during GPU init
> amdgpu 0000:07:00.0: amdgpu: amdgpu: finishing device.
> amdgpu: probe of 0000:07:00.0 failed with error -12”
>
> We tried to figure it out without success for the moment, do you have
> some advice to identify the root cause and to fix it?
>
> Kind regards,
> Thomas Perrot
>
Am 15.01.24 um 11:17 schrieb Thomas Perrot:
> Hello Christian,
>
> On Fri, 2024-01-12 at 09:17 +0100, Christian König wrote:
>> Well the driver load is interrupted for some reason.
>>
>> Have you set any timeout for modprobe?
>>
> We don't set a modprobe timeout.
Well you somehow abort probing the driver.
This seems to be an external event and not something the driver can
influence.
Regards,
Christian.
>
> Kind regards,
> Thomas
>
>> Regards,
>> Christian.
>>
>> Am 12.01.24 um 09:11 schrieb Thomas Perrot:
>>> Hello,
>>>
>>> We are updating the kernel from the 6.1 to the 6.6 and we observe
>>> an
>>> amdgpu’s regression with Radeon RX580 8GB and SiFive Unmatched:
>>> “workqueue: Failed to create a rescuer kthread for wq 'amdgpu-
>>> reset-
>>> dev': -EINTR
>>> [drm:amdgpu_reset_create_reset_domain [amdgpu]] *ERROR* Failed to
>>> allocate wq for amdgpu_reset_domain!
>>> amdgpu 0000:07:00.0: amdgpu: Fatal error during GPU init
>>> amdgpu 0000:07:00.0: amdgpu: amdgpu: finishing device.
>>> amdgpu: probe of 0000:07:00.0 failed with error -12”
>>>
>>> We tried to figure it out without success for the moment, do you
>>> have
>>> some advice to identify the root cause and to fix it?
>>>
>>> Kind regards,
>>> Thomas Perrot
>>>
Hello Christian,
On Fri, 2024-01-12 at 09:17 +0100, Christian König wrote:
> Well the driver load is interrupted for some reason.
>
> Have you set any timeout for modprobe?
>
We don't set a modprobe timeout.
Kind regards,
Thomas
> Regards,
> Christian.
>
> Am 12.01.24 um 09:11 schrieb Thomas Perrot:
> > Hello,
> >
> > We are updating the kernel from the 6.1 to the 6.6 and we observe
> > an
> > amdgpu’s regression with Radeon RX580 8GB and SiFive Unmatched:
> > “workqueue: Failed to create a rescuer kthread for wq 'amdgpu-
> > reset-
> > dev': -EINTR
> > [drm:amdgpu_reset_create_reset_domain [amdgpu]] *ERROR* Failed to
> > allocate wq for amdgpu_reset_domain!
> > amdgpu 0000:07:00.0: amdgpu: Fatal error during GPU init
> > amdgpu 0000:07:00.0: amdgpu: amdgpu: finishing device.
> > amdgpu: probe of 0000:07:00.0 failed with error -12”
> >
> > We tried to figure it out without success for the moment, do you
> > have
> > some advice to identify the root cause and to fix it?
> >
> > Kind regards,
> > Thomas Perrot
> >
>
--
Thomas Perrot, Bootlin
Embedded Linux and kernel engineering
https://bootlin.com