LinuxLists.cc - I got an IOMMU IO page fault. What to do now?

2021-10-25 10:30:44

Subject: I got an IOMMU IO page fault. What to do now?

Dear Linux folks,

On a Dell OptiPlex 5055, Linux 5.10.24 logged the IOMMU messages below.
(GPU hang in amdgpu issue #1762 [1] might be related.)

$ lspci -nn -s 05:00.0
05:00.0 VGA compatible controller [0300]: Advanced Micro Devices,
Inc. [AMD/ATI] Oland [Radeon HD 8570 / R7 240/340 OEM] [1002:6611] (rev 87)
$ dmesg
[…]
[6318399.745242] amdgpu 0000:05:00.0: AMD-Vi: Event logged
[IO_PAGE_FAULT domain=0x000c address=0xfffffff0c0 flags=0x0020]
[6318399.757283] amdgpu 0000:05:00.0: AMD-Vi: Event logged
[IO_PAGE_FAULT domain=0x000c address=0xfffffff7c0 flags=0x0020]
[6318399.769154] amdgpu 0000:05:00.0: AMD-Vi: Event logged
[IO_PAGE_FAULT domain=0x000c address=0xffffffe0c0 flags=0x0020]
[6318399.780913] amdgpu 0000:05:00.0: AMD-Vi: Event logged
[IO_PAGE_FAULT domain=0x000c address=0xfffffffec0 flags=0x0020]
[6318399.792734] amdgpu 0000:05:00.0: AMD-Vi: Event logged
[IO_PAGE_FAULT domain=0x000c address=0xffffffe5c0 flags=0x0020]
[6318399.804309] amdgpu 0000:05:00.0: AMD-Vi: Event logged
[IO_PAGE_FAULT domain=0x000c address=0xffffffd0c0 flags=0x0020]
[6318399.816091] amdgpu 0000:05:00.0: AMD-Vi: Event logged
[IO_PAGE_FAULT domain=0x000c address=0xffffffecc0 flags=0x0020]
[6318399.827407] amdgpu 0000:05:00.0: AMD-Vi: Event logged
[IO_PAGE_FAULT domain=0x000c address=0xffffffd3c0 flags=0x0020]
[6318399.838708] amdgpu 0000:05:00.0: AMD-Vi: Event logged
[IO_PAGE_FAULT domain=0x000c address=0xffffffc0c0 flags=0x0020]
[6318399.850029] amdgpu 0000:05:00.0: AMD-Vi: Event logged
[IO_PAGE_FAULT domain=0x000c address=0xffffffdac0 flags=0x0020]
[6318399.861311] AMD-Vi: Event logged [IO_PAGE_FAULT device=05:00.0
domain=0x000c address=0xffffffc1c0 flags=0x0020]
[6318399.872044] AMD-Vi: Event logged [IO_PAGE_FAULT device=05:00.0
domain=0x000c address=0xffffffc8c0 flags=0x0020]
[6318399.882797] AMD-Vi: Event logged [IO_PAGE_FAULT device=05:00.0
domain=0x000c address=0xffffffb0c0 flags=0x0020]
[6318399.893655] AMD-Vi: Event logged [IO_PAGE_FAULT device=05:00.0
domain=0x000c address=0xffffffcfc0 flags=0x0020]
[6318399.904445] AMD-Vi: Event logged [IO_PAGE_FAULT device=05:00.0
domain=0x000c address=0xffffffb6c0 flags=0x0020]
[6318399.915222] AMD-Vi: Event logged [IO_PAGE_FAULT device=05:00.0
domain=0x000c address=0xffffffa0c0 flags=0x0020]
[6318399.925931] AMD-Vi: Event logged [IO_PAGE_FAULT device=05:00.0
domain=0x000c address=0xffffffbdc0 flags=0x0020]
[6318399.936691] AMD-Vi: Event logged [IO_PAGE_FAULT device=05:00.0
domain=0x000c address=0xffffffa4c0 flags=0x0020]
[6318399.947479] AMD-Vi: Event logged [IO_PAGE_FAULT device=05:00.0
domain=0x000c address=0xffffff90c0 flags=0x0020]
[6318399.958270] AMD-Vi: Event logged [IO_PAGE_FAULT device=05:00.0
domain=0x000c address=0xffffffabc0 flags=0x0020]

As this is not reproducible, how would debugging go? (The system was
rebooted in the meantime.) What options should be enabled, that next
time the required information is logged, or what commands should I
execute when the system is still in that state, so the bug (driver,
userspace, …) can be pinpointed and fixed?

Kind regards,

Paul

[1]: https://gitlab.freedesktop.org/drm/amd/-/issues/1762
"Oland [Radeon HD 8570 / R7 240/340 OEM]: GPU hang"

2021-10-25 13:28:18

by Christian König

[permalink] [raw]

Subject: Re: I got an IOMMU IO page fault. What to do now?

Hi Paul,

not sure how the IOMMU gives out addresses, but the printed ones look
suspicious to me. Something like we are using an invalid address like -1
or similar.

Can you try that on an up to date kernel as well? E.g. ideally bleeding
edge amd-staging-drm-next from Alex repository.

Regards,
Christian.

Am 25.10.21 um 12:25 schrieb Paul Menzel:
> Dear Linux folks,
>
>
> On a Dell OptiPlex 5055, Linux 5.10.24 logged the IOMMU messages
> below. (GPU hang in amdgpu issue #1762 [1] might be related.)
>
>     $ lspci -nn -s 05:00.0
>     05:00.0 VGA compatible controller [0300]: Advanced Micro Devices,
> Inc. [AMD/ATI] Oland [Radeon HD 8570 / R7 240/340 OEM] [1002:6611]
> (rev 87)
>     $ dmesg
>     […]
>     [6318399.745242] amdgpu 0000:05:00.0: AMD-Vi: Event logged
> [IO_PAGE_FAULT domain=0x000c address=0xfffffff0c0 flags=0x0020]
>     [6318399.757283] amdgpu 0000:05:00.0: AMD-Vi: Event logged
> [IO_PAGE_FAULT domain=0x000c address=0xfffffff7c0 flags=0x0020]
>     [6318399.769154] amdgpu 0000:05:00.0: AMD-Vi: Event logged
> [IO_PAGE_FAULT domain=0x000c address=0xffffffe0c0 flags=0x0020]
>     [6318399.780913] amdgpu 0000:05:00.0: AMD-Vi: Event logged
> [IO_PAGE_FAULT domain=0x000c address=0xfffffffec0 flags=0x0020]
>     [6318399.792734] amdgpu 0000:05:00.0: AMD-Vi: Event logged
> [IO_PAGE_FAULT domain=0x000c address=0xffffffe5c0 flags=0x0020]
>     [6318399.804309] amdgpu 0000:05:00.0: AMD-Vi: Event logged
> [IO_PAGE_FAULT domain=0x000c address=0xffffffd0c0 flags=0x0020]
>     [6318399.816091] amdgpu 0000:05:00.0: AMD-Vi: Event logged
> [IO_PAGE_FAULT domain=0x000c address=0xffffffecc0 flags=0x0020]
>     [6318399.827407] amdgpu 0000:05:00.0: AMD-Vi: Event logged
> [IO_PAGE_FAULT domain=0x000c address=0xffffffd3c0 flags=0x0020]
>     [6318399.838708] amdgpu 0000:05:00.0: AMD-Vi: Event logged
> [IO_PAGE_FAULT domain=0x000c address=0xffffffc0c0 flags=0x0020]
>     [6318399.850029] amdgpu 0000:05:00.0: AMD-Vi: Event logged
> [IO_PAGE_FAULT domain=0x000c address=0xffffffdac0 flags=0x0020]
>     [6318399.861311] AMD-Vi: Event logged [IO_PAGE_FAULT
> device=05:00.0 domain=0x000c address=0xffffffc1c0 flags=0x0020]
>     [6318399.872044] AMD-Vi: Event logged [IO_PAGE_FAULT
> device=05:00.0 domain=0x000c address=0xffffffc8c0 flags=0x0020]
>     [6318399.882797] AMD-Vi: Event logged [IO_PAGE_FAULT
> device=05:00.0 domain=0x000c address=0xffffffb0c0 flags=0x0020]
>     [6318399.893655] AMD-Vi: Event logged [IO_PAGE_FAULT
> device=05:00.0 domain=0x000c address=0xffffffcfc0 flags=0x0020]
>     [6318399.904445] AMD-Vi: Event logged [IO_PAGE_FAULT
> device=05:00.0 domain=0x000c address=0xffffffb6c0 flags=0x0020]
>     [6318399.915222] AMD-Vi: Event logged [IO_PAGE_FAULT
> device=05:00.0 domain=0x000c address=0xffffffa0c0 flags=0x0020]
>     [6318399.925931] AMD-Vi: Event logged [IO_PAGE_FAULT
> device=05:00.0 domain=0x000c address=0xffffffbdc0 flags=0x0020]
>     [6318399.936691] AMD-Vi: Event logged [IO_PAGE_FAULT
> device=05:00.0 domain=0x000c address=0xffffffa4c0 flags=0x0020]
>     [6318399.947479] AMD-Vi: Event logged [IO_PAGE_FAULT
> device=05:00.0 domain=0x000c address=0xffffff90c0 flags=0x0020]
>     [6318399.958270] AMD-Vi: Event logged [IO_PAGE_FAULT
> device=05:00.0 domain=0x000c address=0xffffffabc0 flags=0x0020]
>
> As this is not reproducible, how would debugging go? (The system was
> rebooted in the meantime.) What options should be enabled, that next
> time the required information is logged, or what commands should I
> execute when the system is still in that state, so the bug (driver,
> userspace, …) can be pinpointed and fixed?
>
>
> Kind regards,
>
> Paul
>
>
> [1]: https://gitlab.freedesktop.org/drm/amd/-/issues/1762
>      "Oland [Radeon HD 8570 / R7 240/340 OEM]: GPU hang"

2021-10-25 13:28:49

by Paul Menzel

[permalink] [raw]

Subject: Re: I got an IOMMU IO page fault. What to do now?

Dear Christian,

Thank you for your reply.

On 25.10.21 13:23, Christian König wrote:

> not sure how the IOMMU gives out addresses, but the printed ones look
> suspicious to me. Something like we are using an invalid address like -1
> or similar.
>
> Can you try that on an up to date kernel as well? E.g. ideally bleeding
> edge amd-staging-drm-next from Alex repository.

These are production desktops, so I’d need to talk to the user.
Currently, Linux 5.10.70 is running.

Kind regards,

Paul

2021-10-25 16:06:06

by Robin Murphy

[permalink] [raw]

Subject: Re: I got an IOMMU IO page fault. What to do now?

On 2021-10-25 12:23, Christian König wrote:
> Hi Paul,
>
> not sure how the IOMMU gives out addresses, but the printed ones look
> suspicious to me. Something like we are using an invalid address like -1
> or similar.

FWIW those look like believable DMA addresses to me, assuming that the
DMA mapping APIs are being backed iommu_dma_ops and the device has a
40-bit DMA mask, since the IOVA allocator works top-down.

Likely causes are either a race where the dma_unmap_*() call happens
before the hardware has really stopped accessing the relevant addresses,
or the device's DMA mask has been set larger than it should be, and thus
the upper bits have been truncated in the round-trip through the hardware.

Given the addresses involved, my suspicions would initially lean towards
the latter case - the faults are in the very topmost pages which imply
they're the first things mapped in that range. The other contributing
factor being the trick that the IOVA allocator plays for PCI devices,
where it tries to prefer 32-bit addresses. Thus you're only likely to
see this happen once you already have ~3.5-4GB of live DMA-mapped memory
to exhaust the 32-bit IOVA space (minus some reserved areas) and start
allocating from the full DMA mask. You should be able to check that with
a 5.13 or newer kernel by booting with "iommu.forcedac=1" and seeing if
it breaks immediately (unfortunately with an older kernel you'd have to
manually hack iommu_dma_alloc_iova() to the same effect).

Robin.

> Can you try that on an up to date kernel as well? E.g. ideally bleeding
> edge amd-staging-drm-next from Alex repository.
>
> Regards,
> Christian.
>
> Am 25.10.21 um 12:25 schrieb Paul Menzel:
>> Dear Linux folks,
>>
>>
>> On a Dell OptiPlex 5055, Linux 5.10.24 logged the IOMMU messages
>> below. (GPU hang in amdgpu issue #1762 [1] might be related.)
>>
>>     $ lspci -nn -s 05:00.0
>>     05:00.0 VGA compatible controller [0300]: Advanced Micro Devices,
>> Inc. [AMD/ATI] Oland [Radeon HD 8570 / R7 240/340 OEM] [1002:6611]
>> (rev 87)
>>     $ dmesg
>>     […]
>>     [6318399.745242] amdgpu 0000:05:00.0: AMD-Vi: Event logged
>> [IO_PAGE_FAULT domain=0x000c address=0xfffffff0c0 flags=0x0020]
>>     [6318399.757283] amdgpu 0000:05:00.0: AMD-Vi: Event logged
>> [IO_PAGE_FAULT domain=0x000c address=0xfffffff7c0 flags=0x0020]
>>     [6318399.769154] amdgpu 0000:05:00.0: AMD-Vi: Event logged
>> [IO_PAGE_FAULT domain=0x000c address=0xffffffe0c0 flags=0x0020]
>>     [6318399.780913] amdgpu 0000:05:00.0: AMD-Vi: Event logged
>> [IO_PAGE_FAULT domain=0x000c address=0xfffffffec0 flags=0x0020]
>>     [6318399.792734] amdgpu 0000:05:00.0: AMD-Vi: Event logged
>> [IO_PAGE_FAULT domain=0x000c address=0xffffffe5c0 flags=0x0020]
>>     [6318399.804309] amdgpu 0000:05:00.0: AMD-Vi: Event logged
>> [IO_PAGE_FAULT domain=0x000c address=0xffffffd0c0 flags=0x0020]
>>     [6318399.816091] amdgpu 0000:05:00.0: AMD-Vi: Event logged
>> [IO_PAGE_FAULT domain=0x000c address=0xffffffecc0 flags=0x0020]
>>     [6318399.827407] amdgpu 0000:05:00.0: AMD-Vi: Event logged
>> [IO_PAGE_FAULT domain=0x000c address=0xffffffd3c0 flags=0x0020]
>>     [6318399.838708] amdgpu 0000:05:00.0: AMD-Vi: Event logged
>> [IO_PAGE_FAULT domain=0x000c address=0xffffffc0c0 flags=0x0020]
>>     [6318399.850029] amdgpu 0000:05:00.0: AMD-Vi: Event logged
>> [IO_PAGE_FAULT domain=0x000c address=0xffffffdac0 flags=0x0020]
>>     [6318399.861311] AMD-Vi: Event logged [IO_PAGE_FAULT
>> device=05:00.0 domain=0x000c address=0xffffffc1c0 flags=0x0020]
>>     [6318399.872044] AMD-Vi: Event logged [IO_PAGE_FAULT
>> device=05:00.0 domain=0x000c address=0xffffffc8c0 flags=0x0020]
>>     [6318399.882797] AMD-Vi: Event logged [IO_PAGE_FAULT
>> device=05:00.0 domain=0x000c address=0xffffffb0c0 flags=0x0020]
>>     [6318399.893655] AMD-Vi: Event logged [IO_PAGE_FAULT
>> device=05:00.0 domain=0x000c address=0xffffffcfc0 flags=0x0020]
>>     [6318399.904445] AMD-Vi: Event logged [IO_PAGE_FAULT
>> device=05:00.0 domain=0x000c address=0xffffffb6c0 flags=0x0020]
>>     [6318399.915222] AMD-Vi: Event logged [IO_PAGE_FAULT
>> device=05:00.0 domain=0x000c address=0xffffffa0c0 flags=0x0020]
>>     [6318399.925931] AMD-Vi: Event logged [IO_PAGE_FAULT
>> device=05:00.0 domain=0x000c address=0xffffffbdc0 flags=0x0020]
>>     [6318399.936691] AMD-Vi: Event logged [IO_PAGE_FAULT
>> device=05:00.0 domain=0x000c address=0xffffffa4c0 flags=0x0020]
>>     [6318399.947479] AMD-Vi: Event logged [IO_PAGE_FAULT
>> device=05:00.0 domain=0x000c address=0xffffff90c0 flags=0x0020]
>>     [6318399.958270] AMD-Vi: Event logged [IO_PAGE_FAULT
>> device=05:00.0 domain=0x000c address=0xffffffabc0 flags=0x0020]
>>
>> As this is not reproducible, how would debugging go? (The system was
>> rebooted in the meantime.) What options should be enabled, that next
>> time the required information is logged, or what commands should I
>> execute when the system is still in that state, so the bug (driver,
>> userspace, …) can be pinpointed and fixed?
>>
>>
>> Kind regards,
>>
>> Paul
>>
>>
>> [1]: https://gitlab.freedesktop.org/drm/amd/-/issues/1762
>>      "Oland [Radeon HD 8570 / R7 240/340 OEM]: GPU hang"
>
> _______________________________________________
> iommu mailing list
> [email protected]
> https://lists.linuxfoundation.org/mailman/listinfo/iommu

2021-10-25 16:44:56

by Christian König

[permalink] [raw]

Subject: Re: I got an IOMMU IO page fault. What to do now?

Hi Robin,

Am 25.10.21 um 18:01 schrieb Robin Murphy:
> On 2021-10-25 12:23, Christian König wrote:
>> Hi Paul,
>>
>> not sure how the IOMMU gives out addresses, but the printed ones look
>> suspicious to me. Something like we are using an invalid address like
>> -1 or similar.
>
> FWIW those look like believable DMA addresses to me, assuming that the
> DMA mapping APIs are being backed iommu_dma_ops and the device has a
> 40-bit DMA mask, since the IOVA allocator works top-down.

Thanks for that information. In that case the addresses look valid to me.

> Likely causes are either a race where the dma_unmap_*() call happens
> before the hardware has really stopped accessing the relevant
> addresses, or the device's DMA mask has been set larger than it should
> be, and thus the upper bits have been truncated in the round-trip
> through the hardware.

That actually looks correct to me. The device indeed has a 40-bit DMA mask.

There is a third possibility which is actually quite likely and that are
stale reads in the pipeline.

See for some use cases the device can queue reads into an internal
pipeline, but when it later finds that the read isn't needed doesn't
flush the pipeline.

The next operation pushes more read requests into the pipeline and
eventually the stale read requests are executed as well.

Without IOMMU the result of those reads are simply discarded, so no harm
done. But with IOMMU enabled it is perfectly possible that the stale
read is now accessing unmapped memory -> BAM.

That's one of the reasons why we almost always have GPUs in passthrough
mode on x86 and for example don't use system memory for GPU page tables
on APUs.

Regards,
Christian.

>
> Given the addresses involved, my suspicions would initially lean
> towards the latter case - the faults are in the very topmost pages
> which imply they're the first things mapped in that range. The other
> contributing factor being the trick that the IOVA allocator plays for
> PCI devices, where it tries to prefer 32-bit addresses. Thus you're
> only likely to see this happen once you already have ~3.5-4GB of live
> DMA-mapped memory to exhaust the 32-bit IOVA space (minus some
> reserved areas) and start allocating from the full DMA mask. You
> should be able to check that with a 5.13 or newer kernel by booting
> with "iommu.forcedac=1" and seeing if it breaks immediately
> (unfortunately with an older kernel you'd have to manually hack
> iommu_dma_alloc_iova() to the same effect).
>
> Robin.
>
>> Can you try that on an up to date kernel as well? E.g. ideally
>> bleeding edge amd-staging-drm-next from Alex repository.
>>
>> Regards,
>> Christian.
>>
>> Am 25.10.21 um 12:25 schrieb Paul Menzel:
>>> Dear Linux folks,
>>>
>>>
>>> On a Dell OptiPlex 5055, Linux 5.10.24 logged the IOMMU messages
>>> below. (GPU hang in amdgpu issue #1762 [1] might be related.)
>>>
>>>     $ lspci -nn -s 05:00.0
>>>     05:00.0 VGA compatible controller [0300]: Advanced Micro
>>> Devices, Inc. [AMD/ATI] Oland [Radeon HD 8570 / R7 240/340 OEM]
>>> [1002:6611] (rev 87)
>>>     $ dmesg
>>>     […]
>>>     [6318399.745242] amdgpu 0000:05:00.0: AMD-Vi: Event logged
>>> [IO_PAGE_FAULT domain=0x000c address=0xfffffff0c0 flags=0x0020]
>>>     [6318399.757283] amdgpu 0000:05:00.0: AMD-Vi: Event logged
>>> [IO_PAGE_FAULT domain=0x000c address=0xfffffff7c0 flags=0x0020]
>>>     [6318399.769154] amdgpu 0000:05:00.0: AMD-Vi: Event logged
>>> [IO_PAGE_FAULT domain=0x000c address=0xffffffe0c0 flags=0x0020]
>>>     [6318399.780913] amdgpu 0000:05:00.0: AMD-Vi: Event logged
>>> [IO_PAGE_FAULT domain=0x000c address=0xfffffffec0 flags=0x0020]
>>>     [6318399.792734] amdgpu 0000:05:00.0: AMD-Vi: Event logged
>>> [IO_PAGE_FAULT domain=0x000c address=0xffffffe5c0 flags=0x0020]
>>>     [6318399.804309] amdgpu 0000:05:00.0: AMD-Vi: Event logged
>>> [IO_PAGE_FAULT domain=0x000c address=0xffffffd0c0 flags=0x0020]
>>>     [6318399.816091] amdgpu 0000:05:00.0: AMD-Vi: Event logged
>>> [IO_PAGE_FAULT domain=0x000c address=0xffffffecc0 flags=0x0020]
>>>     [6318399.827407] amdgpu 0000:05:00.0: AMD-Vi: Event logged
>>> [IO_PAGE_FAULT domain=0x000c address=0xffffffd3c0 flags=0x0020]
>>>     [6318399.838708] amdgpu 0000:05:00.0: AMD-Vi: Event logged
>>> [IO_PAGE_FAULT domain=0x000c address=0xffffffc0c0 flags=0x0020]
>>>     [6318399.850029] amdgpu 0000:05:00.0: AMD-Vi: Event logged
>>> [IO_PAGE_FAULT domain=0x000c address=0xffffffdac0 flags=0x0020]
>>>     [6318399.861311] AMD-Vi: Event logged [IO_PAGE_FAULT
>>> device=05:00.0 domain=0x000c address=0xffffffc1c0 flags=0x0020]
>>>     [6318399.872044] AMD-Vi: Event logged [IO_PAGE_FAULT
>>> device=05:00.0 domain=0x000c address=0xffffffc8c0 flags=0x0020]
>>>     [6318399.882797] AMD-Vi: Event logged [IO_PAGE_FAULT
>>> device=05:00.0 domain=0x000c address=0xffffffb0c0 flags=0x0020]
>>>     [6318399.893655] AMD-Vi: Event logged [IO_PAGE_FAULT
>>> device=05:00.0 domain=0x000c address=0xffffffcfc0 flags=0x0020]
>>>     [6318399.904445] AMD-Vi: Event logged [IO_PAGE_FAULT
>>> device=05:00.0 domain=0x000c address=0xffffffb6c0 flags=0x0020]
>>>     [6318399.915222] AMD-Vi: Event logged [IO_PAGE_FAULT
>>> device=05:00.0 domain=0x000c address=0xffffffa0c0 flags=0x0020]
>>>     [6318399.925931] AMD-Vi: Event logged [IO_PAGE_FAULT
>>> device=05:00.0 domain=0x000c address=0xffffffbdc0 flags=0x0020]
>>>     [6318399.936691] AMD-Vi: Event logged [IO_PAGE_FAULT
>>> device=05:00.0 domain=0x000c address=0xffffffa4c0 flags=0x0020]
>>>     [6318399.947479] AMD-Vi: Event logged [IO_PAGE_FAULT
>>> device=05:00.0 domain=0x000c address=0xffffff90c0 flags=0x0020]
>>>     [6318399.958270] AMD-Vi: Event logged [IO_PAGE_FAULT
>>> device=05:00.0 domain=0x000c address=0xffffffabc0 flags=0x0020]
>>>
>>> As this is not reproducible, how would debugging go? (The system was
>>> rebooted in the meantime.) What options should be enabled, that next
>>> time the required information is logged, or what commands should I
>>> execute when the system is still in that state, so the bug (driver,
>>> userspace, …) can be pinpointed and fixed?
>>>
>>>
>>> Kind regards,
>>>
>>> Paul
>>>
>>>
>>> [1]:
>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgitlab.freedesktop.org%2Fdrm%2Famd%2F-%2Fissues%2F1762&data=04%7C01%7Cchristian.koenig%40amd.com%7C13ead10a4a584537d87208d997d0c693%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637707745295391463%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=jn1HS%2FhtmfLOQBD%2FI9w4ZXpspc4X7ik6G8N1W5AlqXg%3D&reserved=0
>>>      "Oland [Radeon HD 8570 / R7 240/340 OEM]: GPU hang"
>>
>> _______________________________________________
>> iommu mailing list
>> [email protected]
>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.linuxfoundation.org%2Fmailman%2Flistinfo%2Fiommu&data=04%7C01%7Cchristian.koenig%40amd.com%7C13ead10a4a584537d87208d997d0c693%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637707745295391463%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=WNsslcCFP2UJX%2FtuPmsFWv%2BpW1i02q3K7pIlIdaQcfE%3D&reserved=0
>>

2021-10-27 21:30:23

by Alex Deucher

[permalink] [raw]

Subject: Re: I got an IOMMU IO page fault. What to do now?

On Wed, Oct 27, 2021 at 1:24 PM Alex Deucher <[email protected]> wrote:
>
> On Wed, Oct 27, 2021 at 1:20 PM Robin Murphy <[email protected]> wrote:
> >
> > On 27/10/2021 5:45 pm, Paul Menzel wrote:
> > > Dear Robin,
> > >
> > >
> > > On 25.10.21 18:01, Robin Murphy wrote:
> > >> On 2021-10-25 12:23, Christian König wrote:
> > >
> > >>> not sure how the IOMMU gives out addresses, but the printed ones look
> > >>> suspicious to me. Something like we are using an invalid address like
> > >>> -1 or similar.
> > >>
> > >> FWIW those look like believable DMA addresses to me, assuming that the
> > >> DMA mapping APIs are being backed iommu_dma_ops and the device has a
> > >> 40-bit DMA mask, since the IOVA allocator works top-down.
> > >>
> > >> Likely causes are either a race where the dma_unmap_*() call happens
> > >> before the hardware has really stopped accessing the relevant
> > >> addresses, or the device's DMA mask has been set larger than it should
> > >> be, and thus the upper bits have been truncated in the round-trip
> > >> through the hardware.
> > >>
> > >> Given the addresses involved, my suspicions would initially lean
> > >> towards the latter case - the faults are in the very topmost pages
> > >> which imply they're the first things mapped in that range. The other
> > >> contributing factor being the trick that the IOVA allocator plays for
> > >> PCI devices, where it tries to prefer 32-bit addresses. Thus you're
> > >> only likely to see this happen once you already have ~3.5-4GB of live
> > >> DMA-mapped memory to exhaust the 32-bit IOVA space (minus some
> > >> reserved areas) and start allocating from the full DMA mask. You
> > >> should be able to check that with a 5.13 or newer kernel by booting
> > >> with "iommu.forcedac=1" and seeing if it breaks immediately
> > >> (unfortunately with an older kernel you'd have to manually hack
> > >> iommu_dma_alloc_iova() to the same effect).
> > >
> > > I booted Linux 5.15-rc7 with `iommu.forcedac=1` and the system booted,
> > > and I could log in remotely over SSH. Please find the Linux kernel
> > > messages attached. (The system logs say lightdm failed to start, but it
> > > might be some other issue due to a change in the operating system.)
> >
> > OK, that looks like it's made the GPU blow up straight away, which is
> > what I was hoping for (and also appears to reveal another bug where it's
> > not handling probe failure very well - possibly trying to remove a
> > non-existent audio device?). Lightdm presumably fails to start because
> > it doesn't find any display devices, since amdgpu failed to probe.
> >
> > If you can boot the same kernel without "iommu.forcedac" and get a
> > successful probe and working display, that will imply that it is
> > managing to work OK with 32-bit DMA addresses, at which point I'd have
> > to leave it to Christian and Alex to figure out exactly where DMA
> > addresses are getting mangled. The only thing that stands out to me is
> > the reference to "gfx_v6_0", which makes me wonder whether it's related
> > to gmc_v6_0_sw_init() where a 44-bit DMA mask gets set. If so, that
> > would suggest that either this particular model of GPU is more limited
> > than expected, or that SoC only has 40 bits of address wired up between
> > the PCI host bridge and the IOMMU.
>
> That device only has a 40 bit DMA mask. It looks like the code is wrong there.

The attached patch should fix it.

Alex

Attachments:

0001-drm-amdgpu-gmc6-fix-DMA-mask.patch (974.00 B)

2021-10-27 21:30:35

by Paul Menzel

[permalink] [raw]

Subject: Re: I got an IOMMU IO page fault. What to do now?

Dear Robin,

On 25.10.21 18:01, Robin Murphy wrote:
> On 2021-10-25 12:23, Christian König wrote:

>> not sure how the IOMMU gives out addresses, but the printed ones look
>> suspicious to me. Something like we are using an invalid address like
>> -1 or similar.
>
> FWIW those look like believable DMA addresses to me, assuming that the
> DMA mapping APIs are being backed iommu_dma_ops and the device has a
> 40-bit DMA mask, since the IOVA allocator works top-down.
>
> Likely causes are either a race where the dma_unmap_*() call happens
> before the hardware has really stopped accessing the relevant addresses,
> or the device's DMA mask has been set larger than it should be, and thus
> the upper bits have been truncated in the round-trip through the hardware.
>
> Given the addresses involved, my suspicions would initially lean towards
> the latter case - the faults are in the very topmost pages which imply
> they're the first things mapped in that range. The other contributing
> factor being the trick that the IOVA allocator plays for PCI devices,
> where it tries to prefer 32-bit addresses. Thus you're only likely to
> see this happen once you already have ~3.5-4GB of live DMA-mapped memory
> to exhaust the 32-bit IOVA space (minus some reserved areas) and start
> allocating from the full DMA mask. You should be able to check that with
> a 5.13 or newer kernel by booting with "iommu.forcedac=1" and seeing if
> it breaks immediately (unfortunately with an older kernel you'd have to
> manually hack iommu_dma_alloc_iova() to the same effect).

I booted Linux 5.15-rc7 with `iommu.forcedac=1` and the system booted,
and I could log in remotely over SSH. Please find the Linux kernel
messages attached. (The system logs say lightdm failed to start, but it
might be some other issue due to a change in the operating system.)

>> Can you try that on an up to date kernel as well? E.g. ideally
>> bleeding edge amd-staging-drm-next from Alex repository.

Kind regards,

Paul

Attachments:

=?UTF-8?Q?20211027=E2=80=93linux-5=2E15-rc7=E2=80=93dell-optiplex-5055=E2=80=93iommu=2Eforcedac=2Etxt?= (64.12 kB)

2021-10-27 21:31:17

by Alex Deucher

[permalink] [raw]

Subject: Re: I got an IOMMU IO page fault. What to do now?

On Wed, Oct 27, 2021 at 1:20 PM Robin Murphy <[email protected]> wrote:
>
> On 27/10/2021 5:45 pm, Paul Menzel wrote:
> > Dear Robin,
> >
> >
> > On 25.10.21 18:01, Robin Murphy wrote:
> >> On 2021-10-25 12:23, Christian König wrote:
> >
> >>> not sure how the IOMMU gives out addresses, but the printed ones look
> >>> suspicious to me. Something like we are using an invalid address like
> >>> -1 or similar.
> >>
> >> FWIW those look like believable DMA addresses to me, assuming that the
> >> DMA mapping APIs are being backed iommu_dma_ops and the device has a
> >> 40-bit DMA mask, since the IOVA allocator works top-down.
> >>
> >> Likely causes are either a race where the dma_unmap_*() call happens
> >> before the hardware has really stopped accessing the relevant
> >> addresses, or the device's DMA mask has been set larger than it should
> >> be, and thus the upper bits have been truncated in the round-trip
> >> through the hardware.
> >>
> >> Given the addresses involved, my suspicions would initially lean
> >> towards the latter case - the faults are in the very topmost pages
> >> which imply they're the first things mapped in that range. The other
> >> contributing factor being the trick that the IOVA allocator plays for
> >> PCI devices, where it tries to prefer 32-bit addresses. Thus you're
> >> only likely to see this happen once you already have ~3.5-4GB of live
> >> DMA-mapped memory to exhaust the 32-bit IOVA space (minus some
> >> reserved areas) and start allocating from the full DMA mask. You
> >> should be able to check that with a 5.13 or newer kernel by booting
> >> with "iommu.forcedac=1" and seeing if it breaks immediately
> >> (unfortunately with an older kernel you'd have to manually hack
> >> iommu_dma_alloc_iova() to the same effect).
> >
> > I booted Linux 5.15-rc7 with `iommu.forcedac=1` and the system booted,
> > and I could log in remotely over SSH. Please find the Linux kernel
> > messages attached. (The system logs say lightdm failed to start, but it
> > might be some other issue due to a change in the operating system.)
>
> OK, that looks like it's made the GPU blow up straight away, which is
> what I was hoping for (and also appears to reveal another bug where it's
> not handling probe failure very well - possibly trying to remove a
> non-existent audio device?). Lightdm presumably fails to start because
> it doesn't find any display devices, since amdgpu failed to probe.
>
> If you can boot the same kernel without "iommu.forcedac" and get a
> successful probe and working display, that will imply that it is
> managing to work OK with 32-bit DMA addresses, at which point I'd have
> to leave it to Christian and Alex to figure out exactly where DMA
> addresses are getting mangled. The only thing that stands out to me is
> the reference to "gfx_v6_0", which makes me wonder whether it's related
> to gmc_v6_0_sw_init() where a 44-bit DMA mask gets set. If so, that
> would suggest that either this particular model of GPU is more limited
> than expected, or that SoC only has 40 bits of address wired up between
> the PCI host bridge and the IOMMU.

That device only has a 40 bit DMA mask. It looks like the code is wrong there.

Alex

>
> Cheers,
> Robin.

2021-10-27 21:35:34

by Robin Murphy

[permalink] [raw]

Subject: Re: I got an IOMMU IO page fault. What to do now?

On 27/10/2021 5:45 pm, Paul Menzel wrote:
> Dear Robin,
>
>
> On 25.10.21 18:01, Robin Murphy wrote:
>> On 2021-10-25 12:23, Christian König wrote:
>
>>> not sure how the IOMMU gives out addresses, but the printed ones look
>>> suspicious to me. Something like we are using an invalid address like
>>> -1 or similar.
>>
>> FWIW those look like believable DMA addresses to me, assuming that the
>> DMA mapping APIs are being backed iommu_dma_ops and the device has a
>> 40-bit DMA mask, since the IOVA allocator works top-down.
>>
>> Likely causes are either a race where the dma_unmap_*() call happens
>> before the hardware has really stopped accessing the relevant
>> addresses, or the device's DMA mask has been set larger than it should
>> be, and thus the upper bits have been truncated in the round-trip
>> through the hardware.
>>
>> Given the addresses involved, my suspicions would initially lean
>> towards the latter case - the faults are in the very topmost pages
>> which imply they're the first things mapped in that range. The other
>> contributing factor being the trick that the IOVA allocator plays for
>> PCI devices, where it tries to prefer 32-bit addresses. Thus you're
>> only likely to see this happen once you already have ~3.5-4GB of live
>> DMA-mapped memory to exhaust the 32-bit IOVA space (minus some
>> reserved areas) and start allocating from the full DMA mask. You
>> should be able to check that with a 5.13 or newer kernel by booting
>> with "iommu.forcedac=1" and seeing if it breaks immediately
>> (unfortunately with an older kernel you'd have to manually hack
>> iommu_dma_alloc_iova() to the same effect).
>
> I booted Linux 5.15-rc7 with `iommu.forcedac=1` and the system booted,
> and I could log in remotely over SSH. Please find the Linux kernel
> messages attached. (The system logs say lightdm failed to start, but it
> might be some other issue due to a change in the operating system.)

OK, that looks like it's made the GPU blow up straight away, which is
what I was hoping for (and also appears to reveal another bug where it's
not handling probe failure very well - possibly trying to remove a
non-existent audio device?). Lightdm presumably fails to start because
it doesn't find any display devices, since amdgpu failed to probe.

If you can boot the same kernel without "iommu.forcedac" and get a
successful probe and working display, that will imply that it is
managing to work OK with 32-bit DMA addresses, at which point I'd have
to leave it to Christian and Alex to figure out exactly where DMA
addresses are getting mangled. The only thing that stands out to me is
the reference to "gfx_v6_0", which makes me wonder whether it's related
to gmc_v6_0_sw_init() where a 44-bit DMA mask gets set. If so, that
would suggest that either this particular model of GPU is more limited
than expected, or that SoC only has 40 bits of address wired up between
the PCI host bridge and the IOMMU.

Cheers,
Robin.