2020-06-11 23:07:58

by Alex Xu (Hello71)

[permalink] [raw]
Subject: AMD IOMMU + SME + amdgpu regression

Hi,

amdgpu + IOMMU + SME is now working for me on 5.7, yay! But, it is
broken on torvalds master, boo. On boot, depending on which exact commit
I test, it either hangs immediately (with built-in driver, before
starting initramfs), displays some errors then hangs, or spams the
screen with many amdgpu errors.

I bisected the black screen hang to:

commit dce8d6964ebdb333383bacf5e7ab8c27df151218
Author: Joerg Roedel <[email protected]>
Date: Wed Apr 29 15:36:53 2020 +0200

iommu/amd: Convert to probe/release_device() call-backs

Convert the AMD IOMMU Driver to use the probe_device() and
release_device() call-backs of iommu_ops, so that the iommu core code
does the group and sysfs setup.

Signed-off-by: Joerg Roedel <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Joerg Roedel <[email protected]>

Testing torvalds master (623f6dc593) with the containing merge
(98bdc74b36) plus the DMA mapping merge (4e94d08734) reverted allows
amdgpu + IOMMU + SME to once again work.

I think that nobody is really working on amdgpu + SME, but it would be a
shame if it was supported and then incidentally broken by a small
change.

I am using an ASRock B450 Pro4 with Ryzen 1600 and ASUS RX 480. I don't
understand this code at all, but let me know what I can do to
troubleshoot.

Thanks,
Alex.


2020-06-22 14:31:16

by Joerg Roedel

[permalink] [raw]
Subject: Re: AMD IOMMU + SME + amdgpu regression

Hi Alex,

On Thu, Jun 11, 2020 at 07:05:21PM -0400, Alex Xu (Hello71) wrote:
> I am using an ASRock B450 Pro4 with Ryzen 1600 and ASUS RX 480. I don't
> understand this code at all, but let me know what I can do to
> troubleshoot.

Does it boot without SME enabled?


Regards,

Joerg

2020-06-22 15:32:35

by Alex Xu (Hello71)

[permalink] [raw]
Subject: Re: AMD IOMMU + SME + amdgpu regression

Excerpts from Joerg Roedel's message of June 22, 2020 6:02 am:
> Hi Alex,
>
> On Thu, Jun 11, 2020 at 07:05:21PM -0400, Alex Xu (Hello71) wrote:
>> I am using an ASRock B450 Pro4 with Ryzen 1600 and ASUS RX 480. I don't
>> understand this code at all, but let me know what I can do to
>> troubleshoot.
>
> Does it boot without SME enabled?
>
>
> Regards,
>
> Joerg
>

Yes, it works with SME off with dbed452a078 ("dma-pool: decouple
DMA_REMAP from DMA_COHERENT_POOL") applied.

2020-07-15 09:34:09

by Joerg Roedel

[permalink] [raw]
Subject: Re: AMD IOMMU + SME + amdgpu regression

On Mon, Jun 22, 2020 at 11:30:04AM -0400, Alex Xu (Hello71) wrote:
> Yes, it works with SME off with dbed452a078 ("dma-pool: decouple
> DMA_REMAP from DMA_COHERENT_POOL") applied.

Okay, I can reproduce the problem on my Ryzen System, and the boot log
shows various warnings/bugs from the amdgpu driver. I think this should
be looked at by the AMDGPU folks first, as I didn't really got far
looking into the GPU drivers code.

Regards,

Joerg