2019-08-07 12:52:08

by Kai-Heng Feng

[permalink] [raw]
Subject: [Regression] "drm/amdgpu: enable gfxoff again on raven series (v2)"

Hi,

After commit 005440066f92 ("drm/amdgpu: enable gfxoff again on raven series
(v2)”), browsers on Raven Ridge systems cause serious corruption like this:
https://launchpadlibrarian.net/436319772/Screenshot%20from%202019-08-07%2004-20-34.png

Firmwares for Raven Ridge is up-to-date.

Kai-Heng


2019-08-07 17:51:30

by Kai-Heng Feng

[permalink] [raw]
Subject: Re: [Regression] "drm/amdgpu: enable gfxoff again on raven series (v2)"

Hi Ray,

at 00:03, Huang, Ray <[email protected]> wrote:

> May I know the all firmware version in your system?

# cat amdgpu_firmware_info
VCE feature version: 0, firmware version: 0x00000000
UVD feature version: 0, firmware version: 0x00000000
MC feature version: 0, firmware version: 0x00000000
ME feature version: 40, firmware version: 0x00000099
PFP feature version: 40, firmware version: 0x000000ae
CE feature version: 40, firmware version: 0x0000004d
RLC feature version: 1, firmware version: 0x00000213
RLC SRLC feature version: 1, firmware version: 0x00000001
RLC SRLG feature version: 1, firmware version: 0x00000001
RLC SRLS feature version: 1, firmware version: 0x00000001
MEC feature version: 40, firmware version: 0x0000018b
MEC2 feature version: 40, firmware version: 0x0000018b
SOS feature version: 0, firmware version: 0x00000000
ASD feature version: 0, firmware version: 0x001ad4d4
TA XGMI feature version: 0, firmware version: 0x00000000
TA RAS feature version: 0, firmware version: 0x00000000
SMC feature version: 0, firmware version: 0x00001e44
SDMA0 feature version: 41, firmware version: 0x000000a9
VCN feature version: 0, firmware version: 0x0110901c
DMCU feature version: 0, firmware version: 0x00000000
VBIOS version: 113-RAVEN-103

Kai-Heng

>
> Thanks,
> Ray
>
> From: Kai-Heng Feng <[email protected]>
> Sent: Wednesday, August 7, 2019 8:50 PM
> To: Huang, Ray
> Cc: Deucher, Alexander; Koenig, Christian; Zhou, David(ChunMing); amd-gfx
> list; [email protected]; LKML; Anthony Wong
> Subject: [Regression] "drm/amdgpu: enable gfxoff again on raven series
> (v2)"
>
> Hi,
>
> After commit 005440066f92 ("drm/amdgpu: enable gfxoff again on raven series
> (v2)”), browsers on Raven Ridge systems cause serious corruption like this:
> https://launchpadlibrarian.net/436319772/Screenshot%20from%202019-08-07%2004-20-34.png
>
> Firmwares for Raven Ridge is up-to-date.
>
> Kai-Heng


2019-08-08 06:31:06

by Huang Rui

[permalink] [raw]
Subject: RE: [Regression] "drm/amdgpu: enable gfxoff again on raven series (v2)"

> -----Original Message-----
> From: Kai-Heng Feng <[email protected]>
> Sent: Thursday, August 08, 2019 1:45 AM
> To: Huang, Ray <[email protected]>
> Cc: Deucher, Alexander <[email protected]>; Koenig, Christian
> <[email protected]>; Zhou, David(ChunMing)
> <[email protected]>; amd-gfx list <[email protected]>;
> [email protected]; LKML <[email protected]>;
> Anthony Wong <[email protected]>
> Subject: Re: [Regression] "drm/amdgpu: enable gfxoff again on raven series
> (v2)"
>
> Hi Ray,
>
> at 00:03, Huang, Ray <[email protected]> wrote:
>
> > May I know the all firmware version in your system?

Seems to the issue we encountered with IOMMU enabled. Could you please disable iommu in SBIOS or GRUB?

Thanks,
Ray

>
> # cat amdgpu_firmware_info
> VCE feature version: 0, firmware version: 0x00000000
> UVD feature version: 0, firmware version: 0x00000000
> MC feature version: 0, firmware version: 0x00000000
> ME feature version: 40, firmware version: 0x00000099
> PFP feature version: 40, firmware version: 0x000000ae
> CE feature version: 40, firmware version: 0x0000004d
> RLC feature version: 1, firmware version: 0x00000213
> RLC SRLC feature version: 1, firmware version: 0x00000001
> RLC SRLG feature version: 1, firmware version: 0x00000001
> RLC SRLS feature version: 1, firmware version: 0x00000001
> MEC feature version: 40, firmware version: 0x0000018b
> MEC2 feature version: 40, firmware version: 0x0000018b
> SOS feature version: 0, firmware version: 0x00000000
> ASD feature version: 0, firmware version: 0x001ad4d4
> TA XGMI feature version: 0, firmware version: 0x00000000
> TA RAS feature version: 0, firmware version: 0x00000000
> SMC feature version: 0, firmware version: 0x00001e44
> SDMA0 feature version: 41, firmware version: 0x000000a9
> VCN feature version: 0, firmware version: 0x0110901c
> DMCU feature version: 0, firmware version: 0x00000000
> VBIOS version: 113-RAVEN-103
>
> Kai-Heng
>
> >
> > Thanks,
> > Ray
> >
> > From: Kai-Heng Feng <[email protected]>
> > Sent: Wednesday, August 7, 2019 8:50 PM
> > To: Huang, Ray
> > Cc: Deucher, Alexander; Koenig, Christian; Zhou, David(ChunMing); amd-
> gfx
> > list; [email protected]; LKML; Anthony Wong
> > Subject: [Regression] "drm/amdgpu: enable gfxoff again on raven series
> > (v2)"
> >
> > Hi,
> >
> > After commit 005440066f92 ("drm/amdgpu: enable gfxoff again on raven
> series
> > (v2)”), browsers on Raven Ridge systems cause serious corruption like this:
> > https://launchpadlibrarian.net/436319772/Screenshot%20from%202019-
> 08-07%2004-20-34.png
> >
> > Firmwares for Raven Ridge is up-to-date.
> >
> > Kai-Heng
>

2019-08-08 06:49:25

by Kai-Heng Feng

[permalink] [raw]
Subject: Re: [Regression] "drm/amdgpu: enable gfxoff again on raven series (v2)"

at 14:29, Huang, Ray <[email protected]> wrote:

>> -----Original Message-----
>> From: Kai-Heng Feng <[email protected]>
>> Sent: Thursday, August 08, 2019 1:45 AM
>> To: Huang, Ray <[email protected]>
>> Cc: Deucher, Alexander <[email protected]>; Koenig, Christian
>> <[email protected]>; Zhou, David(ChunMing)
>> <[email protected]>; amd-gfx list <[email protected]>;
>> [email protected]; LKML <[email protected]>;
>> Anthony Wong <[email protected]>
>> Subject: Re: [Regression] "drm/amdgpu: enable gfxoff again on raven series
>> (v2)"
>>
>> Hi Ray,
>>
>> at 00:03, Huang, Ray <[email protected]> wrote:
>>
>>> May I know the all firmware version in your system?
>
> Seems to the issue we encountered with IOMMU enabled. Could you please
> disable iommu in SBIOS or GRUB?

Yes, "amd_iommu=off" can workaround the issue.

Kai-Heng

>
> Thanks,
> Ray
>
>> # cat amdgpu_firmware_info
>> VCE feature version: 0, firmware version: 0x00000000
>> UVD feature version: 0, firmware version: 0x00000000
>> MC feature version: 0, firmware version: 0x00000000
>> ME feature version: 40, firmware version: 0x00000099
>> PFP feature version: 40, firmware version: 0x000000ae
>> CE feature version: 40, firmware version: 0x0000004d
>> RLC feature version: 1, firmware version: 0x00000213
>> RLC SRLC feature version: 1, firmware version: 0x00000001
>> RLC SRLG feature version: 1, firmware version: 0x00000001
>> RLC SRLS feature version: 1, firmware version: 0x00000001
>> MEC feature version: 40, firmware version: 0x0000018b
>> MEC2 feature version: 40, firmware version: 0x0000018b
>> SOS feature version: 0, firmware version: 0x00000000
>> ASD feature version: 0, firmware version: 0x001ad4d4
>> TA XGMI feature version: 0, firmware version: 0x00000000
>> TA RAS feature version: 0, firmware version: 0x00000000
>> SMC feature version: 0, firmware version: 0x00001e44
>> SDMA0 feature version: 41, firmware version: 0x000000a9
>> VCN feature version: 0, firmware version: 0x0110901c
>> DMCU feature version: 0, firmware version: 0x00000000
>> VBIOS version: 113-RAVEN-103
>>
>> Kai-Heng
>>
>>> Thanks,
>>> Ray
>>>
>>> From: Kai-Heng Feng <[email protected]>
>>> Sent: Wednesday, August 7, 2019 8:50 PM
>>> To: Huang, Ray
>>> Cc: Deucher, Alexander; Koenig, Christian; Zhou, David(ChunMing); amd-
>> gfx
>>> list; [email protected]; LKML; Anthony Wong
>>> Subject: [Regression] "drm/amdgpu: enable gfxoff again on raven series
>>> (v2)"
>>>
>>> Hi,
>>>
>>> After commit 005440066f92 ("drm/amdgpu: enable gfxoff again on raven
>> series
>>> (v2)”), browsers on Raven Ridge systems cause serious corruption like
>>> this:
>>> https://launchpadlibrarian.net/436319772/Screenshot%20from%202019-
>> 08-07%2004-20-34.png
>>> Firmwares for Raven Ridge is up-to-date.
>>>
>>> Kai-Heng


2019-08-08 08:16:07

by Huang Rui

[permalink] [raw]
Subject: RE: [Regression] "drm/amdgpu: enable gfxoff again on raven series (v2)"

> -----Original Message-----
> From: Michel Dänzer <[email protected]>
> Sent: Thursday, August 08, 2019 4:10 PM
> To: Huang, Ray <[email protected]>; Kai-Heng Feng
> <[email protected]>
> Cc: Zhou, David(ChunMing) <[email protected]>; LKML <linux-
> [email protected]>; [email protected]; Anthony Wong
> <[email protected]>; amd-gfx list <amd-
> [email protected]>; Deucher, Alexander
> <[email protected]>; Koenig, Christian
> <[email protected]>
> Subject: Re: [Regression] "drm/amdgpu: enable gfxoff again on raven series
> (v2)"
>
> On 2019-08-08 8:29 a.m., Huang, Ray wrote:
> >> From: Kai-Heng Feng <[email protected]> at 00:03, Huang,
> >> Ray <[email protected]> wrote:
> >>
> >>> May I know the all firmware version in your system?
> >
> > Seems to the issue we encountered with IOMMU enabled. Could you
> please disable iommu in SBIOS or GRUB?
>
> The driver needs to work with the IOMMU enabled. If nothing else, ROCm
> only works with IOMMU I think.
>

Yes. ROCm in APU required IOMMU v2. So far, I am asking Kai-Heng to do some tests to make sure the issue that was encountered by us before. (+ Marek)

Thanks,
Ray

>
> --
> Earthling Michel Dänzer | https://www.amd.com
> Libre software enthusiast | Mesa and X developer

2019-08-08 08:17:32

by Michel Dänzer

[permalink] [raw]
Subject: Re: [Regression] "drm/amdgpu: enable gfxoff again on raven series (v2)"

On 2019-08-08 8:29 a.m., Huang, Ray wrote:
>> From: Kai-Heng Feng <[email protected]>
>> at 00:03, Huang, Ray <[email protected]> wrote:
>>
>>> May I know the all firmware version in your system?
>
> Seems to the issue we encountered with IOMMU enabled. Could you please disable iommu in SBIOS or GRUB?

The driver needs to work with the IOMMU enabled. If nothing else, ROCm
only works with IOMMU I think.


--
Earthling Michel Dänzer | https://www.amd.com
Libre software enthusiast | Mesa and X developer