2021-10-07 22:02:48

by Borislav Petkov

[permalink] [raw]
Subject: bf756fb833cb ("drm/amdgpu: add missing cleanups for Polaris12 UVD/VCE on suspend")

Hi folks,

commit in $Subject breaks rebooting an HP laptop here with a Carrizo
chipset: after typing "reboot" and pressing Enter, it powers off the
machine up to a certain point but the fans remain on, screen goes black
and nothing happens anymore. No reboot. I have to power it off by
holding the power key down for 4 seconds.

Reverting the patch fixes the issue.

GPU info on that machine:

[ 1.462214] [drm] amdgpu kernel modesetting enabled.
[ 1.465150] amdgpu 0000:00:01.0: vgaarb: deactivate vga console
[ 1.466259] Console: switching to colour dummy device 80x25
[ 1.466844] [drm] initializing kernel modesetting (CARRIZO 0x1002:0x9874 0x103C:0x807E 0xC4).
[ 1.467242] amdgpu 0000:00:01.0: amdgpu: Trusted Memory Zone (TMZ) feature not supported
[ 1.467552] [drm] register mmio base: 0xD0C00000
[ 1.467750] [drm] register mmio size: 262144
[ 1.467901] [drm] add ip block number 0 <vi_common>
[ 1.468067] [drm] add ip block number 1 <gmc_v8_0>
[ 1.468266] [drm] add ip block number 2 <cz_ih>
[ 1.468436] [drm] add ip block number 3 <gfx_v8_0>
[ 1.468603] [drm] add ip block number 4 <sdma_v3_0>
[ 1.468809] [drm] add ip block number 5 <powerplay>
[ 1.468975] [drm] add ip block number 6 <dm>
[ 1.469120] [drm] add ip block number 7 <uvd_v6_0>
[ 1.469282] [drm] add ip block number 8 <vce_v3_0>
[ 1.485350] [drm] BIOS signature incorrect 20 7
[ 1.485494] resource sanity check: requesting [mem 0x000c0000-0x000dffff], which spans more than PCI Bus 0000:00 [mem 0x000c0000-
0x000c3fff window]
[ 1.485922] caller pci_map_rom+0x68/0x1b0 mapping multiple BARs
[ 1.486273] amdgpu 0000:00:01.0: amdgpu: Fetched VBIOS from ROM BAR
[ 1.486488] amdgpu: ATOM BIOS: SWBRT27354.001
[ 1.486701] [drm] UVD is enabled in physical mode
[ 1.486862] [drm] VCE enabled in physical mode
[ 1.487061] [drm] vm size is 64 GB, 2 levels, block size is 10-bit, fragment size is 9-bit
[ 1.487339] amdgpu 0000:00:01.0: amdgpu: VRAM: 512M 0x000000F400000000 - 0x000000F41FFFFFFF (512M used)
[ 1.487652] amdgpu 0000:00:01.0: amdgpu: GART: 1024M 0x000000FF00000000 - 0x000000FF3FFFFFFF
[ 1.487939] [drm] Detected VRAM RAM=512M, BAR=512M
[ 1.488101] [drm] RAM width 128bits UNKNOWN
[ 1.488309] [drm] amdgpu: 512M of VRAM memory ready
[ 1.488522] [drm] amdgpu: 3072M of GTT memory ready.
[ 1.488707] [drm] GART: num cpu pages 262144, num gpu pages 262144
[ 1.488997] [drm] PCIE GART of 1024M enabled (table at 0x000000F400900000).
[ 1.491962] amdgpu: hwmgr_sw_init smu backed is smu8_smu
[ 1.492544] [drm] Found UVD firmware Version: 1.91 Family ID: 11
[ 1.492764] [drm] UVD ENC is disabled
[ 1.494177] [drm] Found VCE firmware Version: 52.4 Binary ID: 3
[ 1.495765] amdgpu: smu version 18.62.00
[ 1.501983] [drm] DM_PPLIB: values for Engine clock
[ 1.502201] [drm] DM_PPLIB: 300000
[ 1.502321] [drm] DM_PPLIB: 360000
[ 1.502441] [drm] DM_PPLIB: 423530
[ 1.502561] [drm] DM_PPLIB: 514290
[ 1.502680] [drm] DM_PPLIB: 626090
[ 1.502799] [drm] DM_PPLIB: 720000
[ 1.502919] [drm] DM_PPLIB: Validation clocks:
[ 1.503069] [drm] DM_PPLIB: engine_max_clock: 72000
[ 1.503242] [drm] DM_PPLIB: memory_max_clock: 80000
[ 1.503415] [drm] DM_PPLIB: level : 8
[ 1.503578] [drm] DM_PPLIB: values for Display clock
[ 1.503745] [drm] DM_PPLIB: 300000
[ 1.503864] [drm] DM_PPLIB: 400000
[ 1.503984] [drm] DM_PPLIB: 496560
[ 1.504147] [drm] DM_PPLIB: 626090
[ 1.504275] [drm] DM_PPLIB: 685720
[ 1.504403] [drm] DM_PPLIB: 757900
[ 1.504526] [drm] DM_PPLIB: Validation clocks:
[ 1.504678] [drm] DM_PPLIB: engine_max_clock: 72000
[ 1.504891] [drm] DM_PPLIB: memory_max_clock: 80000
[ 1.505063] [drm] DM_PPLIB: level : 8
[ 1.505225] [drm] DM_PPLIB: values for Memory clock
[ 1.505389] [drm] DM_PPLIB: 333000
[ 1.505508] [drm] DM_PPLIB: 800000
[ 1.505628] [drm] DM_PPLIB: Validation clocks:
[ 1.505777] [drm] DM_PPLIB: engine_max_clock: 72000
[ 1.505950] [drm] DM_PPLIB: memory_max_clock: 80000
[ 1.506123] [drm] DM_PPLIB: level : 8
[ 1.506375] [drm] Display Core initialized with v3.2.149!
[ 1.584817] [drm] UVD initialized successfully.
[ 1.784234] [drm] VCE initialized successfully.
[ 1.784415] amdgpu 0000:00:01.0: amdgpu: SE 1, SH per SE 1, CU per SH 8, active_cu_number 8
[ 1.787958] [drm] fb mappable at 0xA0EE4000
[ 1.788118] [drm] vram apper at 0xA0000000
[ 1.788258] [drm] size 14745600
[ 1.788367] [drm] fb depth is 24
[ 1.788503] [drm] pitch is 10240
[ 1.789198] fbcon: amdgpu (fb0) is primary device
[ 1.880014] Console: switching to colour frame buffer device 320x90
[ 1.903779] amdgpu 0000:00:01.0: [drm] fb0: amdgpu frame buffer device
[ 1.918353] [drm] Initialized amdgpu 3.42.0 20150101 for 0000:00:01.0 on minor 0

--
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette


2021-10-08 14:48:16

by Alex Deucher

[permalink] [raw]
Subject: Re: bf756fb833cb ("drm/amdgpu: add missing cleanups for Polaris12 UVD/VCE on suspend")

On Thu, Oct 7, 2021 at 2:03 PM Borislav Petkov <[email protected]> wrote:
>
> Hi folks,
>
> commit in $Subject breaks rebooting an HP laptop here with a Carrizo
> chipset: after typing "reboot" and pressing Enter, it powers off the
> machine up to a certain point but the fans remain on, screen goes black
> and nothing happens anymore. No reboot. I have to power it off by
> holding the power key down for 4 seconds.
>
> Reverting the patch fixes the issue.

@Quan, Evan any ideas? I don't have a CZ system handy at the moment.
Worse comes to worst we could just wrap the changes in an asic_type
check or !APU check.

Alex

>
> GPU info on that machine:
>
> [ 1.462214] [drm] amdgpu kernel modesetting enabled.
> [ 1.465150] amdgpu 0000:00:01.0: vgaarb: deactivate vga console
> [ 1.466259] Console: switching to colour dummy device 80x25
> [ 1.466844] [drm] initializing kernel modesetting (CARRIZO 0x1002:0x9874 0x103C:0x807E 0xC4).
> [ 1.467242] amdgpu 0000:00:01.0: amdgpu: Trusted Memory Zone (TMZ) feature not supported
> [ 1.467552] [drm] register mmio base: 0xD0C00000
> [ 1.467750] [drm] register mmio size: 262144
> [ 1.467901] [drm] add ip block number 0 <vi_common>
> [ 1.468067] [drm] add ip block number 1 <gmc_v8_0>
> [ 1.468266] [drm] add ip block number 2 <cz_ih>
> [ 1.468436] [drm] add ip block number 3 <gfx_v8_0>
> [ 1.468603] [drm] add ip block number 4 <sdma_v3_0>
> [ 1.468809] [drm] add ip block number 5 <powerplay>
> [ 1.468975] [drm] add ip block number 6 <dm>
> [ 1.469120] [drm] add ip block number 7 <uvd_v6_0>
> [ 1.469282] [drm] add ip block number 8 <vce_v3_0>
> [ 1.485350] [drm] BIOS signature incorrect 20 7
> [ 1.485494] resource sanity check: requesting [mem 0x000c0000-0x000dffff], which spans more than PCI Bus 0000:00 [mem 0x000c0000-
> 0x000c3fff window]
> [ 1.485922] caller pci_map_rom+0x68/0x1b0 mapping multiple BARs
> [ 1.486273] amdgpu 0000:00:01.0: amdgpu: Fetched VBIOS from ROM BAR
> [ 1.486488] amdgpu: ATOM BIOS: SWBRT27354.001
> [ 1.486701] [drm] UVD is enabled in physical mode
> [ 1.486862] [drm] VCE enabled in physical mode
> [ 1.487061] [drm] vm size is 64 GB, 2 levels, block size is 10-bit, fragment size is 9-bit
> [ 1.487339] amdgpu 0000:00:01.0: amdgpu: VRAM: 512M 0x000000F400000000 - 0x000000F41FFFFFFF (512M used)
> [ 1.487652] amdgpu 0000:00:01.0: amdgpu: GART: 1024M 0x000000FF00000000 - 0x000000FF3FFFFFFF
> [ 1.487939] [drm] Detected VRAM RAM=512M, BAR=512M
> [ 1.488101] [drm] RAM width 128bits UNKNOWN
> [ 1.488309] [drm] amdgpu: 512M of VRAM memory ready
> [ 1.488522] [drm] amdgpu: 3072M of GTT memory ready.
> [ 1.488707] [drm] GART: num cpu pages 262144, num gpu pages 262144
> [ 1.488997] [drm] PCIE GART of 1024M enabled (table at 0x000000F400900000).
> [ 1.491962] amdgpu: hwmgr_sw_init smu backed is smu8_smu
> [ 1.492544] [drm] Found UVD firmware Version: 1.91 Family ID: 11
> [ 1.492764] [drm] UVD ENC is disabled
> [ 1.494177] [drm] Found VCE firmware Version: 52.4 Binary ID: 3
> [ 1.495765] amdgpu: smu version 18.62.00
> [ 1.501983] [drm] DM_PPLIB: values for Engine clock
> [ 1.502201] [drm] DM_PPLIB: 300000
> [ 1.502321] [drm] DM_PPLIB: 360000
> [ 1.502441] [drm] DM_PPLIB: 423530
> [ 1.502561] [drm] DM_PPLIB: 514290
> [ 1.502680] [drm] DM_PPLIB: 626090
> [ 1.502799] [drm] DM_PPLIB: 720000
> [ 1.502919] [drm] DM_PPLIB: Validation clocks:
> [ 1.503069] [drm] DM_PPLIB: engine_max_clock: 72000
> [ 1.503242] [drm] DM_PPLIB: memory_max_clock: 80000
> [ 1.503415] [drm] DM_PPLIB: level : 8
> [ 1.503578] [drm] DM_PPLIB: values for Display clock
> [ 1.503745] [drm] DM_PPLIB: 300000
> [ 1.503864] [drm] DM_PPLIB: 400000
> [ 1.503984] [drm] DM_PPLIB: 496560
> [ 1.504147] [drm] DM_PPLIB: 626090
> [ 1.504275] [drm] DM_PPLIB: 685720
> [ 1.504403] [drm] DM_PPLIB: 757900
> [ 1.504526] [drm] DM_PPLIB: Validation clocks:
> [ 1.504678] [drm] DM_PPLIB: engine_max_clock: 72000
> [ 1.504891] [drm] DM_PPLIB: memory_max_clock: 80000
> [ 1.505063] [drm] DM_PPLIB: level : 8
> [ 1.505225] [drm] DM_PPLIB: values for Memory clock
> [ 1.505389] [drm] DM_PPLIB: 333000
> [ 1.505508] [drm] DM_PPLIB: 800000
> [ 1.505628] [drm] DM_PPLIB: Validation clocks:
> [ 1.505777] [drm] DM_PPLIB: engine_max_clock: 72000
> [ 1.505950] [drm] DM_PPLIB: memory_max_clock: 80000
> [ 1.506123] [drm] DM_PPLIB: level : 8
> [ 1.506375] [drm] Display Core initialized with v3.2.149!
> [ 1.584817] [drm] UVD initialized successfully.
> [ 1.784234] [drm] VCE initialized successfully.
> [ 1.784415] amdgpu 0000:00:01.0: amdgpu: SE 1, SH per SE 1, CU per SH 8, active_cu_number 8
> [ 1.787958] [drm] fb mappable at 0xA0EE4000
> [ 1.788118] [drm] vram apper at 0xA0000000
> [ 1.788258] [drm] size 14745600
> [ 1.788367] [drm] fb depth is 24
> [ 1.788503] [drm] pitch is 10240
> [ 1.789198] fbcon: amdgpu (fb0) is primary device
> [ 1.880014] Console: switching to colour frame buffer device 320x90
> [ 1.903779] amdgpu 0000:00:01.0: [drm] fb0: amdgpu frame buffer device
> [ 1.918353] [drm] Initialized amdgpu 3.42.0 20150101 for 0000:00:01.0 on minor 0
>
> --
> Regards/Gruss,
> Boris.
>
> https://people.kernel.org/tglx/notes-about-netiquette

2021-10-08 15:07:42

by Borislav Petkov

[permalink] [raw]
Subject: Re: bf756fb833cb ("drm/amdgpu: add missing cleanups for Polaris12 UVD/VCE on suspend")

On Fri, Oct 08, 2021 at 10:45:47AM -0400, Alex Deucher wrote:
> I don't have a CZ system handy at the moment.

I could test patches on mine here, if that would help...

--
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette

2021-10-08 15:14:33

by Alex Deucher

[permalink] [raw]
Subject: Re: bf756fb833cb ("drm/amdgpu: add missing cleanups for Polaris12 UVD/VCE on suspend")

On Fri, Oct 8, 2021 at 11:04 AM Borislav Petkov <[email protected]> wrote:
>
> On Fri, Oct 08, 2021 at 10:45:47AM -0400, Alex Deucher wrote:
> > I don't have a CZ system handy at the moment.
>
> I could test patches on mine here, if that would help...

Can you try swapping the order of
amdgpu_device_ip_set_powergating_state() and
amdgpu_device_ip_set_clockgating_state() in the patch?

Alex


>
> --
> Regards/Gruss,
> Boris.
>
> https://people.kernel.org/tglx/notes-about-netiquette

2021-10-08 15:37:51

by Borislav Petkov

[permalink] [raw]
Subject: Re: bf756fb833cb ("drm/amdgpu: add missing cleanups for Polaris12 UVD/VCE on suspend")

On Fri, Oct 08, 2021 at 11:12:35AM -0400, Alex Deucher wrote:
> Can you try swapping the order of
> amdgpu_device_ip_set_powergating_state() and
> amdgpu_device_ip_set_clockgating_state() in the patch?

Nope, the diff below didn't change things.

Should I comment them out one by one and see whether the clockgating or
the powergating causes it?

diff --git a/drivers/gpu/drm/amd/amdgpu/uvd_v6_0.c b/drivers/gpu/drm/amd/amdgpu/uvd_v6_0.c
index bc571833632e..99e3d697cc24 100644
--- a/drivers/gpu/drm/amd/amdgpu/uvd_v6_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/uvd_v6_0.c
@@ -561,10 +561,10 @@ static int uvd_v6_0_hw_fini(void *handle)
} else {
amdgpu_asic_set_uvd_clocks(adev, 0, 0);
/* shutdown the UVD block */
- amdgpu_device_ip_set_powergating_state(adev, AMD_IP_BLOCK_TYPE_UVD,
- AMD_PG_STATE_GATE);
amdgpu_device_ip_set_clockgating_state(adev, AMD_IP_BLOCK_TYPE_UVD,
AMD_CG_STATE_GATE);
+ amdgpu_device_ip_set_powergating_state(adev, AMD_IP_BLOCK_TYPE_UVD,
+ AMD_PG_STATE_GATE);
}

if (RREG32(mmUVD_STATUS) != 0)
diff --git a/drivers/gpu/drm/amd/amdgpu/vce_v3_0.c b/drivers/gpu/drm/amd/amdgpu/vce_v3_0.c
index 9de66893ccd6..a36612357d0f 100644
--- a/drivers/gpu/drm/amd/amdgpu/vce_v3_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/vce_v3_0.c
@@ -507,10 +507,10 @@ static int vce_v3_0_hw_fini(void *handle)
amdgpu_dpm_enable_vce(adev, false);
} else {
amdgpu_asic_set_vce_clocks(adev, 0, 0);
- amdgpu_device_ip_set_powergating_state(adev, AMD_IP_BLOCK_TYPE_VCE,
- AMD_PG_STATE_GATE);
amdgpu_device_ip_set_clockgating_state(adev, AMD_IP_BLOCK_TYPE_VCE,
AMD_CG_STATE_GATE);
+ amdgpu_device_ip_set_powergating_state(adev, AMD_IP_BLOCK_TYPE_VCE,
+ AMD_PG_STATE_GATE);
}

r = vce_v3_0_wait_for_idle(handle);

--
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette

2021-10-09 01:22:24

by Evan Quan

[permalink] [raw]
Subject: RE: bf756fb833cb ("drm/amdgpu: add missing cleanups for Polaris12 UVD/VCE on suspend")

[AMD Official Use Only]

Maybe the change below can address your issue.
https://lists.freedesktop.org/archives/amd-gfx/2021-September/069006.html

BR
Evan
> -----Original Message-----
> From: Borislav Petkov <[email protected]>
> Sent: Friday, October 8, 2021 11:36 PM
> To: Alex Deucher <[email protected]>
> Cc: Quan, Evan <[email protected]>; amd-gfx list <amd-
> [email protected]>; LKML <[email protected]>; Deucher,
> Alexander <[email protected]>; Pan, Xinhui
> <[email protected]>; Chen, Guchun <[email protected]>
> Subject: Re: bf756fb833cb ("drm/amdgpu: add missing cleanups for Polaris12
> UVD/VCE on suspend")
>
> On Fri, Oct 08, 2021 at 11:12:35AM -0400, Alex Deucher wrote:
> > Can you try swapping the order of
> > amdgpu_device_ip_set_powergating_state() and
> > amdgpu_device_ip_set_clockgating_state() in the patch?
>
> Nope, the diff below didn't change things.
>
> Should I comment them out one by one and see whether the clockgating or
> the powergating causes it?
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/uvd_v6_0.c
> b/drivers/gpu/drm/amd/amdgpu/uvd_v6_0.c
> index bc571833632e..99e3d697cc24 100644
> --- a/drivers/gpu/drm/amd/amdgpu/uvd_v6_0.c
> +++ b/drivers/gpu/drm/amd/amdgpu/uvd_v6_0.c
> @@ -561,10 +561,10 @@ static int uvd_v6_0_hw_fini(void *handle)
> } else {
> amdgpu_asic_set_uvd_clocks(adev, 0, 0);
> /* shutdown the UVD block */
> - amdgpu_device_ip_set_powergating_state(adev,
> AMD_IP_BLOCK_TYPE_UVD,
> - AMD_PG_STATE_GATE);
> amdgpu_device_ip_set_clockgating_state(adev,
> AMD_IP_BLOCK_TYPE_UVD,
> AMD_CG_STATE_GATE);
> + amdgpu_device_ip_set_powergating_state(adev,
> AMD_IP_BLOCK_TYPE_UVD,
> + AMD_PG_STATE_GATE);
> }
>
> if (RREG32(mmUVD_STATUS) != 0)
> diff --git a/drivers/gpu/drm/amd/amdgpu/vce_v3_0.c
> b/drivers/gpu/drm/amd/amdgpu/vce_v3_0.c
> index 9de66893ccd6..a36612357d0f 100644
> --- a/drivers/gpu/drm/amd/amdgpu/vce_v3_0.c
> +++ b/drivers/gpu/drm/amd/amdgpu/vce_v3_0.c
> @@ -507,10 +507,10 @@ static int vce_v3_0_hw_fini(void *handle)
> amdgpu_dpm_enable_vce(adev, false);
> } else {
> amdgpu_asic_set_vce_clocks(adev, 0, 0);
> - amdgpu_device_ip_set_powergating_state(adev,
> AMD_IP_BLOCK_TYPE_VCE,
> - AMD_PG_STATE_GATE);
> amdgpu_device_ip_set_clockgating_state(adev,
> AMD_IP_BLOCK_TYPE_VCE,
> AMD_CG_STATE_GATE);
> + amdgpu_device_ip_set_powergating_state(adev,
> AMD_IP_BLOCK_TYPE_VCE,
> + AMD_PG_STATE_GATE);
> }
>
> r = vce_v3_0_wait_for_idle(handle);
>
> --
> Regards/Gruss,
> Boris.
>
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpeo
> ple.kernel.org%2Ftglx%2Fnotes-about-
> netiquette&amp;data=04%7C01%7CEvan.Quan%40amd.com%7C2389690487
> 7248b6368708d98a715267%7C3dd8961fe4884e608e11a82d994e183d%7C0%7
> C0%7C637693041605567349%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4w
> LjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&am
> p;sdata=BZ9lDD4SnWzTYPdCRPFIAsjlncoQAetHWCo%2FqIjalE0%3D&amp;res
> erved=0

2021-10-09 09:06:05

by Borislav Petkov

[permalink] [raw]
Subject: Re: bf756fb833cb ("drm/amdgpu: add missing cleanups for Polaris12 UVD/VCE on suspend")

On Sat, Oct 09, 2021 at 01:20:39AM +0000, Quan, Evan wrote:
> Maybe the change below can address your issue.
> https://lists.freedesktop.org/archives/amd-gfx/2021-September/069006.html

Nope, that one doesn't change anything.

Thx.

--
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette

2021-10-09 09:56:43

by Evan Quan

[permalink] [raw]
Subject: RE: bf756fb833cb ("drm/amdgpu: add missing cleanups for Polaris12 UVD/VCE on suspend")

[AMD Official Use Only]

Oops, I just found some necessary changes are missing from the patch of the link below.
https://lists.freedesktop.org/archives/amd-gfx/2021-September/069006.html

Could you try the patch from the link above + the attached patch?

BR
Evan
> -----Original Message-----
> From: Borislav Petkov <[email protected]>
> Sent: Saturday, October 9, 2021 5:01 PM
> To: Quan, Evan <[email protected]>
> Cc: Alex Deucher <[email protected]>; amd-gfx list <amd-
> [email protected]>; LKML <[email protected]>; Deucher,
> Alexander <[email protected]>; Pan, Xinhui
> <[email protected]>; Chen, Guchun <[email protected]>
> Subject: Re: bf756fb833cb ("drm/amdgpu: add missing cleanups for Polaris12
> UVD/VCE on suspend")
>
> On Sat, Oct 09, 2021 at 01:20:39AM +0000, Quan, Evan wrote:
> > Maybe the change below can address your issue.
> >
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.
> freedesktop.org%2Farchives%2Famd-gfx%2F2021-
> September%2F069006.html&amp;data=04%7C01%7CEvan.Quan%40amd.co
> m%7Cd06cae38046c476c6cf808d98b0357c6%7C3dd8961fe4884e608e11a82d9
> 94e183d%7C0%7C0%7C637693668756530302%7CUnknown%7CTWFpbGZsb3d
> 8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%
> 3D%7C1000&amp;sdata=5PR5eUKLrK1FONfNwgXNTw7WNgcu%2F13HOmOc
> oOsEkTI%3D&amp;reserved=0
>
> Nope, that one doesn't change anything.
>
> Thx.
>
> --
> Regards/Gruss,
> Boris.
>
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpeo
> ple.kernel.org%2Ftglx%2Fnotes-about-
> netiquette&amp;data=04%7C01%7CEvan.Quan%40amd.com%7Cd06cae3804
> 6c476c6cf808d98b0357c6%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C
> 0%7C637693668756530302%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLj
> AwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;
> sdata=dOzz4cZC0VyM92jwofZ4K0MN4i%2FM%2B9F%2F3vCUj%2Bd%2Bwdw
> %3D&amp;reserved=0


Attachments:
0001-drm-amdgpu-fix-Carrizo-uvd-crash-on-driver-unload.patch (1.56 kB)
0001-drm-amdgpu-fix-Carrizo-uvd-crash-on-driver-unload.patch

2021-10-09 10:11:22

by Borislav Petkov

[permalink] [raw]
Subject: Re: bf756fb833cb ("drm/amdgpu: add missing cleanups for Polaris12 UVD/VCE on suspend")

On Sat, Oct 09, 2021 at 09:54:13AM +0000, Quan, Evan wrote:
> Oops, I just found some necessary changes are missing from the patch of the link below.
> https://lists.freedesktop.org/archives/amd-gfx/2021-September/069006.html
>
> Could you try the patch from the link above + the attached patch?

Nope, still no joy. ;-\

--
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette

2021-10-11 12:30:39

by Evan Quan

[permalink] [raw]
Subject: RE: bf756fb833cb ("drm/amdgpu: add missing cleanups for Polaris12 UVD/VCE on suspend")

[AMD Official Use Only]

OK... Then forget about previous patches. Let's try to narrow down the issue first.
Please try the attached patch1 first. If it works, please undo the changes of patch1 and try patch2 to narrow down further.

BR
Evan
> -----Original Message-----
> From: Borislav Petkov <[email protected]>
> Sent: Saturday, October 9, 2021 6:07 PM
> To: Quan, Evan <[email protected]>
> Cc: Alex Deucher <[email protected]>; amd-gfx list <amd-
> [email protected]>; LKML <[email protected]>; Deucher,
> Alexander <[email protected]>; Pan, Xinhui
> <[email protected]>; Chen, Guchun <[email protected]>
> Subject: Re: bf756fb833cb ("drm/amdgpu: add missing cleanups for Polaris12
> UVD/VCE on suspend")
>
> On Sat, Oct 09, 2021 at 09:54:13AM +0000, Quan, Evan wrote:
> > Oops, I just found some necessary changes are missing from the patch of
> the link below.
> >
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.
> freedesktop.org%2Farchives%2Famd-gfx%2F2021-
> September%2F069006.html&amp;data=04%7C01%7CEvan.Quan%40amd.co
> m%7Ce528679b6b6e4da74ec408d98b0c98df%7C3dd8961fe4884e608e11a82d
> 994e183d%7C0%7C0%7C637693708504533267%7CUnknown%7CTWFpbGZsb3
> d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0
> %3D%7C1000&amp;sdata=HAmBuuX%2BvMex3Rxw%2FZrV8d21ygSMS3xrW
> HWeTMzLObg%3D&amp;reserved=0
> >
> > Could you try the patch from the link above + the attached patch?
>
> Nope, still no joy. ;-\
>
> --
> Regards/Gruss,
> Boris.
>
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpeo
> ple.kernel.org%2Ftglx%2Fnotes-about-
> netiquette&amp;data=04%7C01%7CEvan.Quan%40amd.com%7Ce528679b6b
> 6e4da74ec408d98b0c98df%7C3dd8961fe4884e608e11a82d994e183d%7C0%7
> C0%7C637693708504543261%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4w
> LjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&am
> p;sdata=QVTW41SGsMuwq0qeZ9LtQs%2BQ2zw6gxhW5Ttt1iM%2Fu0M%3D
> &amp;reserved=0


Attachments:
0001-drm-amdgpu-no-UVD-VCE-dpm-disablment-on-suspend-for-.patch (1.80 kB)
0001-drm-amdgpu-no-UVD-VCE-dpm-disablment-on-suspend-for-.patch
0002-drm-amd-pm-no-UVD-VCE-power-off-during-early-phase-o.patch (1.46 kB)
0002-drm-amd-pm-no-UVD-VCE-power-off-during-early-phase-o.patch
Download all attachments

2021-10-11 17:13:12

by Borislav Petkov

[permalink] [raw]
Subject: Re: bf756fb833cb ("drm/amdgpu: add missing cleanups for Polaris12 UVD/VCE on suspend")

On Mon, Oct 11, 2021 at 08:03:51AM +0000, Quan, Evan wrote:
> OK... Then forget about previous patches. Let's try to narrow down the
> issue first. Please try the attached patch1 first. If it works,

It does.

> please undo the changes of patch1 and try patch2 to narrow down further.

It does too.

:-)

Thx.

--
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette

2021-10-13 09:21:23

by Evan Quan

[permalink] [raw]
Subject: RE: bf756fb833cb ("drm/amdgpu: add missing cleanups for Polaris12 UVD/VCE on suspend")

[AMD Official Use Only]

Thanks Boris.
There is another thing which needs your help.
The change of bf756fb833cb was introduced to fix the bug below:
"some hangs observed on suspending when UVD/VCE still using".

So, I need your help to confirm the last two patches(I sent you) do not affect the fix for the bug above.
Please follow the steps below to verify it:
1. Launch a video playing
2. open another terminal and issue "sudo pm-suspend" command to force suspending
3. verify the system can suspend and resume back correctly without errors or hangs

BR
Evan
> -----Original Message-----
> From: Borislav Petkov <[email protected]>
> Sent: Tuesday, October 12, 2021 1:09 AM
> To: Quan, Evan <[email protected]>
> Cc: Alex Deucher <[email protected]>; amd-gfx list <amd-
> [email protected]>; LKML <[email protected]>; Deucher,
> Alexander <[email protected]>; Pan, Xinhui
> <[email protected]>; Chen, Guchun <[email protected]>
> Subject: Re: bf756fb833cb ("drm/amdgpu: add missing cleanups for Polaris12
> UVD/VCE on suspend")
>
> On Mon, Oct 11, 2021 at 08:03:51AM +0000, Quan, Evan wrote:
> > OK... Then forget about previous patches. Let's try to narrow down the
> > issue first. Please try the attached patch1 first. If it works,
>
> It does.
>
> > please undo the changes of patch1 and try patch2 to narrow down further.
>
> It does too.
>
> :-)
>
> Thx.
>
> --
> Regards/Gruss,
> Boris.
>
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpeo
> ple.kernel.org%2Ftglx%2Fnotes-about-
> netiquette&amp;data=04%7C01%7CEvan.Quan%40amd.com%7C7d523770cb
> a84f8d23a108d98cd9c5c0%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C
> 0%7C637695689245101874%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLj
> AwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;
> sdata=TVYDHI74dYSJl68Z3ZjypBhr6jDYAIUgsMQcoreQrhk%3D&amp;reserve
> d=0

2021-10-13 09:28:53

by Borislav Petkov

[permalink] [raw]
Subject: Re: bf756fb833cb ("drm/amdgpu: add missing cleanups for Polaris12 UVD/VCE on suspend")

On Wed, Oct 13, 2021 at 09:19:45AM +0000, Quan, Evan wrote:
> So, I need your help to confirm the last two patches(I sent you) do not affect the fix for the bug above.
> Please follow the steps below to verify it:
> 1. Launch a video playing
> 2. open another terminal and issue "sudo pm-suspend" command to force suspending
> 3. verify the system can suspend and resume back correctly without errors or hangs

Just to confirm: you want me to do that with the last two patches
applied?

--
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette

2021-10-14 02:05:39

by Evan Quan

[permalink] [raw]
Subject: RE: bf756fb833cb ("drm/amdgpu: add missing cleanups for Polaris12 UVD/VCE on suspend")

[AMD Official Use Only]



> -----Original Message-----
> From: Borislav Petkov <[email protected]>
> Sent: Wednesday, October 13, 2021 5:26 PM
> To: Quan, Evan <[email protected]>
> Cc: Alex Deucher <[email protected]>; amd-gfx list <amd-
> [email protected]>; LKML <[email protected]>; Deucher,
> Alexander <[email protected]>; Pan, Xinhui
> <[email protected]>; Chen, Guchun <[email protected]>
> Subject: Re: bf756fb833cb ("drm/amdgpu: add missing cleanups for Polaris12
> UVD/VCE on suspend")
>
> On Wed, Oct 13, 2021 at 09:19:45AM +0000, Quan, Evan wrote:
> > So, I need your help to confirm the last two patches(I sent you) do not
> affect the fix for the bug above.
> > Please follow the steps below to verify it:
> > 1. Launch a video playing
> > 2. open another terminal and issue "sudo pm-suspend" command to force
> > suspending 3. verify the system can suspend and resume back correctly
> > without errors or hangs
>
> Just to confirm: you want me to do that with the last two patches applied?
[Quan, Evan] Yes, but not(apply them) at the same time. One by one as you did before.
- try the patch1 first
- undo the changes of patch1 and try patch2

BR
Evan
>
> --
> Regards/Gruss,
> Boris.
>
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpeo
> ple.kernel.org%2Ftglx%2Fnotes-about-
> netiquette&amp;data=04%7C01%7CEvan.Quan%40amd.com%7Cbcabc6c3a5
> 07426172a308d98e2b8235%7C3dd8961fe4884e608e11a82d994e183d%7C0%7
> C0%7C637697139813202149%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4w
> LjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&am
> p;sdata=7%2F3TXqlIld%2BdSocRyLsgZBeaFcsEDiI5GEwJ5AHaLSk%3D&amp;re
> served=0

2021-10-14 09:02:13

by Borislav Petkov

[permalink] [raw]
Subject: Re: bf756fb833cb ("drm/amdgpu: add missing cleanups for Polaris12 UVD/VCE on suspend")

On Thu, Oct 14, 2021 at 02:02:48AM +0000, Quan, Evan wrote:
> [Quan, Evan] Yes, but not(apply them) at the same time. One by one as you did before.
> - try the patch1 first

Ok, first patch worked fine.

> - undo the changes of patch1 and try patch2

Did that, worked fine too except after the first resume cycle, the video
didn't continue playing.

Then I restarted the video and did a couple more suspend cycles to see
if it would not continue again. In the subsequent tries it would resume
fine and the video would continue playing too.

So I'm going to chalk that single case of halted video with the second
patch to a resume glitch or so.

Btw, I don't have pm-suspend on that box but I did suspend to RAM
assuming this is what you wanted, which is done as root with two
commands:

# echo "suspend" > /sys/power/disk
# echo "mem" > /sys/power/state

If you want me to do more extensive testing, just shoot.

HTH.

--
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette

2021-10-15 17:43:44

by Evan Quan

[permalink] [raw]
Subject: RE: bf756fb833cb ("drm/amdgpu: add missing cleanups for Polaris12 UVD/VCE on suspend")

[AMD Official Use Only]



> -----Original Message-----
> From: Borislav Petkov <[email protected]>
> Sent: Thursday, October 14, 2021 5:01 PM
> To: Quan, Evan <[email protected]>
> Cc: Alex Deucher <[email protected]>; amd-gfx list <amd-
> [email protected]>; LKML <[email protected]>; Deucher,
> Alexander <[email protected]>; Pan, Xinhui
> <[email protected]>; Chen, Guchun <[email protected]>
> Subject: Re: bf756fb833cb ("drm/amdgpu: add missing cleanups for Polaris12
> UVD/VCE on suspend")
>
> On Thu, Oct 14, 2021 at 02:02:48AM +0000, Quan, Evan wrote:
> > [Quan, Evan] Yes, but not(apply them) at the same time. One by one as
> you did before.
> > - try the patch1 first
>
> Ok, first patch worked fine.
>
> > - undo the changes of patch1 and try patch2
>
> Did that, worked fine too except after the first resume cycle, the video didn't
> continue playing.
>
> Then I restarted the video and did a couple more suspend cycles to see if it
> would not continue again. In the subsequent tries it would resume fine and
> the video would continue playing too.
>
> So I'm going to chalk that single case of halted video with the second patch to
> a resume glitch or so.
>
> Btw, I don't have pm-suspend on that box but I did suspend to RAM
> assuming this is what you wanted, which is done as root with two
> commands:
>
> # echo "suspend" > /sys/power/disk
> # echo "mem" > /sys/power/state
[Quan, Evan] Yes, that also works.
>
> If you want me to do more extensive testing, just shoot.
[Quan, Evan] Thanks! That's enough for now. I will prepare an official solution for the issue.

BR
Evan
>
> HTH.
>
> --
> Regards/Gruss,
> Boris.
>
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpeo
> ple.kernel.org%2Ftglx%2Fnotes-about-
> netiquette&amp;data=04%7C01%7CEvan.Quan%40amd.com%7C08df3d5453
> d64ad40dfa08d98ef119ec%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C
> 0%7C637697988457790715%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLj
> AwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;
> sdata=zmFVlmUBv6byoDYyUhSgk9J9Zmvexz5IqG7xBxwiR3M%3D&amp;rese
> rved=0