2016-03-03 02:38:14

by Ken Moffat

[permalink] [raw]
Subject: [drm:radeon_dp_link_train] *ERROR* clock recovery failed

One of my machines is an A10 Kaveri desktop, with a good old VGA
connection to the monitor. I've only just started trying to boot
any 4.5 kernel on it, but with 4.5.0-rc6 and now linus's tree from a
few hours ago (4.5.0-rc6-00018-gf983cd3) I get a blank screen, with
no video signal, as soon as it tries to switch to a framebuffer.

Comparing the logs, the first bad attempt had a couple of new error
messages, everything else in the logs looked normal -

Mar 1 19:31:10 deluxe kernel: [ 2.543163] fbcon: radeondrmfb (fb0) is primary device
Mar 1 19:31:10 deluxe kernel: [ 2.654179] [drm:radeon_dp_link_train] *ERROR* clock recovery reached max voltage
Mar 1 19:31:10 deluxe kernel: [ 2.654181] [drm:radeon_dp_link_train] *ERROR* clock recovery failed
Mar 1 19:31:10 deluxe kernel: [ 2.677142] Console: switching to colour frame buffer device 200x56
Mar 1 19:31:10 deluxe kernel: [ 2.680435] radeon 0000:00:01.0: fb0: radeondrmfb frame buffer device

Any suggestions where to start looking, please ?

ĸen
--
This email was written using 100% recycled letters.


2016-03-03 23:47:19

by Ken Moffat

[permalink] [raw]
Subject: [drm:radeon_dp_link_train] *ERROR* clock recovery failed -bisected

On Thu, Mar 03, 2016 at 02:38:11AM +0000, Ken Moffat wrote:
> One of my machines is an A10 Kaveri desktop, with a good old VGA
> connection to the monitor. I've only just started trying to boot
> any 4.5 kernel on it, but with 4.5.0-rc6 and now linus's tree from a
> few hours ago (4.5.0-rc6-00018-gf983cd3) I get a blank screen, with
> no video signal, as soon as it tries to switch to a framebuffer.
>
> Comparing the logs, the first bad attempt had a couple of new error
> messages, everything else in the logs looked normal -
>
> Mar 1 19:31:10 deluxe kernel: [ 2.543163] fbcon: radeondrmfb (fb0) is primary device
> Mar 1 19:31:10 deluxe kernel: [ 2.654179] [drm:radeon_dp_link_train] *ERROR* clock recovery reached max voltage
> Mar 1 19:31:10 deluxe kernel: [ 2.654181] [drm:radeon_dp_link_train] *ERROR* clock recovery failed
> Mar 1 19:31:10 deluxe kernel: [ 2.677142] Console: switching to colour frame buffer device 200x56
> Mar 1 19:31:10 deluxe kernel: [ 2.680435] radeon 0000:00:01.0: fb0: radeondrmfb frame buffer device
>
Bisection pointed to

092c96a8ab9d1bd60ada2ed385cc364ce084180e is the first bad commit
commit 092c96a8ab9d1bd60ada2ed385cc364ce084180e
Author: Alex Deucher <[email protected]>
Date: Thu Dec 17 10:23:34 2015 -0500

drm/radeon: fix dp link rate selection (v2)

Need to properly handle the max link rate in the dpcd.
This prevents some cases where 5.4 Ghz is selected when
it shouldn't be.

v2: simplify logic, add array bounds check

Reviewed-by: Tom St Denis <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

I have now reverted that commit from that version of linus's tree and
the machine everything is back to normal.

This mobo does not have a DP connector.

lspci reports the graphics part is

00:01.0 VGA compatible controller: Advanced Micro Devices, Inc.
[AMD/ATI] Kaveri [Radeon R7 Graphics] (prog-if 00 [VGA controller])
Subsystem: Gigabyte Technology Co., Ltd Kaveri [Radeon R7 Graphics]
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B-
DisINTx+
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PER
R- INTx-
Latency: 0, Cache Line Size: 64 bytes
Interrupt: pin A routed to IRQ 26
Region 0: Memory at c0000000 (64-bit, prefetchable) [size=256M]
Region 2: Memory at d0000000 (64-bit, prefetchable) [size=8M]
Region 4: I/O ports at f000 [size=256]
Region 5: Memory at feb00000 (32-bit, non-prefetchable) [size=256K]
Expansion ROM at feb40000 [disabled] [size=128K]
Capabilities: [48] Vendor Specific Information: Len=08 <?>
Capabilities: [50] Power Management version 3
Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=0mA PME(D0-,D1+,D2+,D3hot+,D3cold-)
Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME-
Capabilities: [58] Express (v2) Root Complex Integrated Endpoint, MSI 00
DevCap: MaxPayload 256 bytes, PhantFunc 0
ExtTag+ RBE+
DevCtl: Report errors: Correctable- Non-Fatal- Fatal- Unsupported-
RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop+
MaxPayload 128 bytes, MaxReadReq 512 bytes
DevSta: CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr- TransPend-
DevCap2: Completion Timeout: Not Supported, TimeoutDis-, LTR-, OBFF Not Supported
DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR-, OBFF Disabled
Capabilities: [a0] MSI: Enable+ Count=1/1 Maskable- 64bit+
Address: 00000000fee0f00c Data: 4172
Capabilities: [100 v1] Vendor Specific Information: ID=0001 Rev=1 Len=010 <?>
Capabilities: [270 v1] #19
Capabilities: [2b0 v1] Address Translation Service (ATS)
ATSCap: Invalidate Queue Depth: 00
ATSCtl: Enable-, Smallest Translation Unit: 00
Capabilities: [2c0 v1] Page Request Interface (PRI)
PRICtl: Enable- Reset-
PRISta: RF- UPRGI- Stopped+
Page Request Capacity: 00000020, Page Request Allocation: 00000000
Capabilities: [2d0 v1] Process Address Space ID (PASID)
PASIDCap: Exec- Priv-, Max PASID Width: 10
PASIDCtl: Enable- Exec- Priv-
Kernel driver in use: radeon

I've attached my config. Please let me know if there is anything I
can do to help fix this.

ĸen
--
This email was written using 100% recycled letters.


Attachments:
(No filename) (4.53 kB)
config-4.5-rc6-A10 (90.97 kB)
Download all attachments

2016-03-04 01:11:19

by Ken Moffat

[permalink] [raw]
Subject: Re: [drm:radeon_dp_link_train] *ERROR* clock recovery failed -bisected

On Fri, Mar 04, 2016 at 12:44:01AM +0000, Deucher, Alexander wrote:
>
> The attached radeon patch should fix it. I accidently dropped the special handling for NUTMEG DP to VGA bridge chips.
>
> > This mobo does not have a DP connector.
> >
>
> The VGA port uses a DP to VGA bridge chip.
>
> Alex
>

Thanks, I was not expecting such a quick response.

The radeon patch does fix it. If you wish, you can add

Tested-by: Ken Moffat <[email protected]>

ĸen
--
This email was written using 100% recycled letters.

2016-03-04 02:17:58

by Deucher, Alexander

[permalink] [raw]
Subject: RE: [drm:radeon_dp_link_train] *ERROR* clock recovery failed -bisected

> -----Original Message-----
> From: Ken Moffat [mailto:[email protected]]
> Sent: Thursday, March 03, 2016 6:47 PM
> To: Deucher, Alexander; StDenis, Tom
> Cc: [email protected]; [email protected]
> Subject: [drm:radeon_dp_link_train] *ERROR* clock recovery failed -
> bisected
>
> On Thu, Mar 03, 2016 at 02:38:11AM +0000, Ken Moffat wrote:
> > One of my machines is an A10 Kaveri desktop, with a good old VGA
> > connection to the monitor. I've only just started trying to boot
> > any 4.5 kernel on it, but with 4.5.0-rc6 and now linus's tree from a
> > few hours ago (4.5.0-rc6-00018-gf983cd3) I get a blank screen, with
> > no video signal, as soon as it tries to switch to a framebuffer.
> >
> > Comparing the logs, the first bad attempt had a couple of new error
> > messages, everything else in the logs looked normal -
> >
> > Mar 1 19:31:10 deluxe kernel: [ 2.543163] fbcon: radeondrmfb (fb0) is
> primary device
> > Mar 1 19:31:10 deluxe kernel: [ 2.654179] [drm:radeon_dp_link_train]
> *ERROR* clock recovery reached max voltage
> > Mar 1 19:31:10 deluxe kernel: [ 2.654181] [drm:radeon_dp_link_train]
> *ERROR* clock recovery failed
> > Mar 1 19:31:10 deluxe kernel: [ 2.677142] Console: switching to colour
> frame buffer device 200x56
> > Mar 1 19:31:10 deluxe kernel: [ 2.680435] radeon 0000:00:01.0: fb0:
> radeondrmfb frame buffer device
> >
> Bisection pointed to
>
> 092c96a8ab9d1bd60ada2ed385cc364ce084180e is the first bad commit
> commit 092c96a8ab9d1bd60ada2ed385cc364ce084180e
> Author: Alex Deucher <[email protected]>
> Date: Thu Dec 17 10:23:34 2015 -0500
>
> drm/radeon: fix dp link rate selection (v2)
>
> Need to properly handle the max link rate in the dpcd.
> This prevents some cases where 5.4 Ghz is selected when
> it shouldn't be.
>
> v2: simplify logic, add array bounds check
>
> Reviewed-by: Tom St Denis <[email protected]>
> Signed-off-by: Alex Deucher <[email protected]>
>
> I have now reverted that commit from that version of linus's tree and
> the machine everything is back to normal.
>

The attached radeon patch should fix it. I accidently dropped the special handling for NUTMEG DP to VGA bridge chips.

> This mobo does not have a DP connector.
>

The VGA port uses a DP to VGA bridge chip.

Alex

> lspci reports the graphics part is
>
> 00:01.0 VGA compatible controller: Advanced Micro Devices, Inc.
> [AMD/ATI] Kaveri [Radeon R7 Graphics] (prog-if 00 [VGA controller])
> Subsystem: Gigabyte Technology Co., Ltd Kaveri [Radeon R7 Graphics]
> Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop-
> ParErr- Stepping- SERR- FastB2B-
> DisINTx+
> Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort-
> <TAbort- <MAbort- >SERR- <PER
> R- INTx-
> Latency: 0, Cache Line Size: 64 bytes
> Interrupt: pin A routed to IRQ 26
> Region 0: Memory at c0000000 (64-bit, prefetchable) [size=256M]
> Region 2: Memory at d0000000 (64-bit, prefetchable) [size=8M]
> Region 4: I/O ports at f000 [size=256]
> Region 5: Memory at feb00000 (32-bit, non-prefetchable) [size=256K]
> Expansion ROM at feb40000 [disabled] [size=128K]
> Capabilities: [48] Vendor Specific Information: Len=08 <?>
> Capabilities: [50] Power Management version 3
> Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=0mA PME(D0-
> ,D1+,D2+,D3hot+,D3cold-)
> Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME-
> Capabilities: [58] Express (v2) Root Complex Integrated Endpoint, MSI 00
> DevCap: MaxPayload 256 bytes, PhantFunc 0
> ExtTag+ RBE+
> DevCtl: Report errors: Correctable- Non-Fatal- Fatal- Unsupported-
> RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop+
> MaxPayload 128 bytes, MaxReadReq 512 bytes
> DevSta: CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr-
> TransPend-
> DevCap2: Completion Timeout: Not Supported, TimeoutDis-, LTR-,
> OBFF Not Supported
> DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR-, OBFF
> Disabled
> Capabilities: [a0] MSI: Enable+ Count=1/1 Maskable- 64bit+
> Address: 00000000fee0f00c Data: 4172
> Capabilities: [100 v1] Vendor Specific Information: ID=0001 Rev=1
> Len=010 <?>
> Capabilities: [270 v1] #19
> Capabilities: [2b0 v1] Address Translation Service (ATS)
> ATSCap: Invalidate Queue Depth: 00
> ATSCtl: Enable-, Smallest Translation Unit: 00
> Capabilities: [2c0 v1] Page Request Interface (PRI)
> PRICtl: Enable- Reset-
> PRISta: RF- UPRGI- Stopped+
> Page Request Capacity: 00000020, Page Request Allocation: 00000000
> Capabilities: [2d0 v1] Process Address Space ID (PASID)
> PASIDCap: Exec- Priv-, Max PASID Width: 10
> PASIDCtl: Enable- Exec- Priv-
> Kernel driver in use: radeon
>
> I've attached my config. Please let me know if there is anything I
> can do to help fix this.
>
> ĸen
> --
> This email was written using 100% recycled letters.


Attachments:
0001-drm-radeon-dp-add-back-special-handling-for-NUTMEG.patch (1.96 kB)
0001-drm-radeon-dp-add-back-special-handling-for-NUTMEG.patch
0002-drm-amdgpu-dp-add-back-special-handling-for-NUTMEG.patch (2.00 kB)
0002-drm-amdgpu-dp-add-back-special-handling-for-NUTMEG.patch
Download all attachments

2016-03-04 18:36:59

by Alex Deucher

[permalink] [raw]
Subject: Re: [drm:radeon_dp_link_train] *ERROR* clock recovery failed -bisected

On Thu, Mar 3, 2016 at 8:11 PM, Ken Moffat <[email protected]> wrote:
> On Fri, Mar 04, 2016 at 12:44:01AM +0000, Deucher, Alexander wrote:
>>
>> The attached radeon patch should fix it. I accidently dropped the special handling for NUTMEG DP to VGA bridge chips.
>>
>> > This mobo does not have a DP connector.
>> >
>>
>> The VGA port uses a DP to VGA bridge chip.
>>
>> Alex
>>
>
> Thanks, I was not expecting such a quick response.
>
> The radeon patch does fix it. If you wish, you can add
>
> Tested-by: Ken Moffat <[email protected]>

Great! Thanks, I'll include them in my next -fixes pull.

Alex

>
> ĸen
> --
> This email was written using 100% recycled letters.
> _______________________________________________
> dri-devel mailing list
> [email protected]
> https://lists.freedesktop.org/mailman/listinfo/dri-devel