2009-12-14 02:31:06

by Alex Chiang

[permalink] [raw]
Subject: radeon 4830 corruption after resume

Hi Dave, Rafael,

I can successfully suspend/resume my HP Envy, but upon resume,
the screen is corrupted.

I used the gnome screenshot utility to capture this:

http://chizang.net/alex/tmp/radeon-4830-corruption.png

But that screenshot leads you to believe the corruption was
100%, when in reality, the text in my xterms was at least
readable, but ugly.

Kernel is latest upstream pulled today. Userspace is Ubuntu Karmic.

Some info inline; xorg.log and pm-suspend.log attached.

Thanks,
/ac


dmesg snip:
[ 20.984196] [drm] Initialized drm 1.1.0 20060810
[ 21.470507] [drm] radeon defaulting to userspace modesetting.
[ 21.471372] [drm] Initialized radeon 1.31.0 20080528 for 0000:01:00.0 on minor 0
[ 22.689868] [drm] Setting GART location based on new memory map
[ 22.690118] [drm] Loading RV730 CP Microcode
[ 22.690121] platform r600_cp.0: firmware: requesting radeon/RV730_pfp.bin
[ 22.721163] platform r600_cp.0: firmware: requesting radeon/RV730_me.bin
[ 22.760768] [drm] Resetting GPU
[ 22.760827] [drm] writeback test succeeded in 2 usecs


achiang@dre:~$ sudo lspci -vv -s 01:00.0
[sudo] password for achiang:
01:00.0 VGA compatible controller: ATI Technologies Inc Mobility Radeon HD 4830 [M97]
Subsystem: Hewlett-Packard Company Device 7009
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
Latency: 0, Cache Line Size: 64 bytes
Interrupt: pin A routed to IRQ 16
Region 0: Memory at c0000000 (32-bit, prefetchable) [size=256M]
Region 1: I/O ports at 4000 [size=256]
Region 2: Memory at d4000000 (32-bit, non-prefetchable) [size=64K]
Expansion ROM at d4020000 [disabled] [size=128K]
Capabilities: [50] Power Management version 3
Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
Status: D0 PME-Enable- DSel=0 DScale=0 PME-
Capabilities: [58] Express (v2) Legacy Endpoint, MSI 00
DevCap: MaxPayload 128 bytes, PhantFunc 0, Latency L0s <4us, L1 unlimited
ExtTag+ AttnBtn- AttnInd- PwrInd- RBE+ FLReset-
DevCtl: Report errors: Correctable- Non-Fatal- Fatal- Unsupported-
RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop+
MaxPayload 128 bytes, MaxReadReq 128 bytes
DevSta: CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr- TransPend-
LnkCap: Port #0, Speed 2.5GT/s, Width x16, ASPM L0s L1, Latency L0 <64ns, L1 <1us
ClockPM- Suprise- LLActRep- BwNot-
LnkCtl: ASPM L1 Enabled; RCB 64 bytes Disabled- Retrain- CommClk+
ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
LnkSta: Speed 2.5GT/s, Width x8, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
Capabilities: [a0] Message Signalled Interrupts: Mask- 64bit+ Queue=0/0 Enable-
Address: 0000000000000000 Data: 0000
Capabilities: [100] Vendor Specific Information <?>


Attachments:
(No filename) (2.81 kB)
Xorg.0.log (75.76 kB)
pm-suspend.log (16.28 kB)
Download all attachments

2009-12-14 02:55:18

by David Airlie

[permalink] [raw]
Subject: Re: radeon 4830 corruption after resume

On Sun, 2009-12-13 at 19:30 -0700, Alex Chiang wrote:
> Hi Dave, Rafael,
>
> I can successfully suspend/resume my HP Envy, but upon resume,
> the screen is corrupted.
>
> I used the gnome screenshot utility to capture this:
>
> http://chizang.net/alex/tmp/radeon-4830-corruption.png
>
> But that screenshot leads you to believe the corruption was
> 100%, when in reality, the text in my xterms was at least
> readable, but ugly.
>
> Kernel is latest upstream pulled today. Userspace is Ubuntu Karmic.

Has it ever worked? can you suspend/resume without X running at all?

Its quite possibly a userspace problem but its hard to know, we
haven't changed the user modesetting pieces in the kernel at all in
quite a while.

Dave.

2009-12-14 03:11:29

by Alex Chiang

[permalink] [raw]
Subject: Re: radeon 4830 corruption after resume

* Dave Airlie <[email protected]>:
> On Sun, 2009-12-13 at 19:30 -0700, Alex Chiang wrote:
> > Hi Dave, Rafael,
> >
> > I can successfully suspend/resume my HP Envy, but upon resume,
> > the screen is corrupted.
> >
> > I used the gnome screenshot utility to capture this:
> >
> > http://chizang.net/alex/tmp/radeon-4830-corruption.png
> >
> > But that screenshot leads you to believe the corruption was
> > 100%, when in reality, the text in my xterms was at least
> > readable, but ugly.
> >
> > Kernel is latest upstream pulled today. Userspace is Ubuntu Karmic.
>
> Has it ever worked? can you suspend/resume without X running at all?

Hm, define "worked"?

The machine responds to keyboard, mouse, network input, etc. It's
just that the screen is garbled.

Unless I'm not understanding what you're asking?

> Its quite possibly a userspace problem but its hard to know, we
> haven't changed the user modesetting pieces in the kernel at all in
> quite a while.

Ok, I'm just looking for where/how to start debugging. Any advice
on where to look next would be fine too.

Thanks,
/ac

2009-12-14 05:28:39

by Xiaotian Feng

[permalink] [raw]
Subject: Re: radeon 4830 corruption after resume

On Mon, Dec 14, 2009 at 11:11 AM, Alex Chiang <[email protected]> wrote:
> * Dave Airlie <[email protected]>:
>> On Sun, 2009-12-13 at 19:30 -0700, Alex Chiang wrote:
>> > Hi Dave, Rafael,
>> >
>> > I can successfully suspend/resume my HP Envy, but upon resume,
>> > the screen is corrupted.
>> >
>> > I used the gnome screenshot utility to capture this:
>> >
>> >     http://chizang.net/alex/tmp/radeon-4830-corruption.png
>> >
>> > But that screenshot leads you to believe the corruption was
>> > 100%, when in reality, the text in my xterms was at least
>> > readable, but ugly.
>> >
>> > Kernel is latest upstream pulled today. Userspace is Ubuntu Karmic.
>>
>> Has it ever worked? can you suspend/resume without X running at all?
>
> Hm, define "worked"?
>
> The machine responds to keyboard, mouse, network input, etc. It's
> just that the screen is garbled.
>

Then how about ssh to your machine, and get dmesg output?

> Unless I'm not understanding what you're asking?
>
>> Its quite possibly a userspace problem but its hard to know, we
>> haven't changed the user modesetting pieces in the kernel at all in
>> quite a while.
>
> Ok, I'm just looking for where/how to start debugging. Any advice
> on where to look next would be fine too.
>
> Thanks,
> /ac
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
>

2009-12-15 05:40:11

by Alex Chiang

[permalink] [raw]
Subject: Re: radeon 4830 corruption after resume

Adding dri-devel to cc as I should have done in the first place.

Full thread for context here: http://lkml.org/lkml/2009/12/13/321

More responses below.

* Xiaotian Feng <[email protected]>:
> On Mon, Dec 14, 2009 at 11:11 AM, Alex Chiang <[email protected]> wrote:
> > * Dave Airlie <[email protected]>:
> >> On Sun, 2009-12-13 at 19:30 -0700, Alex Chiang wrote:
> >> > Hi Dave, Rafael,
> >> >
> >> > I can successfully suspend/resume my HP Envy, but upon resume,
> >> > the screen is corrupted.
> >> >
> >> > I used the gnome screenshot utility to capture this:
> >> >
> >> > ? ? http://chizang.net/alex/tmp/radeon-4830-corruption.png
> >> >
> >> > But that screenshot leads you to believe the corruption was
> >> > 100%, when in reality, the text in my xterms was at least
> >> > readable, but ugly.
> >> >
> >> > Kernel is latest upstream pulled today. Userspace is Ubuntu Karmic.
> >>
> >> Has it ever worked? can you suspend/resume without X running at all?
> >
> > Hm, define "worked"?
> >
> > The machine responds to keyboard, mouse, network input, etc. It's
> > just that the screen is garbled.
> >
> > Unless I'm not understanding what you're asking?
> >
> >> Its quite possibly a userspace problem but its hard to know, we
> >> haven't changed the user modesetting pieces in the kernel at all in
> >> quite a while.
> >
> > Ok, I'm just looking for where/how to start debugging. Any advice
> > on where to look next would be fine too.
>
> Then how about ssh to your machine, and get dmesg output?

The machine is actually useable. Wireless networking even works.
It's just that the screen is garbled.

Here's a much better idea of the type of corruption that I'm
seeing. Huge file alert, it's like a 5MB jpg.

http://chizang.net/alex/tmp/corrupt-radeon2.jpg

I'm attaching full dmesg and pm-suspend.log.

The suspend happens some time around time 400 and the resume is
the huge jump in time afterwards.

Thanks,
/ac


Attachments:
(No filename) (1.88 kB)
dmesg-corrupt.txt (61.34 kB)
pm-suspend.log (5.00 kB)
Download all attachments

2009-12-15 14:52:49

by Alex Deucher

[permalink] [raw]
Subject: Re: radeon 4830 corruption after resume

On Tue, Dec 15, 2009 at 12:40 AM, Alex Chiang <[email protected]> wrote:
> Adding dri-devel to cc as I should have done in the first place.
>
> Full thread for context here: http://lkml.org/lkml/2009/12/13/321
>
> More responses below.
>
> * Xiaotian Feng <[email protected]>:
>> On Mon, Dec 14, 2009 at 11:11 AM, Alex Chiang <[email protected]> wrote:
>> > * Dave Airlie <[email protected]>:
>> >> On Sun, 2009-12-13 at 19:30 -0700, Alex Chiang wrote:
>> >> > Hi Dave, Rafael,
>> >> >
>> >> > I can successfully suspend/resume my HP Envy, but upon resume,
>> >> > the screen is corrupted.
>> >> >
>> >> > I used the gnome screenshot utility to capture this:
>> >> >
>> >> > ? ? http://chizang.net/alex/tmp/radeon-4830-corruption.png
>> >> >
>> >> > But that screenshot leads you to believe the corruption was
>> >> > 100%, when in reality, the text in my xterms was at least
>> >> > readable, but ugly.
>> >> >
>> >> > Kernel is latest upstream pulled today. Userspace is Ubuntu Karmic.
>> >>
>> >> Has it ever worked? can you suspend/resume without X running at all?
>> >
>> > Hm, define "worked"?
>> >
>> > The machine responds to keyboard, mouse, network input, etc. It's
>> > just that the screen is garbled.
>> >
>> > Unless I'm not understanding what you're asking?
>> >
>> >> Its quite possibly a userspace problem but its hard to know, we
>> >> haven't changed the user modesetting pieces in the kernel at all in
>> >> quite a while.
>> >
>> > Ok, I'm just looking for where/how to start debugging. Any advice
>> > on where to look next would be fine too.
>>
>> Then how about ssh to your machine, and get dmesg output?
>
> The machine is actually useable. Wireless networking even works.
> It's just that the screen is garbled.
>
> Here's a much better idea of the type of corruption that I'm
> seeing. Huge file alert, it's like a 5MB jpg.
>
> ? ? ? ?http://chizang.net/alex/tmp/corrupt-radeon2.jpg
>
> I'm attaching full dmesg and pm-suspend.log.
>
> The suspend happens some time around time 400 and the resume is
> the huge jump in time afterwards.

As Dave asked previously, is this a regression? I.e., did s/r work at
some point in the past and if so when?

Alex

2009-12-15 14:57:40

by Alex Chiang

[permalink] [raw]
Subject: Re: radeon 4830 corruption after resume

* Alex Deucher <[email protected]>:
>
> As Dave asked previously, is this a regression? I.e., did s/r work at
> some point in the past and if so when?

I'm not sure, as I just received the machine a few days ago.

I can try some older kernels, but part of my issue is not knowing
how all the pieces fit together. How can I tell if I'm supposed
to be bisecting the kernel or X or something else, given my
symptoms?

Thanks,
/ac

2009-12-15 15:19:53

by Alex Deucher

[permalink] [raw]
Subject: Re: radeon 4830 corruption after resume

On Tue, Dec 15, 2009 at 9:57 AM, Alex Chiang <[email protected]> wrote:
> * Alex Deucher <[email protected]>:
>>
>> As Dave asked previously, is this a regression? ?I.e., did s/r work at
>> some point in the past and if so when?
>
> I'm not sure, as I just received the machine a few days ago.
>
> I can try some older kernels, but part of my issue is not knowing
> how all the pieces fit together. How can I tell if I'm supposed
> to be bisecting the kernel or X or something else, given my
> symptoms?

First try s/r without X running and see if it works ok. Then try s/r
from a VT other than the one X is on. Finally, try s/r with the DRI
disabled (i.e., Option "DRI" "False" in the device section of your
config). Also for non-kms s/r, you may have to fiddle with vbetool on
resume to post your card. Report back with the results and we can go
from there.

Alex