2012-05-01 13:23:58

by Nick Bowler

[permalink] [raw]
Subject: Re: Linux 3.4-rc4

On 2012-04-30 11:07 +0200, Maarten Maathuis wrote:
> On Mon, Apr 30, 2012 at 12:37 AM, Dmitry Torokhov
> <[email protected]> wrote:
> > On Sat, Apr 28, 2012 at 11:33:50AM -0400, Nick Bowler wrote:
> >> On 2012-04-28 02:19 -0400, Alex Deucher wrote:
> >> > On Fri, Apr 27, 2012 at 8:39 PM, Nick Bowler <[email protected]> wrote:
> >> > > While tracking down the black screen issue, I've been having the monitor
> >> > > directly connected to the video card the whole time, but now when I'm
> >> > > connected through my KVM switch (an IOGear GCS1804), it appears that
> >> > > something's going wrong with reading the EDID, because the available
> >> > > modes are all screwed up (both console and X decide they want to drive
> >> > > the display at 1024x768).
[...]
> >> > > Also, looking at /sys/class/drm/card0-VGA-1/edid I see that it
> >> > > is empty on 3.4-rc4+ and it is correct on 3.2.15. ?Things seem
> >> > > to work OK when the KVM is not involved.
> >> >
> >> > Were you ever able to fetch a EDID with the KVM involved? ?KVMs are
> >> > notorious for not connecting the ddc pins.
> >>
> >> Yes, it works on 3.2.15 as described above.
> >
> > I have the same (or similar) KVM (not in the office at the moment) and I
> > can confirm that with newer kernels EDID fecthing in flaky. It's 50/50
> > if EDED retrieval succeeds or if it fails with:
> >
> > Apr 26 13:06:57 dtor-d630 kernel: [13464.936336] [drm:drm_edid_block_valid] *ERROR* EDID checksum is invalid, remainder is 208
[...]
> > Earlier kernels were able to retrieve EDEDs reliably.

FWIW, for me EDID failure on new kernels is 100% reproducible, and there
are no such checksum errors in the log. It's just missing.

> Just a crazy thought, but didn't we change some timings related to
> EDID retrieval? To make it faster.

OK, this time bisecting started off relatively smoothly (doing the same
"backwards" bisect on the branch-o-reverts as last time), but then my
disk died halfway through... So I'll post the partial bisection results
now (11 commits left to test), but I clearly have other things to fix
before I can get back to this issue.

git bisect start 'drivers/gpu/drm'
# good: [9232969e19ae7251a93ab72e405cf71e5109ec05] drm/nv40/pm: implement first type of pwm fanspeed funcs
git bisect good 9232969e19ae7251a93ab72e405cf71e5109ec05
# bad: [dea7e0ac45fd28f90bbc38ff226d36a9f788efbf] ttm: fix agp since ttm tt rework
git bisect bad dea7e0ac45fd28f90bbc38ff226d36a9f788efbf
# good: [d2491567cdbcb87b2682e0948a69d73c4dd8987e] drm/nv50/pm: only touch 0x611200 on nv92-
git bisect good d2491567cdbcb87b2682e0948a69d73c4dd8987e
# good: [f9f9f536312d4c3ca39502ccf6a3af60cfe38ff4] drm/nouveau/bios: pass drm_device to ROMPTR, rather than nvbios
git bisect good f9f9f536312d4c3ca39502ccf6a3af60cfe38ff4
# bad: [d4c2c99bdc8385a0e51ce4ef2df124d14b6b9c9d] drm/nouveau/dp: remove broken display depth function, use the improved one
git bisect bad d4c2c99bdc8385a0e51ce4ef2df124d14b6b9c9d

Cheers,
--
Nick Bowler, Elliptic Technologies (http://www.elliptictech.com/)


2012-05-01 15:06:58

by Alan

[permalink] [raw]
Subject: Re: Linux 3.4-rc4

> OK, this time bisecting started off relatively smoothly (doing the same
> "backwards" bisect on the branch-o-reverts as last time), but then my
> disk died halfway through... So I'll post the partial bisection results
> now (11 commits left to test), but I clearly have other things to fix
> before I can get back to this issue.

You may get stupid answers because of

commit eeefa4bea1af34207c5299f989fffe03628ea164
commit 8353e6c632aeaea1470a286b83e68ca233073068

Been there, trying to chase down a GMA500 problemt that was muddled in
with the broken edid.h patch as well as a driver bug.

Alan

2012-05-01 15:31:28

by Nick Bowler

[permalink] [raw]
Subject: Re: Linux 3.4-rc4

On 2012-05-01 16:09 +0100, Alan Cox wrote:
> > OK, this time bisecting started off relatively smoothly (doing the same
> > "backwards" bisect on the branch-o-reverts as last time), but then my
> > disk died halfway through... So I'll post the partial bisection results
> > now (11 commits left to test), but I clearly have other things to fix
> > before I can get back to this issue.
>
> You may get stupid answers because of
>
> commit eeefa4bea1af34207c5299f989fffe03628ea164
> commit 8353e6c632aeaea1470a286b83e68ca233073068
>
> Been there, trying to chase down a GMA500 problemt that was muddled in
> with the broken edid.h patch as well as a driver bug.

I'm afraid I don't understand. These commits do not appear to be in
Linus' tree?

Cheers,
--
Nick Bowler, Elliptic Technologies (http://www.elliptictech.com/)

2012-05-01 15:43:03

by Alan

[permalink] [raw]
Subject: Re: Linux 3.4-rc4

On Tue, 1 May 2012 11:31:23 -0400
Nick Bowler <[email protected]> wrote:

> On 2012-05-01 16:09 +0100, Alan Cox wrote:
> > > OK, this time bisecting started off relatively smoothly (doing the same
> > > "backwards" bisect on the branch-o-reverts as last time), but then my
> > > disk died halfway through... So I'll post the partial bisection results
> > > now (11 commits left to test), but I clearly have other things to fix
> > > before I can get back to this issue.
> >
> > You may get stupid answers because of
> >
> > commit eeefa4bea1af34207c5299f989fffe03628ea164
> > commit 8353e6c632aeaea1470a286b83e68ca233073068
> >
> > Been there, trying to chase down a GMA500 problemt that was muddled in
> > with the broken edid.h patch as well as a driver bug.
>
> I'm afraid I don't understand. These commits do not appear to be in

Ok they only got as far as the DRM tree - thats good, so you ought to get
a sane answer.

Alan

2012-05-02 01:21:23

by Nick Bowler

[permalink] [raw]
Subject: Re: Linux 3.4-rc4

(re-adding Ben to the Cc because he was apparently dropped somewhere in
this thread)

On 2012-05-01 09:23 -0400, Nick Bowler wrote:
> On 2012-04-30 11:07 +0200, Maarten Maathuis wrote:
> > On Mon, Apr 30, 2012 at 12:37 AM, Dmitry Torokhov
> > <[email protected]> wrote:
> > > On Sat, Apr 28, 2012 at 11:33:50AM -0400, Nick Bowler wrote:
> > >> On 2012-04-28 02:19 -0400, Alex Deucher wrote:
> > >> > On Fri, Apr 27, 2012 at 8:39 PM, Nick Bowler <[email protected]> wrote:
> > >> > > While tracking down the black screen issue, I've been having the monitor
> > >> > > directly connected to the video card the whole time, but now when I'm
> > >> > > connected through my KVM switch (an IOGear GCS1804), it appears that
> > >> > > something's going wrong with reading the EDID, because the available
> > >> > > modes are all screwed up (both console and X decide they want to drive
> > >> > > the display at 1024x768).
> [...]
> > >> > > Also, looking at /sys/class/drm/card0-VGA-1/edid I see that it
> > >> > > is empty on 3.4-rc4+ and it is correct on 3.2.15. ?Things seem
> > >> > > to work OK when the KVM is not involved.
> > >> >
> > >> > Were you ever able to fetch a EDID with the KVM involved? ?KVMs are
> > >> > notorious for not connecting the ddc pins.
> > >>
> > >> Yes, it works on 3.2.15 as described above.
> > >
> > > I have the same (or similar) KVM (not in the office at the moment) and I
> > > can confirm that with newer kernels EDID fecthing in flaky. It's 50/50
> > > if EDED retrieval succeeds or if it fails with:
> > >
> > > Apr 26 13:06:57 dtor-d630 kernel: [13464.936336] [drm:drm_edid_block_valid] *ERROR* EDID checksum is invalid, remainder is 208
> [...]
> > > Earlier kernels were able to retrieve EDEDs reliably.
>
> FWIW, for me EDID failure on new kernels is 100% reproducible, and there
> are no such checksum errors in the log. It's just missing.
>
> > Just a crazy thought, but didn't we change some timings related to
> > EDID retrieval? To make it faster.
>
> OK, this time bisecting started off relatively smoothly (doing the same
> "backwards" bisect on the branch-o-reverts as last time), but then my
> disk died halfway through...
[...]

OK, system is back online and I finished the bisection. The commit that
broke it for me is the following, and reverting it on top of 3.3.4 + the
"make VGA work at all" patch fixes this particular issue for me.

commit f553b79c03f0dbd52f6f03abe8233a2bef8cbd0d
Author: Ben Skeggs <[email protected]>
Date: Wed Dec 21 18:09:12 2011 +1000

drm/nouveau/i2c: handle bit-banging ourselves

i2c-algo-bit doesn't actually work very well on one card I have access to
(NVS 300), random single-bit errors occur most of the time - what we're
doing now is closer to what xf86i2c.c does.

The original plan was to figure out why i2c-algo-bit fails on the NVS 300,
and fix it. However, while investigating I discovered i2c-algo-bit calls
cond_resched(), which makes it a bad idea for us to be using as we execute
VBIOS scripts from a tasklet, and there may very well be i2c transfers as
a result.

So, since I already wrote this code in userspace to track down the NVS 300
bug, and it's not really much code - lets use it.

Signed-off-by: Ben Skeggs <[email protected]>

Cheers,
--
Nick Bowler, Elliptic Technologies (http://www.elliptictech.com/)

2012-05-04 09:20:59

by Dave Airlie

[permalink] [raw]
Subject: Re: Linux 3.4-rc4

>>
>> FWIW, for me EDID failure on new kernels is 100% reproducible, and there
>> are no such checksum errors in the log. ?It's just missing.
>>
>> > Just a crazy thought, but didn't we change some timings related to
>> > EDID retrieval? To make it faster.
>>
>> OK, this time bisecting started off relatively smoothly (doing the same
>> "backwards" bisect on the branch-o-reverts as last time), but then my
>> disk died halfway through...
> [...]
>
> OK, system is back online and I finished the bisection. ?The commit that
> broke it for me is the following, and reverting it on top of 3.3.4 + the
> "make VGA work at all" patch fixes this particular issue for me.
>
Can you test with the attached patch? its a revert mostly of Ben's patch, and
he says with the i2c core change stuff is working for him again.

Dave.


Attachments:
0001-drm-nouveau-i2c-resume-use-of-i2c-algo-bit-rather-th.patch (6.46 kB)

2012-05-05 15:39:49

by Nick Bowler

[permalink] [raw]
Subject: Re: Linux 3.4-rc4

On 2012-05-04 10:20 +0100, Dave Airlie wrote:
> >>
> >> FWIW, for me EDID failure on new kernels is 100% reproducible, and there
> >> are no such checksum errors in the log. ?It's just missing.
> >>
> >> > Just a crazy thought, but didn't we change some timings related to
> >> > EDID retrieval? To make it faster.
> >>
> >> OK, this time bisecting started off relatively smoothly (doing the same
> >> "backwards" bisect on the branch-o-reverts as last time), but then my
> >> disk died halfway through...
> > [...]
> >
> > OK, system is back online and I finished the bisection. ?The commit that
> > broke it for me is the following, and reverting it on top of 3.3.4 + the
> > "make VGA work at all" patch fixes this particular issue for me.
> >
> Can you test with the attached patch? its a revert mostly of Ben's patch, and
> he says with the i2c core change stuff is working for him again.

Yup, this one seems to work on top of Linus' master.

Thanks,
--
Nick Bowler, Elliptic Technologies (http://www.elliptictech.com/)