2011-04-25 09:28:15

by Boris Zhmurov

[permalink] [raw]
Subject: i915 completely unusable in 2.6.38.x

Hi!

After upgrading RHEL6 to 2.6.28.x kernel, I have spontaneous "GPU hung"
5-6 times per day.
xorg-x11-drv-intel-2.11.0-7.el6.i686 driver for Xorg.

-----------------------------8<-------------------------------
Apr 25 12:36:02 bb-work kernel: [drm:i915_hangcheck_elapsed] *ERROR*
Hangcheck timer elapsed... GPU hung
Apr 25 12:36:02 bb-work kernel: [drm:i915_do_wait_request] *ERROR*
i915_do_wait_request returns -11 (awaiting 76525 at 76524, next 76526)
Apr 25 12:36:04 bb-work kernel: [drm:i915_hangcheck_elapsed] *ERROR*
Hangcheck timer elapsed... GPU hung
Apr 25 12:36:04 bb-work kernel: [drm:i915_reset] *ERROR* GPU hanging too
fast, declaring wedged!
Apr 25 12:36:04 bb-work kernel: [drm:i915_reset] *ERROR* Failed to reset
chip.
-----------------------------8<-------------------------------

How can I help to debug this issue? Configs and logs attached. Thanks.




--
Boris B. Zhmurov
System/Network Administrator
mailto: [email protected]
"wget http://kernelpanic.ru/bb_public_key.pgp -O - | gpg --import"


Attachments:
config (97.69 kB)
dmesg (67.04 kB)
lspci_vx (13.76 kB)
Download all attachments

2011-04-25 09:28:08

by Boris Zhmurov

[permalink] [raw]
Subject: Re: i915 completely unusable in 2.6.38.x

Boris B. Zhmurov, 04/25/2011 01:14 PM:

> Hi!
>
> After upgrading RHEL6 to 2.6.28.x kernel, I have spontaneous "GPU hung"
> 5-6 times per day.

I'm sorry, 2.6.*38*.x
Subject is correct.


--
Boris B. Zhmurov
System/Network Administrator
mailto: [email protected]
"wget http://kernelpanic.ru/bb_public_key.pgp -O - | gpg --import"

2011-04-25 20:17:32

by Melchior FRANZ

[permalink] [raw]
Subject: Re: i915 completely unusable in 2.6.38.x

* Boris B. Zhmurov -- Monday 25 April 2011:
> After upgrading RHEL6 to 2.6.28.x kernel, I have spontaneous "GPU hung"
> 5-6 times per day.
> xorg-x11-drv-intel-2.11.0-7.el6.i686 driver for Xorg.

JFTR, and although it seems to be a totally different problem with i915
(or rather gm45), the whole 2.6.38.* series is also of limited use on my
notebook, as described in this error message:

https://bugzilla.kernel.org/show_bug.cgi?id=31522

I'm still using 2.6.38-rc8 (with SuSE 11.4/tumbleweed) on this machine
(Acer TravelMate 5735Z-452G32Mnss), because that's the last version that
supported KMS on my chipset. This older version works ok, so there's no
reason to complain, but I'm still a bit suprised about that regression.

m.

2011-04-25 21:03:39

by Greg KH

[permalink] [raw]
Subject: Re: i915 completely unusable in 2.6.38.x

On Mon, Apr 25, 2011 at 10:10:46PM +0200, Melchior FRANZ wrote:
> * Boris B. Zhmurov -- Monday 25 April 2011:
> > After upgrading RHEL6 to 2.6.28.x kernel, I have spontaneous "GPU hung"
> > 5-6 times per day.
> > xorg-x11-drv-intel-2.11.0-7.el6.i686 driver for Xorg.
>
> JFTR, and although it seems to be a totally different problem with i915
> (or rather gm45), the whole 2.6.38.* series is also of limited use on my
> notebook, as described in this error message:
>
> https://bugzilla.kernel.org/show_bug.cgi?id=31522
>
> I'm still using 2.6.38-rc8 (with SuSE 11.4/tumbleweed) on this machine
> (Acer TravelMate 5735Z-452G32Mnss), because that's the last version that
> supported KMS on my chipset. This older version works ok, so there's no
> reason to complain, but I'm still a bit suprised about that regression.

I don't understand, .38 should work as it has the above fix in it,
right?

Otherwise, what should be done for the .38-stable tree?

confused,

greg k-h

2011-04-26 07:59:59

by Melchior FRANZ

[permalink] [raw]
Subject: Re: i915 completely unusable in 2.6.38.x

I didn't really want to hijack this thread -- just confirm that the i915
driver is/was broken through the whole 2.6.38 series, up to 2.6.38.4.



* Greg KH -- Monday 25 April 2011:
> On Mon, Apr 25, 2011 at 10:10:46PM +0200, Melchior FRANZ wrote:
> > I'm still using 2.6.38-rc8 (with SuSE 11.4/tumbleweed) on this machine
> > (Acer TravelMate 5735Z-452G32Mnss), because that's the last version that
> > supported KMS on my chipset.

> I don't understand, .38 should work as it has the above fix in it,
> right?

[https://bugzilla.kernel.org/show_bug.cgi?id=31522]
Argh, sorry -- I hadn't noticed that someone closed this bug report.
No, this isn't fixed. Exact same symptoms: The notebook's screen
goes black early in the boot process with KMS. Closing and reopening
the lid doesn't help, contrary to what some people reported. Turning
off KMS kind-of helps: The virtual console is right after that, but
Xorg apparently gets contradicting info about the screen size, so
that parts of the desktop end up outside the screen area.



> Otherwise, what should be done for the .38-stable tree?

I had assumed that the commit that broke it was known, as there hasn't
much been committed between 2.6.38-rc8 and 26.6.38, but I'll bisect
and investigate.

m.

2011-04-26 10:31:23

by Boris Zhmurov

[permalink] [raw]
Subject: Re: i915 completely unusable in 2.6.38.x

Melchior FRANZ, 04/26/2011 11:59 AM:

> I didn't really want to hijack this thread -- just confirm that the i915
> driver is/was broken through the whole 2.6.38 series, up to 2.6.38.4.

Yes, that is true for me either.

--
Boris B. Zhmurov
System/Network Administrator
mailto: [email protected]
"wget http://kernelpanic.ru/bb_public_key.pgp -O - | gpg --import"

2011-04-26 18:04:46

by Melchior FRANZ

[permalink] [raw]
Subject: Re: i915 completely unusable in 2.6.38.x

* Melchior FRANZ -- Tuesday 26 April 2011:
> [https://bugzilla.kernel.org/show_bug.cgi?id=31522]

I've replied to this error message, but here again for this audience:
The commit that broke all 2.6.38.* releases for my machine is this:
http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=ba3820ade317ee36e496b9b40d2ec3987dd4aef0
(Takashi Iwai: "drm/i915: Revive combination mode for backlight
control").

In 'is_backlight_combination_mode()' INTEL_INFO(dev)->gen
equals 4, and the function returns 0x40000000 in "combination mode".
In 'intel_panel_get_backlight()' this happens:


val = I915_READ(BLC_PWM_CTL) & BACKLIGHT_DUTY_CYCLE_MASK; // val = 0x0b4a
if (IS_PINEVIEW(dev)) // false
val >>= 1;

if (is_backlight_combination_mode(dev)){
u8 lbpc;

val &= ~1; // val = 0x0b4a
pci_read_config_byte(dev->pdev, PCI_LBPC, &lbpc); // lbpc = 0
val *= lbpc; // val = 0
}


The backlight remains off and does also not react to the "brighter" key
event.

Reverting the patch fixes my system, obviously.

m.