Hi!
After upgrading RHEL6 to 2.6.28.x kernel, I have spontaneous "GPU hung"
5-6 times per day.
xorg-x11-drv-intel-2.11.0-7.el6.i686 driver for Xorg.
-----------------------------8<-------------------------------
Apr 25 12:36:02 bb-work kernel: [drm:i915_hangcheck_elapsed] *ERROR*
Hangcheck timer elapsed... GPU hung
Apr 25 12:36:02 bb-work kernel: [drm:i915_do_wait_request] *ERROR*
i915_do_wait_request returns -11 (awaiting 76525 at 76524, next 76526)
Apr 25 12:36:04 bb-work kernel: [drm:i915_hangcheck_elapsed] *ERROR*
Hangcheck timer elapsed... GPU hung
Apr 25 12:36:04 bb-work kernel: [drm:i915_reset] *ERROR* GPU hanging too
fast, declaring wedged!
Apr 25 12:36:04 bb-work kernel: [drm:i915_reset] *ERROR* Failed to reset
chip.
-----------------------------8<-------------------------------
How can I help to debug this issue? Configs and logs attached. Thanks.
--
Boris B. Zhmurov
System/Network Administrator
mailto: [email protected]
"wget http://kernelpanic.ru/bb_public_key.pgp -O - | gpg --import"
Boris B. Zhmurov, 04/25/2011 01:14 PM:
> Hi!
>
> After upgrading RHEL6 to 2.6.28.x kernel, I have spontaneous "GPU hung"
> 5-6 times per day.
I'm sorry, 2.6.*38*.x
Subject is correct.
--
Boris B. Zhmurov
System/Network Administrator
mailto: [email protected]
"wget http://kernelpanic.ru/bb_public_key.pgp -O - | gpg --import"
* Boris B. Zhmurov -- Monday 25 April 2011:
> After upgrading RHEL6 to 2.6.28.x kernel, I have spontaneous "GPU hung"
> 5-6 times per day.
> xorg-x11-drv-intel-2.11.0-7.el6.i686 driver for Xorg.
JFTR, and although it seems to be a totally different problem with i915
(or rather gm45), the whole 2.6.38.* series is also of limited use on my
notebook, as described in this error message:
https://bugzilla.kernel.org/show_bug.cgi?id=31522
I'm still using 2.6.38-rc8 (with SuSE 11.4/tumbleweed) on this machine
(Acer TravelMate 5735Z-452G32Mnss), because that's the last version that
supported KMS on my chipset. This older version works ok, so there's no
reason to complain, but I'm still a bit suprised about that regression.
m.
On Mon, Apr 25, 2011 at 10:10:46PM +0200, Melchior FRANZ wrote:
> * Boris B. Zhmurov -- Monday 25 April 2011:
> > After upgrading RHEL6 to 2.6.28.x kernel, I have spontaneous "GPU hung"
> > 5-6 times per day.
> > xorg-x11-drv-intel-2.11.0-7.el6.i686 driver for Xorg.
>
> JFTR, and although it seems to be a totally different problem with i915
> (or rather gm45), the whole 2.6.38.* series is also of limited use on my
> notebook, as described in this error message:
>
> https://bugzilla.kernel.org/show_bug.cgi?id=31522
>
> I'm still using 2.6.38-rc8 (with SuSE 11.4/tumbleweed) on this machine
> (Acer TravelMate 5735Z-452G32Mnss), because that's the last version that
> supported KMS on my chipset. This older version works ok, so there's no
> reason to complain, but I'm still a bit suprised about that regression.
I don't understand, .38 should work as it has the above fix in it,
right?
Otherwise, what should be done for the .38-stable tree?
confused,
greg k-h
I didn't really want to hijack this thread -- just confirm that the i915
driver is/was broken through the whole 2.6.38 series, up to 2.6.38.4.
* Greg KH -- Monday 25 April 2011:
> On Mon, Apr 25, 2011 at 10:10:46PM +0200, Melchior FRANZ wrote:
> > I'm still using 2.6.38-rc8 (with SuSE 11.4/tumbleweed) on this machine
> > (Acer TravelMate 5735Z-452G32Mnss), because that's the last version that
> > supported KMS on my chipset.
> I don't understand, .38 should work as it has the above fix in it,
> right?
[https://bugzilla.kernel.org/show_bug.cgi?id=31522]
Argh, sorry -- I hadn't noticed that someone closed this bug report.
No, this isn't fixed. Exact same symptoms: The notebook's screen
goes black early in the boot process with KMS. Closing and reopening
the lid doesn't help, contrary to what some people reported. Turning
off KMS kind-of helps: The virtual console is right after that, but
Xorg apparently gets contradicting info about the screen size, so
that parts of the desktop end up outside the screen area.
> Otherwise, what should be done for the .38-stable tree?
I had assumed that the commit that broke it was known, as there hasn't
much been committed between 2.6.38-rc8 and 26.6.38, but I'll bisect
and investigate.
m.
Melchior FRANZ, 04/26/2011 11:59 AM:
> I didn't really want to hijack this thread -- just confirm that the i915
> driver is/was broken through the whole 2.6.38 series, up to 2.6.38.4.
Yes, that is true for me either.
--
Boris B. Zhmurov
System/Network Administrator
mailto: [email protected]
"wget http://kernelpanic.ru/bb_public_key.pgp -O - | gpg --import"
* Melchior FRANZ -- Tuesday 26 April 2011:
> [https://bugzilla.kernel.org/show_bug.cgi?id=31522]
I've replied to this error message, but here again for this audience:
The commit that broke all 2.6.38.* releases for my machine is this:
http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=ba3820ade317ee36e496b9b40d2ec3987dd4aef0
(Takashi Iwai: "drm/i915: Revive combination mode for backlight
control").
In 'is_backlight_combination_mode()' INTEL_INFO(dev)->gen
equals 4, and the function returns 0x40000000 in "combination mode".
In 'intel_panel_get_backlight()' this happens:
val = I915_READ(BLC_PWM_CTL) & BACKLIGHT_DUTY_CYCLE_MASK; // val = 0x0b4a
if (IS_PINEVIEW(dev)) // false
val >>= 1;
if (is_backlight_combination_mode(dev)){
u8 lbpc;
val &= ~1; // val = 0x0b4a
pci_read_config_byte(dev->pdev, PCI_LBPC, &lbpc); // lbpc = 0
val *= lbpc; // val = 0
}
The backlight remains off and does also not react to the "brighter" key
event.
Reverting the patch fixes my system, obviously.
m.