2009-01-12 12:57:01

by Frans Pop

[permalink] [raw]
Subject: [2.6.28] Kernel panic after closing lid on HP 2510p

I got the following series of errors while closing the lid of my notebook.
The end result was a frozen system that needed a hard power off.

The notebook was docked and connected to external monitor. Closing the lid
will only power off both displays, not suspend.

I possibly tried to do a few things too quickly in succession, but
AFAIK that should still not result in the kernel crapping out on me ;-)

System is HP 2510p notebook running 2.6.28 (x86_64, Debian amd64/lenny)
with a few additional patches on top.

Cheers,
FJP

Errors in kern.log after rebooting:

BUG: unable to handle kernel paging request at ffff88007ceeb5a0
IP: [<ffffffff8034e077>] acpi_ex_field_datum_io+0xec/0x17e
PGD 202063 PUD 8067 PMD 6ff52163 PTE 800000007ceeb163
Oops: 0011 [#1] SMP
last sysfs file: /sys/class/power_supply/C23D/charge_full
CPU 1
Modules linked in: iwlagn iwlcore mac80211 cfg80211 isofs zlib_inflate usbhid hid vboxdrv tcp_diag inet_diag i915
drm ppdev parport_pc lp parport nfs lockd nfs_acl sunrpc ipv6 ext2 coretemp hp_wmi acpi_cpufreq loop joydev
snd_hda_intel snd_pcm_oss snd_mixer_oss snd_pcm arc4 snd_seq_dummy ecb snd_seq_oss snd_seq_midi snd_rawmidi
snd_seq_midi_event snd_seq snd_timer rfkill snd_seq_device pcmcia snd soundcore psmouse yenta_socket
rsrc_nonstatic iTCO_wdt snd_page_alloc serio_raw pcspkr pcmcia_core intel_agp battery video output wmi
leds_hp_disk led_class container ac button evdev ext3 jbd mbcache sha256_generic aes_x86_64 aes_generic cbc
dm_crypt dm_mirror dm_region_hash dm_log dm_snapshot dm_mod sg sr_mod cdrom sd_mod piix ata_piix ide_pci_generic
ide_core pata_acpi ricoh_mmc sdhci_pci sdhci mmc_core ohci1394 ieee1394 ata_generic ehci_hcd libata uhci_hcd
e1000e scsi_mod thermal processor fan thermal_sys [last unloaded: cfg80211]
Pid: 70, comm: kacpid Not tainted 2.6.28-rjw #83
RIP: 61a0:[<ffffffff803534ed>] [<ffffffff803534ed>] acpi_ns_search_one_scope+0x1d/0x46
RSP: ffffffff8028fbf2:ffff88007e1d7b10 EFLAGS: 00000005
RAX: ffff88007e046510 RBX: ffff88007ceeb5a0 RCX: ffff88007e1d7b70
RDX: 0000000000000000 RSI: ffff88007e1d7ae0 RDI: ffffffff8028fbf2
RBP: ffff88007e1d7b80 R08: 00000003000000b2 R09: ffff88007e1d7b70
R10: 0000000000000000 R11: ffff88007e1d7c60 R12: ffff88007e1d7b10
R13: ffffffff8034df7e R14: ffff88007e1d7ad0 R15: ffff88007e046510
FS: 0000000000000000(0000) GS:ffff88007e002a80(0000) knlGS:0000000000000000
CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b
CR2: ffff88007ceeb5a0 CR3: 0000000063b35000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process kacpid (pid: 70, threadinfo ffff88007e1d6000, task ffff88007e1d95f0)
Stack:
BUG: unable to handle kernel paging request at 000000007e1d7b00
IP: [<ffffffff8020f140>] show_stack_log_lvl+0xb0/0x125
PGD 63b00067 PUD 73867067 PMD 0
Oops: 0000 [#2] SMP
last sysfs file: /sys/class/power_supply/C23D/charge_full
CPU 1
Modules linked in: iwlagn iwlcore mac80211 cfg80211 isofs zlib_inflate usbhid hid vboxdrv tcp_diag inet_diag i915
drm ppdev parport_pc lp parport nfs lockd nfs_acl sunrpc ipv6 ext2 coretemp hp_wmi acpi_cpufreq loop joydev
snd_hda_intel snd_pcm_oss snd_mixer_oss snd_pcm arc4 snd_seq_dummy ecb snd_seq_oss snd_seq_midi snd_rawmidi
snd_seq_midi_event snd_seq snd_timer rfkill snd_seq_device pcmcia snd soundcore psmouse yenta_socket
rsrc_nonstatic iTCO_wdt snd_page_alloc serio_raw pcspkr pcmcia_core intel_agp battery video output wmi
leds_hp_disk led_class container ac button evdev ext3 jbd mbcache sha256_generic aes_x86_64 aes_generic cbc
dm_crypt dm_mirror dm_region_hash dm_log dm_snapshot dm_mod sg sr_mod cdrom sd_mod piix ata_piix ide_pci_generic
ide_core pata_acpi ricoh_mmc sdhci_pci sdhci mmc_core ohci1394 ieee1394 ata_generic ehci_hcd libata uhci_hcd
e1000e scsi_mod thermal processor fan thermal_sys [last unloaded: cfg80211]
Pid: 70, comm: kacpid Not tainted 2.6.28-rjw #83
RIP: 0010:[<ffffffff8020f140>] [<ffffffff8020f140>] show_stack_log_lvl+0xb0/0x125
RSP: 0018:ffff88007e1d7818 EFLAGS: 00010046
RAX: ffff88007e093fc0 RBX: 000000007e1d7b00 RCX: ffff88007e1d7b80
RDX: ffff880001017c00 RSI: ffff88007e1d7a58 RDI: 0000000000000000
RBP: ffff88007e1d7868 R08: ffffffff80513f7d R09: 0000000000000000
R10: ffffffff8067ee90 R11: 00000000000253d4 R12: 0000000000000000
R13: ffffffff80513f7d R14: 0000000000000000 R15: ffff88007e097fc0
FS: 0000000000000000(0000) GS:ffff88007e002a80(0000) knlGS:0000000000000000
CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b
CR2: 000000007e1d7b00 CR3: 0000000063b35000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process kacpid (pid: 70, threadinfo ffff88007e1d6000, task ffff88007e1d95f0)
Stack:
ffff88007e1d7868 ffff88007e1d7b80 ffff88007e1d7a58 ffff88007e093fc0
000000007e1d7b00 ffff88007e1d95f0 0000000000000ac0 ffff88007e1d7a58
0000000000000040 000000007e1d7b00 ffff88007e1d78a8 ffffffff8020f26b
Call Trace:
Code: e8 8c a8 22 00 eb 08 f7 c3 ff 1f 00 00 74 42 45 85 e4 74 17 41 f6 c4 03 75 11 4c 89 ee 48 c7 c7 21 3f 51 80
31 c0 e8 66 a8 22 00 <48> 8b 33 48 c7 c7 25 3f 51 80 31 c0 48 83 c3 08 41 ff c4 e8 4e
RIP [<ffffffff8020f140>] show_stack_log_lvl+0xb0/0x125
RSP <ffff88007e1d7818>
CR2: 000000007e1d7b00
---[ end trace ddc817f6d2e9b476 ]---
Kernel panic - not syncing: Attempted to kill the idle task!
------------[ cut here ]------------
WARNING: at kernel/smp.c:333 smp_call_function_mask+0x40/0x1e9()
Modules linked in: iwlagn iwlcore mac80211 cfg80211 isofs zlib_inflate usbhid hid vboxdrv tcp_diag inet_diag i915
drm ppdev parport_pc lp parport nfs lockd nfs_acl sunrpc ipv6 ext2 coretemp hp_wmi acpi_cpufreq loop joydev
snd_hda_intel snd_pcm_oss snd_mixer_oss snd_pcm arc4 snd_seq_dummy ecb snd_seq_oss snd_seq_midi snd_rawmidi
snd_seq_midi_event snd_seq snd_timer rfkill snd_seq_device pcmcia snd soundcore psmouse yenta_socket
rsrc_nonstatic iTCO_wdt snd_page_alloc serio_raw pcspkr pcmcia_core intel_agp battery video output wmi
leds_hp_disk led_class container ac button evdev ext3 jbd mbcache sha256_generic aes_x86_64 aes_generic cbc
dm_crypt dm_mirror dm_region_hash dm_log dm_snapshot dm_mod sg sr_mod cdrom sd_mod piix ata_piix ide_pci_generic
ide_core pata_acpi ricoh_mmc sdhci_pci sdhci mmc_core ohci1394 ieee1394 ata_generic ehci_hcd libata uhci_hcd
e1000e scsi_mod thermal processor fan thermal_sys [last unloaded: cfg80211]
Pid: 0, comm: swapper Tainted: G D 2.6.28-rjw #83
Call Trace:
---[ end trace ddc817f6d2e9b476 ]---
------------[ cut here ]------------
WARNING: at kernel/smp.c:220 smp_call_function_single+0x3d/0xb8()
Modules linked in: iwlagn iwlcore mac80211 cfg80211 isofs zlib_inflate usbhid hid vboxdrv tcp_diag inet_diag i915
drm ppdev parport_pc lp parport nfs lockd nfs_acl sunrpc ipv6 ext2 coretemp hp_wmi acpi_cpufreq loop joydev
snd_hda_intel snd_pcm_oss snd_mixer_oss snd_pcm arc4 snd_seq_dummy ecb snd_seq_oss snd_seq_midi snd_rawmidi
snd_seq_midi_event snd_seq snd_timer rfkill snd_seq_device pcmcia snd soundcore psmouse yenta_socket
rsrc_nonstatic iTCO_wdt snd_page_alloc serio_raw pcspkr pcmcia_core intel_agp battery video output wmi
leds_hp_disk led_class container ac button evdev ext3 jbd mbcache sha256_generic aes_x86_64 aes_generic cbc
dm_crypt dm_mirror dm_region_hash dm_log dm_snapshot dm_mod sg sr_mod cdrom sd_mod piix ata_piix ide_pci_generic
ide_core pata_acpi ricoh_mmc sdhci_pci sdhci mmc_core ohci1394 ieee1394 ata_generic ehci_hcd libata uhci_hcd
e1000e scsi_mod thermal processor fan thermal_sys [last unloaded: cfg80211]
Pid: 0, comm: swapper Tainted: G D W 2.6.28-rjw #83
Call Trace:
---[ end trace ddc817f6d2e9b476 ]---


2009-01-13 12:56:55

by Martin Michlmayr

[permalink] [raw]
Subject: Re: [2.6.28] Kernel panic after closing lid on HP 2510p

* Frans Pop <[email protected]> [2009-01-12 13:56]:
> I got the following series of errors while closing the lid of my notebook.
> The end result was a frozen system that needed a hard power off.
...
> System is HP 2510p notebook running 2.6.28 (x86_64, Debian amd64/lenny)
> with a few additional patches on top.

This came up on lkml and elsewhere before. It's probably a BIOS bug.
You can work around it with:

echo 7 > /proc/acpi/video/C09A/DOS

--
Martin Michlmayr
http://www.cyrius.com/

2009-01-13 17:45:26

by David Hagood

[permalink] [raw]
Subject: Git as of 13-Jan-2009 build fail on OMAP3

I am attempting to build the stock kernel as pulled from the kernel GIT
server for an OMAP3 platform, and am getting errors.

The version I am working with (as displayed by "git show") is "commit
e0b325d310a6b11f1538413fd557d2eb98f2fae5".

The error is:


CC arch/arm/mach-omap2/board-omap3beagle.o
arch/arm/mach-omap2/board-omap3beagle.c: In function ?beagle_twl_gpio_setup?:
arch/arm/mach-omap2/board-omap3beagle.c:132: error: ?TWL4030_GPIO_MAX?
undeclared (first use in this function)
arch/arm/mach-omap2/board-omap3beagle.c:132: error: (Each undeclared
identifier is reported only once
arch/arm/mach-omap2/board-omap3beagle.c:132: error: for each function it
appears in.)
arch/arm/mach-omap2/board-omap3beagle.c: At top level:
arch/arm/mach-omap2/board-omap3beagle.c:141: error: variable
?beagle_gpio_data? has initializer but incomplete type
arch/arm/mach-omap2/board-omap3beagle.c:142: error: unknown field
?gpio_base? specified in initializer
arch/arm/mach-omap2/board-omap3beagle.c:142: warning: excess elements in
struct initializer
arch/arm/mach-omap2/board-omap3beagle.c:142: warning: (near initialization
for ?beagle_gpio_data?)
arch/arm/mach-omap2/board-omap3beagle.c:143: error: unknown field
?irq_base? specified in initializer
arch/arm/mach-omap2/board-omap3beagle.c:143: warning: excess elements in
struct initializer
arch/arm/mach-omap2/board-omap3beagle.c:143: warning: (near initialization
for ?beagle_gpio_data?)
arch/arm/mach-omap2/board-omap3beagle.c:144: error: unknown field
?irq_end? specified in initializer
arch/arm/mach-omap2/board-omap3beagle.c:144: warning: excess elements in
struct initializer
arch/arm/mach-omap2/board-omap3beagle.c:144: warning: (near initialization
for ?beagle_gpio_data?)
arch/arm/mach-omap2/board-omap3beagle.c:145: error: unknown field
?use_leds? specified in initializer
arch/arm/mach-omap2/board-omap3beagle.c:145: warning: excess elements in
struct initializer
arch/arm/mach-omap2/board-omap3beagle.c:145: warning: (near initialization
for ?beagle_gpio_data?)
arch/arm/mach-omap2/board-omap3beagle.c:146: error: unknown field
?pullups? specified in initializer
arch/arm/mach-omap2/board-omap3beagle.c:146: warning: excess elements in
struct initializer
arch/arm/mach-omap2/board-omap3beagle.c:146: warning: (near initialization
for ?beagle_gpio_data?)
arch/arm/mach-omap2/board-omap3beagle.c:147: error: unknown field
?pulldowns? specified in initializer
arch/arm/mach-omap2/board-omap3beagle.c:148: warning: excess elements in
struct initializer
arch/arm/mach-omap2/board-omap3beagle.c:148: warning: (near initialization
for ?beagle_gpio_data?)
arch/arm/mach-omap2/board-omap3beagle.c:149: error: unknown field ?setup?
specified in initializer
arch/arm/mach-omap2/board-omap3beagle.c:149: warning: excess elements in
struct initializer
arch/arm/mach-omap2/board-omap3beagle.c:149: warning: (near initialization
for ?beagle_gpio_data?)
arch/arm/mach-omap2/board-omap3beagle.c:152: error: variable
?beagle_twldata? has initializer but incomplete type
arch/arm/mach-omap2/board-omap3beagle.c:153: error: unknown field
?irq_base? specified in initializer
arch/arm/mach-omap2/board-omap3beagle.c:153: warning: excess elements in
struct initializer
arch/arm/mach-omap2/board-omap3beagle.c:153: warning: (near initialization
for ?beagle_twldata?)
arch/arm/mach-omap2/board-omap3beagle.c:154: error: unknown field
?irq_end? specified in initializer
arch/arm/mach-omap2/board-omap3beagle.c:154: warning: excess elements in
struct initializer
arch/arm/mach-omap2/board-omap3beagle.c:154: warning: (near initialization
for ?beagle_twldata?)
arch/arm/mach-omap2/board-omap3beagle.c:157: error: unknown field ?gpio?
specified in initializer
arch/arm/mach-omap2/board-omap3beagle.c:157: warning: excess elements in
struct initializer
arch/arm/mach-omap2/board-omap3beagle.c:157: warning: (near initialization
for ?beagle_twldata?)
arch/arm/mach-omap2/board-omap3beagle.c: In function ?omap3_beagle_init?:
arch/arm/mach-omap2/board-omap3beagle.c:308: error: ?gpio? undeclared
(first use in this function)
make[1]: *** [arch/arm/mach-omap2/board-omap3beagle.o] Error 1
make: *** [arch/arm/mach-omap2] Error 2

I do have the TWL4030 enabled in the config.

Has anybody else tried this?

2009-01-13 19:31:29

by Frans Pop

[permalink] [raw]
Subject: Re: [2.6.28] Kernel panic after closing lid on HP 2510p

On Tuesday 13 January 2009, Martin Michlmayr wrote:
> * Frans Pop <[email protected]> [2009-01-12 13:56]:
> > I got the following series of errors while closing the lid of my
> > notebook. The end result was a frozen system that needed a hard power
> > off.
>
> ...
>
> > System is HP 2510p notebook running 2.6.28 (x86_64, Debian
> > amd64/lenny) with a few additional patches on top.
>
> This came up on lkml and elsewhere before. It's probably a BIOS bug.
> You can work around it with:
>
> echo 7 > /proc/acpi/video/C09A/DOS

Thanks Martin, I'll give that a try.

2009-01-14 15:55:40

by Tony Lindgren

[permalink] [raw]
Subject: Re: Git as of 13-Jan-2009 build fail on OMAP3

* [email protected] <[email protected]> [090114 12:18]:
> I am attempting to build the stock kernel as pulled from the kernel GIT
> server for an OMAP3 platform, and am getting errors.
>
> The version I am working with (as displayed by "git show") is "commit
> e0b325d310a6b11f1538413fd557d2eb98f2fae5".

I posted some omap build fixes to linux-arm-kernel few days ago.
You can cherry pick the build fixes from:

http://git.kernel.org/?p=linux/kernel/git/tmlind/linux-omap-2.6.git;a=shortlog;h=omap-fixes

Hopefully these will get to mainline within few days.

Regards,

Tony



>
> The error is:
>
>
> CC arch/arm/mach-omap2/board-omap3beagle.o
> arch/arm/mach-omap2/board-omap3beagle.c: In function ?beagle_twl_gpio_setup?:
> arch/arm/mach-omap2/board-omap3beagle.c:132: error: ?TWL4030_GPIO_MAX?
> undeclared (first use in this function)
> arch/arm/mach-omap2/board-omap3beagle.c:132: error: (Each undeclared
> identifier is reported only once
> arch/arm/mach-omap2/board-omap3beagle.c:132: error: for each function it
> appears in.)
> arch/arm/mach-omap2/board-omap3beagle.c: At top level:
> arch/arm/mach-omap2/board-omap3beagle.c:141: error: variable
> ?beagle_gpio_data? has initializer but incomplete type
> arch/arm/mach-omap2/board-omap3beagle.c:142: error: unknown field
> ?gpio_base? specified in initializer
> arch/arm/mach-omap2/board-omap3beagle.c:142: warning: excess elements in
> struct initializer
> arch/arm/mach-omap2/board-omap3beagle.c:142: warning: (near initialization
> for ?beagle_gpio_data?)
> arch/arm/mach-omap2/board-omap3beagle.c:143: error: unknown field
> ?irq_base? specified in initializer
> arch/arm/mach-omap2/board-omap3beagle.c:143: warning: excess elements in
> struct initializer
> arch/arm/mach-omap2/board-omap3beagle.c:143: warning: (near initialization
> for ?beagle_gpio_data?)
> arch/arm/mach-omap2/board-omap3beagle.c:144: error: unknown field
> ?irq_end? specified in initializer
> arch/arm/mach-omap2/board-omap3beagle.c:144: warning: excess elements in
> struct initializer
> arch/arm/mach-omap2/board-omap3beagle.c:144: warning: (near initialization
> for ?beagle_gpio_data?)
> arch/arm/mach-omap2/board-omap3beagle.c:145: error: unknown field
> ?use_leds? specified in initializer
> arch/arm/mach-omap2/board-omap3beagle.c:145: warning: excess elements in
> struct initializer
> arch/arm/mach-omap2/board-omap3beagle.c:145: warning: (near initialization
> for ?beagle_gpio_data?)
> arch/arm/mach-omap2/board-omap3beagle.c:146: error: unknown field
> ?pullups? specified in initializer
> arch/arm/mach-omap2/board-omap3beagle.c:146: warning: excess elements in
> struct initializer
> arch/arm/mach-omap2/board-omap3beagle.c:146: warning: (near initialization
> for ?beagle_gpio_data?)
> arch/arm/mach-omap2/board-omap3beagle.c:147: error: unknown field
> ?pulldowns? specified in initializer
> arch/arm/mach-omap2/board-omap3beagle.c:148: warning: excess elements in
> struct initializer
> arch/arm/mach-omap2/board-omap3beagle.c:148: warning: (near initialization
> for ?beagle_gpio_data?)
> arch/arm/mach-omap2/board-omap3beagle.c:149: error: unknown field ?setup?
> specified in initializer
> arch/arm/mach-omap2/board-omap3beagle.c:149: warning: excess elements in
> struct initializer
> arch/arm/mach-omap2/board-omap3beagle.c:149: warning: (near initialization
> for ?beagle_gpio_data?)
> arch/arm/mach-omap2/board-omap3beagle.c:152: error: variable
> ?beagle_twldata? has initializer but incomplete type
> arch/arm/mach-omap2/board-omap3beagle.c:153: error: unknown field
> ?irq_base? specified in initializer
> arch/arm/mach-omap2/board-omap3beagle.c:153: warning: excess elements in
> struct initializer
> arch/arm/mach-omap2/board-omap3beagle.c:153: warning: (near initialization
> for ?beagle_twldata?)
> arch/arm/mach-omap2/board-omap3beagle.c:154: error: unknown field
> ?irq_end? specified in initializer
> arch/arm/mach-omap2/board-omap3beagle.c:154: warning: excess elements in
> struct initializer
> arch/arm/mach-omap2/board-omap3beagle.c:154: warning: (near initialization
> for ?beagle_twldata?)
> arch/arm/mach-omap2/board-omap3beagle.c:157: error: unknown field ?gpio?
> specified in initializer
> arch/arm/mach-omap2/board-omap3beagle.c:157: warning: excess elements in
> struct initializer
> arch/arm/mach-omap2/board-omap3beagle.c:157: warning: (near initialization
> for ?beagle_twldata?)
> arch/arm/mach-omap2/board-omap3beagle.c: In function ?omap3_beagle_init?:
> arch/arm/mach-omap2/board-omap3beagle.c:308: error: ?gpio? undeclared
> (first use in this function)
> make[1]: *** [arch/arm/mach-omap2/board-omap3beagle.o] Error 1
> make: *** [arch/arm/mach-omap2] Error 2
>
> I do have the TWL4030 enabled in the config.
>
> Has anybody else tried this?
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/

2009-01-15 00:26:25

by Andrew Morton

[permalink] [raw]
Subject: Re: [2.6.28] Kernel panic after closing lid on HP 2510p

(cc linux-acpi)

On Tue, 13 Jan 2009 13:30:56 +0100
Martin Michlmayr <[email protected]> wrote:

> * Frans Pop <[email protected]> [2009-01-12 13:56]:
> > I got the following series of errors while closing the lid of my notebook.
> > The end result was a frozen system that needed a hard power off.
> ...
> > System is HP 2510p notebook running 2.6.28 (x86_64, Debian amd64/lenny)
> > with a few additional patches on top.
>
> This came up on lkml and elsewhere before. It's probably a BIOS bug.
> You can work around it with:
>
> echo 7 > /proc/acpi/video/C09A/DOS
>

It'd be a very special BIOS bug if it can reach out and make the kernel
oops.


> I got the following series of errors while closing the lid of my notebook.
> The end result was a frozen system that needed a hard power off.
>
> The notebook was docked and connected to external monitor. Closing the lid
> will only power off both displays, not suspend.
>
> I possibly tried to do a few things too quickly in succession, but
> AFAIK that should still not result in the kernel crapping out on me ;-)
>
> System is HP 2510p notebook running 2.6.28 (x86_64, Debian amd64/lenny)
> with a few additional patches on top.
>
> Cheers,
> FJP
>
> Errors in kern.log after rebooting:
>
> BUG: unable to handle kernel paging request at ffff88007ceeb5a0
> IP: [<ffffffff8034e077>] acpi_ex_field_datum_io+0xec/0x17e
> PGD 202063 PUD 8067 PMD 6ff52163 PTE 800000007ceeb163
> Oops: 0011 [#1] SMP
> last sysfs file: /sys/class/power_supply/C23D/charge_full
> CPU 1
> Modules linked in: iwlagn iwlcore mac80211 cfg80211 isofs zlib_inflate usbhid hid vboxdrv tcp_diag inet_diag i915
> drm ppdev parport_pc lp parport nfs lockd nfs_acl sunrpc ipv6 ext2 coretemp hp_wmi acpi_cpufreq loop joydev
> snd_hda_intel snd_pcm_oss snd_mixer_oss snd_pcm arc4 snd_seq_dummy ecb snd_seq_oss snd_seq_midi snd_rawmidi
> snd_seq_midi_event snd_seq snd_timer rfkill snd_seq_device pcmcia snd soundcore psmouse yenta_socket
> rsrc_nonstatic iTCO_wdt snd_page_alloc serio_raw pcspkr pcmcia_core intel_agp battery video output wmi
> leds_hp_disk led_class container ac button evdev ext3 jbd mbcache sha256_generic aes_x86_64 aes_generic cbc
> dm_crypt dm_mirror dm_region_hash dm_log dm_snapshot dm_mod sg sr_mod cdrom sd_mod piix ata_piix ide_pci_generic
> ide_core pata_acpi ricoh_mmc sdhci_pci sdhci mmc_core ohci1394 ieee1394 ata_generic ehci_hcd libata uhci_hcd
> e1000e scsi_mod thermal processor fan thermal_sys [last unloaded: cfg80211]
> Pid: 70, comm: kacpid Not tainted 2.6.28-rjw #83
> RIP: 61a0:[<ffffffff803534ed>] [<ffffffff803534ed>] acpi_ns_search_one_scope+0x1d/0x46
> RSP: ffffffff8028fbf2:ffff88007e1d7b10 EFLAGS: 00000005
> RAX: ffff88007e046510 RBX: ffff88007ceeb5a0 RCX: ffff88007e1d7b70
> RDX: 0000000000000000 RSI: ffff88007e1d7ae0 RDI: ffffffff8028fbf2
> RBP: ffff88007e1d7b80 R08: 00000003000000b2 R09: ffff88007e1d7b70
> R10: 0000000000000000 R11: ffff88007e1d7c60 R12: ffff88007e1d7b10
> R13: ffffffff8034df7e R14: ffff88007e1d7ad0 R15: ffff88007e046510
> FS: 0000000000000000(0000) GS:ffff88007e002a80(0000) knlGS:0000000000000000
> CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b
> CR2: ffff88007ceeb5a0 CR3: 0000000063b35000 CR4: 00000000000006e0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> Process kacpid (pid: 70, threadinfo ffff88007e1d6000, task ffff88007e1d95f0)
> Stack:
> BUG: unable to handle kernel paging request at 000000007e1d7b00
> IP: [<ffffffff8020f140>] show_stack_log_lvl+0xb0/0x125
> PGD 63b00067 PUD 73867067 PMD 0

If the BIOS is bad then the kernel would ideally report that fact and
then take some sort of avoiding action. It shouldn't oops!

Frans, please raise a bugzilla against acpi for this if nothing happens
in the next few days, thanks.

2009-01-15 02:03:32

by Matthew Garrett

[permalink] [raw]
Subject: Re: [2.6.28] Kernel panic after closing lid on HP 2510p

On Wed, Jan 14, 2009 at 04:26:03PM -0800, Andrew Morton wrote:

> It'd be a very special BIOS bug if it can reach out and make the kernel
> oops.

Lid actions typically trigger SMI code, so it's entirely capable of
destroying CPU state in such a way that the kernel falls over (and
probably even in ways that cause the kernel to turn green, emit pleasing
warbling noises or invade neighbouring pieces of hardware). In this case
it seems to be SMP specific - the system's entirely stable in UP mode.
It's greatly vexing.

--
Matthew Garrett | [email protected]

2009-01-15 02:16:23

by Andrew Morton

[permalink] [raw]
Subject: Re: [2.6.28] Kernel panic after closing lid on HP 2510p

On Thu, 15 Jan 2009 02:03:11 +0000 Matthew Garrett <[email protected]> wrote:

> On Wed, Jan 14, 2009 at 04:26:03PM -0800, Andrew Morton wrote:
>
> > It'd be a very special BIOS bug if it can reach out and make the kernel
> > oops.
>
> Lid actions typically trigger SMI code, so it's entirely capable of
> destroying CPU state in such a way that the kernel falls over (and
> probably even in ways that cause the kernel to turn green, emit pleasing
> warbling noises or invade neighbouring pieces of hardware). In this case
> it seems to be SMP specific - the system's entirely stable in UP mode.
> It's greatly vexing.

Does it always crash in the same way?

If so, we can put crash-avoidance code at the offending callsite and
back out gracefully?

2009-01-15 02:21:24

by Matthew Garrett

[permalink] [raw]
Subject: Re: [2.6.28] Kernel panic after closing lid on HP 2510p

On Wed, Jan 14, 2009 at 06:15:42PM -0800, Andrew Morton wrote:
> On Thu, 15 Jan 2009 02:03:11 +0000 Matthew Garrett <[email protected]> wrote:
> > Lid actions typically trigger SMI code, so it's entirely capable of
> > destroying CPU state in such a way that the kernel falls over (and
> > probably even in ways that cause the kernel to turn green, emit pleasing
> > warbling noises or invade neighbouring pieces of hardware). In this case
> > it seems to be SMP specific - the system's entirely stable in UP mode.
> > It's greatly vexing.
>
> Does it always crash in the same way?

Not in my experience. Sometimes it falls over in the ACPI parsing code,
but sometimes the backtrace is nonsense and the IP is in the middle of
nowhere. I've now got a working 2510p again, so maybe I'll have time to
look at this on the way to LCA.

--
Matthew Garrett | [email protected]

2009-01-15 02:54:10

by Zhang, Rui

[permalink] [raw]
Subject: Re: [2.6.28] Kernel panic after closing lid on HP 2510p

On Thu, 2009-01-15 at 10:21 +0800, Matthew Garrett wrote:
> On Wed, Jan 14, 2009 at 06:15:42PM -0800, Andrew Morton wrote:
> > On Thu, 15 Jan 2009 02:03:11 +0000 Matthew Garrett <[email protected]> wrote:
> > > Lid actions typically trigger SMI code, so it's entirely capable of
> > > destroying CPU state in such a way that the kernel falls over (and
> > > probably even in ways that cause the kernel to turn green, emit pleasing
> > > warbling noises or invade neighbouring pieces of hardware). In this case
> > > it seems to be SMP specific - the system's entirely stable in UP mode.
> > > It's greatly vexing.
> >
> > Does it always crash in the same way?
>
> Not in my experience. Sometimes it falls over in the ACPI parsing code,
> but sometimes the backtrace is nonsense and the IP is in the middle of
> nowhere. I've now got a working 2510p again, so maybe I'll have time to
> look at this on the way to LCA.
>
please check if this is a duplicate of bug #11259.
http://bugzilla.kernel.org/show_bug.cgi?id=11259


We (Wu, fengguang and me) reproduced this bug on a HP 6910p.
and we found that windows run a different AML code path when closing Lid
on this laptop and the SMI is not invoked...
And I have verified how to make Linux run the same code path by changing
the IGD OpRegion code.

DIDL is an IGD OpRegion field, as the Supported Display Devices ID List.
it's evaluated by the _DOD method when ACPI video driver is loaded.
And according to the spec, "The graphics driver writes to this field
once during its initialization"
if DIDL is not empty, a flag is set and the SMI will not be invoked when
closing the lid.
In our tests, this field (DIDL) is set in windows when _DOD is invoked
while it's not in Linux.
I can workaround this bug by setting the DIDL manually in AML code.

So a patch setting the DIDL in i915_opregion.c should be a proper fix
for this problem. Fengguang will cook up a patch later.

But there is still one mystery left.
Windows doesn't break even if the SMI is invoked...

thanks,
rui



2009-01-15 03:00:05

by Matthew Garrett

[permalink] [raw]
Subject: Re: [2.6.28] Kernel panic after closing lid on HP 2510p

On Thu, Jan 15, 2009 at 10:55:03AM +0800, Zhang Rui wrote:

> DIDL is an IGD OpRegion field, as the Supported Display Devices ID List.
> it's evaluated by the _DOD method when ACPI video driver is loaded.
> And according to the spec, "The graphics driver writes to this field
> once during its initialization"
> if DIDL is not empty, a flag is set and the SMI will not be invoked when
> closing the lid.
> In our tests, this field (DIDL) is set in windows when _DOD is invoked
> while it's not in Linux.
> I can workaround this bug by setting the DIDL manually in AML code.

Oh, huh. Yeah, that sounds plausible. I'll give it a go here tomorrow.

--
Matthew Garrett | [email protected]

2009-01-15 04:21:23

by Fengguang Wu

[permalink] [raw]
Subject: Re: [2.6.28] Kernel panic after closing lid on HP 2510p

On Thu, Jan 15, 2009 at 04:55:03AM +0200, Zhang, Rui wrote:
> On Thu, 2009-01-15 at 10:21 +0800, Matthew Garrett wrote:
> > On Wed, Jan 14, 2009 at 06:15:42PM -0800, Andrew Morton wrote:
> > > On Thu, 15 Jan 2009 02:03:11 +0000 Matthew Garrett <[email protected]> wrote:
> > > > Lid actions typically trigger SMI code, so it's entirely capable of
> > > > destroying CPU state in such a way that the kernel falls over (and
> > > > probably even in ways that cause the kernel to turn green, emit pleasing
> > > > warbling noises or invade neighbouring pieces of hardware). In this case
> > > > it seems to be SMP specific - the system's entirely stable in UP mode.
> > > > It's greatly vexing.
> > >
> > > Does it always crash in the same way?
> >
> > Not in my experience. Sometimes it falls over in the ACPI parsing code,
> > but sometimes the backtrace is nonsense and the IP is in the middle of
> > nowhere. I've now got a working 2510p again, so maybe I'll have time to
> > look at this on the way to LCA.
> >
> please check if this is a duplicate of bug #11259.
> http://bugzilla.kernel.org/show_bug.cgi?id=11259
>
>
> We (Wu, fengguang and me) reproduced this bug on a HP 6910p.
> and we found that windows run a different AML code path when closing Lid
> on this laptop and the SMI is not invoked...
> And I have verified how to make Linux run the same code path by changing
> the IGD OpRegion code.
>
> DIDL is an IGD OpRegion field, as the Supported Display Devices ID List.
> it's evaluated by the _DOD method when ACPI video driver is loaded.
> And according to the spec, "The graphics driver writes to this field
> once during its initialization"
> if DIDL is not empty, a flag is set and the SMI will not be invoked when
> closing the lid.
> In our tests, this field (DIDL) is set in windows when _DOD is invoked
> while it's not in Linux.
> I can workaround this bug by setting the DIDL manually in AML code.
>
> So a patch setting the DIDL in i915_opregion.c should be a proper fix
> for this problem. Fengguang will cook up a patch later.

I found it not easy given the required sequence of operations:
For this specific bug, opregion DIDL entry must be set before the
*first* _DOD invocation, i.e. the ACPI video module loading time.

This means
- opregion module must be loaded before ACPI video module
- the opregion DIDL setting code must run in module init time,
instead of the current implemented startx time.
However, how can the opregion module get the right DIDL data
without the help of Xorg and intel driver? Maybe kernel mode
setting?

So the module dependencies could be:
ACPI video => i915 opregion => kernel mode setting

The latter dependency should be a reasonable one, but does the first
one make sense in general?

Or is it possible to delay the first _DOD invocation in ACPI video?

Thanks,
Fengguang

2009-01-15 07:41:57

by Martin Michlmayr

[permalink] [raw]
Subject: Re: [2.6.28] Kernel panic after closing lid on HP 2510p

* Zhang Rui <[email protected]> [2009-01-15 10:55]:
> please check if this is a duplicate of bug #11259.
> http://bugzilla.kernel.org/show_bug.cgi?id=11259

It certainly sounds so.

BTW, here is my original bug report which contains an easy recipe to
reproduce the problem: http://lkml.org/lkml/2008/6/16/157
And here's a bug report from Ubuntu that shows that the HP 6710b,
HP 6510b and HP 2510p laptops are affected:
https://bugs.launchpad.net/ubuntu/+source/acpid/+bug/157691

Thanks for looking into this issue.
--
Martin Michlmayr
http://www.cyrius.com/