2008-02-28 20:55:07

by Jonathan McDowell

[permalink] [raw]
Subject: 2.6.25 regression/oops on boot (ACPI related?)

I'm getting a "general protection fault" when trying to boot 2.6.25-rc3
on my AMD64 box; 2.6.24 boots fine. The machine just seems to end up
sitting there at the end, but still responds to a ctrl-alt-del to
cleanly shutdown. The GPF is as follows:

-----
general protection fault: 0000 [1] PREEMPT SMP
CPU 1
Modules linked in: thermal(+) processor fan
Pid: 598, comm: modprobe Not tainted 2.6.25-rc3 #1
RIP: 0010:[<ffffffff803590a8>] [<ffffffff803590a8>] acpi_ns_map_handle_to_node+0x19/0x23
RSP: 0018:ffff81011de5fc68 EFLAGS: 00010246
RAX: 0000000000000000 RBX: 0000000000001001 RCX: 0000000000000000
RDX: 0000000000005067 RSI: 0000000000000001 RDI: 4d52454854584e4c
RBP: ffff81011de5fc68 R08: 0000000000000000 R09: ffff81011de5fc78
R10: ffff81011dcc0648 R11: ffffffff802d566a R12: 4d52454854584e4c
R13: ffff81011de5fcf8 R14: ffffffff80362bd3 R15: 0000000000000003
FS: 00007f04840336e0(0000) GS:ffff81011fab1bc0(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 00007f0484032000 CR3: 000000011defc000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process modprobe (pid: 598, threadinfo ffff81011de5e000, task ffff81011de38640)
Stack: ffff81011de5fc98 ffffffff803584ab ffff81011de5fcf8 ffff81011df1a800
0000000000000000 ffff81011de5fcf8 ffff81011de5fcb8 ffffffff80362382
ffff81011de52230 0000000000000001 ffff81011de5fd28 ffffffff880113a9
Call Trace:
[<ffffffff803584ab>] acpi_get_data+0x3f/0x70
[<ffffffff80362382>] acpi_bus_get_device+0x25/0x39
[<ffffffff880113a9>] :thermal:acpi_thermal_cooling_device_cb+0x6b/0x166
[<ffffffff80405ee8>] ? thermal_zone_bind_cooling_device+0x0/0x26e
[<ffffffff880114c6>] :thermal:acpi_thermal_bind_cooling_device+0x10/0x12
[<ffffffff80405e68>] thermal_zone_device_register+0x252/0x2d2
[<ffffffff88011626>] :thermal:acpi_thermal_add+0x15e/0x42b
[<ffffffff80364138>] acpi_device_probe+0x45/0x92
[<ffffffff803aac23>] driver_probe_device+0xc0/0x147
[<ffffffff803aad55>] ? __driver_attach+0x0/0x94
[<ffffffff803aadb0>] __driver_attach+0x5b/0x94
[<ffffffff803a9f55>] bus_for_each_dev+0x4f/0x7f
[<ffffffff803aaa6e>] driver_attach+0x1c/0x1e
[<ffffffff803aa81e>] bus_add_driver+0xb7/0x201
[<ffffffff803aafdb>] driver_register+0x5e/0xd6
[<ffffffff803644ce>] acpi_bus_register_driver+0x3e/0x40
[<ffffffff88017061>] :thermal:acpi_thermal_init+0x61/0x83
[<ffffffff80257120>] sys_init_module+0x98/0x16b
[<ffffffff8020c0eb>] system_call_after_swapgs+0x7b/0x80


Code: 83 c6 04 41 ff c8 45 85 c0 75 a7 c6 06 00 31 c0 c9 c3 48 8d 47 ff 55 48 83 f8 fd 48 89 e5 76 09 48 8b 05 cc 49 3f 00 eb 0a 31 c0 <80> 7f 08 0f 48 0f 44 c7 c9 c3 55 48 89 f8 48 89 e5 c9 c3 55 31
RIP [<ffffffff803590a8>] acpi_ns_map_handle_to_node+0x19/0x23
RSP <ffff81011de5fc68>
---[ end trace c027ad8802a9e766 ]---
Segmentation fault
-----

Full boot log is at:

http://the.earth.li/~noodles/2.6.25-breakage/meepok.console-2.6.25-rc3.log

Config at:

http://the.earth.li/~noodles/2.6.25-breakage/config-2.6.25-rc3

2.6.25-rc1 and 2.6.25-rc2-git6 both show the same issue.

J.

--
jid: [email protected]
In God we trust; all else we walk through.


2008-02-29 02:08:51

by Zhang, Rui

[permalink] [raw]
Subject: Re: 2.6.25 regression/oops on boot (ACPI related?)

Hi, Jonathan,

Please attach the acpidump output using the latest pmtools at :
http://www.kernel.org/pub/linux/kernel/people/lenb/acpi/utils/
Please attach the result of "cat /proc/acpi/thermal_zone/*/*" as well.

thanks,
rui

On Fri, 2008-02-29 at 04:08 +0800, Jonathan McDowell wrote:
> I'm getting a "general protection fault" when trying to boot
> 2.6.25-rc3
> on my AMD64 box; 2.6.24 boots fine. The machine just seems to end up
> sitting there at the end, but still responds to a ctrl-alt-del to
> cleanly shutdown. The GPF is as follows:
>
> -----
> general protection fault: 0000 [1] PREEMPT SMP
> CPU 1
> Modules linked in: thermal(+) processor fan
> Pid: 598, comm: modprobe Not tainted 2.6.25-rc3 #1
> RIP: 0010:[<ffffffff803590a8>] [<ffffffff803590a8>]
> acpi_ns_map_handle_to_node+0x19/0x23
> RSP: 0018:ffff81011de5fc68 EFLAGS: 00010246
> RAX: 0000000000000000 RBX: 0000000000001001 RCX: 0000000000000000
> RDX: 0000000000005067 RSI: 0000000000000001 RDI: 4d52454854584e4c
> RBP: ffff81011de5fc68 R08: 0000000000000000 R09: ffff81011de5fc78
> R10: ffff81011dcc0648 R11: ffffffff802d566a R12: 4d52454854584e4c
> R13: ffff81011de5fcf8 R14: ffffffff80362bd3 R15: 0000000000000003
> FS: 00007f04840336e0(0000) GS:ffff81011fab1bc0(0000)
> knlGS:0000000000000000
> CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
> CR2: 00007f0484032000 CR3: 000000011defc000 CR4: 00000000000006e0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> Process modprobe (pid: 598, threadinfo ffff81011de5e000, task
> ffff81011de38640)
> Stack: ffff81011de5fc98 ffffffff803584ab ffff81011de5fcf8
> ffff81011df1a800
> 0000000000000000 ffff81011de5fcf8 ffff81011de5fcb8 ffffffff80362382
> ffff81011de52230 0000000000000001 ffff81011de5fd28 ffffffff880113a9
> Call Trace:
> [<ffffffff803584ab>] acpi_get_data+0x3f/0x70
> [<ffffffff80362382>] acpi_bus_get_device+0x25/0x39
> [<ffffffff880113a9>] :thermal:acpi_thermal_cooling_device_cb
> +0x6b/0x166
> [<ffffffff80405ee8>] ? thermal_zone_bind_cooling_device+0x0/0x26e
> [<ffffffff880114c6>] :thermal:acpi_thermal_bind_cooling_device
> +0x10/0x12
> [<ffffffff80405e68>] thermal_zone_device_register+0x252/0x2d2
> [<ffffffff88011626>] :thermal:acpi_thermal_add+0x15e/0x42b
> [<ffffffff80364138>] acpi_device_probe+0x45/0x92
> [<ffffffff803aac23>] driver_probe_device+0xc0/0x147
> [<ffffffff803aad55>] ? __driver_attach+0x0/0x94
> [<ffffffff803aadb0>] __driver_attach+0x5b/0x94
> [<ffffffff803a9f55>] bus_for_each_dev+0x4f/0x7f
> [<ffffffff803aaa6e>] driver_attach+0x1c/0x1e
> [<ffffffff803aa81e>] bus_add_driver+0xb7/0x201
> [<ffffffff803aafdb>] driver_register+0x5e/0xd6
> [<ffffffff803644ce>] acpi_bus_register_driver+0x3e/0x40
> [<ffffffff88017061>] :thermal:acpi_thermal_init+0x61/0x83
> [<ffffffff80257120>] sys_init_module+0x98/0x16b
> [<ffffffff8020c0eb>] system_call_after_swapgs+0x7b/0x80
>
>
> Code: 83 c6 04 41 ff c8 45 85 c0 75 a7 c6 06 00 31 c0 c9 c3 48 8d 47
> ff 55 48 83 f8 fd 48 89 e5 76 09 48 8b 05 cc 49 3f 00 eb 0a 31 c0 <80>
> 7f 08 0f 48 0f 44 c7 c9 c3 55 48 89 f8 48 89 e5 c9 c3 55 31
> RIP [<ffffffff803590a8>] acpi_ns_map_handle_to_node+0x19/0x23
> RSP <ffff81011de5fc68>
> ---[ end trace c027ad8802a9e766 ]---
> Segmentation fault
> -----
>
> Full boot log is at:
>
> http://the.earth.li/~noodles/2.6.25-breakage/meepok.console-2.6.25-rc3.log
>
> Config at:
>
> http://the.earth.li/~noodles/2.6.25-breakage/config-2.6.25-rc3
>
> 2.6.25-rc1 and 2.6.25-rc2-git6 both show the same issue.
>
> J.
>
> --
> jid:
> [email protected]
> In God we trust; all else we walk through.
> --
> To unsubscribe from this list: send the line "unsubscribe linux-acpi"
> in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
>

2008-02-29 07:55:14

by Jonathan McDowell

[permalink] [raw]
Subject: Re: 2.6.25 regression/oops on boot (ACPI related?)

On Fri, Feb 29, 2008 at 01:20:27AM +0800, Zhang, Rui wrote:
> Please attach the acpidump output using the latest pmtools at :
> http://www.kernel.org/pub/linux/kernel/people/lenb/acpi/utils/
> Please attach the result of "cat /proc/acpi/thermal_zone/*/*" as well.

I've attached the output of acpidump. The cat results in this output:

[noodles@meepok /proc/acpi/thermal_zone/THRM]$ cat *
0 - Active; 1 - Passive
<polling disabled>
state: ok
temperature: 40 C
Segmentation fault

It also causes a general protection fault, which I've attached as well.

This is a stock Debian kernel:

Linux meepok 2.6.24-1-amd64 #1 SMP Mon Feb 11 13:47:43 UTC 2008 x86_64 GNU/Linux

I have a patch from Ming Lin to try out but it'll have to wait until
tomorrow before I can do so.

J.

--
Are you out of my mind?
This .sig brought to you by the letter R and the number 6
Product of the Republic of HuggieTag


Attachments:
(No filename) (925.00 B)
acpidump (101.86 kB)
gpf (2.41 kB)
Download all attachments

2008-02-29 08:27:23

by Zhang, Rui

[permalink] [raw]
Subject: Re: 2.6.25 regression/oops on boot (ACPI related?)


On Fri, 2008-02-29 at 15:54 +0800, Jonathan McDowell wrote:
> On Fri, Feb 29, 2008 at 01:20:27AM +0800, Zhang, Rui wrote:
> > Please attach the acpidump output using the latest pmtools at :
> > http://www.kernel.org/pub/linux/kernel/people/lenb/acpi/utils/
> > Please attach the result of "cat /proc/acpi/thermal_zone/*/*" as
> well.
>
> I've attached the output of acpidump. The cat results in this output:
>
> [noodles@meepok /proc/acpi/thermal_zone/THRM]$ cat *
> 0 - Active; 1 - Passive
> <polling disabled>
> state: ok
> temperature: 40 C
> Segmentation fault
>
> It also causes a general protection fault, which I've attached as
> well.
>
> This is a stock Debian kernel:
>
> Linux meepok 2.6.24-1-amd64 #1 SMP Mon Feb 11 13:47:43 UTC 2008 x86_64
> GNU/Linux
>
> I have a patch from Ming Lin to try out but it'll have to wait until
> tomorrow before I can do so.
>
We've root caused the problem and Lin Ming's patch should work for you.
Please give it a try. :)

From: Lin Ming <[email protected]>

Fix a memory overflow bug when copying
NULL internal package element object to external.

Signed-off-by: Lin Ming <[email protected]>
Signed-off-by: Zhang Rui <[email protected]>
---
drivers/acpi/utilities/utobject.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

Index: linux-2.6/drivers/acpi/utilities/utobject.c
===================================================================
--- linux-2.6.orig/drivers/acpi/utilities/utobject.c
+++ linux-2.6/drivers/acpi/utilities/utobject.c
@@ -432,7 +432,7 @@ acpi_ut_get_simple_object_size(union acp
* element -- which is legal)
*/
if (!internal_object) {
- *obj_length = 0;
+ *obj_length = sizeof(union acpi_object);
return_ACPI_STATUS(AE_OK);
}


2008-03-01 15:27:20

by Jonathan McDowell

[permalink] [raw]
Subject: Re: 2.6.25 regression/oops on boot (ACPI related?)

On Fri, Feb 29, 2008 at 07:38:54AM +0800, Zhang, Rui wrote:
> On Fri, 2008-02-29 at 15:54 +0800, Jonathan McDowell wrote:
> > On Fri, Feb 29, 2008 at 01:20:27AM +0800, Zhang, Rui wrote:
> > > Please attach the acpidump output using the latest pmtools at :
> > > http://www.kernel.org/pub/linux/kernel/people/lenb/acpi/utils/
> > > Please attach the result of "cat /proc/acpi/thermal_zone/*/*" as
> > well.
> >
> > I've attached the output of acpidump. The cat results in this output:
> >
> > [noodles@meepok /proc/acpi/thermal_zone/THRM]$ cat *
> > 0 - Active; 1 - Passive
> > <polling disabled>
> > state: ok
> > temperature: 40 C
> > Segmentation fault
> >
> > It also causes a general protection fault, which I've attached as
> > well.
> >
> > This is a stock Debian kernel:
> >
> > Linux meepok 2.6.24-1-amd64 #1 SMP Mon Feb 11 13:47:43 UTC 2008 x86_64
> > GNU/Linux
> >
> > I have a patch from Ming Lin to try out but it'll have to wait until
> > tomorrow before I can do so.
> >
> We've root caused the problem and Lin Ming's patch should work for you.
> Please give it a try. :)

I've now done so; it fixes my problem and 2.6.25-rc3 boots fine with it
applied. Thanks.

J.

--
Programmer, | 101 things you can't have too | Tel/SMS:
sysadmin & | much of : 51 - News. | +423-663-212343
BHMF. | | Made by HuggieTag

2008-03-10 14:37:36

by Jonathan McDowell

[permalink] [raw]
Subject: Re: 2.6.25 regression/oops on boot (ACPI related?)

On Sat, Mar 01, 2008 at 03:27:06PM +0000, Jonathan McDowell wrote:
> On Fri, Feb 29, 2008 at 07:38:54AM +0800, Zhang, Rui wrote:
> > On Fri, 2008-02-29 at 15:54 +0800, Jonathan McDowell wrote:
> > > On Fri, Feb 29, 2008 at 01:20:27AM +0800, Zhang, Rui wrote:
> > > > Please attach the acpidump output using the latest pmtools at :
> > > > http://www.kernel.org/pub/linux/kernel/people/lenb/acpi/utils/
> > > > Please attach the result of "cat /proc/acpi/thermal_zone/*/*" as
> > > well.
> > >
> > > I've attached the output of acpidump. The cat results in this output:
> > >
> > > [noodles@meepok /proc/acpi/thermal_zone/THRM]$ cat *
> > > 0 - Active; 1 - Passive
> > > <polling disabled>
> > > state: ok
> > > temperature: 40 C
> > > Segmentation fault
> > >
> > > It also causes a general protection fault, which I've attached as
> > > well.
> > >
> > > This is a stock Debian kernel:
> > >
> > > Linux meepok 2.6.24-1-amd64 #1 SMP Mon Feb 11 13:47:43 UTC 2008 x86_64
> > > GNU/Linux
> > >
> > > I have a patch from Ming Lin to try out but it'll have to wait until
> > > tomorrow before I can do so.
> > >
> > We've root caused the problem and Lin Ming's patch should work for you.
> > Please give it a try. :)
>
> I've now done so; it fixes my problem and 2.6.25-rc3 boots fine with it
> applied. Thanks.

Is there a reason this still hasn't made it into 2.6.25-rc5?

J.

--
Covered in paint and high as a kite.
This .sig brought to you by the letter I and the number 13
Product of the Republic of HuggieTag

2008-03-10 15:03:05

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: 2.6.25 regression/oops on boot (ACPI related?)

On Monday, 10 of March 2008, Jonathan McDowell wrote:
> On Sat, Mar 01, 2008 at 03:27:06PM +0000, Jonathan McDowell wrote:
> > On Fri, Feb 29, 2008 at 07:38:54AM +0800, Zhang, Rui wrote:
> > > On Fri, 2008-02-29 at 15:54 +0800, Jonathan McDowell wrote:
> > > > On Fri, Feb 29, 2008 at 01:20:27AM +0800, Zhang, Rui wrote:
> > > > > Please attach the acpidump output using the latest pmtools at :
> > > > > http://www.kernel.org/pub/linux/kernel/people/lenb/acpi/utils/
> > > > > Please attach the result of "cat /proc/acpi/thermal_zone/*/*" as
> > > > well.
> > > >
> > > > I've attached the output of acpidump. The cat results in this output:
> > > >
> > > > [noodles@meepok /proc/acpi/thermal_zone/THRM]$ cat *
> > > > 0 - Active; 1 - Passive
> > > > <polling disabled>
> > > > state: ok
> > > > temperature: 40 C
> > > > Segmentation fault
> > > >
> > > > It also causes a general protection fault, which I've attached as
> > > > well.
> > > >
> > > > This is a stock Debian kernel:
> > > >
> > > > Linux meepok 2.6.24-1-amd64 #1 SMP Mon Feb 11 13:47:43 UTC 2008 x86_64
> > > > GNU/Linux
> > > >
> > > > I have a patch from Ming Lin to try out but it'll have to wait until
> > > > tomorrow before I can do so.
> > > >
> > > We've root caused the problem and Lin Ming's patch should work for you.
> > > Please give it a try. :)
> >
> > I've now done so; it fixes my problem and 2.6.25-rc3 boots fine with it
> > applied. Thanks.
>
> Is there a reason this still hasn't made it into 2.6.25-rc5?

Well, I guess Len was unaware of it (CCed now).

Thanks,
Rafael