2011-06-04 20:05:02

by [email protected]

[permalink] [raw]
Subject: Random kernel panics when booting 2.6.39.1

Hi,

When booting my machine:
processor : 0
vendor_id : GenuineIntel
cpu family : 15
model : 2
model name : Intel(R) Pentium(R) 4 CPU 2.80GHz
stepping : 9
cpu MHz : 2793.178
cache size : 512 KB

Output of lspci:
00:00.0 Host bridge: Intel Corporation 82865G/PE/P DRAM
Controller/Host-Hub Interface (rev 02)
00:02.0 VGA compatible controller: Intel Corporation 82865G Integrated
Graphics Controller (rev 02)
00:03.0 PCI bridge: Intel Corporation 82865G/PE/P PCI to CSA Bridge (rev 02)
00:06.0 System peripheral: Intel Corporation 82865G/PE/P Processor to
I/O Memory Interface (rev 02)
00:1d.0 USB Controller: Intel Corporation 82801EB/ER (ICH5/ICH5R) USB
UHCI Controller #1 (rev 02)
00:1d.1 USB Controller: Intel Corporation 82801EB/ER (ICH5/ICH5R) USB
UHCI Controller #2 (rev 02)
00:1d.2 USB Controller: Intel Corporation 82801EB/ER (ICH5/ICH5R) USB
UHCI Controller #3 (rev 02)
00:1d.3 USB Controller: Intel Corporation 82801EB/ER (ICH5/ICH5R) USB
UHCI Controller #4 (rev 02)
00:1d.7 USB Controller: Intel Corporation 82801EB/ER (ICH5/ICH5R) USB2
EHCI Controller (rev 02)
00:1e.0 PCI bridge: Intel Corporation 82801 PCI Bridge (rev c2)
00:1f.0 ISA bridge: Intel Corporation 82801EB/ER (ICH5/ICH5R) LPC
Interface Bridge (rev 02)
00:1f.1 IDE interface: Intel Corporation 82801EB/ER (ICH5/ICH5R) IDE
Controller (rev 02)
00:1f.2 IDE interface: Intel Corporation 82801EB (ICH5) SATA Controller
(rev 02)
00:1f.3 SMBus: Intel Corporation 82801EB/ER (ICH5/ICH5R) SMBus
Controller (rev 02)
00:1f.5 Multimedia audio controller: Intel Corporation 82801EB/ER
(ICH5/ICH5R) AC'97 Audio Controller (rev 02)
01:01.0 Ethernet controller: Intel Corporation 82547EI Gigabit Ethernet
Controller
02:00.0 Multimedia video controller: Brooktree Corporation Bt878 Video
Capture (rev 11)
02:00.1 Multimedia controller: Brooktree Corporation Bt878 Audio Capture
(rev 11)
02:03.0 Communication controller: NetMos Technology PCI 9835 Multi-I/O
Controller (rev 01)
02:04.0 Ethernet controller: Realtek Semiconductor Co., Ltd.
RTL-8139/8139C/8139C+ (rev 10)
02:05.0 Ethernet controller: Realtek Semiconductor Co., Ltd.
RTL-8139/8139C/8139C+ (rev 10)

with the newest 2.6.39.1 kernel, I experience random kernel panics.
I was able to record the following messages using the serial console
(however the message after kernel panic itself probably was not sent,
as interrupts were blocked ?)
The machine works stable with 2.6.37.3 kernel.


Below are logs from four starts of the system:
====================== First booting =================================
udevd[1023]: SYSFS{}= will be removed in a future udev version, please
use ATTR{}= to match the event device, or ATTRS{}= to match a parent
device, in /etc/udev/rules.d/zydas_qemu.rules:3

i801_smbus 0000:00:1f.3: PCI INT B -> GSI 17 (level, low) -> IRQ 17
shpchp: Standard Hot Plug PCI Controller Driver version: 0.4
Floppy drive(s):
BUG: unable to handle kernel paging request at 2c0000a2
IP: [<c05bbcf4>] 0xc05bbcf3
*pde = 00000000
Oops: 0000 [#1] PREEMPT SMP
last sysfs file: /sys/devices/virtual/tty/ttyb9/uevent
Modules linked in: processor(+) 8250_pci floppy(+) 8250_pnp(+) parport
snd shpchp i2c_i801 soundcore pci_hotplug snd_page_alloc unix

Pid: 1074, comm: modprobe Not tainted 2.6.39.1 #1
/D865GBF
EIP: 0060:[<c05bbcf4>] EFLAGS: 00010086 CPU: 1
EIP is at 0xc05bbcf4
EAX: 0000001b EBX: 00000400 ECX: 00000000 EDX: c049db50
ESI: c0134bc5 EDI: c05bfdc0 EBP: 0000000f ESP: f5b19be0
DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
Process modprobe (pid: 1074, ti=f5b18000 task=f5a0ae00 task.ti=f5b18000)
Stack:
c04124bd 00000000 c0135450 f5b19c74 00000000 13a677f7 c025007d 00000000
f6605f80 e13bb98c c0108ac6 00000000 00000293 00000001 f5b19c6cNULL
pointer dereference
invalid opcode: 0000 [#2] PREEMPT SMP
last sysfs file: /sys/devices/virtual/tty/ttyb9/uevent
Modules linked in: processor(+) 8250_pci floppy(+) 8250_pnp(+) parport
snd shpchp i2c_i801 soundcore pci_hotplug snd_page_alloc unix

Pid: 1074, comm: modprobe Not tainted 2.6.39.1 #1
/D865GBF
EIP: 0060:[<f6405f80>] EFLAGS: 00010086 CPU: 1
EIP is at 0xf6405f80
EAX: 0000000c EBX: 00000000 ECX: 00000001 EDX: 00010002
ESI: f5b19ba4 EDI: 0000000f EBP: f5b19c1f ESP: f5b19a7c
DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
Process modprobe (pid: 1074, ti=f5b18000 task=f5a0ae00 task.ti=f5b18000)
Stack:
f8cff8b0 c0108ac6 f6405f80 f6405f80 00000000 f5b19ae4 c0152bd3 1627fbeb
00000002 1627f839 00000002 00000002 8a130af4 00000000 048def99 00000000
00000086 00000000 00000002 c0416d80 f6405f80 00000082 f8d02e30 0000001b
Call Trace:
[<c0108ac6>] ? native_sched_clock+0x26/0x90
[<c0152bd3>] ? sched_clock_local+0xd3/0x1c0
[<c0155e3a>] ? ktime_get+0x6a/0x120
[<c011a243>] ? lapic_next_event+0x13/0x20
[<c015ae87>] ? clockevents_program_event+0x87/0x120
[<c015bf86>] ? tick_dev_program_event+0x46/0x150
[<c015c0a9>] ? tick_program_event+0x19/0x20
[<c0151155>] ? hrtimer_interrupt+0x155/0x280
[<c0104536>] ? do_IRQ+0x46/0xb0
[<c011a7c2>] ? smp_apic_timer_interrupt+0x52/0x90
[<c0413329>] ? common_interrupt+0x29/0x30
[<c041007b>] ? schedule+0x17b/0x8a0
[<c01500d8>] ? hrtimer_forward+0x68/0x2c0
[<c04124bd>] ? _raw_spin_lock+0xd/0x30
[<c0135450>] ? vprintk+0x50/0x3f0
[<c025007d>] ?
[<c025007d>] ?
generic_make_request+0x19d/0x390

======== Second booting =============================

udevd[1023]: segfault at dac81 ip 00005681 sp f82f0724 error 4
BUG: unable to handle kernel paging request at 00010296
IP: [<00010296>] 0x10295
*pde = 00000000
Oops: 0000 [#1] PREEMPT SMP
last sysfs file: /sys/module/intel_agp/initstate
Modules linked in: floppy(+) processor(+) intel_agp(+) pcspkr intel_gtt
pci_hotplug ehci_hcd(+) evdev i2c_i801 8250_pnp usbcore unix

Pid: 1070, comm: modprobe Not tainted 2.6.39.1 #1
/D865GBF
EIP: 0060:[<00010296>] EFLAGS: 00010296 CPU: 1
EIP is at 0x10296
EAX: f5591be1 EBX: c0121a30 ECX: ffffffff EDX: 0000ffeb
ESI: ffffffff EDI: c0266a97 EBP: 00000060 ESP: f5b05d74
DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
Process modprobe (pid: 1070, ti=f5b04000 task=f5966040 task.ti=f5b04000)
Stack:
f6178400 ffffffff ffffff02 f61aa30a 0000fffe c05d0010 f5591bd7 f5591bc0
00000017 ff0a0210 ffffffff f5591bc0 f5b05de4 c04c2db2 00000017 c0269ae5
f5b05df4 f61aa3a0 00000000 f83ec3d6 f61aa398 c02608f2 f61aa380 f83ecfd4
Call Trace:
[<c0269ae5>] ? kvasprintf+0x45/0x60
[<c02608f2>] ? kobject_set_name_vargs+0x32/0x70
[<c0300d74>] ? dev_set_name+0x14/0x20
[<c035e3c1>] ? thermal_cooling_device_register+0xc1/0x220
[<f83ec082>] ? acpi_processor_add+0x3b6/0x442 [processor]
[<c0299fef>] ? acpi_device_probe+0x34/0xe7
[<c0303d25>] ? driver_probe_device+0x65/0x160
[<c029a3bf>] ? acpi_match_device_ids+0x27/0x4d
[<c0303e99>] ? __driver_attach+0x79/0x80
[<c0303e20>] ? driver_probe_device+0x160/0x160
[<c030308b>] ? bus_for_each_dev+0x4b/0x70
[<c0299f2f>] ? acpi_device_hid+0x13/0x13
[<c0303a86>] ? driver_attach+0x16/0x20
[<c0303e20>] ? driver_probe_device+0x160/0x160
[<c03036ec>] ? bus_add_driver+0x9c/0x230
[<c0299f2f>] ? acpi_device_hid+0x13/0x13
[<c030432f>] ? driver_register+0x5f/0x100
[<c040fc67>] ? printk+0x17/0x20
[<f841305a>] ? acpi_processor_init+0x5a/0xb7 [processor]
[<c01011f3>] ? do_one_initcall+0x33/0x170
[<c019da08>] ? __vunmap+0xa8/0xe0
[<f8413000>] ? 0xf8412fff
[<c0163843>] ? sys_init_module+0x123/0x1960
[<c0412e10>] ? sysenter_do_call+0x12/0x26
Code: Bad EIP value.
EIP: [<00010296>] 0x10296 SS:ESP 0068:f5b05d74
CR2: 0000000000010296
---[ end trace a8bea5a64b02f72c ]---
note: modprobe[1070] exited with preempt_count 2

======= Third booting ==============================

pci_hotplug: PCI Hot Plug PCI Core version: 0.5
input: PC Speaker as /devices/platform/pcspkr/input/input3
i801_smbus 0000:00:1f.3: PCI INT B -> GSI 17 (level, low) -> IRQ 17
general protection fault: 0000 [#1] PREEMPT SMP
last sysfs file: /sys/devices/virtual/tty/ptyy3/uevent
Modules linked in: processor(+) i2c_i801 pcspkr evdev pci_hotplug
soundcore snd_page_alloc unix

Pid: 1076, comm: modprobe Not tainted 2.6.39.1 #1
/D865GBF
EIP: 00d8:[<c052007b>] EFLAGS: c0102eb0 CPU: 1
EIP is at 0xc052007b
EAX: 00000002 EBX: c014007b ECX: f5aeda58 EDX: c012230d
ESI: c0412b7e EDI: f5aeda58 EBP: 00000000 ESP: f5aed998
DS: 230d ES: 1a30 FS: 246c GS: 0033 SS: 0068
Process modprobe (pid: 1076, ti=f5aec000 task=f619bbc0 task.ti=f5aec000)
Stack:
ffffffff f5aeda58 00000060 00010082 c01214ee f5aeda60 c01214ee f60fc0a4
00000000 f5aeda58 00000009 c0121a30 8314246b c012176f 00030001 c0121c86
00030001 c0121c86 f59054c8 f592c5c8 00000086 00000002 f5913b80 f5aeda58
Call Trace:
[<c01214ee>] ? no_context+0x1e/0x150
[<c01214ee>] ? no_context+0x1e/0x150
[<c0121a30>] ? vmalloc_sync_all+0xf0/0xf0
[<c012176f>] ? bad_area_nosemaphore+0xf/0x20
[<c0121c86>] ? do_page_fault+0x256/0x3f0
[<c0121c86>] ? do_page_fault+0x256/0x3f0
[<c021bbc5>] ? search_for_position_by_key+0x115/0x2c0
[<c014007b>] ? alloc_uid+0x5b/0x100
[<c0121a30>] ? vmalloc_sync_all+0xf0/0xf0
[<c014007b>] ? alloc_uid+0x5b/0x100
[<c0412b7e>] ? error_code+0x5a/0x60
[<c014007b>] ? alloc_uid+0x5b/0x100
[<c014007b>] ? alloc_uid+0x5b/0x100
[<c01300d8>] ? thread_group_sched_runtime+0x58/0xe0
[<c0121a30>] ? vmalloc_sync_all+0xf0/0xf0
[<c014007b>] ? alloc_uid+0x5b/0x100
[<c012230d>] ? fixup_exception+0x1d/0x90
[<c0165662>] ? search_module_extables+0x12/0xa0
[<c01214ee>] ? no_context+0x1e/0x150
[<c012230d>] ? fixup_exception+0x1d/0x90
[<c01217bb>] ? bad_area+0x3b/0x50
[<c0121a30>] ? vmalloc_sync_all+0xf0/0xf0
[<c0121d50>] ? do_page_fault+0x320/0x3f0
[<c01217bb>] ? bad_area+0x3b/0x50
[<c0121a30>] ? vmalloc_sync_all+0xf0/0xf0
[<c01308f8>] ? get_parent_ip+0x8/0x20
[<c013097f>] ? sub_preempt_count+0x6f/0xa0
[<c0108ac6>] ? native_sched_clock+0x26/0x90
[<c0152bd3>] ? sched_clock_local+0xd3/0x1c0
[<c02a007b>] ? acpi_ds_store_object_to_local+0x6a/0x12f
[<c0121a30>] ? vmalloc_sync_all+0xf0/0xf0
[<c0412b7e>] ? error_code+0x5a/0x60
[<c02a007b>] ? acpi_ds_store_object_to_local+0x6a/0x12f
[<c02ac7c8>] ? acpi_ns_lookup+0xc8/0x2f9
[<c0121a30>] ? vmalloc_sync_all+0xf0/0xf0
[<c014007b>] ? alloc_uid+0x5b/0x100
[<c0121a30>] ? vmalloc_sync_all+0xf0/0xf0
[<c0196571>] ? find_vma+0x31/0x70
[<c0121af3>] ? do_page_fault+0xc3/0x3f0
[<c01217bb>] ? bad_area+0x3b/0x50
[<c0121a30>] ? vmalloc_sync_all+0xf0/0xf0
[<c0121d50>] ? do_page_fault+0x320/0x3f0
[<c041218d>] ? _raw_spin_unlock_irq+0xd/0x30
[<c0128e51>] ? finish_task_switch+0x31/0xb0
[<c0121a30>] ? vmalloc_sync_all+0xf0/0xf0
[<c0412b7e>] ? error_code+0x5a/0x60
[<c0121a30>] ? vmalloc_sync_all+0xf0/0xf0
[<c012007b>] ? hpet_work+0x1b/0x250
[<c04100d8>] ? schedule+0x1d8/0x8a0
[<c0121a30>] ? vmalloc_sync_all+0xf0/0xf0
[<c0121a30>] ? vmalloc_sync_all+0xf0/0xf0
[<c02ac7c8>] ? acpi_ns_lookup+0xc8/0x2f9
[<c04121be>] ? _raw_spin_unlock_irqrestore+0xe/0x30
[<c0151d11>] ? down_timeout+0x31/0x50
[<c02adee0>] ? acpi_ns_get_node+0x5f/0x83
[<c0121a30>] ? vmalloc_sync_all+0xf0/0xf0
[<c0412b7e>] ? error_code+0x5a/0x60
[<c0121a30>] ? vmalloc_sync_all+0xf0/0xf0
[<c02b5634>] ? acpi_ut_delete_internal_object_list+0x10/0x20
[<c02ad319>] ? acpi_evaluate_object+0x1db/0x1f5
[<f82f9295>] ? acpi_processor_get_power_info+0x4e/0x53f [processor]
[<c0130a17>] ? add_preempt_count+0x67/0xa0
[<c04124bd>] ? _raw_spin_lock+0xd/0x30
[<c013097f>] ? sub_preempt_count+0x6f/0xa0
[<c04121be>] ? _raw_spin_unlock_irqrestore+0xe/0x30
[<c0128cd3>] ? set_cpus_allowed_ptr+0x83/0x150
[<f82f85c7>] ? acpi_processor_get_throttling+0x5d/0x66 [processor]
[<f82f91c8>] ? acpi_processor_get_throttling_info+0x49c/0x4cc [processor]
[<f82fb1db>] ? acpi_processor_power_init+0xcd/0x111 [processor]
[<f82fb071>] ? acpi_processor_add+0x3a5/0x442 [processor]
[<c0299fef>] ? acpi_device_probe+0x34/0xe7
[<c0303d25>] ? driver_probe_device+0x65/0x160
[<c029a3bf>] ? acpi_match_device_ids+0x27/0x4d
[<c0303e99>] ? __driver_attach+0x79/0x80
[<c0303e20>] ? driver_probe_device+0x160/0x160
[<c030308b>] ? bus_for_each_dev+0x4b/0x70
[<c0299f2f>] ? acpi_device_hid+0x13/0x13
[<c0303a86>] ? driver_attach+0x16/0x20
[<c0303e20>] ? driver_probe_device+0x160/0x160
[<c03036ec>] ? bus_add_driver+0x9c/0x230
[<c0299f2f>] ? acpi_device_hid+0x13/0x13
[<c030432f>] ? driver_register+0x5f/0x100
[<c040fc67>] ? printk+0x17/0x20
[<f835405a>] ? acpi_processor_init+0x5a/0xb7 [processor]
[<c01011f3>] ? do_one_initcall+0x33/0x170
[<c019da08>] ? __vunmap+0xa8/0xe0
[<f8354000>] ? 0xf8353fff
[<c0163843>] ? sys_init_module+0x123/0x1960
[<c0412e10>] ? sysenter_do_call+0x12/0x26
Code: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
EIP: [<c052007b>] 0xc052007b SS:ESP 0068:f5aed998


======= Fourth booting ==============================

BUG: unable to handle kernel paging request at 00010202
IP: [<c0266a19>] vsnprintf+0x49/0x3d0
*pde = 00000000
Oops: 0000 [#2] PREEMPT SMP
last sysfs file: /sys/devices/virtual/tty/ttyad/uevent
Modules linked in: processor(+) intel_gtt parport_pc(+) i2c_i801(+)
ehci_hcd(+) soundcore parport shpchp floppy pci_hotplug 8250_pnp
snd_page_alloc evdev usbcore pcspkr unix

Pid: 1083, comm: modprobe Not tainted 2.6.39.1 #1
/D865GBF
EIP: 0060:[<c0266a19>] EFLAGS: 00010083 CPU: 1
EIP is at vsnprintf+0x49/0x3d0
EAX: b60e1804 EBX: 00000400 ECX: 00010202 EDX: 3fa4023f
ESI: 00010202 EDI: c05bfdc0 EBP: 00000046 ESP: f5b21970
DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
Process modprobe (pid: 1083, ti=f5b20000 task=f6152040 task.ti=f5b20000)
Stack:
c012b212 f5b21a24 00000000 c0121380 c014a0d4 f5b21a44 c012230d c05bfdc0
3fa4023f c01214ee 00000000 00000400 f5b21b0c c05bfdc0 f5b21a44 00000008
c0121a30 00010202 c012176f 00030001 c0121c86 ffffffff 00000000 00000000
Call Trace:
[<c012b212>] ? find_busiest_group+0x132/0xc70
[<c0121380>] ? is_prefetch.clone.22+0x70/0x1c0
[<c014a0d4>] ? search_exception_tables+0x14/0x30
[<c012230d>] ? fixup_exception+0x1d/0x90
[<c01214ee>] ? no_context+0x1e/0x150
[<c0121a30>] ? vmalloc_sync_all+0xf0/0xf0
[<c012176f>] ? bad_area_nosemaphore+0xf/0x20
[<c0121c86>] ? do_page_fault+0x256/0x3f0
[<c01532e0>] ? release_tgcred.clone.7+0x20/0x20
[<c0121a30>] ? vmalloc_sync_all+0xf0/0xf0
[<c0412b7e>] ? error_code+0x5a/0x60
[<c0121a30>] ? vmalloc_sync_all+0xf0/0xf0
[<c04135a3>] ? iret_exc+0x1d7/0x98f
[<c0121392>] ? is_prefetch.clone.22+0x82/0x1c0
[<c01058fe>] ? oops_begin+0xe/0x90
[<c0121522>] ? no_context+0x52/0x150
[<c0121a30>] ? vmalloc_sync_all+0xf0/0xf0
[<c012176f>] ? bad_area_nosemaphore+0xf/0x20
[<c0121c86>] ? do_page_fault+0x256/0x3f0
[<c0412b7e>] ? error_code+0x5a/0x60
[<c0121a30>] ? vmalloc_sync_all+0xf0/0xf0
[<c0102eb0>] ? do_bounds+0x80/0x80
[<c0122313>] ? fixup_exception+0x23/0x90
[<c0121a30>] ? vmalloc_sync_all+0xf0/0xf0
[<c0412b7e>] ? error_code+0x5a/0x60
[<c0122313>] ? fixup_exception+0x23/0x90
[<c01600d8>] ? rt_mutex_adjust_prio_chain+0x98/0x330
[<c0121a30>] ? vmalloc_sync_all+0xf0/0xf0
[<c01214ee>] ? no_context+0x1e/0x150
[<c0121a30>] ? vmalloc_sync_all+0xf0/0xf0
[<c012176f>] ? bad_area_nosemaphore+0xf/0x20
[<c0121c86>] ? do_page_fault+0x256/0x3f0
[<c0181ebf>] ? __alloc_pages_nodemask+0xff/0x6b0
[<c0121a30>] ? vmalloc_sync_all+0xf0/0xf0
[<c0412b7e>] ? error_code+0x5a/0x60
[<c0121a30>] ? vmalloc_sync_all+0xf0/0xf0
[<c04135a3>] ? iret_exc+0x1d7/0x98f
[<c0121651>] ? __bad_area_nosemaphore+0x31/0x140
[<c02b470f>] ? acpi_ut_valid_acpi_name+0x16/0x2d
[<c02acd71>] ? acpi_ns_search_and_enter+0x94/0x13f
[<c01217bb>] ? bad_area+0x3b/0x50
[<c0121a30>] ? vmalloc_sync_all+0xf0/0xf0
[<c0412b7e>] ? error_code+0x5a/0x60
[<c0121a30>] ? vmalloc_sync_all+0xf0/0xf0
[<c0265de1>] ? number.clone.1+0x211/0x350
[<c0125e4a>] ? resched_task+0x3a/0x60
[<c01308f8>] ? get_parent_ip+0x8/0x20
[<c0121a30>] ? vmalloc_sync_all+0xf0/0xf0
[<c0412b7e>] ? error_code+0x5a/0x60
[<c0266a95>] ? vsnprintf+0xc5/0x3d0
[<c0269ae5>] ? kvasprintf+0x45/0x60
[<c02608f2>] ? kobject_set_name_vargs+0x32/0x70
[<c0300d74>] ? dev_set_name+0x14/0x20
[<c035e3c1>] ? thermal_cooling_device_register+0xc1/0x220
[<f8419082>] ? acpi_processor_add+0x3b6/0x442 [processor]
[<c0299fef>] ? acpi_device_probe+0x34/0xe7
[<c0303d25>] ? driver_probe_device+0x65/0x160
[<c029a3bf>] ? acpi_match_device_ids+0x27/0x4d
[<c0303e99>] ? __driver_attach+0x79/0x80
[<c0303e20>] ? driver_probe_device+0x160/0x160
[<c030308b>] ? bus_for_each_dev+0x4b/0x70
[<c0299f2f>] ? acpi_device_hid+0x13/0x13
[<c0303a86>] ? driver_attach+0x16/0x20
[<c0303e20>] ? driver_probe_device+0x160/0x160
[<c03036ec>] ? bus_add_driver+0x9c/0x230
[<c0299f2f>] ? acpi_device_hid+0x13/0x13
[<c030432f>] ? driver_register+0x5f/0x100
[<c040fc67>] ? printk+0x17/0x20
[<f847a05a>] ? acpi_processor_init+0x5a/0xb7 [processor]
[<c01011f3>] ? do_one_initcall+0x33/0x170
[<c019da08>] ? __vunmap+0xa8/0xe0
[<f847a000>] ? 0xf8479fff
[<c0163843>] ? sys_init_module+0x123/0x1960
[<c0412e10>] ? sysenter_do_call+0x12/0x26
Code: 28 00 00 00 00 0f 88 7b 03 00 00 8b 44 24 1c 03 44 24 20 89 44 24
18 73 12 8b 54 24 1c c7 44 24 18 ff ff ff ff f7 d2 89 54 24 20 <0f> b6
06 8b 5c 24 1c 84 c0 74 7c 8d 54 24 24 89 f0 e8 f1 eb ff
EIP: [<c0266a19>] vsnprintf+0x49/0x3d0 SS:ESP 0068:f5b21970
CR2: 0000000000010202
---[ end trace 34c5a12794a4228b ]---
note: modprobe[1083] exited with preempt_count 3
BUG: scheduling while atomic: modprobe/1083/0x10000004
Modules linked in: processor(+) intel_gtt parport_pc(+) i2c_i801(+)
ehci_hcd(+) soundcore parport shpchp floppy pci_hotplug 8250_pnp
snd_page_alloc evdev usbcore pcspkr unix
Pid: 1083, comm: modprobe Tainted: G D 2.6.39.1 #1
Call Trace:
[<c04103f1>] ? schedule+0x4f1/0x8a0
[<c0152bd3>] ? sched_clock_local+0xd3/0x1c0
[<c0155e3a>] ? ktime_get+0x6a/0x120
[<c0130a5f>] ? __cond_resched+0xf/0x20
[<c0410980>] ? _cond_resched+0x20/0x30
[<c01938f0>] ? unmap_vmas+0x530/0x6b0
[<c0198c19>] ? exit_mmap+0xc9/0x180
[<c0132a7e>] ? mmput+0x1e/0xa0
[<c013697c>] ? exit_mm+0xec/0x120
[<c041007b>] ? schedule+0x17b/0x8a0
[<c01381dd>] ? do_exit+0x53d/0x6d0
[<c04121be>] ? _raw_spin_unlock_irqrestore+0xe/0x30
[<c0136388>] ? kmsg_dump+0x58/0xd0
[<c0105c27>] ? oops_end+0x67/0x90
[<c040fc67>] ? printk+0x17/0x20
[<c0121590>] ? no_context+0xc0/0x150
[<c0121a30>] ? vmalloc_sync_all+0xf0/0xf0
[<c012176f>] ? bad_area_nosemaphore+0xf/0x20
[<c0121c86>] ? do_page_fault+0x256/0x3f0
[<c01532e0>] ? release_tgcred.clone.7+0x20/0x20
[<c0121a30>] ? vmalloc_sync_all+0xf0/0xf0
[<c0412b7e>] ? error_code+0x5a/0x60
[<c0121a30>] ? vmalloc_sync_all+0xf0/0xf0
[<c0266a19>] ? vsnprintf+0x49/0x3d0
[<c012b212>] ? find_busiest_group+0x132/0xc70
[<c0121380>] ? is_prefetch.clone.22+0x70/0x1c0
[<c014a0d4>] ? search_exception_tables+0x14/0x30
[<c012230d>] ? fixup_exception+0x1d/0x90
[<c01214ee>] ? no_context+0x1e/0x150
[<c0121a30>] ? vmalloc_sync_all+0xf0/0xf0
[<c012176f>] ? bad_area_nosemaphore+0xf/0x20
[<c0121c86>] ? do_page_fault+0x256/0x3f0
[<c01532e0>] ? release_tgcred.clone.7+0x20/0x20
[<c0121a30>] ? vmalloc_sync_all+0xf0/0xf0
[<c0412b7e>] ? error_code+0x5a/0x60
[<c0121a30>] ? vmalloc_sync_all+0xf0/0xf0
[<c04135a3>] ? iret_exc+0x1d7/0x98f
[<c0121392>] ? is_prefetch.clone.22+0x82/0x1c0
[<c01058fe>] ? oops_begin+0xe/0x90
[<c0121522>] ? no_context+0x52/0x150
[<c0121a30>] ? vmalloc_sync_all+0xf0/0xf0
[<c012176f>] ? bad_area_nosemaphore+0xf/0x20
[<c0121c86>] ? do_page_fault+0x256/0x3f0
[<c0412b7e>] ? error_code+0x5a/0x60
[<c0121a30>] ? vmalloc_sync_all+0xf0/0xf0
[<c0102eb0>] ? do_bounds+0x80/0x80
[<c0122313>] ? fixup_exception+0x23/0x90
[<c0121a30>] ? vmalloc_sync_all+0xf0/0xf0
[<c0412b7e>] ? error_code+0x5a/0x60
[<c0122313>] ? fixup_exception+0x23/0x90
[<c01600d8>] ? rt_mutex_adjust_prio_chain+0x98/0x330
[<c0121a30>] ? vmalloc_sync_all+0xf0/0xf0
[<c01214ee>] ? no_context+0x1e/0x150
[<c0121a30>] ? vmalloc_sync_all+0xf0/0xf0
[<c012176f>] ? bad_area_nosemaphore+0xf/0x20
[<c0121c86>] ? do_page_fault+0x256/0x3f0
[<c0181ebf>] ? __alloc_pages_nodemask+0xff/0x6b0
[<c0121a30>] ? vmalloc_sync_all+0xf0/0xf0
[<c0412b7e>] ? error_code+0x5a/0x60
[<c0121a30>] ? vmalloc_sync_all+0xf0/0xf0
[<c04135a3>] ? iret_exc+0x1d7/0x98f
[<c0121651>] ? __bad_area_nosemaphore+0x31/0x140
[<c02b470f>] ? acpi_ut_valid_acpi_name+0x16/0x2d
[<c02acd71>] ? acpi_ns_search_and_enter+0x94/0x13f
[<c01217bb>] ? bad_area+0x3b/0x50
[<c0121a30>] ? vmalloc_sync_all+0xf0/0xf0
[<c0412b7e>] ? error_code+0x5a/0x60
[<c0121a30>] ? vmalloc_sync_all+0xf0/0xf0
[<c0265de1>] ? number.clone.1+0x211/0x350
[<c0125e4a>] ? resched_task+0x3a/0x60
[<c01308f8>] ? get_parent_ip+0x8/0x20
[<c0121a30>] ? vmalloc_sync_all+0xf0/0xf0
[<c0412b7e>] ? error_code+0x5a/0x60
[<c0266a95>] ? vsnprintf+0xc5/0x3d0
[<c0269ae5>] ? kvasprintf+0x45/0x60
[<c02608f2>] ? kobject_set_name_vargs+0x32/0x70
[<c0300d74>] ? dev_set_name+0x14/0x20
[<c035e3c1>] ? thermal_cooling_device_register+0xc1/0x220
[<f8419082>] ? acpi_processor_add+0x3b6/0x442 [processor]
[<c0299fef>] ? acpi_device_probe+0x34/0xe7
[<c0303d25>] ? driver_probe_device+0x65/0x160
[<c029a3bf>] ? acpi_match_device_ids+0x27/0x4d
[<c0303e99>] ? __driver_attach+0x79/0x80
[<c0303e20>] ? driver_probe_device+0x160/0x160
[<c030308b>] ? bus_for_each_dev+0x4b/0x70
[<c0299f2f>] ? acpi_device_hid+0x13/0x13
[<c0303a86>] ? driver_attach+0x16/0x20
[<c0303e20>] ? driver_probe_device+0x160/0x160
[<c03036ec>] ? bus_add_driver+0x9c/0x230
[<c0299f2f>] ? acpi_device_hid+0x13/0x13
[<c030432f>] ? driver_register+0x5f/0x100
[<c040fc67>] ? printk+0x17/0x20
[<f847a05a>] ? acpi_processor_init+0x5a/0xb7 [processor]
[<c01011f3>] ? do_one_initcall+0x33/0x170
[<c019da08>] ? __vunmap+0xa8/0xe0
[<f847a000>] ? 0xf8479fff
[<c0163843>] ? sys_init_module+0x123/0x1960
[<c0412e10>] ? sysenter_do_call+0x12/0x26
parport0: PC-style at 0x378, irq 7 [PCSPP,TRISTATE]
ehci_hcd 0000:00:1d.7: PCI INT D -> GSI 23 (level, low) -> IRQ 23
ehci_hcd 0000:00:1d.7: setting latency timer to 64
ehci_hcd 0000:00:1d.7: EHCI Host Controller
ehci_hcd 0000:00:1d.7: new USB bus registered, assigned bus number 1


2011-06-04 20:08:00

by [email protected]

[permalink] [raw]
Subject: Re: Random kernel panics when booting 2.6.39.1

In this message I include the configuration of the 2.6.39.1 from the
affected machine.


--
Regards,
Wojtek


Attachments:
.config.gz (23.59 kB)

2011-06-04 21:41:24

by [email protected]

[permalink] [raw]
Subject: Machine panicking with 2.6.39.1 boots correctly with acpi=off

The same machine boots and works correctly with 2.6.39.1 when I add
"acpi=off" boot parameter. Is the problem located somewhere in the acpi
related code?
--
Regards,
Wojtek

2011-06-04 22:01:49

by [email protected]

[permalink] [raw]
Subject: acpi=off switches off also HT support, maybe it is something SMP related?

W dniu 04.06.2011 23:41, [email protected] pisze:
> The same machine boots and works correctly with 2.6.39.1 when I add
> "acpi=off" boot parameter. Is the problem located somewhere in the acpi
> related code?

I've just stated, that acpi=off switches off also HT support.
However 2.6.39.1 works correctly on my another dual-core machine.
May it be something, which fails on HT enabled CPU but works correctly
on SMP machine?
--
Regards,
Wojtek

2011-06-05 09:40:18

by [email protected]

[permalink] [raw]
Subject: Kernel panic with 2.6.39.1 - switching off HT in BIOS helps

Hi,
I've just stated, that the machine, in which I've experienced the
problem boots correctly also without "acpi=off", when I switch off
HyperThreading (HT) in BIOS. As kernel works correctly on machine
with "true SMP" (Intel Core2 Duo), it seems that the problem must be
related to HT specific part of kernel:

first report and thread: http://thread.gmane.org/gmane.linux.kernel/1150275
gzipped .config:
http://cache.gmane.org//gmane/linux/kernel/1150276-001.bin )

Full output of the /proc/cpuinfo (under 2.6.37.3 kernel):
processor : 0
vendor_id : GenuineIntel
cpu family : 15
model : 2
model name : Intel(R) Pentium(R) 4 CPU 2.80GHz
stepping : 9
cpu MHz : 2793.414
cache size : 512 KB
physical id : 0
siblings : 2
core id : 0
cpu cores : 1
apicid : 0
initial apicid : 0
fdiv_bug : no
hlt_bug : no
f00f_bug : no
coma_bug : no
fpu : yes
fpu_exception : yes
cpuid level : 2
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov
pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe pebs bts cid xtpr
bogomips : 5589.61
clflush size : 64
cache_alignment : 128
address sizes : 36 bits physical, 32 bits virtual
power management:

processor : 1
vendor_id : GenuineIntel
cpu family : 15
model : 2
model name : Intel(R) Pentium(R) 4 CPU 2.80GHz
stepping : 9
cpu MHz : 2793.414
cache size : 512 KB
physical id : 0
siblings : 2
core id : 0
cpu cores : 1
apicid : 1
initial apicid : 1
fdiv_bug : no
hlt_bug : no
f00f_bug : no
coma_bug : no
fpu : yes
fpu_exception : yes
cpuid level : 2
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov
pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe pebs bts cid xtpr
bogomips : 5588.73
clflush size : 64
cache_alignment : 128
address sizes : 36 bits physical, 32 bits virtual
power management:

2011-06-05 20:22:32

by Wojciech Zabołotny

[permalink] [raw]
Subject: Kernel panic on HT machine - full logs with debug from a few boots (some successful)

Hi,
I tried to find the source of the problem I experience, supposing, that
it may affect also other users of machine with Hyper-Threaded CPU.

I've recompiled the kernel with debugging support.
If I remember correctly, machine was booted 5 times. Two of restarts
were successful and even GDM started.

Attached (in gzipped form to spare bandwidth) are logs from serial
console (with overwritten MAC addresses of NICs), and configuration of
the kernel.

Crash seems to happen in random places:
1. [ 9.589453] BUG: unable to handle kernel paging request at
0045e22f
[ 9.590022] IP: [<c02e3fb2>] acpi_evaluate_object+0xf1/0x1f2
[ 9.590022] *pde = 00000000
[ 9.590022] Oops: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC
[ 9.590022] last sysfs file:
/sys/devices/platform/serial8250/tty/ttyS0/uevent
[ 9.590022] Modules linked in: snd_page_alloc processor(+) unix

2. [ 15.995196] pci_hotplug: PCI Hot Plug PCI Core version: 0.5
[ 16.064443] agpgart-intel 0000:00:00.0: Intel 865 Chipset
[ 16.133686] BUG: sleeping function called from invalid context
at kernel/mutex.c:278
[ 16.133693] BUG: unable to handle kernel NULL pointer
dereference at (null)
[ 16.133702] IP: [<001d9726>] 0x1d9725
[ 16.133711] *pde = 00000000
[ 16.133716] Oops: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC
[ 16.133724] last sysfs file:
/sys/devices/LNXSYSTM:00/device:00/PNP0A03:00/device:12/PNP0303:00/uevent

As crash occurs only with HT on and doesn't happen on another machine
with 2 cores, it seems that maybe the problem is associated with
incorrect allocation of resources or locking for HT enabled CPU...

--
Regards,
Wojtek


Attachments:
panic5.log.gz (54.01 kB)
.config.gz (23.75 kB)
Download all attachments

2011-06-11 15:23:29

by Wojciech Zabołotny

[permalink] [raw]
Subject: kmemcheck error and panic when booting 2.6.39.1 , however acpi=off allows to boot

Hi,

Today I've tried to investigate more thoroughly why one of my machine
doesn't boot with 2.6.39.1
I've performed four reboots with different parameters, recording serial
console output to files:

crash6.txt - booting with HT in BIOS on, parameters: kmemleak=on
crash7.txt - booting with HT in BIOS on, parameters: slub_debug kmemleak=on
crash8.txt - booting with HT in BIOS off, parameters: slub_debug kmemleak=on
crash9.txt - booting with HT in BIOS on, parameters: slub_debug
kmemleak=on acpi=off

In all cases there was a problem detected in kmemcheck:

crash6.txt:
[ 69.689737] ------------[ cut here ]------------
[ 69.690605] WARNING: at arch/x86/mm/kmemcheck/kmemcheck.c:634
kmemcheck_fault+0xa5/0xc0()
[ 69.690605] Hardware name:
[ 69.690605] Modules linked in:
[ 69.690605] Pid: 1, comm: swapper Not tainted 2.6.39.1 #3
[ 69.690605] Call Trace:
[ 69.690605] [<c013de0d>] warn_slowpath_common+0x6d/0xa0
[ 69.690605] [<c0128ce5>] ? kmemcheck_fault+0xa5/0xc0
[ 69.690605] [<c0128ce5>] ? kmemcheck_fault+0xa5/0xc0
[ 69.690605] [<c013de5d>] warn_slowpath_null+0x1d/0x20
[ 69.690605] [<c0128ce5>] kmemcheck_fault+0xa5/0xc0
[ 69.690605] [<c0124480>] do_page_fault+0x270/0x440
[ 69.690605] [<c0128cad>] ? kmemcheck_fault+0x6d/0xc0
[ 69.690605] [<c0128e2e>] ? kmemcheck_pte_lookup+0xe/0x40
[ 69.690605] [<c0124210>] ? vmalloc_sync_all+0x100/0x100
[ 69.690605] [<c04663f3>] error_code+0x5f/0x64
[ 69.690605] [<c012007b>] ? io_apic_set_pci_routing+0x4b/0x60
[ 69.690605] [<c0124210>] ? vmalloc_sync_all+0x100/0x100
[ 69.690605] [<c0111dbf>] ? p4_pmu_handle_irq+0x7f/0x1b0
[ 69.690605] [<c0128e2e>] ? kmemcheck_pte_lookup+0xe/0x40
[ 69.690605] [<c0128469>] ? kmemcheck_save_addr+0x19/0x40
[ 69.690605] [<c012880b>] ? kmemcheck_show_addr+0xb/0x20
[ 69.690605] [<c012896e>] ? kmemcheck_show_all+0x2e/0x40
[ 69.690605] [<c010f098>] perf_event_nmi_handler+0x28/0xa0
[ 69.690605] [<c015f1c5>] notifier_call_chain+0x75/0xe0
[ 69.690605] [<c015f700>] __atomic_notifier_call_chain+0x60/0x90
[ 69.690605] [<c015f6a0>] ? register_reboot_notifier+0x20/0x20
[ 69.690605] [<c015f74a>] atomic_notifier_call_chain+0x1a/0x20
[ 69.690605] [<c015f86d>] notify_die+0x2d/0x30
[ 69.690605] [<c0102d72>] default_do_nmi+0x32/0x280
[ 69.690605] [<c010369f>] do_nmi+0x7f/0x90
[ 69.690605] [<c04664a5>] nmi_stack_correct+0x28/0x2d
[ 69.690605] [<c010378c>] ? do_debug+0xc/0x190
[ 69.690605] [<c0466446>] debug_stack_correct+0x2e/0x34
[ 69.690605] [<c028f6ef>] ? prio_tree_insert+0xdf/0x190
[ 69.690605] [<c01c68e0>] create_object+0x140/0x230
[ 69.690605] [<c04543a7>] kmemleak_alloc+0x27/0x50
[ 69.690605] [<c01c3b79>] kmem_cache_alloc+0xc9/0x110
[ 69.690605] [<c02ac306>] dma_debug_init+0x96/0x140
[ 69.690605] [<c061f75d>] ? start_kernel+0x322/0x322
[ 69.690605] [<c061f75d>] ? start_kernel+0x322/0x322
[ 69.690605] [<c0624d50>] pci_iommu_init+0x13/0x48
[ 69.690605] [<c01011f0>] do_one_initcall+0x30/0x170
[ 69.690605] [<c061f75d>] ? start_kernel+0x322/0x322
[ 69.690605] [<c0624d3d>] ? iommu_setup+0x1fd/0x1fd
[ 69.690605] [<c061f75d>] ? start_kernel+0x322/0x322
[ 69.690605] [<c061f7f8>] kernel_init+0x9b/0x12f
[ 69.690605] [<c0466bba>] kernel_thread_helper+0x6/0xd
[ 69.690605] ---[ end trace 93d72a36b9146f22 ]---

crash7.txt:
[ 69.687496] ------------[ cut here ]------------
[ 69.690015] WARNING: at arch/x86/mm/kmemcheck/kmemcheck.c:634
kmemcheck_fault+0xa5/0xc0()
[ 69.690015] Hardware name:
[ 69.690015] Modules linked in:
[ 69.690015] Pid: 1, comm: swapper Not tainted 2.6.39.1 #3
[ 69.690015] Call Trace:
[ 69.690015] [<c013de0d>] warn_slowpath_common+0x6d/0xa0
[ 69.690015] [<c0128ce5>] ? kmemcheck_fault+0xa5/0xc0
[ 69.690015] [<c0128ce5>] ? kmemcheck_fault+0xa5/0xc0
[ 69.690015] [<c013de5d>] warn_slowpath_null+0x1d/0x20
[ 69.690015] [<c0128ce5>] kmemcheck_fault+0xa5/0xc0
[ 69.690015] [<c0124480>] do_page_fault+0x270/0x440
[ 69.690015] [<c0128e2e>] ? kmemcheck_pte_lookup+0xe/0x40
[ 69.690015] [<c0128469>] ? kmemcheck_save_addr+0x19/0x40
[ 69.690015] [<c0128e2e>] ? kmemcheck_pte_lookup+0xe/0x40
[ 69.690015] [<c0128469>] ? kmemcheck_save_addr+0x19/0x40
[ 69.690015] [<c0128713>] ? kmemcheck_read_strict+0x33/0x80
[ 69.690015] [<c0124210>] ? vmalloc_sync_all+0x100/0x100
[ 69.690015] [<c04663f3>] error_code+0x5f/0x64
[ 69.690015] [<c0124210>] ? vmalloc_sync_all+0x100/0x100
[ 69.690015] [<c0111dbf>] ? p4_pmu_handle_irq+0x7f/0x1b0
[ 69.690015] [<c0128e2e>] ? kmemcheck_pte_lookup+0xe/0x40
[ 69.690015] [<c0128e2e>] ? kmemcheck_pte_lookup+0xe/0x40
[ 69.690015] [<c010f098>] perf_event_nmi_handler+0x28/0xa0
[ 69.690015] [<c015f1c5>] notifier_call_chain+0x75/0xe0
[ 69.690015] [<c015f700>] __atomic_notifier_call_chain+0x60/0x90
[ 69.690015] [<c015f6a0>] ? register_reboot_notifier+0x20/0x20
[ 69.690015] [<c015f74a>] atomic_notifier_call_chain+0x1a/0x20
[ 69.690015] [<c015f86d>] notify_die+0x2d/0x30
[ 69.690015] [<c0102d72>] default_do_nmi+0x32/0x280
[ 69.690015] [<c010369f>] do_nmi+0x7f/0x90
[ 69.690015] [<c04664a5>] nmi_stack_correct+0x28/0x2d
[ 69.690015] [<c012007b>] ? io_apic_set_pci_routing+0x4b/0x60
[ 69.690015] [<c016ce46>] ? trace_hardirqs_off_caller+0xa6/0xf0
[ 69.690015] [<c0295b64>] trace_hardirqs_off_thunk+0xc/0x18
[ 69.690015] [<c0465de6>] ? ret_from_exception+0x6/0x6
[ 69.690015] [<c01c007b>] ? unuse_pte+0xfb/0x120
[ 69.690015] [<c01c67f2>] ? create_object+0x52/0x230
[ 69.690015] [<c04543a7>] kmemleak_alloc+0x27/0x50
[ 69.690015] [<c01c3b79>] kmem_cache_alloc+0xc9/0x110
[ 69.690015] [<c02ac306>] dma_debug_init+0x96/0x140
[ 69.690015] [<c061f75d>] ? start_kernel+0x322/0x322
[ 69.690015] [<c061f75d>] ? start_kernel+0x322/0x322
[ 69.690015] [<c0624d50>] pci_iommu_init+0x13/0x48
[ 69.690015] [<c01011f0>] do_one_initcall+0x30/0x170
[ 69.690015] [<c061f75d>] ? start_kernel+0x322/0x322
[ 69.690015] [<c0624d3d>] ? iommu_setup+0x1fd/0x1fd
[ 69.690015] [<c061f75d>] ? start_kernel+0x322/0x322
[ 69.690015] [<c061f7f8>] kernel_init+0x9b/0x12f
[ 69.690015] [<c0466bba>] kernel_thread_helper+0x6/0xd
[ 69.690015] ---[ end trace 93d72a36b9146f22 ]---

crash8.txt:
[ 69.663411] ------------[ cut here ]------------
[ 69.663411] WARNING: at arch/x86/mm/kmemcheck/kmemcheck.c:634
kmemcheck_fault+0xa5/0xc0()
[ 69.663411] Hardware name:
[ 69.663411] Modules linked in:
[ 69.663411] Pid: 1, comm: swapper Not tainted 2.6.39.1 #3
[ 69.663411] Call Trace:
[ 69.663411] [<c013de0d>] warn_slowpath_common+0x6d/0xa0
[ 69.663411] [<c0128ce5>] ? kmemcheck_fault+0xa5/0xc0
[ 69.663411] [<c0128ce5>] ? kmemcheck_fault+0xa5/0xc0
[ 69.663411] [<c013de5d>] warn_slowpath_null+0x1d/0x20
[ 69.663411] [<c0128ce5>] kmemcheck_fault+0xa5/0xc0
[ 69.663411] [<c0124480>] do_page_fault+0x270/0x440
[ 69.663411] [<c0128e2e>] ? kmemcheck_pte_lookup+0xe/0x40
[ 69.663411] [<c01093f6>] ? native_sched_clock+0x26/0x90
[ 69.663411] [<c015ffc3>] ? sched_clock_local+0xd3/0x1c0
[ 69.663411] [<c0124210>] ? vmalloc_sync_all+0x100/0x100
[ 69.663411] [<c04663f3>] error_code+0x5f/0x64
[ 69.663411] [<c0124210>] ? vmalloc_sync_all+0x100/0x100
[ 69.663411] [<c0111dbf>] ? p4_pmu_handle_irq+0x7f/0x1b0
[ 69.663411] [<c012896e>] ? kmemcheck_show_all+0x2e/0x40
[ 69.663411] [<c0128469>] ? kmemcheck_save_addr+0x19/0x40
[ 69.663411] [<c0128713>] ? kmemcheck_read_strict+0x33/0x80
[ 69.663411] [<c010f098>] perf_event_nmi_handler+0x28/0xa0
[ 69.663411] [<c015f1c5>] notifier_call_chain+0x75/0xe0
[ 69.663411] [<c012896e>] ? kmemcheck_show_all+0x2e/0x40
[ 69.663411] [<c015f700>] __atomic_notifier_call_chain+0x60/0x90
[ 69.663411] [<c015f6a0>] ? register_reboot_notifier+0x20/0x20
[ 69.663411] [<c015f74a>] atomic_notifier_call_chain+0x1a/0x20
[ 69.663411] [<c015f86d>] notify_die+0x2d/0x30
[ 69.663411] [<c0102d72>] default_do_nmi+0x32/0x280
[ 69.663411] [<c010369f>] do_nmi+0x7f/0x90
[ 69.663411] [<c04664a5>] nmi_stack_correct+0x28/0x2d
[ 69.663411] [<c046638c>] ? spurious_interrupt_bug+0xc/0xc
[ 69.663411] [<c028f42c>] ? prio_tree_replace+0x4c/0x60
[ 69.663411] [<c028f739>] prio_tree_insert+0x129/0x190
[ 69.663411] [<c01c68e0>] create_object+0x140/0x230
[ 69.663411] [<c04543a7>] kmemleak_alloc+0x27/0x50
[ 69.663411] [<c01c3b79>] kmem_cache_alloc+0xc9/0x110
[ 69.663411] [<c02ac306>] dma_debug_init+0x96/0x140
[ 69.663411] [<c061f75d>] ? start_kernel+0x322/0x322
[ 69.663411] [<c061f75d>] ? start_kernel+0x322/0x322
[ 69.663411] [<c0624d50>] pci_iommu_init+0x13/0x48
[ 69.663411] [<c01011f0>] do_one_initcall+0x30/0x170
[ 69.663411] [<c061f75d>] ? start_kernel+0x322/0x322
[ 69.663411] [<c0624d3d>] ? iommu_setup+0x1fd/0x1fd
[ 69.663411] [<c061f75d>] ? start_kernel+0x322/0x322
[ 69.663411] [<c061f7f8>] kernel_init+0x9b/0x12f
[ 69.663411] [<c0466bba>] kernel_thread_helper+0x6/0xd
[ 69.663411] ---[ end trace 93d72a36b9146f22 ]---

crash9.txt:
[ 61.373342] ------------[ cut here ]------------
[ 61.373348] WARNING: at arch/x86/mm/kmemcheck/kmemcheck.c:634
kmemcheck_fault+0xa5/0xc0()
[ 61.373348] Hardware name:
[ 61.373348] Modules linked in:
[ 61.373348] Pid: 1, comm: swapper Not tainted 2.6.39.1 #3
[ 61.373348] Call Trace:
[ 61.373348] [<c013de0d>] warn_slowpath_common+0x6d/0xa0
[ 61.373348] [<c0128ce5>] ? kmemcheck_fault+0xa5/0xc0
[ 61.373348] [<c0128ce5>] ? kmemcheck_fault+0xa5/0xc0
[ 61.373348] [<c013de5d>] warn_slowpath_null+0x1d/0x20
[ 61.373348] [<c0128ce5>] kmemcheck_fault+0xa5/0xc0
[ 61.373348] [<c0124480>] do_page_fault+0x270/0x440
[ 61.373348] [<c0128713>] ? kmemcheck_read_strict+0x33/0x80
[ 61.373348] [<c0128469>] ? kmemcheck_save_addr+0x19/0x40
[ 61.373348] [<c0128713>] ? kmemcheck_read_strict+0x33/0x80
[ 61.373348] [<c0128e2e>] ? kmemcheck_pte_lookup+0xe/0x40
[ 61.373348] [<c0124210>] ? vmalloc_sync_all+0x100/0x100
[ 61.373348] [<c04663f3>] error_code+0x5f/0x64
[ 61.373348] [<c0124210>] ? vmalloc_sync_all+0x100/0x100
[ 61.373348] [<c0111dbf>] ? p4_pmu_handle_irq+0x7f/0x1b0
[ 61.373348] [<c0128713>] ? kmemcheck_read_strict+0x33/0x80
[ 61.373348] [<c0128e2e>] ? kmemcheck_pte_lookup+0xe/0x40
[ 61.373348] [<c0128e2e>] ? kmemcheck_pte_lookup+0xe/0x40
[ 61.373348] [<c010f098>] perf_event_nmi_handler+0x28/0xa0
[ 61.373348] [<c015f1c5>] notifier_call_chain+0x75/0xe0
[ 61.373348] [<c015f700>] __atomic_notifier_call_chain+0x60/0x90
[ 61.373348] [<c015f6a0>] ? register_reboot_notifier+0x20/0x20
[ 61.373348] [<c015f74a>] atomic_notifier_call_chain+0x1a/0x20
[ 61.373348] [<c015f86d>] notify_die+0x2d/0x30
[ 61.373348] [<c0102d72>] default_do_nmi+0x32/0x280
[ 61.373348] [<c0128cad>] ? kmemcheck_fault+0x6d/0xc0
[ 61.373348] [<c010369f>] do_nmi+0x7f/0x90
[ 61.373348] [<c04664a5>] nmi_stack_correct+0x28/0x2d
[ 61.373348] [<c012007b>] ? io_apic_set_pci_routing+0x4b/0x60
[ 61.373348] [<c01093d0>] ? time_cpufreq_notifier+0x140/0x140
[ 61.373348] [<c015ffc3>] ? sched_clock_local+0xd3/0x1c0
[ 61.373348] [<c012880b>] ? kmemcheck_show_addr+0xb/0x20
[ 61.373348] [<c012896e>] ? kmemcheck_show_all+0x2e/0x40
[ 61.373348] [<c0128cad>] ? kmemcheck_fault+0x6d/0xc0
[ 61.373348] [<c0160259>] sched_clock_cpu+0xf9/0x190
[ 61.373348] [<c0128e2e>] ? kmemcheck_pte_lookup+0xe/0x40
[ 61.373348] [<c0160309>] local_clock+0x19/0x60
[ 61.373348] [<c016d33d>] lock_release_holdtime+0x2d/0x160
[ 61.373348] [<c0128d1a>] ? kmemcheck_trap+0x1a/0x30
[ 61.373348] [<c01728dc>] lock_release_nested+0x8c/0x110
[ 61.373348] [<c02a60da>] ? debug_object_deactivate+0x8a/0xf0
[ 61.373348] [<c01729ab>] __lock_release+0x4b/0xe0
[ 61.373348] [<c0172a89>] lock_release+0x49/0x70
[ 61.373348] [<c02a60da>] ? debug_object_deactivate+0x8a/0xf0
[ 61.373348] [<c0465cb9>] _raw_spin_unlock_irqrestore+0x19/0x70
[ 61.373348] [<c02a60da>] debug_object_deactivate+0x8a/0xf0
[ 61.373348] [<c015d4ed>] __run_hrtimer.clone.22+0x2d/0x120
[ 61.373348] [<c0465116>] ? _raw_spin_lock+0x66/0x70
[ 61.373348] [<c015dffd>] hrtimer_interrupt+0x17d/0x260
[ 61.373348] [<c012898b>] ? kmemcheck_hide_addr+0xb/0x20
[ 61.373348] [<c011c370>] smp_apic_timer_interrupt+0x50/0x90
[ 61.373348] [<c0295b64>] ? trace_hardirqs_off_thunk+0xc/0x18
[ 61.373348] [<c04661b7>] apic_timer_interrupt+0x2f/0x34
[ 61.373348] [<c029007b>] ? radix_tree_callback+0x4b/0x60
[ 61.373348] [<c0225eda>] ? sysfs_find_dirent+0x2a/0x50
[ 61.373348] [<c0226077>] __sysfs_add_one+0x27/0x90
[ 61.373348] [<c02260f8>] sysfs_add_one+0x18/0xb0
[ 61.373348] [<c022695a>] sysfs_do_create_link+0xea/0x1f0
[ 61.373348] [<c0226a72>] sysfs_create_link+0x12/0x20
[ 61.373348] [<c0341a44>] device_add+0x154/0x350
[ 61.373348] [<c0341c52>] device_register+0x12/0x20
[ 61.373348] [<c0341d01>] device_create_vargs+0xa1/0xc0
[ 61.373348] [<c0341d48>] device_create+0x28/0x30
[ 61.373348] [<c030386f>] tty_register_device+0x7f/0x100
[ 61.373348] [<c0290035>] ? radix_tree_callback+0x5/0x60
[ 61.373348] [<c0465ec8>] ? restore_all+0xf/0xf
[ 61.373348] [<c028da00>] ? kobject_cleanup+0x100/0x110
[ 61.373348] [<c0303d73>] tty_register_driver+0xf3/0x240
[ 61.373348] [<c0640a11>] legacy_pty_init+0x159/0x188
[ 61.373348] [<c061f75d>] ? start_kernel+0x322/0x322
[ 61.373348] [<c0640c6c>] pty_init+0x8/0x11
[ 61.373348] [<c01011f0>] do_one_initcall+0x30/0x170
[ 61.373348] [<c061f75d>] ? start_kernel+0x322/0x322
[ 61.373348] [<c0640c64>] ? unix98_pty_init+0x224/0x224
[ 61.373348] [<c061f75d>] ? start_kernel+0x322/0x322
[ 61.373348] [<c061f7f8>] kernel_init+0x9b/0x12f
[ 61.373348] [<c0466bba>] kernel_thread_helper+0x6/0xd
[ 61.373348] ---[ end trace 93d72a36b9146f22 ]---

All above errors look very similar, however further operation of the kernel
depends on boot parameters.
Only with "acpi=off" the system started completely, and I was able to log in
into gdm, and later switch it off in normal way (crash8.txt).
With other parameters kernel panicked during the boot.

I attach the crashes.tar.z2 file containing logs (crash?.txt) and
configuration
of my kernel (config). Hardware details of my machine were already provided
in previous messages in this thread.
--
Regards,
Wojtek

2011-06-11 15:28:44

by Wojciech Zabołotny

[permalink] [raw]
Subject: Forgotten attachment Re: kmemcheck error and panic when booting 2.6.39.1 , however acpi=off allows to boot

Ooops, I've forgotten to attach logs and kernel configuration.
Here they are.
--
Regards,
Wojtek


Attachments:
crashes.tar.bz2 (79.67 kB)

2011-06-11 16:15:37

by Wojciech Zabołotny

[permalink] [raw]
Subject: Re: kmemcheck error and panic when booting 2.6.39.1 , however acpi=off allows to boot

W dniu 11.06.2011 17:23, wzab pisze:
>
> Only with "acpi=off" the system started completely, and I was able to
> log in
> into gdm, and later switch it off in normal way (crash8.txt).
> With other parameters kernel panicked during the boot.
>
Of course I've mistaken.
The log from correct booting and shutdown is crash9.txt, not crash8.txt

However even in this run there is yet another kmemcheck error:

[ 369.213324] ERROR: kmemcheck: Fatal error
[ 369.217356]
[ 369.218847] Pid: 1125, comm: udevd Tainted: G W 2.6.39.1
#3 /D865GBF
[ 369.229624] EIP: 0060:[<c0295950>] EFLAGS: 00010146 CPU: 0
[ 369.235130] EIP is at strncpy+0x20/0x40
[ 369.238983] EAX: eeff9a00 EBX: eeff9ae8 ECX: 00000005 EDX: eef79254
[ 369.245257] ESI: eef7925a EDI: eeff9af3 EBP: eef7bcec ESP: c083528c
[ 369.251533] DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
[ 369.256941] CR0: 8005003b CR2: f5cb98d8 CR3: 2eff3000 CR4: 000006d0
[ 369.263217] DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
[ 369.269493] DR6: ffff4ff0 DR7: 00000400
[ 369.273340] [<c0128417>] kmemcheck_error_save_bug+0x57/0x90
[ 369.279035] [<c0128a34>] kmemcheck_show+0x64/0x70
[ 369.283869] [<c0128cad>] kmemcheck_fault+0x6d/0xc0
[ 369.288776] [<c0124480>] do_page_fault+0x270/0x440
[ 369.293702] [<c04663f3>] error_code+0x5f/0x64
[ 369.298182] [<c01c687a>] create_object+0xda/0x230
[ 369.303001] [<c04543a7>] kmemleak_alloc+0x27/0x50
[ 369.307838] [<c01c3b79>] kmem_cache_alloc+0xc9/0x110
[ 369.312924] [<c02a5ed8>] __debug_object_init+0x318/0x330
[ 369.318378] [<c02a5f07>] debug_object_init+0x17/0x20
[ 369.323466] [<c01567c2>] rcuhead_fixup_activate+0x82/0xa0
[ 369.328987] [<c02a6037>] debug_object_activate+0x107/0x120
[ 369.334603] [<c01883a9>] __call_rcu+0x19/0x140
[ 369.339173] [<c01884dd>] call_rcu+0xd/0x10
[ 369.343401] [<c01c52c4>] put_object+0x24/0x40
[ 369.347875] [<c01c554d>] delete_object_full+0x1d/0x30
[ 369.353040] [<c0454280>] kmemleak_free+0x20/0x50
[ 369.357789] [<c01c3457>] kmem_cache_free+0x97/0xc0
[ 369.362694] [<c01b959f>] anon_vma_chain_free+0xf/0x20
[ 369.367877] [<c01ba8b6>] unlink_anon_vmas+0x36/0xa0
[ 369.372871] [<c01b1414>] free_pgtables+0x64/0xc0
[ 369.377620] [<c01b7df5>] exit_mmap+0xf5/0x180
[ 369.382093] [<c013be6c>] mmput+0x4c/0xc0
[ 369.386128] [<c01cf346>] exec_mmap+0x126/0x360
[ 369.390708] [<c01cffd5>] flush_old_exec+0x55/0x90
[ 369.395526] [<c020888c>] load_elf_binary+0x20c/0x9f0
[ 369.400621] [<c01cecd1>] search_binary_handler+0xe1/0x2b0
[ 369.406134] [<c01d08c7>] do_execve+0x1d7/0x250
[ 369.410710] [<c010ab22>] sys_execve+0x32/0x70
[ 369.415189] [<c0466706>] ptregs_execve+0x12/0x18
[ 369.419922] [<ffffffff>] 0xffffffff
[ 369.423556] WARNING: kmemcheck: Caught 32-bit read from uninitialized
memory (eeff9a70)
[ 369.431562]
5c9affee649affee649affee349bffeeb076ffeec776ffee0101010101010101
[ 369.439186] i i i i i i i i i i i i i i i i u u u u u u u u u u u u
u u u u
[ 369.446832] ^
[ 369.451287]
[ 369.452777] Pid: 1125, comm: udevd Tainted: G W 2.6.39.1
#3 /D865GBF
[ 369.463563] EIP: 0060:[<c028f6e9>] EFLAGS: 00010046 CPU: 0
[ 369.469061] EIP is at prio_tree_insert+0xd9/0x190
[ 369.473783] EAX: eeff7698 EBX: 00000000 ECX: eeff9a64 EDX: eeff9a64
[ 369.480057] ESI: eeff76af EDI: c0d7b1fc EBP: eef7bcec ESP: c08353b0
[ 369.486333] DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
[ 369.491743] CR0: 8005003b CR2: f5cb98d8 CR3: 2eff3000 CR4: 000006d0
[ 369.498019] DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
[ 369.504295] DR6: ffff4ff0 DR7: 00000400
[ 369.508141] [<c01c68e0>] create_object+0x140/0x230
[ 369.513046] [<c04543a7>] kmemleak_alloc+0x27/0x50
[ 369.517885] [<c01c3b79>] kmem_cache_alloc+0xc9/0x110
[ 369.522963] [<c02a5ed8>] __debug_object_init+0x318/0x330
[ 369.528408] [<c02a5f07>] debug_object_init+0x17/0x20
[ 369.533487] [<c01567c2>] rcuhead_fixup_activate+0x82/0xa0
[ 369.539000] [<c02a6037>] debug_object_activate+0x107/0x120
[ 369.544617] [<c01883a9>] __call_rcu+0x19/0x140
[ 369.549177] [<c01884dd>] call_rcu+0xd/0x10
[ 369.553408] [<c01c52c4>] put_object+0x24/0x40
[ 369.557879] [<c01c554d>] delete_object_full+0x1d/0x30
[ 369.563047] [<c0454280>] kmemleak_free+0x20/0x50
[ 369.567796] [<c01c3457>] kmem_cache_free+0x97/0xc0
[ 369.572709] [<c01b959f>] anon_vma_chain_free+0xf/0x20
[ 369.577893] [<c01ba8b6>] unlink_anon_vmas+0x36/0xa0
[ 369.582884] [<c01b1414>] free_pgtables+0x64/0xc0
[ 369.587633] [<c01b7df5>] exit_mmap+0xf5/0x180
[ 369.592104] [<c013be6c>] mmput+0x4c/0xc0
[ 369.596141] [<c01cf346>] exec_mmap+0x126/0x360
[ 369.600719] [<c01cffd5>] flush_old_exec+0x55/0x90
[ 369.605537] [<c020888c>] load_elf_binary+0x20c/0x9f0
[ 369.610634] [<c01cecd1>] search_binary_handler+0xe1/0x2b0
[ 369.616144] [<c01d08c7>] do_execve+0x1d7/0x250
[ 369.620720] [<c010ab22>] sys_execve+0x32/0x70
[ 369.625202] [<c0466706>] ptregs_execve+0x12/0x18
[ 369.629933] [<ffffffff>] 0xffffffff
done.