2006-10-12 21:50:38

by Steven Truong

[permalink] [raw]
Subject: kdump/kexec/crash on vmcore file

Hi, all. This is my first attempt to troubleshoot a kernel panic so I
am quite newbie in this area. I have been able to obtain a kdump when
my box had kernel panic.

I set up Kdump and Kexec and then the captured/crash kernel to boot
into Level 1 and then copy /proc/vmcore file to the disk for later
analysis. However, after the server booted back to Level 3 and I
utilized the crash command to analyzed the vmcore file. I got error
message:

./crash /boot/vmlinux ../vmcore.test


crash: read error: kernel virtual address: ffffffff8123d1e0 type:
"kernel_config_data"
WARNING: cannot read kernel_config_data
crash: read error: kernel virtual address: ffffffff813b5180 type: "xtime"

I installed the lastest crash software 4.0-3.6. I googled left and
right but nothing has turned up useful.

Thank you for all the helps and hints.

My box is dual Xeon 3.2 GHz, 4 GBs of PC2700 DDR2 and run CentOS 4.3
with kernel

Linux node30 2.6.18 #1 SMP Mon Oct 9 14:42:15 PDT 2006 x86_64 x86_64
x86_64 GNU/Linux

lspci
00:00.0 Host bridge: Intel Corporation E7320 Memory Controller Hub (rev 0c)
00:02.0 PCI bridge: Intel Corporation E7525/E7520/E7320 PCI Express
Port A (rev 0c)
00:03.0 PCI bridge: Intel Corporation E7525/E7520/E7320 PCI Express
Port A1 (rev 0c)
00:1c.0 PCI bridge: Intel Corporation 6300ESB 64-bit PCI-X Bridge (rev 02)
00:1d.0 USB Controller: Intel Corporation 6300ESB USB Universal Host
Controller (rev 02)
00:1d.1 USB Controller: Intel Corporation 6300ESB USB Universal Host
Controller (rev 02)
00:1d.4 System peripheral: Intel Corporation 6300ESB Watchdog Timer (rev 02)
00:1d.5 PIC: Intel Corporation 6300ESB I/O Advanced Programmable
Interrupt Controller (rev 02)
00:1d.7 USB Controller: Intel Corporation 6300ESB USB2 Enhanced Host
Controller (rev 02)
00:1e.0 PCI bridge: Intel Corporation 82801 PCI Bridge (rev 0a)
00:1f.0 ISA bridge: Intel Corporation 6300ESB LPC Interface Controller (rev 02)
00:1f.2 IDE interface: Intel Corporation 6300ESB SATA Storage
Controller (rev 02)
00:1f.3 SMBus: Intel Corporation 6300ESB SMBus Controller (rev 02)
01:00.0 PCI bridge: Intel Corporation 6700PXH PCI Express-to-PCI
Bridge A (rev 09)
01:00.2 PCI bridge: Intel Corporation 6700PXH PCI Express-to-PCI
Bridge B (rev 09)
05:01.0 Ethernet controller: Intel Corporation 82541GI/PI Gigabit
Ethernet Controller (rev 05)
05:02.0 Ethernet controller: Intel Corporation 82541GI/PI Gigabit
Ethernet Controller (rev 05)
06:02.0 VGA compatible controller: ATI Technologies Inc Rage XL (rev 27)


cat /proc/meminfo
MemTotal: 4042508 kB
MemFree: 3533972 kB
Buffers: 50236 kB
Cached: 374388 kB
SwapCached: 0 kB
Active: 154168 kB
Inactive: 276496 kB
HighTotal: 0 kB
HighFree: 0 kB
LowTotal: 4042508 kB
LowFree: 3533972 kB
SwapTotal: 5742364 kB
SwapFree: 5742364 kB
Dirty: 80 kB
Writeback: 0 kB
AnonPages: 5868 kB
Mapped: 4020 kB
Slab: 57212 kB
PageTables: 904 kB
NFS_Unstable: 0 kB
Bounce: 0 kB
CommitLimit: 7763616 kB
Committed_AS: 19128 kB
VmallocTotal: 34359738367 kB
VmallocUsed: 3208 kB
VmallocChunk: 34359734915 kB
HugePages_Total: 0
HugePages_Free: 0
HugePages_Rsvd: 0
Hugepagesize: 2048 kB

processor : 0
vendor_id : GenuineIntel
cpu family : 15
model : 4
model name : Intel(R) Xeon(TM) CPU 3.20GHz
stepping : 3
cpu MHz : 3200.198
cache size : 2048 KB
physical id : 0
siblings : 2
core id : 0
cpu cores : 1
fpu : yes
fpu_exception : yes
cpuid level : 5
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge
mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm syscall
lm constant_tsc pni monitor ds_cpl cid cx16 xtpr
bogomips : 6405.21
clflush size : 64
cache_alignment : 128
address sizes : 36 bits physical, 48 bits virtual
power management:

processor : 1
vendor_id : GenuineIntel
cpu family : 15
model : 4
model name : Intel(R) Xeon(TM) CPU 3.20GHz
stepping : 3
cpu MHz : 3200.198
cache size : 2048 KB
physical id : 3
siblings : 2
core id : 0
cpu cores : 1
fpu : yes
fpu_exception : yes
cpuid level : 5
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge
mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm syscall
lm constant_tsc pni monitor ds_cpl cid cx16 xtpr
bogomips : 6400.80
clflush size : 64
cache_alignment : 128
address sizes : 36 bits physical, 48 bits virtual
power management:

processor : 2
vendor_id : GenuineIntel
cpu family : 15
model : 4
model name : Intel(R) Xeon(TM) CPU 3.20GHz
stepping : 3
cpu MHz : 3200.198
cache size : 2048 KB
physical id : 0
siblings : 2
core id : 0
cpu cores : 1
fpu : yes
fpu_exception : yes
cpuid level : 5
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge
mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm syscall
lm constant_tsc pni monitor ds_cpl cid cx16 xtpr
bogomips : 6400.98
clflush size : 64
cache_alignment : 128
address sizes : 36 bits physical, 48 bits virtual
power management:

processor : 3
vendor_id : GenuineIntel
cpu family : 15
model : 4
model name : Intel(R) Xeon(TM) CPU 3.20GHz
stepping : 3
cpu MHz : 3200.198
cache size : 2048 KB
physical id : 3
siblings : 2
core id : 0
cpu cores : 1
fpu : yes
fpu_exception : yes
cpuid level : 5
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge
mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm syscall
lm constant_tsc pni monitor ds_cpl cid cx16 xtpr
bogomips : 6400.80
clflush size : 64
cache_alignment : 128
address sizes : 36 bits physical, 48 bits virtual
power management:


2006-10-12 21:56:10

by Steven Truong

[permalink] [raw]
Subject: Re: kdump/kexec/crash on vmcore file

There are weird things when this node booted up. Here is the dmesg
for this node:

Linux version 2.6.18 (root@node30) (gcc version 3.4.5 20051201 (Red
Hat 3.4.5-2)) #1 SMP Mon Oct 9 14:42:15 PDT 2006
BIOS-provided physical RAM map:
BIOS-e820: 0000000000000000 - 000000000009cc00 (usable)
BIOS-e820: 000000000009cc00 - 00000000000a0000 (reserved)
BIOS-e820: 00000000000ea070 - 0000000000100000 (reserved)
BIOS-e820: 0000000000100000 - 00000000dffe0000 (usable)
BIOS-e820: 00000000dffe0000 - 00000000dffef000 (ACPI data)
BIOS-e820: 00000000dffef000 - 00000000dfff0000 (ACPI NVS)
BIOS-e820: 00000000dfff0000 - 00000000e0000000 (reserved)
BIOS-e820: 00000000fec00000 - 00000000fec86000 (reserved)
BIOS-e820: 00000000fee00000 - 00000000fee01000 (reserved)
BIOS-e820: 00000000ffb00000 - 0000000100000000 (reserved)
BIOS-e820: 0000000100000000 - 0000000120000000 (usable)
DMI 2.3 present.
ACPI: RSDP (v000 ACPIAM ) @ 0x00000000000f7740
ACPI: RSDT (v001 A M I OEMRSDT 0x03000529 MSFT 0x00000097) @
0x00000000dffe0000
ACPI: FADT (v002 A M I OEMFACP 0x03000529 MSFT 0x00000097) @
0x00000000dffe0200
ACPI: MADT (v001 A M I OEMAPIC 0x03000529 MSFT 0x00000097) @
0x00000000dffe0390
ACPI: OEMB (v001 A M I AMI_OEM 0x03000529 MSFT 0x00000097) @
0x00000000dffef040
ACPI: DSDT (v001 DVA4G DVA4G007 0x00000007 INTL 0x02002026) @
0x0000000000000000
No NUMA configuration found
Faking a node at 0000000000000000-0000000120000000
Bootmem setup node 0 0000000000000000-0000000120000000
On node 0 totalpages: 1028721
DMA zone: 2641 pages, LIFO batch:0
DMA32 zone: 897056 pages, LIFO batch:31
Normal zone: 129024 pages, LIFO batch:31
ACPI: PM-Timer IO Port: 0x408
ACPI: Local APIC address 0xfee00000
ACPI: LAPIC (acpi_id[0x01] lapic_id[0x00] enabled)
Processor #0 15:4 APIC version 20
ACPI: LAPIC (acpi_id[0x02] lapic_id[0x06] enabled)
Processor #6 15:4 APIC version 20
ACPI: LAPIC (acpi_id[0x03] lapic_id[0x01] enabled)
Processor #1 15:4 APIC version 20
ACPI: LAPIC (acpi_id[0x04] lapic_id[0x07] enabled)
Processor #7 15:4 APIC version 20
ACPI: IOAPIC (id[0x08] address[0xfec00000] gsi_base[0])
IOAPIC[0]: apic_id 8, version 32, address 0xfec00000, GSI 0-23
ACPI: IOAPIC (id[0x0b] address[0xfec10000] gsi_base[72])
IOAPIC[1]: apic_id 11, version 32, address 0xfec10000, GSI 72-95
ACPI: IOAPIC (id[0x09] address[0xfec80000] gsi_base[24])
IOAPIC[2]: apic_id 9, version 32, address 0xfec80000, GSI 24-47
ACPI: IOAPIC (id[0x0a] address[0xfec80400] gsi_base[48])
IOAPIC[3]: apic_id 10, version 32, address 0xfec80400, GSI 48-71
ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl)
ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 9 high level)
ACPI: IRQ0 used by override.
ACPI: IRQ2 used by override.
ACPI: IRQ9 used by override.
Setting APIC routing to physical flat
Using ACPI (MADT) for SMP configuration information
Allocating PCI resources starting at e2000000 (gap: e0000000:1ec00000)
SMP: Allowing 4 CPUs, 0 hotplug CPUs
Built 1 zonelists. Total pages: 1028721
Kernel command line: ro root=/dev/sda3 rhgb quiet
Initializing CPU#0
PID hash table entries: 4096 (order: 12, 32768 bytes)
time.c: Using 3.579545 MHz WALL PM GTOD PIT/TSC timer.
time.c: Detected 3200.198 MHz processor.
Console: colour VGA+ 80x25
------------------------
| Locking API testsuite:
----------------------------------------------------------------------------
| spin |wlock |rlock |mutex | wsem | rsem |
--------------------------------------------------------------------------
A-A deadlock:failed|failed| ok |failed|failed|failed|
A-B-B-A deadlock:failed|failed| ok |failed|failed|failed|
A-B-B-C-C-A deadlock:failed|failed| ok |failed|failed|failed|
A-B-C-A-B-C deadlock:failed|failed| ok |failed|failed|failed|
A-B-B-C-C-D-D-A deadlock:failed|failed| ok |failed|failed|failed|
A-B-C-D-B-D-D-A deadlock:failed|failed| ok |failed|failed|failed|
A-B-C-D-B-C-D-A deadlock:failed|failed| ok |failed|failed|failed|
double unlock: ok | ok |failed|failed|failed|failed|
initialize held:failed|failed|failed|failed|failed|failed|
bad unlock order: ok | ok | ok | ok | ok | ok |
--------------------------------------------------------------------------
recursive read-lock: | ok | |failed|
recursive read-lock #2: | ok | |failed|
mixed read-write-lock: |failed| |failed|
mixed write-read-lock: |failed| |failed|
--------------------------------------------------------------------------
hard-irqs-on + irq-safe-A/12:failed|failed| ok |
soft-irqs-on + irq-safe-A/12:failed|failed| ok |
hard-irqs-on + irq-safe-A/21:failed|failed| ok |
soft-irqs-on + irq-safe-A/21:failed|failed| ok |
sirq-safe-A => hirqs-on/12:failed|failed| ok |
sirq-safe-A => hirqs-on/21:failed|failed| ok |
hard-safe-A + irqs-on/12:failed|failed| ok |
soft-safe-A + irqs-on/12:failed|failed| ok |
hard-safe-A + irqs-on/21:failed|failed| ok |
soft-safe-A + irqs-on/21:failed|failed| ok |
hard-safe-A + unsafe-B #1/123:failed|failed| ok |
soft-safe-A + unsafe-B #1/123:failed|failed| ok |
hard-safe-A + unsafe-B #1/132:failed|failed| ok |
soft-safe-A + unsafe-B #1/132:failed|failed| ok |
hard-safe-A + unsafe-B #1/213:failed|failed| ok |
soft-safe-A + unsafe-B #1/213:failed|failed| ok |
hard-safe-A + unsafe-B #1/231:failed|failed| ok |
soft-safe-A + unsafe-B #1/231:failed|failed| ok |
hard-safe-A + unsafe-B #1/312:failed|failed| ok |
soft-safe-A + unsafe-B #1/312:failed|failed| ok |
hard-safe-A + unsafe-B #1/321:failed|failed| ok |
soft-safe-A + unsafe-B #1/321:failed|failed| ok |
hard-safe-A + unsafe-B #2/123:failed|failed| ok |
soft-safe-A + unsafe-B #2/123:failed|failed| ok |
hard-safe-A + unsafe-B #2/132:failed|failed| ok |
soft-safe-A + unsafe-B #2/132:failed|failed| ok |
hard-safe-A + unsafe-B #2/213:failed|failed| ok |
soft-safe-A + unsafe-B #2/213:failed|failed| ok |
hard-safe-A + unsafe-B #2/231:failed|failed| ok |
soft-safe-A + unsafe-B #2/231:failed|failed| ok |
hard-safe-A + unsafe-B #2/312:failed|failed| ok |
soft-safe-A + unsafe-B #2/312:failed|failed| ok |
hard-safe-A + unsafe-B #2/321:failed|failed| ok |
soft-safe-A + unsafe-B #2/321:failed|failed| ok |
hard-irq lock-inversion/123:failed|failed| ok |
soft-irq lock-inversion/123:failed|failed| ok |
hard-irq lock-inversion/132:failed|failed| ok |
soft-irq lock-inversion/132:failed|failed| ok |
hard-irq lock-inversion/213:failed|failed| ok |
soft-irq lock-inversion/213:failed|failed| ok |
hard-irq lock-inversion/231:failed|failed| ok |
soft-irq lock-inversion/231:failed|failed| ok |
hard-irq lock-inversion/312:failed|failed| ok |
soft-irq lock-inversion/312:failed|failed| ok |
hard-irq lock-inversion/321:failed|failed| ok |
soft-irq lock-inversion/321:failed|failed| ok |
hard-irq read-recursion/123: ok |
soft-irq read-recursion/123: ok |
hard-irq read-recursion/132: ok |
soft-irq read-recursion/132: ok |
hard-irq read-recursion/213: ok |
soft-irq read-recursion/213: ok |
hard-irq read-recursion/231: ok |
soft-irq read-recursion/231: ok |
hard-irq read-recursion/312: ok |
soft-irq read-recursion/312: ok |
hard-irq read-recursion/321: ok |
soft-irq read-recursion/321: ok |
--------------------------------------------------------
143 out of 218 testcases failed, as expected. |
Dentry cache hash table entries: 524288 (order: 10, 4194304 bytes)
Inode-cache hash table entries: 262144 (order: 9, 2097152 bytes)
Checking aperture...
PCI-DMA: Using software bounce buffering for IO (SWIOTLB)
Placing software IO TLB between 0x1669000 - 0x5669000
Memory: 4041340k/4718592k available (2376k kernel code, 152436k reserved, 1547k
data, 204k init)
Calibrating delay using timer specific routine.. 6405.21 BogoMIPS
(lpj=12810429)Security Framework v1.0.0 initialized
SELinux: Initializing.
SELinux: Starting in permissive mode
selinux_register_security: Registering secondary module capability
Capability LSM initialized as secondary
Mount-cache hash table entries: 256
CPU: Trace cache: 12K uops, L1 D cache: 16K
CPU: L2 cache: 2048K
using mwait in idle threads.
CPU: Physical Processor ID: 0
CPU: Processor Core ID: 0
CPU0: Thermal monitoring enabled (TM1)
SMP alternatives: switching to UP code
ACPI: Core revision 20060707
Using local APIC timer interrupts.
result 12500695
Detected 12.500 MHz APIC timer.
SMP alternatives: switching to SMP code
Booting processor 1/4 APIC 0x6
Initializing CPU#1
Calibrating delay using timer specific routine.. 6400.80 BogoMIPS
(lpj=12801613)CPU: Trace cache: 12K uops, L1 D cache: 16K
CPU: L2 cache: 2048K
CPU: Physical Processor ID: 3
CPU: Processor Core ID: 0
CPU1: Thermal monitoring enabled (TM1)
Intel(R) Xeon(TM) CPU 3.20GHz stepping 03
SMP alternatives: switching to SMP code
Booting processor 2/4 APIC 0x1
Initializing CPU#2
Calibrating delay using timer specific routine.. 6400.98 BogoMIPS
(lpj=12801963)CPU: Trace cache: 12K uops, L1 D cache: 16K
CPU: L2 cache: 2048K
CPU: Physical Processor ID: 0
CPU: Processor Core ID: 0
CPU2: Thermal monitoring enabled (TM1)
Intel(R) Xeon(TM) CPU 3.20GHz stepping 03
SMP alternatives: switching to SMP code
Booting processor 3/4 APIC 0x7
Initializing CPU#3
Calibrating delay using timer specific routine.. 6400.80 BogoMIPS
(lpj=12801609)CPU: Trace cache: 12K uops, L1 D cache: 16K
CPU: L2 cache: 2048K
CPU: Physical Processor ID: 3
CPU: Processor Core ID: 0
CPU3: Thermal monitoring enabled (TM1)
Intel(R) Xeon(TM) CPU 3.20GHz stepping 03
Brought up 4 CPUs
testing NMI watchdog ... OK.
migration_cost=4,1290
checking if image is initramfs... it is
Freeing initrd memory: 963k freed
PM: Adding info for No Bus:platform
NET: Registered protocol family 16
ACPI: bus type pci registered
PCI: Using configuration type 1
ACPI: Interpreter enabled
ACPI: Using IOAPIC for interrupt routing
PM: Adding info for acpi:acpi
ACPI: PCI Root Bridge [PCI0] (0000:00)
PCI: Probing PCI hardware (bus 00)
PM: Adding info for No Bus:pci0000:00
PCI quirk: region 0400-047f claimed by ICH4 ACPI/GPIO/TCO
PCI quirk: region 0500-053f claimed by ICH4 GPIO
PCI: Ignoring BAR0-3 of IDE controller 0000:00:1f.2
PCI: PXH quirk detected, disabling MSI for SHPC device
PCI: PXH quirk detected, disabling MSI for SHPC device
Boot video device is 0000:06:02.0
PCI: Transparent bridge - 0000:00:1e.0
ACPI: PCI Interrupt Routing Table [\_SB_.PCI0._PRT]
PM: Adding info for pci:0000:00:00.0
PM: Adding info for pci:0000:00:02.0
PM: Adding info for pci:0000:00:03.0
PM: Adding info for pci:0000:00:1c.0
PM: Adding info for pci:0000:00:1d.0
PM: Adding info for pci:0000:00:1d.1
PM: Adding info for pci:0000:00:1d.4
PM: Adding info for pci:0000:00:1d.5
PM: Adding info for pci:0000:00:1d.7
PM: Adding info for pci:0000:00:1e.0
PM: Adding info for pci:0000:00:1f.0
PM: Adding info for pci:0000:00:1f.2
PM: Adding info for pci:0000:00:1f.3
PM: Adding info for pci:0000:01:00.0
PM: Adding info for pci:0000:01:00.2
PM: Adding info for pci:0000:05:01.0
PM: Adding info for pci:0000:05:02.0
PM: Adding info for pci:0000:06:02.0
ACPI: PCI Interrupt Routing Table [\_SB_.PCI0.EPA0._PRT]
ACPI: PCI Interrupt Routing Table [\_SB_.PCI0.EPA0.PXHA._PRT]
ACPI: PCI Interrupt Routing Table [\_SB_.PCI0.EPA0.PXHB._PRT]
ACPI: PCI Interrupt Routing Table [\_SB_.PCI0.EPA1._PRT]
ACPI: PCI Interrupt Routing Table [\_SB_.PCI0.P0P1._PRT]
ACPI: PCI Interrupt Routing Table [\_SB_.PCI0.P0PC._PRT]
ACPI: PCI Interrupt Link [LNKA] (IRQs 3 4 5 6 7 *10 11 12 14 15)
ACPI: PCI Interrupt Link [LNKB] (IRQs 3 4 5 6 7 10 *11 12 14 15)
ACPI: PCI Interrupt Link [LNKC] (IRQs 3 4 5 6 7 10 11 12 14 15) *0, disabled.
ACPI: PCI Interrupt Link [LNKD] (IRQs 3 4 *5 6 7 10 11 12 14 15)
ACPI: PCI Interrupt Link [LNKD] (IRQs 3 4 *5 6 7 10 11 12 14 15)
ACPI: PCI Interrupt Link [LNKE] (IRQs 3 4 5 6 7 10 11 12 14 15) *0, disabled.
ACPI: PCI Interrupt Link [LNKF] (IRQs 3 4 5 6 7 10 11 12 14 15) *0, disabled.
ACPI: PCI Interrupt Link [LNKG] (IRQs 3 4 5 6 7 10 11 12 14 15) *0, disabled.
ACPI: PCI Interrupt Link [LNKH] (IRQs 3 4 *5 6 7 10 11 12 14 15)
Linux Plug and Play Support v0.97 (c) Adam Belay
pnp: PnP ACPI init
PM: Adding info for No Bus:pnp0
PM: Adding info for pnp:00:00
PM: Adding info for pnp:00:01
PM: Adding info for pnp:00:02
PM: Adding info for pnp:00:03
PM: Adding info for pnp:00:04
PM: Adding info for pnp:00:05
PM: Adding info for pnp:00:06
PM: Adding info for pnp:00:07
PM: Adding info for pnp:00:08
PM: Adding info for pnp:00:09
PM: Adding info for pnp:00:0a
PM: Adding info for pnp:00:0b
PM: Adding info for pnp:00:0c
PM: Adding info for pnp:00:0d
PM: Adding info for pnp:00:0e
PM: Adding info for pnp:00:0f
pnp: PnP ACPI: found 16 devices
usbcore: registered new driver usbfs
usbcore: registered new driver hub
PCI: Using ACPI for IRQ routing
PCI: If a device doesn't work, try "pci=routeirq". If it helps, post a report
PCI-GART: No AMD northbridge found.
pnp: 00:0a: ioport range 0x295-0x296 has been reserved
pnp: 00:0a: ioport range 0xb78-0xb7f has been reserved
pnp: 00:0d: ioport range 0x680-0x6ff has been reserved
PCI: Bridge: 0000:01:00.0
IO window: disabled.
MEM window: disabled.
PREFETCH window: disabled.
PCI: Bridge: 0000:01:00.2
IO window: disabled.
MEM window: disabled.
PREFETCH window: disabled.
PCI: Bridge: 0000:00:02.0
IO window: disabled.
MEM window: disabled.
PREFETCH window: disabled.
PCI: Bridge: 0000:00:03.0
IO window: disabled.
MEM window: disabled.
PREFETCH window: disabled.
PCI: Bridge: 0000:00:03.0
IO window: disabled.
MEM window: disabled.
PREFETCH window: disabled.
PCI: Bridge: 0000:00:1c.0
IO window: c000-cfff
MEM window: fc900000-fc9fffff
PREFETCH window: fc700000-fc7fffff
PCI: Bridge: 0000:00:1e.0
IO window: d000-dfff
MEM window: fca00000-feafffff
PREFETCH window: e2000000-e20fffff
GSI 16 sharing vector 0xA9 and IRQ 16
ACPI: PCI Interrupt 0000:00:02.0[A] -> GSI 16 (level, low) -> IRQ 16
PCI: Setting latency timer of device 0000:00:02.0 to 64
PCI: Setting latency timer of device 0000:01:00.0 to 64
PCI: Setting latency timer of device 0000:01:00.2 to 64
ACPI: PCI Interrupt 0000:00:03.0[A] -> GSI 16 (level, low) -> IRQ 16
PCI: Setting latency timer of device 0000:00:03.0 to 64
PCI: Setting latency timer of device 0000:00:1e.0 to 64
NET: Registered protocol family 2
IP route cache hash table entries: 131072 (order: 8, 1048576 bytes)
TCP established hash table entries: 131072 (order: 10, 4194304 bytes)
TCP bind hash table entries: 65536 (order: 9, 2097152 bytes)
TCP: Hash tables configured (established 131072 bind 65536)
TCP reno registered
PM: Adding info for platform:pcspkr
audit: initializing netlink socket (disabled)
audit(1160507947.196:1): initialized
Total HugeTLB memory allocated, 0
VFS: Disk quotas dquot_6.5.1
Dquot-cache hash table entries: 512 (order 0, 4096 bytes)
SELinux: Registering netfilter hooks
Initializing Cryptographic API
io scheduler noop registered
io scheduler anticipatory registered
io scheduler deadline registered
io scheduler cfq registered (default)
PCI: Setting latency timer of device 0000:00:02.0 to 64
Allocate Port Service[0000:00:02.0:pcie00]
PM: Adding info for pci_express:0000:00:02.0:pcie00
PCI: Setting latency timer of device 0000:00:03.0 to 64
Allocate Port Service[0000:00:03.0:pcie00]
PM: Adding info for pci_express:0000:00:03.0:pcie00
pci_hotplug: PCI Hot Plug PCI Core version: 0.5
PM: Adding info for platform:vesafb.0
ACPI: Processor [CPU1] (supports 8 throttling states)
Real Time Clock Driver v1.12ac
Linux agpgart interface v0.101 (c) Dave Jones
Serial: 8250/16550 driver $Revision: 1.90 $ 4 ports, IRQ sharing enabled
PM: Adding info for platform:serial8250
serial8250: ttyS0 at I/O 0x3f8 (irq = 4) is a 16550A
serial8250: ttyS1 at I/O 0x2f8 (irq = 3) is a 16550A
00:05: ttyS0 at I/O 0x3f8 (irq = 4) is a 16550A
00:06: ttyS1 at I/O 0x2f8 (irq = 3) is a 16550A
RAMDISK driver initialized: 16 RAM disks of 16384K size 1024 blocksize
Uniform Multi-Platform E-IDE driver Revision: 7.00alpha2
ide: Assuming 33MHz system bus speed for PIO modes; override with idebus=xx
Probing IDE interface ide0...
ide1: I/O resource 0x170-0x177 not free.
ide1: ports already in use, skipping probe
ide-floppy driver 0.99.newide
usbcore: registered new driver libusual
usbcore: registered new driver hiddev
usbcore: registered new driver usbhid
drivers/usb/input/hid-core.c: v2.6:USB HID core driver
PNP: PS/2 Controller [PNP0303:PS2K,PNP0f03:PS2M] at 0x60,0x64 irq 1,12
PM: Adding info for platform:i8042
serio: i8042 AUX port at 0x60,0x64 irq 12
PM: Adding info for serio:serio0
serio: i8042 KBD port at 0x60,0x64 irq 1
PM: Adding info for serio:serio1
mice: PS/2 mouse device common for all mice
md: md driver 0.90.3 MAX_MD_DEVS=256, MD_SB_DISKS=27
md: bitmap version 4.39
TCP bic registered
Initializing IPsec netlink socket
NET: Registered protocol family 1
NET: Registered protocol family 17
ACPI: (supports S0 S1 S4 S5)
Freeing unused kernel memory: 204k freed
Write protecting the kernel read-only data: 454k
SCSI subsystem initialized
input: AT Translated Set 2 keyboard as /class/input/input0
libata version 2.00 loaded.
ata_piix 0000:00:1f.2: version 2.00
ata_piix 0000:00:1f.2: MAP [ IDE IDE P0 P1 ]
GSI 17 sharing vector 0xB1 and IRQ 17
ACPI: PCI Interrupt 0000:00:1f.2[A] -> GSI 18 (level, low) -> IRQ 17
PCI: Setting latency timer of device 0000:00:1f.2 to 64
ata1: PATA max UDMA/100 cmd 0x1F0 ctl 0x3F6 bmdma 0xFC00 irq 14
scsi0 : ata_piix
PM: Adding info for No Bus:host0
ata2: SATA max UDMA/133 cmd 0x170 ctl 0x376 bmdma 0xFC08 irq 15
scsi1 : ata_piix
PM: Adding info for No Bus:host1
ata2.00: ATA-7, max UDMA/133, 156312576 sectors: LBA48
ata2.00: ata2: dev 0 multi count 16
ata2.00: configured for UDMA/133
PM: Adding info for No Bus:target1:0:0
Vendor: ATA Model: WDC WD800JD-08LS Rev: 07.0
Type: Direct-Access ANSI SCSI revision: 05
PM: Adding info for scsi:1:0:0:0
SCSI device sda: 156312576 512-byte hdwr sectors (80032 MB)
sda: Write Protect is off
sda: Mode Sense: 00 3a 00 00
SCSI device sda: drive cache: write back
SCSI device sda: 156312576 512-byte hdwr sectors (80032 MB)
sda: Write Protect is off
sda: Mode Sense: 00 3a 00 00
SCSI device sda: drive cache: write back
sda: sda1 sda2 sda3 sda4 < sda5 sda6 sda7 sda8 >
sd 1:0:0:0: Attached scsi disk sda
kjournald starting. Commit interval 5 seconds
EXT3-fs: mounted filesystem with ordered data mode.
SELinux: Disabled at runtime.
SELinux: Unregistering netfilter hooks
audit(1160507949.120:2): selinux=0 auid=4294967295
input: ImExPS/2 Generic Explorer Mouse as /class/input/input1
FDC 0 is a post-1991 82077
Intel(R) PRO/1000 Network Driver - version 7.1.9-k4-NAPI
Copyright (c) 1999-2006 Intel Corporation.
GSI 18 sharing vector 0xB9 and IRQ 18
ACPI: PCI Interrupt 0000:05:01.0[A] -> GSI 74 (level, low) -> IRQ 18
e1000: 0000:05:01.0: e1000_validate_option: Receive Interrupt Delay set to 0
e1000: 0000:05:01.0: e1000_check_options: Interrupt Throttling Rate
(ints/sec) turned off
e1000: 0000:05:01.0: e1000_probe: (PCI:66MHz:32-bit) 00:30:48:83:12:66
e1000: eth0: e1000_probe: Intel(R) PRO/1000 Network Connection
GSI 19 sharing vector 0xC1 and IRQ 19
ACPI: PCI Interrupt 0000:05:02.0[A] -> GSI 75 (level, low) -> IRQ 19
e1000: 0000:05:02.0: e1000_probe: (PCI:66MHz:32-bit) 00:30:48:83:12:67
e1000: eth1: e1000_probe: Intel(R) PRO/1000 Network Connection
GSI 20 sharing vector 0xC9 and IRQ 20
ACPI: PCI Interrupt 0000:00:1f.3[B] -> GSI 17 (level, low) -> IRQ 20
PM: Adding info for No Bus:i2c-0
i6300ESB timer: initialized (0xffffc20000004800). heartbeat=30 sec (nowayout=0)
GSI 21 sharing vector 0xD1 and IRQ 21
ACPI: PCI Interrupt 0000:00:1d.7[D] -> GSI 23 (level, low) -> IRQ 21
PCI: Setting latency timer of device 0000:00:1d.7 to 64
ehci_hcd 0000:00:1d.7: EHCI Host Controller
ehci_hcd 0000:00:1d.7: new USB bus registered, assigned bus number 1
ehci_hcd 0000:00:1d.7: debug port 1
PCI: cache line size of 128 is not supported by device 0000:00:1d.7
ehci_hcd 0000:00:1d.7: irq 21, io mem 0xfebffc00
ehci_hcd 0000:00:1d.7: USB 2.0 started, EHCI 1.00, driver 10 Dec 2004
PM: Adding info for usb:usb1
PM: Adding info for No Bus:usbdev1.1_ep00
usb usb1: configuration #1 chosen from 1 choice
PM: Adding info for usb:1-0:1.0
hub 1-0:1.0: USB hub found
hub 1-0:1.0: 4 ports detected
PM: Adding info for No Bus:usbdev1.1_ep81
USB Universal Host Controller Interface driver v3.0
ACPI: PCI Interrupt 0000:00:1d.0[A] -> GSI 16 (level, low) -> IRQ 16
PCI: Setting latency timer of device 0000:00:1d.0 to 64
uhci_hcd 0000:00:1d.0: UHCI Host Controller
uhci_hcd 0000:00:1d.0: new USB bus registered, assigned bus number 2
uhci_hcd 0000:00:1d.0: irq 16, io base 0x0000e800
PM: Adding info for usb:usb2
PM: Adding info for No Bus:usbdev2.1_ep00
usb usb2: configuration #1 chosen from 1 choice
PM: Adding info for usb:2-0:1.0
hub 2-0:1.0: USB hub found
hub 2-0:1.0: 2 ports detected
PM: Adding info for No Bus:usbdev2.1_ep81
GSI 22 sharing vector 0xD9 and IRQ 22
ACPI: PCI Interrupt 0000:00:1d.1[B] -> GSI 19 (level, low) -> IRQ 22
PCI: Setting latency timer of device 0000:00:1d.1 to 64
uhci_hcd 0000:00:1d.1: UHCI Host Controller
uhci_hcd 0000:00:1d.1: new USB bus registered, assigned bus number 3
uhci_hcd 0000:00:1d.1: irq 22, io base 0x0000ec00
PM: Adding info for usb:usb3
PM: Adding info for No Bus:usbdev3.1_ep00
usb usb3: configuration #1 chosen from 1 choice
PM: Adding info for usb:3-0:1.0
hub 3-0:1.0: USB hub found
hub 3-0:1.0: 2 ports detected
PM: Adding info for No Bus:usbdev3.1_ep81
md: Autodetecting RAID arrays.
md: autorun ...
md: ... autorun DONE.
ACPI: Power Button (FF) [PWRF]
ACPI: Power Button (CM) [PWRB]
ACPI: Sleep Button (CM) [SLPB]
EXT3 FS on sda3, internal journal
device-mapper: ioctl: 4.7.0-ioctl (2006-06-24) initialised: [email protected]
kjournald starting. Commit interval 5 seconds
EXT3 FS on sda1, internal journal
EXT3-fs: mounted filesystem with ordered data mode.
kjournald starting. Commit interval 5 seconds
EXT3 FS on sda8, internal journal
EXT3-fs: mounted filesystem with ordered data mode.
kjournald starting. Commit interval 5 seconds
EXT3 FS on sda7, internal journal
EXT3-fs: mounted filesystem with ordered data mode.
kjournald starting. Commit interval 5 seconds
EXT3 FS on sda5, internal journal
EXT3-fs: mounted filesystem with ordered data mode.
kjournald starting. Commit interval 5 seconds
EXT3 FS on sda6, internal journal
EXT3-fs: mounted filesystem with ordered data mode.
Adding 5742364k swap on /dev/sda2. Priority:-1 extents:1 across:5742364k
ip_tables: (C) 2000-2006 Netfilter Core Team
ip_tables: (C) 2000-2006 Netfilter Core Team
e1000: eth0: e1000_watchdog: NIC Link is Up 1000 Mbps Full Duplex
ip_tables: (C) 2000-2006 Netfilter Core Team
NET: Registered protocol family 10
lo: Disabled Privacy Extensions
IPv6 over IPv4 tunneling driver
e1000: eth0: e1000_watchdog: NIC Link is Up 1000 Mbps Full Duplex
eth0: no IPv6 routers present

2006-10-12 22:01:10

by Steven Truong

[permalink] [raw]
Subject: Re: kdump/kexec/crash on vmcore file

I also tried to gdb the vmcore file but I got errors too.
gdb vmcore.test
GNU gdb Red Hat Linux (6.3.0.0-1.96rh)

This GDB was configured as
"x86_64-redhat-linux-gnu"..."/scratch/vmcore.test": not in executable
format: File format not recognized

Could someone here points me to the right directions or links to
troubleshoot kernel panic?

Thank you very much.

2006-10-13 14:15:18

by Vivek Goyal

[permalink] [raw]
Subject: Re: kdump/kexec/crash on vmcore file

On Thu, Oct 12, 2006 at 02:50:33PM -0700, Steven Truong wrote:
> Hi, all. This is my first attempt to troubleshoot a kernel panic so I
> am quite newbie in this area. I have been able to obtain a kdump when
> my box had kernel panic.
>
> I set up Kdump and Kexec and then the captured/crash kernel to boot
> into Level 1 and then copy /proc/vmcore file to the disk for later
> analysis. However, after the server booted back to Level 3 and I
> utilized the crash command to analyzed the vmcore file. I got error
> message:
>
> ./crash /boot/vmlinux ../vmcore.test
>
>
> crash: read error: kernel virtual address: ffffffff8123d1e0 type:
> "kernel_config_data"
> WARNING: cannot read kernel_config_data
> crash: read error: kernel virtual address: ffffffff813b5180 type: "xtime"
>

Hi Steven,

which vmlinux are you using for analysis? First kernel's vmlinux or
second kernel's vmlinux. You should be using first kernel's vmlinux.

crash is trying to read some symbols from the core file and crash thinks
that virtual address for kernel_config_data is ffffffff8123d1e0. I think
this is too high a address. I guess this will be the address if you
compile your kernel for physical address 16MB. So my first guess is that
you are using second kernel's vmlinux for analysis.

Which kernel version and kexec-tools version are you using?

I am also copying the mail to crash-utility mailing list where folks
keep a watch on crash related issues.

Thanks
Vivek

2006-10-13 14:19:21

by Vivek Goyal

[permalink] [raw]
Subject: Re: kdump/kexec/crash on vmcore file

On Thu, Oct 12, 2006 at 03:01:06PM -0700, Steven Truong wrote:
> I also tried to gdb the vmcore file but I got errors too.
> gdb vmcore.test
> GNU gdb Red Hat Linux (6.3.0.0-1.96rh)
>

You should be running gdb as

gdb vmlinux vmcore

vmlinux is first kernel's executable and should be compiled with
CONFIG_DEBUG_INFO.

Thanks
Vivek

2006-10-13 15:42:52

by Jean-Marc Saffroy

[permalink] [raw]
Subject: Re: [Crash-utility] Re: kdump/kexec/crash on vmcore file

Steven,

I see you tried using gdb, maybe a tool I wrote could help you:
http://jeanmarc.saffroy.free.fr/kdump2gdb/

Basically it will convert the kdump core to a slightly different core that
is suitable for gdb, as well as a gdb script that loads kernel modules at
the right offsets. Then maybe you can grab a backtrace of the faulting
process ("bt full" can be nice) and post it to l-k.


Cheers,

--
[email protected]

2006-10-13 17:50:12

by Steven Truong

[permalink] [raw]
Subject: Re: kdump/kexec/crash on vmcore file

Hi, Vivek. Thank you very much for the tips. I went back to check my
command to load the crash/capture kernel and found out that I loaded
the wrong kernel. I then tried again with the correct kernel and now
I was able to use crash to analyze the vmcore kdump file.

Thank you once again.

On 10/13/06, Vivek Goyal <[email protected]> wrote:
> On Thu, Oct 12, 2006 at 02:50:33PM -0700, Steven Truong wrote:
> > Hi, all. This is my first attempt to troubleshoot a kernel panic so I
> > am quite newbie in this area. I have been able to obtain a kdump when
> > my box had kernel panic.
> >
> > I set up Kdump and Kexec and then the captured/crash kernel to boot
> > into Level 1 and then copy /proc/vmcore file to the disk for later
> > analysis. However, after the server booted back to Level 3 and I
> > utilized the crash command to analyzed the vmcore file. I got error
> > message:
> >
> > ./crash /boot/vmlinux ../vmcore.test
> >
> >
> > crash: read error: kernel virtual address: ffffffff8123d1e0 type:
> > "kernel_config_data"
> > WARNING: cannot read kernel_config_data
> > crash: read error: kernel virtual address: ffffffff813b5180 type: "xtime"
> >
>
> Hi Steven,
>
> which vmlinux are you using for analysis? First kernel's vmlinux or
> second kernel's vmlinux. You should be using first kernel's vmlinux.
>
> crash is trying to read some symbols from the core file and crash thinks
> that virtual address for kernel_config_data is ffffffff8123d1e0. I think
> this is too high a address. I guess this will be the address if you
> compile your kernel for physical address 16MB. So my first guess is that
> you are using second kernel's vmlinux for analysis.
>
> Which kernel version and kexec-tools version are you using?
>
> I am also copying the mail to crash-utility mailing list where folks
> keep a watch on crash related issues.
>
> Thanks
> Vivek
>

2006-10-13 17:51:16

by Steven Truong

[permalink] [raw]
Subject: Re: [Crash-utility] Re: kdump/kexec/crash on vmcore file

Thank Jean-Marc. I am going to try your tool too.

On 10/13/06, Jean-Marc Saffroy <[email protected]> wrote:
> Steven,
>
> I see you tried using gdb, maybe a tool I wrote could help you:
> http://jeanmarc.saffroy.free.fr/kdump2gdb/
>
> Basically it will convert the kdump core to a slightly different core that
> is suitable for gdb, as well as a gdb script that loads kernel modules at
> the right offsets. Then maybe you can grab a backtrace of the faulting
> process ("bt full" can be nice) and post it to l-k.
>
>
> Cheers,
>
> --
> [email protected]
>