2006-11-29 17:03:09

by Bart Trojanowski

[permalink] [raw]
Subject: 2.6.18.3 SMP PREEMPT crashes (x86-64)

Hi all,

I've been getting spurious hangs in 2.6.18 lately... first I thought it
was hardware but tried different replacement parts and memtest. Nothing
showed any problems.

I finally hooked up a serial console to the box and I see the following.
I include the initial dmesg output to show what's in the machine.

- Nforce4 based Shuttle XPC (PCIe, forcedeth, etc)
- Opteron 170 (dual core)
- ATI X300 video card (OSS drivers)
- 2 software raided SATA disks

The last crash occurred while I was building 2.6.19-rc5 to see if it
would make a difference. I've had it hang this way while typing into an
xterm... I don't think it required any particular load.

My .config is attached.

Any help would be appreciated.

Thanks in advance.

-Bart

[ 0.000000] Bootdata ok (command line is BOOT_IMAGE=v2.6.18.3 ro root=900 console=ttyS0,115200n8 console=tty0)
[ 0.000000] Linux version 2.6.18.3-xenon64-smp.9 (root@xenon) (gcc version 4.1.2 20060928 (prerelease) (Ubuntu 4.1.1-13ubuntu5)) #1 SMP PREEMPT Wed Nov 29 10:20:42 EST 2006
[ 0.000000] BIOS-provided physical RAM map:
[ 0.000000] BIOS-e820: 0000000000000000 - 000000000009f400 (usable)
[ 0.000000] BIOS-e820: 000000000009f400 - 00000000000a0000 (reserved)
[ 0.000000] BIOS-e820: 00000000000f0000 - 0000000000100000 (reserved)
[ 0.000000] BIOS-e820: 0000000000100000 - 000000003fff0000 (usable)
[ 0.000000] BIOS-e820: 000000003fff0000 - 000000003fff3000 (ACPI NVS)
[ 0.000000] BIOS-e820: 000000003fff3000 - 0000000040000000 (ACPI data)
[ 0.000000] BIOS-e820: 00000000e0000000 - 00000000f0000000 (reserved)
[ 0.000000] BIOS-e820: 00000000fec00000 - 0000000100000000 (reserved)
[ 0.000000] DMI 2.2 present.
[ 0.000000] Nvidia board detected. Ignoring ACPI timer override.
[ 0.000000] ACPI: PM-Timer IO Port: 0x4008
[ 0.000000] ACPI: LAPIC (acpi_id[0x00] lapic_id[0x00] enabled)
[ 0.000000] Processor #0 15:3 APIC version 16
[ 0.000000] ACPI: LAPIC (acpi_id[0x01] lapic_id[0x01] enabled)
[ 0.000000] Processor #1 15:3 APIC version 16
[ 0.000000] ACPI: LAPIC_NMI (acpi_id[0x00] high edge lint[0x1])
[ 0.000000] ACPI: LAPIC_NMI (acpi_id[0x01] high edge lint[0x1])
[ 0.000000] ACPI: IOAPIC (id[0x02] address[0xfec00000] gsi_base[0])
[ 0.000000] IOAPIC[0]: apic_id 2, version 17, address 0xfec00000, GSI 0-23
[ 0.000000] ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl)
[ 0.000000] ACPI: BIOS IRQ0 pin2 override ignored.
[ 0.000000] ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 9 high level)
[ 0.000000] ACPI: INT_SRC_OVR (bus 0 bus_irq 14 global_irq 14 high edge)
[ 0.000000] ACPI: INT_SRC_OVR (bus 0 bus_irq 15 global_irq 15 high edge)
[ 0.000000] Setting APIC routing to flat
[ 0.000000] Using ACPI (MADT) for SMP configuration information
[ 0.000000] Allocating PCI resources starting at 50000000 (gap: 40000000:a0000000)
[ 0.000000] Built 1 zonelists. Total pages: 257927
[ 0.000000] Kernel command line: BOOT_IMAGE=v2.6.18.3 ro root=900 console=ttyS0,115200n8 console=tty0
[ 0.000000] Initializing CPU#0
[ 0.000000] PID hash table entries: 4096 (order: 12, 32768 bytes)
[ 0.000000] Disabling vsyscall due to use of PM timer
[ 0.000000] time.c: Using 3.579545 MHz WALL PM GTOD PM timer.
[ 0.000000] time.c: Detected 2010.319 MHz processor.
[ 42.677836] Console: colour VGA+ 80x50
[ 42.912332] Dentry cache hash table entries: 131072 (order: 8, 1048576 bytes)
[ 42.920112] Inode-cache hash table entries: 65536 (order: 7, 524288 bytes)
[ 42.927135] Checking aperture...
[ 42.930408] CPU 0: aperture @ 4000000 size 32 MB
[ 42.935065] Aperture too small (32 MB)
[ 42.944006] No AGP bridge found
[ 42.947189] Your BIOS doesn't leave a aperture memory hole
[ 42.952715] Please enable the IOMMU option in the BIOS setup
[ 42.958413] This costs you 64 MB of RAM
[ 43.009987] Mapping aperture over 65536 KB of RAM @ 4000000
[ 43.028343] Memory: 958304k/1048512k available (3486k kernel code, 89540k reserved, 1521k data, 216k init)
[ 43.097928] Calibrating delay using timer specific routine.. 4022.42 BogoMIPS (lpj=2011214)
[ 43.106529] Security Framework v1.0.0 initialized
[ 43.111280] SELinux: Disabled at boot.
[ 43.115222] Mount-cache hash table entries: 256
[ 43.120158] CPU: L1 I Cache: 64K (64 bytes/line), D cache 64K (64 bytes/line)
[ 43.127334] CPU: L2 Cache: 1024K (64 bytes/line)
[ 43.131992] CPU: Physical Processor ID: 0
[ 43.136043] CPU: Processor Core ID: 0
[ 43.139769] Freeing SMP alternatives: 28k freed
[ 43.144363] ACPI: Core revision 20060707
[ 43.167580] Using local APIC timer interrupts.
[ 43.221777] result 12564513
[ 43.224616] Detected 12.564 MHz APIC timer.
[ 43.230362] Booting processor 1/2 APIC 0x1
[ 43.244505] Initializing CPU#1
[ 43.305454] Calibrating delay using timer specific routine.. 4020.05 BogoMIPS (lpj=2010027)
[ 43.305461] CPU: L1 I Cache: 64K (64 bytes/line), D cache 64K (64 bytes/line)
[ 43.305463] CPU: L2 Cache: 1024K (64 bytes/line)
[ 43.305465] CPU: Physical Processor ID: 0
[ 43.305467] CPU: Processor Core ID: 1
[ 43.305582] Dual Core AMD Opteron(tm) Processor 170 stepping 02
[ 43.306458] CPU 1: Syncing TSC to CPU 0.
[ 43.306830] CPU 1: synchronized TSC with CPU 0 (last diff 21 cycles, maxerr 434 cycles)
[ 43.306836] Brought up 2 CPUs
[ 43.359153] testing NMI watchdog ... OK.
[ 44.505548] migration_cost=457
[ 44.509877] NET: Registered protocol family 16
[ 44.514532] ACPI: bus type pci registered
[ 44.521996] PCI: Using MMCONFIG at e0000000
[ 44.526247] PCI: No mmconfig possible on device 0:18
[ 44.542156] ACPI: Interpreter enabled
[ 44.545861] ACPI: Using IOAPIC for interrupt routing
[ 44.551990] ACPI: PCI Root Bridge [PCI0] (0000:00)
[ 44.562340] PCI: Transparent bridge - 0000:00:09.0
[ 44.681143] ACPI: PCI Interrupt Link [LNK1] (IRQs 3 4 5 7 9 10 11 12 14 15) *0, disabled.
[ 44.690490] ACPI: PCI Interrupt Link [LNK2] (IRQs 3 4 5 7 9 10 11 12 14 15) *0, disabled.
[ 44.699847] ACPI: PCI Interrupt Link [LNK3] (IRQs 3 4 5 7 9 10 11 *12 14 15)
[ 44.707955] ACPI: PCI Interrupt Link [LNK4] (IRQs 3 4 5 *7 9 10 11 12 14 15)
[ 44.716065] ACPI: PCI Interrupt Link [LNK5] (IRQs 3 4 5 7 9 10 11 12 14 15) *0, disabled.
[ 44.725379] ACPI: PCI Interrupt Link [LUBA] (IRQs *3 4 5 7 9 10 11 12 14 15)
[ 44.733466] ACPI: PCI Interrupt Link [LUBB] (IRQs 3 4 5 7 9 10 11 12 14 15) *0, disabled.
[ 44.742815] ACPI: PCI Interrupt Link [LMAC] (IRQs 3 4 *5 7 9 10 11 12 14 15)
[ 44.750920] ACPI: PCI Interrupt Link [LACI] (IRQs 3 4 5 7 9 10 11 12 14 15) *0, disabled.
[ 44.760263] ACPI: PCI Interrupt Link [LMCI] (IRQs 3 4 5 7 9 10 11 12 14 15) *0, disabled.
[ 44.769593] ACPI: PCI Interrupt Link [LSMB] (IRQs 3 4 5 7 9 *10 11 12 14 15)
[ 44.777698] ACPI: PCI Interrupt Link [LUB2] (IRQs 3 4 5 7 9 10 *11 12 14 15)
[ 44.785806] ACPI: PCI Interrupt Link [LIDE] (IRQs 3 4 5 7 9 10 11 12 14 15) *0, disabled.
[ 44.795156] ACPI: PCI Interrupt Link [LSID] (IRQs 3 4 5 7 9 10 *11 12 14 15)
[ 44.803257] ACPI: PCI Interrupt Link [LFID] (IRQs 3 4 5 7 9 *10 11 12 14 15)
[ 44.811370] ACPI: PCI Interrupt Link [LPCA] (IRQs 3 4 5 7 9 10 11 12 14 15) *0, disabled.
[ 44.820793] ACPI: PCI Interrupt Link [APC1] (IRQs 16) *0, disabled.
[ 44.827879] ACPI: PCI Interrupt Link [APC2] (IRQs 17) *0, disabled.
[ 44.834961] ACPI: PCI Interrupt Link [APC3] (IRQs 18) *0, disabled.
[ 44.842048] ACPI: PCI Interrupt Link [APC4] (IRQs 19) *0, disabled.
[ 44.849001] ACPI: PCI Interrupt Link [APC5] (IRQs *16), disabled.
[ 44.855911] ACPI: PCI Interrupt Link [APCF] (IRQs 20 21 22 23) *0, disabled.
[ 44.863953] ACPI: PCI Interrupt Link [APCG] (IRQs 20 21 22 23) *0, disabled.
[ 44.871965] ACPI: PCI Interrupt Link [APCH] (IRQs 20 21 22 23) *0, disabled.
[ 44.879984] ACPI: PCI Interrupt Link [APCJ] (IRQs 20 21 22 23) *0, disabled.
[ 44.888007] ACPI: PCI Interrupt Link [APCK] (IRQs 20 21 22 23) *0, disabled.
[ 44.896024] ACPI: PCI Interrupt Link [APCS] (IRQs 20 21 22 23) *0, disabled.
[ 44.904045] ACPI: PCI Interrupt Link [APCL] (IRQs 20 21 22 23) *0, disabled.
[ 44.912066] ACPI: PCI Interrupt Link [APCZ] (IRQs 20 21 22 23) *0, disabled.
[ 44.920094] ACPI: PCI Interrupt Link [APSI] (IRQs 20 21 22 23) *0, disabled.
[ 44.928112] ACPI: PCI Interrupt Link [APSJ] (IRQs 20 21 22 23) *0, disabled.
[ 44.936144] ACPI: PCI Interrupt Link [APCP] (IRQs 20 21 22 23) *0, disabled.
[ 44.948033] Linux Plug and Play Support v0.97 (c) Adam Belay
[ 44.953756] pnp: PnP ACPI init
[ 44.964277] pnp: PnP ACPI: found 11 devices
[ 44.968612] Generic PHY: Registered new driver
[ 44.974234] SCSI subsystem initialized
[ 44.978129] usbcore: registered new driver usbfs
[ 44.982879] usbcore: registered new driver hub
[ 44.988747] PCI: Using ACPI for IRQ routing
[ 44.992978] PCI: If a device doesn't work, try "pci=routeirq". If it helps, post a report
[ 45.001487] PCI-DMA: Disabling AGP.
[ 45.005070] PCI-DMA: aperture base @ 4000000 size 65536 KB
[ 45.010599] PCI-DMA: using GART IOMMU.
[ 45.014394] PCI-DMA: Reserving 64MB of IOMMU area in the AGP aperture
[ 45.021692] pnp: 00:01: ioport range 0x4000-0x407f could not be reserved
[ 45.028431] pnp: 00:01: ioport range 0x4080-0x40ff has been reserved
[ 45.034824] pnp: 00:01: ioport range 0x4400-0x447f has been reserved
[ 45.041214] pnp: 00:01: ioport range 0x4480-0x44ff could not be reserved
[ 45.047952] pnp: 00:01: ioport range 0x4800-0x487f has been reserved
[ 45.054341] pnp: 00:01: ioport range 0x4880-0x48ff has been reserved
[ 45.061527] PCI: Bridge: 0000:00:09.0
[ 45.065239] IO window: a000-afff
[ 45.068684] MEM window: d8100000-d81fffff
[ 45.072909] PREFETCH window: disabled.
[ 45.076876] PCI: Bridge: 0000:00:0b.0
[ 45.080580] IO window: disabled.
[ 45.084027] MEM window: disabled.
[ 45.087562] PREFETCH window: disabled.
[ 45.091528] PCI: Bridge: 0000:00:0c.0
[ 45.095234] IO window: disabled.
[ 45.098682] MEM window: disabled.
[ 45.102215] PREFETCH window: disabled.
[ 45.106182] PCI: Bridge: 0000:00:0d.0
[ 45.109888] IO window: disabled.
[ 45.113335] MEM window: disabled.
[ 45.116868] PREFETCH window: disabled.
[ 45.120837] PCI: Bridge: 0000:00:0e.0
[ 45.124540] IO window: 9000-9fff
[ 45.127989] MEM window: d8000000-d80fffff
[ 45.132215] PREFETCH window: d0000000-d7ffffff
[ 45.136980] NET: Registered protocol family 2
[ 45.151451] IP route cache hash table entries: 32768 (order: 6, 262144 bytes)
[ 45.159110] TCP established hash table entries: 131072 (order: 10, 4194304 bytes)
[ 45.169673] TCP bind hash table entries: 65536 (order: 9, 2097152 bytes)
[ 45.177929] TCP: Hash tables configured (established 131072 bind 65536)
[ 45.184584] TCP reno registered
[ 45.193267] audit: initializing netlink socket (disabled)
[ 45.198735] audit(1160191405.089:1): initialized
[ 45.203720] VFS: Disk quotas dquot_6.5.1
[ 45.207722] Dquot-cache hash table entries: 512 (order 0, 4096 bytes)
[ 45.214341] SGI XFS with ACLs, security attributes, realtime, large block/inode numbers, no debug enabled
[ 45.224362] SGI XFS Quota Management subsystem
[ 45.228922] Initializing Cryptographic API
[ 45.233070] io scheduler noop registered
[ 45.237101] io scheduler anticipatory registered (default)
[ 45.242726] io scheduler deadline registered
[ 45.247134] io scheduler cfq registered
[ 45.265842] PCI: Linking AER extended capability on 0000:00:0b.0
[ 45.271910] PCI: Linking AER extended capability on 0000:00:0c.0
[ 45.277953] PCI: Linking AER extended capability on 0000:00:0d.0
[ 45.283997] PCI: Linking AER extended capability on 0000:00:0e.0
[ 45.290391] pcie_portdrv_probe->Dev[005d:10de] has invalid IRQ. Check vendor BIOS
[ 45.297947] assign_interrupt_mode Found MSI capability
[ 45.303373] pcie_portdrv_probe->Dev[005d:10de] has invalid IRQ. Check vendor BIOS
[ 45.310925] assign_interrupt_mode Found MSI capability
[ 45.316307] pcie_portdrv_probe->Dev[005d:10de] has invalid IRQ. Check vendor BIOS
[ 45.323868] assign_interrupt_mode Found MSI capability
[ 45.329257] pcie_portdrv_probe->Dev[005d:10de] has invalid IRQ. Check vendor BIOS
[ 45.336810] assign_interrupt_mode Found MSI capability
[ 45.342220] ACPI: Power Button (FF) [PWRF]
[ 45.346370] ACPI: Power Button (CM) [PWRB]
[ 45.350530] ACPI: Fan [FAN] (on)
[ 45.353930] Using specific hotkey driver
[ 45.359101] ACPI: Thermal Zone [THRM] (40 C)
[ 45.416089] DoubleTalk PC - not found
[ 45.419902] Linux agpgart interface v0.101 (c) Dave Jones
[ 45.425355] [drm] Initialized drm 1.0.1 20051102
[ 45.430475] ACPI: PCI Interrupt Link [APC3] enabled at IRQ 18
[ 45.436263] GSI 16 sharing vector 0xD9 and IRQ 16
[ 45.441006] ACPI: PCI Interrupt 0000:01:00.0[A] -> Link [APC3] -> GSI 18 (level, low) -> IRQ 217
[ 45.450089] [drm] Initialized radeon 1.25.0 20060524 on minor 0
[ 45.456053] Hangcheck: starting hangcheck timer 0.9.0 (tick is 180 seconds, mar8126] hda: PLEXTOR DVDR PX-716AL, ATAPI CD/DVD-ROM drive
[ 47.682452] ide0 at 0x1f0-0x1f7,0x3f6 on irq 14
[ 47.688236] ACPI: PCI Interrupt Link [APSI] enabled at IRQ 22
[ 47.694025] GSI 18 sharing vector 0xE9 and IRQ 18
[ 47.698771] ACPI: PCI Interrupt 0000:00:07.0[A] -> Link [APSI] -> GSI 22 (level, low) -> IRQ 233
[ 47.708344] ata1: SATA max UDMA/133 cmd 0x9F0 ctl 0xBF2 bmdma 0xD800 irq 233
[ 47.715516] ata2: SATA max UDMA/133 cmd 0x970 ctl 0xB72 bmdma 0xD808 irq 233
[ 47.722622] scsi0 : sata_nv
[ 48.179012] ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
[ 48.185786] ata1.00: ATA-7, max UDMA/133, 320170943 sectors: LBA48
[ 48.192092] ata1.00: ata1: dev 0 multi count 16
[ 48.197350] ata1.00: configured for UDMA/133
[ 48.201666] scsi1 : sata_nv
[ 48.658642] ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
[ 48.665426] ata2.00: ATA-7, max UDMA/133, 320173056 sectors: LBA48
[ 48.671729] ata2.00: ata2: dev 0 multi count 16
[ 48.677011] ata2.00: configured for UDMA/133
[ 48.681442] Vendor: ATA Model: Maxtor 6Y160M0 Rev: YAR5
[ 48.688962] Type: Direct-Access ANSI SCSI revision: 05
[ 48.696766] Vendor: ATA Model: Maxtor 6Y160M0 Rev: YAR5
[ 48.704266] Type: Direct-Access ANSI SCSI revision: 05
[ 48.712743] ACPI: PCI Interrupt Link [APSJ] enabled at IRQ 21
[ 48.718535] GSI 19 sharing vector 0x32 and IRQ 19
[ 48.723277] ACPI: PCI Interrupt 0000:00:08.0[A] -> Link [APSJ] -> GSI 21 (level, low) -> IRQ 50
[ 48.732468] ata3: SATA max UDMA/133 cmd 0x9E0 ctl 0xBE2 bmdma 0xC400 irq 50
[ 48.739559] ata4: SATA max UDMA/133 cmd 0x960 ctl 0xB62 bmdma 0xC408 irq 50
[ 48.746569] scsi2 : sata_nv
[ 49.052326] ata3: SATA link down (SStatus 0 SControl 300)
[ 49.057767] scsi3 : sata_nv
[ 49.364085] ata4: SATA link down (SStatus 0 SControl 300)
[ 49.369818] SCSI device sda: 320170943 512-byte hdwr sectors (163928 MB)
[ 49.376571] sda: Write Protect is off
[ 49.380346] SCSI device sda: drive cache: write back
[ 49.385456] SCSI device sda: 320170943 512-byte h04] usbmon: debugfs is not available
[ 49.580828] ACPI: PCI Interrupt Link [APCL] enabled at IRQ 20
[ 49.586619] GSI 20 sharing vector 0x3A and IRQ 20
[ 49.591362] ACPI: PCI Interrupt 0000:00:02.1[B] -> Link [APCL] -> GSI 20 (level, low) -> IRQ 58
[ 49.600488] ehci_hcd 0000:00:02.1: EHCI Host Controller
[ 49.605980] ehci_hcd 0000:00:02.1: new USB bus registered, assigned bus number 1
[ 49.613499] ehci_hcd 0000:00:02.1: debug port 1
[ 49.618083] ehci_hcd 0000:00:02.1: irq 58, io mem 0xd8205000
[ 49.623781] ehci_hcd 0000:00:02.1: USB 2.0 started, EHCI 1.00, driver 10 Dec 2004
[ 49.631593] usb usb1: configuration #1 chosen from 1 choice
[ 49.637309] hub 1-0:1.0: USB hub found
[ 49.641112] hub 1-0:1.0: 10 ports detected
[ 49.747731] ACPI: PCI Interrupt Link [APCF] enabled at IRQ 23
[ 49.753517] ACPI: PCI Interrupt 0000:00:02.0[A] -> Link [APCF] -> GSI 23 (level, low) -> IRQ 225
[ 49.762686] ohci_hcd 0000:00:02.0: OHCI Host Controller
[ 49.768136] ohci_hcd 0000:00:02.0: new USB bus registered, assigned bus number 2
[ 49.775629] ohci_hcd 0000:00:02.0: irq 225, io mem 0xd8204000
[ 49.834982] usb usb2: configuration #1 chosen from 1 choice
[ 49.840699] hub 2-0:1.0: USB hub found
[ 49.844503] hub 2-0:1.0: 10 ports detected
[ 49.949873] USB Universal Host Controller Interface driver v3.0
[ 49.955832] usb 1-2: new high speed USB device using ehci_hcd and address 2
[ 50.081412] usb 1-2: configuration #1 chosen from 1 choice
[ 50.605132] usb 1-6: new high speed USB device using ehci_hcd and address 5
[ 50.741923] usb 1-6: configuration #1 chosen from 1 choice
[ 50.965854] usb 2-3: new low speed USB device using ohci_hcd and address 2
[ 51.118991] usb 2-3: configuration #1 chosen from 1 choice
[ 51.343563] usb 2-4: new full speed USB device using ohci_hcd and address 3
[ 51.494714] usb 2-4: configuration #1 chosen from 1 choice
[ 51.501587] hub 2-4:1.0: USB hub found
[ 51.506489] hub 2-4:1.0: 3 ports detected
[ 51.808264] usb 2-4.1: new full speed USB device using ohci_hcd and address 4
[ 51.912394] usb 2-4.1: configuration #1_MD_DEVS=256, MD_SB_DISKS=27
[ 52.040132] md: bitmap version 4.39
[ 52.044006] device-mapper: ioctl: 4.7.0-ioctl (2006-06-24) initialised: [email protected]
[ 52.052486] Advanced Linux Sound Architecture Driver Version 1.0.12rc1 (Thu Jun 22 13:55:50 2006 UTC).
[ 52.062048] ALSA device list:
[ 52.065065] No soundcards found.
[ 52.068580] TCP bic registered
[ 52.071700] NET: Registered protocol family 1
[ 52.076497] md: Autodetecting RAID arrays.
[ 52.133378] md: autorun ...
[ 52.136222] md: considering sdb3 ...
[ 52.139871] md: adding sdb3 ...
[ 52.143140] md: sdb1 has different UUID to sdb3
[ 52.147737] md: adding sda3 ...
[ 52.151014] md: sda1 has different UUID to sdb3
[ 52.155722] md: created md1
[ 52.158567] md: bind<sda3>
[ 52.161333] md: bind<sdb3>
[ 52.164111] md: running: <sdb3><sda3>
[ 52.168257] raid1: raid set md1 active with 2 out of 2 mirrors
[ 52.174372] md: considering sdb1 ...
[ 52.178015] md: adding sdb1 ...
[ 52.181313] md: adding sda1 ...
[ 52.184588] md: created md0
[ 52.187431] md: bind<sda1>
[ 52.190206] md: bind<sdb1>
[ 52.192977] md: running: <sdb1><sda1>
[ 52.196832] md: md0: raid array is not clean -- starting background reconstruction
[ 52.204791] raid1: raid set md0 active with 2 out of 2 mirrors
[ 52.210707] md: syncing RAID array md0
[ 52.214504] md: minimum _guaranteed_ reconstruction speed: 1000 KB/sec/disc.
[ 52.214911] md: ... autorun DONE.
[ 52.224941] md: using maximum available idle IO bandwidth (but not more than 200000 KB/sec) for reconstruction.
[ 52.235105] md: using 128k window, over a total of 29294400 blocks.
[ 52.298365] Filesystem "md0": Disabling barriers, not supported by the underlying device
[ 52.340686] XFS mounting filesystem md0
[ 52.454886] Starting XFS recovery on filesystem: md0 (logdev: internal)
[ 54.704519] Ending XFS recovery on filesystem: md0 (logdev: internal)
[ 54.711153] VFS: Mounted root (xfs filesystem) readonly.
[ 54.716593] Freeing unused kernel memory: 216k freed
[ 68.321317] lp: driver loaded but no devices found

[ 1460.922839] general protection fault: 0000 [1] PREEMPT SMP
[ 1460.928466] CPU 1
[ 1460.930492] Modules linked in: raid456 xor binfmt_misc ipv6 container battery asus_acpi ac nfs lockd sunrpc af_packet rtc parport_pc lp parport
[ 1460.943621] Pid: 27424, comm: cc1 Not tainted 2.6.18.3-xenon64-smp.9 #1
[ 1460.950228] RIP: 0010:[<ffffffff810359d0>] [<ffffffff810359d0>] cache_free_debugcheck+0x190/0x280
[ 1460.959193] RSP: 0018:ffff810022163ee8 EFLAGS: 00010046
[ 1460.964503] RAX: ffffff557fffffff RBX: ffff81003ffe9700 RCX: 0000000000000003
[ 1460.971629] RDX: ffffffff8101ae5f RSI: 0000000080042800 RDI: ffff810021921000
[ 1460.978757] RBP: ffff810021921000 R08: 0000000000000000 R09: ffff81002192100e
[ 1460.985883] R10: 000000000000000f R11: ffffffff81030160 R12: ffff81003ffe4980
[ 1460.993003] R13: 0000000000000246 R14: 89e8ae0f0054f325 R15: ffffffff8101ae5f
[ 1461.000130] FS: 00002b6778ac26d0(0000) GS:ffff81003ffac6d0(0000) knlGS:0000000000000000
[ 1461.008209] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 1461.013951] CR2: 00002b6779117018 CR3: 000000001d984000 CR4: 00000000000006e0
[ 1461.021079] Process cc1 (pid: 27424, threadinfo ffff810022162000, task ffff810029d847d0)
[ 1461.029157] Stack: ffff810028be73e8 ffff81003ffe9700 ffff81003ffe4980 ffff810021921000
[ 1461.037228] 0000000000000246 ffff810022699530 0000000000000000 ffffffff81007745
[ 1461.044677] ffff810021921000 0000000000000020 ffff810021921000 ffff8100276a34b8
[ 1461.051933] Call Trace:
[ 1461.054572] [<ffffffff81007745>] kmem_cache_free+0xb5/0x110
[ 1461.060225] [<ffffffff8101ae5f>] do_sys_open+0xcf/0xf0
[ 1461.065448] [<ffffffff8106911e>] system_call+0x7e/0x83
[ 1461.070668]
[ 1461.072155]
[ 1461.072156] Code: 49 8b 7e 18 41 8b 4c 24 4c 89 e8 31 d2 29 f8 f7 f1 41 3b 44
[ 1461.081214] RIP [<ffffffff810359d0>] cache_free_debugcheck+0x190/0x280
[ 1461.087839] RSP <ffff810022163ee8>
[ 1461.091321] <3>BUG: sleeping function called from invalid context at kernel/rwsem.c:20
[ 1461.099333] in_atomic():0, irqs_disabled():1
[ 1461.103602]
[ 1461.103603] Call Trace:
[ 1461.107545] [<ffffffff810a69a5>] down_read+0x15/0x20
[ 1461.112595] [<ffffffff8109dd40>] blocking_notifier_call_chain+0x20/0x50
[ 1461.119289] [<ffffffff81015fb2>] do_exit+0x22/0x980
[ 1461.124254] [<ffffffff81204c85>] do_unblank_screen+0x85/0x140
[ 1461.130080] [<ffffffff8101ae5f>] do_sys_open+0xcf/0xf0
[ 1461.135303] [<ffffffff81077111>] die+0x51/0x60
[ 1461.139831] [<ffffffff8106ff1d>] do_general_protection+0x10d/0x130
[ 1461.146092] [<ffffffff81069e1d>] error_exit+0x0/0x84
[ 1461.151140] [<ffffffff8101ae5f>] do_sys_open+0xcf/0xf0
[ 1461.156363] [<ffffffff81030160>] dummy_inode_permission+0x0/0x10
[ 1461.162451] [<ffffffff8101ae5f>] do_sys_open+0xcf/0xf0
[ 1461.167673] [<ffffffff810359d0>] cache_free_debugcheck+0x190/0x280
[ 1461.173935] [<ffffffff8103587d>] cache_free_debugcheck+0x3d/0x280
[ 1461.180110] [<ffffffff81007745>] kmem_cache_free+0xb5/0x110
[ 1461.185765] [<ffffffff8101ae5f>] do_sys_open+0xcf/0xf0
[ 1461.190988] [<ffffffff8106911e>] system_call+0x7e/0x83
[ 1461.196208]
[ 1461.197744] general protection fault: 0000 [2] PREEMPT SMP
[ 1461.203359] CPU 1
[ 1461.205385] Modules linked in: raid456 xor binfmt_misc ipv6 container battery asus_acpi ac nfs lockd sunrpc af_packet rtc parport_pc lp parport
[ 1461.218516] Pid: 27424, comm: cc1 Not tainted 2.6.18.3-xenon64-smp.9 #1
[ 1461.225123] RIP: 0010:[<ffffffff810359d0>] [<ffffffff810359d0>] cache_free_debugcheck+0x190/0x280
[ 1461.234086] RSP: 0018:ffff81003ff1fe90 EFLAGS: 00010002
[ 1461.239387] RAX: ffff81003f1d3ea0 RBX: 00000000170fc2a5 RCX: ffff81003ff1ff10
[ 1461.246515] RDX: ffffffff810a1e01 RSI: 0000000000052c00 RDI: ffff81003ffe4840
[ 1461.253642] RBP: ffff81003f1d3d78 R08: 0000000000000102 R09: ffff8100025cb680
[ 1461.260770] R10: 0000000000000001 R11: ffffffff81081a20 R12: ffff81003ffe4840
[ 1461.267897] R13: 00000000170fc2a5 R14: 480039bc3c056348 R15: ffffffff810a1e05
[ 1461.275023] FS: 00002b6778ac26d0(0000) GS:ffff81003ffac6d0(0000) knlGS:0000000000000000
[ 1461.283102] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 1461.288838] CR2: 00002b6779117018 CR3: 000000001d984000 CR4: 00000000000006e0
[ 1461.295966] Process cc1 (pid: 27424, threadinfo ffff810022162000, task ffff810029d847d0)
[ 1461.304043] Stack: ffff81003ff1ff00 ffff81003ffa80d8 ffff81003ffe4840 ffff81003f1d3d80
[ 1461.312114] 0000000000000246 000000000000000a ffffffff8101ae5f ffffffff81007745
[ 1461.319563] 0000000000000282 ffff81001d1538c0 ffff8100025cbb60 0000000000000001
[ 1461.326819] Call Trace:
[ 1461.329451] <IRQ> [<ffffffff8101ae5f>] do_sys_open+0xcf/0xf0
[ 1461.335214] [<ffffffff81007745>] kmem_cache_free+0xb5/0x110
[ 1461.340869] [<ffffffff810a1e05>] __rcu_process_callbacks+0x145/0x1e0
[ 1461.347303] [<ffffffff810a1ec3>] rcu_process_callbacks+0x23/0x50
[ 1461.353393] [<ffffffff81096226>] tasklet_action+0x66/0xb0
[ 1461.358873] [<ffffffff810128cb>] __do_softirq+0x5b/0xd0
[ 1461.364185] [<ffffffff8106a324>] call_softirq+0x1c/0x28
[ 1461.369490] [<ffffffff81077edc>] do_softirq+0x2c/0x90
[ 1461.374626] [<ffffffff810961af>] irq_exit+0x3f/0x50
[ 1461.379588] [<ffffffff81069cc2>] apic_timer_interrupt+0x66/0x6c
[ 1461.385589] <EOI> [<ffffffff81081a20>] flat_send_IPI_mask+0x0/0x50
[ 1461.391870] [<ffffffff8106f24f>] _spin_unlock_irq+0xf/0x40
[ 1461.397439] [<ffffffff8106f249>] _spin_unlock_irq+0x9/0x40
[ 1461.403007] [<ffffffff8106e701>] __down_read+0x31/0xa3
[ 1461.408227] [<ffffffff8101ae5f>] do_sys_open+0xcf/0xf0
[ 1461.413449] [<ffffffff81076ea2>] dump_stack+0x12/0x20
[ 1461.418586] [<ffffffff8100b830>] __might_sleep+0xb0/0xc0
[ 1461.423982] [<ffffffff8109dd40>] blocking_notifier_call_chain+0x20/0x50
[ 1461.430677] [<ffffffff81015fb2>] do_exit+0x22/0x980
[ 1461.435638] [<ffffffff81204c85>] do_unblank_screen+0x85/0x140
[ 1461.441467] [<ffffffff8101ae5f>] do_sys_open+0xcf/0xf0
[ 1461.446688] [<ffffffff81077111>] die+0x51/0x60
[ 1461.451219] [<ffffffff8106ff1d>] do_general_protection+0x10d/0x130
[ 1461.457479] [<ffffffff81069e1d>] error_exit+0x0/0x84
[ 1461.462528] [<ffffffff8101ae5f>] do_sys_open+0xcf/0xfsion+0x0/0x10
[ 1461.473838] [<ffffffff8101ae5f>] do_sys_open+0xcf/0xf0
[ 1461.479062] [<ffffffff810359d0>] cache_free_debugcheck+0x190/0x280
[ 1461.485323] [<ffffffff8103587d>] cache_free_debugcheck+0x3d/0x280
[ 1461.491496] [<ffffffff81007745>] kmem_cache_free+0xb5/0x110
[ 1461.497152] [<ffffffff8101ae5f>] do_sys_open+0xcf/0xf0
[ 1461.502373] [<ffffffff8106911e>] system_call+0x7e/0x83
[ 1461.507597]
[ 1461.509093]
[ 1461.509093] Code: 49 8b 7e 18 41 8b 4c 24 4c 89 e8 31 d2 29 f8 f7 f1 41 3b 44
[ 1461.518151] RIP [<ffffffff810359d0>] cache_free_debugcheck+0x190/0x280
[ 1461.524777] RSP <ffff81003ff1fe90>
[ 1461.528267] <0>Kernel panic - not syncing: Aiee, killing interrupt handler!

--
WebSig: http://www.jukie.net/~bart/sig/


Attachments:
(No filename) (25.65 kB)
config-2.6.18.3 (52.72 kB)
Download all attachments

2006-11-29 17:13:45

by Prakash Punnoor

[permalink] [raw]
Subject: Re: 2.6.18.3 SMP PREEMPT crashes (x86-64)

Am Mittwoch 29 November 2006 18:02 schrieb Bart Trojanowski:
> Hi all,
>
> I've been getting spurious hangs in 2.6.18 lately... first I thought it
> was hardware but tried different replacement parts and memtest. Nothing
> showed any problems.
>
> I finally hooked up a serial console to the box and I see the following.
> I include the initial dmesg output to show what's in the machine.
>
> - Nforce4 based Shuttle XPC (PCIe, forcedeth, etc)

> [ 0.000000] Nvidia board detected. Ignoring ACPI timer override.

Does your bios have the option to enable the hpet? (Maybe after a bios
update?)

If not:

Try booting with noapic, compile latest git kernel and buut it (w/o noapic).
Above message should now not appear, if I am not mistaken. Otherwise you have
to hack the kernel to not ignoge the timer override..


But I could be mistaken...
--
(?= =?)
//\ Prakash Punnoor /\\
V_/ \_V


Attachments:
(No filename) (924.00 B)
(No filename) (189.00 B)
Download all attachments

2006-11-29 17:31:32

by Bart Trojanowski

[permalink] [raw]
Subject: Re: 2.6.18.3 SMP PREEMPT crashes (x86-64)

* Bart Trojanowski <[email protected]> [061129 12:02]:
> I finally hooked up a serial console to the box and I see the following.
> I include the initial dmesg output to show what's in the machine.

This time I booted with "maxcpus=1" and proceeded to build 2.6.19-rc again.
Again a crash... as it's different, I include the console output:

[ 2496.851726] Unable to handle kernel paging request at ffff8100aa5b69b0 RIP:
[ 2496.856340] [<ffffffff8108dcd8>] __wake_up_common+0x28/0x80
[ 2496.864454] PGD 8063 PUD 0
[ 2496.867278] Oops: 0000 [1] PREEMPT SMP
[ 2496.871158] CPU 0
[ 2496.873192] Modules linked in: ide_cd cdrom binfmt_misc ipv6 container battery asus_acpi ac nfs lockd sunrpc af_packet rtc parport_pc lp parport
[ 2496.886410] Pid: 31767, comm: ruby Not tainted 2.6.18.3-xenon64-smp.9 #1
[ 2496.893103] RIP: 0010:[<ffffffff8108dcd8>] [<ffffffff8108dcd8>] __wake_up_common+0x28/0x80
[ 2496.901469] RSP: 0000:ffff81002992dcc8 EFLAGS: 00010006
[ 2496.906777] RAX: ffff8100aa5b69b0 RBX: ffff8100025b6998 RCX: 0000000000000000
[ 2496.913904] RDX: 0000000000000001 RSI: 0000000000000003 RDI: ffff8100025b6998
[ 2496.921024] RBP: ffff81002992dcf8 R08: ffff81002992dd48 R09: ffff8100022cd210
[ 2496.928151] R10: 00002b2a49319d20 R11: 0000000000000001 R12: ffff81002992dd48
[ 2496.935278] R13: 0000000000000001 R14: ffff8100025b69b0 R15: ffff81002992dd48
[ 2496.942398] FS: 00002b2a49319c90(0000) GS:ffffffff81560000(0000) knlGS:0000000000000000
[ 2496.950477] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[ 2496.956220] CR2: ffff8100aa5b69b0 CR3: 0000000024cb7000 CR4: 00000000000006e0
[ 2496.963347] Process ruby (pid: 31767, threadinfo ffff81002992c000, task ffff81001b33e790)
[ 2496.971512] Stack: 0000000300000000 ffff8100025b6998 ffff81002992dd48 0000000000000001
[ 2496.979584] 0000000000000213 0000000000000003 ffff81002992dd38 ffffffff81031ed3
[ 2496.987031] ffff810001b39540 0000000000000000 ffff8100022cd210 ffff8100346c8cb0
[ 2496.994288] Call Trace:
[ 2496.996926] [<ffffffff81031ed3>] __wake_up+0x43/0x70
[ 2497.001973] [<ffffffff8100c438>] __wake_up_bit+0x28/0x30
[ 2497.007368] [<ffffffff81011644>] do_wp_page+0x154/0x450
[ 2497.012677] [<ffffffff81009178>] __handle_mm_fault+0x9d8/0xaa0
[ 2497.018593] [<ffffffff8106ec46>] _spin_lock_irqsave+0x26/0x90
[ 2497.024420] [<ffffffff81071617>] do_page_fault+0x4a7/0x8c0
[ 2497.029993] [<ffffffff81069e1d>] error_exit+0x0/0x84
[ 2497.035038]
[ 2497.036532]
[ 2497.036533] Code: 48 8b 18 74 36 66 66 90 48 8d 78 e8 44 8b 60 e8 4c 89 f9 8b
[ 2497.045591] RIP [<ffffffff8108dcd8>] __wake_up_common+0x28/0x80
[ 2497.051612] RSP <ffff81002992dcc8>
[ 2497.055101] CR2: ffff8100aa5b69b0
[ 2497.058408] <3>BUG: sleeping function called from invalid context at kernel/rwsem.c:20
[ 2497.066420] in_atomic():1, irqs_disabled():1
[ 2497.070680]
[ 2497.070680] Call Trace:
[ 2497.074614] [<ffffffff810a69a5>] down_read+0x15/0x20
[ 2497.079663] [<ffffffff8109dd40>] blocking_notifier_call_chain+0x20/0x50
[ 2497.086358] [<ffffffff81015fb2>] do_exit+0x22/0x980
[ 2497.091323] [<ffffffff81204c85>] do_unblank_screen+0x85/0x140
[ 2497.097149] [<ffffffff81071929>] do_page_fault+0x7b9/0x8c0
[ 2497.102719] [<ffffffff8100a5fb>] get_page_from_freelist+0x27b/0x470
[ 2497.109067] [<ffffffff81069e1d>] error_exit+0x0/0x84
[ 2497.114117] [<ffffffff8108dcd8>] __wake_up_common+0x28/0x80
[ 2497.119771] [<ffffffff81031ed3>] __wake_up+0x43/0x70
[ 2497.124819] [<ffffffff8100c438>] __wake_up_bit+0x28/0x30
[ 2497.130214] [<ffffffff81011644>] do_wp_page+0x154/0x450
[ 2497.135522] [<ffffffff81009178>] __handle_mm_fault+0x9d8/0xaa0
[ 2497.141439] [<ffffffff8106ec46>] _spin_lock_irqsave+0x26/0x90
[ 2497.147266] [<ffffffff81071617>] do_page_fault+0x4a7/0x8c0
[ 2497.152838] [<ffffffff81069e1d>] error_exit+0x0/0x84
[ 2497.157883]
[ 2497.159424] note: ruby[31767] exited with preempt_count 2
[ 2507.151629] BUG: soft lockup detected on CPU#0!
[ 2507.156149]
[ 2507.156150] Call Trace:
[ 2507.160081] <IRQ> [<ffffffff810b6d69>] softlockup_tick+0xf9/0x140
[ 2507.166276] [<ffffffff8109a797>] update_process_times+0x57/0x90
[ 2507.172279] [<ffffffff8107f403>] smp_local_timer_interrupt+0x23/0x50
[ 2507.178712] [<ffffffff8107f968>] smp_apic_timer_interrupt+0x38/0x40
[ 2507.185059] [<ffffffff81069cc2>] apic_timer_interrupt+0x66/0x6c
[ 2507.191059] <EOI> [<ffffffff8106f040>] _spin_lock+0x50/0x70
[ 2507.196734] [<ffffffff8106f02c>] _spin_lock+0x3c/0x70
[ 2507.201869] [<ffffffff810080f3>] unmap_vmas+0x763/0x7c0
[ 2507.207181] [<ffffffff8103f991>] exit_mmap+0x81/0x130
[ 2507.212313] [<ffffffff81042368>] mmput+0x48/0xe0
[ 2507.217016] [<ffffffff810161bc>] do_exit+0x22c/0x980
[ 2507.222065] [<ffffffff81204c85>] do_unblank_screen+0x85/0x140
[ 2507.227894] [<ffffffff81071929>] do_page_fault+0x7b9/0x8c0
[ 2507.233463] [<ffffffff8100a5fb>] get_page_from_freelist+0x27b/0x470
[ 2507.239811] [<ffffffff81069e1d>] error_exit+0x0/0x84
[ 2507.244861] [<ffffffff8108dcd8>] __wake_up_common+0x28/0x80
[ 2507.250516] [<ffffffff81031ed3>] __wake_up+0x43/0x70
[ 2507.255563] [<ffffffff8100c438>] __wake_up_bit+0x28/0x30
[ 2507.260958] [<ffffffff81011644>] do_wp_page+0x154/0x450
[ 2507.266268] [<ffffffff81009178>] __handle_mm_fault+0x9d8/0xaa0
[ 2507.272183] [<ffffffff8106ec46>] _spin_lock_irqsave+0x26/0x90
[ 2507.278010] [<ffffffff81071617>] do_page_fault+0x4a7/0x8c0
[ 2507.283584] [<ffffffff81069e1d>] error_exit+0x0/0x84
[ 2507.288628]
[ 2517.143985] BUG: soft lockup detected on CPU#0!
[ 2517.148512]
[ 2517.148513] Call Trace:
[ 2517.152444] <IRQ> [<ffffffff810b6d69>] softlockup_tick+0xf9/0x140
[ 2517.158630] [<ffffffff8109a797>] update_process_times+0x57/0x90
[ 2517.164632] [<ffffffff8107f403>] smp_local_timer_interrupt+0x23/0x50
[ 2517.171066] [<ffffffff8107f968>] smp_apic_timer_interrupt+0x38/0x40
[ 2517.177414] [<ffffffff81069cc2>] apic_timer_interrupt+0x66/0x6c
[ 2517.183412] <EOI> [<ffffffff8106f042>] _spin_lock+0x52/0x70
[ 2517.189087] [<ffffffff8106f02c>] _spin_lock+0x3c/0x70
[ 2517.194222] [<ffffffff810080f3>] unmap_vmas+0x763/0x7c0
[ 2517.199536] [<ffffffff8103f991>] exit_mmap+0x81/0x130
[ 2517.204667] [<ffffffff81042368>] mmput+0x48/0xe0
[ 2517.209370] [<ffffffff810161bc>] do_exit+0x22c/0x980
[ 2517.214419] [<ffffffff81204c85>] do_unblank_screen+0x85/0x140
[ 2517.220248] [<ffffffff81071929>] do_page_fault+0x7b9/0x8c0
[ 2517.225817] [<ffffffff8100a5fb>] get_page_from_freelist+0x27b/0x470
[ 2517.232166] [<ffffffff81069e1d>] error_exit+0x0/0x84
[ 2517.237215] [<ffffffff8108dcd8>] __wake_up_common+0x28/0x80
[ 2517.242869] [<ffffffff81031ed3>] __wake_up+0x43/0x70
[ 2517.247917] [<ffffffff8100c438>] __wake_up_bit+0x28/0x30
[ 2517.253311] [<ffffffff81011644>] do_wp_page+0x154/0x450
[ 2517.258620] [<ffffffff81009178>] __handle_mm_fault+0x9d8/0xaa0
[ 2517.264537] [<ffffffff8106ec46>] _spin_lock_irqsave+0x26/0x90
[ 2517.270366] [<ffffffff81071617>] do_page_fault+0x4a7/0x8c0
[ 2517.275937] [<ffffffff81069e1d>] error_exit+0x0/0x84

--
WebSig: http://www.jukie.net/~bart/sig/

2006-11-29 19:08:18

by Bart Trojanowski

[permalink] [raw]
Subject: Re: 2.6.18.3 soft lockups and 2.6.19-rc6 rwlock OOPs on SMP x86-64

One more update. I rebuilt 2.6.19-rc6 and turned off CONFIG_PREEMPT_BKL.
It didn't help much.

I am still getting BUG dumps from the kernel, however they don't cause
freezes immediately. It takes a few minutes of console output before
the computer stops responding to my input.

As I was writing the above I tried building again under 2.6.19-rc6. The
kernel OOPSed with a NULL pointer dereference.

-Bart

[ 1273.206537] Unable to handle kernel NULL pointer dereference at 0000000000000005 RIP:
[ 1273.212025] [<ffffffff8100784f>] _raw_spin_lock+0x1f/0x140
[ 1273.220130] PGD 15465067 PUD 15427067 PMD 0
[ 1273.224477] Oops: 0000 [1] SMP
[ 1273.227681] CPU 0
[ 1273.229733] Modules linked in: ide_cd cdrom binfmt_misc ipv6 container battery asus_acpi ac nfs lockd sunrpc af_packet rtc parport_pc lp parport
[ 1273.243012] Pid: 13785, comm: make Not tainted 2.6.19-rc6-xenon64-smp.10 #1
[ 1273.249991] RIP: 0010:[<ffffffff8100784f>] [<ffffffff8100784f>] _raw_spin_lock+0x1f/0x140
[ 1273.258313] RSP: 0018:ffffffff8155af38 EFLAGS: 00010092
[ 1273.263649] RAX: 0000000000009e40 RBX: 0000000000000001 RCX: 0000000000000000
[ 1273.270802] RDX: 0000000000000001 RSI: ffffffff814f7d80 RDI: 0000000000000001
[ 1273.277955] RBP: ffffffff81447060 R08: 0000000000000001 R09: 0000000000000000
[ 1273.285107] R10: 0000000000004008 R11: 0000000000000000 R12: 0000000000000000
[ 1273.292253] R13: 0000000000000001 R14: 00000000ffffffff R15: 0000000000000001
[ 1273.299407] FS: 00002b6c23c6cae0(0000) GS:ffffffff814f7000(0000) knlGS:0000000000000000
[ 1273.307530] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[ 1273.313296] CR2: 0000000000000005 CR3: 0000000014066000 CR4: 00000000000006e0
[ 1273.320442] Process make (pid: 13785, threadinfo ffff8100132d2000, task ffff81001845f0c0)
[ 1273.328651] Stack: ffffffff814f7d80 ffffffff81447060 0000000000000000 0000000000000001
[ 1273.336784] 00000000ffffffff ffffffff810b6f09 ffffffff8106624c ffffffff81554c30
[ 1273.344292] 0000000000000020 0000000000000000 ffff8100185cbda0 ffffffff81073f0a
[ 1273.351584] Call Trace:
[ 1273.354269] <IRQ> [<ffffffff810b6f09>] handle_edge_irq+0x119/0x150
[ 1273.360672] [<ffffffff8106624c>] call_softirq+0x1c/0x28
[ 1273.366007] [<ffffffff81073f0a>] do_IRQ+0x8a/0xe0
[ 1273.370819] [<ffffffff81065641>] ret_from_intr+0x0/0xa
[ 1273.376066] <EOI> [<ffffffff81069f67>] __mutex_lock_slowpath+0x1d7/0x1f0
[ 1273.382990] [<ffffffff81023409>] dbg_redzone1+0x19/0x30
[ 1273.388323] [<ffffffff810cfa45>] pipe_read+0x75/0x400
[ 1273.393485] [<ffffffff8106b129>] _spin_unlock_irq+0x9/0x10
[ 1273.399079] [<ffffffff8100ca1f>] do_sync_read+0xcf/0x120
[ 1273.404503] [<ffffffff810a06a0>] autoremove_wake_function+0x0/0x30
[ 1273.410787] [<ffffffff8103f739>] remove_wait_queue+0x19/0x60
[ 1273.416554] [<ffffffff8106b129>] _spin_unlock_irq+0x9/0x10
[ 1273.422151] [<ffffffff8102b33a>] do_sigaction+0x1aa/0x1d0
[ 1273.427656] [<ffffffff8100b08b>] vfs_read+0xdb/0x1a0
[ 1273.432733] [<ffffffff81011db3>] sys_read+0x53/0x90
[ 1273.437720] [<ffffffff8106511e>] system_call+0x7e/0x83
[ 1273.442967]
[ 1273.444481]
[ 1273.444482] Code: 81 7f 04 ad 4e ad de 74 0c 48 c7 c6 dc 30 3c 81 e8 1c c2 00
[ 1273.453619] RIP [<ffffffff8100784f>] _raw_spin_lock+0x1f/0x140
[ 1273.459577] RSP <ffffffff8155af38>
[ 1273.463085] CR2: 0000000000000005
[ 1273.466569] <3>BUG: sleeping function called from invalid context at kernel/rwsem.c:20
[ 1273.474672] in_atomic():1, irqs_disabled():1
[ 1273.478984]
[ 1273.478985] Call Trace:
[ 1273.483002] <IRQ> [<ffffffff810a2e85>] down_read+0x15/0x30
[ 1273.488747] [<ffffffff81099b80>] blocking_notifier_call_chain+0x20/0x50
[ 1273.495484] [<ffffffff81015c52>] do_exit+0x22/0x8c0
[ 1273.500491] [<ffffffff8106b169>] _spin_lock_irqsave+0x9/0x10
[ 1273.506277] [<ffffffff811d17b6>] vgacon_set_cursor_size+0x36/0xf0
[ 1273.512494] [<ffffffff8106d809>] do_page_fault+0x7a9/0x8b0
[ 1273.518107] [<ffffffff81063908>] blk_run_queue+0x28/0x80
[ 1273.523546] [<ffffffff812561bb>] scsi_next_command+0x3b/0x60
[ 1273.529330] [<ffffffff8106b58d>] error_exit+0x0/0x84
[ 1273.534422] [<ffffffff8100784f>] _raw_spin_lock+0x1f/0x140
[ 1273.540035] [<ffffffff810b6f09>] handle_edge_irq+0x119/0x150
[ 1273.545817] [<ffffffff8106624c>] call_softirq+0x1c/0x28
[ 1273.551171] [<ffffffff81073f0a>] do_IRQ+0x8a/0xe0
[ 1273.556002] [<ffffffff81065641>] ret_from_intr+0x0/0xa
[ 1273.561267] <EOI> [<ffffffff81069f67>] __mutex_lock_slowpath+0x1d7/0x1f0
[ 1273.568224] [<ffffffff81023409>] dbg_redzone1+0x19/0x30
[ 1273.573574] [<ffffffff810cfa45>] pipe_read+0x75/0x400
[ 1273.578754] [<ffffffff8106b129>] _spin_unlock_irq+0x9/0x10
[ 1273.584365] [<ffffffff8100ca1f>] do_sync_read+0xcf/0x120
[ 1273.589804] [<ffffffff810a06a0>] autoremove_wake_function+0x0/0x30
[ 1273.596108] [<ffffffff8103f739>] remove_wait_queue+0x19/0x60
[ 1273.601894] [<ffffffff8106b129>] _spin_unlock_irq+0x9/0x10
[ 1273.607505] [<ffffffff8102b33a>] do_sigaction+0x1aa/0x1d0
[ 1273.613030] [<ffffffff8100b08b>] vfs_read+0xdb/0x1a0
[ 1273.618124] [<ffffffff81011db3>] sys_read+0x53/0x90
[ 1273.623128] [<ffffffff8106511e>] system_call+0x7e/0x83
[ 1273.628393]
[ 1273.629960] Kernel panic - not syncing: Aiee, killing interrupt handler!

--
WebSig: http://www.jukie.net/~bart/sig/

2006-11-29 22:40:10

by Bart Trojanowski

[permalink] [raw]
Subject: Re: 2.6.18.3 SMP PREEMPT crashes (x86-64)

Prakash Punnoor <prakash <at> punnoor.de> writes:
> Does your bios have the option to enable the hpet? (Maybe after a bios
> update?)

It does not.

> If not:
>
> Try booting with noapic, compile latest git kernel and buut it (w/o noapic).
> Above message should now not appear, if I am not mistaken. Otherwise you have
> to hack the kernel to not ignoge the timer override..

Worked as advertised. I booted with noapic maxcpus=1 and rebuilt the git repo.
Rebooted and ran through the same kind of stress tests. Seems to work.

Thanks a lot.



2006-11-30 19:02:29

by Bart Trojanowski

[permalink] [raw]
Subject: Re: 2.6.19 SMP x86-64 crashes continue

Hello,

I've been getting odd crashes in 2.6.18 and .19-rc's with a dual core
opteron on an nforce chipset. I've been running with a serial console
capturing things.

Prakash suggested I try the git tree yesterday afternoon, and I've been
running v2.6.19-rc6-g1275361 ...that ended up being very close to the
2.6.19 release, which is perfect :)

2.6.19 seems to be a lot more stable with this configuration. 2.6.18.y
used to die after 5 minutes (make -j3 of the kernel), my new kernel ran
for about 24 hours.

Below I marked a few spots where I remember what I did. See EVENT1,
EVENT2, EVENT3, and EVENT4.

I am building the "real" 2.6.19 as I speak and will report if it
happens again.

-Bart

[ 0.000000] Command line: BOOT_IMAGE=dbg-v2.6.19-rc6 ro root=900 console=ttyS0,115200n8 console=tty0
[ 0.000000] BIOS-provided physical RAM map:
[ 0.000000] BIOS-e820: 0000000000000000 - 000000000009f400 (usable)
[ 0.000000] BIOS-e820: 000000000009f400 - 00000000000a0000 (reserved)
[ 0.000000] BIOS-e820: 00000000000f0000 - 0000000000100000 (reserved)
[ 0.000000] BIOS-e820: 0000000000100000 - 000000003fff0000 (usable)
[ 0.000000] BIOS-e820: 000000003fff0000 - 000000003fff3000 (ACPI NVS)
[ 0.000000] BIOS-e820: 000000003fff3000 - 0000000040000000 (ACPI data)
[ 0.000000] BIOS-e820: 00000000e0000000 - 00000000f0000000 (reserved)
[ 0.000000] BIOS-e820: 00000000fec00000 - 0000000100000000 (reserved)
[ 0.000000] end_pfn_map = 1048576
[ 0.000000] DMI 2.2 present.
[ 0.000000] Zone PFN ranges:
[ 0.000000] DMA 0 -> 4096
[ 0.000000] DMA32 4096 -> 1048576
[ 0.000000] Normal 1048576 -> 1048576
[ 0.000000] early_node_map[2] active PFN ranges
[ 0.000000] 0: 0 -> 159
[ 0.000000] 0: 256 -> 262128
[ 0.000000] Nvidia board detected. Ignoring ACPI timer override.
[ 0.000000] If you got timer trouble try acpi_use_timer_override
[ 0.000000] ACPI: PM-Timer IO Port: 0x4008
[ 0.000000] ACPI: LAPIC (acpi_id[0x00] lapic_id[0x00] enabled)
[ 0.000000] Processor #0 (Bootup-CPU)
[ 0.000000] ACPI: LAPIC (acpi_id[0x01] lapic_id[0x01] enabled)
[ 0.000000] Processor #1
[ 0.000000] ACPI: LAPIC_NMI (acpi_id[0x00] high edge lint[0x1])
[ 0.000000] ACPI: LAPIC_NMI (acpi_id[0x01] high edge lint[0x1])
[ 0.000000] ACPI: IOAPIC (id[0x02] address[0xfec00000] gsi_base[0])
[ 0.000000] IOAPIC[0]: apic_id 2, address 0xfec00000, GSI 0-23
[ 0.000000] ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl)
[ 0.000000] ACPI: BIOS IRQ0 pin2 override ignored.
[ 0.000000] ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 9 high level)
[ 0.000000] ACPI: INT_SRC_OVR (bus 0 bus_irq 14 global_irq 14 high edge)
[ 0.000000] ACPI: INT_SRC_OVR (bus 0 bus_irq 15 global_irq 15 high edge)
[ 0.000000] Setting APIC routing to flat
[ 0.000000] Using ACPI (MADT) for SMP configuration information
[ 0.000000] Nosave address range: 000000000009f000 - 00000000000a0000
[ 0.000000] Nosave address range: 00000000000a0000 - 00000000000f0000
[ 0.000000] Nosave address range: 00000000000f0000 - 0000000000100000
[ 0.000000] Allocating PCI resources starting at 50000000 (gap: 40000000:a0000000)
[ 0.000000] PERCPU: Allocating 33472 bytes of per cpu data
[ 0.000000] Built 1 zonelists. Total pages: 258439
[ 0.000000] Kernel command line: BOOT_IMAGE=dbg-v2.6.19-rc6 ro root=900 console=ttyS0,115200n8 console=tty0
[ 0.000000] Initializing CPU#0
[ 0.000000] PID hash table entries: 4096 (order: 12, 32768 bytes)
[ 25.933238] Console: colour VGA+ 80x50
[ 26.210190] Dentry cache hash table entries: 131072 (order: 8, 1048576 bytes)
[ 26.217960] Inode-cache hash table entries: 65536 (order: 7, 524288 bytes)
[ 26.224976] Checking aperture...
[ 26.228249] CPU 0: aperture @ 4000000 size 32 MB
[ 26.232906] Aperture too small (32 MB)
[ 26.241859] No AGP bridge found
[ 26.256938] Memory: 1025888k/1048512k available (3496k kernel code, 21972k reserved, 1517k data, 256k init)
[ 26.326412] Calibrating delay using timer specific routine.. 4021.81 BogoMIPS (lpj=2010905)
[ 26.334896] Security Framework v1.0.0 initialized
[ 26.339648] SELinux: Disabled at boot.
[ 26.343538] Mount-cache hash table entries: 256
[ 26.348246] CPU: L1 I Cache: 64K (64 bytes/line), D cache 64K (64 bytes/line)
[ 26.355416] CPU: L2 Cache: 1024K (64 bytes/line)
[ 26.360075] CPU: Physical Processor ID: 0
[ 26.364127] CPU: Processor Core ID: 0
[ 26.367850] Freeing SMP alternatives: 32k freed
[ 26.372450] ACPI: Core revision 20060707
[ 26.392499] Using local APIC timer interrupts.
[ 26.446689] result 12564482
[ 26.449523] Detected 12.564 MHz APIC timer.
[ 26.454403] Booting processor 1/2 APIC 0x1
[ 26.468543] Initializing CPU#1
[ 26.528945] Calibrating delay using timer specific routine.. 4020.06 BogoMIPS (lpj=2010033)
[ 26.528951] CPU: L1 I Cache: 64K (64 bytes/line), D cache 64K (64 bytes/line)
[ 26.528953] CPU: L2 Cache: 1024K (64 bytes/line)
[ 26.528956] CPU: Physical Processor ID: 0
[ 26.528957] CPU: Processor Core ID: 1
[ 26.529074] Dual Core AMD Opteron(tm) Processor 170 stepping 02
[ 26.529951] CPU 1: Syncing TSC to CPU 0.
[ 26.530321] CPU 1: synchronized TSC with CPU 0 (last diff -4 cycles, maxerr 466 cycles)
[ 26.530330] Brought up 2 CPUs
[ 26.582655] testing NMI watchdog ... OK.
[ 26.596663] Disabling vsyscall due to use of PM timer
[ 26.601756] time.c: Using 3.579545 MHz WALL PM GTOD PM timer.
[ 26.607540] time.c: Detected 2010.314 MHz processor.
[ 27.745837] migration_cost=452
[ 27.749382] NET: Registered protocol family 16
[ 27.753937] ACPI: bus type pci registered
[ 27.761377] PCI: Using MMCONFIG at e0000000
[ 27.765623] PCI: No mmconfig possible on device 00:18
[ 27.777974] ACPI: Interpreter enabled
[ 27.781680] ACPI: Using IOAPIC for interrupt routing
[ 27.787133] ACPI: PCI Root Bridge [PCI0] (0000:00)
[ 27.795708] PCI: Transparent bridge - 0000:00:09.0
[ 27.867353] ACPI: PCI Interrupt Link [LNK1] (IRQs 3 4 5 7 9 10 11 12 14 15) *0, disabled.
[ 27.876473] ACPI: PCI Interrupt Link [LNK2] (IRQs 3 4 5 7 9 10 11 12 14 15) *0, disabled.
[ 27.885575] ACPI: PCI Interrupt Link [LNK3] (IRQs 3 4 5 7 9 10 11 *12 14 15)
[ 27.893447] ACPI: PCI Interrupt Link [LNK4] (IRQs 3 4 5 *7 9 10 11 12 14 15)
[ 27.901295] ACPI: PCI Interrupt Link [LNK5] (IRQs 3 4 5 7 9 10 11 12 14 15) *0, disabled.
[ 27.910385] ACPI: PCI Interrupt Link [LUBA] (IRQs *3 4 5 7 9 10 11 12 14 15)
[ 27.918265] ACPI: PCI Interrupt Link [LUBB] (IRQs 3 4 5 7 9 10 11 12 14 15) *0, disabled.
[ 27.927350] ACPI: PCI Interrupt Link [LMAC] (IRQs 3 4 *5 7 9 10 11 12 14 15)
[ 27.935207] ACPI: PCI Interrupt Link [LACI] (IRQs 3 4 5 7 9 10 11 12 14 15) *0, disabled.
[ 27.944300] ACPI: PCI Interrupt Link [LMCI] (IRQs 3 4 5 7 9 10 11 12 14 15) *0, disabled.
[ 27.953403] ACPI: PCI Interrupt Link [LSMB] (IRQs 3 4 5 7 9 *10 11 12 14 15)
[ 27.961257] ACPI: PCI Interrupt Link [LUB2] (IRQs 3 4 5 7 9 10 *11 12 14 15)
[ 27.969119] ACPI: PCI Interrupt Link [LIDE] (IRQs 3 4 5 7 9 10 11 12 14 15) *0, disabled.
[ 27.978223] ACPI: PCI Interrupt Link [LSID] (IRQs 3 4 5 7 9 10 *11 12 14 15)
[ 27.986087] ACPI: PCI Interrupt Link [LFID] (IRQs 3 4 5 7 9 *10 11 12 14 15)
[ 27.993954] ACPI: PCI Interrupt Link [LPCA] (IRQs 3 4 5 7 9 10 11 12 14 15) *0, disabled.
[ 28.003092] ACPI: PCI Interrupt Link [APC1] (IRQs 16) *0, disabled.
[ 28.009930] ACPI: PCI Interrupt Link [APC2] (IRQs 17) *0, disabled.
[ 28.016753] ACPI: PCI Interrupt Link [APC3] (IRQs 18) *0, disabled.
[ 28.023569] ACPI: PCI Interrupt Link [APC4] (IRQs 19) *0, disabled.
[ 28.031516] ACPI: PCI Interrupt Link [APC5] (IRQs *16), disabled.
[ 28.038154] ACPI: PCI Interrupt Link [APCF] (IRQs 20 21 22 23) *0, disabled.
[ 28.045907] ACPI: PCI Interrupt Link [APCG] (IRQs 20 21 22 23) *0, disabled.
[ 28.053640] ACPI: PCI Interrupt Link [APCH] (IRQs 20 21 22 23) *0, disabled.
[ 28.061395] ACPI: PCI Interrupt Link [APCJ] (IRQs 20 21 22 23) *0, disabled.
[ 28.069144] ACPI: PCI Interrupt Link [APCK] (IRQs 20 21 22 23) *0, disabled.
[ 28.076886] ACPI: PCI Interrupt Link [APCS] (IRQs 20 21 22 23) *0, disabled.
[ 28.084627] ACPI: PCI Interrupt Link [APCL] (IRQs 20 21 22 23) *0, disabled.
[ 28.092370] ACPI: PCI Interrupt Link [APCZ] (IRQs 20 21 22 23) *0, disabled.
[ 28.100124] ACPI: PCI Interrupt Link [APSI] (IRQs 20 21 22 23) *0, disabled.
[ 28.107880] ACPI: PCI Interrupt Link [APSJ] (IRQs 20 21 22 23) *0, disabled.
[ 28.115624] ACPI: PCI Interrupt Link [APCP] (IRQs 20 21 22 23) *0, disabled.
[ 28.125412] Linux Plug and Play Support v0.97 (c) Adam Belay
[ 28.131118] pnp: PnP ACPI init
[ 28.138667] pnp: PnP ACPI: found 11 devices
[ 28.142926] Generic PHY: Registered new driver
[ 28.147524] SCSI subsystem initialized
[ 28.151418] usbcore: registered new interface driver usbfs
[ 28.156974] usbcore: registered new interface driver hub
[ 28.162359] usbcore: registered new device driver usb
[ 28.167485] PCI: Using ACPI for IRQ routing
[ 28.171716] PCI: If a device doesn't work, try "pci=routeirq". If it helps, post a report
[ 28.180090] NetLabel: Initializing
[ 28.183533] NetLabel: domain hash size = 128
[ 28.187932] NetLabel: protocols = UNLABELED CIPSOv4
[ 28.192952] NetLabel: unlabeled traffic allowed by default
[ 28.198596] PCI-DMA: Disabling IOMMU.
[ 28.202705] pnp: 00:01: ioport range 0x4000-0x407f could not be reserved
[ 28.209447] pnp: 00:01: ioport range 0x4080-0x40ff has been reserved
[ 28.215836] pnp: 00:01: ioport range 0x4400-0x447f has been reserved
[ 28.222229] pnp: 00:01: ioport range 0x4480-0x44ff could not be reserved
[ 28.228966] pnp: 00:01: ioport range 0x4800-0x487f has been reserved
[ 28.235357] pnp: 00:01: ioport range 0x4880-0x48ff has been reserved
[ 28.242013] PCI: Bridge: 0000:00:09.0
[ 28.245716] IO window: a000-afff
[ 28.249162] MEM window: d8100000-d81fffff
[ 28.253386] PREFETCH window: disabled.
[ 28.257354] PCI: Bridge: 0000:00:0b.0
[ 28.261059] IO window: disabled.
[ 28.264506] MEM window: disabled.
[ 28.268041] PREFETCH window: disabled.
[ 28.272007] PCI: Bridge: 0000:00:0c.0
[ 28.275713] IO window: disabled.
[ 28.279160] MEM window: disabled.
[ 28.282693] PREFETCH window: disabled.
[ 28.286659] PCI: Bridge: 0000:00:0d.0
[ 28.290366] IO window: disabled.
[ 28.293814] MEM window: disabled.
[ 28.297347] PREFETCH window: disabled.
[ 28.301316] PCI: Bridge: 0000:00:0e.0
[ 28.305019] IO window: 9000-9fff
[ 28.308467] MEM window: d8000000-d80fffff
[ 28.312692] PREFETCH window: d0000000-d7ffffff
[ 28.317396] NET: Registered protocol family 2
[ 28.330931] IP route cache hash table entries: 32768 (order: 6, 262144 bytes)
[ 28.338372] TCP established hash table entries: 131072 (order: 9, 2097152 bytes)
[ 28.347355] TCP bind hash table entries: 65536 (order: 8, 1048576 bytes)
[ 28.354863] TCP: Hash tables configured (established 131072 bind 65536)
[ 28.361512] TCP reno registered
[ 28.365379] audit: initializing netlink socket (disabled)
[ 28.370837] audit(1164831161.976:1): initialized
[ 28.375626] VFS: Disk quotas dquot_6.5.1
[ 28.379602] Dquot-cache hash table entries: 512 (order 0, 4096 bytes)
[ 28.386123] SGI XFS with ACLs, security attributes, realtime, large block/inode numbers, no debug enabled
[ 28.395898] SGI XFS Quota Management subsystem
[ 28.400410] io scheduler noop registered
[ 28.404416] io scheduler anticipatory registered (default)
[ 28.410019] io scheduler deadline registered
[ 28.414385] io scheduler cfq registered
[ 28.432167] PCI: Linking AER extended capability on 0000:00:0b.0
[ 28.438223] PCI: Found HT MSI mapping on 0000:00:0b.0 with capability disabled
[ 28.445511] PCI: Found HT MSI mapping on 0000:00:00.0 with capability enabled
[ 28.452687] PCI: Linking AER extended capability on 0000:00:0c.0
[ 28.458734] PCI: Found HT MSI mapping on 0000:00:0c.0 with capability disabled
[ 28.466017] PCI: Found HT MSI mapping on 0000:00:00.0 with capability enabled
[ 28.473186] PCI: Linking AER extended capability on 0000:00:0d.0
[ 28.479230] PCI: Found HT MSI mapping on 0000:00:0d.0 with capability disabled
[ 28.486516] PCI: Found HT MSI mapping on 0000:00:00.0 with capability enabled
[ 28.493684] PCI: Linking AER extended capability on 0000:00:0e.0
[ 28.499729] PCI: Found HT MSI mapping on 0000:00:0e.0 with capability disabled
[ 28.507016] PCI: Found HT MSI mapping on 0000:00:00.0 with capability enabled
[ 28.514369] pcie_portdrv_probe->Dev[005d:10de] has invalid IRQ. Check vendor BIOS
[ 28.521920] assign_interrupt_mode Found MSI capability
[ 28.527207] pcie_portdrv_probe->Dev[005d:10de] has invalid IRQ. Check vendor BIOS
[ 28.534761] assign_interrupt_mode Found MSI capability
[ 28.540022] pcie_portdrv_probe->Dev[005d:10de] has invalid IRQ. Check vendor BIOS
[ 28.547568] assign_interrupt_mode Found MSI capability
[ 28.552832] pcie_portdrv_probe->Dev[005d:10de] has invalid IRQ. Check vendor BIOS
[ 28.560384] assign_interrupt_mode Found MSI capability
[ 28.565685] ACPI: Power Button (FF) [PWRF]
[ 28.569832] ACPI: Power Button (CM) [PWRB]
[ 28.573990] ACPI: Fan [FAN] (on)
[ 28.577322] Using specific hotkey driver
[ 28.581947] ACPI: Thermal Zone [THRM] (40 C)
[ 28.606679] DoubleTalk PC - not found
[ 28.610457] Linux agpgart interface v0.101 (c) Dave Jones
[ 28.615936] [drm] Initialized drm 1.0.1 20051102
[ 28.620946] ACPI: PCI Interrupt Link [APC3] enabled at IRQ 18
[ 28.626762] ACPI: PCI Interrupt 0000:01:00.0[A] -> Link [APC3] -> GSI 18 (level, low) -> IRQ 18
[ 28.635756] [drm] Initialized radeon 1.25.0 20060524 on minor 0
[ 28.641740] Hangcheck: starting hangcheck timer 0.9.0 (tick is 180 seconds, margin is 60 seconds).
[ 28.650774] Hangcheck: Using monotonic_clock().
[ 28.655364] Serial: 8250/16550 driver $Revision: 1.90 $ 4 ports, IRQ sharing enabled
[ 28.663274] serial8250: ttyS0 at I/O 0x3f8 (irq = 4) is a 16550A
[ 28.669852] 00:08: ttyS0 at I/O 0x3f8 (irq = 4) is a 16550A
[ 28.675984] RAMDISK driver initialized: 16 RAM disks of 65536K size 1024 blocksize
[ 28.683694] forcedeth.c: Reverse Engineered nForce ethernet driver. Version 0.57.
[ 28.691609] ACPI: PCI Interrupt Link [APCH] enabled at IRQ 23
[ 28.697404] ACPI: PCI Interrupt 0000:00:0a.0[A] -> Link [APCH] -> GSI 23 (level, low) -> IRQ 23
[ 28.706249] forcedeth: using HIGHDMA
[ 29.222393] eth0: forcedeth.c: subsystem: 01297:5036 bound to 0000:00:0a.0
[ 29.229308] Uniform Multi-Platform E-IDE driver Revision: 7.00alpha2
[ 29.235702] ide: Assuming 33MHz system bus speed for PIO modes; override with idebus=xx
[ 30.037669] hda: PLEXTOR DVDR PX-716AL, ATAPI CD/DVD-ROM drive
[ 30.861935] ide0 at 0x1f0-0x1f7,0x3f6 on irq 14
[ 30.867073] ACPI: PCI Interrupt Link [APSI] enabled at IRQ 22
[ 30.872860] ACPI: PCI Interrupt 0000:00:07.0[A] -> Link [APSI] -> GSI 22 (level, low) -> IRQ 22
[ 30.882137] ata1: SATA max UDMA/133 cmd 0x9F0 ctl 0xBF2 bmdma 0xD800 irq 22
[ 30.889162] ata2: SATA max UDMA/133 cmd 0x970 ctl 0xB72 bmdma 0xD808 irq 22
[ 30.896174] scsi0 : sata_nv
[ 31.352558] ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
[ 31.359346] ata1.00: ATA-7, max UDMA/133, 320170943 sectors: LBA48
[ 31.365653] ata1.00: ata1: dev 0 multi count 16
[ 31.370916] ata1.00: configured for UDMA/133
[ 31.375247] scsi1 : sata_nv
[ 31.832189] ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
[ 31.838970] ata2.00: ATA-7, max UDMA/133, 320173056 sectors: LBA48
[ 31.845275] ata2.00: ata2: dev 0 multi count 16
[ 31.850540] ata2.00: configured for UDMA/133
[ 31.854922] scsi 0:0:0:0: Direct-Access ATA Maxtor 6Y160M0 YAR5 PQ: 0 ANSI: 5
[ 31.863151] SCSI device sda: 320170943 512-byte hdwr sectors (163928 MB)
[ 31.869891] sda: Write Protect is off
[ 31.873605] SCSI device sda: drive cache: write back
[ 31.878641] SCSI device sda: 320170943 512-byte hdwr sectors (163928 MB)
[ 31.885384] sda: Write Protect is off
[ 31.890300] SCSI device sda: drive cache: write back
[ 31.895305] sda: sda1 sda2 sda3
[ 31.936730] sd 0:0:0:0: Attached scsi disk sda
[ 31.941282] sd 0:0:0:0: Attached scsi generic sg0 type 0
[ 31.946704] scsi 1:0:0:0: Direct-Access ATA Maxtor 6Y160M0 YAR5 PQ: 0 ANSI: 5
[ 31.954926] SCSI device sdb: 320173056 512-byte hdwr sectors (163929 MB)
[ 31.961667] sdb: Write Protect is off
[ 31.965386] SCSI device sdb: drive cache: write back
[ 31.970423] SCSI device sdb: 320173056 512-byte hdwr sectors (163929 MB)
[ 31.977167] sdb: Write Protect is off
[ 31.980882] SCSI device sdb: drive cache: write back
[ 31.985884] sdb: sdb1 sdb2 sdb3
[ 32.009037] sd 1:0:0:0: Attached scsi disk sdb
[ 32.013571] sd 1:0:0:0: Attached scsi generic sg1 type 0
[ 32.019399] ACPI: PCI Interrupt Link [APSJ] enabled at IRQ 21
[ 32.025188] ACPI: PCI Interrupt 0000:00:08.0[A] -> Link [APSJ] -> GSI 21 (level, low) -> IRQ 21
[ 32.034342] ata3: SATA max UDMA/133 cmd 0x9E0 ctl 0xBE2 bmdma 0xC400 irq 21
[ 32.041377] ata4: SATA max UDMA/133 cmd 0x960 ctl 0xB62 bmdma 0xC408 irq 21
[ 32.048385] scsi2 : sata_nv
[ 32.354775] ata3: SATA link down (SStatus 0 SControl 300)
[ 32.360220] scsi3 : sata_nv
[ 32.666533] ata4: SATA link down (SStatus 0 SControl 300)
[ 32.672009] ieee1394: raw1394: /dev/raw1394 device initialized
[ 32.677905] ieee1394: sbp2: Driver forced to serialize I/O (serialize_io=1)
[ 32.684903] ieee1394: sbp2: Try serialize_io=0 for better performance
[ 32.691441] usbmon: debugfs is not available
[ 32.696138] ACPI: PCI Interrupt Link [APCL] enabled at IRQ 20
[ 32.701925] ACPI: PCI Interrupt 0000:00:02.1[B] -> Link [APCL] -> GSI 20 (level, low) -> IRQ 20
[ 32.710981] ehci_hcd 0000:00:02.1: EHCI Host Controller
[ 32.716378] ehci_hcd 0000:00:02.1: new USB bus registered, assigned bus number 1
[ 32.723854] ehci_hcd 0000:00:02.1: debug port 1
[ 32.728431] ehci_hcd 0000:00:02.1: irq 20, io mem 0xd8205000
[ 32.734131] ehci_hcd 0000:00:02.1: USB 2.0 started, EHCI 1.00, driver 10 Dec 2004
[ 32.741784] usb usb1: configuration #1 chosen from 1 choice
[ 32.747433] hub 1-0:1.0: USB hub found
[ 32.751228] hub 1-0:1.0: 10 ports detected
[ 32.856838] ACPI: PCI Interrupt Link [APCF] enabled at IRQ 23
[ 32.862623] ACPI: PCI Interrupt 0000:00:02.0[A] -> Link [APCF] -> GSI 23 (level, low) -> IRQ 23
[ 32.871672] ohci_hcd 0000:00:02.0: OHCI Host Controller
[ 32.877061] ohci_hcd 0000:00:02.0: new USB bus registered, assigned bus number 2
[ 32.884527] ohci_hcd 0000:00:02.0: irq 23, io mem 0xd8204000
[ 32.943408] usb usb2: configuration #1 chosen from 1 choice
[ 32.949051] hub 2-0:1.0: USB hub found
[ 32.952847] hub 2-0:1.0: 10 ports detected
[ 33.058307] USB Universal Host Controller Interface driver v3.0
[ 33.072209] usb 1-2: new high speed USB device using ehci_hcd and address 2
[ 33.197865] usb 1-2: configuration #1 chosen from 1 choice
[ 33.721708] usb 1-6: new high speed USB device using ehci_hcd and address 5
[ 33.858249] usb 1-6: configuration #1 chosen from 1 choice
[ 34.081430] usb 2-3: new low speed USB device using ohci_hcd and address 2
[ 34.234419] usb 2-3: configuration #1 chosen from 1 choice
[ 34.459139] usb 2-4: new full speed USB device using ohci_hcd and address 3
[ 34.610142] usb 2-4: configuration #1 chosen from 1 choice
[ 34.617098] hub 2-4:1.0: USB hub found
[ 34.622065] hub 2-4:1.0: 3 ports detected
[ 34.923841] usb 2-4.1: new full speed USB device using ohci_hcd and address 4
[ 35.027835] usb 2-4.1: configuration #1 chosen from 1 choice
[ 35.044754] usbcore: registered new interface driver libusual
[ 35.050582] usbcore: registered new interface driver hiddev
[ 35.063825] input: Logitech Trackball as /class/input/input0
[ 35.069526] input: USB HID v1.10 Mouse [Logitech Trackball] on usb-0000:00:02.0-3
[ 35.081819] input: Chicony PFU-65 USB Keyboard as /class/input/input1
[ 35.088391] input: USB HID v1.00 Keyboard [Chicony PFU-65 USB Keyboard] on usb-0000:00:02.0-4.1
[ 35.097326] usbcore: registered new interface driver usbhid
[ 35.102936] drivers/usb/input/hid-core.c: v2.6:USB HID core driver
[ 35.109260] PNP: No PS/2 controller found. Probing ports directly.
[ 35.366820] serio: i8042 KBD port at 0x60,0x64 irq 1
[ 35.371945] mice: PS/2 mouse device common for all mice
[ 35.377275] md: raid0 personality registered for level 0
[ 35.382626] md: raid1 personality registered for level 1
[ 35.388021] device-mapper: ioctl: 4.10.0-ioctl (2006-09-14) initialised: [email protected]
[ 35.396550] Advanced Linux Sound Architecture Driver Version 1.0.13 (Tue Nov 28 14:07:24 2006 UTC).
[ 35.405711] ALSA device list:
[ 35.408728] No soundcards found.
[ 35.412223] TCP cubic registered
[ 35.415505] NET: Registered protocol family 1
[ 35.722234] md: Autodetecting RAID arrays.
[ 35.769138] md: autorun ...
[ 35.771981] md: considering sdb3 ...
[ 35.775604] md: adding sdb3 ...
[ 35.778874] md: sdb1 has different UUID to sdb3
[ 35.783448] md: adding sda3 ...
[ 35.786720] md: sda1 has different UUID to sdb3
[ 35.791326] md: created md1
[ 35.794170] md: bind<sda3>
[ 35.796931] md: bind<sdb3>
[ 35.799687] md: running: <sdb3><sda3>
[ 35.803596] raid1: raid set md1 active with 2 out of 2 mirrors
[ 35.809508] md: considering sdb1 ...
[ 35.813127] md: adding sdb1 ...
[ 35.816400] md: adding sda1 ...
[ 35.819672] md: created md0
[ 35.822512] md: bind<sda1>
[ 35.825274] md: bind<sdb1>
[ 35.828034] md: running: <sdb1><sda1>
[ 35.831938] raid1: raid set md0 active with 2 out of 2 mirrors
[ 35.837836] md: ... autorun DONE.
[ 35.861366] Filesystem "md0": Disabling barriers, not supported by the underlying device
[ 35.869723] XFS mounting filesystem md0
[ 35.974203] VFS: Mounted root (xfs filesystem) readonly.
[ 35.979614] Freeing unused kernel memory: 256k freed
[ 46.500699] lp: driver loaded but no devices found
[ 46.523974] Real Time Clock Driver v1.12ac

[ 3038.048201] sdc: assuming drive cache: write through
[ 3038.056198] sdc: assuming drive cache: write through
[10222.410786] sdc: assuming drive cache: write through
[10222.421776] sdc: assuming drive cache: write through

EVENT1... I think at this time I was plugging in my USB key trying to
figure out why /dev/disk was gone.

[10938.054484] NMI Watchdog detected LOCKUP on CPU 1
[10938.059184] CPU 1
[10938.061210] Modules linked in: usb_storage snd_via82xx snd_ens1371 snd_ens1370 gameport snd_ak4531_codec snd_emu10k1_synth snd_emux_synth snd_seq_virmidi snd_seq_midi_emul snd_seq_dummy snd_seq_oss snd_seq_midi snd_seq_midi_event snd_seq snd_seq_device snd_emu10k1x snd_ice1712 snd_ice17xx_ak4xxx snd_ak4xxx_adda snd_cs8427 snd_i2c snd_mpu401_uart binfmt_misc ipv6 container battery asus_acpi ac nfs lockd sunrpc af_packet rtc parport_pc lp parport
[10938.101118] Pid: 233, comm: kswapd0 Not tainted 2.6.19-rc6-xenon64-smp.10 #1
[10938.108157] RIP: 0010:[<ffffffff810650c9>] [<ffffffff810650c9>] __write_lock_failed+0x9/0x20
[10938.116688] RSP: 0018:ffff81003f501b80 EFLAGS: 00000083
[10938.121997] RAX: 0000000000000000 RBX: ffff810001eb5870 RCX: 0000000000000036
[10938.129125] RDX: ffff8100050cf938 RSI: ffff810001eb5870 RDI: ffff8100050cf950
[10938.136252] RBP: ffff810001eb5870 R08: 0000000000000001 R09: ffffffff81444d80
[10938.143379] R10: 0000000000000006 R11: 0000000000000001 R12: ffff8100050cf938
[10938.150506] R13: ffff81003f501e90 R14: ffff81003f501d80 R15: ffff8100050cf938
[10938.157634] FS: 00002b385de2c6d0(0000) GS:ffff8100025795c0(0000) knlGS:00000000f7324bb0
[10938.165713] CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b
[10938.171446] CR2: 00002afb375beff0 CR3: 0000000017ad3000 CR4: 00000000000006e0
[10938.178565] Process kswapd0 (pid: 233, threadinfo ffff81003f500000, task ffff81003f494040)
[10938.186816] Stack: ffffffff81067bbf ffffffff810123ea ffff810001eb5870 ffffffff81444d80
[10938.194890] ffff810001eb5898 ffffffff810b94d7 0000000000000000 ffff81003f501d70
[10938.202336] 0000000000000001 fffffffffffffffe ffffffff81445308 ffffffff815457e0
[10938.209594] Call Trace:
[10938.212230] [<ffffffff81067bbf>] _write_lock_irq+0xf/0x10
[10938.217712] [<ffffffff810123ea>] remove_mapping+0x5a/0xf0
[10938.223194] [<ffffffff810b94d7>] shrink_inactive_list+0x557/0x850
[10938.229381] [<ffffffff81012ad4>] shrink_zone+0xe4/0x110
[10938.234687] [<ffffffff8105b7fe>] kswapd+0x31e/0x470
[10938.239650] [<ffffffff8109cc30>] autoremove_wake_function+0x0/0x30
[10938.245909] [<ffffffff8109c9f0>] keventd_create_kthread+0x0/0x80
[10938.251997] [<ffffffff8105b4e0>] kswapd+0x0/0x470
[10938.256785] [<ffffffff8109c9f0>] keventd_create_kthread+0x0/0x80
[10938.262874] [<ffffffff81033859>] kthread+0xd9/0x120
[10938.267839] [<ffffffff81062ed8>] child_rip+0xa/0x12
[10938.272799] [<ffffffff8109c9f0>] keventd_create_kthread+0x0/0x80
[10938.278890] [<ffffffff81033780>] kthread+0x0/0x120
[10938.283762] [<ffffffff81062ece>] child_rip+0x0/0x12
[10938.288724]
[10938.290211]
[10938.290212] Code: 81 3f 00 00 00 01 75 f6 f0 81 2f 00 00 00 01 0f 85 e2 ff ff
[10938.299273] <5>statd: server localhost not responding, timed out
[11978.316564] lockd: cannot monitor meson
[11978.320402] lockd: failed to monitor meson
[11982.620868] statd: server localhost not responding, timed out
[11982.626629] lockd: cannot monitor meson
[11982.630462] lockd: failed to monitor meson
[12048.136656] statd: server localhost not responding, timed out
[12048.142419] lockd: cannot monitor meson
[12048.146259] lockd: failed to monitor meson
[12059.702760] statd: server localhost not responding, timed out
[12059.708522] lockd: cannot monitor meson
[12059.712358] lockd: failed to monitor meson
[12096.141632] statd: server localhost not responding, timed out
[12096.147396] lockd: cannot monitor meson
[12096.151231] lockd: failed to monitor meson
[12109.631149] statd: server localhost not responding, timed out
[12109.636910] lockd: cannot monitor meson
[12109.640744] lockd: failed to monitor meson
[12111.292901] statd: server localhost not responding, timed out
[12111.298664] lockd: cannot monitor meson
[12111.302495] lockd: failed to monitor meson
[12112.827687] statd: server localhost not responding, timed out
[12112.833444] lockd: cannot monitor meson
[12112.837283] lockd: failed to monitor meson
[14751.454577] apt-index-watch[18388]: segfault at 00000000a06c7d78 rip 0000000000459cbd rsp 00007fffb6d31fd0 error 4
[19152.300080] statd: server localhost not responding, timed out
[19152.305841] lockd: cannot monitor meson
[19152.309683] lockd: failed to monitor meson
[20284.049066] statd: server localhost not responding, timed out
[20284.054824] lockd: cannot monitor meson
[20284.058658] lockd: failed to monitor meson
[20314.039915] statd: server localhost not responding, timed out
[20314.045675] lockd: cannot monitor meson
[20314.049510] lockd: failed to monitor meson
[20344.098400] statd: server localhost not responding, timed out
[20344.104165] lockd: cannot monitor meson
[20344.108001] lockd: failed to monitor meson
[20345.881462] hda: ATAPI 40X DVD-ROM DVD-R CD-R/RW drive, 8192kB Cache
[20345.887988] Uniform CD-ROM driver Revision: 3.20
[20402.667944] usb 1-7: USB disconnect, address 7

[23856.264428] statd: server localhost not responding, timed out
[23856.270184] lockd: cannot monitor meson
[23856.274029] lockd: failed to monitor meson
[27246.948121] statd: server localhost not responding, timed out
[27246.953876] lockd: cannot monitor meson
[27246.957708] lockd: failed to monitor meson
[27256.123056] statd: server localhost not responding, timed out
[27256.128819] lockd: cannot monitor meson
[27256.132649] lockd: failed to monitor meson
[27277.053697] statd: server localhost not responding, timed out
[27277.059463] lockd: cannot monitor meson
[27277.063298] lockd: failed to monitor meson
[27307.100407] statd: server localhost not responding, timed out
[27307.106205] lockd: cannot monitor meson
[27307.110043] lockd: failed to monitor meson
[64466.900781] usb 1-2: USB disconnect, address 2
[64493.084575] usb 1-2: new high speed USB device using ehci_hcd and address 8
[64493.208324] usb 1-2: configuration #1 chosen from 1 choice
[65315.654790] usb 1-2: USB disconnect, address 8
[65341.340831] usb 1-2: new high speed USB device using ehci_hcd and address 9
[65341.464578] usb 1-2: configuration #1 chosen from 1 choice

... this is where I noticed that I had no swap.

[72670.632135] Adding 1951888k swap on /dev/sda2. Priority:-1 extents:1 across:1951888k
[72673.798467] Adding 1951888k swap on /dev/sdb2. Priority:-2 extents:1 across:1951888k

EVENT2... while switching to a gitk window.

[79869.398641] wish[469]: segfault at 000000000fb1303f rip 00000000f7e3dc49 rsp 00000000ffa93c50 error 4
[79869.708547] Bad page state in process 'ruby'
[79869.708549] page:ffff810001e73f18 flags:0x4000000000080010 mapping:0000000000000000 mapcount:0 count:0
[79869.708551] Trying to fix it up, but a reboot is needed
[79869.708553] Backtrace:
[79869.729680]
[79869.729681] Call Trace:
[79869.733637] [<ffffffff810b6ec7>] bad_page+0x57/0x90
[79869.738600] [<ffffffff8100ad32>] free_hot_cold_page+0x82/0x130
[79869.744516] [<ffffffff81024093>] __pagevec_free+0x23/0x30
[79869.749997] [<ffffffff8100a766>] release_pages+0x156/0x170
[79869.755567] [<ffffffff810226c1>] __up_read+0x21/0xb0
[79869.760615] [<ffffffff8115427a>] xfs_iunlock+0x3a/0xa0
[79869.765838] [<ffffffff810bf825>] swap_info_get+0x75/0xf0
[79869.771232] [<ffffffff8100d8ff>] free_pages_and_swap_cache+0x7f/0xa0
[79869.777668] [<ffffffff81007b8e>] unmap_vmas+0x57e/0x790
[79869.782983] [<ffffffff8103b8a7>] exit_mmap+0x77/0x100
[79869.788119] [<ffffffff8103dfbc>] mmput+0x3c/0xd0
[79869.792822] [<ffffffff8102d0b4>] flush_old_exec+0x6e4/0xa00
[79869.798478] [<ffffffff8100ac5f>] vfs_read+0x14f/0x1a0
[79869.803614] [<ffffffff81018003>] load_elf_binary+0x503/0x1cf0
[79869.809442] [<ffffffff8100c15f>] do_sync_read+0xcf/0x120
[79869.814840] [<ffffffff81057b8c>] strrchr+0xc/0x30
[79869.819627] [<ffffffff881077f5>] :binfmt_misc:load_misc_binary+0x75/0x420
[79869.826492] [<ffffffff8100eb95>] __alloc_pages+0x65/0x2f0
[79869.831976] [<ffffffff8100eb95>] __alloc_pages+0x65/0x2f0
[79869.837458] [<ffffffff81017639>] copy_strings+0x1b9/0x230
[79869.842940] [<ffffffff810424aa>] search_binary_handler+0xaa/0x270
[79869.849113] [<ffffffff81041980>] do_execve+0x1a0/0x270
[79869.854338] [<ffffffff81057a84>] sys_execve+0x44/0xb0
[79869.859473] [<ffffffff810624f7>] stub_execve+0x67/0xb0
[79869.864695]
[79870.772381] Eeek! page_mapcount(page) went negative! (-1)
[79870.777784] page->flags = 4000000000000000
[79870.782055] page->count = 1
[79870.785023] page->mapping = 0000000000000000

[79870.789473] ----------- [cut here ] --------- [please bite here ] ---------
[79870.796423] Kernel BUG at mm/rmap.c:578
[79870.800252] invalid opcode: 0000 [1] SMP
[79870.804295] CPU 1
[79870.806321] Modules linked in: ide_cd cdrom usb_storage snd_via82xx snd_ens1371 snd_ens1370 gameport snd_ak4531_codec snd_emu10k1_synth snd_emux_synth snd_seq_virmidi snd_seq_midi_emul snd_seq_dummy snd_seq_oss snd_seq_midi snd_seq_midi_event snd_seq snd_seq_device snd_emu10k1x snd_ice1712 snd_ice17xx_ak4xxx snd_ak4xxx_adda snd_cs8427 snd_i2c snd_mpu401_uart binfmt_misc ipv6 container battery asus_acpi ac nfs lockd sunrpc af_packet rtc parport_pc lp parport
[79870.847390] Pid: 3059, comm: ruby Tainted: G B 2.6.19-rc6-xenon64-smp.10 #1
[79870.854604] RIP: 0010:[<ffffffff8100a5ec>] [<ffffffff8100a5ec>] page_remove_rmap+0x6c/0x90
[79870.862968] RSP: 0000:ffff81002a4b1d48 EFLAGS: 00010292
[79870.868270] RAX: 0000000000000035 RBX: ffff810001e73f18 RCX: ffffffff8143fba8
[79870.875396] RDX: ffffffff8143fba8 RSI: 0000000000000082 RDI: ffffffff8143fba0
[79870.882516] RBP: ffff810001e73f18 R08: 0000000000000000 R09: 0000000000000001
[79870.889644] R10: ffffffff815438a0 R11: ffffffff8107ad70 R12: ffff81002ca87c60
[79870.896772] R13: ffff810001b05a60 R14: ffff81002e46ff00 R15: ffff810001f7ed98
[79870.903898] FS: 00002b111d12dc90(0000) GS:ffff8100025795c0(0000) knlGS:00000000f74716c0
[79870.911978] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[79870.917719] CR2: 00002b111d18cf78 CR3: 000000002a188000 CR4: 00000000000006e0
[79870.924847] Process ruby (pid: 3059, threadinfo ffff81002a4b0000, task ffff81003b7ae750)
[79870.932926] Stack: 0000000000000002 ffffffff81010ae8 0000000000000000 ffff81002a897740
[79870.940998] 00002b111d18cf78 ffff81003fd66bc0 00002b111d129520 ffff81002d7d7220
[79870.948445] 8000000027e45065 ffff810001f7ed98 ffffffff815457e0 ffff81002ca87c60
[79870.955702] Call Trace:
[79870.958338] [<ffffffff81010ae8>] do_wp_page+0x3a8/0x4d0
[79870.963648] [<ffffffff81008e78>] __handle_mm_fault+0xa98/0xb50
[79870.969566] [<ffffffff8106a007>] do_page_fault+0x4b7/0x8d0
[79870.975134] [<ffffffff81023f86>] sys_newstat+0x36/0x50
[79870.980353] [<ffffffff8106807d>] error_exit+0x0/0x84
[79870.985402]
[79870.986896]
[79870.986896] Code: 0f 0b 68 38 8c 3a 81 c2 42 02 8b 73 18 48 89 df 5b 83 f6 01
[79870.995955] RIP [<ffffffff8100a5ec>] page_remove_rmap+0x6c/0x90
[79871.001965] RSP <ffff81002a4b1d48>
[79871.005447] <0>Bad page state in process 'modprobe'
[79872.097306] page:ffff810001e73f18 flags:0x4000000000000000 mapping:0000000000000000 mapcount:-1 count:1
[79872.097308] Trying to fix it up, but a reboot is needed
[79872.097309] Backtrace:
[79872.117823]
[79872.117824] Call Trace:
[79872.121779] [<ffffffff810b6ec7>] bad_page+0x57/0x90
[79872.126745] [<ffffffff8100a0d8>] get_page_from_freelist+0x268/0x360
[79872.133095] [<ffffffff8100eb95>] __alloc_pages+0x65/0x2f0
[79872.138575] [<ffffffff810226c1>] __up_read+0x21/0xb0
[79872.143624] [<ffffffff81008903>] __handle_mm_fault+0x523/0xb50
[79872.149540] [<ffffffff81011c5d>] may_open+0x6d/0x270
[79872.154589] [<ffffffff8106a007>] do_page_fault+0x4b7/0x8d0
[79872.160154] [<ffffffff8101ec85>] __dentry_open+0x115/0x220
[79872.165725] [<ffffffff8102849d>] do_filp_open+0x2d/0x40
[79872.171037] [<ffffffff8106807d>] error_exit+0x0/0x84
[79872.176094]

EVENT3... I try to restart gitk, not knowing about the above.

[79915.285166] Bad page state in process 'wish'
[79915.285168] page:ffffffff81444da8 flags:0x0000000000000000 mapping:000000ba0000002d mapcount:1 count:0
[79915.285170] Trying to fix it up, but a reboot is needed
[79915.285171] Backtrace:
[79915.306299]
[79915.306300] Call Trace:
[79915.310254] [<ffffffff810b6ec7>] bad_page+0x57/0x90
[79915.315216] [<ffffffff8100a0d8>] get_page_from_freelist+0x268/0x360
[79915.321567] [<ffffffff8100eb95>] __alloc_pages+0x65/0x2f0
[79915.327048] [<ffffffff81008903>] __handle_mm_fault+0x523/0xb50
[79915.332964] [<ffffffff8106a007>] do_page_fault+0x4b7/0x8d0
[79915.338537] [<ffffffff8106807d>] error_exit+0x0/0x84
[79915.343587] [<ffffffff8100c480>] file_read_actor+0x0/0x150
[79915.349154]
[79915.350655] general protection fault: 0000 [2] SMP
[79915.355558] CPU 0
[79915.357584] Modules linked in: ide_cd cdrom usb_storage snd_via82xx snd_ens1371 snd_ens1370 gameport snd_ak4531_codec snd_emu10k1_synth snd_emux_synth snd_seq_virmidi snd_seq_midi_emul snd_seq_dummy snd_seq_oss snd_seq_midi snd_seq_midi_event snd_seq snd_seq_device snd_emu10k1x snd_ice1712 snd_ice17xx_ak4xxx snd_ak4xxx_adda snd_cs8427 snd_i2c snd_mpu401_uart binfmt_misc ipv6 container battery asus_acpi ac nfs lockd sunrpc af_packet rtc parport_pc lp parport
[79915.398655] Pid: 29139, comm: wish Tainted: G B 2.6.19-rc6-xenon64-smp.10 #1
[79915.405953] RIP: 0010:[<ffffffff81064a47>] [<ffffffff81064a47>] clear_page_c+0x7/0x10
[79915.413878] RSP: 0000:ffff81003df67cc0 EFLAGS: 00010246
[79915.419187] RAX: 0000000000000000 RBX: ffffffff81444da8 RCX: 0000000000000200
[79915.426315] RDX: ffffffff8143fba8 RSI: 0000000000000086 RDI: 0023c9fff9563000
[79915.433440] RBP: ffffffff81444de0 R08: 0000000000000000 R09: 0000000000000000
[79915.440561] R10: ffffffff815438a0 R11: ffffffff8107ad70 R12: 0000000000000001
[79915.447689] R13: ffff810000000000 R14: 6db6db6db6db6db7 R15: ffff81003df67f58
[79915.454815] FS: 00002b4256b3b040(0000) GS:ffffffff814e6000(0063) knlGS:00000000f7b0b6c0
[79915.462894] CS: 0010 DS: 002b ES: 002b CR0: 000000008005003b
[79915.468627] CR2: 0000000008256a24 CR3: 0000000029ef1000 CR4: 00000000000006e0
[79915.475746] Process wish (pid: 29139, threadinfo ffff81003df66000, task ffff81001d4520c0)
[79915.483912] Stack: ffffffff8100a148 000000013df67ea8 0000004400000000 000280d200000000
[79915.491984] ffffffff81445ac0 0000000100000292 0000000000000256 ffffffff815457e0
[79915.499431] 00000040289f86c0 0000000000000001 ffffffff81445ac0 ffff81001d4520c0
[79915.506689] Call Trace:
[79915.509323] [<ffffffff8100a148>] get_page_from_freelist+0x2d8/0x360
[79915.515673] [<ffffffff8100eb95>] __alloc_pages+0x65/0x2f0
[79915.521156] [<ffffffff81008903>] __handle_mm_fault+0x523/0xb50
[79915.527073] [<ffffffff8106a007>] do_page_fault+0x4b7/0x8d0
[79915.532642] [<ffffffff8106807d>] error_exit+0x0/0x84
[79915.537687] [<ffffffff8100c480>] file_read_actor+0x0/0x150
[79915.543254]
[79915.544743]
[79915.544744] Code: f3 48 ab c3 66 66 90 66 90 eb ee 66 66 66 90 66 66 66 90 66
[79915.553800] RIP [<ffffffff81064a47>] clear_page_c+0x7/0x10
[79915.559386] RSP <ffff81003df67cc0>
[79915.562868] <1>Unable to handle kernel paging request at 0000000000200200 RIP:
[79915.568811] [<ffffffff81009fde>] get_page_from_freelist+0x16e/0x360
[79915.577625] PGD 3a973067 PUD 3ab5d067 PMD 0
[79915.581938] Oops: 0002 [3] SMP
[79915.585108] CPU 0
[79915.587134] Modules linked in: ide_cd cdrom usb_storage snd_via82xx snd_ens1371 snd_ens1370 gameport snd_ak4531_codec snd_emu10k1_synth snd_emux_synth snd_seq_virmidi snd_seq_midi_emul snd_seq_dummy snd_seq_oss snd_seq_midi snd_seq_midi_event snd_seq snd_seq_device snd_emu10k1x snd_ice1712 snd_ice17xx_ak4xxx snd_ak4xxx_adda snd_cs8427 snd_i2c snd_mpu401_uart binfmt_misc ipv6 container battery asus_acpi ac nfs lockd sunrpc af_packet rtc parport_pc lp parport
[79915.628203] Pid: 2094, comm: klogd Tainted: G B 2.6.19-rc6-xenon64-smp.10 #1
[79915.635502] RIP: 0010:[<ffffffff81009fde>] [<ffffffff81009fde>] get_page_from_freelist+0x16e/0x360
[79915.644544] RSP: 0018:ffff81003ab4dd28 EFLAGS: 00010097
[79915.649844] RAX: ffff810001b23f10 RBX: ffffffff81444dd0 RCX: ffff810001b23ee8
[79915.656970] RDX: 0000000000200200 RSI: 0000000000000000 RDI: ffffffff81444d80
[79915.664090] RBP: 0000000000000001 R08: ffffffff814451c8 R09: ffff810001b5a5e8
[79915.671209] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000001
[79915.678337] R13: ffffffff81444dc0 R14: ffffffff81444d80 R15: 000000000000001f
[79915.685464] FS: 00002b22c4b4f6d0(0000) GS:ffffffff814e6000(0000) knlGS:00000000f7b0b6c0
[79915.693542] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[79915.699277] CR2: 0000000000200200 CR3: 000000003ab8c000 CR4: 00000000000006e0
[79915.706406] Process klogd (pid: 2094, threadinfo ffff81003ab4c000, task ffff81000268c7d0)
[79915.714570] Stack: 0000000100000000 0000004400000000 000200d000000000 ffffffff81445ac0
[79915.722641] 000000013bf81820 0000000000000256 ffffffff815457e0 000000403ab4ded9
[79915.730088] 0000000000000001 ffffffff81445ac0 ffff81000268c7d0 00000000000000d0
[79915.737347] Call Trace:
[79915.739983] [<ffffffff8100eb95>] __alloc_pages+0x65/0x2f0
[79915.745467] [<ffffffff81060f90>] cache_alloc_refill+0x2b0/0x530
[79915.751467] [<ffffffff8100a229>] kmem_cache_alloc+0x59/0x80
[79915.757120] [<ffffffff81022dd7>] d_alloc+0x37/0x200
[79915.762085] [<ffffffff81309964>] sock_attach_fd+0x64/0x120
[79915.767652] [<ffffffff810120bd>] get_empty_filp+0x8d/0x160
[79915.773221] [<ffffffff8104f145>] sock_map_fd+0x35/0x80
[79915.778441] [<ffffffff8130a10f>] sys_socket+0x1f/0x40
[79915.783577] [<ffffffff8106211e>] system_call+0x7e/0x83
[79915.788800]
[79915.790295]
[79915.790296] Code: 48 89 02 48 89 50 08 75 c9 41 c7 86 40 04 00 00 01 00 00 00
[79915.799354] RIP [<ffffffff81009fde>] get_page_from_freelist+0x16e/0x360
[79915.806056] RSP <ffff81003ab4dd28>
[79915.809539] CR2: 0000000000200200
[79915.812845] <1>Unable to handle kernel paging request at 0000000000100108 RIP:
[79915.817913] [<ffffffff8100ad82>] free_hot_cold_page+0xd2/0x130
[79915.826287] PGD 0
[79915.828314] Oops: 0002 [4] SMP
[79915.831483] CPU 0
[79915.833510] Modules linked in: ide_cd cdrom usb_storage snd_via82xx snd_ens1371 snd_ens1370 gameport snd_ak4531_codec snd_emu10k1_synth snd_emux_synth snd_seq_virmidi snd_seq_midi_emul snd_seq_dummy snd_seq_oss snd_seq_midi snd_seq_midi_event snd_seq snd_seq_device snd_emu10k1x snd_ice1712 snd_ice17xx_ak4xxx snd_ak4xxx_adda snd_cs8427 snd_i2c snd_mpu401_uart binfmt_misc ipv6 container battery asus_acpi ac nfs lockd sunrpc af_packet rtc parport_pc lp parport
[79915.874577] Pid: 2094, comm: klogd Tainted: G B 2.6.19-rc6-xenon64-smp.10 #1
[79915.881870] RIP: 0010:[<ffffffff8100ad82>] [<ffffffff8100ad82>] free_hot_cold_page+0xd2/0x130
[79915.890488] RSP: 0018:ffff81003ab4d868 EFLAGS: 00010002
[79915.895797] RAX: ffff81000227da10 RBX: ffff81000227d9e8 RCX: 0000000000000000
[79915.902923] RDX: 0000000000100100 RSI: 0000000000000000 RDI: ffffffff81444dd0
[79915.910052] RBP: ffffffff81444dc0 R08: 000000003fc95025 R09: ffffffff81444d80
[79915.917178] R10: 0000000000000001 R11: ffffffff81031090 R12: 0000000000000256
[79915.924305] R13: ffffffff81444d80 R14: 000000000000000e R15: ffff8100023c57f8
[79915.931433] FS: 00002b22c4b4f6d0(0000) GS:ffffffff814e6000(0000) knlGS:00000000f7b0b6c0
[79915.939512] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[79915.945245] CR2: 0000000000100108 CR3: 0000000001001000 CR4: 00000000000006e0
[79915.952366] Process klogd (pid: 2094, threadinfo ffff81003ab4c000, task ffff81000268c7d0)
[79915.960529] Stack: 0000000000000034 0000000000000008 ffff81003ab4d8b8 ffff8100023c5868
[79915.968601] 000000000000000e ffffffff81024093 0000000000000001 ffff81000227d9e8
[79915.976050] ffffffff81444d80 ffffffff8100a766 000000000000000a 0000000000000000
[79915.983306] Call Trace:
[79915.985942] [<ffffffff81024093>] __pagevec_free+0x23/0x30
[79915.991422] [<ffffffff8100a766>] release_pages+0x156/0x170
[79915.996996] [<ffffffff8122b520>] serial8250_console_putchar+0x0/0xa0
[79916.003428] [<ffffffff8100d8ff>] free_pages_and_swap_cache+0x7f/0xa0
[79916.009862] [<ffffffff81007a91>] unmap_vmas+0x481/0x790
[79916.015176] [<ffffffff8103b8a7>] exit_mmap+0x77/0x100
[79916.020306] [<ffffffff8103dfbc>] mmput+0x3c/0xd0
[79916.025009] [<ffffffff81015144>] do_exit+0x214/0x8e0
[79916.030061] [<ffffffff8106a319>] do_page_fault+0x7c9/0x8d0
[79916.035624] [<ffffffff81008fb5>] __d_lookup+0x85/0x120
[79916.040840] [<ffffffff8100b941>] _atomic_dec_and_lock+0x41/0x80
[79916.046842] [<ffffffff8102d9b4>] mntput_no_expire+0x24/0xb0
[79916.052498] [<ffffffff8106807d>] error_exit+0x0/0x84
[79916.057547] [<ffffffff81009fde>] get_page_from_freelist+0x16e/0x360
[79916.063893] [<ffffffff81009fbd>] get_page_from_freelist+0x14d/0x360
[79916.070243] [<ffffffff8100eb95>] __alloc_pages+0x65/0x2f0
[79916.075724] [<ffffffff81060f90>] cache_alloc_refill+0x2b0/0x530
[79916.081725] [<ffffffff8100a229>] kmem_cache_alloc+0x59/0x80
[79916.087379] [<ffffffff81022dd7>] d_alloc+0x37/0x200
[79916.092342] [<ffffffff81309964>] sock_attach_fd+0x64/0x120
[79916.097911] [<ffffffff810120bd>] get_empty_filp+0x8d/0x160
[79916.103480] [<ffffffff8104f145>] sock_map_fd+0x35/0x80
[79916.108702] [<ffffffff8130a10f>] sys_socket+0x1f/0x40
[79916.113836] [<ffffffff8106211e>] system_call+0x7e/0x83
[79916.119059]
[79916.120554]
[79916.120555] Code: 48 89 42 08 48 89 53 28 48 89 78 08 48 89 45 10 8b 45 00 ff
[79916.129614] RIP [<ffffffff8100ad82>] free_hot_cold_page+0xd2/0x130
[79916.135883] RSP <ffff81003ab4d868>
[79916.139364] CR2: 0000000000100108
[79916.142672] <1>Fixing recursive fault but reboot is needed!

EVENT4... everything dies.

[79921.294125] NMI Watchdog detected LOCKUP on CPU 0
[79921.298824] CPU 0
[79921.300849] Modules linked in: ide_cd cdrom usb_storage snd_via82xx snd_ens1371 snd_ens1370 gameport snd_ak4531_codec snd_emu10k1_synth snd_emux_synth snd_seq_virmidi snd_seq_midi_emul snd_seq_dummy snd_seq_oss snd_seq_midi snd_seq_midi_event snd_seq snd_seq_device snd_emu10k1x snd_ice1712 snd_ice17xx_ak4xxx snd_ak4xxx_adda snd_cs8427 snd_i2c snd_mpu401_uart binfmt_misc ipv6 container battery asus_acpi ac nfs lockd sunrpc af_packet rtc parport_pc lp parport
[79921.341918] Pid: 8, comm: events/0 Tainted: G B 2.6.19-rc6-xenon64-smp.10 #1
[79921.349210] RIP: 0010:[<ffffffff81067a37>] [<ffffffff81067a37>] _spin_lock+0x7/0x10
[79921.356962] RSP: 0018:ffff81003ff43d58 EFLAGS: 00000082
[79921.362270] RAX: 0000000000000001 RBX: 0000000000000001 RCX: ffff810080e82f40
[79921.369398] RDX: 0000000000000002 RSI: 0000000000000000 RDI: ffffffff814451c0
[79921.376525] RBP: 0000000000000001 R08: ffffffff81444e05 R09: 0000000000000000
[79921.383652] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000001
[79921.390771] R13: ffff810001dc27f0 R14: ffffffff81444d80 R15: 0000000000000001
[79921.397900] FS: 00002b22c4b4f6d0(0000) GS:ffffffff814e6000(0000) knlGS:00000000f7b0b6c0
[79921.405978] CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b
[79921.411720] CR2: 0000000000100108 CR3: 0000000001001000 CR4: 00000000000006e0
[79921.418848] Process events/0 (pid: 8, threadinfo ffff81003ff42000, task ffff81003f43c850)
[79921.427013] Stack: ffffffff810b7054 0000000000000246 ffff81003ac639c0 ffff81003d545ae0
[79921.435085] ffff810024b92140 ffff810024b92000 ffff81003d545b00 0000000000000002
[79921.442534] ffffffff810c52d3 ffff81003a96b200 ffff81003d545ac0 ffff81003d545ae0
[79921.449789] Call Trace:
[79921.452424] [<ffffffff810b7054>] __free_pages_ok+0xe4/0x2f0
[79921.458081] [<ffffffff810c52d3>] slab_destroy+0x83/0xb0
[79921.463388] [<ffffffff810c55b9>] drain_freelist+0x79/0xb0
[79921.468871] [<ffffffff810c6010>] cache_reap+0x0/0x130
[79921.474006] [<ffffffff810c60f2>] cache_reap+0xe2/0x130
[79921.479227] [<ffffffff8104fb22>] run_workqueue+0xb2/0x110
[79921.484709] [<ffffffff8104be10>] worker_thread+0x0/0x160
[79921.490104] [<ffffffff8104bf31>] worker_thread+0x121/0x160
[79921.495674] [<ffffffff81088750>] default_wake_function+0x0/0x10
[79921.501677] [<ffffffff8104be10>] worker_thread+0x0/0x160
[79921.507070] [<ffffffff81033859>] kthread+0xd9/0x120
[79921.512034] [<ffffffff81062ed8>] child_rip+0xa/0x12
[79921.516998] [<ffffffff81033780>] kthread+0x0/0x120
[79921.521870] [<ffffffff81062ece>] child_rip+0x0/0x12
[79921.526823]
[79921.528312]

--
WebSig: http://www.jukie.net/~bart/sig/