2003-03-03 08:01:24

by Christopher Huhn

[permalink] [raw]
Subject: Kernel Bug at spinlock.h ?!

Hi folks,

we have some problems with the 2.4.20 kernel on our linux farm, mainly - but
not exclusivly - on our batch processing 2x2GHz Xeon (with hyperthreading),
Intel e7500 supermicro P4-DPE boxes running - almost plain - Debian woody.

On these boxes (ca. 60) we sometimes get a kernel panic after an uptime of 2
days or more. The process causing the panic is sometimes a users number
crunching process but more often ksoftirqd_CPU0 or swapper.

Today we got this profound error msg:
Kernel Bug at /usr/src/kernel-source-2.4.20/include/asm/spinlock.h:105!

As I didn't find anything specifically related to this on google, I'm mailing right to this
list.
If you find any solutions, please cc: me as I'm not suscribed to the list.

With kind regards,

Christopher Huhn

Below is an excerpt of the kern.log:

Feb 24 14:44:56 lxb006 kernel: Linux version 2.4.20clone (root@lxdv10) (gcc
version 2.95.4 20011002 (Debian prerelease))
#1 SMP Mon Feb 24 13:45:27 CET 2003
Feb 24 14:44:56 lxb006 kernel: BIOS-provided physical RAM map:
Feb 24 14:44:56 lxb006 kernel: BIOS-e820: 0000000000000000 - 000000000009f800
(usable)
Feb 24 14:44:56 lxb006 kernel: BIOS-e820: 000000000009f800 - 00000000000a0000
(reserved)
Feb 24 14:44:56 lxb006 kernel: BIOS-e820: 00000000000d8000 - 00000000000e0000
(reserved)
Feb 24 14:44:56 lxb006 kernel: BIOS-e820: 00000000000e4000 - 0000000000100000
(reserved)
Feb 24 14:44:56 lxb006 kernel: BIOS-e820: 0000000000100000 - 000000003fefd000
(usable)
Feb 24 14:44:56 lxb006 kernel: BIOS-e820: 000000003fefd000 - 000000003ff00000
(ACPI NVS)
Feb 24 14:44:56 lxb006 kernel: BIOS-e820: 000000003ff00000 - 000000003ff80000
(usable)
Feb 24 14:44:56 lxb006 kernel: BIOS-e820: 000000003ff80000 - 0000000040000000
(reserved)
Feb 24 14:44:56 lxb006 kernel: BIOS-e820: 00000000fec00000 - 00000000fec10000
(reserved)
Feb 24 14:44:56 lxb006 kernel: BIOS-e820: 00000000fee00000 - 00000000fee01000
(reserved)
Feb 24 14:44:56 lxb006 kernel: BIOS-e820: 00000000ff800000 - 00000000ffc00000
(reserved)
Feb 24 14:44:56 lxb006 kernel: BIOS-e820: 00000000fff00000 - 0000000100000000
(reserved)
Feb 24 14:44:56 lxb006 kernel: 127MB HIGHMEM available.
Feb 24 14:44:56 lxb006 kernel: 896MB LOWMEM available.
Feb 24 14:44:56 lxb006 kernel: found SMP MP-table at 000f66c0
Feb 24 14:44:56 lxb006 kernel: hm, page 000f6000 reserved twice.
Feb 24 14:44:56 lxb006 kernel: hm, page 000f7000 reserved twice.
Feb 24 14:44:56 lxb006 kernel: hm, page 0009f000 reserved twice.
Feb 24 14:44:56 lxb006 kernel: hm, page 000a0000 reserved twice.
Feb 24 14:44:56 lxb006 kernel: On node 0 totalpages: 262016
Feb 24 14:44:56 lxb006 kernel: zone(0): 4096 pages.
Feb 24 14:44:56 lxb006 kernel: zone(1): 225280 pages.
Feb 24 14:44:56 lxb006 kernel: zone(2): 32640 pages.
Feb 24 14:44:56 lxb006 kernel: ACPI: Searched entire block, no RSDP was found.
Feb 24 14:44:56 lxb006 kernel: ACPI: Searched entire block, no RSDP was found.
Feb 24 14:44:56 lxb006 kernel: ACPI: System description tables not found
Feb 24 14:44:56 lxb006 kernel: Intel MultiProcessor Specification v1.4
Feb 24 14:44:56 lxb006 kernel: Virtual Wire compatibility mode.
Feb 24 14:44:56 lxb006 kernel: OEM ID: Product ID: Kings Canyon APIC at:
0xFEE00000
Feb 24 14:44:56 lxb006 kernel: Processor #0 Pentium 4(tm) XEON(tm) APIC
version 20
Feb 24 14:44:56 lxb006 kernel: Processor #6 Pentium 4(tm) XEON(tm) APIC
version 20
Feb 24 14:44:56 lxb006 kernel: Processor #1 Pentium 4(tm) XEON(tm) APIC
version 20
Feb 24 14:44:56 lxb006 kernel: Processor #7 Pentium 4(tm) XEON(tm) APIC
version 20
Feb 24 14:44:56 lxb006 kernel: I/O APIC #2 Version 32 at 0xFEC00000.
Feb 24 14:44:56 lxb006 kernel: I/O APIC #3 Version 32 at 0xFEC80000.
Feb 24 14:44:56 lxb006 kernel: I/O APIC #4 Version 32 at 0xFEC80400.
Feb 24 14:44:56 lxb006 kernel: I/O APIC #5 Version 32 at 0xFEC81000.
Feb 24 14:44:56 lxb006 kernel: I/O APIC #8 Version 32 at 0xFEC81400.
Feb 24 14:44:56 lxb006 kernel: Processors: 4
Feb 24 14:44:56 lxb006 kernel: Kernel command line: rw root=/dev/nfs
nfsroot=/SystemBoot/lxb006 ip=140.181.97.13:140.181.
97.209:140.181.96.1:255.255.192.0:
Feb 24 14:44:56 lxb006 kernel: Initializing CPU#0
Feb 24 14:44:56 lxb006 kernel: Detected 1996.646 MHz processor.
Feb 24 14:44:56 lxb006 kernel: Console: colour VGA+ 80x25
Feb 24 14:44:56 lxb006 kernel: Calibrating delay loop... 3984.58 BogoMIPS
Feb 24 14:44:56 lxb006 kernel: Memory: 1032124k/1048064k available (1763k
kernel code, 15540k reserved, 746k data, 168k i
nit, 130548k highmem)
Feb 24 14:44:56 lxb006 kernel: Dentry cache hash table entries: 131072 (order:
8, 1048576 bytes)
Feb 24 14:44:56 lxb006 kernel: Inode cache hash table entries: 65536 (order:
7, 524288 bytes)
Feb 24 14:44:56 lxb006 kernel: Mount-cache hash table entries: 16384 (order:
5, 131072 bytes)
eb 24 14:44:56 lxb006 kernel: Buffer-cache hash table entries: 65536 (order:
6, 262144 bytes)
Feb 24 14:44:56 lxb006 kernel: Page-cache hash table entries: 262144 (order:
8, 1048576 bytes)
Feb 24 14:44:56 lxb006 kernel: CPU: L1 I cache: 0K, L1 D cache: 8K
Feb 24 14:44:56 lxb006 kernel: CPU: L2 cache: 512K
Feb 24 14:44:56 lxb006 kernel: CPU: Physical Processor ID: 0
Feb 24 14:44:56 lxb006 kernel: Intel machine check architecture supported.
Feb 24 14:44:56 lxb006 kernel: Intel machine check reporting enabled on CPU#0.
Feb 24 14:44:56 lxb006 kernel: CPU: After generic, caps: 3febfbff 00000000
00000000 00000000
Feb 24 14:44:56 lxb006 kernel: CPU: Common caps: 3febfbff 00000000
00000000 00000000
Feb 24 14:44:56 lxb006 kernel: Enabling fast FPU save and restore... done.
Feb 24 14:45:00 lxb006 kernel: Enabling unmasked SIMD FPU exception support...
done.
Feb 24 14:45:00 lxb006 kernel: Checking 'hlt' instruction... OK.
Feb 24 14:45:00 lxb006 kernel: POSIX conformance testing by UNIFIX
Feb 24 14:45:00 lxb006 kernel: mtrr: v1.40 (20010327) Richard Gooch
([email protected])
Feb 24 14:45:00 lxb006 kernel: mtrr: detected mtrr type: Intel
Feb 24 14:45:00 lxb006 kernel: CPU: L1 I cache: 0K, L1 D cache: 8K
Feb 24 14:45:00 lxb006 kernel: CPU: L2 cache: 512K
Feb 24 14:45:00 lxb006 kernel: CPU: Physical Processor ID: 0
Feb 24 14:45:00 lxb006 kernel: Intel machine check reporting enabled on CPU#0.
Feb 24 14:45:00 lxb006 kernel: CPU: After generic, caps: 3febfbff 00000000
00000000 00000000
Feb 24 14:45:00 lxb006 kernel: CPU: Common caps: 3febfbff 00000000
00000000 00000000
Feb 24 14:45:00 lxb006 kernel: CPU0: Intel(R) XEON(TM) CPU 2.00GHz stepping 04
Feb 24 14:45:00 lxb006 kernel: per-CPU timeslice cutoff: 1462.69 usecs.
Feb 24 14:45:00 lxb006 kernel: enabled ExtINT on CPU#0
Feb 24 14:45:00 lxb006 kernel: ESR value before enabling vector: 00000000
Feb 24 14:45:00 lxb006 kernel: ESR value after enabling vector: 00000000
Feb 24 14:45:00 lxb006 kernel: Booting processor 1/1 eip 2000
Feb 24 14:45:00 lxb006 kernel: Initializing CPU#1
Feb 24 14:45:00 lxb006 kernel: masked ExtINT on CPU#1
Feb 24 14:45:00 lxb006 kernel: ESR value before enabling vector: 00000000
Feb 24 14:45:00 lxb006 kernel: ESR value after enabling vector: 00000000
Feb 24 14:45:00 lxb006 kernel: Calibrating delay loop... 3984.58 BogoMIPS
Feb 24 14:45:00 lxb006 kernel: CPU: L1 I cache: 0K, L1 D cache: 8K
Feb 24 14:45:00 lxb006 kernel: CPU: L2 cache: 512K
Feb 24 14:45:00 lxb006 kernel: CPU: Physical Processor ID: 0
Feb 24 14:45:00 lxb006 kernel: Intel machine check reporting enabled on CPU#1.
Feb 24 14:45:00 lxb006 kernel: CPU: After generic, caps: 3febfbff 00000000
00000000 00000000
Feb 24 14:45:00 lxb006 kernel: CPU: Common caps: 3febfbff 00000000
00000000 00000000
Feb 24 14:45:00 lxb006 kernel: CPU1: Intel(R) XEON(TM) CPU 2.00GHz stepping 04
Feb 24 14:45:00 lxb006 kernel: Booting processor 2/6 eip 2000
Feb 24 14:45:00 lxb006 kernel: Initializing CPU#2
Feb 24 14:45:00 lxb006 kernel: masked ExtINT on CPU#2
Feb 24 14:45:00 lxb006 kernel: ESR value before enabling vector: 00000000
Feb 24 14:45:00 lxb006 kernel: ESR value after enabling vector: 00000000
Feb 24 14:45:00 lxb006 kernel: Calibrating delay loop... 3984.58 BogoMIPS
Feb 24 14:45:00 lxb006 kernel: CPU: L1 I cache: 0K, L1 D cache: 8K
Feb 24 14:45:00 lxb006 kernel: CPU: L2 cache: 512K
Feb 24 14:45:00 lxb006 kernel: CPU: Physical Processor ID: 3
Feb 24 14:45:00 lxb006 kernel: Intel machine check reporting enabled on CPU#2.
Feb 24 14:45:00 lxb006 kernel: CPU: After generic, caps: 3febfbff 00000000
00000000 00000000
Feb 24 14:45:00 lxb006 kernel: CPU: Common caps: 3febfbff 00000000
00000000 00000000
Feb 24 14:45:00 lxb006 kernel: CPU2: Intel(R) XEON(TM) CPU 2.00GHz stepping 04
Feb 24 14:45:00 lxb006 kernel: Booting processor 3/7 eip 2000
Feb 24 14:45:00 lxb006 kernel: Initializing CPU#3
Feb 24 14:45:00 lxb006 kernel: masked ExtINT on CPU#3
Feb 24 14:45:00 lxb006 kernel: ESR value before enabling vector: 00000000
Feb 24 14:45:00 lxb006 kernel: ESR value after enabling vector: 00000000
Feb 24 14:45:00 lxb006 kernel: Calibrating delay loop... 3984.58 BogoMIPS
Feb 24 14:45:00 lxb006 kernel: CPU: L1 I cache: 0K, L1 D cache: 8K
Feb 24 14:45:00 lxb006 kernel: CPU: L2 cache: 512K
Feb 24 14:45:00 lxb006 kernel: CPU: Physical Processor ID: 3
Feb 24 14:45:04 lxb006 kernel: Intel machine check reporting enabled on CPU#3.
Feb 24 14:45:04 lxb006 kernel: CPU: After generic, caps: 3febfbff 00000000
00000000 00000000
Feb 24 14:45:04 lxb006 kernel: CPU: Common caps: 3febfbff 00000000
00000000 00000000
Feb 24 14:45:04 lxb006 kernel: CPU3: Intel(R) XEON(TM) CPU 2.00GHz stepping 04
Feb 24 14:45:04 lxb006 kernel: Total of 4 processors activated (15938.35
BogoMIPS).
Feb 24 14:45:04 lxb006 kernel: cpu_sibling_map[0] = 1
Feb 24 14:45:04 lxb006 kernel: cpu_sibling_map[1] = 0
Feb 24 14:45:04 lxb006 kernel: cpu_sibling_map[2] = 3
Feb 24 14:45:04 lxb006 kernel: cpu_sibling_map[3] = 2
Feb 24 14:45:04 lxb006 kernel: ENABLING IO-APIC IRQs
Feb 24 14:45:04 lxb006 kernel: Setting 2 in the phys_id_present_map
Feb 24 14:45:08 lxb006 kernel: ...changing IO-APIC physical APIC ID to 2 ...
ok.
Feb 24 14:45:08 lxb006 kernel: Setting 3 in the phys_id_present_map
Feb 24 14:45:08 lxb006 kernel: ...changing IO-APIC physical APIC ID to 3 ...
ok.
Feb 24 14:45:08 lxb006 kernel: Setting 4 in the phys_id_present_map
Feb 24 14:45:08 lxb006 kernel: ...changing IO-APIC physical APIC ID to 4 ...
ok.
Feb 24 14:45:08 lxb006 kernel: Setting 5 in the phys_id_present_map
Feb 24 14:45:08 lxb006 kernel: ...changing IO-APIC physical APIC ID to 5 ...
ok.
Feb 24 14:45:08 lxb006 kernel: Setting 8 in the phys_id_present_map
Feb 24 14:45:08 lxb006 kernel: ...changing IO-APIC physical APIC ID to 8 ...
ok.
Feb 24 14:45:08 lxb006 kernel: init IO_APIC IRQs
Feb 24 14:45:08 lxb006 kernel: IO-APIC (apicid-pin) 2-0, 2-10, 2-11, 2-20,
2-21, 2-22, 2-23, 3-0, 3-1, 3-2, 3-3, 3-4, 3-
5, 3-6, 3-7, 3-8, 3-9, 3-10, 3-11, 3-12, 3-13, 3-14, 3-15, 3-16, 3-17, 3-18,
3-19, 3-20, 3-21, 3-22, 3-23, 4-0, 4-1, 4-2,
4-3, 4-4, 4-5, 4-6, 4-7, 4-8, 4-9, 4-10, 4-11, 4-12, 4-13, 4-14, 4-15, 4-16,
4-17, 4-18, 4-19, 4-20, 4-21, 4-22, 4-23, 5
-0, 5-1, 5-2, 5-3, 5-4, 5-5, 5-6, 5-7, 5-8, 5-9, 5-10, 5-11, 5-12, 5-13, 5-14,
5-15, 5-16, 5-17, 5-18, 5-19, 5-20, 5-21,
5-22, 5-23, 8-0, 8-1, 8-2, 8-3, 8-4, 8-5, 8-6, 8-7, 8-8, 8-9, 8-10, 8-11,
8-12, 8-13, 8-14, 8-15, 8-16, 8-17, 8-18, 8-19,
8-20, 8-21, 8-22, 8-23 not connected.
Feb 24 14:45:08 lxb006 kernel: ..TIMER: vector=0x31 pin1=2 pin2=0
Feb 24 14:45:08 lxb006 kernel: number of MP IRQ sources: 19.
Feb 24 14:45:08 lxb006 kernel: number of IO-APIC #2 registers: 24.
Feb 24 14:45:12 lxb006 kernel: number of IO-APIC #3 registers: 24.
Feb 24 14:45:12 lxb006 kernel: number of IO-APIC #4 registers: 24.
Feb 24 14:45:12 lxb006 kernel: number of IO-APIC #5 registers: 24.
Feb 24 14:45:12 lxb006 kernel: number of IO-APIC #8 registers: 24.
Feb 24 14:45:12 lxb006 kernel: testing the IO APIC.......................
Feb 24 14:45:12 lxb006 kernel:
Feb 24 14:45:12 lxb006 kernel: IO APIC #2......
Feb 24 14:45:12 lxb006 kernel: .... register #00: 02008000
Feb 24 14:45:12 lxb006 kernel: ....... : physical APIC id: 02
Feb 24 14:45:12 lxb006 kernel: WARNING: unexpected IO-APIC, please mail
Feb 24 14:45:12 lxb006 kernel: to [email protected]
Feb 24 14:45:12 lxb006 kernel: .... register #01: 00178020
Feb 24 14:45:12 lxb006 kernel: ....... : max redirection entries: 0017
Feb 24 14:45:12 lxb006 kernel: ....... : PRQ implemented: 1
Feb 24 14:45:12 lxb006 kernel: ....... : IO APIC version: 0020
Feb 24 14:45:12 lxb006 kernel: .... register #02: 00000000
Feb 24 14:45:12 lxb006 kernel: ....... : arbitration: 00
Feb 24 14:45:12 lxb006 kernel: .... IRQ redirection table:
Feb 24 14:45:12 lxb006 kernel: NR Log Phy Mask Trig IRR Pol Stat Dest Deli
Vect:
Feb 24 14:45:16 lxb006 kernel: 00 000 00 1 0 0 0 0 0 0 00
Feb 24 14:45:16 lxb006 kernel: 01 00F 0F 0 0 0 0 0 1 1 39
Feb 24 14:45:16 lxb006 kernel: 02 00F 0F 0 0 0 0 0 1 1 31
Feb 24 14:45:16 lxb006 kernel: 03 00F 0F 0 0 0 0 0 1 1 41
Feb 24 14:45:16 lxb006 kernel: 04 00F 0F 0 0 0 0 0 1 1 49
Feb 24 14:45:16 lxb006 kernel: 05 00F 0F 0 0 0 0 0 1 1 51
Feb 24 14:45:16 lxb006 kernel: 06 00F 0F 0 0 0 0 0 1 1 59
Feb 24 14:45:16 lxb006 kernel: 07 00F 0F 0 0 0 0 0 1 1 61
Feb 24 14:45:16 lxb006 kernel: 08 00F 0F 0 0 0 0 0 1 1 69
Feb 24 14:45:16 lxb006 kernel: 09 00F 0F 0 0 0 0 0 1 1 71
Feb 24 14:45:16 lxb006 kernel: 0a 000 00 1 0 0 0 0 0 0 00
Feb 24 14:45:16 lxb006 kernel: 0b 000 00 1 0 0 0 0 0 0 00
Feb 24 14:45:16 lxb006 kernel: 0c 00F 0F 0 0 0 0 0 1 1 79
Feb 24 14:45:16 lxb006 kernel: 0d 00F 0F 0 0 0 0 0 1 1 81
Feb 24 14:45:16 lxb006 kernel: 0e 00F 0F 0 0 0 0 0 1 1 89
Feb 24 14:45:16 lxb006 kernel: 0f 00F 0F 0 0 0 0 0 1 1 91
Feb 24 14:45:16 lxb006 kernel: 10 00F 0F 1 1 0 1 0 1 1 99
Feb 24 14:45:16 lxb006 kernel: 11 00F 0F 1 1 0 1 0 1 1 A1
Feb 24 14:45:16 lxb006 kernel: 12 00F 0F 1 1 0 1 0 1 1 A9
Feb 24 14:45:16 lxb006 kernel: 13 00F 0F 1 1 0 1 0 1 1 B1
Feb 24 14:45:16 lxb006 kernel: 14 000 00 1 0 0 0 0 0 0 00
Feb 24 14:45:16 lxb006 kernel: 15 000 00 1 0 0 0 0 0 0 00
Feb 24 14:45:16 lxb006 kernel: 16 000 00 1 0 0 0 0 0 0 00
Feb 24 14:45:16 lxb006 kernel: 17 000 00 1 0 0 0 0 0 0 00
Feb 24 14:45:16 lxb006 kernel:
Feb 24 14:45:16 lxb006 kernel: IO APIC #3......
Feb 24 14:45:16 lxb006 kernel: .... register #00: 03000000
Feb 24 14:45:16 lxb006 kernel: ....... : physical APIC id: 03
Feb 24 14:45:16 lxb006 kernel: .... register #01: 00178020
Feb 24 14:45:16 lxb006 kernel: ....... : max redirection entries: 0017
Feb 24 14:45:16 lxb006 kernel: ....... : PRQ implemented: 1
Feb 24 14:45:16 lxb006 kernel: ....... : IO APIC version: 0020
Feb 24 14:45:16 lxb006 kernel: .... register #02: 03000000
Feb 24 14:45:16 lxb006 kernel: ....... : arbitration: 03
Feb 24 14:45:16 lxb006 kernel: .... IRQ redirection table:
Feb 24 14:45:16 lxb006 kernel: NR Log Phy Mask Trig IRR Pol Stat Dest Deli
Vect:
Feb 24 14:45:16 lxb006 kernel: 00 000 00 1 0 0 0 0 0 0 00
Feb 24 14:45:20 lxb006 kernel: 01 000 00 1 0 0 0 0 0 0 00
Feb 24 14:45:20 lxb006 kernel: 02 000 00 1 0 0 0 0 0 0 00
Feb 24 14:45:20 lxb006 kernel: 03 000 00 1 0 0 0 0 0 0 00
Feb 24 14:45:20 lxb006 kernel: 04 000 00 1 0 0 0 0 0 0 00
Feb 24 14:45:20 lxb006 kernel: 05 000 00 1 0 0 0 0 0 0 00
Feb 24 14:45:20 lxb006 kernel: 06 000 00 1 0 0 0 0 0 0 00
Feb 24 14:45:20 lxb006 kernel: 07 000 00 1 0 0 0 0 0 0 00
Feb 24 14:45:20 lxb006 kernel: 08 000 00 1 0 0 0 0 0 0 00
Feb 24 14:45:20 lxb006 kernel: 09 000 00 1 0 0 0 0 0 0 00
Feb 24 14:45:20 lxb006 kernel: 0a 000 00 1 0 0 0 0 0 0 00
Feb 24 14:45:20 lxb006 kernel: 0b 000 00 1 0 0 0 0 0 0 00
Feb 24 14:45:20 lxb006 kernel: 0c 000 00 1 0 0 0 0 0 0 00
eb 24 14:45:20 lxb006 kernel: 0d 000 00 1 0 0 0 0 0 0 00
Feb 24 14:45:20 lxb006 kernel: 0e 000 00 1 0 0 0 0 0 0 00
Feb 24 14:45:20 lxb006 kernel: 0f 000 00 1 0 0 0 0 0 0 00
Feb 24 14:45:20 lxb006 kernel: 10 000 00 1 0 0 0 0 0 0 00
Feb 24 14:45:20 lxb006 kernel: 11 000 00 1 0 0 0 0 0 0 00
Feb 24 14:45:20 lxb006 kernel: 12 000 00 1 0 0 0 0 0 0 00
Feb 24 14:45:20 lxb006 kernel: 13 000 00 1 0 0 0 0 0 0 00
Feb 24 14:45:20 lxb006 kernel: 14 000 00 1 0 0 0 0 0 0 00
Feb 24 14:45:20 lxb006 kernel: 15 000 00 1 0 0 0 0 0 0 00
Feb 24 14:45:20 lxb006 kernel: 16 000 00 1 0 0 0 0 0 0 00
Feb 24 14:45:20 lxb006 kernel: 17 000 00 1 0 0 0 0 0 0 00
Feb 24 14:45:20 lxb006 kernel:
Feb 24 14:45:20 lxb006 kernel: IO APIC #4......
Feb 24 14:45:20 lxb006 kernel: .... register #00: 04000000
Feb 24 14:45:20 lxb006 kernel: ....... : physical APIC id: 04
Feb 24 14:45:20 lxb006 kernel: .... register #01: 00178020
Feb 24 14:45:20 lxb006 kernel: ....... : max redirection entries: 0017
Feb 24 14:45:20 lxb006 kernel: ....... : PRQ implemented: 1
Feb 24 14:45:20 lxb006 kernel: ....... : IO APIC version: 0020
Feb 24 14:45:20 lxb006 kernel: .... register #02: 04000000
Feb 24 14:45:20 lxb006 kernel: ....... : arbitration: 04
Feb 24 14:45:20 lxb006 kernel: .... IRQ redirection table:
Feb 24 14:45:20 lxb006 kernel: NR Log Phy Mask Trig IRR Pol Stat Dest Deli
Vect:
Feb 24 14:45:20 lxb006 kernel: 00 000 00 1 0 0 0 0 0 0 00
Feb 24 14:45:20 lxb006 kernel: 01 000 00 1 0 0 0 0 0 0 00
Feb 24 14:45:20 lxb006 kernel: 02 000 00 1 0 0 0 0 0 0 00
Feb 24 14:45:20 lxb006 kernel: 03 000 00 1 0 0 0 0 0 0 00
Feb 24 14:45:20 lxb006 kernel: 04 000 00 1 0 0 0 0 0 0 00
Feb 24 14:45:20 lxb006 kernel: 05 000 00 1 0 0 0 0 0 0 00
Feb 24 14:45:20 lxb006 kernel: 06 000 00 1 0 0 0 0 0 0 00
Feb 24 14:45:20 lxb006 kernel: 07 000 00 1 0 0 0 0 0 0 00
Feb 24 14:45:20 lxb006 kernel: 08 000 00 1 0 0 0 0 0 0 00
Feb 24 14:45:20 lxb006 kernel: 09 000 00 1 0 0 0 0 0 0 00
Feb 24 14:45:20 lxb006 kernel: 0a 000 00 1 0 0 0 0 0 0 00
Feb 24 14:45:20 lxb006 kernel: 0b 000 00 1 0 0 0 0 0 0 00
Feb 24 14:45:20 lxb006 kernel: 0c 000 00 1 0 0 0 0 0 0 00
Feb 24 14:45:20 lxb006 kernel: 0d 000 00 1 0 0 0 0 0 0 00
Feb 24 14:45:24 lxb006 kernel: 0e 000 00 1 0 0 0 0 0 0 00
Feb 24 14:45:24 lxb006 kernel: 0f 000 00 1 0 0 0 0 0 0 00
Feb 24 14:45:24 lxb006 kernel: 10 000 00 1 0 0 0 0 0 0 00
Feb 24 14:45:24 lxb006 kernel: 11 000 00 1 0 0 0 0 0 0 00
Feb 24 14:45:24 lxb006 kernel: 12 000 00 1 0 0 0 0 0 0 00
Feb 24 14:45:24 lxb006 kernel: 13 000 00 1 0 0 0 0 0 0 00
Feb 24 14:45:24 lxb006 kernel: 14 000 00 1 0 0 0 0 0 0 00
Feb 24 14:45:24 lxb006 kernel: 15 000 00 1 0 0 0 0 0 0 00
Feb 24 14:45:24 lxb006 kernel: 16 000 00 1 0 0 0 0 0 0 00
Feb 24 14:45:24 lxb006 kernel: 17 000 00 1 0 0 0 0 0 0 00
Feb 24 14:45:24 lxb006 kernel:
Feb 24 14:45:24 lxb006 kernel: IO APIC #5......
Feb 24 14:45:24 lxb006 kernel: .... register #00: 05000000
Feb 24 14:45:24 lxb006 kernel: ....... : physical APIC id: 05
Feb 24 14:45:24 lxb006 kernel: .... register #01: 00178020
Feb 24 14:45:24 lxb006 kernel: ....... : max redirection entries: 0017
Feb 24 14:45:24 lxb006 kernel: ....... : PRQ implemented: 1
Feb 24 14:45:24 lxb006 kernel: ....... : IO APIC version: 0020
Feb 24 14:45:24 lxb006 kernel: .... register #02: 05000000
Feb 24 14:45:24 lxb006 kernel: ....... : arbitration: 05
Feb 24 14:45:24 lxb006 kernel: .... IRQ redirection table:
Feb 24 14:45:24 lxb006 kernel: NR Log Phy Mask Trig IRR Pol Stat Dest Deli
Vect:
Feb 24 14:45:24 lxb006 kernel: 00 000 00 1 0 0 0 0 0 0 00
Feb 24 14:45:24 lxb006 kernel: 01 000 00 1 0 0 0 0 0 0 00
Feb 24 14:45:24 lxb006 kernel: 02 000 00 1 0 0 0 0 0 0 00
Feb 24 14:45:24 lxb006 kernel: 03 000 00 1 0 0 0 0 0 0 00
Feb 24 14:45:24 lxb006 kernel: 04 000 00 1 0 0 0 0 0 0 00
Feb 24 14:45:24 lxb006 kernel: 05 000 00 1 0 0 0 0 0 0 00
Feb 24 14:45:24 lxb006 kernel: 06 000 00 1 0 0 0 0 0 0 00
Feb 24 14:45:24 lxb006 kernel: 07 000 00 1 0 0 0 0 0 0 00
Feb 24 14:45:24 lxb006 kernel: 08 000 00 1 0 0 0 0 0 0 00
Feb 24 14:45:24 lxb006 kernel: 09 000 00 1 0 0 0 0 0 0 00
Feb 24 14:45:24 lxb006 kernel: 0a 000 00 1 0 0 0 0 0 0 00
Feb 24 14:45:24 lxb006 kernel: 0b 000 00 1 0 0 0 0 0 0 00
Feb 24 14:45:24 lxb006 kernel: 0c 000 00 1 0 0 0 0 0 0 00
Feb 24 14:45:24 lxb006 kernel: 0d 000 00 1 0 0 0 0 0 0 00
Feb 24 14:45:24 lxb006 kernel: 0e 000 00 1 0 0 0 0 0 0 00
Feb 24 14:45:24 lxb006 kernel: 0f 000 00 1 0 0 0 0 0 0 00
Feb 24 14:45:24 lxb006 kernel: 10 000 00 1 0 0 0 0 0 0 00
Feb 24 14:45:24 lxb006 kernel: 11 000 00 1 0 0 0 0 0 0 00
Feb 24 14:45:24 lxb006 kernel: 12 000 00 1 0 0 0 0 0 0 00
Feb 24 14:45:24 lxb006 kernel: 13 000 00 1 0 0 0 0 0 0 00
Feb 24 14:45:24 lxb006 kernel: 14 000 00 1 0 0 0 0 0 0 00
Feb 24 14:45:24 lxb006 kernel: 15 000 00 1 0 0 0 0 0 0 00
Feb 24 14:45:24 lxb006 kernel: 16 000 00 1 0 0 0 0 0 0 00
Feb 24 14:45:24 lxb006 kernel: 17 000 00 1 0 0 0 0 0 0 00
Feb 24 14:45:24 lxb006 kernel:
Feb 24 14:45:24 lxb006 kernel: IO APIC #8......
Feb 24 14:45:24 lxb006 kernel: .... register #00: 08000000
Feb 24 14:45:26 lxb006 kernel: ....... : physical APIC id: 08
Feb 24 14:45:26 lxb006 kernel: .... register #01: 00178020
Feb 24 14:45:26 lxb006 kernel: ....... : max redirection entries: 0017
Feb 24 14:45:26 lxb006 kernel: ....... : PRQ implemented: 1
Feb 24 14:45:26 lxb006 kernel: ....... : IO APIC version: 0020
Feb 24 14:45:26 lxb006 kernel: .... register #02: 08000000
Feb 24 14:45:26 lxb006 kernel: ....... : arbitration: 08
Feb 24 14:45:26 lxb006 kernel: .... IRQ redirection table:
Feb 24 14:45:26 lxb006 kernel: NR Log Phy Mask Trig IRR Pol Stat Dest Deli
Vect:
Feb 24 14:45:26 lxb006 kernel: 00 000 00 1 0 0 0 0 0 0 00
Feb 24 14:45:26 lxb006 kernel: 01 000 00 1 0 0 0 0 0 0 00
Feb 24 14:45:26 lxb006 kernel: 02 000 00 1 0 0 0 0 0 0 00
Feb 24 14:45:26 lxb006 kernel: 03 000 00 1 0 0 0 0 0 0 00
Feb 24 14:45:26 lxb006 kernel: 04 000 00 1 0 0 0 0 0 0 00
Feb 24 14:45:26 lxb006 kernel: 05 000 00 1 0 0 0 0 0 0 00
Feb 24 14:45:26 lxb006 kernel: 06 000 00 1 0 0 0 0 0 0 00
Feb 24 14:45:26 lxb006 kernel: 07 000 00 1 0 0 0 0 0 0 00
Feb 24 14:45:26 lxb006 kernel: 08 000 00 1 0 0 0 0 0 0 00
Feb 24 14:45:26 lxb006 kernel: 09 000 00 1 0 0 0 0 0 0 00
Feb 24 14:45:26 lxb006 kernel: 0a 000 00 1 0 0 0 0 0 0 00
Feb 24 14:45:26 lxb006 kernel: 0b 000 00 1 0 0 0 0 0 0 00
Feb 24 14:45:30 lxb006 kernel: 0c 000 00 1 0 0 0 0 0 0 00
Feb 24 14:45:30 lxb006 kernel: 0d 000 00 1 0 0 0 0 0 0 00
Feb 24 14:45:30 lxb006 kernel: 0e 000 00 1 0 0 0 0 0 0 00
Feb 24 14:45:30 lxb006 kernel: 0f 000 00 1 0 0 0 0 0 0 00
Feb 24 14:45:30 lxb006 kernel: 10 000 00 1 0 0 0 0 0 0 00
Feb 24 14:45:30 lxb006 kernel: 11 000 00 1 0 0 0 0 0 0 00
Feb 24 14:45:30 lxb006 kernel: 12 000 00 1 0 0 0 0 0 0 00
Feb 24 14:45:30 lxb006 kernel: 13 000 00 1 0 0 0 0 0 0 00
Feb 24 14:45:30 lxb006 kernel: 14 000 00 1 0 0 0 0 0 0 00
Feb 24 14:45:30 lxb006 kernel: 15 000 00 1 0 0 0 0 0 0 00
Feb 24 14:45:30 lxb006 kernel: 16 000 00 1 0 0 0 0 0 0 00
Feb 24 14:45:30 lxb006 kernel: 17 000 00 1 0 0 0 0 0 0 00
Feb 24 14:45:30 lxb006 kernel: IRQ to pin mappings:
Feb 24 14:45:30 lxb006 kernel: IRQ0 -> 0:2
Feb 24 14:45:30 lxb006 kernel: IRQ1 -> 0:1
Feb 24 14:45:30 lxb006 kernel: IRQ3 -> 0:3
Feb 24 14:45:30 lxb006 kernel: IRQ4 -> 0:4
Feb 24 14:45:30 lxb006 kernel: IRQ5 -> 0:5
Feb 24 14:45:30 lxb006 kernel: IRQ6 -> 0:6
Feb 24 14:45:30 lxb006 kernel: IRQ7 -> 0:7
Feb 24 14:45:30 lxb006 kernel: IRQ8 -> 0:8
Feb 24 14:45:30 lxb006 kernel: IRQ9 -> 0:9
Feb 24 14:45:30 lxb006 kernel: IRQ12 -> 0:12
Feb 24 14:45:30 lxb006 kernel: IRQ13 -> 0:13
Feb 24 14:45:30 lxb006 kernel: IRQ14 -> 0:14
Feb 24 14:45:30 lxb006 kernel: IRQ15 -> 0:15
Feb 24 14:45:30 lxb006 kernel: IRQ16 -> 0:16
Feb 24 14:45:30 lxb006 kernel: IRQ17 -> 0:17
Feb 24 14:45:30 lxb006 kernel: IRQ18 -> 0:18
Feb 24 14:45:30 lxb006 kernel: IRQ19 -> 0:19
Feb 24 14:45:30 lxb006 kernel: .................................... done.
Feb 24 14:45:30 lxb006 kernel: Using local APIC timer interrupts.
Feb 24 14:45:30 lxb006 kernel: calibrating APIC timer ...
Feb 24 14:45:30 lxb006 kernel: ..... CPU clock speed is 1996.6181 MHz.
Feb 24 14:45:30 lxb006 kernel: ..... host bus clock speed is 99.8308 MHz.
Feb 24 14:45:30 lxb006 kernel: cpu: 0, clocks: 998308, slice: 199661
Feb 24 14:45:30 lxb006 kernel: CPU0<T0:998304,T1:798640,D:3,S:199661,C:998308>
Feb 24 14:45:30 lxb006 kernel: cpu: 1, clocks: 998308, slice: 199661
Feb 24 14:45:30 lxb006 kernel: cpu: 3, clocks: 998308, slice: 199661
Feb 24 14:45:30 lxb006 kernel: cpu: 2, clocks: 998308, slice: 199661
Feb 24 14:45:30 lxb006 kernel: CPU1<T0:998304,T1:598976,D:6,S:199661,C:998308>
Feb 24 14:45:30 lxb006 kernel: CPU2<T0:998304,T1:399312,D:9,S:199661,C:998308>
Feb 24 14:45:30 lxb006 kernel:
CPU3<T0:998304,T1:199648,D:12,S:199661,C:998308>
Feb 24 14:45:30 lxb006 kernel: checking TSC synchronization across CPUs:
passed.
Feb 24 14:45:30 lxb006 kernel: Waiting on wait_init_idle (map = 0xe)
Feb 24 14:45:30 lxb006 kernel: All processors have done init_idle
Feb 24 14:45:30 lxb006 kernel: mtrr: your CPUs had inconsistent fixed MTRR
settings
Feb 24 14:45:30 lxb006 kernel: mtrr: probably your BIOS does not setup all
CPUs
eb 24 14:45:30 lxb006 kernel: PCI: PCI BIOS revision 2.10 entry at 0xfd875,
last bus=7
Feb 24 14:45:30 lxb006 kernel: PCI: Using configuration type 1
Feb 24 14:45:30 lxb006 kernel: PCI: Probing PCI hardware
Feb 24 14:45:30 lxb006 kernel: Transparent bridge - Intel Corp. 82801BA/CA/DB
PCI Bridge
Feb 24 14:45:30 lxb006 kernel: PCI: Discovered primary peer bus 10 [IRQ]
Feb 24 14:45:30 lxb006 kernel: PCI: Discovered primary peer bus 11 [IRQ]
Feb 24 14:45:30 lxb006 kernel: PCI: Discovered primary peer bus 12 [IRQ]
Feb 24 14:45:34 lxb006 kernel: PCI: Using IRQ router PIIX [8086/2480] at
00:1f.0
Feb 24 14:45:34 lxb006 kernel: PCI->APIC IRQ transform: (B0,I29,P0) -> 16
Feb 24 14:45:34 lxb006 kernel: PCI->APIC IRQ transform: (B0,I29,P1) -> 19
Feb 24 14:45:34 lxb006 kernel: PCI->APIC IRQ transform: (B0,I29,P2) -> 18
Feb 24 14:45:34 lxb006 kernel: PCI->APIC IRQ transform: (B7,I1,P0) -> 16
Feb 24 14:45:34 lxb006 kernel: PCI->APIC IRQ transform: (B7,I2,P0) -> 17
Feb 24 14:45:34 lxb006 kernel: isapnp: Scanning for PnP cards...
Feb 24 14:45:34 lxb006 kernel: isapnp: No Plug & Play device found
Feb 24 14:45:34 lxb006 kernel: Linux NET4.0 for Linux 2.4
Feb 24 14:45:34 lxb006 kernel: Based upon Swansea University Computer Society
NET3.039
Feb 24 14:45:34 lxb006 kernel: Initializing RT netlink socket
Feb 24 14:45:34 lxb006 kernel: Starting kswapd
Feb 24 14:45:34 lxb006 kernel: allocated 32 pages and 32 bhs reserved for the
highmem bounces
Feb 24 14:45:34 lxb006 kernel: VFS: Diskquotas version dquot_6.4.0 initialized
Feb 24 14:45:34 lxb006 kernel: Journalled Block Device driver loaded
Feb 24 14:45:34 lxb006 kernel: i2c-core.o: i2c core module
Feb 24 14:45:34 lxb006 kernel: i2c-proc.o version 2.6.1 (20010825)
Feb 24 14:45:34 lxb006 kernel: pty: 2048 Unix98 ptys configured
Feb 24 14:45:34 lxb006 kernel: Serial driver version 5.05c (2001-07-08) with
MANY_PORTS SHARE_IRQ SERIAL_PCI ISAPNP enabl
ed
Feb 24 14:45:34 lxb006 kernel: ttyS00 at 0x03f8 (irq = 4) is a 16550A
Feb 24 14:45:34 lxb006 kernel: ttyS01 at 0x02f8 (irq = 3) is a 16550A
Feb 24 14:45:34 lxb006 kernel: Real Time Clock Driver v1.10e
Feb 24 14:45:34 lxb006 kernel: Uniform Multi-Platform E-IDE driver Revision:
6.31
Feb 24 14:45:34 lxb006 kernel: ide: Assuming 33MHz system bus speed for PIO
modes; override with idebus=xx
Feb 24 14:45:34 lxb006 kernel: ICH3: IDE controller on PCI bus 00 dev f9
Feb 24 14:45:34 lxb006 kernel: PCI: Device 00:1f.1 not available because of
resource collisions
Feb 24 14:45:34 lxb006 kernel: PCI: No IRQ known for interrupt pin A of device
00:1f.1. Probably buggy MP table.
Feb 24 14:45:34 lxb006 kernel: ICH3: BIOS setup was incomplete.
Feb 24 14:45:34 lxb006 kernel: ICH3: chipset revision 2
Feb 24 14:45:34 lxb006 kernel: ICH3: not 100%% native mode: will probe irqs
later
Feb 24 14:45:34 lxb006 kernel: ide0: BM-DMA at 0x1460-0x1467, BIOS
settings: hda:pio, hdb:pio
Feb 24 14:45:34 lxb006 kernel: ide1: BM-DMA at 0x1468-0x146f, BIOS
settings: hdc:pio, hdd:pio
Feb 24 14:45:34 lxb006 kernel: hda: ST340016A, ATA DISK drive
Feb 24 14:45:34 lxb006 kernel: ide0 at 0x1f0-0x1f7,0x3f6 on irq 14
Feb 24 14:45:34 lxb006 kernel: hda: 78165360 sectors (40021 MB) w/2048KiB
Cache, CHS=4865/255/63
Feb 24 14:45:34 lxb006 kernel: Partition check:
Feb 24 14:45:34 lxb006 kernel: hda: hda1 hda4 < hda5 hda6 >
Feb 24 14:45:34 lxb006 kernel: Floppy drive(s): fd0 is 1.44M
Feb 24 14:45:34 lxb006 kernel: FDC 0 is a post-1991 82077
Feb 24 14:45:34 lxb006 kernel: RAMDISK driver initialized: 16 RAM disks of
4096K size 1024 blocksize
Feb 24 14:45:34 lxb006 kernel: loop: loaded (max 8 devices)
Feb 24 14:45:34 lxb006 kernel: eepro100.c:v1.09j-t 9/29/99 Donald Becker
http://www.scyld.com/network/eepro100.html
Feb 24 14:45:34 lxb006 kernel: eepro100.c: $Revision: 1.36 $ 2000/11/17
Modified by Andrey V. Savochkin <[email protected]
g> and others
Feb 24 14:45:34 lxb006 kernel: eth0: OEM i82557/i82558 10/100 Ethernet,
00:30:48:23:4C:E9, IRQ 17.
Feb 24 14:45:34 lxb006 kernel: Board assembly 000000-000, Physical
connectors present: RJ45
Feb 24 14:45:34 lxb006 kernel: Primary interface chip i82555 PHY #1.
Feb 24 14:45:34 lxb006 kernel: General self-test: passed.
Feb 24 14:45:34 lxb006 kernel: Serial sub-system self-test: passed.
Feb 24 14:45:38 lxb006 kernel: Internal registers self-test: passed.
Feb 24 14:45:38 lxb006 kernel: ROM checksum self-test: passed (0xb874c1d3).
Feb 24 14:45:38 lxb006 kernel: SCSI subsystem driver Revision: 1.00
Feb 24 14:45:38 lxb006 kernel: kmod: failed to exec /sbin/modprobe -s -k
scsi_hostadapter, errno = 2
Feb 24 14:45:38 lxb006 kernel: kmod: failed to exec /sbin/modprobe -s -k
scsi_hostadapter, errno = 2
Feb 24 14:45:38 lxb006 kernel: Linux Kernel Card Services 3.1.22
Feb 24 14:45:38 lxb006 kernel: options: [pci] [cardbus]
Feb 24 14:45:38 lxb006 kernel: I2O Core - (C) Copyright 1999 Red Hat Software
Feb 24 14:45:38 lxb006 kernel: I2O: Event thread created as pid 13
Feb 24 14:45:38 lxb006 kernel: I2O configuration manager v 0.04.
Feb 24 14:45:38 lxb006 kernel: (C) Copyright 1999 Red Hat Software
Feb 24 14:45:38 lxb006 kernel: NET4: Linux TCP/IP 1.0 for NET4.0
Feb 24 14:45:38 lxb006 kernel: IP Protocols: ICMP, UDP, TCP
Feb 24 14:45:38 lxb006 kernel: IP: routing cache hash table of 4096 buckets,
64Kbytes
Feb 24 14:45:38 lxb006 kernel: TCP: Hash tables configured (established 131072
bind 43690)
Feb 24 14:45:38 lxb006 kernel: IP-Config: Complete:
Feb 24 14:45:38 lxb006 kernel: device=eth0, addr=140.181.97.13,
mask=255.255.192.0, gw=140.181.96.1,
Feb 24 14:45:38 lxb006 kernel: host=140.181.97.13, domain=,
nis-domain=(none),
Feb 24 14:45:38 lxb006 kernel: bootserver=140.181.97.209,
rootserver=140.181.97.209, rootpath=
Feb 24 14:45:38 lxb006 kernel: NET4: Unix domain sockets 1.0/SMP for Linux
NET4.0.
Feb 24 14:45:38 lxb006 kernel: ds: no socket drivers loaded!
Feb 24 14:45:38 lxb006 kernel: Looking up port of RPC 100003/2 on
140.181.97.209
Feb 24 14:45:38 lxb006 kernel: Looking up port of RPC 100005/1 on
140.181.97.209
Feb 24 14:45:38 lxb006 kernel: VFS: Mounted root (nfs filesystem).
Feb 24 14:45:38 lxb006 kernel: Freeing unused kernel memory: 168k freed
Feb 24 14:45:38 lxb006 kernel: Adding Swap: 2097136k swap-space (priority -1)
Feb 24 14:45:38 lxb006 kernel: sk98lin: Network Device Driver v4.06
Feb 24 14:45:38 lxb006 kernel: Copyright (C) 2000-2001 SysKonnect GmbH.
Feb 24 14:45:38 lxb006 kernel: No adapter found
Feb 24 14:45:38 lxb006 kernel: i2c-piix4.o version 2.6.3 (20020322)
Feb 24 14:45:38 lxb006 kernel: i2c-piix4.o: Error: Can't detect PIIX4 or
compatible device!
Feb 24 14:45:38 lxb006 kernel: i2c-piix4.o: Device not detected, module not
inserted.
Feb 24 14:45:38 lxb006 kernel: i2c-isa.o version 2.6.3 (20020322)
Feb 24 14:45:38 lxb006 kernel: i2c-core.o: adapter ISA main adapter registered
as adapter 0.
Feb 24 14:45:38 lxb006 kernel: i2c-isa.o: ISA bus access for i2c modules
initialized.
Feb 24 14:45:38 lxb006 kernel: w83781d.o version 2.6.3 (20020322)
Feb 24 14:45:38 lxb006 kernel: i2c-core.o: driver W83781D sensor driver
registered.
Feb 24 14:45:38 lxb006 kernel: i2c-core.o: client [W83627HF chip] registered
to adapter [ISA main adapter](pos. 0).
Feb 24 14:45:38 lxb006 kernel: eeprom.o version 2.6.3 (20020322)
Feb 24 14:45:38 lxb006 kernel: i2c-core.o: driver EEPROM READER registered.
Feb 24 14:45:38 lxb006 kernel: scsi0 : SCSI host adapter emulation for IDE
ATAPI devices
Feb 24 14:45:38 lxb006 kernel: md: md driver 0.90.0 MAX_MD_DEVS=256,
MD_SB_DISKS=27
Feb 24 14:45:38 lxb006 kernel: kjournald starting. Commit interval 5 seconds
Feb 24 14:45:38 lxb006 kernel: EXT3 FS 2.4-0.9.19, 19 August 2002 on
ide0(3,6), internal journal
Feb 24 14:45:38 lxb006 kernel: EXT3-fs: mounted filesystem with ordered data
mode.
Feb 24 14:45:38 lxb006 kernel: Installing knfsd (copyright (C) 1996
[email protected]



2003-03-03 08:43:20

by Zwane Mwaikambo

[permalink] [raw]
Subject: Re: Kernel Bug at spinlock.h ?!

On Mon, 3 Mar 2003, ChristopherHuhn wrote:

> Hi folks,
>
> we have some problems with the 2.4.20 kernel on our linux farm, mainly - but
> not exclusivly - on our batch processing 2x2GHz Xeon (with hyperthreading),
> Intel e7500 supermicro P4-DPE boxes running - almost plain - Debian woody.
>
> On these boxes (ca. 60) we sometimes get a kernel panic after an uptime of 2
> days or more. The process causing the panic is sometimes a users number
> crunching process but more often ksoftirqd_CPU0 or swapper.
>
> Today we got this profound error msg:
> Kernel Bug at /usr/src/kernel-source-2.4.20/include/asm/spinlock.h:105!

Sounds like possible memory corruption (can you vouch for the reliability
of your RAM?) Might be worthwhile posting the oops in it's entirety. Is
EIP normally in __run_timers? Do you run a heavy networking load?

Zwane
--
function.linuxpower.ca

2003-03-03 10:02:09

by Christopher Huhn

[permalink] [raw]
Subject: Re: Kernel Bug at spinlock.h ?!

Zwane Mwaikambo wrote:

>Sounds like possible memory corruption (can you vouch for the reliability
>of your RAM?)
>
The similar problem occurs on many of our machines, so I would exclude
memory corruption.

>Might be worthwhile posting the oops in it's entirety. Is
>EIP normally in __run_timers? Do you run a heavy networking load?
>
The numbers crunched by these machines are loaded from and writtem to
the net, so I would assume that.

We had these machines running potato with 2.2.21 since last summer and
the kernel never oopsed.

Due to this fact I expect this to be a bug in the SMP code of 2.4.20 or
a kernel misconfiguration by us.

I'll send the next oops as soon as it occurs.

Kind regards,

Christopher


2003-03-03 15:03:36

by Christopher Huhn

[permalink] [raw]
Subject: Re: Kernel Bug at spinlock.h ?!

Hi again,

>>Sounds like possible memory corruption (can you vouch for the reliability
>>of your RAM?) Might be worthwhile posting the oops in it's entirety. Is
>>EIP normally in __run_timers? Do you run a heavy networking load?
>>
as apparently every machine in our farm is affected, I cannot believe in
a corrupted memory. I've started to run memtest86 on a machine that just
oopsed though, but it didn't find any errors (yet).

>Feb 24 14:45:34 lxb006 kernel: ICH3: BIOS setup was incomplete.
>
Does this mean we should upgrade to 2.5?

Kind regards,

Christopher


Here comes a complete oops that just occured:

Unable to handle kernel NULL pointer dereference at virtual address 00000002
priniting eip:
e40e5cfc
*pde: 00000000
Oops: 0002
Cpu: 0
EIP: 0010:[<e40e5cfc>] Not tainted
EFLAGS: 00010246
eax: 00000002 ebx: e40e5cfc ecx: c03f9208 edx: 00000000
esi: e40e5cb0 edi: 00000001 ebp: d5d15cd0 esp: d5d15cbc
ds: 0018 es: 0018 ss: 0018
Process adsmcli (pid: 13223, stackpage=d5d15000)
Stack: c02c6783 e40e5cb0 e40e4cb0 c02c66a0 0ac9682a d5d15d08 c012564b
e40e5cb0
00000000 00000000 00000000 00000001 00000000 c03f9600 c041c30c c041c30c
...
Call Trace: [<c02c6783>] [<c02c66a0>] [<c0125646>] [<c012139a>] [<c0121263>]
[<c0120fdd>] [<c02a50dc>] [<c02a3c68>] [<c02abc50>] [<c027eec2>]
[<c029c877>]
...

Code: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 2b 68 c9 0a
<0> Kernel panic: Aiee, killing interrupt handler!
In interrupt handler - not syncing


2003-03-03 15:25:53

by Alan

[permalink] [raw]
Subject: Re: Kernel Bug at spinlock.h ?!

On Mon, 2003-03-03 at 15:13, ChristopherHuhn wrote:
> >Feb 24 14:45:34 lxb006 kernel: ICH3: BIOS setup was incomplete.
> >
> Does this mean we should upgrade to 2.5?

No the kernel cleaned that one up for you

2003-03-03 15:29:41

by Richard B. Johnson

[permalink] [raw]
Subject: Re: Kernel Bug at spinlock.h ?!

On Mon, 3 Mar 2003, ChristopherHuhn wrote:

> Hi again,
>
> >>Sounds like possible memory corruption (can you vouch for the reliability
> >>of your RAM?) Might be worthwhile posting the oops in it's entirety. Is
> >>EIP normally in __run_timers? Do you run a heavy networking load?
> >>
> as apparently every machine in our farm is affected, I cannot believe in
> a corrupted memory. I've started to run memtest86 on a machine that just
> oopsed though, but it didn't find any errors (yet).
>
> >Feb 24 14:45:34 lxb006 kernel: ICH3: BIOS setup was incomplete.
> >
> Does this mean we should upgrade to 2.5?
>
> Kind regards,
>
> Christopher
>
>
> Here comes a complete oops that just occured:
>
> Unable to handle kernel NULL pointer dereference at virtual address 00000002
> priniting eip:
> e40e5cfc
> *pde: 00000000
> Oops: 0002
> Cpu: 0
> EIP: 0010:[<e40e5cfc>] Not tainted
> EFLAGS: 00010246
> eax: 00000002 ebx: e40e5cfc ecx: c03f9208 edx: 00000000
> esi: e40e5cb0 edi: 00000001 ebp: d5d15cd0 esp: d5d15cbc
> ds: 0018 es: 0018 ss: 0018
> Process adsmcli (pid: 13223, stackpage=d5d15000)
> Stack: c02c6783 e40e5cb0 e40e4cb0 c02c66a0 0ac9682a d5d15d08 c012564b
> e40e5cb0
> 00000000 00000000 00000000 00000001 00000000 c03f9600 c041c30c c041c30c
> ...
> Call Trace: [<c02c6783>] [<c02c66a0>] [<c0125646>] [<c012139a>] [<c0121263>]
> [<c0120fdd>] [<c02a50dc>] [<c02a3c68>] [<c02abc50>] [<c027eec2>]
> [<c029c877>]
> ...
>
> Code: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 2b 68 c9 0a
> <0> Kernel panic: Aiee, killing interrupt handler!
> In interrupt handler - not syncing
>

What does "Process adsmcli" do? Does it make any special system-calls
or does in interface with a particular driver? Whatever it's doing
may have triggered the event.

> Code: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 2b 68 c9 0a
This code is not valid. Either some hardware burped or a pointer to
a function got corrupted, both quite likely RAM related.

The "Re: Kernel Bug at spinlock.h ?!" is an eye-catcher because this
inline code cannot have any bugs or you wouldn't even have booted.

Cheers,
Dick Johnson
Penguin : Linux version 2.4.18 on an i686 machine (797.90 BogoMips).
Why is the government concerned about the lunatic fringe? Think about it.


2003-03-03 15:20:40

by Tomas Szepe

[permalink] [raw]
Subject: re: Kernel Bug at spinlock.h ?!

> [[email protected]]
>
> Here comes a complete oops that just occured:

The oops is no use here if you don't decode it first.
Please install ksymoops, read its docs, then repost.

--
Tomas Szepe <[email protected]>

2003-03-03 15:53:19

by Christopher Huhn

[permalink] [raw]
Subject: Re: Kernel Bug at spinlock.h ?!

Hi,

I'm sorry I didn't know about ksymoops as I'm not experienced with
kernel bugs yet.

Richard B. Johnson wrote:

>The "Re: Kernel Bug at spinlock.h ?!" is an eye-catcher because this
>inline code cannot have any bugs or you wouldn't even have booted.
>
I think this is the code, that produced the BUG message:

static inline void spin_unlock(spinlock_t *lock)
{
char oldval = 1;
#if SPINLOCK_DEBUG
if (lock->magic != SPINLOCK_MAGIC)
BUG();
...

The oops occured after an uptime of about 50 hours.

I just discovered the following messages in the syslog, right before
that oops (I never found any kernel oops logs in the syslog until now ...):

Feb 27 20:51:37 lxb039 kernel: Unable to handle kernel NULL pointer
dereference at virtual address 00000000
Feb 27 20:51:37 lxb039 kernel: printing eip:
Feb 27 20:51:37 lxb039 kernel: 00000000
Feb 27 20:51:37 lxb039 kernel: *pde = 00000000
Feb 27 20:51:37 lxb039 kernel: Oops: 0000
Feb 27 20:51:37 lxb039 kernel: CPU: 3
Feb 27 20:51:37 lxb039 kernel: EIP: 0010:[msr_exit+0/24] Not tainted
Feb 27 20:51:37 lxb039 kernel: EFLAGS: 00010246
Feb 27 20:51:37 lxb039 kernel: eax: fffffffe ebx: f1857cb0 ecx:
00000002 edx: 00000008
Feb 27 20:51:37 lxb039 kernel: esi: fffffff5 edi: f1857cb0 ebp:
f1857c90 esp: f1857c84
Feb 27 20:51:37 lxb039 kernel: ds: 0018 es: 0018 ss: 0018
Feb 27 20:51:37 lxb039 kernel: Process sh (pid: 29359, stackpage=f1857000)
Feb 27 20:51:37 lxb039 kernel: Stack: f1857cb0 f1857ca8 00000000
f1857d38 c02a8ff1 f1857cb0 f1857d58 f1856000
Feb 27 20:51:37 lxb039 kernel: 00000001 fffffefd ffffffff
00000000 00000000 00000000 00000000 00000000
Feb 27 20:51:37 lxb039 kernel: 00000000 00000000 fffffffe
00000000 00000003 f1857da8 f1857d9c 00000000
Feb 27 20:51:38 lxb039 kernel: Call Trace: [rpc_call_sync+121/164]
[rpc_run_timer+0/240] [nfs3_rpc_wrapper+54/124] [nfs3_proc_lookup
+194/340] [nfs_lookup+122/204]
Feb 27 20:51:38 lxb039 kernel: [dput+27/464]
[link_path_walk+2940/3200] [in_group_p+32/40] [vfs_permission+121/248]
[d_alloc+25/476]
[real_lookup+169/360]
Feb 27 20:51:38 lxb039 kernel: [link_path_walk+2425/3200]
[path_walk+29/36] [path_lookup+30/44] [__user_walk+45/72] [sys_stat64+26/11
2] [sys_close+115/140]
Feb 27 20:51:38 lxb039 kernel: [system_call+51/56]
Feb 27 20:51:38 lxb039 kernel:
Feb 27 20:51:38 lxb039 kernel: Code: Bad EIP value.


Looks like a NFS problem, huh?

2003-03-03 16:29:38

by Richard B. Johnson

[permalink] [raw]
Subject: Re: Kernel Bug at spinlock.h ?!

On Mon, 3 Mar 2003, ChristopherHuhn wrote:

> Hi,
>
> I'm sorry I didn't know about ksymoops as I'm not experienced with
> kernel bugs yet.
>
> Richard B. Johnson wrote:
>
> >The "Re: Kernel Bug at spinlock.h ?!" is an eye-catcher because this
> >inline code cannot have any bugs or you wouldn't even have booted.
> >
> I think this is the code, that produced the BUG message:
>
> static inline void spin_unlock(spinlock_t *lock)
> {
> char oldval = 1;
> #if SPINLOCK_DEBUG
> if (lock->magic != SPINLOCK_MAGIC)
> BUG();
> ...
>
> The oops occured after an uptime of about 50 hours.
>
> I just discovered the following messages in the syslog, right before
> that oops (I never found any kernel oops logs in the syslog until now ...):
>
> Feb 27 20:51:37 lxb039 kernel: Unable to handle kernel NULL pointer
> dereference at virtual address 00000000
> Feb 27 20:51:37 lxb039 kernel: printing eip:
> Feb 27 20:51:37 lxb039 kernel: 00000000
> Feb 27 20:51:37 lxb039 kernel: *pde = 00000000
> Feb 27 20:51:37 lxb039 kernel: Oops: 0000
> Feb 27 20:51:37 lxb039 kernel: CPU: 3
> Feb 27 20:51:37 lxb039 kernel: EIP: 0010:[msr_exit+0/24] Not tainted
> Feb 27 20:51:37 lxb039 kernel: EFLAGS: 00010246
> Feb 27 20:51:37 lxb039 kernel: eax: fffffffe ebx: f1857cb0 ecx:
> 00000002 edx: 00000008
> Feb 27 20:51:37 lxb039 kernel: esi: fffffff5 edi: f1857cb0 ebp:
> f1857c90 esp: f1857c84
> Feb 27 20:51:37 lxb039 kernel: ds: 0018 es: 0018 ss: 0018
> Feb 27 20:51:37 lxb039 kernel: Process sh (pid: 29359, stackpage=f1857000)
> Feb 27 20:51:37 lxb039 kernel: Stack: f1857cb0 f1857ca8 00000000
> f1857d38 c02a8ff1 f1857cb0 f1857d58 f1856000
> Feb 27 20:51:37 lxb039 kernel: 00000001 fffffefd ffffffff
> 00000000 00000000 00000000 00000000 00000000
> Feb 27 20:51:37 lxb039 kernel: 00000000 00000000 fffffffe
> 00000000 00000003 f1857da8 f1857d9c 00000000
> Feb 27 20:51:38 lxb039 kernel: Call Trace: [rpc_call_sync+121/164]
> [rpc_run_timer+0/240] [nfs3_rpc_wrapper+54/124] [nfs3_proc_lookup
> +194/340] [nfs_lookup+122/204]
> Feb 27 20:51:38 lxb039 kernel: [dput+27/464]
> [link_path_walk+2940/3200] [in_group_p+32/40] [vfs_permission+121/248]
> [d_alloc+25/476]
> [real_lookup+169/360]
> Feb 27 20:51:38 lxb039 kernel: [link_path_walk+2425/3200]
> [path_walk+29/36] [path_lookup+30/44] [__user_walk+45/72] [sys_stat64+26/11
> 2] [sys_close+115/140]
> Feb 27 20:51:38 lxb039 kernel: [system_call+51/56]
> Feb 27 20:51:38 lxb039 kernel:
> Feb 27 20:51:38 lxb039 kernel: Code: Bad EIP value.
>
>
> Looks like a NFS problem, huh?

No. It looks like something got corrupted so a call occurred to
nonexistant code.

If you have two RAM sticks, swap them. Otherwise, do something to
change your RAM configuration. There is something wrong in some
RAM area that is corrupting code or pointers to code. It could also
be that 100 MHz RAM is being used at 130 MHz, etc.

Cheers,
Dick Johnson
Penguin : Linux version 2.4.18 on an i686 machine (797.90 BogoMips).
Why is the government concerned about the lunatic fringe? Think about it.


2003-03-03 17:41:25

by Zwane Mwaikambo

[permalink] [raw]
Subject: Re: Kernel Bug at spinlock.h ?!

On Mon, 3 Mar 2003, ChristopherHuhn wrote:

> Feb 27 20:51:37 lxb039 kernel: Unable to handle kernel NULL pointer
> dereference at virtual address 00000000
> Feb 27 20:51:37 lxb039 kernel: printing eip:
> Feb 27 20:51:37 lxb039 kernel: 00000000
> Feb 27 20:51:38 lxb039 kernel: Code: Bad EIP value.
>
>
> Looks like a NFS problem, huh?

Is that absolutely the first oops? Looks valid, could you, if possible,
try running a newer 2.4 and we can debug from there.

Cheers,
Zwane
--
function.linuxpower.ca

2003-03-04 13:47:37

by Christopher Huhn

[permalink] [raw]
Subject: Re: Kernel Bug at spinlock.h ?!

Zwane Mwaikambo wrote:

>On Mon, 3 Mar 2003, ChristopherHuhn wrote:
>
>
>
>>Feb 27 20:51:37 lxb039 kernel: Unable to handle kernel NULL pointer
>>dereference at virtual address 00000000
>>Feb 27 20:51:37 lxb039 kernel: printing eip:
>>Feb 27 20:51:37 lxb039 kernel: 00000000
>>Feb 27 20:51:38 lxb039 kernel: Code: Bad EIP value.
>>
>>
>>Looks like a NFS problem, huh?
>>
>>
>
>Is that absolutely the first oops?
>
It's the only one in the log file of that machine. I think it didn't die
rigth after that.

>Looks valid, could you, if possible,
>try running a newer 2.4 and we can debug from there.
>
>Cheers,
> Zwane
>
>
Newer means 2.4.21pre, since we are running 2.4.20?
I assume that we will not upgrade the kernel before a new stable
release, since it is - should be - a production environment.

We have some indications, that our whole problem might be related to
kernel NFS and mixing between 2.2.21 and 2.4.20 in both directions.

I'll compile some more oopses and give you a report tomorrow.

Have a nice day,

Christopher

2003-03-05 05:41:35

by Zwane Mwaikambo

[permalink] [raw]
Subject: Re: Kernel Bug at spinlock.h ?!

On Tue, 4 Mar 2003, ChristopherHuhn wrote:

> Newer means 2.4.21pre, since we are running 2.4.20?
> I assume that we will not upgrade the kernel before a new stable
> release, since it is - should be - a production environment.
>
> We have some indications, that our whole problem might be related to
> kernel NFS and mixing between 2.2.21 and 2.4.20 in both directions.
>
> I'll compile some more oopses and give you a report tomorrow.

Ok don't worry about upgrading kernels for now, (Disclaimer: I'm no NFS
expert). it looks like there might have been a race here.

Code: Bad EIP value
[rpc_call_sync+121/164]
[rpc_run_timer+0/240]

It looks like a possible race with rpc_execute and possibly the timer,
although i can't be certain where the other cpus are. Do the other oopses
look somewhat similar? Could you supply them?

Zwane
--
function.linuxpower.ca

2003-03-06 13:06:07

by Christopher Huhn

[permalink] [raw]
Subject: Re: Kernel Bug at spinlock.h ?!

Hi again,

>It looks like a possible race with rpc_execute and possibly the timer,
>although i can't be certain where the other cpus are. Do the other oopses
>look somewhat similar? Could you supply them?
>
>
below are some oopses I gathered yesterday and today, all on different
machines.
I'd like to remark that we experience massive NFS problems at the moment
that seem to be caused by our mixed potato 2.2/ woody 2.4 environment,
i. e. linking apps on a woody system with the sources mounted via nfs
from a potato box leads to obscure IO failures like "no space left on
device" (This never happens with woddy only). So this might be a clue
here as well.

The oopses are all written down from the screen, I hopefully made little
"transmission" errors.


Unable to handle kernel paging request at virtual address 5a5a5a5a
5a5a5a5a
*pde = 00000000
Oops: 0000
CPU: 0
EIP: 0010:[<5a5a5a5a>] Not tainted
Using defaults from ksymoops -t elf32-i386 -a i386
EFLAGS: 00010246
eax: 00000002 ebx: 5a5a5a5a ecx: c03d0208 edx: 00000000
esi: c4c8f124 edi: 00000001 ebp: c8b7be14 esp: c8b7be00
ds: 0018 es: 0018 ss: 0018
Process cp (pid: 15914, stackpage=c8b7b000)
Stack: c02aca63 c4c8f124 c4c8f124 c02ac908 00000000 c8b7be4c c012566b
c4c8f124
00000000 00000000 00000000 00000001 00000000 c03d0600 c03f2c4c
c03f2c4c
c8b7be4c daf70d08 f7ed0eec c8b7be58 c01213ba c03d0600 c8b7be70
c0121283
Call Trace: [<c02aca63>] [<c02ac980>] [<c012566b>] [<c01213ba>]
[<c0121283>]
[<c0120fdd>] [<c010ac4c>] [<c0130da3>] [<c0130889>] [<c0130e8b>]
[<c0130d38>]
[<c018a3ff>] [<c0141a08>] [<c01090cf>]
Code: Bad EIP value.


>>EIP; 5a5a5a5a Before first symbol <=====

>>ebx; 5a5a5a5a Before first symbol
>>ecx; c03d0208 <irq_stat+8/400>
>>esi; c4c8f124 <_end+4877160/384eb09c>
>>ebp; c8b7be14 <_end+8763e50/384eb09c>
>>esp; c8b7be00 <_end+8763e3c/384eb09c>

Trace; c02aca63 <rpc_run_timer+e3/f0>
Trace; c02ac980 <rpc_run_timer+0/f0>
Trace; c012566b <timer_bh+2ff/488>
Trace; c01213ba <bh_action+52/e0>
Trace; c0121283 <tasklet_hi_action+63/a0>
Trace; c0120fdd <do_softirq+5d/e0>
Trace; c010ac4c <do_IRQ+198/1a8>
Trace; c0130da3 <file_read_actor+6b/d8>
Trace; c0130889 <do_generic_file_read+255/504>
Trace; c0130e8b <generic_file_read+7b/10c>
Trace; c0130d38 <file_read_actor+0/d8>
Trace; c018a3ff <nfs_file_read+9b/ac>
Trace; c0141a08 <sys_read+98/188>
Trace; c01090cf <system_call+33/38>

<0>Kernel panic: Aiee, killing interrupt handler!



Call Trace: [<c02c6783>][<c02c66a0>][<c012564b>][<c012139a>][<c0121263>]
Code: 02 00 00 00 40 01 00 00 00 00 00 00 00 00 00 c0 80 3d 35 c0
Using defaults from ksymoops -t elf32-i386 -a i386


Trace; c02c6783 <rpc_run_timer+e3/f0>
Trace; c02c66a0 <rpc_run_timer+0/f0>
Trace; c012564b <timer_bh+2ff/488>
Trace; c012139a <bh_action+52/e0>
Trace; c0121263 <tasklet_hi_action+63/a0>

Code; 00000000 Before first symbol
00000000 <_EIP>:
Code; 00000000 Before first symbol
0: 02 00 add (%eax),%al
Code; 00000002 Before first symbol
2: 00 00 add %al,(%eax)
Code; 00000004 Before first symbol
4: 40 inc %eax
Code; 00000005 Before first symbol
5: 01 00 add %eax,(%eax)
Code; 0000000f Before first symbol
f: c0 80 3d 35 c0 00 00 rolb $0x0,0xc0353d(%eax)




Unable to handle kernel NULL pointer dereferende at virtual address 00000002
f6a9dd50
*pde = 00000000
Oops: 0002
CPU: 0
EIP: 0010:[<f6a9dd50>] Not tainted
Using defaults from ksymoops -t elf32-i386 -a i386
EFLAGS: 00010246
eax: 00000002 ebx: f6a9dd50 ecx: c03f9208 edx: 00000000
esi: f6a9dce4 edi: 00000001 ebp: f7ee7ecc esp: f7ee7eb8
ds: 0018 es: 0018 ss: 0018
Process ksoftirqd_CPU0 (pid: 3, stackpage=f7ee7000)
Stack: c02c6783 f6a9dce4 f6a9dce4 c02c66a0 00000000 f7ee7f04 c012564b
f6a9dce4
00000000 00000000 00000000 00000001 00000000 c03f9600 c041c30c
c041c30c
f7ee7f04 f64e9f28 f7ed0eec f7ee7f10 c012139a c03f9600 f7ee7f28
c0121263
Call Trace: [<c02c6783>] [<c02c66a0>] [<c012564b>] [<c012139a>]
[<c0121263>]
[<c0120fdd>] [<c010ac2c>] [<c0117e93>] [<c01215ce>] [<c0107448>]
Code: 00 10 e9 f7 78 dd a9 f6 18 af 28 c0 c4 1c e5 f7 00 00 00 00


>>EIP; f6a9dd50 <_end+3665c5ec/384c18fc> <=====

>>ebx; f6a9dd50 <_end+3665c5ec/384c18fc>
>>ecx; c03f9208 <irq_stat+8/400>
>>esi; f6a9dce4 <_end+3665c580/384c18fc>
>>ebp; f7ee7ecc <_end+37aa6768/384c18fc>
>>esp; f7ee7eb8 <_end+37aa6754/384c18fc>

Trace; c02c6783 <rpc_run_timer+e3/f0>
Trace; c02c66a0 <rpc_run_timer+0/f0>
Trace; c012564b <timer_bh+2ff/488>
Trace; c012139a <bh_action+52/e0>
Trace; c0121263 <tasklet_hi_action+63/a0>
Trace; c0120fdd <do_softirq+7d/e0>
Trace; c010ac2c <do_IRQ+198/1a8>
Trace; c0117e93 <schedule+2e3/710>
Trace; c01215ce <ksoftirqd+92/cc>
Trace; c0107448 <kernel_thread+28/38>

Code; f6a9dd50 <_end+3665c5ec/384c18fc>
00000000 <_EIP>:
Code; f6a9dd50 <_end+3665c5ec/384c18fc> <=====
0: 00 10 add %dl,(%eax) <=====
Code; f6a9dd52 <_end+3665c5ee/384c18fc>
2: e9 f7 78 dd a9 jmp a9dd78fe <_EIP+0xa9dd78fe>
a087564e Before first symbol
Code; f6a9dd57 <_end+3665c5f3/384c18fc>
7: f6 18 negb (%eax)
Code; f6a9dd59 <_end+3665c5f5/384c18fc>
9: af scas %es:(%edi),%eax
Code; f6a9dd5a <_end+3665c5f6/384c18fc>
a: 28 c0 sub %al,%al
Code; f6a9dd5c <_end+3665c5f8/384c18fc>
c: c4 1c e5 f7 00 00 00 les 0xf7(,8),%ebx



Unable to handle kernel paging request at virtual address 5a5a5a5a
5a5a5a5a
*pde = 00000000
Oops: 0000
CPU: 0
EIP: 0010:[<5a5a5a5a>] Not tainted
Using defaults from ksymoops -t elf32-i386 -a i386
EFLAGS: 00010246
eax: 00000002 ebx: 5a5a5a5a ecx: c03d0208 edx: 00000000
esi: f28b23e4 edi: 00000001 ebp: ed151f1c esp: ed151f08
ds: 0018 es: 0018 ss: 0018
Process ncsh.exe (pid: 8328, stackpage=ed151000)
Stack: c02c6783 f28b23e4 f28b23e4 c02c66a0 00000000 ed151f54 c012564b
f28b23e4
00000000 00000000 00000000 00000001 00000000 c03f9600 c041c30c
c041c30c
ed151f54 f28b2588 f7ed0eec ed151f60 c012139a c03f9600 ed151f78
c0121263
Call Trace: [<c02c6783>] [<c02c66a0>] [<c012564b>] [<c012139a>]
[<c0121263>]
[<c0120fdd>] [<c010ac2c>]
Code: Bad EIP value.


>>EIP; 5a5a5a5a Before first symbol <=====

>>ebx; 5a5a5a5a Before first symbol
>>ecx; c03d0208 <softnet_data+6e8/3400>
>>esi; f28b23e4 <_end+32470c80/384c18fc>
>>ebp; ed151f1c <_end+2cd107b8/384c18fc>
>>esp; ed151f08 <_end+2cd107a4/384c18fc>

Trace; c02c6783 <rpc_run_timer+e3/f0>
Trace; c02c66a0 <rpc_run_timer+0/f0>
Trace; c012564b <timer_bh+2ff/488>
Trace; c012139a <bh_action+52/e0>
Trace; c0121263 <tasklet_hi_action+63/a0>
Trace; c0120fdd <do_softirq+7d/e0>
Trace; c010ac2c <do_IRQ+198/1a8>

<0>Kernel panic: Aiee, killing interrupt handler!



Unable to handle kernel NULL pointer dereferende at virtual address 00000002
f7467d5c
*pde = 00000000
Oops: 0002
CPU: 0
EIP: 0010:[<f7467d5c>] Not tainted
Using defaults from ksymoops -t elf32-i386 -a i386
EFLAGS: 00010246
eax: 00000002 ebx: f7467d5c ecx: c03f9208 edx: 00000000
esi: f7467cf0 edi: 00000001 ebp: f7ee7ecc esp: f7ee7eb8
ds: 0018 es: 0018 ss: 0018
Process ksoftirqd_CPU0 (pid: 3, stackpage=f7ee7000)
Stack: c02c6783 f7467cf0 f7467cf0 c02c66a0 1320dcce f7ee7f04 c012564b
f7467cf0
00000000 00000000 c03f9614 00000000 c042e864 f7ed0ed4 c0361da8
f7ee7f1c
c01f59f4 f7467f28 f7ed0eec f7ee7f10 c012139a c03f9600 f7ee7f28
c0121263
Call Trace: [<c02c6783>] [<c02c66a0>] [<c012564b>] [<c01f59f4>]
[<c012139a>]
[<c0121263>] [<c0120fdd>] [<c010ac2c>] [<c0117eb7>] [<c01215ce>]
[<c0107448>]
Code: 00 10 e9 f7 84 7d 46 f7 18 af 28 c0 c4 1c e5 f7 00 00 00 00


>>EIP; f7467d5c <_end+370265f8/384c18fc> <=====

>>ebx; f7467d5c <_end+370265f8/384c18fc>
>>ecx; c03f9208 <irq_stat+8/400>
>>esi; f7467cf0 <_end+3702658c/384c18fc>
>>ebp; f7ee7ecc <_end+37aa6768/384c18fc>
>>esp; f7ee7eb8 <_end+37aa6754/384c18fc>

Trace; c02c6783 <rpc_run_timer+e3/f0>
Trace; c02c66a0 <rpc_run_timer+0/f0>
Trace; c012564b <timer_bh+2ff/488>
Trace; c01f59f4 <ide_intr+1c0/284>
Trace; c012139a <bh_action+52/e0>
Trace; c0121263 <tasklet_hi_action+63/a0>
Trace; c0120fdd <do_softirq+7d/e0>
Trace; c010ac2c <do_IRQ+198/1a8>
Trace; c0117eb7 <schedule+307/710>
Trace; c01215ce <ksoftirqd+92/cc>
Trace; c0107448 <kernel_thread+28/38>

Code; f7467d5c <_end+370265f8/384c18fc>
00000000 <_EIP>:
Code; f7467d5c <_end+370265f8/384c18fc> <=====
0: 00 10 add %dl,(%eax) <=====
Code; f7467d5e <_end+370265fa/384c18fc>
2: e9 f7 84 7d 46 jmp 467d84fe <_EIP+0x467d84fe>
3dc4025a Before first symbol
Code; f7467d63 <_end+370265ff/384c18fc>
7: f7 18 negl (%eax)
Code; f7467d65 <_end+37026601/384c18fc>
9: af scas %es:(%edi),%eax
Code; f7467d66 <_end+37026602/384c18fc>
a: 28 c0 sub %al,%al
Code; f7467d68 <_end+37026604/384c18fc>
c: c4 1c e5 f7 00 00 00 les 0xf7(,8),%ebx



Unable to handle kernel NULL pointer dereferende at virtual address 00000002
f6df1d50
*pde = 00000000
Oops: 0002
CPU: 0
EIP: 0010:[<f6df1d50>] Not tainted
Using defaults from ksymoops -t elf32-i386 -a i386
EFLAGS: 00010246
eax: 00000002 ebx: f6df1d50 ecx: c03f9208 edx: 00000000
esi: f6df1ce4 edi: 00000001 ebp: ce1bdf1c esp: ce1bdf08
ds: 0018 es: 0018 ss: 0018
Process ncsh.exe (pid: 18791, stackpage=ce1bd000)
Stack: c02c6783 f6df1ce4 f6df1ce4 c02c66a0 00000000 ce1bdf54 c012564b
f6df1ce4
00000000 00000000 00000000 00000001 00000000 c03f9600 c041c30c
c041c30c
ce1bdf54 f6df00e8 f7ed0eec ce1bdf60 c012139a c03f9600 ce1bdf78
c0121263
Call Trace: [<c02c6783>] [<c02c66a0>] [<c012564b>] [<c012139a>]
[<c0121263>]
[<c0120fdd>] [<c010ac2c>]
Code: 00 10 e9 f7 78 1d df f6 18 af 28 c0 c4 1c e5 f7 00 00 00 00


>>EIP; f6df1d50 <_end+369b05ec/384c18fc> <=====

>>ebx; f6df1d50 <_end+369b05ec/384c18fc>
>>ecx; c03f9208 <irq_stat+8/400>
>>esi; f6df1ce4 <_end+369b0580/384c18fc>
>>ebp; ce1bdf1c <_end+dd7c7b8/384c18fc>
>>esp; ce1bdf08 <_end+dd7c7a4/384c18fc>

Trace; c02c6783 <rpc_run_timer+e3/f0>
Trace; c02c66a0 <rpc_run_timer+0/f0>
Trace; c012564b <timer_bh+2ff/488>
Trace; c012139a <bh_action+52/e0>
Trace; c0121263 <tasklet_hi_action+63/a0>
Trace; c0120fdd <do_softirq+7d/e0>
Trace; c010ac2c <do_IRQ+198/1a8>

Code; f6df1d50 <_end+369b05ec/384c18fc>
00000000 <_EIP>:
Code; f6df1d50 <_end+369b05ec/384c18fc> <=====
0: 00 10 add %dl,(%eax) <=====
Code; f6df1d52 <_end+369b05ee/384c18fc>
2: e9 f7 78 1d df jmp df1d78fe <_EIP+0xdf1d78fe>
d5fc964e <_end+15b87eea/384c18fc>
Code; f6df1d57 <_end+369b05f3/384c18fc>
7: f6 18 negb (%eax)
Code; f6df1d59 <_end+369b05f5/384c18fc>
9: af scas %es:(%edi),%eax
Code; f6df1d5a <_end+369b05f6/384c18fc>
a: 28 c0 sub %al,%al
Code; f6df1d5c <_end+369b05f8/384c18fc>
c: c4 1c e5 f7 00 00 00 les 0xf7(,8),%ebx

<0>Kernel panic: Aiee, killing interrupt handler!



Unable to handle kernel paging request at virtual address 00010000
00010000
*pde = 00000000
Oops: 0000
CPU: 0
EIP: 0010:[<00010000>] Not tainted
Using defaults from ksymoops -t elf32-i386 -a i386
EFLAGS: 00010246
eax: 00000002 ebx: 00010000 ecx: c03f9208 edx: 00000000
esi: d44a7cf0 edi: 00000001 ebp: e2c69d84 esp: e2c69d70
ds: 0018 es: 0018 ss: 0018
Process rhomesu2b.x (pid: 29390, stackpage=e2c69000)
Stack: c02c6783 d44a7cf0 d44a7cf0 c02c66a0 c03f9614 e2c69dbc c012564b
d44a7cf0
00000000 00000000 c03f9614 c03cfb50 00000000 12286799 00000040
00000001
e2c69de0 d44a7d04 f7ed0eec e2c69dc8 c012139a c03f9600 e2c69de0
c0121263
Call Trace: [<c02c6783>] [<c02c66a0>] [<c012564b>] [<c012139a>]
[<c0121263>]
[<c0120fdd>] [<c010ac2c>] [<c010f223>] [<c01086cc>] [<c01087f9>]
[<c0108bb1>]
[<c0108f38>] [<c0120fdd>] [<c010ac2c>] [<c01090f0>]
Code: Bad EIP value.


>>EIP; 00010000 Before first symbol <=====

>>ebx; 00010000 Before first symbol
>>ecx; c03f9208 <irq_stat+8/400>
>>esi; d44a7cf0 <_end+1406658c/384c18fc>
>>ebp; e2c69d84 <_end+22828620/384c18fc>
>>esp; e2c69d70 <_end+2282860c/384c18fc>

Trace; c02c6783 <rpc_run_timer+e3/f0>
Trace; c02c66a0 <rpc_run_timer+0/f0>
Trace; c012564b <timer_bh+2ff/488>
Trace; c012139a <bh_action+52/e0>
Trace; c0121263 <tasklet_hi_action+63/a0>
Trace; c0120fdd <do_softirq+7d/e0>
Trace; c010ac2c <do_IRQ+198/1a8>
Trace; c010f223 <save_i387+37/230>
Trace; c01086cc <setup_sigcontext+dc/12c>
Trace; c01087f9 <setup_frame+dd/1b4>
Trace; c0108bb1 <handle_signal+71/154>
Trace; c0108f38 <do_signal+2a4/2ed>
Trace; c0120fdd <do_softirq+7d/e0>
Trace; c010ac2c <do_IRQ+198/1a8>
Trace; c01090f0 <signal_return+14/18>

<0>Kernel panic: Aiee, killing interrupt handler!



Unable to handle kernel paging request at virtual address 5a5a5a5a
5a5a5a5a
*pde = 00000000
Oops: 0000
CPU: 0
EIP: 0010:[<5a5a5a5a>] Not tainted
Using defaults from ksymoops -t elf32-i386 -a i386
EFLAGS: 00010246
eax: 00000002 ebx: 5a5a5a5a ecx: c03d0208 edx: 00000000
esi: e27cb7d4 edi: 00000001 ebp: f7ee7f5c esp: f7ee7f48
ds: 0018 es: 0018 ss: 0018
Process ksoftirqd_CPU0 (pid: 3, stackpage=f7ee7000)
Stack: c02aca63 e27cb7d4 e27cb7d4 c02ac908 00000000 f7ee7f94 c012566b
e27cb7d4
00000000 00000000 00000000 00000001 00000000 c03d0600 c03f2c4c
c03f2c4c
f7ee7f94 e1902c18 f7ed0eec f7ee7fa0 c01213ba c03d0600 f7ee7fb8
c0121283
Call Trace: [<c02aca63>] [<c02ac980>] [<c012566b>] [<c01213ba>]
[<c0121283>]
[<c0120fdd>] [<c0121609>] [<c0107448>]
Code: Bad EIP value.


>>EIP; 5a5a5a5a Before first symbol <=====

>>ebx; 5a5a5a5a Before first symbol
>>ecx; c03d0208 <irq_stat+8/400>
>>esi; e27cb7d4 <_end+223b3810/384eb09c>
>>ebp; f7ee7f5c <_end+37acff98/384eb09c>
>>esp; f7ee7f48 <_end+37acff84/384eb09c>

Trace; c02aca63 <rpc_run_timer+e3/f0>
Trace; c02ac980 <rpc_run_timer+0/f0>
Trace; c012566b <timer_bh+2ff/488>
Trace; c01213ba <bh_action+52/e0>
Trace; c0121283 <tasklet_hi_action+63/a0>
Trace; c0120fdd <do_softirq+5d/e0>
Trace; c0121609 <ksoftirqd+ad/cc>
Trace; c0107448 <kernel_thread+28/38>

<0>Kernel panic: Aiee, killing interrupt handler!



Unable to handle kernel NULL pointer dereferende at virtual address ffffffff
ffffffff
*pde = 00003063
Oops: 0000
CPU: 0
EIP: 0010:[<ffffffff>] Not tainted
Using defaults from ksymoops -t elf32-i386 -a i386
EFLAGS: 00010246
eax: 00000002 ebx: ffffffff ecx: c03f9208 edx: 00000000
esi: f70b1cb0 edi: 00000001 ebp: f77dddf4 esp: f77ddde0
ds: 0018 es: 0018 ss: 0018
Process timeoutd (pid: 556, stackpage=f77dd000)
Stack: c02c6783 f70b1cb0 f70b1cb0 c02c66a0 00000000 f77dde2c c012564b
f70b1cb0
00000000 00000000 00000000 00000001 00000000 c03f9600 c041c30c
c041c30c
f77dde2c f70b1d44 f7ed0eec f77dde38 c012139a c03f9600 f77dde50
c0121263
Call Trace: [<c02c6783>] [<c02c66a0>] [<c012564b>] [<c012139a>]
[<c0121263>]
[<c0120fdd>] [<c010ac2c>] [<c0152056>] [<c014e6d8>] [<c014e719>]
[<c014e8e6>]
[<c014ef4b>] [<c0140eee>] [<c014124e>] [<c01090b7>]
Code: Bad EIP value.


>>EIP; ffffffff <END_OF_CODE+76e343f/????> <=====

>>ebx; ffffffff <END_OF_CODE+76e343f/????>
>>ecx; c03f9208 <irq_stat+8/400>
>>esi; f70b1cb0 <_end+36c7054c/384c18fc>
>>ebp; f77dddf4 <_end+3739c690/384c18fc>
>>esp; f77ddde0 <_end+3739c67c/384c18fc>

Trace; c02c6783 <rpc_run_timer+e3/f0>
Trace; c02c66a0 <rpc_run_timer+0/f0>
Trace; c012564b <timer_bh+2ff/488>
Trace; c012139a <bh_action+52/e0>
Trace; c0121263 <tasklet_hi_action+63/a0>
Trace; c0120fdd <do_softirq+7d/e0>
Trace; c010ac2c <do_IRQ+198/1a8>
Trace; c0152056 <.text.lock.namei+9/4b3>
Trace; c014e6d8 <link_path_walk+c5c/c80>
Trace; c014e719 <path_walk+1d/24>
Trace; c014e8e6 <path_lookup+1e/2c>
Trace; c014ef4b <open_namei+6b/75c>
Trace; c0140eee <filp_open+3a/5c>
Trace; c014124e <sys_open+36/b4>
Trace; c01090b7 <system_call+2f/34>

<0>Kernel panic: Aiee, killing interrupt handler!



Unable to handle kernel NULL pointer dereferende at virtual address 00000002
f1cddcfc
*pde = 00000000
Oops: 0002
CPU: 0
EIP: 0010:[<f1cddcfc>] Not tainted
Using defaults from ksymoops -t elf32-i386 -a i386
EFLAGS: 00010246
eax: 00000002 ebx: f1cddcfc ecx: c03d0208 edx: 00000000
esi: f1cddcb0 edi: 00000001 ebp: c0375ee0 esp: c0375ecc
ds: 0018 es: 0018 ss: 0018
Process swapper (pid: 0, stackpage=c0375000)
Stack: c02aca63 f1cddcb0 f1cddcb0 c02ac980 c03d063c c0375f18 c012566b
f1cddcb0
00000000 00000000 c03d063c 00000001 00000000 c03d0600 c03f2c4c
c03f2c4c
c0375f18 f68abd18 f7ed0eec c0375f24 c01213ba c03d0600 c0375f3c
c0121283
Call Trace: [<c02aca63>] [<c02ac980>] [<c012566b>] [<c01213ba>]
[<c0121283>]
[<c0120fdd>] [<c010ac4c>] [<c0106ff0>] [<c0106ff0>] [<c0106ff0>]
[<c0106ff0>]
[<c010701f>] [<c0107092>] [<c0105000>] [<c010507c>]
Code: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 a8 8d b6 00


>>EIP; f1cddcfc <_end+318c5d38/384eb09c> <=====

>>ebx; f1cddcfc <_end+318c5d38/384eb09c>
>>ecx; c03d0208 <irq_stat+8/400>
>>esi; f1cddcb0 <_end+318c5cec/384eb09c>
>>ebp; c0375ee0 <init_task_union+1ee0/2000>
>>esp; c0375ecc <init_task_union+1ecc/2000>

Trace; c02aca63 <rpc_run_timer+e3/f0>
Trace; c02ac980 <rpc_run_timer+0/f0>
Trace; c012566b <timer_bh+2ff/488>
Trace; c01213ba <bh_action+52/e0>
Trace; c0121283 <tasklet_hi_action+63/a0>
Trace; c0120fdd <do_softirq+5d/e0>
Trace; c010ac4c <do_IRQ+198/1a8>
Trace; c0106ff0 <default_idle+0/38>
Trace; c0106ff0 <default_idle+0/38>
Trace; c0106ff0 <default_idle+0/38>
Trace; c0106ff0 <default_idle+0/38>
Trace; c010701f <default_idle+2f/38>
Trace; c0107092 <cpu_idle+42/58>
Trace; c0105000 <_stext+0/0>
Trace; c010507c <rest_init+7c/80>

Code; f1cddcfc <_end+318c5d38/384eb09c> <=====
00000000 <_EIP>: <=====
Code; f1cddd0c <_end+318c5d48/384eb09c>
10: a8 8d test $0x8d,%al
Code; f1cddd0e <_end+318c5d4a/384eb09c>
12: b6 00 mov $0x0,%dh

<0>Kernel panic: Aiee, killing interrupt handler!



Unable to handle kernel NULL pointer dereferende at virtual address ffffffff
ffffffff
*pde = 00003063
Oops: 0000
CPU: 0
EIP: 0010:[<ffffffff>] Not tainted
Using defaults from ksymoops -t elf32-i386 -a i386
EFLAGS: 00010246
eax: 00000002 ebx: ffffffff ecx: c03f9208 edx: 00000000
esi: f7a3fcb0 edi: 00000001 ebp: c039dee0 esp: c039decc
ds: 0018 es: 0018 ss: 0018
Process swapper (pid: 0, stackpage=c039d000)
Stack: c02c6783 f7a3fcb0 f7a3fcb0 c02c66a0 00000000 c039df18 c012564b
f7a3cfb0
00000000 00000000 00000000 00000001 00000000 c03f9600 c041c30c
c041c30c
c039df18 f7a3fd44 f7ed0eec c039df24 c012139a c03f9600 c039fd3c
c0121263
Call Trace: [<c02c6783>] [<c02c66a0>] [<c012564b>] [<c012139a>]
[<c0121263>]
[<c0120fdd>] [<c010ac2c>] [<c0106ff0>] [<c0106ff0>] [<c0106ff0>]
[<c0106ff0>]
[<c010701f>] [<c0107092>] [<c0105000>] [<c010507c>]
Code: Bad EIP value.


>>EIP; ffffffff <END_OF_CODE+76e343f/????> <=====

>>ebx; ffffffff <END_OF_CODE+76e343f/????>
>>ecx; c03f9208 <irq_stat+8/400>
>>esi; f7a3fcb0 <_end+375fe54c/384c18fc>
>>ebp; c039dee0 <init_task_union+1ee0/2000>
>>esp; c039decc <init_task_union+1ecc/2000>

Trace; c02c6783 <rpc_run_timer+e3/f0>
Trace; c02c66a0 <rpc_run_timer+0/f0>
Trace; c012564b <timer_bh+2ff/488>
Trace; c012139a <bh_action+52/e0>
Trace; c0121263 <tasklet_hi_action+63/a0>
Trace; c0120fdd <do_softirq+7d/e0>
Trace; c010ac2c <do_IRQ+198/1a8>
Trace; c0106ff0 <default_idle+0/38>
Trace; c0106ff0 <default_idle+0/38>
Trace; c0106ff0 <default_idle+0/38>
Trace; c0106ff0 <default_idle+0/38>
Trace; c010701f <default_idle+2f/38>
Trace; c0107092 <cpu_idle+42/58>
Trace; c0105000 <_stext+0/0>
Trace; c010507c <rest_init+7c/80>

<0>Kernel panic: Aiee, killing interrupt handler!



Unable to handle kernel NULL pointer dereferende at virtual address 00000058
c02c5247
*pde = 00000000
Oops: 0000
CPU: 0
EIP: 0010:[<c02c5247>] Not tainted
Using defaults from ksymoops -t elf32-i386 -a i386
EFLAGS: 00010202
eax: 00000002 ebx: 00000000 ecx: c366e624 edx: 00000001
esi: c73c88e8 edi: 00000001 ebp: d553ff04 esp: d553feec
ds: 0018 es: 0018 ss: 0018
Process flukahp (pid: 15178, stackpage=d553f000)
Stack: c02c51cc c366e624 00000001 00000800 00000000 c73c8000 d553ff1c
c02c6783
c366e624 c366e624 c02c66a0 00000000 d553ff54 c012564b c366e624
00000000
00000000 00000000 c03cfb50 00000000 0f68a4a4 00000040 00000001
00000086
Call Trace: [<c02c51cc>] [<c02c6783>] [<c02c66a0>] [<c012564b>]
[<c012139a>]
[<c0121263>] [<c0120fdd>] [<c010ac2c>]
Code: Bad EIP value.


>>EIP; c02c5247 <xprt_timer+7b/158> <=====

>>ecx; c366e624 <_end+322cec0/384c18fc>
>>esi; c73c88e8 <_end+6f87184/384c18fc>
>>ebp; d553ff04 <_end+150fe7a0/384c18fc>
>>esp; d553feec <_end+150fe788/384c18fc>

Trace; c02c51cc <xprt_timer+0/158>
Trace; c02c6783 <rpc_run_timer+e3/f0>
Trace; c02c66a0 <rpc_run_timer+0/f0>
Trace; c012564b <timer_bh+2ff/488>
Trace; c012139a <bh_action+52/e0>
Trace; c0121263 <tasklet_hi_action+63/a0>
Trace; c0120fdd <do_softirq+7d/e0>
Trace; c010ac2c <do_IRQ+198/1a8>

<0>Kernel panic: Aiee, killing interrupt handler!


That' it.

Have a nice day,

Christopher

2003-03-07 15:39:15

by Zwane Mwaikambo

[permalink] [raw]
Subject: Re: Kernel Bug at spinlock.h ?!

On Thu, 6 Mar 2003, ChristopherHuhn wrote:

> Hi again,
>
> >It looks like a possible race with rpc_execute and possibly the timer,
> >although i can't be certain where the other cpus are. Do the other oopses
> >look somewhat similar? Could you supply them?
> >
> >
> below are some oopses I gathered yesterday and today, all on different
> machines.
> I'd like to remark that we experience massive NFS problems at the moment
> that seem to be caused by our mixed potato 2.2/ woody 2.4 environment,
> i. e. linking apps on a woody system with the sources mounted via nfs
> from a potato box leads to obscure IO failures like "no space left on
> device" (This never happens with woddy only). So this might be a clue
> here as well.
>
> The oopses are all written down from the screen, I hopefully made little
> "transmission" errors.

Some of these are a bit worrying seeing as they are bit flips, also they
all appear to come from a UP machine(?) this would change things with
respect to my previous comment about races. Regarding weird io failures
are you mounting with the 'soft' option?

Zwane

2003-03-10 08:43:27

by Christopher Huhn

[permalink] [raw]
Subject: Re: Kernel Bug at spinlock.h ?!

Zwane Mwaikambo wrote:

>On Thu, 6 Mar 2003, ChristopherHuhn wrote:
>
>
>
>>Hi again,
>>
>>
>>
>>>It looks like a possible race with rpc_execute and possibly the timer,
>>>although i can't be certain where the other cpus are. Do the other oopses
>>>look somewhat similar? Could you supply them?
>>>
>>>
>>>
>>>
>>below are some oopses I gathered yesterday and today, all on different
>>machines.
>>I'd like to remark that we experience massive NFS problems at the moment
>>that seem to be caused by our mixed potato 2.2/ woody 2.4 environment,
>>i. e. linking apps on a woody system with the sources mounted via nfs
>>from a potato box leads to obscure IO failures like "no space left on
>>device" (This never happens with woddy only). So this might be a clue
>>here as well.
>>
>>The oopses are all written down from the screen, I hopefully made little
>>"transmission" errors.
>>
>>
>
>Some of these are a bit worrying seeing as they are bit flips, also they
>all appear to come from a UP machine(?) this would change things with
>respect to my previous comment about races. Regarding weird io failures
>are you mounting with the 'soft' option?
>
> Zwane
>
>
The machines all all DP Xeons, our SP machines run the same kernel, but
these oopses only occur on DP machines under heavy load.
The machines are recognized as SMP:
# uname -a
Linux lxb000 2.4.20 #2 SMP Tue Dec 17 10:43:29 CET 2002 i686 unknown

but the e7500 chipset seems not to be supported 100%:

Jan 27 15:26:34 lxb000 kernel: found SMP MP-table at 000f6710
Jan 27 15:26:34 lxb000 kernel: hm, page 000f6000 reserved twice.
Jan 27 15:26:34 lxb000 kernel: hm, page 000f7000 reserved twice.
Jan 27 15:26:34 lxb000 kernel: hm, page 0009f000 reserved twice.
Jan 27 15:26:34 lxb000 kernel: hm, page 000a0000 reserved twice.
Jan 27 15:26:34 lxb000 kernel: On node 0 totalpages: 262016
Jan 27 15:26:34 lxb000 kernel: zone(0): 4096 pages.
Jan 27 15:26:34 lxb000 kernel: zone(1): 225280 pages.
Jan 27 15:26:34 lxb000 kernel: zone(2): 32640 pages.
Jan 27 15:26:34 lxb000 kernel: ACPI: Searched entire block, no RSDP was
found.
Jan 27 15:26:34 lxb000 kernel: ACPI: Searched entire block, no RSDP was
found.
Jan 27 15:26:34 lxb000 kernel: ACPI: System description tables not found
Jan 27 15:26:34 lxb000 kernel: Intel MultiProcessor Specification v1.4
Jan 27 15:26:34 lxb000 kernel: Virtual Wire compatibility mode.
Jan 27 15:26:34 lxb000 kernel: OEM ID: Product ID: Kings Canyon APIC
at: 0xFEE00000
Jan 27 15:26:34 lxb000 kernel: Processor #0 Pentium 4(tm) XEON(tm) APIC
version 20
Jan 27 15:26:34 lxb000 kernel: Processor #6 Pentium 4(tm) XEON(tm) APIC
version 20
Jan 27 15:26:34 lxb000 kernel: Processor #1 Pentium 4(tm) XEON(tm) APIC
version 20
Jan 27 15:26:34 lxb000 kernel: Processor #7 Pentium 4(tm) XEON(tm) APIC
version 20
Jan 27 15:26:34 lxb000 kernel: I/O APIC #2 Version 32 at 0xFEC00000.
Jan 27 15:26:34 lxb000 kernel: I/O APIC #3 Version 32 at 0xFEC80000.
Jan 27 15:26:34 lxb000 kernel: I/O APIC #4 Version 32 at 0xFEC80400.
Jan 27 15:26:34 lxb000 kernel: I/O APIC #5 Version 32 at 0xFEC81000.
Jan 27 15:26:34 lxb000 kernel: I/O APIC #8 Version 32 at 0xFEC81400.
Jan 27 15:26:34 lxb000 kernel: Processors: 4
...

There might be (are) severe flaws in our NFS configuration and network
performance, but that should not crash the box, should it?

BTW: I just received a link to a bux incl. fix that sounds similar to
our problem: http://marc.theaimsgroup.com/?l=linux-nfs&m=104716581307294&w=2

With kind regards,

Christopher