It's the first time I see this, I don't think it is reproducible...
I was emerging (Gentoo) something and working on a reiserfs partition
when this happened:
grep[21235]: segfault at 00000079ecf16039 rip 0000003cf6701274 rsp
00007fffffd95e80 error 4 slab: Internal list corruption detected in
cache 'anon_vma'(92), slabp ffff810000a07000(16). Hexdump:
000: 00 01 10 00 00 00 00 00 00 02 20 00 00 00 00 00
010: a0 01 00 00 00 00 00 00 a0 71 a0 00 00 81 ff ff
020: 10 00 00 00 10 00 00 00 00 00 00 00 00 00 00 00
030: fe ff ff ff fe ff ff ff fe ff ff ff fe ff ff ff
040: fe ff ff ff fe ff ff ff fe ff ff ff fe ff ff ff
050: fe ff ff ff fe ff ff ff fe ff ff ff fe ff ff ff
060: fe ff ff ff fe ff ff ff fe ff ff ff fe ff ff ff
070: 11 00 00 00 12 00 00 00 13 00 00 00 14 00 00 00
080: 15 00 00 00 16 00 00 00 17 00 00 00 18 00 00 00
090: 19 00 00 00 1a 00 00 00 1b 00 00 00 1c 00 00 00
0a0: 1d 00 00 00 1e 00 00 00 7f 00 00 00 20 00 00 00
0b0: 21 00 00 00 22 00 00 00 23 00 00 00 24 00 00 00
0c0: 25 00 00 00 26 00 00 00 27 00 00 00 28 00 00 00
0d0: 29 00 00 00 2a 00 00 00 2b 00 00 00 2c 00 00 00
0e0: 2d 00 00 00 2e 00 00 00 2f 00 00 00 30 00 00 00
0f0: 31 00 00 00 32 00 00 00 33 00 00 00 34 00 00 00
100: 35 00 00 00 36 00 00 00 37 00 00 00 38 00 00 00
110: 39 00 00 00 3a 00 00 00 3b 00 00 00 3c 00 00 00
120: 3d 00 00 00 3e 00 00 00 3f 00 00 00 40 00 00 00
130: 41 00 00 00 42 00 00 00 43 00 00 00 44 00 00 00
140: 45 00 00 00 46 00 00 00 47 00 00 00 48 00 00 00
150: 49 00 00 00 4a 00 00 00 4b 00 00 00 4c 00 00 00
160: 4d 00 00 00 4e 00 00 00 4f 00 00 00 50 00 00 00
170: 51 00 00 00 52 00 00 00
----------- [cut here ] --------- [please bite here ] ---------
Kernel BUG at mm/slab.c:2564
invalid opcode: 0000 [1]
CPU 0
Modules linked in: xt_state ip_queue ip_conntrack iptable_filter
ip_tables Pid: 21232, comm: as Not tainted 2.6.16-rc5-g7b14e3b5 #7
RIP: 0010:[<ffffffff80156c5e>] <ffffffff80156c5e>{check_slabp+188}
RSP: 0018:ffff81000a2cbdc8 EFLAGS: 00010096
RAX: 0000000000000001 RBX: 0000000000000178 RCX: 0000000000003ee7
RDX: 00000000ffffff01 RSI: 0000000000003ee7 RDI: ffffffff803f24c0
RBP: ffff810000a07000 R08: 00000000fffffffe R09: ffff810000a07000
R10: 0000000000000046 R11: 0000000000000000 R12: ffff81001ff2cec0
R13: ffff810000a071a0 R14: 0000000000000000 R15: ffff81000a2cbee0
FS: 00002af4db709dc0(0000) GS:ffffffff804cb000(0000)
knlGS:00000000f6cd7bb0 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 0000003cf698c600 CR3: 000000000c1a3000 CR4: 00000000000006e0
Process as (pid: 21232, threadinfo ffff81000a2ca000, task
ffff810017e7a100) Stack: ffff810000a07000 ffff81001ff2bf20
ffff81001ff2cec0 ffffffff80157763 ffff81001ff280a8 ffff810005f24000
0000000000000010 0000000000000010 ffff81001ff28098 ffff81001ff2bf20
Call Trace: <ffffffff80157763>{free_block+154}
<ffffffff8015796c>{cache_flusharray+111}
<ffffffff80157585>{kmem_cache_free+78}
<ffffffff80148cac>{free_pgtables+45} <ffffffff8014e1fc>{exit_mmap+119}
<ffffffff8012210c>{mmput+27} <ffffffff801263d6>{do_exit+519}
<ffffffff801269d7>{sys_exit_group+0} <ffffffff8010a5b2>{system_call+126}
Code: 0f 0b 68 4f e6 38 80 c2 04 0a 5b 5d 41 5c c3 41 55 31 c0 48
RIP <ffffffff80156c5e>{check_slabp+188} RSP <ffff81000a2cbdc8>
<1>Fixing recursive fault but reboot is needed!
Full dmesg & config attached.
PS: the machine is still running and other messages are appearing:
slab: double free detected in cache 'anon_vma', objp ffff81000511ca38
----------- [cut here ] --------- [please bite here ] ---------
Kernel BUG at mm/slab.c:2329
invalid opcode: 0000 [3]
CPU 0
Modules linked in: xt_state ip_queue ip_conntrack iptable_filter
ip_tables Pid: 21233, comm: bash Not tainted 2.6.16-rc5-g7b14e3b5 #7
RIP: 0010:[<ffffffff801577ca>] <ffffffff801577ca>{free_block+257}
RSP: 0018:ffff810000dc5c78 EFLAGS: 00010092
RAX: 0000000000000049 RBX: ffff81000511c000 RCX: ffffffff803f2490
RDX: ffffffff803f2490 RSI: 0000000000000001 RDI: 0000000100000000
RBP: ffff81001ff2bf20 R08: 000000003b9aca00 R09: 000000000000000f
R10: 0000000000000000 R11: ffff81001ff2bf20 R12: ffff81001ff2cec0
R13: ffff81000511ca38 R14: 0000000000000037 R15: ffff81000511c030
FS: 00002b9e31461e60(0000) GS:ffffffff804cb000(0000)
knlGS:00000000f77378e0 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 000000000071359f CR3: 0000000000614000 CR4: 00000000000006e0
Process bash (pid: 21233, threadinfo ffff810000dc4000, task
ffff81001d018e60) Stack: ffff81001ff280a8 0000003700615065
0000000000000010 0000000000000010 ffff81001ff28098 ffff81001ff2bf20
ffff81001ff2cec0 0000000000000000 ffff810000dc5d70 ffffffff8015796c
Call Trace: <ffffffff8015796c>{cache_flusharray+111}
<ffffffff80157585>{kmem_cache_free+78}
<ffffffff80148cac>{free_pgtables+45} <ffffffff8014e1fc>{exit_mmap+119}
<ffffffff8012210c>{mmput+27} <ffffffff801263d6>{do_exit+519}
<ffffffff801269d7>{sys_exit_group+0}
<ffffffff8012deb3>{get_signal_to_deliver+1216}
<ffffffff80109aaa>{do_signal+110} <ffffffff8012d591>{kill_proc_info+41}
<ffffffff8012e601>{sys_kill+263}
<ffffffff8010a476>{sys_rt_sigreturn+654}
<ffffffff8010a82e>{int_signal+18}
Code: 0f 0b 68 4f e6 38 80 c2 19 09 8b 43 24 48 89 de 4c 89 e7 43
RIP <ffffffff801577ca>{free_block+257} RSP <ffff810000dc5c78>
<1>Fixing recursive fault but reboot is needed!
Unable to handle kernel paging request at 0000000000100108 RIP:
<ffffffff80157748>{free_block+127}
PGD 0
Oops: 0002 [4]
CPU 0
Modules linked in: xt_state ip_queue ip_conntrack iptable_filter
ip_tables Pid: 18669, comm: konsole Not tainted 2.6.16-rc5-g7b14e3b5 #7
RIP: 0010:[<ffffffff80157748>] <ffffffff80157748>{free_block+127}
RSP: 0018:ffff81001d60bde8 EFLAGS: 00010012
RAX: 0000000000100100 RBX: ffff81000511c000 RCX: ffff810001000000
RDX: 0000000000200200 RSI: ffff81000511c000 RDI: ffff81001ff2cec0
RBP: ffff81001ff2bf20 R08: ffff81001ff2eae8 R09: ffff8100169f1000
R10: 0000000000000212 R11: 00000000000000aa R12: ffff81001ff2cec0
R13: ffff81000511c5b0 R14: 0000000000000006 R15: ffff8100098ef030
FS: 00002b6843b30ce0(0000) GS:ffffffff804cb000(0000)
knlGS:00000000f77378e0 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 0000000000100108 CR3: 000000000f55a000 CR4: 00000000000006e0
Process konsole (pid: 18669, threadinfo ffff81001d60a000, task
ffff81001e17b750) Stack: ffff81001ff28100 000000060513e000
0000000000000005 0000000000000010 ffff81001ff28098 ffff81001ff2bf20
ffff81001ff2cec0 0000000000000000 ffff81001d60bee0 ffffffff8015796c
Call Trace: <ffffffff8015796c>{cache_flusharray+111}
<ffffffff80157585>{kmem_cache_free+78}
<ffffffff80148cac>{free_pgtables+45} <ffffffff8014e1fc>{exit_mmap+119}
<ffffffff8012210c>{mmput+27} <ffffffff801263d6>{do_exit+519}
<ffffffff801269d7>{sys_exit_group+0} <ffffffff8010a5b2>{system_call+126}
Code: 48 89 50 08 48 89 02 48 c7 43 08 00 02 20 00 48 c7 03 00 01
RIP <ffffffff80157748>{free_block+127} RSP <ffff81001d60bde8>
CR2: 0000000000100108
<1>Fixing recursive fault but reboot is needed!
slab error in cache_alloc_debugcheck_after(): cache `anon_vma': double
free, or memory outside object was overwritten
Call Trace: <ffffffff80156d06>{cache_alloc_debugcheck_after+153}
<ffffffff8014f939>{anon_vma_prepare+73}
<ffffffff80156e27>{kmem_cache_alloc+140}
<ffffffff8014f939>{anon_vma_prepare+73}
<ffffffff8014af5f>{__handle_mm_fault+899}
<ffffffff80117ee8>{do_page_fault+937}
<ffffffff801ddede>{prio_tree_insert+432}
<ffffffff801566a0>{check_poison_obj+48}
<ffffffff8010aee1>{error_exit+0} <ffffffff801e1896>{__clear_user+60}
<ffffffff801e187a>{__clear_user+32} <ffffffff80182b0d>{padzero+24}
<ffffffff80183b4c>{load_elf_binary+2866}
<ffffffff80163dd4>{search_binary_handler+110}
<ffffffff801640c7>{do_execve+375} <ffffffff8010a5b2>{system_call+126}
<ffffffff8010929c>{sys_execve+51} <ffffffff8010a9d6>{stub_execve+106}
ffff8100098efb78: redzone 1: 0x170fc2a5, redzone 2: 0x170fc2a5. slab
error in cache_alloc_debugcheck_after(): cache `anon_vma': double free,
or memory outside object was overwritten
Call Trace: <ffffffff80156d06>{cache_alloc_debugcheck_after+153}
<ffffffff8014f939>{anon_vma_prepare+73}
<ffffffff80156e27>{kmem_cache_alloc+140}
<ffffffff8014f939>{anon_vma_prepare+73}
<ffffffff8014af5f>{__handle_mm_fault+899}
<ffffffff80117ee8>{do_page_fault+937}
<ffffffff80141857>{generic_file_aio_read+52}
<ffffffff801ddede>{prio_tree_insert+432}
<ffffffff8010aee1>{error_exit+0} <ffffffff801e1896>{__clear_user+60}
<ffffffff801e187a>{__clear_user+32} <ffffffff80182b0d>{padzero+24}
<ffffffff80182f2b>{load_elf_interp+803}
<ffffffff80183d87>{load_elf_binary+3437}
<ffffffff80163dd4>{search_binary_handler+110}
<ffffffff801640c7>{do_execve+375} <ffffffff8010a5b2>{system_call+126}
<ffffffff8010929c>{sys_execve+51} <ffffffff8010a9d6>{stub_execve+106}
ffff8100098ef970: redzone 1: 0x170fc2a5, redzone 2: 0x170fc2a5. slab
error in cache_alloc_debugcheck_after(): cache `anon_vma': double free,
or memory outside object was overwritten
Call Trace: <ffffffff80156d06>{cache_alloc_debugcheck_after+153}
<ffffffff8014f939>{anon_vma_prepare+73}
<ffffffff80156e27>{kmem_cache_alloc+140}
<ffffffff8014f939>{anon_vma_prepare+73}
<ffffffff8014ad64>{__handle_mm_fault+392}
<ffffffff801566a0>{check_poison_obj+48}
<ffffffff80117ee8>{do_page_fault+937}
<ffffffff8014dc68>{do_mmap_pgoff+1507} <ffffffff8010aee1>{error_exit+0}
ffff8100098ef3d0: redzone 1: 0x170fc2a5, redzone 2: 0x170fc2a5. slab
error in cache_alloc_debugcheck_after(): cache `anon_vma': double free,
or memory outside object was overwritten
Call Trace: <ffffffff80156d06>{cache_alloc_debugcheck_after+153}
<ffffffff8014f939>{anon_vma_prepare+73}
<ffffffff80156e27>{kmem_cache_alloc+140}
<ffffffff8014f939>{anon_vma_prepare+73}
<ffffffff8014af5f>{__handle_mm_fault+899}
<ffffffff80117ee8>{do_page_fault+937}
<ffffffff8014dc68>{do_mmap_pgoff+1507} <ffffffff8010aee1>{error_exit+0}
ffff8100098ef920: redzone 1: 0x170fc2a5, redzone 2: 0x170fc2a5. slab
error in cache_alloc_debugcheck_after(): cache `anon_vma': double free,
or memory outside object was overwritten
Call Trace: <ffffffff80156d06>{cache_alloc_debugcheck_after+153}
<ffffffff8014f939>{anon_vma_prepare+73}
<ffffffff80156e27>{kmem_cache_alloc+140}
<ffffffff8014f939>{anon_vma_prepare+73}
<ffffffff8014ad64>{__handle_mm_fault+392}
<ffffffff801566a0>{check_poison_obj+48}
<ffffffff80117ee8>{do_page_fault+937}
<ffffffff8014dc68>{do_mmap_pgoff+1507} <ffffffff8010aee1>{error_exit+0}
ffff8100098ef808: redzone 1: 0x170fc2a5, redzone 2: 0x170fc2a5. slab
error in cache_alloc_debugcheck_after(): cache `anon_vma': double free,
or memory outside object was overwritten
Call Trace: <ffffffff80156d06>{cache_alloc_debugcheck_after+153}
<ffffffff8014f939>{anon_vma_prepare+73}
<ffffffff80156e27>{kmem_cache_alloc+140}
<ffffffff8014f939>{anon_vma_prepare+73}
<ffffffff8014ad64>{__handle_mm_fault+392}
<ffffffff80117ee8>{do_page_fault+937} <ffffffff8014c436>{remove_vma+90}
<ffffffff8014d66a>{do_munmap+601} <ffffffff8010aee1>{error_exit+0}
ffff8100098eff10: redzone 1: 0x170fc2a5, redzone 2: 0x170fc2a5. slab
error in cache_alloc_debugcheck_after(): cache `anon_vma': double free,
or memory outside object was overwritten
Call Trace: <ffffffff80156d06>{cache_alloc_debugcheck_after+153}
<ffffffff8014f939>{anon_vma_prepare+73}
<ffffffff80156e27>{kmem_cache_alloc+140}
<ffffffff8014f939>{anon_vma_prepare+73}
<ffffffff8014ad64>{__handle_mm_fault+392}
<ffffffff80117ee8>{do_page_fault+937} <ffffffff8016d5f8>{dput+59}
<ffffffff8015b791>{__fput+302} <ffffffff8010aee1>{error_exit+0}
ffff8100098ef718: redzone 1: 0x170fc2a5, redzone 2: 0x170fc2a5. slab
error in cache_alloc_debugcheck_after(): cache `anon_vma': double free,
or memory outside object was overwritten
Call Trace: <ffffffff80156d06>{cache_alloc_debugcheck_after+153}
<ffffffff8014f939>{anon_vma_prepare+73}
<ffffffff80156e27>{kmem_cache_alloc+140}
<ffffffff8014f939>{anon_vma_prepare+73}
<ffffffff8014ad64>{__handle_mm_fault+392}
<ffffffff80117ee8>{do_page_fault+937}
<ffffffff8014dc68>{do_mmap_pgoff+1507} <ffffffff8010aee1>{error_exit+0}
ffff8100098ef740: redzone 1: 0x170fc2a5, redzone 2: 0x170fc2a5. slab
error in cache_alloc_debugcheck_after(): cache `anon_vma': double free,
or memory outside object was overwritten
Call Trace: <ffffffff80156d06>{cache_alloc_debugcheck_after+153}
<ffffffff8014f939>{anon_vma_prepare+73}
<ffffffff80156e27>{kmem_cache_alloc+140}
<ffffffff8014f939>{anon_vma_prepare+73}
<ffffffff8014af5f>{__handle_mm_fault+899}
<ffffffff80117ee8>{do_page_fault+937}
<ffffffff801ddede>{prio_tree_insert+432}
<ffffffff801566a0>{check_poison_obj+48}
<ffffffff8010aee1>{error_exit+0} <ffffffff801e1896>{__clear_user+60}
<ffffffff801e187a>{__clear_user+32} <ffffffff80182b0d>{padzero+24}
<ffffffff80183b4c>{load_elf_binary+2866}
<ffffffff80163dd4>{search_binary_handler+110}
<ffffffff801640c7>{do_execve+375} <ffffffff8010a5b2>{system_call+126}
<ffffffff8010929c>{sys_execve+51} <ffffffff8010a9d6>{stub_execve+106}
ffff8100098ef858: redzone 1: 0x170fc2a5, redzone 2: 0x170fc2a5. slab
error in cache_alloc_debugcheck_after(): cache `anon_vma': double free,
or memory outside object was overwritten
Call Trace: <ffffffff80156d06>{cache_alloc_debugcheck_after+153}
<ffffffff8014f939>{anon_vma_prepare+73}
<ffffffff80156e27>{kmem_cache_alloc+140}
<ffffffff8014f939>{anon_vma_prepare+73}
<ffffffff8014af5f>{__handle_mm_fault+899}
<ffffffff80117ee8>{do_page_fault+937}
<ffffffff80141857>{generic_file_aio_read+52}
<ffffffff801ddede>{prio_tree_insert+432}
<ffffffff8010aee1>{error_exit+0} <ffffffff801e1896>{__clear_user+60}
<ffffffff801e187a>{__clear_user+32} <ffffffff80182b0d>{padzero+24}
<ffffffff80182f2b>{load_elf_interp+803}
<ffffffff80183d87>{load_elf_binary+3437}
<ffffffff80163dd4>{search_binary_handler+110}
<ffffffff801640c7>{do_execve+375} <ffffffff8010a5b2>{system_call+126}
<ffffffff8010929c>{sys_execve+51} <ffffffff8010a9d6>{stub_execve+106}
ffff8100098ef768: redzone 1: 0x170fc2a5, redzone 2: 0x170fc2a5. slab
error in cache_alloc_debugcheck_after(): cache `anon_vma': double free,
or memory outside object was overwritten
Call Trace: <ffffffff80156d06>{cache_alloc_debugcheck_after+153}
<ffffffff8014f939>{anon_vma_prepare+73}
<ffffffff80156e27>{kmem_cache_alloc+140}
<ffffffff8014f939>{anon_vma_prepare+73}
<ffffffff8014ad64>{__handle_mm_fault+392}
<ffffffff801566a0>{check_poison_obj+48}
<ffffffff80117ee8>{do_page_fault+937}
<ffffffff8014dc68>{do_mmap_pgoff+1507} <ffffffff8010aee1>{error_exit+0}
ffff8100098ef290: redzone 1: 0x170fc2a5, redzone 2: 0x170fc2a5.
I'm going to reboot ;)
--
Paolo Ornati
Linux 2.6.16-rc5-g7b14e3b5 on x86_64
On Wed, 1 Mar 2006 16:06:56 +0100
Paolo Ornati <[email protected]> wrote:
> I'm going to reboot ;)
Some more info...
1) System
AMD Athlon64 3200+
2 x 256MB DDR400 (Corsair Value)
Asus A8VSE Deluxe
0000:00:00.0 Host bridge: VIA Technologies, Inc. VT8385 [K8T800 AGP] Host Bridge (rev 01)
0000:00:01.0 PCI bridge: VIA Technologies, Inc. VT8237 PCI bridge [K8T800/K8T890 South]
0000:00:0a.0 Ethernet controller: Marvell Technology Group Ltd. 88E8001 Gigabit Ethernet Controller (rev 13)
0000:00:0e.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL-8139/8139C/8139C+ (rev 10)
0000:00:0f.0 RAID bus controller: VIA Technologies, Inc. VIA VT6420 SATA RAID Controller (rev 80)
0000:00:0f.1 IDE interface: VIA Technologies, Inc. VT82C586A/B/VT82C686/A/B/VT823x/A/C PIPC Bus Master IDE (rev 06)
0000:00:10.0 USB Controller: VIA Technologies, Inc. VT82xxxxx UHCI USB 1.1 Controller (rev 81)
0000:00:10.1 USB Controller: VIA Technologies, Inc. VT82xxxxx UHCI USB 1.1 Controller (rev 81)
0000:00:10.2 USB Controller: VIA Technologies, Inc. VT82xxxxx UHCI USB 1.1 Controller (rev 81)
0000:00:10.3 USB Controller: VIA Technologies, Inc. VT82xxxxx UHCI USB 1.1 Controller (rev 81)
0000:00:10.4 USB Controller: VIA Technologies, Inc. USB 2.0 (rev 86)
0000:00:11.0 ISA bridge: VIA Technologies, Inc. VT8237 ISA bridge [KT600/K8T800/K8T890 South]
0000:00:11.5 Multimedia audio controller: VIA Technologies, Inc. VT8233/A/8235/8237 AC97 Audio Controller (rev 60)
0000:00:18.0 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] HyperTransport Technology Configuration
0000:00:18.1 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] Address Map
0000:00:18.2 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] DRAM Controller
0000:00:18.3 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] Miscellaneous Control
0000:01:00.0 VGA compatible controller: ATI Technologies Inc RV280 [Radeon 9200 SE] (rev 01)
0000:01:00.1 Display controller: ATI Technologies Inc RV280 [Radeon 9200 SE] (Secondary) (rev 01)
2) partitions:
/ ext3
/var reiserfs
After the reboot I've checked the reiserfs partition and found errors
fixable only with "reiserfsck --rebuild-tree".
3) memory corruption?
With this PC I've experienced sporadically memory corruption many months
ago... but they are gone away disabling Memory Interleaving in the BIOS
(tested with memtest86+). Maybe it only fails less often...
4) Frequency scaling: the only thing I've recently enabled is frequence
scaling (with "ondemand" governor).
--
Paolo Ornati
Linux 2.6.16-rc5-g7b14e3b5 on x86_64
On Wed, Mar 01, 2006 at 04:06:56PM +0100, Paolo Ornati wrote:
> It's the first time I see this, I don't think it is reproducible...
>
> Kernel BUG at mm/slab.c:2564
> invalid opcode: 0000 [1]
> CPU 0
> Modules linked in: xt_state ip_queue ip_conntrack iptable_filter
> ip_tables Pid: 21232, comm: as Not tainted 2.6.16-rc5-g7b14e3b5 #7
> RIP: 0010:[<ffffffff80156c5e>] <ffffffff80156c5e>{check_slabp+188}
> RSP: 0018:ffff81000a2cbdc8 EFLAGS: 00010096
> RAX: 0000000000000001 RBX: 0000000000000178 RCX: 0000000000003ee7
> RDX: 00000000ffffff01 RSI: 0000000000003ee7 RDI: ffffffff803f24c0
> RBP: ffff810000a07000 R08: 00000000fffffffe R09: ffff810000a07000
> R10: 0000000000000046 R11: 0000000000000000 R12: ffff81001ff2cec0
> R13: ffff810000a071a0 R14: 0000000000000000 R15: ffff81000a2cbee0
> FS: 00002af4db709dc0(0000) GS:ffffffff804cb000(0000)
> knlGS:00000000f6cd7bb0 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
> CR2: 0000003cf698c600 CR3: 000000000c1a3000 CR4: 00000000000006e0
> Process as (pid: 21232, threadinfo ffff81000a2ca000, task
> ffff810017e7a100) Stack: ffff810000a07000 ffff81001ff2bf20
> ffff81001ff2cec0 ffffffff80157763 ffff81001ff280a8 ffff810005f24000
> 0000000000000010 0000000000000010 ffff81001ff28098 ffff81001ff2bf20
> Call Trace: <ffffffff80157763>{free_block+154}
> <ffffffff8015796c>{cache_flusharray+111}
> <ffffffff80157585>{kmem_cache_free+78}
> <ffffffff80148cac>{free_pgtables+45} <ffffffff8014e1fc>{exit_mmap+119}
> <ffffffff8012210c>{mmput+27} <ffffffff801263d6>{do_exit+519}
> <ffffffff801269d7>{sys_exit_group+0} <ffffffff8010a5b2>{system_call+126}
>
> Code: 0f 0b 68 4f e6 38 80 c2 04 0a 5b 5d 41 5c c3 41 55 31 c0 48
> RIP <ffffffff80156c5e>{check_slabp+188} RSP <ffff81000a2cbdc8>
> <1>Fixing recursive fault but reboot is needed!
I might have hit something similar about a month ago running 2.6.16-rc1.
At the time I had written this off as a hardware problem since I was using
a questionable system, but maybe there is a hard-to-hit bug in the anon_vma
or slab code?
Pid: 10865, CPU 3, comm: sleep
psr : 00001010081a2018 ifs : 800000000000058d ip : [<a00000010012f7c0>] Tainted: G U
ip is at free_block+0x1c0/0x280
unat: 0000000000000000 pfs : 000000000000058d rsc : 0000000000000003
rnat: 0000000000000000 bsps: 0000000000000000 pr : 0000000001aa9555
ldrs: 0000000000000000 ccv : 0000000000000000 fpsr: 0009804c8270033f
csd : 0000000000000000 ssd : 0000000000000000
b0 : a00000010012f700 b6 : a000000100010640 b7 : a000000100010610
f6 : 0fffaaaaaaaaaaa000000 f7 : 0ffe4b040000000000000
f8 : 1000cb040000000000000 f9 : 10003c000000000000000
f10 : 10007eaffffffff150000 f11 : 1003e00000000000001d6
r1 : a000000100f81b50 r2 : e002806003045ea0 r3 : e002806077210020
r8 : 0000000000000000 r9 : 0000000000000246 r10 : 0000000000000247
r11 : 0000000000002814 r12 : e00280606e9c7d40 r13 : e00280606e9c0000
r14 : 0000000a0181dc84 r15 : 0000000000000001 r16 : e00280607a7ab594
r17 : 00000000ffffffff r18 : e002806003045eb0 r19 : e002806003045e88
r20 : 0000000000000000 r21 : 0000000000000758 r22 : e00280607a7aad8c
r23 : 0000000000200200 r24 : 0000000000100100 r25 : e00280600c0b0008
r26 : e002806072b8c000 r27 : e00280600c0b0000 r28 : a0007ffffffbbd10
r29 : a0007ffffffbbce0 r30 : 000000460a8d079c r31 : a0007dcfab938000
Call Trace:
[<a000000100014c20>] show_stack+0x40/0xa0
sp=e00280606e9c78d0 bsp=e00280606e9c14f8
[<a000000100015450>] show_regs+0x7d0/0x800
sp=e00280606e9c7aa0 bsp=e00280606e9c14b0
[<a000000100037560>] die+0x1c0/0x2a0
sp=e00280606e9c7aa0 bsp=e00280606e9c1468
[<a0000001008b5d50>] ia64_do_page_fault+0x890/0x9e0
sp=e00280606e9c7ac0 bsp=e00280606e9c1410
[<a00000010000cae0>] ia64_leave_kernel+0x0/0x280
sp=e00280606e9c7b70 bsp=e00280606e9c1410
[<a00000010012f7c0>] free_block+0x1c0/0x280
sp=e00280606e9c7d40 bsp=e00280606e9c13a0
[<a00000010012eb00>] cache_flusharray+0x140/0x1a0
sp=e00280606e9c7d40 bsp=e00280606e9c1358
[<a00000010012f350>] kmem_cache_free+0x310/0x3a0
sp=e00280606e9c7d40 bsp=e00280606e9c1310
[<a0000001001157a0>] anon_vma_unlink+0xe0/0x100
sp=e00280606e9c7d50 bsp=e00280606e9c12e8
[<a000000100109de0>] free_pgtables+0x160/0x2a0
sp=e00280606e9c7d50 bsp=e00280606e9c12a0
[<a00000010010c1b0>] exit_mmap+0x130/0x440
sp=e00280606e9c7d50 bsp=e00280606e9c1250
[<a0000001000830d0>] mmput+0x50/0x180
sp=e00280606e9c7e20 bsp=e00280606e9c1220
[<a00000010008cb90>] exit_mm+0x330/0x360
sp=e00280606e9c7e20 bsp=e00280606e9c11d8
[<a000000100090880>] do_exit+0x400/0x1340
sp=e00280606e9c7e20 bsp=e00280606e9c1178
[<a000000100091940>] do_group_exit+0x180/0x1a0
sp=e00280606e9c7e30 bsp=e00280606e9c1140
[<a000000100091980>] sys_exit_group+0x20/0x40
sp=e00280606e9c7e30 bsp=e00280606e9c10e8
[<a00000010000c940>] ia64_ret_from_syscall+0x0/0x20
sp=e00280606e9c7e30 bsp=e00280606e9c10e8
[<a000000000010640>] __kernel_syscall_via_break+0x0/0x20
sp=e00280606e9c8000 bsp=e00280606e9c10e8
<1>Unable to handle kernel NULL pointer dereference (address 0000000000000768)
sleep[10865]: Oops 11012296146944 [2]
Dean
--
Dean Roe
Silicon Graphics, Inc.
[email protected]
On Wed, 1 Mar 2006 11:36:36 -0600
Dean Roe <[email protected]> wrote:
> I might have hit something similar about a month ago running 2.6.16-rc1.
> At the time I had written this off as a hardware problem since I was using
> a questionable system, but maybe there is a hard-to-hit bug in the anon_vma
> or slab code?
Something is happened again here!
Slab corruption: start=ffff81000d0ffb30, len=104
Redzone: 0x5a2cf071/0x5a2cf071.
Last user: [<ffffffff8015caac>](end_bio_bh_io_sync+0x35/0x39)
000: 6b 6b 6b 2b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
Prev obj: start=ffff81000d0ffab0, len=104
Redzone: 0x170fc2a5/0x170fc2a5.
Last user: [<ffffffff80141b05>](mempool_alloc+0x44/0xdf)
000: 3e db d8 05 00 00 00 00 00 00 00 00 00 00 00 00
010: 58 a6 f1 1f 00 81 ff ff 09 00 00 00 00 00 00 00
Next obj: start=ffff81000d0ffbb0, len=104
Redzone: 0x5a2cf071/0x5a2cf071.
Last user: [<ffffffff8015caac>](end_bio_bh_io_sync+0x35/0x39)
000: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
010: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
Slab corruption: start=ffff81000d0ffb30, len=104
Redzone: 0x5a2cf071/0x5a2cf071.
Last user: [<ffffffff8015caac>](end_bio_bh_io_sync+0x35/0x39)
000: 6b 6b 6b 2b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
Prev obj: start=ffff81000d0ffab0, len=104
Redzone: 0x5a2cf071/0x5a2cf071.
Last user: [<ffffffff8015caac>](end_bio_bh_io_sync+0x35/0x39)
000: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
010: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
Next obj: start=ffff81000d0ffbb0, len=104
Redzone: 0x5a2cf071/0x5a2cf071.
Last user: [<ffffffff8015caac>](end_bio_bh_io_sync+0x35/0x39)
000: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
010: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
Mmmm... I'm going to disable CPU freq scaling, it's the only thing I've
recently enabled, maybe it's causing some kind of instability ?!
--
Paolo Ornati
Linux 2.6.16-rc5-g800d1142 on x86_64
On Wed, 1 Mar 2006 17:07:18 +0100
Paolo Ornati <[email protected]> wrote:
> Asus A8VSE Deluxe
small correction "A8V" --> "K8V"
--
Paolo Ornati
Linux 2.6.16-rc5-g800d1142 on x86_64
On Thu, 2 Mar 2006 09:07:28 +0100
Paolo Ornati <[email protected]> wrote:
> Mmmm... I'm going to disable CPU freq scaling, it's the only thing I've
> recently enabled, maybe it's causing some kind of instability ?!
After a clean re-compilation (usually I use ccache and I don't do
"make clean") I'm unable to reproduce the problem even with Freq.
Scaling enabled, so maybe it was just a miscompiled kernel or
something...
--
Paolo Ornati
Linux 2.6.16-rc5-gc499ec24 on x86_64
On Thu, 2 Mar 2006, Paolo Ornati wrote:
>
> Something is happened again here!
I think you have bad ram.
> Slab corruption: start=ffff81000d0ffb30, len=104
> Redzone: 0x5a2cf071/0x5a2cf071.
> Last user: [<ffffffff8015caac>](end_bio_bh_io_sync+0x35/0x39)
> 000: 6b 6b 6b 2b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
> Slab corruption: start=ffff81000d0ffb30, len=104
> Redzone: 0x5a2cf071/0x5a2cf071.
> Last user: [<ffffffff8015caac>](end_bio_bh_io_sync+0x35/0x39)
> 000: 6b 6b 6b 2b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
It's the same corruption both times, and the exact same slab entry.
And it's a single-bit error: the "2b" should be a "6b".
Now, if could have been a software error, clearing that one bit, but the
thing is, that is the first word in a "struct bio", which should be a
"sector_t bi_sector". The entries around it are also "struct bio"s, and we
don't do any bit-operations on anything in that area (on "bi_flags", yes).
The fact that it was the very same bit both times (not just the same
offset: the same physical address) makes me suspect bad RAM.
Linus
On Mon, 6 Mar 2006 11:16:13 -0800 (PST)
Linus Torvalds <[email protected]> wrote:
> On Thu, 2 Mar 2006, Paolo Ornati wrote:
> >
> > Something is happened again here!
>
> I think you have bad ram.
>
> > Slab corruption: start=ffff81000d0ffb30, len=104
> > Redzone: 0x5a2cf071/0x5a2cf071.
> > Last user: [<ffffffff8015caac>](end_bio_bh_io_sync+0x35/0x39)
> > 000: 6b 6b 6b 2b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
My suspect was that the failing addr was the same one I had already
seen some time ago with memtest86+ and that was (apparently?) fixed by
disabling "bank interleaving" in the BIOS.
But now that I've rechecked... it was a different address:
76.1 MB -- 04c0 37fc
TEST 6
good FF FF FF FD
bad F7 FF FF FD
The one detected with DEBUG_SLAB is at 208.99 MB (so both problems are
in my first 256MB memory module) but I'm unable to reproduce it with
memtest86+...
I wonder if these two are related in some way... or maybe it's just a
weak memory module ;)
--
Paolo Ornati
Linux 2.6.16-rc5-g501f74f2 on x86_64