Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S261362AbVEMU5f (ORCPT ); Fri, 13 May 2005 16:57:35 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S261486AbVEMUua (ORCPT ); Fri, 13 May 2005 16:50:30 -0400 Received: from mail.wildbrain.com ([209.130.193.228]:57504 "EHLO hermes.wildbrain.com") by vger.kernel.org with ESMTP id S261362AbVEMUpx (ORCPT ); Fri, 13 May 2005 16:45:53 -0400 Message-ID: <428511F8.6020303@wildbrain.com> Date: Fri, 13 May 2005 13:45:44 -0700 From: Gregory Brauer User-Agent: Mozilla Thunderbird 1.0.2 (X11/20050317) X-Accept-Language: en-us, en MIME-Version: 1.0 To: linux-kernel@vger.kernel.org Subject: kernel OOPS for XFS in xfs_iget_core (using NFS+SMP+MD) Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-WB-MailScanner: Found to be clean X-WB-MailScanner-SpamCheck: not spam (whitelisted), SpamAssassin (score=-2.06, required 5, BAYES_00 -2.60, TW_CL 0.08, TW_FX 0.08, TW_JB 0.08, TW_UH 0.08, TW_XF 0.08, TW_XS 0.08, TW_XT 0.08) X-MailScanner-From: greg@wildbrain.com Sender: linux-kernel-owner@vger.kernel.org X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 14257 Lines: 327 I have seen some references to a similar bug in other kernel list posts from October 2004 and am trying to figure out if this is the same problem, or something new related to the xfs_iget_core patch in 2.6.11. This seems to be a very hard to reproduce bug, but we've seen this problem twice in a week of testing under Fedora Core 3 on the following kernel: Linux violet 2.6.10-1.741_FC3smp #1 SMP Thu Jan 13 16:53:16 EST 2005 i686 i686 i386 GNU/Linux This is on a system with 4 3ware 8506-4 cards in a software-RAID-0 configuration to make one large volume. The first traceback was triggered after 1 week of a production NFS load and the second traceback was triggered after two hours of a load generated by five NFS clients running nfs_fsstress.sh. The two tracebacks are follow: May 10 15:49:56 violet kernel: xfs_iget_core: ambiguous vns: vp/0xee02202c, invp/0xe13dddd0 May 10 15:49:56 violet kernel: ------------[ cut here ]------------ May 10 15:49:56 violet kernel: kernel BUG at fs/xfs/support/debug.c:106! May 10 15:49:56 violet kernel: invalid operand: 0000 [#1] May 10 15:49:56 violet kernel: SMP May 10 15:49:56 violet kernel: Modules linked in: nfs vfat fat nfsd lockd md5 ipv6 parport_pc lp parport sunrpc xfs exportfs dm_mod video button batter y ac uhci_hcd hw_random i2c_i801 i2c_core e1000 bonding floppy ext3 jbd raid0 3w_xxxx sd_mod scsi_mod May 10 15:49:56 violet kernel: CPU: 1 May 10 15:49:56 violet kernel: EIP: 0060:[] Not tainted VLI May 10 15:49:56 violet kernel: EFLAGS: 00010246 (2.6.11-1.14_FC3smp) May 10 15:49:56 violet kernel: EIP is at cmn_err+0x98/0xa8 [xfs] May 10 15:49:56 violet kernel: eax: 00000000 ebx: f8aa9a84 ecx: f8abd364 edx: 00000000 May 10 15:49:56 violet kernel: esi: f8aa9304 edi: f8abe91e ebp: 00000000 esp: d77f4af0 May 10 15:49:56 violet kernel: ds: 007b es: 007b ss: 0068 May 10 15:49:56 violet kernel: Process nfsd (pid: 6303, threadinfo=d77f4000 task=e7412aa0) May 10 15:49:56 violet kernel: Stack: f8aa9304 f8aa92cb f8abe8e0 00000292 d77f4000 a1e0b035 00000000 f53891ac May 10 15:49:56 violet kernel: f8a7932f 00000000 f8aa9a84 ee02202c e13dddd0 f7005200 f8aa4fe1 d82d0354 May 10 15:49:56 violet kernel: 000001e0 e13dddf4 d82d0350 00000000 cabd6400 e13dddd0 f53891ac e13dddf4 May 10 15:49:56 violet kernel: Call Trace: May 10 15:49:56 violet kernel: [] xfs_iget_core+0x4f9/0x5e2 [xfs] May 10 15:49:56 violet kernel: [] linvfs_alloc_inode+0x20/0x2d [xfs] May 10 15:49:56 violet kernel: [] xfs_iget+0xd4/0x16a [xfs] May 10 15:49:56 violet kernel: [] xfs_vget+0x61/0xce [xfs] May 10 15:49:56 violet kernel: [] vfs_vget+0x2d/0x30 [xfs] May 10 15:49:56 violet kernel: [] linvfs_get_dentry+0x3b/0x6f [xfs] May 10 15:49:56 violet kernel: [] find_exported_dentry+0x2d/0x601 [exportfs] May 10 15:49:56 violet kernel: [] e1000_xmit_frame+0x554/0x8b3 [e1000] May 10 15:49:56 violet kernel: [] __kfree_skb+0xab/0x155 May 10 15:49:56 violet kernel: [] xfs_log_release_iclog+0x11/0x3e [xfs] May 10 15:49:56 violet kernel: [] copy_to_user+0x4d/0x5f May 10 15:49:56 violet kernel: [] memcpy_toiovec+0x27/0x4c May 10 15:49:56 violet kernel: [] skb_copy_datagram_iovec+0x4d/0x1f7 May 10 15:49:56 violet kernel: [] release_sock+0x12/0x79 May 10 15:49:56 violet kernel: [] tcp_recvmsg+0x2d5/0x738 May 10 15:49:56 violet kernel: [] sock_common_recvmsg+0x41/0x57 May 10 15:49:56 violet kernel: [] sock_recvmsg+0x10c/0x127 May 10 15:49:56 violet kernel: [] d_instantiate+0x54/0x58 May 10 15:49:56 violet kernel: [] recalc_task_prio+0x8a/0x150 May 10 15:49:56 violet kernel: [] activate_task+0x8a/0x99 May 10 15:49:56 violet kernel: [] try_to_wake_up+0x238/0x270 May 10 15:49:56 violet kernel: [] svc_expkey_lookup+0x404/0x425 [nfsd] May 10 15:49:56 violet kernel: [] exp_find_key+0x69/0x7e [nfsd] May 10 15:49:56 violet kernel: [] linvfs_decode_fh+0x5e/0xb6 [xfs] May 10 15:49:56 violet kernel: [] nfsd_acceptable+0x0/0xdd [nfsd] May 10 15:49:56 violet kernel: [] linvfs_decode_fh+0x0/0xb6 [xfs] May 10 15:49:56 violet kernel: [] fh_verify+0x1cb/0x578 [nfsd] May 10 15:49:56 violet kernel: [] nfsd_acceptable+0x0/0xdd [nfsd] May 10 15:49:56 violet kernel: [] schedule+0x33d/0xb11 May 10 15:49:56 violet kernel: [] nfsd_open+0x22/0x1bd [nfsd] May 10 15:49:56 violet kernel: [] nfsd_write+0x31/0x316 [nfsd] May 10 15:49:56 violet kernel: [] ip_map_lookup+0x34c/0x397 [sunrpc] May 10 15:49:57 violet kernel: [] svcauth_unix_set_client+0xab/0xb2 [sunrpc] May 10 15:49:57 violet kernel: [] del_singleshot_timer_sync+0x8/0x27 May 10 15:49:57 violet kernel: [] schedule_timeout+0x91/0xfd May 10 15:49:57 violet kernel: [] nfsd3_proc_write+0xab/0x10a [nfsd] May 10 15:49:57 violet kernel: [] nfs3svc_decode_writeargs+0x0/0x175 [nfsd] May 10 15:49:57 violet kernel: [] nfsd_dispatch+0x80/0x1db [nfsd] May 10 15:49:57 violet kernel: [] svc_authenticate+0x79/0xb1 [sunrpc] May 10 15:49:57 violet kernel: [] svc_process+0x52e/0x60e [sunrpc] May 10 15:49:57 violet kernel: [] nfsd+0x185/0x300 [nfsd] May 10 15:49:57 violet kernel: [] nfsd+0x0/0x300 [nfsd] May 10 15:49:57 violet kernel: [] kernel_thread_helper+0x5/0xb May 10 15:49:57 violet kernel: Code: f8 c7 44 24 08 e0 e8 ab f8 c7 04 24 04 93 aa f8 89 44 24 04 e8 a5 b2 67 c7 b8 64 d3 ab f8 8b 54 24 0c e8 9f a8 85 c7 85 ed 75 08 <0f> 0b 6a 00 eb 92 aa f8 83 c4 10 5b 5e 5f 5d c3 55 89 c5 b8 07 May 12 19:42:01 violet kernel: xfs_iget_core: ambiguous vns: vp/0xd710bc4c, invp/0xc400a7c0 May 12 19:42:01 violet kernel: ------------[ cut here ]------------ May 12 19:42:01 violet kernel: kernel BUG at fs/xfs/support/debug.c:106! May 12 19:42:01 violet kernel: invalid operand: 0000 [#1] May 12 19:42:01 violet kernel: SMP May 12 19:42:01 violet kernel: Modules linked in: loop nfs nfsd lockd md5 ipv6 parport_pc lp parport sunrpc xfs exportfs dm_mod video button battery ac uhci_hcd hw_random i2c_i801 i2c_core e1000 bonding floppy ext3 jbd raid0 3w_xxxx sd_mod scsi_mod May 12 19:42:01 violet kernel: CPU: 1 May 12 19:42:01 violet kernel: EIP: 0060:[] Not tainted VLI May 12 19:42:01 violet kernel: EFLAGS: 00010246 (2.6.11-1.14_FC3smp) May 12 19:42:01 violet kernel: EIP is at cmn_err+0x98/0xa8 [xfs] May 12 19:42:01 violet kernel: eax: 00000000 ebx: f8aa9a84 ecx: f8abd364 edx: 00000000 May 12 19:42:01 violet kernel: esi: f8aa9304 edi: f8abe91e ebp: 00000000 esp: f77adb5c May 12 19:42:01 violet kernel: ds: 007b es: 007b ss: 0068 May 12 19:42:01 violet kernel: Process nfsd (pid: 4186, threadinfo=f77ad000 task=f779f560) May 12 19:42:01 violet kernel: Stack: f8aa9304 f8aa92cb f8abe8e0 00000296 f77ad000 20819027 00000000 e29c5aac May 12 19:42:01 violet kernel: f8a7932f 00000000 f8aa9a84 d710bc4c c400a7c0 f67f7400 f8aa4fe1 f76d0274 May 12 19:42:01 violet kernel: 00000000 c400a7e4 f76d0270 00000000 f67ef000 c400a7c0 e29c5aac c400a7e4 May 12 19:42:01 violet kernel: Call Trace: May 12 19:42:01 violet kernel: [] xfs_iget_core+0x4f9/0x5e2 [xfs] May 12 19:42:01 violet kernel: [] linvfs_alloc_inode+0x20/0x2d [xfs] May 12 19:42:01 violet kernel: [] xfs_iget+0xd4/0x16a [xfs] May 12 19:42:01 violet kernel: [] xfs_vget+0x61/0xce [xfs] May 12 19:42:01 violet kernel: [] __kfree_skb+0xab/0x155 May 12 19:42:01 violet kernel: [] vfs_vget+0x2d/0x30 [xfs] May 12 19:42:01 violet kernel: [] linvfs_get_dentry+0x3b/0x6f [xfs] May 12 19:42:01 violet kernel: [] find_exported_dentry+0x2d/0x601 [exportfs] May 12 19:42:01 violet kernel: [] __kfree_skb+0xab/0x155 May 12 19:42:01 violet kernel: [] release_sock+0x12/0x79 May 12 19:42:01 violet kernel: [] tcp_recvmsg+0x2d5/0x738 May 12 19:42:01 violet kernel: [] sock_common_recvmsg+0x41/0x57 processor : 0 vendor_id : GenuineIntel cpu family : 15 model : 2 model name : Intel(R) Xeon(TM) CPU 2.66GHz stepping : 7 cpu MHz : 2666.116 cache size : 512 KB physical id : 0 siblings : 2 fdiv_bug : no hlt_bug : no f00f_bug : no coma_bug : no fpu : yes fpu_exception : yes cpuid level : 2 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe cid xtpr bogomips : 5275.64 processor : 1 vendor_id : GenuineIntel cpu family : 15 model : 2 model name : Intel(R) Xeon(TM) CPU 2.66GHz stepping : 7 cpu MHz : 2666.116 cache size : 512 KB physical id : 3 siblings : 2 fdiv_bug : no hlt_bug : no f00f_bug : no coma_bug : no fpu : yes fpu_exception : yes cpuid level : 2 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe cid xtpr bogomips : 5324.80 PCI devices found: Bus 0, device 0, function 0: Class 0600: PCI device 8086:254c (rev 1). Bus 0, device 0, function 1: Class ff00: PCI device 8086:2541 (rev 1). Bus 0, device 2, function 0: Class 0604: PCI device 8086:2543 (rev 1). Master Capable. Latency=64. Min Gnt=4. Bus 0, device 3, function 0: Class 0604: PCI device 8086:2545 (rev 1). Master Capable. Latency=64. Min Gnt=4. Bus 0, device 29, function 0: Class 0c03: PCI device 8086:2482 (rev 2). IRQ 225. I/O at 0x2000 [0x201f]. Bus 0, device 29, function 1: Class 0c03: PCI device 8086:2484 (rev 2). IRQ 233. I/O at 0x2020 [0x203f]. Bus 0, device 29, function 2: Class 0c03: PCI device 8086:2487 (rev 2). IRQ 169. I/O at 0x2040 [0x205f]. Bus 0, device 30, function 0: Class 0604: PCI device 8086:244e (rev 66). Master Capable. No bursts. Min Gnt=12. Bus 0, device 31, function 0: Class 0601: PCI device 8086:2480 (rev 2). Bus 0, device 31, function 1: Class 0101: PCI device 8086:248b (rev 2). IRQ 169. I/O at 0x2060 [0x206f]. Non-prefetchable 32 bit memory at 0xf8000000 [0xf80003ff]. Bus 0, device 31, function 3: Class 0c05: PCI device 8086:2483 (rev 2). I/O at 0x1100 [0x111f]. Bus 1, device 28, function 0: Class 0800: PCI device 8086:1461 (rev 4). Non-prefetchable 32 bit memory at 0xf9100000 [0xf9100fff]. Bus 1, device 30, function 0: Class 0800: PCI device 8086:1461 (rev 4). Non-prefetchable 32 bit memory at 0xf9101000 [0xf9101fff]. Bus 4, device 28, function 0: Class 0800: PCI device 8086:1461 (rev 4). Non-prefetchable 32 bit memory at 0xfa000000 [0xfa000fff]. Bus 4, device 30, function 0: Class 0800: PCI device 8086:1461 (rev 4). Non-prefetchable 32 bit memory at 0xfa001000 [0xfa001fff]. Bus 1, device 29, function 0: Class 0604: PCI device 8086:1460 (rev 4). Master Capable. Latency=64. Min Gnt=4. Bus 1, device 31, function 0: Class 0604: PCI device 8086:1460 (rev 4). Master Capable. Latency=64. Min Gnt=4. Bus 4, device 29, function 0: Class 0604: PCI device 8086:1460 (rev 4). Master Capable. Latency=64. Min Gnt=4. Bus 4, device 31, function 0: Class 0604: PCI device 8086:1460 (rev 4). Master Capable. Latency=64. Min Gnt=4. Bus 3, device 1, function 0: Class 0104: PCI device 13c1:1001 (rev 1). IRQ 177. Master Capable. Latency=72. Min Gnt=9. I/O at 0x3080 [0x308f]. Non-prefetchable 32 bit memory at 0xf9240000 [0xf924000f]. Non-prefetchable 32 bit memory at 0xf9800000 [0xf9ffffff]. Bus 5, device 1, function 0: Class 0104: PCI device 13c1:1001 (rev 1). IRQ 185. Master Capable. Latency=72. Min Gnt=9. I/O at 0x4000 [0x400f]. Non-prefetchable 32 bit memory at 0xfa100000 [0xfa10000f]. Non-prefetchable 32 bit memory at 0xfa800000 [0xfaffffff]. Bus 5, device 3, function 0: Class 0104: PCI device 13c1:1001 (rev 1). IRQ 193. Master Capable. Latency=72. Min Gnt=9. I/O at 0x4010 [0x401f]. Non-prefetchable 32 bit memory at 0xfa100400 [0xfa10040f]. Non-prefetchable 32 bit memory at 0xfb000000 [0xfb7fffff]. Bus 6, device 1, function 0: Class 0104: PCI device 13c1:1001 (rev 1). IRQ 201. Master Capable. Latency=72. Min Gnt=9. I/O at 0x5000 [0x500f]. Non-prefetchable 32 bit memory at 0xfc000000 [0xfc00000f]. Non-prefetchable 32 bit memory at 0xfb800000 [0xfbffffff]. Bus 3, device 2, function 0: Class 0200: PCI device 8086:1010 (rev 1). IRQ 209. Master Capable. Latency=64. Min Gnt=255. Non-prefetchable 64 bit memory at 0xf9200000 [0xf921ffff]. I/O at 0x3000 [0x303f]. Bus 3, device 2, function 1: Class 0200: PCI device 8086:1010 (rev 1). IRQ 217. Master Capable. Latency=64. Min Gnt=255. Non-prefetchable 64 bit memory at 0xf9220000 [0xf923ffff]. I/O at 0x3040 [0x307f]. Bus 7, device 1, function 0: Class 0300: PCI device 1002:4752 (rev 39). IRQ 11. Master Capable. Latency=64. Min Gnt=8. Non-prefetchable 32 bit memory at 0xfd000000 [0xfdffffff]. I/O at 0x6000 [0x60ff]. Non-prefetchable 32 bit memory at 0xfc100000 [0xfc100fff]. I would appreciate any information you can offer. Greg Brauer - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/