From: "J. Bruce Fields" Subject: Re: 2.6.26 stable kernel crash with NFS exporting a XFS share Date: Mon, 4 Aug 2008 13:53:32 -0400 Message-ID: <20080804175332.GC25940@fieldses.org> References: <4893CACA.1060605@cse.unsw.edu.au> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Cc: linux-nfs@vger.kernel.org, linux-fsdevel@vger.kernel.org To: Shehjar Tikoo Return-path: In-Reply-To: <4893CACA.1060605@cse.unsw.edu.au> Sender: linux-fsdevel-owner@vger.kernel.org List-ID: On Sat, Aug 02, 2008 at 12:47:38PM +1000, Shehjar Tikoo wrote: > Hi All > > Please see the attached text file which contains > details of a crash I observed recently while > running some tests against Linux nfsd with > XFS as the file system. Details are all there in > the file. > > > -Shehjar > Kernel 2.6.26 stable on both client and server > sunrpc.tcp_slot_table_entries=96 > NFS wsize=32k > Server file system = XFS over software raid0 w/ 3 10k RPM SAS disks > nfsd threads = 512 > Gbit network w/ Jumbo frames > Workload: > Iozone 1-20 writer processes > write() syscall record/buffer size= 50Megs > File size = 500Megs > Server machine is a 2xDual Core Itanium 2 with 8Gigs of RAM, so in theory, maximum memory pressure > will be for (500Megs * 20 Processes) > Client machine is a 4xDual Core Itanium 2 with 16Gigs of RAM. > > > Remarks: > 1. The crash is preceded by hundreds of lines which look like this: > Filesystem "md0": Access to block zero in inode 271 start_block: 0 start_off: 0 blkcnt: 0 extent- state: 0 lastx: fa So that's an xfs_cmn_err() in fs/xfs/xfs_map.c:xfs_bmap_search_extents(). Is there an xfs developer that could explain what that likely means? --b. > > 2. Cannot be reproduced on every run of the above workload. Disk corruption? > > > The stack trace: > ================ > Unable to handle kernel paging request at virtual address 000000000153e1ff > nfsd[10703]: Oops 11012296146944 [1] > > Pid: 10703, CPU 1, comm: nfsd > psr : 00001210085a2010 ifs : 800000000000038c ip : [] Not tainted (2.6.26) > ip is at __kmalloc+0x131/0x220 > unat: 0000000000000000 pfs : 000000000000038c rsc : 0000000000000003 > rnat: a0000001004c0780 bsps: a0000001001a8060 pr : 666a96a6a5996665 > ldrs: 0000000000000000 ccv : 0000000000000002 fpsr: 0009804c8a70433f > csd : 0000000000000000 ssd : 0000000000000000 > b0 : a0000001001494c0 b6 : a0000001004c0780 b7 : a0000001003b4520 > f6 : 0fff7810204080f809f84 f7 : 0ffdf9d24f78a00000000 > f8 : 10006fe00000000000000 f9 : 10006fe00000000000000 > f10 : 0fffefffffffffec33c8a f11 : 1003e0000000000000001 > r1 : a0000001011f1c80 r2 : 0000000000508040 r3 : e0000001951b8014 > r8 : a000000100f26060 r9 : e000000100015c18 r10 : a000000100f26150 > r11 : 000000000000001d r12 : e0000001951bf8b0 r13 : e0000001951b8000 > r14 : 00000000000000c0 r15 : a000000100f26068 r16 : 0000000000000001 > r17 : 0000000000000001 r18 : e0000001951b8c04 r19 : a000000100ff21f0 > r20 : 000000000153e1ff r21 : 0000000000000000 r22 : e000000100015c14 > r23 : 0000000000000000 r24 : 00000000000000ff r25 : 000000000000000e > r26 : e000010040919a72 r27 : 0000000000000000 r28 : 0000000000000100 > r29 : 0000000000000101 r30 : e0000001551e93ec r31 : 000ffffffffe0000 > > Call Trace: > [] show_stack+0x50/0xa0 > sp=e0000001951bf480 bsp=e0000001951b9bc0 > [] show_regs+0x800/0x840 > sp=e0000001951bf650 bsp=e0000001951b9b68 > [] die+0x1a0/0x2c0 > sp=e0000001951bf650 bsp=e0000001951b9b28 > [] ia64_do_page_fault+0x8b0/0x9e0 > sp=e0000001951bf650 bsp=e0000001951b9ad8 > [] ia64_leave_kernel+0x0/0x270 > sp=e0000001951bf6e0 bsp=e0000001951b9ad8 > [] __kmalloc+0x130/0x220 > sp=e0000001951bf8b0 bsp=e0000001951b9a78 > [] kmem_alloc+0x140/0x2a0 > sp=e0000001951bf8b0 bsp=e0000001951b9a38 > [] xfs_iext_add_indirect_multi+0xa0/0x440 > sp=e0000001951bf8b0 bsp=e0000001951b99c0 > [] xfs_iext_add+0x3e0/0x4a0 > sp=e0000001951bf8b0 bsp=e0000001951b9968 > [] xfs_iext_insert+0x30/0xc0 > sp=e0000001951bf8c0 bsp=e0000001951b9930 > [] xfs_bmap_add_extent_hole_delay+0x6e0/0x7c0 > sp=e0000001951bf8c0 bsp=e0000001951b98b0 > [] xfs_bmap_add_extent+0x2a0/0x7a0 > sp=e0000001951bf900 bsp=e0000001951b9820 > [] xfs_bmapi+0x1020/0x1c40 > sp=e0000001951bf950 bsp=e0000001951b95a0 > [] xfs_iomap_write_delay+0x2e0/0x400 > sp=e0000001951bfa20 bsp=e0000001951b94a8 > [] xfs_iomap+0x380/0x520 > sp=e0000001951bfaf0 bsp=e0000001951b9448 > [] __xfs_get_blocks+0xc0/0x560 > sp=e0000001951bfb40 bsp=e0000001951b93e8 > [] xfs_get_blocks+0x40/0x60 > sp=e0000001951bfb80 bsp=e0000001951b93b0 > [] __block_prepare_write+0x400/0xa80 > sp=e0000001951bfb80 bsp=e0000001951b92d0 > [] block_write_begin+0x100/0x1e0 > sp=e0000001951bfba0 bsp=e0000001951b9260 > [] xfs_vm_write_begin+0x50/0x80 > sp=e0000001951bfba0 bsp=e0000001951b9210 > [] generic_file_buffered_write+0x1d0/0xe00 > sp=e0000001951bfba0 bsp=e0000001951b9108 > [] xfs_write+0x910/0xd80 > sp=e0000001951bfbd0 bsp=e0000001951b8fc8 > [] xfs_file_aio_write+0xf0/0x120 > sp=e0000001951bfc20 bsp=e0000001951b8f90 > [] do_sync_readv_writev+0x140/0x1c0 > sp=e0000001951bfc20 bsp=e0000001951b8f40 > [] do_readv_writev+0x160/0x280 > sp=e0000001951bfd10 bsp=e0000001951b8ef0 > [] vfs_writev+0xc0/0x100 > sp=e0000001951bfda0 bsp=e0000001951b8eb8 > [] nfsd_vfs_write+0x200/0x660 > sp=e0000001951bfda0 bsp=e0000001951b8e48 > [] nfsd_write+0x140/0x1a0 > sp=e0000001951bfe00 bsp=e0000001951b8de8 > [] nfsd3_proc_write+0x1b0/0x200 > sp=e0000001951bfe10 bsp=e0000001951b8db0 > [] nfsd_dispatch+0x220/0x4c0 > sp=e0000001951bfe10 bsp=e0000001951b8d70 > [] svc_process+0xc30/0x1b40 > sp=e0000001951bfe10 bsp=e0000001951b8d18 > [] nfsd+0x350/0x600 > sp=e0000001951bfe20 bsp=e0000001951b8c88 > [] kernel_thread_helper+0xd0/0x100 > sp=e0000001951bfe30 bsp=e0000001951b8c60 > [] start_kernel_thread+0x20/0x40 > sp=e0000001951bfe30 bsp=e0000001951b8c60 >