Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S261254AbVESVBY (ORCPT ); Thu, 19 May 2005 17:01:24 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S261215AbVESVBY (ORCPT ); Thu, 19 May 2005 17:01:24 -0400 Received: from chaos.egr.duke.edu ([152.3.195.82]:32899 "EHLO chaos.egr.duke.edu") by vger.kernel.org with ESMTP id S261254AbVESVAh (ORCPT ); Thu, 19 May 2005 17:00:37 -0400 Date: Thu, 19 May 2005 17:00:22 -0400 (EDT) From: Joshua Baker-LePain X-X-Sender: jlb@chaos.egr.duke.edu To: Gregory Brauer cc: linux-kernel@vger.kernel.org, linux-xfs@oss.sgi.com, Jakob Oestergaard , Chris Wedgwood Subject: Re: kernel OOPS for XFS in xfs_iget_core (using NFS+SMP+MD) In-Reply-To: Message-ID: References: <428511F8.6020303@wildbrain.com> <20050514184711.GA27565@taniwha.stupidest.org> <428B7D7F.9000107@wildbrain.com> <20050518175925.GA22738@taniwha.stupidest.org> <20050518195251.GY422@unthought.net> <428BA8E4.2040108@wildbrain.com> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: linux-kernel-owner@vger.kernel.org X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 5362 Lines: 88 On Thu, 19 May 2005 at 3:43pm, Joshua Baker-LePain wrote > On Wed, 18 May 2005 at 1:43pm, Gregory Brauer wrote > > We first saw the problem after 5 days in production, but since then > > we took the server out of production and used the script > > nfs_fsstress.sh located in this package: > > > > http://prdownloads.sourceforge.net/ltp/ltp-full-20050505.tgz?download > > > > We run the script on 5 client machines that are running RedHat 9 > > with kernel-smp-2.4.20-20.9 and nfs-utils-1.0.1-3.9.1.legacy and > > are NFS mounting our 2.6 kernel server. The longest time to OOPS > > My clients are all RHEL3 (well, centos 3 actually). I'll give the > nfs_fsstress scripts a shot. Hrm. That didn't take long. I've got 6 clients (3 per interface) going with nfs_fsstress.sh, and I saw the following in the logs on the server after about 20 minutes. Note, however, that I can still access the FS, both locally on the server and via NFS. The scripts are still going, but some have already reported errors (obviously). The server error: May 19 16:47:10 norbert kernel: xfs_da_do_buf: bno 8388608 May 19 16:47:10 norbert kernel: dir: inode 2162706 May 19 16:47:10 norbert kernel: Filesystem "md0": XFS internal error xfs_da_do_buf(1) at line 2176 of file fs/xfs/xfs_da_btree.c. Caller 0xf8c90148 May 19 16:47:10 norbert kernel: [] xfs_da_do_buf+0x357/0x70d [xfs] May 19 16:47:10 norbert kernel: [] xfs_da_read_buf+0x19/0x1e [xfs] May 19 16:47:10 norbert kernel: [] buffered_rmqueue+0x17d/0x1a5 May 19 16:47:10 norbert kernel: [] xfs_da_read_buf+0x19/0x1e [xfs] May 19 16:47:10 norbert kernel: [] xfs_da_node_lookup_int+0x9d/0x2c0 [xfs] May 19 16:47:10 norbert kernel: [] xfs_da_node_lookup_int+0x9d/0x2c0 [xfs] May 19 16:47:10 norbert kernel: [] kmem_zone_alloc+0x3b/0x70 [xfs] May 19 16:47:10 norbert kernel: [] xfs_dir2_node_lookup+0x34/0x96 [xfs] May 19 16:47:10 norbert kernel: [] xfs_dir2_lookup+0xde/0x107 [xfs] May 19 16:47:10 norbert kernel: [] avc_has_perm_noaudit+0x8d/0xda May 19 16:47:10 norbert kernel: [] avc_has_perm_noaudit+0x8d/0xda May 19 16:47:10 norbert kernel: [] xfs_dir_lookup_int+0x26/0xa8 [xfs] May 19 16:47:10 norbert kernel: [] xfs_lookup+0x40/0x69 [xfs] May 19 16:47:10 norbert kernel: [] wake_up_inode+0x6/0x29 May 19 16:47:10 norbert kernel: [] vfs_init_vnode+0x1e/0x22 [xfs] May 19 16:47:10 norbert kernel: [] linvfs_get_parent+0x43/0x75 [xfs] May 19 16:47:10 norbert kernel: [] __cond_resched+0x14/0x39 May 19 16:47:10 norbert kernel: [] __cond_resched+0x14/0x39 May 19 16:47:10 norbert kernel: [] d_alloc+0x197/0x1a1 May 19 16:47:10 norbert kernel: [] d_alloc_anon+0xd1/0xee May 19 16:47:10 norbert kernel: [] __cond_resched+0x14/0x39 May 19 16:47:10 norbert kernel: [] find_exported_dentry+0x303/0x5e8 [exportfs] May 19 16:47:10 norbert kernel: [] skb_copy_datagram_iovec+0x4f/0x1e1 May 19 16:47:10 norbert kernel: [] release_sock+0xf/0x4f May 19 16:47:10 norbert kernel: [] tcp_recvmsg+0x64a/0x681 May 19 16:47:10 norbert kernel: [] sock_common_recvmsg+0x30/0x46 May 19 16:47:10 norbert kernel: [] sock_recvmsg+0xef/0x10c May 19 16:47:10 norbert kernel: [] dst_output+0x0/0x1a May 19 16:47:10 norbert kernel: [] recalc_task_prio+0x128/0x133 May 19 16:47:10 norbert kernel: [] activate_task+0x88/0x95 May 19 16:47:10 norbert kernel: [] try_to_wake_up+0x222/0x22d May 19 16:47:10 norbert kernel: [] __wake_up_common+0x36/0x51 May 19 16:47:11 norbert kernel: [] __wake_up+0x29/0x3c May 19 16:47:11 norbert kernel: [] svc_sock_enqueue+0x1d6/0x212 [sunrpc] May 19 16:47:11 norbert kernel: [] svc_tcp_recvfrom+0x304/0x376 [sunrpc] May 19 16:47:11 norbert kernel: [] svc_expkey_lookup+0x1fc/0x330 [nfsd] May 19 16:47:11 norbert kernel: [] export_decode_fh+0x61/0x6d [exportfs] May 19 16:47:11 norbert kernel: [] nfsd_acceptable+0x0/0xba [nfsd] May 19 16:47:11 norbert kernel: [] export_decode_fh+0x0/0x6d [exportfs] May 19 16:47:11 norbert kernel: [] fh_verify+0x3bc/0x5bd [nfsd] May 19 16:47:11 norbert kernel: [] nfsd_acceptable+0x0/0xba [nfsd] May 19 16:47:11 norbert kernel: [] nfsd3_proc_getattr+0x6f/0x77 [nfsd] May 19 16:47:11 norbert kernel: [] nfs3svc_decode_fhandle+0x0/0x8d [nfsd] May 19 16:47:11 norbert kernel: [] nfsd_dispatch+0xba/0x16f [nfsd] May 19 16:47:11 norbert kernel: [] svc_process+0x420/0x6d6 [sunrpc] May 19 16:47:11 norbert kernel: [] nfsd+0x1cc/0x332 [nfsd] May 19 16:47:11 norbert kernel: [] nfsd+0x0/0x332 [nfsd] May 19 16:47:11 norbert kernel: [] kernel_thread_helper+0x5/0xb May 19 16:47:11 norbert kernel: nfsd: non-standard errno: -990 -- Joshua Baker-LePain Department of Biomedical Engineering Duke University - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/