Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1758073Ab3GMCAh (ORCPT ); Fri, 12 Jul 2013 22:00:37 -0400 Received: from ipmail06.adl6.internode.on.net ([150.101.137.145]:28723 "EHLO ipmail06.adl6.internode.on.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1758027Ab3GMCAg (ORCPT ); Fri, 12 Jul 2013 22:00:36 -0400 X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AgsPAL+z4FF5LK/r/2dsb2JhbABagwa9LoEihAcEAYENF3SCIwEBBTocMwgDGAkaCw8FJQMhARKIDrcTFo4wgSSDdwOXW4ojhyuBWYFLKg Date: Sat, 13 Jul 2013 12:00:30 +1000 From: Dave Chinner To: Dave Jones , Linux Kernel , xfs@oss.sgi.com Subject: Re: XFS: Assertion failed: xfs_dir2_sf_lookup(args) == ENOENT, file: fs/xfs/xfs_dir2_sf.c, line: 358 Message-ID: <20130713020030.GG3438@dastard> References: <20130712023930.GA6473@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20130712023930.GA6473@redhat.com> User-Agent: Mutt/1.5.21 (2010-09-15) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2884 Lines: 57 On Thu, Jul 11, 2013 at 10:39:30PM -0400, Dave Jones wrote: > Just saw this during boot after an unclean shutdown. It hung afterwards. > > [ 97.162665] XFS: Assertion failed: xfs_dir2_sf_lookup(args) == ENOENT, file: fs/xfs/xfs_dir2_sf.c, line: 358 .... > [ 97.173730] [] xfs_dir2_sf_addname+0x43/0x760 [xfs] > [ 97.173743] [] xfs_dir_createname+0x15c/0x1b0 [xfs] > [ 97.173754] [] xfs_create+0x4cc/0x710 [xfs] > [ 97.173764] [] xfs_vn_mknod+0x9a/0x1c0 [xfs] > [ 97.173773] [] xfs_vn_create+0x13/0x20 [xfs] > [ 97.173776] [] vfs_create+0x9d/0x100 > [ 97.173778] [] do_last+0x925/0xe00 > [ 97.173780] [] path_openat+0xbe/0x6f0 > [ 97.173783] [] ? local_clock+0x3f/0x50 > [ 97.173785] [] ? __alloc_fd+0xaf/0x200 > [ 97.173787] [] do_filp_open+0x3a/0x90 > [ 97.173789] [] ? __alloc_fd+0xaf/0x200 > [ 97.173790] [] do_sys_open+0x10b/0x200 > [ 97.173792] [] ? syscall_trace_enter+0x18/0x290 > [ 97.173794] [] SyS_open+0x1e/0x20 > > This trace repeated a few times, then the same assertion was triggered from sys_renameat. That's rather curious. What this means is that there is either an EIO or EEXIST error being returned from xfs_dir2_sf_lookup() when a we're about to add the new entry. There are two things here - EIO can only be returned if a shutdown has occurred - are there any signs of a shutdown in the logs? If there is a shutdown in progress, then this is just unlucky to shutdown with an inode in an inconsistent state in memory that triggers this validity check failure. And EEXIST means that the initial lookup of the name during the open failed to find the entry we are now trying to create. i.e. the initial path walk failed to do the correct lookup on the directory, and so never got down to xfs_dir2_sf_lookup() to find the directory entry (perhaps a problem with a cached negative dentry?). Hence it was decided during the open(O_CREATE) call that the directory entry needed to be created, we get down to XFS to create it, and then get EEXIST because the name already exists... So, it's not clear what has caused this yet. Is it reproducable? If would be good to get a trace of lookup vs addname events from XFS, too (i.e. all the xfs_dir* and xfs_da* events) so we can see if the correct lookups were done prior to the failing addname operation... Cheers, Dave. -- Dave Chinner david@fromorbit.com -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/