From: Theodore Ts'o Subject: Re: ext4 error Date: Wed, 13 Apr 2016 23:22:07 -0400 Message-ID: <20160414032207.GC16656@thunk.org> References: <0255994B402DE243B1DFC1057A00655201AC6F@ZXSHMBX02.zhaoxin.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Cc: "jack@suse.cz" , Toshi Kani , "dan.j.williams@intel.com" , "linux-kernel@vger.kernel.org" , "xfs@oss.sgi.com" , "adilger.kernel@dilger.ca" , "viro@zeniv.linux.org.uk" , "linux-nvdimm@lists.01.org" , "linux-fsdevel@vger.kernel.org" , Matthew Wilcox , "akpm@linux-foundation.org" , "linux-ext4@vger.kernel.org" , "ross.zwisler@linux.intel.com" , "kirill.shutemov@linux.intel.com" To: Eric Shang Return-path: Content-Disposition: inline In-Reply-To: <0255994B402DE243B1DFC1057A00655201AC6F@ZXSHMBX02.zhaoxin.com> List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: xfs-bounces@oss.sgi.com Sender: xfs-bounces@oss.sgi.com List-Id: linux-ext4.vger.kernel.org On Wed, Apr 13, 2016 at 01:44:55PM +0000, Eric Shang wrote: > HI All: > I meet an ext4 error, following is the error log. After panic, I check the emmc by the tool debufs, the inode 69878 i_nlink is not zero. And this inode don't belong to parent dir 6987, it belong to other file(this inode belong to two files when check by debugfs ncheck), I guess than this inode has beed deleted in memory and already used by other file. But the parent dentry buff_head not flush to emmc. But when lookup this dentry can't find it' in dentry cache, and then lookup_real, read the dentry from emmc, get the file inode which already be deleted. > Can any give me some help how to check this issue. My kernel version is 3.18 form Android . I thinks something wrong with dentry cache flush and dirty buff_head flush to emmc. Thanks all! If I had to guess, this was caused starting with a corrupted file system, where the inode allocation bitmap showed that an inode which was in use by the file system, was erroneously showing it as free. This allowed it to be allocated for use in a second file (which would have wiped out the contents for the original file stored at that inode). Later on, the file was deleted via either the older or newer pathname, which dropped the ref count to zero, and then an access via the other pathname would have resulted in this error. After the panic, the on-disk data structures wouldn't have been updated from whatever the in-memory data structures might have been ("Kernel panic - not syncing"). So what you see from using debugfs after the crash might not be represenatative of what you saw before the crash. I'm not sure there's much debugging that can be done, because there are any number of sources for the original corruption. It could be caused by a hardware issue in the flash or the memory, or it could be caused by a wild pointer corrupting a disk buffer, etc. etc. The panic won't result in a useful stack trace because that's when the problem was *noticed*. But that's very different from where the file system corruption was *introduced*. If you can reliably reproduce this sort of failure, then it becomes possible to try to track it down. But if it's a one-off event, there's not much anyone can do. Best regards, - Ted _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs