Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S969231AbXFHN77 (ORCPT ); Fri, 8 Jun 2007 09:59:59 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S968688AbXFHN7k (ORCPT ); Fri, 8 Jun 2007 09:59:40 -0400 Received: from bay0-omc1-s11.bay0.hotmail.com ([65.54.246.83]:12086 "EHLO bay0-omc1-s11.bay0.hotmail.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S968639AbXFHN7j (ORCPT ); Fri, 8 Jun 2007 09:59:39 -0400 Message-ID: X-Originating-IP: [85.36.106.214] X-Originating-Email: [pupilla@hotmail.com] From: "Marco Berizzi" To: "David Chinner" Cc: "David Chinner" , , , "Marco Berizzi" References: <20070316012520.GN5743@melbourne.sgi.com> <20070316195951.GB5743@melbourne.sgi.com>

<20070320064632.GO32602149@melbourne.sgi.com> <20070607130505.GE85884050@sgi.com> Subject: Re: XFS internal error xfs_da_do_buf(2) at line 2087 of file fs/xfs/xfs_da_btree.c. Caller 0xc01b00bd Date: Fri, 8 Jun 2007 15:59:39 +0200 X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook Express 6.00.2800.1123 X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2800.1123 X-OriginalArrivalTime: 08 Jun 2007 13:59:38.0538 (UTC) FILETIME=[40E758A0:01C7A9D5] Sender: linux-kernel-owner@vger.kernel.org X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3823 Lines: 130 David Chinner wrote: > > Jun 6 09:47:09 Pleiadi kernel: ======================= > > Jun 6 09:47:09 Pleiadi kernel: 0x0: 28 f1 45 d4 22 53 35 11 09 80 37 5a > > 47 8a 22 ee > > Jun 6 09:47:09 Pleiadi kernel: Filesystem "sda8": XFS internal error > > xfs_da_do_buf(2) at line 2086 of file fs/xfs/xfs_da_btree.c. Caller > > 0xc01b2301 > > Jun 6 09:47:09 Pleiadi kernel: [] xfs_da_do_buf+0x70c/0x7b1 > > Jun 6 09:47:09 Pleiadi kernel: [] xfs_da_read_buf+0x30/0x35 > > Jun 6 09:47:09 Pleiadi kernel: [] xfs_da_read_buf+0x30/0x35 > > These above stack trace is the sign of a corrupted directory. > > Chopping out the rest of the top posting (please don't do that) apologies > we get down to 3 months ago: > > > > On Mon, Mar 19, 2007 at 11:32:27AM +0100, Marco Berizzi wrote: > > > > Marco Berizzi wrote: > > > > Here is the relevant results: > > > > > > > > Phase 2 - found root inode chunk > > > > Phase 3 - ... > > > > agno = 0 > > > > ... > > > > agno = 12 > > > > LEAFN node level is 1 inode 1610612918 bno = 8388608 > > > > > > Hmmm - single bit error in the bno - that reminds of this: > > > > > > http://oss.sgi.com/projects/xfs/faq.html#dir2 > > > > > > So I'd definitely make sure that is repaired.... > > Where we saw signs of on disk directory corruption. Have you run > xfs_repair successfully on the filesystem since you reported > this? yes. > If you did clean up the error, does xfs_repair report the same sort > of error again? I have run xfs_repair this morning. Here is the report: Phase 1 - find and verify superblock... Phase 2 - using internal log - zero log... - scan filesystem freespace and inode maps... - found root inode chunk Phase 3 - for each AG... - scan and clear agi unlinked lists... - process known inodes and perform inode discovery... - agno = 0 - agno = 1 - agno = 2 - agno = 3 - agno = 4 - agno = 5 - agno = 6 - agno = 7 - agno = 8 - agno = 9 - agno = 10 - agno = 11 - agno = 12 - agno = 13 - agno = 14 - agno = 15 - process newly discovered inodes... Phase 4 - check for duplicate blocks... - setting up duplicate extent list... - clear lost+found (if it exists) ... - check for inodes claiming duplicate blocks... - agno = 0 - agno = 1 - agno = 2 - agno = 3 - agno = 4 - agno = 5 - agno = 6 - agno = 7 - agno = 8 - agno = 9 - agno = 10 - agno = 11 - agno = 12 - agno = 13 - agno = 14 - agno = 15 Phase 5 - rebuild AG headers and trees... - reset superblock... Phase 6 - check inode connectivity... - resetting contents of realtime bitmap and summary inodes - ensuring existence of lost+found directory - traversing filesystem starting at / ... - traversal finished ... - traversing all unattached subtrees ... - traversals finished ... - moving disconnected inodes to lost+found ... Phase 7 - verify and correct link counts... done > Have you run a 2.6.16-rcX or 2.6.17.[0-6] kernel since you last > reported this problem? No. I have run only 2.6.19.x and 2.6.21.x After the xfs_repair I have remounted the file system. After few hours linux has crashed with this message: BUG: at arch/i386/kernel/smp.c:546 smp_call_function() I have also the monitor bitmap. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/