Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752746AbZIOKcW (ORCPT ); Tue, 15 Sep 2009 06:32:22 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1752614AbZIOKcS (ORCPT ); Tue, 15 Sep 2009 06:32:18 -0400 Received: from calipso.tuenti.com ([95.131.168.251]:41486 "EHLO calipso.tuenti.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750913AbZIOKcR (ORCPT ); Tue, 15 Sep 2009 06:32:17 -0400 X-Greylist: delayed 1867 seconds by postgrey-1.27 at vger.kernel.org; Tue, 15 Sep 2009 06:32:17 EDT From: David =?iso-8859-1?q?Mart=EDnez_Moreno?= Organization: Tuenti Technologies S.L. To: linux-ext4@vger.kernel.org, "Theodore Y. Ts'o" , linux-raid@vger.kernel.org Subject: ext3 crash with tune2fs and MD RAID10 device. Date: Tue, 15 Sep 2009 12:01:08 +0200 User-Agent: KMail/1.12.1 (Linux/2.6.31-rc2-ender; KDE/4.3.1; x86_64; ; ) Cc: linux-kernel@vger.kernel.org, David =?iso-8859-1?q?Mart=EDnez_Moreno?= MIME-Version: 1.0 Content-Type: Text/Plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Message-Id: <200909151201.08482.ender@tuenti.com> Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3197 Lines: 64 Good morning. During a maintainance, I've changed (20 hours ago) the reserved block count in my servers' data partition to 0. In some hours, I've got a number of errors (5) in ext3 in different servers that are not possible by simple luck. The errors are: ========================================================= EXT3-fs error (device md3): ext3_new_block: Allocating block in system zone - blocks from 8192000, length 1 Aborting journal on device md3. Remounting filesystem read-only EXT3-fs error (device md3): ext3_free_blocks: Freeing blocks in system zones - Block = 8192000, count = 1 EXT3-fs error (device md3) in ext3_free_blocks_sb: Journal has aborted __journal_remove_journal_head: freeing b_committed_data __journal_remove_journal_head: freeing b_committed_data ... ========================================================= or ========================================================= EXT3-fs error (device md3): ext3_free_blocks_sb: bit already cleared for block 22479363 Aborting journal on device md3. Remounting filesystem read-only EXT3-fs error (device md3): ext3_free_blocks_sb: bit already cleared for block 22479364 EXT3-fs error (device md3): ext3_free_blocks_sb: bit already cleared for block 22479365 EXT3-fs error (device md3): ext3_free_blocks_sb: bit already cleared for block 22479367 EXT3-fs error (device md3) in ext3_free_blocks_sb: Journal has aborted EXT3-fs error (device md3) in ext3_free_blocks_sb: Journal has aborted EXT3-fs error (device md3) in ext3_free_blocks_sb: Journal has aborted ... ========================================================= I have some servers with a single SAS disc and others with a software RAID10 volume over 4 discs. So far, only the RAID10 volumes are showing errors. Coincidence? This is Debian etch, and 2.6.24.2. The command line we've run was: tune2fs -r 0 partition while MySQL was running. The e2fsprogs version is 1.39+1.40-WIP-2006.11.14+dfsg-2etch1. Do you know of any problem with this setup? I've reviewed the e2fsprogs changelog searching for something like this but it seems rather related to RAID10+ext3 interaction. We have lots of servers waiting for crashing, I suspect. While I was writing this mail another one crashed. Probably I can provide superblock copies for analysis or any other information. Best regards, Ender. -- I once farted on the set of Blue Lagoon. -- Brooke Shields (South Park). -- Responsable de sistemas tuenti.com -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/