Date: Tue, 9 Sep 2003 20:11:32 +0200
From: Jakob Oestergaard <jakob@unthought.net>
To: Chris Meadors <clubneon@hereintown.net>
Cc: linux-kernel@vger.kernel.org
Subject: Re: Panic when finishing raidreconf on 2.4.0-test4 with preempt
Message-ID: <20030909181131.GB9079@unthought.net>
Mail-Followup-To: Jakob Oestergaard <jakob@unthought.net>,
	Chris Meadors <clubneon@hereintown.net>, linux-kernel@vger.kernel.org
References: <1062883950.1341.26.camel@clubneon.clubneon.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=iso-8859-1
Content-Disposition: inline
Content-Transfer-Encoding: 8bit
In-Reply-To: <1062883950.1341.26.camel@clubneon.clubneon.com>
User-Agent: Mutt/1.3.28i
Sender: linux-kernel-owner@vger.kernel.org
Content-Length: 2282
Lines: 48

On Sat, Sep 06, 2003 at 05:32:30PM -0400, Chris Meadors wrote:
> I've done this twice now, I'd prefer not to do it again, but can upon
> request, if you really need the oops output.
> 
> Running raidreconf to expand a 4 disk array to 5, seems to work
> correctly until the very end.  I'm guessing it is as the RAID super
> block is being written.  A preempt error is triggered and the kernel
> panics.  Upon reboot the MD driver doesn't think the 5th disk is valid
> for consideration in the array and skips over it.  Leaving a very
> corrupted file system.

raidreconf does no "funny business" with the kernel, so I think this
points to either:
*) a bug which mkraid can trigger as well
*) an API change combined with missing error handling, which raidreconf
   now triggers (by calling the old API)
*) a more general kernel bug - there is a *massive* VM load when
   raidreconf does its magic, perhaps calling mkraid after beating
   the VM half way to death can trigger the same error?

raidreconf, upon complete reconfiguration, will set up the new
superblock for you array, mark it as "unclean", and add the disks one by
one.  Once all disks are added, the kernel should start calculating
parity information (because raidreconf does not do this during the
conversion, and hence marks the newly set up array as unclean in order
to have the kernel do this dirty work).

There should be nothing special about this, compared to normal mkraid
and raidhotadd usage - except raidreconf is probably a lot more likely
to trigger races.

Ah, fingerpointing  ;)

(/me sits back, confident that his code is perfect and the kernel alone
 is to blame)

-- 
................................................................
:   jakob@unthought.net   : And I see the elder races,         :
:.........................: putrid forms of man                :
:   Jakob ?stergaard      : See him rise and claim the earth,  :
:        OZ9ABN           : his downfall is at hand.           :
:.........................:............{Konkhra}...............:
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/