Date: Mon, 23 Dec 2002 01:29:43 -0500
From: harik@chaos.ao.net
To: Linux Kernel <linux-kernel@vger.kernel.org>
Cc: Neil Brown <neilb@cse.unsw.edu.au>
Subject: Re: raid5 multi-drive-failure and recovery?
Message-ID: <20021223062943.GB14954@chaos.ao.net>
References: <200212200049.gBK0n5NS016297@chaos.ao.net> <15874.27493.835799.195199@notabene.cse.unsw.edu.au>
Mime-Version: 1.0
Content-Type: multipart/signed; micalg=pgp-sha1;
	protocol="application/pgp-signature"; boundary="4SFOXa2GPu3tIq4H"
Content-Disposition: inline
In-Reply-To: <15874.27493.835799.195199@notabene.cse.unsw.edu.au>
User-Agent: Mutt/1.4i
Sender: linux-kernel-owner@vger.kernel.org
Content-Length: 2435
Lines: 64


--4SFOXa2GPu3tIq4H
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

On Fri, 20 Dec 2002, Neil Brown wrote:

> > fail, catch it in raid5_end_read_request then tag the stripe_head
> > with the device that's failed.  If one has already failed, return
> > EIO.   This way further reads on the stripe_head will go to the parity
> > disk (until it's eventually freed.  One IO error per stripe isn't too
> > harsh a price to pay for disaster recovery)
> >=20
> > in 2.4.20, I'm at raid5.c:421 where we're about to call md_error.
> > What happens to the bh from that point?  Obviously, it's not up-to-date,
> > so when 1 drive fails how does it get re-issued to be pulled from a par=
ity
> > drive to reconstruct it?
>=20
> Nothing much happens to the bh.  It just has Uptodate cleared.
>=20
> A little later (line 433) we call release_stripe.  Once all the active
> IO requests have finished the stripe will be completely released and
> handle_stripe gets called (from raid5d) to handle the stripe.
> It notices there is a block that is being read (bh_read) but that
> isn't uptodate, and so tries to schedule a read.  Line 949 notices
> that the block it wants to read is on a failed drive, so it causes all
> blocks to be read in.  Once they are read in, handle_stripe gets
> called again, and this time it gets to line 955 where it computes the
> block that you want.

Ok, I was unclear what the retry-magic was.  We stall a request on a
failure, mark the drive 'failed' and eventually we notice "Hey, that
wasn't finished" and restart it (on the good drive)

I'll try to toss a prelim patch up in the near future.   Is there a
patch for loopback to simulate errors-on-demand?  I've already done
drive forensics to recover the disks in question, and I'd much rather
test on a set of 500meg loops rather then a 60gig raid. (For obvious
reasons)

Thanks,

--Dan


--4SFOXa2GPu3tIq4H
Content-Type: application/pgp-signature
Content-Disposition: inline

-----BEGIN PGP SIGNATURE-----

iD8DBQE+Bq1XkycScExRgsgRAnrpAJ9M7CYl1b5gEc6LDx6DsJED8qLuXACfdzmq
2p6xubL/3QFiRnVU0M2lqAI=
=+QQ3
-----END PGP SIGNATURE-----

--4SFOXa2GPu3tIq4H--
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/