From: Neil Brown <neilb@suse.de>
To: Rik van Riel <riel@redhat.com>
Date: Tue, 31 Mar 2009 20:27:20 +1100
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Message-ID: <18897.57848.315067.51672@notabene.brown>
Cc: Linus Torvalds <torvalds@linux-foundation.org>,
       Ric Wheeler <rwheeler@redhat.com>,
       "Andreas T.Auer" <andreas.t.auer_lkml_73537@ursus.ath.cx>,
       Alan Cox <alan@lxorguk.ukuu.org.uk>, Theodore Tso <tytso@mit.edu>,
       Mark Lord <lkml@rtr.ca>, Stefan Richter <stefanr@s5r6.in-berlin.de>,
       Jeff Garzik <jeff@garzik.org>, Matthew Garrett <mjg59@srcf.ucam.org>,
       Andrew Morton <akpm@linux-foundation.org>,
       David Rees <drees76@gmail.com>, Jesper Krogh <jesper@krogh.cc>,
       Linux Kernel Mailing List <linux-kernel@vger.kernel.org>
Subject: Re: Linux 2.6.29
In-Reply-To: message from Rik van Riel on Monday March 30
References: <alpine.LFD.2.00.0903271511230.3994@localhost.localdomain>
	<alpine.LFD.2.00.0903271522210.3994@localhost.localdomain>
	<49CD7B10.7010601@garzik.org>
	<49CD891A.7030103@rtr.ca>
	<49CD9047.4060500@garzik.org>
	<49CE2633.2000903@s5r6.in-berlin.de>
	<49CE3186.8090903@garzik.org>
	<49CE35AE.1080702@s5r6.in-berlin.de>
	<49CE3F74.6090103@rtr.ca>
	<20090329231451.GR26138@disturbed>
	<20090330003948.GA13356@mit.edu>
	<49D0710A.1030805@ursus.ath.cx>
	<20090330100546.51907bd2@the-village.bc.nu>
	<49D0A3D6.4000300@ursus.ath.cx>
	<49D0AA4A.6020308@redhat.com>
	<alpine.LFD.2.00.0903300817400.3948@localhost.localdomain>
	<49D0EF1E.9040806@redhat.com>
	<alpine.LFD.2.00.0903300922231.3948@localhost.localdomain>
	<49D0FD4C.1010007@redhat.com>
	<alpine.LFD.2.00.0903301038100.3948@localhost.localdomain>
	<49D11BDD.70702@redhat.com>
Sender: linux-kernel-owner@vger.kernel.org
Content-Length: 1254
Lines: 28

On Monday March 30, riel@redhat.com wrote:
> Linus Torvalds wrote:
> > On Mon, 30 Mar 2009, Ric Wheeler wrote:
> 
> >> Heat is a major killer of spinning drives (as is severe cold). A lot of times,
> >> drives that have read errors only (not failed writes) might be fully
> >> recoverable if you can re-write that injured sector.
> > 
> > It's not worked for me, and yes, I've tried.
> 
> It's worked here.  It would be nice to have a device mapper module
> that can just insert itself between the disk and the higher device
> mapper layer and "scrub" the disk, fetching unreadable sectors from
> the other RAID copy where required.

You want to start using 'md' :-)
With raid0,1,4,5,6,10, if it gets a read error, it find the data from
elsewhere and tries to over-write the read error and then read back.
If that all works, then it assume the drive is still good.
This happens during normal IO and all when you 'scrub' the array which
e.g. Debian does on the first Sunday of the month by default.

NeilBrown
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/