From: Jan Kara Subject: Re: Weird I/O errors with USB hard drive not remounting filesystem readonly Date: Wed, 25 Nov 2009 09:42:41 +0100 Message-ID: <20091125084240.GA549@quack.suse.cz> References: <20091124195607.GC16662@quack.suse.cz> <20091124203944.GD16662@quack.suse.cz> <20091124215044.GA20245@roll> <20091124222334.GB20245@roll> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Cc: Jan Kara , Alan Stern , Boaz Harrosh , Kernel development list , USB list , Jens Axboe , SCSI development list , linux-ext4@vger.kernel.org To: tmhikaru@gmail.com Return-path: Content-Disposition: inline In-Reply-To: <20091124222334.GB20245@roll> Sender: linux-scsi-owner@vger.kernel.org List-Id: linux-ext4.vger.kernel.org On Tue 24-11-09 17:23:34, tmhikaru@gmail.com wrote: > On Tue, Nov 24, 2009 at 04:50:44PM -0500, tmhikaru@gmail.com wrote: > > On Tue, Nov 24, 2009 at 09:39:44PM +0100, Jan Kara wrote: > > > On Tue 24-11-09 15:13:01, Alan Stern wrote: > > > > On Tue, 24 Nov 2009, Jan Kara wrote: > > > > > > > > > After digging in block layer code, it's as we suspected: > > > > > In case of host error DID_ERROR (which is our case), scsi request is > > > > > retried iff it is not a FAILFAST request which is set if bio is doing > > > > > readahead... So this is explained and everything behaves as it should. > > > > > Thanks everybody involved :). > > > > > > > > Okay, very good. There remains the question of the disturbing error > > > > messages in the system log. Should they be supressed for FAILFAST > > > > requests? > > > I think it's useful they are there because ultimately, something really > > > went wrong and you should better investigate. BTW, "end_request: I/O error" > > > messages are in the log even for requests where we retried and succeeded... > > > > > > Honza > > > > While I agree it is useful information, I think that if the error messages > > are going to be printed, you should *also* print that this is a NON FATAL > > error and that it's going to be retried. It'd help diagnosing the path it's > > following through the failure code IMHO as well as not making users > > completely freak out like I did in my case. It is *not* particularly obvious > > given the message printed to syslog what is going wrong or why. Yeah, we might make it more obvious that read failed and whether or not we are going to retry. Just technically it's not so simple because a different layer prints messages about errors (generic block layer) and different (scsi disk driver) decides what to do (retry, don't retry, ...). > I should have asked since I'm here at the moment - do you need any > more information out of the buggy USB enclosure at the moment, or can I work > on trying to fix/replace it now? No, feel free to do anything with it :). Thanks for your help with debugging this. Honza