Message-ID: <4AB2610F.8010904@rtr.ca>
Date: Thu, 17 Sep 2009 12:17:19 -0400
From: Mark Lord <liml@rtr.ca>
Organization: Real-Time Remedies Inc.
User-Agent: Thunderbird 2.0.0.23 (X11/20090817)
MIME-Version: 1.0
To: Chris Webb <chris@arachsys.com>
Cc: Tejun Heo <teheo@suse.de>, linux-scsi@vger.kernel.org,
       Ric Wheeler <rwheeler@redhat.com>, Andrei Tanas <andrei@tanas.ca>,
       NeilBrown <neilb@suse.de>, linux-kernel@vger.kernel.org,
       IDE/ATA development list <linux-ide@vger.kernel.org>,
       Jeff Garzik <jgarzik@redhat.com>, Mark Lord <mlord@pobox.com>
Subject: Re: MD/RAID time out writing superblock
References: <4A9BBC4A.6070708@redhat.com> <4A9BC023.10903@kernel.org> <20090907114442.GG18831@arachsys.com> <20090907115927.GU8710@arachsys.com> <20090909120218.GB21829@arachsys.com> <4AADF3C4.5060004@kernel.org> <4AADF471.2020801@suse.de> <4AAE3B9A.2060306@rtr.ca> <4AAE3F86.8090804@suse.de> <4AAE524C.2030401@rtr.ca> <20090916231921.GL1924@arachsys.com> <4AB239C8.2020203@rtr.ca> <4AB25736.1060601@suse.de> <4AB260CA.8040308@rtr.ca>
In-Reply-To: <4AB260CA.8040308@rtr.ca>
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Sender: linux-kernel-owner@vger.kernel.org
Content-Length: 1307
Lines: 43

Mark Lord wrote:
> Tejun Heo wrote:
>> Hello,
>>
>> Mark Lord wrote:
>>> Tejun.. do we do a FLUSH CACHE before issuing a non-NCQ command ?
>>
>> Nope.
>>
>>> If not, then I think we may need to add code to do it.
>>
>> Hmm... can you explain a bit more?  That seems rather extreme to me.
> ..
> 
> You may recall that I first raised this issue about a year ago,
> when my own RAID0 array (MythTV box) started showing errors very
> similar to what Chris is reporting.
> 
> These were easily triggered by running hddtemp once every few seconds
> to log drive temperatures during Myth recording sessions.
> 
> hddtemp uses SMART commands.
> 
> The actual errors in the logs were command timeouts,
> but at this point I no longer remember which opcode was
> actually timing out.  Disabling the onboard write cache
> immediately "cured" the problem, at the expense of MUCH
> slower I/O times.
..

Speaking of which.. 

Chris:  I wonder if the errors will also vanish in your situation
by disabling the onboard write-caches in the drives ?

Eg.  hdparm -W0 /dev/sd?


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/