From: Rob Landley Subject: Re: [patch] ext2/3: document conditions when reliable operation is possible Date: Thu, 27 Aug 2009 15:51:42 -0500 Message-ID: <200908271551.43840.rob@landley.net> References: <20090824212518.GF29763@elf.ucw.cz> <200908262253.17886.rob@landley.net> <4A967175.5070700@redhat.com> Mime-Version: 1.0 Content-Type: Text/Plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Cc: Pavel Machek , Theodore Tso , Florian Weimer , Goswin von Brederlow , kernel list , Andrew Morton , mtk.manpages@gmail.com, rdunlap@xenotime.net, linux-doc@vger.kernel.org, linux-ext4@vger.kernel.org, corbet@lwn.net To: Ric Wheeler Return-path: In-Reply-To: <4A967175.5070700@redhat.com> Content-Disposition: inline Sender: linux-doc-owner@vger.kernel.org List-Id: linux-ext4.vger.kernel.org On Thursday 27 August 2009 06:43:49 Ric Wheeler wrote: > On 08/26/2009 11:53 PM, Rob Landley wrote: > > On Tuesday 25 August 2009 18:40:50 Ric Wheeler wrote: > >> Repeat experiment until you get up to something like google scale or the > >> other papers on failures in national labs in the US and then we can have > >> an informed discussion. > > > > On google scale anvil lightning can fry your machine out of a clear sky. > > > > However, there are still a few non-enterprise users out there, and > > knowing that specific usage patterns don't behave like they expect might > > be useful to them. > > You are missing the broader point of both papers. No, I'm dismissing the papers (some of which I read when they first came out and got slashdotted) as irrelevant to the topic at hand. Pavel has two failure modes which he can trivially reproduce. The USB stick one is reproducible on a laptop by jostling said stick. I myself used to have a literal USB keychain, and the weight of keys dangling from it pulled it out of the USB socket fairly easily if I wasn't careful. At the time nobody had told me a journaling filesystem was not a reasonable safeguard here. Presumably the degraded raid one can be reproduced under an emulator, with no hardware directly involved at all, so talking about hardware failure rates ignores the fact that he's actually discussing a _software_ problem. It may happen in _response_ to hardware failures, but the damage he's attempting to document happens entirely in software. These failure modes can cause data loss which journaling can't help, but which journaling might (or might not) conceivably hide so you don't immediately notice it. They share a common underlying assumption that the storage device's update granularity is less than or equal to the filesystem's block size, which is not actually true of all modern storage devices. The fact he's only _found_ two instances where this assumption bites doesn't mean there aren't more waiting to be found, especially as more new storage media types get introduced. Pavel's response was to attempt to document this. Not that journaling is _bad_, but that it doesn't protect against this class of problem. Your response is to talk about google clusters, cloud storage, and cite academic papers of statistical hardware failure rates. As I understand the discussion, that's not actually the issue Pavel's talking about, merely one potential trigger for it. Rob -- Latency is more important than throughput. It's that simple. - Linus Torvalds