From: "George Spelvin" Subject: Re: raid is dangerous but that's secret (was Re: [patch] ext2/3: Date: 1 Sep 2009 07:18:03 -0400 Message-ID: <20090901111803.11027.qmail@science.horizon.com> References: <4a2c5faeb04cab59af9ba6ab512c9916.squirrel@neil.brown.name> Cc: david@lang.hm, linux-doc@vger.kernel.org, linux-ext4@vger.kernel.org, linux-kernel@vger.kernel.org, pavel@ucw.cz To: linux@horizon.com, neilb@suse.de Return-path: In-Reply-To: <4a2c5faeb04cab59af9ba6ab512c9916.squirrel@neil.brown.name> Sender: linux-doc-owner@vger.kernel.org List-Id: linux-ext4.vger.kernel.org >> An embedded checksum, no matter how good, can't tell you if >> the data is stale; you need a way to distinguish versions in the pointer. > I would disagree with that. > If the embedded checksum is a function of both the data and the address > of the data (in whatever address space seems most appropriate) then it can > still verify that the data found with the checksum is the data that was > expected. > And storing the checksum with the data (where it is practical) means > index blocks can be more dense so on average fewer accesses to storage > are needed. I must not have been clear. Originally, block 100 has contents version 1. This includes a correctly computed checksum. Then you write version 2 of the data there. But there's a bit error in the address and the write goes to block 256+100 = 356. So block 100 still has the version 1 contents, complete with valid checksum. (Yes, block 356 is now corrupted, but perhaps it's not even allocated.) Then we go to read block 100, find a valid checksum, and return incorrect data. Namely, version 1 data, when we expact and want version 2. Basically, the pointer has to say which *version* of the data it points to, not just the block address. Otherwise, it can't detect a missing write. If density is a big issue, then including a small version field is a possibility.