2008-07-17 13:55:45

by Mike Snitzer

[permalink] [raw]
Subject: Re: [PATCH 0 of 7] Block/SCSI Data Integrity Support

On Tue, Jun 10, 2008 at 11:28 AM, Martin K. Petersen
<[email protected]> wrote:
>>>>>> "Jeff" == Jeff Moyer <[email protected]> writes:
>
> Jeff> Thanks for all of the great documentation. It would be good to
> Jeff> include some instructions on how one would test this, and what
> Jeff> testing you performed.
>
> modprobe scsi_debug dix=199 dif=1 guard=1 dev_size_mb=1024 num_parts=1
>
> I'm testing with XFS and btrfs. Generally doing kernel builds, etc.
> ext2/3 are still problematic because they modify pages in flight.

Have you made the ext2/3/4 developers aware of this?

Could you elaborate on the interaction between the data integrity
support in the block layer and a given filesystem? Shouldn't _any_
filesystem "just work" given that the block layer is what is
generating the checksums and then verifying them on read?

regards,
Mike


2008-07-17 15:35:46

by Martin K. Petersen

[permalink] [raw]
Subject: Re: [PATCH 0 of 7] Block/SCSI Data Integrity Support

>>>>> "Mike" == Mike Snitzer <[email protected]> writes:

>> I'm testing with XFS and btrfs. Generally doing kernel builds,
>> etc. ext2/3 are still problematic because they modify pages in
>> flight.

Mike> Have you made the ext2/3/4 developers aware of this?

Yep.


Mike> Shouldn't _any_ filesystem "just work" given that the block
Mike> layer is what is generating the checksums and then verifying
Mike> them on read?

Yep.

There are a couple of issues. One problem is that pages are no longer
locked down during I/O. Instead the writeback bit is being set to
indicate that I/O is in progress. Not all corners of ext* have been
adapted to that properly. Especially ext2 suffers and often modifies
pages containing metadata while they are in flight. If I remember
correctly, ext2/dir.c hasn't been made aware of writeback at all and
assumes the page lock still works like it used to.

That is normally not a huge problem because the page is being
scheduled for write again shortly thereafter. So the inconsistent
block on disk gets overwritten pretty much instantly. But that kind
of sloppy behavior is a no-go with integrity checking turned on.

There also appears to be some quirks in the page cache in general.
There's something not quite right in clear_page_dirty() /
page_mkwrite() territory. If I sync excessively I can make any fs
keel over. peterz said that an mmapped page is supposed to be
read-only during writeback but that appears to be racy when a forced
sync is involved.

That's my recollection, anyway. I've been busy with the innards of
the integrity code stuff for a couple of months and haven't poked at
the fs/vm issues for a while.

--
Martin K. Petersen Oracle Linux Engineering