2010-01-27 12:10:54

by Daniel J Blueman

[permalink] [raw]
Subject: file/extent checksums for dedup/sync...

For purposes of data deduplication and data synchronisation, it would
be a powerful tool to expose file data checksums.

Since eg BTRFS uses the crc32c algorithm [1], it's possible to compute
the file's overall CRC from the accumulation of the CRCs from all it's
extents' CRCs.

For now, exposing this via an IOCTL may be sufficient, though any
ideas for introducing it in a more standard way? (it's a pity that
when stat64 was introduced, reserved fields weren't added)

Thanks,
Daniel

[1] http://www.research.ibm.com/haifa/satran/ips/Vince-Luben-crc32c-01.pdf
--
Daniel J Blueman


2010-01-27 12:30:47

by Andi Kleen

[permalink] [raw]
Subject: Re: file/extent checksums for dedup/sync...

Daniel J Blueman <[email protected]> writes:

> For purposes of data deduplication and data synchronisation, it would
> be a powerful tool to expose file data checksums.
>
> Since eg BTRFS uses the crc32c algorithm [1], it's possible to compute
> the file's overall CRC from the accumulation of the CRCs from all it's
> extents' CRCs.
>
> For now, exposing this via an IOCTL may be sufficient, though any
> ideas for introducing it in a more standard way? (it's a pity that
> when stat64 was introduced, reserved fields weren't added)

The problem of doing it in any "standard way" is that it would
hard code the way the file system does checksums in the applications.
So the file system could never change it without breaking
user space.

-Andi
--
[email protected] -- Speaking for myself only.

2010-01-27 13:23:32

by Daniel J Blueman

[permalink] [raw]
Subject: Re: file/extent checksums for dedup/sync...

On Wed, Jan 27, 2010 at 12:30 PM, Andi Kleen <[email protected]> wrote:
> Daniel J Blueman <[email protected]> writes:
>
>> For purposes of data deduplication and data synchronisation, it would
>> be a powerful tool to expose file data checksums.
>>
>> Since eg BTRFS uses the crc32c algorithm [1], it's possible to compute
>> the file's overall CRC from the accumulation of the CRCs from all it's
>> extents' CRCs.
>>
>> For now, exposing this via an IOCTL may be sufficient, though any
>> ideas for introducing it in a more standard way? (it's a pity that
>> when stat64 was introduced, reserved fields weren't added)
>
> The problem of doing it in any "standard way" is that it would
> hard code the way the file system does checksums in the applications.
> So the file system could never change it without breaking
> user space.

I guess the filesystem would need to express this in the resulting
data-structure, eg:
- type 1 corresponds to using the crc32c algorithm with starting seed
N and accumulating ascending over data extents, padding with modulus
remainder or sparse holes with 0
- type 2 etc

The next question, is does filesystem (eg BTRFS) compression come
before or after checksumming?
--
Daniel J Blueman

2010-01-27 20:16:13

by Chris Mason

[permalink] [raw]
Subject: Re: file/extent checksums for dedup/sync...

On Wed, Jan 27, 2010 at 01:23:28PM +0000, Daniel J Blueman wrote:
> On Wed, Jan 27, 2010 at 12:30 PM, Andi Kleen <[email protected]> wrote:
> > Daniel J Blueman <[email protected]> writes:
> >
> >> For purposes of data deduplication and data synchronisation, it would
> >> be a powerful tool to expose file data checksums.
> >>
> >> Since eg BTRFS uses the crc32c algorithm [1], it's possible to compute
> >> the file's overall CRC from the accumulation of the CRCs from all it's
> >> extents' CRCs.
> >>
> >> For now, exposing this via an IOCTL may be sufficient, though any
> >> ideas for introducing it in a more standard way? (it's a pity that
> >> when stat64 was introduced, reserved fields weren't added)
> >
> > The problem of doing it in any "standard way" is that it would
> > hard code the way the file system does checksums in the applications.
> > So the file system could never change it without breaking
> > user space.

At the end of the day the checksums are also hard coded on disk. We
can't add a new way without continuing to support the old one.

>
> I guess the filesystem would need to express this in the resulting
> data-structure, eg:
> - type 1 corresponds to using the crc32c algorithm with starting seed
> N and accumulating ascending over data extents, padding with modulus
> remainder or sparse holes with 0
> - type 2 etc

Yes, if they were exported to userland we'd need to export version info.

>
> The next question, is does filesystem (eg BTRFS) compression come
> before or after checksumming?

The checksums are based on what is on disk, so they are done on the
compressed data.

-chris