2011-08-09 17:52:25

by H. Peter Anvin

[permalink] [raw]
Subject: Feature request: e2fsck -z

Hi all,

This is something I've wanted to see for a very long time, and it
finally occurred to me that perhaps I should say something about it!

It would be a very nice thing to have a flag to e2fsck, presumably -z,
to zero out any unused data blocks, inodes and so on. The goal is to
minimize the amount of space required after compressing a virtual disk
image or similar, and to make sure any non-data isn't lying around.

-hpa


2011-08-10 06:41:42

by Ric Wheeler

[permalink] [raw]
Subject: Re: Feature request: e2fsck -z

On 08/09/2011 06:52 PM, H. Peter Anvin wrote:
> Hi all,
>
> This is something I've wanted to see for a very long time, and it
> finally occurred to me that perhaps I should say something about it!
>
> It would be a very nice thing to have a flag to e2fsck, presumably -z,
> to zero out any unused data blocks, inodes and so on. The goal is to
> minimize the amount of space required after compressing a virtual disk
> image or similar, and to make sure any non-data isn't lying around.
>
> -hpa
>

Do you need it to be in the fsck tool?

If you have a sparsely allocated block map under your file system, doing a zero
of all blocks could add hours for a big, slow S-ATA drives (2-3 hours for a 1TB
drive).

An alternative for SSD's and devices that do TRIM/UNMAP would be to use one of
the batched discard tools (that would make discarded data read back as zeroed).

Thanks!

Ric


2011-08-10 08:16:16

by Ron Yorston

[permalink] [raw]
Subject: Re: Feature request: e2fsck -z

"H. Peter Anvin" wrote:

>It would be a very nice thing to have a flag to e2fsck, presumably -z,
>to zero out any unused data blocks, inodes and so on. The goal is to
>minimize the amount of space required after compressing a virtual disk
>image or similar, and to make sure any non-data isn't lying around.

I wrote a separate utility, zerofree[1], that zeroes out free blocks.

Ron
--
[1] http://intgat.tigress.co.uk/rmy/uml/

2011-08-10 08:16:17

by Andreas Dilger

[permalink] [raw]
Subject: Re: Feature request: e2fsck -z

On 2011-08-10, at 12:34 AM, Ric Wheeler wrote:
> On 08/09/2011 06:52 PM, H. Peter Anvin wrote:
>> Hi all,
>>
>> This is something I've wanted to see for a very long time, and it
>> finally occurred to me that perhaps I should say something about it!
>>
>> It would be a very nice thing to have a flag to e2fsck, presumably -z,
>> to zero out any unused data blocks, inodes and so on. The goal is to
>> minimize the amount of space required after compressing a virtual disk
>> image or similar, and to make sure any non-data isn't lying around.
>
> Do you need it to be in the fsck tool?
>
> If you have a sparsely allocated block map under your file system, doing a zero of all blocks could add hours for a big, slow S-ATA drives (2-3 hours for a 1TB drive).

I think Ted has a tool that does this already. It should be relatively simple oo do, like "dd if=/dev/zero of=/mountpoint/temp_zero_file && rm /mountpoint/temp_zero_file.

> An alternative for SSD's and devices that do TRIM/UNMAP would be to use one of the batched discard tools (that would make discarded data read back as zeroed).

In fact, I thought Lukas has already made a tool for sending BLKDISCARD for all unused parts of the filesystem?

Cheers, Andreas






2011-08-10 08:22:46

by Ric Wheeler

[permalink] [raw]
Subject: Re: Feature request: e2fsck -z

On 08/10/2011 09:16 AM, Andreas Dilger wrote:
> On 2011-08-10, at 12:34 AM, Ric Wheeler wrote:
>> On 08/09/2011 06:52 PM, H. Peter Anvin wrote:
>>> Hi all,
>>>
>>> This is something I've wanted to see for a very long time, and it
>>> finally occurred to me that perhaps I should say something about it!
>>>
>>> It would be a very nice thing to have a flag to e2fsck, presumably -z,
>>> to zero out any unused data blocks, inodes and so on. The goal is to
>>> minimize the amount of space required after compressing a virtual disk
>>> image or similar, and to make sure any non-data isn't lying around.
>> Do you need it to be in the fsck tool?
>>
>> If you have a sparsely allocated block map under your file system, doing a zero of all blocks could add hours for a big, slow S-ATA drives (2-3 hours for a 1TB drive).
> I think Ted has a tool that does this already. It should be relatively simple oo do, like "dd if=/dev/zero of=/mountpoint/temp_zero_file&& rm /mountpoint/temp_zero_file.

This will work but will be a potential multi-hour long process (and cause out of
space errors at some point for other applications for a very brief window :))

>
>> An alternative for SSD's and devices that do TRIM/UNMAP would be to use one of the batched discard tools (that would make discarded data read back as zeroed).
> In fact, I thought Lukas has already made a tool for sending BLKDISCARD for all unused parts of the filesystem?
>
> Cheers, Andreas

Right, I think he has. That tool (not part of fsck) would do the job much
quicker for enabled devices,

Ric


2011-08-10 09:35:39

by Lukas Czerner

[permalink] [raw]
Subject: Re: Feature request: e2fsck -z

On Wed, 10 Aug 2011, Ric Wheeler wrote:

> On 08/10/2011 09:16 AM, Andreas Dilger wrote:
> > On 2011-08-10, at 12:34 AM, Ric Wheeler wrote:
> > > On 08/09/2011 06:52 PM, H. Peter Anvin wrote:
> > > > Hi all,
> > > >
> > > > This is something I've wanted to see for a very long time, and it
> > > > finally occurred to me that perhaps I should say something about it!
> > > >
> > > > It would be a very nice thing to have a flag to e2fsck, presumably -z,
> > > > to zero out any unused data blocks, inodes and so on. The goal is to
> > > > minimize the amount of space required after compressing a virtual disk
> > > > image or similar, and to make sure any non-data isn't lying around.
> > > Do you need it to be in the fsck tool?
> > >
> > > If you have a sparsely allocated block map under your file system, doing a
> > > zero of all blocks could add hours for a big, slow S-ATA drives (2-3 hours
> > > for a 1TB drive).
> > I think Ted has a tool that does this already. It should be relatively
> > simple oo do, like "dd if=/dev/zero of=/mountpoint/temp_zero_file&& rm
> > /mountpoint/temp_zero_file.
>
> This will work but will be a potential multi-hour long process (and cause out
> of space errors at some point for other applications for a very brief window
> :))
>
> >
> > > An alternative for SSD's and devices that do TRIM/UNMAP would be to use
> > > one of the batched discard tools (that would make discarded data read back
> > > as zeroed).
> > In fact, I thought Lukas has already made a tool for sending BLKDISCARD for
> > all unused parts of the filesystem?
> >
> > Cheers, Andreas
>
> Right, I think he has. That tool (not part of fsck) would do the job much
> quicker for enabled devices,

You're right. It is a part of e2fsck and you can enable it with -E
discard.

Thanks!
-Lukas

2011-08-10 11:01:17

by Lukas Czerner

[permalink] [raw]
Subject: Re: Feature request: e2fsck -z

On Wed, 10 Aug 2011, Florian Weimer wrote:

> * Lukas Czerner:
>
> > You're right. It is a part of e2fsck and you can enable it with -E
> > discard.
>
> If the block device does not support TRIM, does this result in zeros
> being written instead?
>

Of course not, discard works only on devices which does support it. If
the device does not support discard it will to nothing.

Thanks!
-Lukas

2011-08-10 11:03:11

by Florian Weimer

[permalink] [raw]
Subject: Re: Feature request: e2fsck -z

* Lukas Czerner:

> You're right. It is a part of e2fsck and you can enable it with -E
> discard.

If the block device does not support TRIM, does this result in zeros
being written instead?

--
Florian Weimer <[email protected]>
BFK edv-consulting GmbH http://www.bfk.de/
Kriegsstra?e 100 tel: +49-721-96201-1
D-76133 Karlsruhe fax: +49-721-96201-99

2011-08-10 11:04:47

by Florian Weimer

[permalink] [raw]
Subject: Re: Feature request: e2fsck -z

* Lukas Czerner:

> On Wed, 10 Aug 2011, Florian Weimer wrote:
>
>> * Lukas Czerner:
>>
>> > You're right. It is a part of e2fsck and you can enable it with -E
>> > discard.
>>
>> If the block device does not support TRIM, does this result in zeros
>> being written instead?

> Of course not, discard works only on devices which does support it. If
> the device does not support discard it will to nothing.

Oh, then it doesn't address hpa's use case (improving compression).

--
Florian Weimer <[email protected]>
BFK edv-consulting GmbH http://www.bfk.de/
Kriegsstra?e 100 tel: +49-721-96201-1
D-76133 Karlsruhe fax: +49-721-96201-99

2011-08-10 11:06:49

by Ric Wheeler

[permalink] [raw]
Subject: Re: Feature request: e2fsck -z

On 08/10/2011 12:04 PM, Florian Weimer wrote:
> * Lukas Czerner:
>
>> On Wed, 10 Aug 2011, Florian Weimer wrote:
>>
>>> * Lukas Czerner:
>>>
>>>> You're right. It is a part of e2fsck and you can enable it with -E
>>>> discard.
>>> If the block device does not support TRIM, does this result in zeros
>>> being written instead?
>> Of course not, discard works only on devices which does support it. If
>> the device does not support discard it will to nothing.
> Oh, then it doesn't address hpa's use case (improving compression).
>

I think that there are multiple ways to help compression.

One would be to extract only the used data blocks (like file system specific
dump utilities can do), the other would be to zero with the discard or by
writing zeros to the unused blocks.

Ric


2011-08-10 11:09:54

by Lukas Czerner

[permalink] [raw]
Subject: Re: Feature request: e2fsck -z

On Wed, 10 Aug 2011, Florian Weimer wrote:

> * Lukas Czerner:
>
> > On Wed, 10 Aug 2011, Florian Weimer wrote:
> >
> >> * Lukas Czerner:
> >>
> >> > You're right. It is a part of e2fsck and you can enable it with -E
> >> > discard.
> >>
> >> If the block device does not support TRIM, does this result in zeros
> >> being written instead?
>
> > Of course not, discard works only on devices which does support it. If
> > the device does not support discard it will to nothing.
>
> Oh, then it doesn't address hpa's use case (improving compression).
>

It does if he will use SSD's with "discard_zeroes_data" capability. But
my original answer was to simply confirm that e2fsck has that feature.

Thanks!
-Lukas

2011-08-10 13:25:31

by Theodore Ts'o

[permalink] [raw]
Subject: Re: Feature request: e2fsck -z

On Wed, Aug 10, 2011 at 07:34:46AM +0100, Ric Wheeler wrote:
>
> Do you need it to be in the fsck tool?

As others have noted, (a) zerofree does this already, and (b) there's
also -E discard.

My take on it though is that it's a reasonable request. It's much
like there is "sort -u" even though you could do this via "sort |
uniq". Part of the Unix philosophy is to use tools that are
composable, yes --- but optimizing for common cases is also a good
thing, and with the advent of virtualization being more and more
popular, zeroing free blocks for virtualization images is good and
useful thing.

This is also why I agitated for adding support so that e2fsprogs tools
could operate directly on qemu-img files, and not just have support
which is hacked into e2image. Yes, you can always take an qemu-img
file, decompress and expand it into a raw file, run debugfs or e2fsck
on it, and then convert it back to a qemu-img file --- but if lots of
people are doing that, it does make sense to optimize for the most
common use cases.

- Ted

2011-08-10 14:34:48

by Lukas Czerner

[permalink] [raw]
Subject: Re: Feature request: e2fsck -z

On Wed, 10 Aug 2011, Ted Ts'o wrote:

> On Wed, Aug 10, 2011 at 07:34:46AM +0100, Ric Wheeler wrote:
> >
> > Do you need it to be in the fsck tool?
>
> As others have noted, (a) zerofree does this already, and (b) there's
> also -E discard.
>
> My take on it though is that it's a reasonable request. It's much
> like there is "sort -u" even though you could do this via "sort |
> uniq". Part of the Unix philosophy is to use tools that are
> composable, yes --- but optimizing for common cases is also a good
> thing, and with the advent of virtualization being more and more
> popular, zeroing free blocks for virtualization images is good and
> useful thing.

And since we already have punch hole support in ext4 I am going to use
that to "discard" - "punch out" unused blocks from the image. It is
probably not exactly what you are talking about, but it can be useful.

>
> This is also why I agitated for adding support so that e2fsprogs tools
> could operate directly on qemu-img files, and not just have support
> which is hacked into e2image. Yes, you can always take an qemu-img
> file, decompress and expand it into a raw file, run debugfs or e2fsck
> on it, and then convert it back to a qemu-img file --- but if lots of
> people are doing that, it does make sense to optimize for the most
> common use cases.

I am in favor if this, however I do not want to do it. I seems to me
that it is just rewriting the support that qemu already has. However if
qemu guys (or anyone) would create an library which we can use from
both projects, that would be certainly very useful.

Thanks!
-Lukas

>
> - Ted
> --
> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>

--