2014-02-02 05:43:21

by Namjae Jeon

[permalink] [raw]
Subject: [PATCH RESEND 0/10] fs: Introduce new flag(FALLOC_FL_COLLAPSE_RANGE) for fallocate

From: Namjae Jeon <[email protected]>

This patch series is in response of the following post:
http://lwn.net/Articles/556136/
"ext4: introduce two new ioctls"

Dave chinner suggested that truncate_block_range
(which was one of the ioctls name) should be an fallocate operation
and not any fs specific ioctl, hence we add this functionality to fallocate.

This patch series introduces new flag FALLOC_FL_COLLAPSE_RANGE for fallocate
and implements it for XFS and Ext4.

The semantics of this flag are following:
1) It collapses the range lying between offset and length by removing any data
blocks which are present in this range and than updates all the logical
offsets of extents beyond "offset + len" to nullify the hole created by
removing blocks. In short, it does not leave a hole.
2) It should be used exclusively. No other fallocate flag in combination.
3) Offset and length supplied to fallocate should be fs block size aligned
in case of xfs and ext4.
4) Collaspe range does not work beyond i_size.

This new functionality of collapsing range could be used by media editing tools
which does non linear editing to quickly purge and edit parts of a media file.
This will immensely improve the performance of these operations.
The limitation of fs block size aligned offsets can be easily handled
by media codecs which are encapsulated in a conatiner as they have to
just change the offset to next keyframe value to match the proper alignment.

Change log
v4:
- vfs: Move block size aligned check from VFS layer to FS specific layer.
- vfs: update comments for FALLOC_FL_COLLAPSE_RANGE in user visible header file.
- xfs: update comments for xfs_bmap_shift_extents and change variable name
to more reasonable name.
- xfs: add ASSERTs for pointers in XFS patch.
- xfs: drop all the xfs_bmbt_get*() wrappers.
- xfs and ext4: change return errno from EFSCORRUPTED to EINVAL
when hole is not large enough to shift.
- xfs: remove extents from on-disk btree also in case of merge.
- xfstest: separate shared/316 test to shared/001 ~ 004 in xfstest.
- xfstest: update multi collapse test shared/005 for block size less than page size case.
- manpage: update description.

v3:
Fix checkpatch.pl errors

v2:
Fix review points from Dave Chinner.

Namjae Jeon (10):
fs: Add new flag(FALLOC_FL_COLLAPSE_RANGE) for fallocate
xfs: Add support FALLOC_FL_COLLAPSE_RANGE for fallocate
ext4: Add support FALLOC_FL_COLLAPSE_RANGE for fallocate
xfsprog: xfsio: Add support FALLOC_FL_COLLAPSE_RANGE for fallocate
xfstest: shared/001: Standard collapse range tests
xfstest: shared/002: Delayed allocation collapse range
xfstest: shared/003: Multi collapse range tests
xfstest: shared/004: Delayed allocation multi collapse
xfstest: shared/005: Test multiple fallocate collapse
manpage: update FALLOC_FL_COLLAPSE_RANGE flag in fallocate
--
1.7.9.5


2014-02-02 15:16:29

by Matthew Wilcox

[permalink] [raw]
Subject: Re: [PATCH RESEND 0/10] fs: Introduce new flag(FALLOC_FL_COLLAPSE_RANGE) for fallocate

On Sun, Feb 02, 2014 at 02:41:34PM +0900, Namjae Jeon wrote:
> The semantics of this flag are following:
> 1) It collapses the range lying between offset and length by removing any data
> blocks which are present in this range and than updates all the logical
> offsets of extents beyond "offset + len" to nullify the hole created by
> removing blocks. In short, it does not leave a hole.
> 2) It should be used exclusively. No other fallocate flag in combination.
> 3) Offset and length supplied to fallocate should be fs block size aligned
> in case of xfs and ext4.
> 4) Collaspe range does not work beyond i_size.

What if the file is mmaped at the time somebody issues this command?
Seems to me we should drop pagecache pages that overlap with the
removed blocks. If the removed range is not a multiple of PAGE_SIZE,
then we should also drop any pagecache pages after the removed range.

--
Matthew Wilcox Intel Open Source Technology Centre
"Bill, look, we understand that you're interested in selling us this
operating system, but compare it to ours. We can't possibly take such
a retrograde step."

2014-02-02 15:21:10

by Matthew Wilcox

[permalink] [raw]
Subject: Re: [PATCH RESEND 0/10] fs: Introduce new flag(FALLOC_FL_COLLAPSE_RANGE) for fallocate

On Sun, Feb 02, 2014 at 08:16:24AM -0700, Matthew Wilcox wrote:
> On Sun, Feb 02, 2014 at 02:41:34PM +0900, Namjae Jeon wrote:
> > The semantics of this flag are following:
> > 1) It collapses the range lying between offset and length by removing any data
> > blocks which are present in this range and than updates all the logical
> > offsets of extents beyond "offset + len" to nullify the hole created by
> > removing blocks. In short, it does not leave a hole.
> > 2) It should be used exclusively. No other fallocate flag in combination.
> > 3) Offset and length supplied to fallocate should be fs block size aligned
> > in case of xfs and ext4.
> > 4) Collaspe range does not work beyond i_size.
>
> What if the file is mmaped at the time somebody issues this command?
> Seems to me we should drop pagecache pages that overlap with the
> removed blocks. If the removed range is not a multiple of PAGE_SIZE,
> then we should also drop any pagecache pages after the removed range.

Oops, forgot to add "and if it is a multiple of page size, then we need
to update the offsets of any pages after the removed page". We should
probably start easy though; just drop all pages that overlap the beginning
of the affected range to the end of the file. At some later point,
if there's demand, we can add the optimisation to adjust the offsets of
pages still in the cache.

--
Matthew Wilcox Intel Open Source Technology Centre
"Bill, look, we understand that you're interested in selling us this
operating system, but compare it to ours. We can't possibly take such
a retrograde step."

2014-02-02 22:03:52

by Dave Chinner

[permalink] [raw]
Subject: Re: [PATCH RESEND 0/10] fs: Introduce new flag(FALLOC_FL_COLLAPSE_RANGE) for fallocate

On Sun, Feb 02, 2014 at 08:21:06AM -0700, Matthew Wilcox wrote:
> On Sun, Feb 02, 2014 at 08:16:24AM -0700, Matthew Wilcox wrote:
> > On Sun, Feb 02, 2014 at 02:41:34PM +0900, Namjae Jeon wrote:
> > > The semantics of this flag are following:
> > > 1) It collapses the range lying between offset and length by removing any data
> > > blocks which are present in this range and than updates all the logical
> > > offsets of extents beyond "offset + len" to nullify the hole created by
> > > removing blocks. In short, it does not leave a hole.
> > > 2) It should be used exclusively. No other fallocate flag in combination.
> > > 3) Offset and length supplied to fallocate should be fs block size aligned
> > > in case of xfs and ext4.
> > > 4) Collaspe range does not work beyond i_size.
> >
> > What if the file is mmaped at the time somebody issues this command?
> > Seems to me we should drop pagecache pages that overlap with the
> > removed blocks. If the removed range is not a multiple of PAGE_SIZE,
> > then we should also drop any pagecache pages after the removed range.
>
> Oops, forgot to add "and if it is a multiple of page size, then we need
> to update the offsets of any pages after the removed page".

Yup, that's what the XFS implementation does when it punches the
hole out of the file before shifting the extents down. Check
xfs_free_file_space():

1275 /* wait for the completion of any pending DIOs */
1276 inode_dio_wait(VFS_I(ip));
1277
1278 rounding = max_t(xfs_off_t, 1 << mp->m_sb.sb_blocklog, PAGE_CACHE_SIZE);
1279 ioffset = offset & ~(rounding - 1);
1280 error = -filemap_write_and_wait_range(VFS_I(ip)->i_mapping,
1281 ioffset, -1);
1282 if (error)
1283 goto out;
1284 truncate_pagecache_range(VFS_I(ip), ioffset, -1);

Bonus points for working out why the XFS code doesn't just use
PAGE_CACHE_SIZE for rounding here....

Cheers,

Dave.
--
Dave Chinner
[email protected]

2014-02-19 00:46:08

by Namjae Jeon

[permalink] [raw]
Subject: Re: [PATCH RESEND 0/10] fs: Introduce new flag(FALLOC_FL_COLLAPSE_RANGE) for fallocate

2014-02-03 0:21 GMT+09:00, Matthew Wilcox <[email protected]>:
> On Sun, Feb 02, 2014 at 08:16:24AM -0700, Matthew Wilcox wrote:
>> On Sun, Feb 02, 2014 at 02:41:34PM +0900, Namjae Jeon wrote:
>> > The semantics of this flag are following:
>> > 1) It collapses the range lying between offset and length by removing
>> > any data
>> > blocks which are present in this range and than updates all the
>> > logical
>> > offsets of extents beyond "offset + len" to nullify the hole created
>> > by
>> > removing blocks. In short, it does not leave a hole.
>> > 2) It should be used exclusively. No other fallocate flag in
>> > combination.
>> > 3) Offset and length supplied to fallocate should be fs block size
>> > aligned
>> > in case of xfs and ext4.
>> > 4) Collaspe range does not work beyond i_size.
>>
>> What if the file is mmaped at the time somebody issues this command?
>> Seems to me we should drop pagecache pages that overlap with the
>> removed blocks. If the removed range is not a multiple of PAGE_SIZE,
>> then we should also drop any pagecache pages after the removed range.
Hi Matthew.
Yes, right. So both xfs and ext4 call truncate_pagecache_range to drop
page caches before removing blocks.
truncate_pagecache_range(inode, offset, -1);
and end offset is -1, which mean all page cache will be dropped from
start offset to the end of file.
>
> Oops, forgot to add "and if it is a multiple of page size, then we need
> to update the offsets of any pages after the removed page". We should
> probably start easy though; just drop all pages that overlap the beginning
> of the affected range to the end of the file.
Yes, right. current implementation does exactly as you pointed

> At some later point,
> if there's demand, we can add the optimisation to adjust the offsets of
> pages still in the cache.
-> Yes, Right. But if we consider that fs block size can be less than
page cache size,(512B, 1K, 2K)
I thought that it is proper to drop all pages from the start offset to
the end of the file.

Thanks for your reply.
>
> --
> Matthew Wilcox Intel Open Source Technology Centre
> "Bill, look, we understand that you're interested in selling us this
> operating system, but compare it to ours. We can't possibly take such
> a retrograde step."
>