2011-03-01 03:08:09

by Allison Henderson

[permalink] [raw]
Subject: [Ext4 punch hole 1/5] Ext4 Punch Hole Support: Convert Blocks to Uninit Exts

Hi All!

This is the patch series for ext4 punch hole that I have been working on.
There is still a little more debugging that I would like to do with it,
but I've had some requests to see the patch, so I'm sending it out a bit
early. Any feed back is appreciated. Thx!

This first patch adds a function to convert a range of blocks
to an uninitialized extent. This function will
be used to first convert the blocks to extents before
punching them out.

This function also adds a routine to split an extent at
a given block. This routine is used when a range of
blocks to be converted is only partially contained in
an extent.

Signed-off-by: Allison Henderson <[email protected]>
---
:100644 100644 7516fb9... ab2e42e... M fs/ext4/extents.c
fs/ext4/extents.c | 185 +++++++++++++++++++++++++++++++++++++++++++++++++++++
1 files changed, 185 insertions(+), 0 deletions(-)

diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c
index 7516fb9..ab2e42e 100644
--- a/fs/ext4/extents.c
+++ b/fs/ext4/extents.c
@@ -2169,6 +2169,105 @@ static int ext4_remove_blocks(handle_t *handle, struct inode *inode,
return 0;
}

+/*
+ * ext4_split_extents() Splits a given extent at the block "split"
+ * @handle: The journal handle
+ * @inode: The file inode
+ * @split: The block where the extent will be split
+ * @path: The path to the extent
+ * @flags: flags used to insert the new extent
+ */
+static int ext4_split_extents(handle_t *handle,
+ struct inode *inode,
+ ext4_lblk_t split,
+ struct ext4_ext_path *path,
+ int flags)
+{
+ struct ext4_extent *ex, newex, orig_ex, *i_ex;
+ struct ext4_extent *ex1 = NULL;
+ struct ext4_extent *ex2 = NULL;
+ struct ext4_extent_header *eh;
+ ext4_lblk_t ee_block;
+ unsigned int ee_len, depth;
+ ext4_fsblk_t newblock, origblock;
+ int err = 0;
+
+ ext_debug("ext4_split_extent: inode %lu, split %u\n", inode->i_ino, split);
+ ext4_ext_show_leaf(inode, path);
+
+ depth = ext_depth(inode);
+ eh = path[depth].p_hdr;
+ ex = path[depth].p_ext;
+ ee_block = le32_to_cpu(ex->ee_block);
+ ee_len = ext4_ext_get_actual_len(ex);
+
+ /* if the block is not actually in the extent, go to out */
+ if(split > ee_block+ee_len || split < ee_block)
+ goto out;
+
+ origblock = ee_block + ext4_ext_pblock(ex);
+ newblock = split - ee_block + ext4_ext_pblock(ex);
+ ext_debug("The new block is %llu, the orig block is: %llu\n", newblock, origblock);
+
+ /* save original block in case split fails */
+ orig_ex.ee_block = ex->ee_block;
+ orig_ex.ee_len = cpu_to_le16(ee_len);
+ ext4_ext_store_pblock(&orig_ex, ext4_ext_pblock(ex));
+
+ err = ext4_ext_get_access(handle, inode, path + depth);
+ if (err)
+ goto out;
+
+ ex1 = ex;
+ ex1->ee_len = cpu_to_le16(split-ee_block);
+
+ ex2 = &newex;
+ ex2->ee_len = cpu_to_le16(ee_len-(split-ee_block));
+ ex2->ee_block = split;
+ ext4_ext_store_pblock(ex2, newblock);
+
+ if (ex1->ee_block != ex2->ee_block)
+ goto insert;
+
+ /* Mark modified extent as dirty */
+ err = ext4_ext_dirty(handle, inode, path + depth);
+ ext_debug("out here\n");
+ goto out;
+insert:
+
+ /*
+ * If this leaf cannot fit in any more extents
+ * insert it into another leaf
+ */
+ if(EXT_LAST_EXTENT(eh) >= EXT_MAX_EXTENT(eh)){
+ err = ext4_ext_insert_extent(handle, inode, path, &newex, flags);
+ if (err)
+ goto fix_extent_len;
+ }
+
+ /* otherwise just scoot all ther other extents down */
+ else{
+ for(i_ex = EXT_LAST_EXTENT(eh); i_ex > ex1; i_ex-- ){
+ memcpy(i_ex+1, i_ex, sizeof(struct ext4_extent));
+ }
+ memcpy(ex1+1, ex2, sizeof(struct ext4_extent));
+ eh->eh_entries++;
+ }
+
+out:
+ ext4_ext_show_leaf(inode, path);
+ return err;
+
+fix_extent_len:
+ ex->ee_block = orig_ex.ee_block;
+ ex->ee_len = orig_ex.ee_len;
+ ext4_ext_store_pblock(ex, ext4_ext_pblock(&orig_ex));
+ ext4_ext_mark_uninitialized(ex);
+ ext4_ext_dirty(handle, inode, path + depth);
+
+ return err;
+}
+
static int
ext4_ext_rm_leaf(handle_t *handle, struct inode *inode,
struct ext4_ext_path *path, ext4_lblk_t start)
@@ -3598,6 +3697,92 @@ out_stop:
ext4_journal_stop(handle);
}

+/*
+ * ext4_ext_convert_blocks_uninit()
+ * Converts a range of blocks to uninitialized
+ *
+ * @inode: The files inode
+ * @handle: The journal handle
+ * @lblock: The logical block where the conversion will start
+ * @len: The max number of blocks to convert
+ *
+ * Returns the number of blocks successfully converted
+ */
+static int ext4_ext_convert_blocks_uninit(struct inode *inode, handle_t *handle, ext4_lblk_t lblock, ext4_lblk_t len){
+
+ ext4_lblk_t ee_block, iblock;
+ struct ext4_ext_path *path;
+ struct ext4_extent *ex;
+ unsigned int ee_len;
+ int ret, err = 0;
+
+ iblock = lblock;
+ while(iblock < lblock+len){
+
+ path = ext4_ext_find_extent(inode, lblock, NULL);
+
+ if(path == NULL)
+ goto next;
+
+ err = ext4_ext_get_access(handle, inode, path);
+ if(err < 0)
+ goto next;
+
+ ex = path->p_ext;
+ if(ex == NULL)
+ goto next;
+
+ ee_block = le32_to_cpu(ex->ee_block);
+ ee_len = ext4_ext_get_actual_len(ex);
+
+ /*
+ * If the block is in the middle of the extent,
+ * split the extent to remove the head.
+ * Then find the new extent that contains the block
+ */
+ if(ee_block < iblock){
+ err = ext4_split_extents(handle, inode, iblock, path, 0);
+ if(err)
+ goto next;
+
+ path = ext4_ext_find_extent(inode, iblock, NULL);
+
+ if(path == NULL)
+ goto next;
+
+ ex = path->p_ext;
+ if(ex == NULL)
+ goto next;
+
+ ee_block = le32_to_cpu(ex->ee_block);
+ ee_len = ext4_ext_get_actual_len(ex);
+
+ }
+
+ /* If the extent exceeds len, split the extent to remove the tail */
+ if(ee_len > len){
+ err = ext4_split_extents(handle, inode, lblock+len, path, 0);
+ if(err)
+ goto next;
+
+ ee_len = ext4_ext_get_actual_len(ex);
+ }
+
+ /* Mark the extent uninitialized */
+ ext4_ext_mark_uninitialized(ex);
+
+ iblock+=ex->ee_len;
+ ret+=ex->ee_len;
+ continue;
+next:
+ /* If the extent for this block cannot be found, skip it */
+ iblock++;
+ }
+
+ return ret;
+}
+
+
static void ext4_falloc_update_inode(struct inode *inode,
int mode, loff_t new_size, int update_ctime)
{
--
1.7.1




2011-03-01 05:42:37

by Dave Young

[permalink] [raw]
Subject: Re: [Ext4 punch hole 1/5] Ext4 Punch Hole Support: Convert Blocks to Uninit Exts

On Tue, Mar 1, 2011 at 11:08 AM, Allison Henderson
<[email protected]> wrote:
> Hi All!
>
> This is the patch series for ext4 punch hole that I have been working on.
> There is still a little more debugging that I would like to do with it,
> but I've had some requests to see the patch, so I'm sending it out a bit
> early.  Any feed back is appreciated.  Thx!

patch 5/5 is not sent?

--
Regards
dave

2011-03-01 05:51:18

by Allison Henderson

[permalink] [raw]
Subject: Re: [Ext4 punch hole 1/5] Ext4 Punch Hole Support: Convert Blocks to Uninit Exts

On 2/28/2011 10:42 PM, Dave Young wrote:
> On Tue, Mar 1, 2011 at 11:08 AM, Allison Henderson
> <[email protected]> wrote:
>> Hi All!
>>
>> This is the patch series for ext4 punch hole that I have been working on.
>> There is still a little more debugging that I would like to do with it,
>> but I've had some requests to see the patch, so I'm sending it out a bit
>> early. Any feed back is appreciated. Thx!
>
> patch 5/5 is not sent?
>

Hi there,

Sorry, patch 5 was a debug patch. It does not add any extra function. Please ignore the 5. Thx!

Allison Henderson


2011-03-01 06:42:50

by Andreas Dilger

[permalink] [raw]
Subject: Re: [Ext4 punch hole 1/5] Ext4 Punch Hole Support: Convert Blocks to Uninit Exts

On 2011-02-28, at 8:08 PM, Allison Henderson wrote:
> This first patch adds a function to convert a range of blocks
> to an uninitialized extent. This function will
> be used to first convert the blocks to extents before
> punching them out.

This was proposed as a separate function for FALLOCATE by Dave Chinner (based on XFS_IOC_ZERO_RANGE), so this is useful as a standalone function.

> +/*
> + * ext4_split_extents() Splits a given extent at the block "split"
> + * @handle: The journal handle
> + * @inode: The file inode
> + * @split: The block where the extent will be split
> + * @path: The path to the extent
> + * @flags: flags used to insert the new extent
> + */
> +static int ext4_split_extents(handle_t *handle,
> + struct inode *inode,
> + ext4_lblk_t split,
> + struct ext4_ext_path *path,
> + int flags)

Isn't this already done when converting an uninitialized extent into a regular extent when it is being written to? Hmm, ext4_ext_convert_to_initialized() is splitting the extent, but it is doing that "by hand". It might make sense to convert ext4_ext_convert_to_initialized() to use this routine, but this depends on whether it makes the code simpler or not.

> + /* if the block is not actually in the extent, go to out */
> + if(split > ee_block+ee_len || split < ee_block)
> + goto out;

Is this actually needed?

> + ext_debug("The new block is %llu, the orig block is: %llu\n", newblock, origblock);

(style) please wrap lines at 80 columns. This patch should be run through checkpatch.pl, which should catch issues like this.

> + if(EXT_LAST_EXTENT(eh) >= EXT_MAX_EXTENT(eh)){
> +
> + /* otherwise just scoot all ther other extents down */
> + else{
> + for(i_ex = EXT_LAST_EXTENT(eh); i_ex > ex1; i_ex-- ){
> + memcpy(i_ex+1, i_ex, sizeof(struct ext4_extent));
> + }
> + memcpy(ex1+1, ex2, sizeof(struct ext4_extent));

(style) the whitespace in several other places is also not following the Linux coding style


Cheers, Andreas






2011-03-01 09:30:07

by Amir Goldstein

[permalink] [raw]
Subject: Re: [Ext4 punch hole 1/5] Ext4 Punch Hole Support: Convert Blocks to Uninit Exts

On Tue, Mar 1, 2011 at 5:08 AM, Allison Henderson
<[email protected]> wrote:
> Hi All!
>
> This is the patch series for ext4 punch hole that I have been working on.
> There is still a little more debugging that I would like to do with it,
> but I've had some requests to see the patch, so I'm sending it out a bit
> early. ?Any feed back is appreciated. ?Thx!

Hi Allison,

Very nice work!

One meta comment is related to locking considerations with direct I/O.
I am not all that familiar with all direct I/O paths (i.e. dioread_nolock),
but I smell a possible race with holes initiation via direct I/O.

The thing with direct I/O is that you have no flags to test the state
of the block (did Eric just add a hash table to cope with another issue?).
Direct I/O writes outside i_size are synchronous (i.e. i_mutex is held until
I/O completion), but inside i_size, they may be a-synchronous.
The case of initiating holes is currently handled with DIO_SKIP_HOLES
by falling back to buffered I/O, although it could be handled by allocating
an uninitialized extent.
The case of writing to falloced space in handled by converting extents to
initialized at async I/O completion.

I am not sure there a race with hole punching here and I am not even sure
that the description above is totally accurate (?), but it's a dark corner,
which should to be reviewed carefully.

>
> This first patch adds a function to convert a range of blocks
> to an uninitialized extent. ?This function will
> be used to first convert the blocks to extents before
> punching them out.
>
> This function also adds a routine to split an extent at
> a given block. ?This routine is used when a range of
> blocks to be converted is only partially contained in
> an extent.
>
> Signed-off-by: Allison Henderson <[email protected]>
> ---
> :100644 100644 7516fb9... ab2e42e... M ?fs/ext4/extents.c
> ?fs/ext4/extents.c | ?185 +++++++++++++++++++++++++++++++++++++++++++++++++++++
> ?1 files changed, 185 insertions(+), 0 deletions(-)
>
> diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c
> index 7516fb9..ab2e42e 100644
> --- a/fs/ext4/extents.c
> +++ b/fs/ext4/extents.c
> @@ -2169,6 +2169,105 @@ static int ext4_remove_blocks(handle_t *handle, struct inode *inode,
> ? ? ? ?return 0;
> ?}
>
> +/*
> + * ext4_split_extents() Splits a given extent at the block "split"
> + * @handle: The journal handle
> + * @inode: The file inode
> + * @split: The block where the extent will be split
> + * @path: The path to the extent
> + * @flags: flags used to insert the new extent
> + */
> +static int ext4_split_extents(handle_t *handle,
> + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? struct inode *inode,
> + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ext4_lblk_t split,
> + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? struct ext4_ext_path *path,
> + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? int flags)
> +{
> + ? ? ? struct ext4_extent *ex, newex, orig_ex, *i_ex;
> + ? ? ? struct ext4_extent *ex1 = NULL;
> + ? ? ? struct ext4_extent *ex2 = NULL;
> + ? ? ? struct ext4_extent_header *eh;
> + ? ? ? ext4_lblk_t ee_block;
> + ? ? ? unsigned int ee_len, depth;
> + ? ? ? ext4_fsblk_t newblock, origblock;
> + ? ? ? int err = 0;
> +
> + ? ? ? ext_debug("ext4_split_extent: inode %lu, split %u\n", inode->i_ino, split);
> + ? ? ? ext4_ext_show_leaf(inode, path);
> +
> + ? ? ? depth = ext_depth(inode);
> + ? ? ? eh = path[depth].p_hdr;
> + ? ? ? ex = path[depth].p_ext;
> + ? ? ? ee_block = le32_to_cpu(ex->ee_block);
> + ? ? ? ee_len = ext4_ext_get_actual_len(ex);
> +
> + ? ? ? /* if the block is not actually in the extent, go to out */
> + ? ? ? if(split > ee_block+ee_len || split < ee_block)
> + ? ? ? ? ? ? ? goto out;
> +
> + ? ? ? origblock = ee_block + ext4_ext_pblock(ex);
> + ? ? ? newblock = split - ee_block + ext4_ext_pblock(ex);
> + ? ? ? ext_debug("The new block is %llu, the orig block is: %llu\n", newblock, origblock);
> +
> + ? ? ? /* save original block in case split fails */
> + ? ? ? orig_ex.ee_block = ex->ee_block;
> + ? ? ? orig_ex.ee_len ? = cpu_to_le16(ee_len);
> + ? ? ? ext4_ext_store_pblock(&orig_ex, ext4_ext_pblock(ex));
> +
> + ? ? ? err = ext4_ext_get_access(handle, inode, path + depth);
> + ? ? ? if (err)
> + ? ? ? ? ? ? ? goto out;
> +
> + ? ? ? ex1 = ex;
> + ? ? ? ex1->ee_len = cpu_to_le16(split-ee_block);
> +
> + ? ? ? ex2 = &newex;
> + ? ? ? ex2->ee_len = cpu_to_le16(ee_len-(split-ee_block));
> + ? ? ? ex2->ee_block = split;
> + ? ? ? ext4_ext_store_pblock(ex2, newblock);
> +
> + ? ? ? if (ex1->ee_block != ex2->ee_block)
> + ? ? ? ? ? ? ? goto insert;
> +
> + ? ? ? /* Mark modified extent as dirty */
> + ? ? ? err = ext4_ext_dirty(handle, inode, path + depth);
> + ? ? ? ext_debug("out here\n");
> + ? ? ? goto out;
> +insert:
> +
> + ? ? ? /*
> + ? ? ? ?* If this leaf cannot fit in any more extents
> + ? ? ? ?* insert it into another leaf
> + ? ? ? ?*/
> + ? ? ? if(EXT_LAST_EXTENT(eh) >= ?EXT_MAX_EXTENT(eh)){
> + ? ? ? ? ? ? ? err = ext4_ext_insert_extent(handle, inode, path, &newex, flags);
> + ? ? ? ? ? ? ? if (err)
> + ? ? ? ? ? ? ? ? ? ? ? goto fix_extent_len;
> + ? ? ? }
> +
> + ? ? ? /* otherwise just scoot all ther other extents down ?*/
> + ? ? ? else{
> + ? ? ? ? ? ? ? for(i_ex = EXT_LAST_EXTENT(eh); i_ex > ex1; i_ex-- ){
> + ? ? ? ? ? ? ? ? ? ? ? memcpy(i_ex+1, i_ex, sizeof(struct ext4_extent));
> + ? ? ? ? ? ? ? }
> + ? ? ? ? ? ? ? memcpy(ex1+1, ex2, sizeof(struct ext4_extent));
> + ? ? ? ? ? ? ? eh->eh_entries++;
> + ? ? ? }
> +
> +out:
> + ? ? ? ext4_ext_show_leaf(inode, path);
> + ? ? ? return err;
> +
> +fix_extent_len:
> + ? ? ? ex->ee_block = orig_ex.ee_block;
> + ? ? ? ex->ee_len ? = orig_ex.ee_len;
> + ? ? ? ext4_ext_store_pblock(ex, ext4_ext_pblock(&orig_ex));
> + ? ? ? ext4_ext_mark_uninitialized(ex);
> + ? ? ? ext4_ext_dirty(handle, inode, path + depth);
> +
> + ? ? ? return err;
> +}
> +
> ?static int
> ?ext4_ext_rm_leaf(handle_t *handle, struct inode *inode,
> ? ? ? ? ? ? ? ?struct ext4_ext_path *path, ext4_lblk_t start)
> @@ -3598,6 +3697,92 @@ out_stop:
> ? ? ? ?ext4_journal_stop(handle);
> ?}
>
> +/*
> + * ext4_ext_convert_blocks_uninit()
> + * Converts a range of blocks to uninitialized
> + *
> + * @inode: ?The files inode
> + * @handle: The journal handle


(style) in all other functions, @handle comes before @inode
I'm sorry, but this hurts my eyes :-)

> + * @lblock: The logical block where the conversion will start
> + * @len: ? ?The max number of blocks to convert
> + *
> + * Returns the number of blocks successfully converted
> + */
> +static int ext4_ext_convert_blocks_uninit(struct inode *inode, handle_t *handle, ext4_lblk_t lblock, ext4_lblk_t len){
> +
> + ? ? ? ext4_lblk_t ee_block, iblock;
> + ? ? ? struct ext4_ext_path *path;
> + ? ? ? struct ext4_extent *ex;
> + ? ? ? unsigned int ee_len;
> + ? ? ? int ret, err = 0;
> +
> + ? ? ? iblock = lblock;
> + ? ? ? while(iblock < lblock+len){
> +
> + ? ? ? ? ? ? ? path = ext4_ext_find_extent(inode, lblock, NULL);
> +
> + ? ? ? ? ? ? ? if(path == NULL)
> + ? ? ? ? ? ? ? ? ? ? ? goto next;
> +
> + ? ? ? ? ? ? ? err = ext4_ext_get_access(handle, inode, path);
> + ? ? ? ? ? ? ? if(err < 0)
> + ? ? ? ? ? ? ? ? ? ? ? goto next;
> +
> + ? ? ? ? ? ? ? ex = path->p_ext;
> + ? ? ? ? ? ? ? if(ex == NULL)
> + ? ? ? ? ? ? ? ? ? ? ? goto next;
> +
> + ? ? ? ? ? ? ? ee_block = le32_to_cpu(ex->ee_block);
> + ? ? ? ? ? ? ? ee_len = ext4_ext_get_actual_len(ex);
> +
> + ? ? ? ? ? ? ? /*
> + ? ? ? ? ? ? ? ?* If the block is in the middle of the extent,
> + ? ? ? ? ? ? ? ?* split the extent to remove the head.
> + ? ? ? ? ? ? ? ?* Then find the new extent that contains the block
> + ? ? ? ? ? ? ? ?*/
> + ? ? ? ? ? ? ? if(ee_block < iblock){
> + ? ? ? ? ? ? ? ? ? ? ? err = ext4_split_extents(handle, inode, iblock, path, 0);
> + ? ? ? ? ? ? ? ? ? ? ? if(err)
> + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? goto next;
> +
> + ? ? ? ? ? ? ? ? ? ? ? path = ext4_ext_find_extent(inode, iblock, NULL);
> +
> + ? ? ? ? ? ? ? ? ? ? ? if(path == NULL)
> + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? goto next;
> +
> + ? ? ? ? ? ? ? ? ? ? ? ex = path->p_ext;
> + ? ? ? ? ? ? ? ? ? ? ? if(ex == NULL)
> + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? goto next;
> +
> + ? ? ? ? ? ? ? ? ? ? ? ee_block = le32_to_cpu(ex->ee_block);
> + ? ? ? ? ? ? ? ? ? ? ? ee_len = ext4_ext_get_actual_len(ex);
> +
> + ? ? ? ? ? ? ? }
> +
> + ? ? ? ? ? ? ? /* If the extent exceeds len, split the extent to remove the tail */
> + ? ? ? ? ? ? ? if(ee_len > len){
> + ? ? ? ? ? ? ? ? ? ? ? err = ext4_split_extents(handle, inode, lblock+len, path, 0);
> + ? ? ? ? ? ? ? ? ? ? ? if(err)
> + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? goto next;
> +
> + ? ? ? ? ? ? ? ? ? ? ? ee_len = ext4_ext_get_actual_len(ex);
> + ? ? ? ? ? ? ? }
> +
> + ? ? ? ? ? ? ? /* Mark the extent uninitialized */
> + ? ? ? ? ? ? ? ext4_ext_mark_uninitialized(ex);
> +
> + ? ? ? ? ? ? ? iblock+=ex->ee_len;
> + ? ? ? ? ? ? ? ret+=ex->ee_len;
> + ? ? ? ? ? ? ? continue;
> +next:
> + ? ? ? ? ? ? ? /* If the extent for this block cannot be found, skip it */
> + ? ? ? ? ? ? ? iblock++;
> + ? ? ? }
> +
> + ? ? ? return ret;
> +}
> +
> +
> ?static void ext4_falloc_update_inode(struct inode *inode,
> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?int mode, loff_t new_size, int update_ctime)
> ?{
> --
> 1.7.1
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> the body of a message to [email protected]
> More majordomo info at ?http://vger.kernel.org/majordomo-info.html
>

2011-03-01 11:09:49

by Yongqiang Yang

[permalink] [raw]
Subject: Re: [Ext4 punch hole 1/5] Ext4 Punch Hole Support: Convert Blocks to Uninit Exts

On Tue, Mar 1, 2011 at 11:08 AM, Allison Henderson
<[email protected]> wrote:
> Hi All!
>
> This is the patch series for ext4 punch hole that I have been working on.
> There is still a little more debugging that I would like to do with it,
> but I've had some requests to see the patch, so I'm sending it out a bit
> early. ?Any feed back is appreciated. ?Thx!
>
> This first patch adds a function to convert a range of blocks
> to an uninitialized extent. ?This function will
> be used to first convert the blocks to extents before
> punching them out.
>
> This function also adds a routine to split an extent at
> a given block. ?This routine is used when a range of
> blocks to be converted is only partially contained in
> an extent.
>
> Signed-off-by: Allison Henderson <[email protected]>
> ---
> :100644 100644 7516fb9... ab2e42e... M ?fs/ext4/extents.c
> ?fs/ext4/extents.c | ?185 +++++++++++++++++++++++++++++++++++++++++++++++++++++
> ?1 files changed, 185 insertions(+), 0 deletions(-)
>
> diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c
> index 7516fb9..ab2e42e 100644
> --- a/fs/ext4/extents.c
> +++ b/fs/ext4/extents.c
> @@ -2169,6 +2169,105 @@ static int ext4_remove_blocks(handle_t *handle, struct inode *inode,
> ? ? ? ?return 0;
> ?}
>
> +/*
> + * ext4_split_extents() Splits a given extent at the block "split"
> + * @handle: The journal handle
> + * @inode: The file inode
> + * @split: The block where the extent will be split
> + * @path: The path to the extent
> + * @flags: flags used to insert the new extent
> + */
> +static int ext4_split_extents(handle_t *handle,
> + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? struct inode *inode,
> + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ext4_lblk_t split,
> + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? struct ext4_ext_path *path,
> + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? int flags)
> +{
> + ? ? ? struct ext4_extent *ex, newex, orig_ex, *i_ex;
> + ? ? ? struct ext4_extent *ex1 = NULL;
> + ? ? ? struct ext4_extent *ex2 = NULL;
> + ? ? ? struct ext4_extent_header *eh;
> + ? ? ? ext4_lblk_t ee_block;
> + ? ? ? unsigned int ee_len, depth;
> + ? ? ? ext4_fsblk_t newblock, origblock;
> + ? ? ? int err = 0;
> +
> + ? ? ? ext_debug("ext4_split_extent: inode %lu, split %u\n", inode->i_ino, split);
> + ? ? ? ext4_ext_show_leaf(inode, path);
> +
> + ? ? ? depth = ext_depth(inode);
> + ? ? ? eh = path[depth].p_hdr;
> + ? ? ? ex = path[depth].p_ext;
> + ? ? ? ee_block = le32_to_cpu(ex->ee_block);
> + ? ? ? ee_len = ext4_ext_get_actual_len(ex);
> +
> + ? ? ? /* if the block is not actually in the extent, go to out */
> + ? ? ? if(split > ee_block+ee_len || split < ee_block)
> + ? ? ? ? ? ? ? goto out;
> +
> + ? ? ? origblock = ee_block + ext4_ext_pblock(ex);
> + ? ? ? newblock = split - ee_block + ext4_ext_pblock(ex);
> + ? ? ? ext_debug("The new block is %llu, the orig block is: %llu\n", newblock, origblock);
> +
> + ? ? ? /* save original block in case split fails */
> + ? ? ? orig_ex.ee_block = ex->ee_block;
> + ? ? ? orig_ex.ee_len ? = cpu_to_le16(ee_len);
> + ? ? ? ext4_ext_store_pblock(&orig_ex, ext4_ext_pblock(ex));
> +
> + ? ? ? err = ext4_ext_get_access(handle, inode, path + depth);
> + ? ? ? if (err)
> + ? ? ? ? ? ? ? goto out;
> +
> + ? ? ? ex1 = ex;
> + ? ? ? ex1->ee_len = cpu_to_le16(split-ee_block);
> +
> + ? ? ? ex2 = &newex;
> + ? ? ? ex2->ee_len = cpu_to_le16(ee_len-(split-ee_block));
> + ? ? ? ex2->ee_block = split;
> + ? ? ? ext4_ext_store_pblock(ex2, newblock);
> +
> + ? ? ? if (ex1->ee_block != ex2->ee_block)
> + ? ? ? ? ? ? ? goto insert;
> +
> + ? ? ? /* Mark modified extent as dirty */
> + ? ? ? err = ext4_ext_dirty(handle, inode, path + depth);
> + ? ? ? ext_debug("out here\n");
> + ? ? ? goto out;



--------
> +insert:
> +
> + ? ? ? /*
> + ? ? ? ?* If this leaf cannot fit in any more extents
> + ? ? ? ?* insert it into another leaf
> + ? ? ? ?*/
> + ? ? ? if(EXT_LAST_EXTENT(eh) >= ?EXT_MAX_EXTENT(eh)){
> + ? ? ? ? ? ? ? err = ext4_ext_insert_extent(handle, inode, path, &newex, flags);
> + ? ? ? ? ? ? ? if (err)
> + ? ? ? ? ? ? ? ? ? ? ? goto fix_extent_len;
> + ? ? ? }
> +
> + ? ? ? /* otherwise just scoot all ther other extents down ?*/
> + ? ? ? else{
> + ? ? ? ? ? ? ? for(i_ex = EXT_LAST_EXTENT(eh); i_ex > ex1; i_ex-- ){
> + ? ? ? ? ? ? ? ? ? ? ? memcpy(i_ex+1, i_ex, sizeof(struct ext4_extent));
> + ? ? ? ? ? ? ? }
for loop can be replaced by:
memcpy(ex1 + 1, ex1 + 2, (i_ex - ex1) *
sizeof(struct ext4_extent));
> + ? ? ? ? ? ? ? memcpy(ex1+1, ex2, sizeof(struct ext4_extent));
> + ? ? ? ? ? ? ? eh->eh_entries++;
(error) should be replaced by:
le16_add_cpu(&eh->eh_entries, 1);
> + ? ? ? }

insert label could be replaced by:
? ? ? ? ? ? ? err = ext4_ext_insert_extent(handle, inode, path, &newex, flags);
? ? ? ? ? ? ? if (err)
? ? ? ? ? ? ? ? ? ? ? goto fix_extent_len;

> +out:
> + ? ? ? ext4_ext_show_leaf(inode, path);
> + ? ? ? return err;
> +
> +fix_extent_len:
> + ? ? ? ex->ee_block = orig_ex.ee_block;
> + ? ? ? ex->ee_len ? = orig_ex.ee_len;
> + ? ? ? ext4_ext_store_pblock(ex, ext4_ext_pblock(&orig_ex));
> + ? ? ? ext4_ext_mark_uninitialized(ex);
> + ? ? ? ext4_ext_dirty(handle, inode, path + depth);
> +
> + ? ? ? return err;
> +}
> +
> ?static int
> ?ext4_ext_rm_leaf(handle_t *handle, struct inode *inode,
> ? ? ? ? ? ? ? ?struct ext4_ext_path *path, ext4_lblk_t start)
> @@ -3598,6 +3697,92 @@ out_stop:
> ? ? ? ?ext4_journal_stop(handle);
> ?}
>
> +/*
> + * ext4_ext_convert_blocks_uninit()
> + * Converts a range of blocks to uninitialized
> + *
> + * @inode: ?The files inode
> + * @handle: The journal handle
> + * @lblock: The logical block where the conversion will start
> + * @len: ? ?The max number of blocks to convert
> + *
> + * Returns the number of blocks successfully converted
> + */
> +static int ext4_ext_convert_blocks_uninit(struct inode *inode, handle_t *handle, ext4_lblk_t lblock, ext4_lblk_t len){
> +
> + ? ? ? ext4_lblk_t ee_block, iblock;
> + ? ? ? struct ext4_ext_path *path;
> + ? ? ? struct ext4_extent *ex;
> + ? ? ? unsigned int ee_len;
> + ? ? ? int ret, err = 0;
> +
> + ? ? ? iblock = lblock;
> + ? ? ? while(iblock < lblock+len){
> +
> + ? ? ? ? ? ? ? path = ext4_ext_find_extent(inode, lblock, NULL);
> +
> + ? ? ? ? ? ? ? if(path == NULL)
> + ? ? ? ? ? ? ? ? ? ? ? goto next;
> +
> + ? ? ? ? ? ? ? err = ext4_ext_get_access(handle, inode, path);
> + ? ? ? ? ? ? ? if(err < 0)
> + ? ? ? ? ? ? ? ? ? ? ? goto next;
> +
> + ? ? ? ? ? ? ? ex = path->p_ext;
> + ? ? ? ? ? ? ? if(ex == NULL)
> + ? ? ? ? ? ? ? ? ? ? ? goto next;
> +
> + ? ? ? ? ? ? ? ee_block = le32_to_cpu(ex->ee_block);
> + ? ? ? ? ? ? ? ee_len = ext4_ext_get_actual_len(ex);
> +
> + ? ? ? ? ? ? ? /*
> + ? ? ? ? ? ? ? ?* If the block is in the middle of the extent,
> + ? ? ? ? ? ? ? ?* split the extent to remove the head.
> + ? ? ? ? ? ? ? ?* Then find the new extent that contains the block
> + ? ? ? ? ? ? ? ?*/
> + ? ? ? ? ? ? ? if(ee_block < iblock){
> + ? ? ? ? ? ? ? ? ? ? ? err = ext4_split_extents(handle, inode, iblock, path, 0);
> + ? ? ? ? ? ? ? ? ? ? ? if(err)
> + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? goto next;
> +
> + ? ? ? ? ? ? ? ? ? ? ? path = ext4_ext_find_extent(inode, iblock, NULL);
> +
> + ? ? ? ? ? ? ? ? ? ? ? if(path == NULL)
> + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? goto next;
> +
> + ? ? ? ? ? ? ? ? ? ? ? ex = path->p_ext;
> + ? ? ? ? ? ? ? ? ? ? ? if(ex == NULL)
> + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? goto next;
> +
> + ? ? ? ? ? ? ? ? ? ? ? ee_block = le32_to_cpu(ex->ee_block);
> + ? ? ? ? ? ? ? ? ? ? ? ee_len = ext4_ext_get_actual_len(ex);
> +
> + ? ? ? ? ? ? ? }
> +
> + ? ? ? ? ? ? ? /* If the extent exceeds len, split the extent to remove the tail */
> + ? ? ? ? ? ? ? if(ee_len > len){
> + ? ? ? ? ? ? ? ? ? ? ? err = ext4_split_extents(handle, inode, lblock+len, path, 0);
> + ? ? ? ? ? ? ? ? ? ? ? if(err)
> + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? goto next;
> +
> + ? ? ? ? ? ? ? ? ? ? ? ee_len = ext4_ext_get_actual_len(ex);
> + ? ? ? ? ? ? ? }
> +
> + ? ? ? ? ? ? ? /* Mark the extent uninitialized */
> + ? ? ? ? ? ? ? ext4_ext_mark_uninitialized(ex);
> +
> + ? ? ? ? ? ? ? iblock+=ex->ee_len;
> + ? ? ? ? ? ? ? ret+=ex->ee_len;
> + ? ? ? ? ? ? ? continue;
> +next:
> + ? ? ? ? ? ? ? /* If the extent for this block cannot be found, skip it */
> + ? ? ? ? ? ? ? iblock++;
> + ? ? ? }
> +
> + ? ? ? return ret;
> +}
> +
> +
> ?static void ext4_falloc_update_inode(struct inode *inode,
> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?int mode, loff_t new_size, int update_ctime)
> ?{
> --
> 1.7.1
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> the body of a message to [email protected]
> More majordomo info at ?http://vger.kernel.org/majordomo-info.html
>



--
Best Wishes
Yongqiang Yang

2011-03-01 12:25:41

by Yongqiang Yang

[permalink] [raw]
Subject: Re: [Ext4 punch hole 1/5] Ext4 Punch Hole Support: Convert Blocks to Uninit Exts

On Tue, Mar 1, 2011 at 11:08 AM, Allison Henderson
<[email protected]> wrote:
> Hi All!
>
> This is the patch series for ext4 punch hole that I have been working on.
> There is still a little more debugging that I would like to do with it,
> but I've had some requests to see the patch, so I'm sending it out a bit
> early. ?Any feed back is appreciated. ?Thx!
>
> This first patch adds a function to convert a range of blocks
> to an uninitialized extent. ?This function will
> be used to first convert the blocks to extents before
> punching them out.
>
> This function also adds a routine to split an extent at
> a given block. ?This routine is used when a range of
> blocks to be converted is only partially contained in
> an extent.
>
> Signed-off-by: Allison Henderson <[email protected]>
> ---
> :100644 100644 7516fb9... ab2e42e... M ?fs/ext4/extents.c
> ?fs/ext4/extents.c | ?185 +++++++++++++++++++++++++++++++++++++++++++++++++++++
> ?1 files changed, 185 insertions(+), 0 deletions(-)
>
> diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c
> index 7516fb9..ab2e42e 100644
> --- a/fs/ext4/extents.c
> +++ b/fs/ext4/extents.c
> @@ -2169,6 +2169,105 @@ static int ext4_remove_blocks(handle_t *handle, struct inode *inode,
> ? ? ? ?return 0;
> ?}
>
> +/*
> + * ext4_split_extents() Splits a given extent at the block "split"
> + * @handle: The journal handle
> + * @inode: The file inode
> + * @split: The block where the extent will be split
> + * @path: The path to the extent
> + * @flags: flags used to insert the new extent
> + */
> +static int ext4_split_extents(handle_t *handle,
> + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? struct inode *inode,
> + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ext4_lblk_t split,
> + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? struct ext4_ext_path *path,
> + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? int flags)
> +{
> + ? ? ? struct ext4_extent *ex, newex, orig_ex, *i_ex;
> + ? ? ? struct ext4_extent *ex1 = NULL;
> + ? ? ? struct ext4_extent *ex2 = NULL;
> + ? ? ? struct ext4_extent_header *eh;
> + ? ? ? ext4_lblk_t ee_block;
> + ? ? ? unsigned int ee_len, depth;
> + ? ? ? ext4_fsblk_t newblock, origblock;
> + ? ? ? int err = 0;
> +
> + ? ? ? ext_debug("ext4_split_extent: inode %lu, split %u\n", inode->i_ino, split);
> + ? ? ? ext4_ext_show_leaf(inode, path);
> +
> + ? ? ? depth = ext_depth(inode);
> + ? ? ? eh = path[depth].p_hdr;
> + ? ? ? ex = path[depth].p_ext;
> + ? ? ? ee_block = le32_to_cpu(ex->ee_block);
> + ? ? ? ee_len = ext4_ext_get_actual_len(ex);
> +
> + ? ? ? /* if the block is not actually in the extent, go to out */
> + ? ? ? if(split > ee_block+ee_len || split < ee_block)
> + ? ? ? ? ? ? ? goto out;
> +
> + ? ? ? origblock = ee_block + ext4_ext_pblock(ex);
> + ? ? ? newblock = split - ee_block + ext4_ext_pblock(ex);
> + ? ? ? ext_debug("The new block is %llu, the orig block is: %llu\n", newblock, origblock);
> +
> + ? ? ? /* save original block in case split fails */
> + ? ? ? orig_ex.ee_block = ex->ee_block;
> + ? ? ? orig_ex.ee_len ? = cpu_to_le16(ee_len);
> + ? ? ? ext4_ext_store_pblock(&orig_ex, ext4_ext_pblock(ex));
> +
> + ? ? ? err = ext4_ext_get_access(handle, inode, path + depth);
> + ? ? ? if (err)
> + ? ? ? ? ? ? ? goto out;
> +
> + ? ? ? ex1 = ex;
> + ? ? ? ex1->ee_len = cpu_to_le16(split-ee_block);
> +
> + ? ? ? ex2 = &newex;
> + ? ? ? ex2->ee_len = cpu_to_le16(ee_len-(split-ee_block));
> + ? ? ? ex2->ee_block = split;
> + ? ? ? ext4_ext_store_pblock(ex2, newblock);
> +
> + ? ? ? if (ex1->ee_block != ex2->ee_block)
> + ? ? ? ? ? ? ? goto insert;
> +
> + ? ? ? /* Mark modified extent as dirty */
> + ? ? ? err = ext4_ext_dirty(handle, inode, path + depth);
> + ? ? ? ext_debug("out here\n");
> + ? ? ? goto out;
> +insert:
> +
> + ? ? ? /*
> + ? ? ? ?* If this leaf cannot fit in any more extents
> + ? ? ? ?* insert it into another leaf
> + ? ? ? ?*/
> + ? ? ? if(EXT_LAST_EXTENT(eh) >= ?EXT_MAX_EXTENT(eh)){
> + ? ? ? ? ? ? ? err = ext4_ext_insert_extent(handle, inode, path, &newex, flags);
> + ? ? ? ? ? ? ? if (err)
> + ? ? ? ? ? ? ? ? ? ? ? goto fix_extent_len;
> + ? ? ? }
> +
> + ? ? ? /* otherwise just scoot all ther other extents down ?*/
> + ? ? ? else{
> + ? ? ? ? ? ? ? for(i_ex = EXT_LAST_EXTENT(eh); i_ex > ex1; i_ex-- ){
> + ? ? ? ? ? ? ? ? ? ? ? memcpy(i_ex+1, i_ex, sizeof(struct ext4_extent));
> + ? ? ? ? ? ? ? }
> + ? ? ? ? ? ? ? memcpy(ex1+1, ex2, sizeof(struct ext4_extent));
> + ? ? ? ? ? ? ? eh->eh_entries++;
> + ? ? ? }
> +
> +out:
> + ? ? ? ext4_ext_show_leaf(inode, path);
> + ? ? ? return err;
> +
> +fix_extent_len:
> + ? ? ? ex->ee_block = orig_ex.ee_block;
> + ? ? ? ex->ee_len ? = orig_ex.ee_len;
> + ? ? ? ext4_ext_store_pblock(ex, ext4_ext_pblock(&orig_ex));
> + ? ? ? ext4_ext_mark_uninitialized(ex);
> + ? ? ? ext4_ext_dirty(handle, inode, path + depth);
> +
> + ? ? ? return err;
> +}
> +
> ?static int
> ?ext4_ext_rm_leaf(handle_t *handle, struct inode *inode,
> ? ? ? ? ? ? ? ?struct ext4_ext_path *path, ext4_lblk_t start)
> @@ -3598,6 +3697,92 @@ out_stop:
> ? ? ? ?ext4_journal_stop(handle);
> ?}
>
> +/*
> + * ext4_ext_convert_blocks_uninit()
> + * Converts a range of blocks to uninitialized
> + *
> + * @inode: ?The files inode
> + * @handle: The journal handle
> + * @lblock: The logical block where the conversion will start
> + * @len: ? ?The max number of blocks to convert
> + *
> + * Returns the number of blocks successfully converted
> + */
> +static int ext4_ext_convert_blocks_uninit(struct inode *inode, handle_t *handle, ext4_lblk_t lblock, ext4_lblk_t len){
> +
> + ? ? ? ext4_lblk_t ee_block, iblock;
> + ? ? ? struct ext4_ext_path *path;
> + ? ? ? struct ext4_extent *ex;
> + ? ? ? unsigned int ee_len;
> + ? ? ? int ret, err = 0;
> +
> + ? ? ? iblock = lblock;
> + ? ? ? while(iblock < lblock+len){
> +
> + ? ? ? ? ? ? ? path = ext4_ext_find_extent(inode, lblock, NULL);
> +
> + ? ? ? ? ? ? ? if(path == NULL)
> + ? ? ? ? ? ? ? ? ? ? ? goto next;
> +
> + ? ? ? ? ? ? ? err = ext4_ext_get_access(handle, inode, path);
> + ? ? ? ? ? ? ? if(err < 0)
> + ? ? ? ? ? ? ? ? ? ? ? goto next;
> +
> + ? ? ? ? ? ? ? ex = path->p_ext;
> + ? ? ? ? ? ? ? if(ex == NULL)
> + ? ? ? ? ? ? ? ? ? ? ? goto next;
> +
> + ? ? ? ? ? ? ? ee_block = le32_to_cpu(ex->ee_block);
> + ? ? ? ? ? ? ? ee_len = ext4_ext_get_actual_len(ex);
> +
> + ? ? ? ? ? ? ? /*
> + ? ? ? ? ? ? ? ?* If the block is in the middle of the extent,
> + ? ? ? ? ? ? ? ?* split the extent to remove the head.
> + ? ? ? ? ? ? ? ?* Then find the new extent that contains the block
> + ? ? ? ? ? ? ? ?*/
> + ? ? ? ? ? ? ? if(ee_block < iblock){
> + ? ? ? ? ? ? ? ? ? ? ? err = ext4_split_extents(handle, inode, iblock, path, 0);
> + ? ? ? ? ? ? ? ? ? ? ? if(err)
> + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? goto next;
> +
> + ? ? ? ? ? ? ? ? ? ? ? path = ext4_ext_find_extent(inode, iblock, NULL);
(error) path needs to be kfree() and to release buffer_heads like
ext4_ext_map_block() does.
> +
> + ? ? ? ? ? ? ? ? ? ? ? if(path == NULL)
> + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? goto next;
> +
> + ? ? ? ? ? ? ? ? ? ? ? ex = path->p_ext;
> + ? ? ? ? ? ? ? ? ? ? ? if(ex == NULL)
> + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? goto next;
> +
> + ? ? ? ? ? ? ? ? ? ? ? ee_block = le32_to_cpu(ex->ee_block);
> + ? ? ? ? ? ? ? ? ? ? ? ee_len = ext4_ext_get_actual_len(ex);
> +
> + ? ? ? ? ? ? ? }
> +
> + ? ? ? ? ? ? ? /* If the extent exceeds len, split the extent to remove the tail */
> + ? ? ? ? ? ? ? if(ee_len > len){
> + ? ? ? ? ? ? ? ? ? ? ? err = ext4_split_extents(handle, inode, lblock+len, path, 0);
> + ? ? ? ? ? ? ? ? ? ? ? if(err)
> + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? goto next;
> +
> + ? ? ? ? ? ? ? ? ? ? ? ee_len = ext4_ext_get_actual_len(ex);
> + ? ? ? ? ? ? ? }
> +
> + ? ? ? ? ? ? ? /* Mark the extent uninitialized */
> + ? ? ? ? ? ? ? ext4_ext_mark_uninitialized(ex);
> +
> + ? ? ? ? ? ? ? iblock+=ex->ee_len;
> + ? ? ? ? ? ? ? ret+=ex->ee_len;
> + ? ? ? ? ? ? ? continue;
> +next:
> + ? ? ? ? ? ? ? /* If the extent for this block cannot be found, skip it */
> + ? ? ? ? ? ? ? iblock++;

> + ? ? ? }
> +
> + ? ? ? return ret;
> +}
> +
> +
> ?static void ext4_falloc_update_inode(struct inode *inode,
> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?int mode, loff_t new_size, int update_ctime)
> ?{
> --
> 1.7.1
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> the body of a message to [email protected]
> More majordomo info at ?http://vger.kernel.org/majordomo-info.html
>



--
Best Wishes
Yongqiang Yang

2011-03-01 13:22:02

by Yongqiang Yang

[permalink] [raw]
Subject: Re: [Ext4 punch hole 1/5] Ext4 Punch Hole Support: Convert Blocks to Uninit Exts

On Tue, Mar 1, 2011 at 11:08 AM, Allison Henderson
<[email protected]> wrote:
> Hi All!
>
> This is the patch series for ext4 punch hole that I have been working on.
> There is still a little more debugging that I would like to do with it,
> but I've had some requests to see the patch, so I'm sending it out a bit
> early. ?Any feed back is appreciated. ?Thx!
>
> This first patch adds a function to convert a range of blocks
> to an uninitialized extent. ?This function will
> be used to first convert the blocks to extents before
> punching them out.
>
> This function also adds a routine to split an extent at
> a given block. ?This routine is used when a range of
> blocks to be converted is only partially contained in
> an extent.
>
> Signed-off-by: Allison Henderson <[email protected]>
> ---
> :100644 100644 7516fb9... ab2e42e... M ?fs/ext4/extents.c
> ?fs/ext4/extents.c | ?185 +++++++++++++++++++++++++++++++++++++++++++++++++++++
> ?1 files changed, 185 insertions(+), 0 deletions(-)
>
> diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c
> index 7516fb9..ab2e42e 100644
> --- a/fs/ext4/extents.c
> +++ b/fs/ext4/extents.c
> @@ -2169,6 +2169,105 @@ static int ext4_remove_blocks(handle_t *handle, struct inode *inode,
> ? ? ? ?return 0;
> ?}
>
> +/*
> + * ext4_split_extents() Splits a given extent at the block "split"
> + * @handle: The journal handle
> + * @inode: The file inode
> + * @split: The block where the extent will be split
> + * @path: The path to the extent
> + * @flags: flags used to insert the new extent
> + */
> +static int ext4_split_extents(handle_t *handle,
> + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? struct inode *inode,
> + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ext4_lblk_t split,
> + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? struct ext4_ext_path *path,
> + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? int flags)
> +{
> + ? ? ? struct ext4_extent *ex, newex, orig_ex, *i_ex;
> + ? ? ? struct ext4_extent *ex1 = NULL;
> + ? ? ? struct ext4_extent *ex2 = NULL;
> + ? ? ? struct ext4_extent_header *eh;
> + ? ? ? ext4_lblk_t ee_block;
> + ? ? ? unsigned int ee_len, depth;
> + ? ? ? ext4_fsblk_t newblock, origblock;
> + ? ? ? int err = 0;
> +
> + ? ? ? ext_debug("ext4_split_extent: inode %lu, split %u\n", inode->i_ino, split);
> + ? ? ? ext4_ext_show_leaf(inode, path);
> +
> + ? ? ? depth = ext_depth(inode);
> + ? ? ? eh = path[depth].p_hdr;
> + ? ? ? ex = path[depth].p_ext;
> + ? ? ? ee_block = le32_to_cpu(ex->ee_block);
> + ? ? ? ee_len = ext4_ext_get_actual_len(ex);
> +
> + ? ? ? /* if the block is not actually in the extent, go to out */
> + ? ? ? if(split > ee_block+ee_len || split < ee_block)
> + ? ? ? ? ? ? ? goto out;
> +
> + ? ? ? origblock = ee_block + ext4_ext_pblock(ex);
> + ? ? ? newblock = split - ee_block + ext4_ext_pblock(ex);
> + ? ? ? ext_debug("The new block is %llu, the orig block is: %llu\n", newblock, origblock);
> +
> + ? ? ? /* save original block in case split fails */
> + ? ? ? orig_ex.ee_block = ex->ee_block;
> + ? ? ? orig_ex.ee_len ? = cpu_to_le16(ee_len);
> + ? ? ? ext4_ext_store_pblock(&orig_ex, ext4_ext_pblock(ex));
> +
> + ? ? ? err = ext4_ext_get_access(handle, inode, path + depth);
> + ? ? ? if (err)
> + ? ? ? ? ? ? ? goto out;
> +
> + ? ? ? ex1 = ex;
> + ? ? ? ex1->ee_len = cpu_to_le16(split-ee_block);
> +
> + ? ? ? ex2 = &newex;
> + ? ? ? ex2->ee_len = cpu_to_le16(ee_len-(split-ee_block));
> + ? ? ? ex2->ee_block = split;
> + ? ? ? ext4_ext_store_pblock(ex2, newblock);
> +
> + ? ? ? if (ex1->ee_block != ex2->ee_block)
> + ? ? ? ? ? ? ? goto insert;
> +
> + ? ? ? /* Mark modified extent as dirty */
> + ? ? ? err = ext4_ext_dirty(handle, inode, path + depth);
> + ? ? ? ext_debug("out here\n");
> + ? ? ? goto out;
> +insert:
> +
> + ? ? ? /*
> + ? ? ? ?* If this leaf cannot fit in any more extents
> + ? ? ? ?* insert it into another leaf
> + ? ? ? ?*/
> + ? ? ? if(EXT_LAST_EXTENT(eh) >= ?EXT_MAX_EXTENT(eh)){
> + ? ? ? ? ? ? ? err = ext4_ext_insert_extent(handle, inode, path, &newex, flags);
> + ? ? ? ? ? ? ? if (err)
> + ? ? ? ? ? ? ? ? ? ? ? goto fix_extent_len;
> + ? ? ? }
> +
> + ? ? ? /* otherwise just scoot all ther other extents down ?*/
> + ? ? ? else{
> + ? ? ? ? ? ? ? for(i_ex = EXT_LAST_EXTENT(eh); i_ex > ex1; i_ex-- ){
> + ? ? ? ? ? ? ? ? ? ? ? memcpy(i_ex+1, i_ex, sizeof(struct ext4_extent));
> + ? ? ? ? ? ? ? }
> + ? ? ? ? ? ? ? memcpy(ex1+1, ex2, sizeof(struct ext4_extent));
> + ? ? ? ? ? ? ? eh->eh_entries++;
> + ? ? ? }
> +
> +out:
> + ? ? ? ext4_ext_show_leaf(inode, path);
> + ? ? ? return err;
> +
> +fix_extent_len:
> + ? ? ? ex->ee_block = orig_ex.ee_block;
> + ? ? ? ex->ee_len ? = orig_ex.ee_len;
> + ? ? ? ext4_ext_store_pblock(ex, ext4_ext_pblock(&orig_ex));
> + ? ? ? ext4_ext_mark_uninitialized(ex);
> + ? ? ? ext4_ext_dirty(handle, inode, path + depth);
path may be changed in ext4_ext_insert_extent(), as well as depth.
> +
> + ? ? ? return err;
> +}
> +
> ?static int
> ?ext4_ext_rm_leaf(handle_t *handle, struct inode *inode,
> ? ? ? ? ? ? ? ?struct ext4_ext_path *path, ext4_lblk_t start)
> @@ -3598,6 +3697,92 @@ out_stop:
> ? ? ? ?ext4_journal_stop(handle);
> ?}
>
> +/*
> + * ext4_ext_convert_blocks_uninit()
> + * Converts a range of blocks to uninitialized
> + *
> + * @inode: ?The files inode
> + * @handle: The journal handle
> + * @lblock: The logical block where the conversion will start
> + * @len: ? ?The max number of blocks to convert
> + *
> + * Returns the number of blocks successfully converted
> + */
> +static int ext4_ext_convert_blocks_uninit(struct inode *inode, handle_t *handle, ext4_lblk_t lblock, ext4_lblk_t len){
> +
> + ? ? ? ext4_lblk_t ee_block, iblock;
> + ? ? ? struct ext4_ext_path *path;
> + ? ? ? struct ext4_extent *ex;
> + ? ? ? unsigned int ee_len;
> + ? ? ? int ret, err = 0;
> +
> + ? ? ? iblock = lblock;
> + ? ? ? while(iblock < lblock+len){
> +
> + ? ? ? ? ? ? ? path = ext4_ext_find_extent(inode, lblock, NULL);
> +
> + ? ? ? ? ? ? ? if(path == NULL)
> + ? ? ? ? ? ? ? ? ? ? ? goto next;
> +
> + ? ? ? ? ? ? ? err = ext4_ext_get_access(handle, inode, path);
> + ? ? ? ? ? ? ? if(err < 0)
> + ? ? ? ? ? ? ? ? ? ? ? goto next;
> +
> + ? ? ? ? ? ? ? ex = path->p_ext;
> + ? ? ? ? ? ? ? if(ex == NULL)
> + ? ? ? ? ? ? ? ? ? ? ? goto next;
> +
> + ? ? ? ? ? ? ? ee_block = le32_to_cpu(ex->ee_block);
> + ? ? ? ? ? ? ? ee_len = ext4_ext_get_actual_len(ex);
> +
> + ? ? ? ? ? ? ? /*
> + ? ? ? ? ? ? ? ?* If the block is in the middle of the extent,
> + ? ? ? ? ? ? ? ?* split the extent to remove the head.
> + ? ? ? ? ? ? ? ?* Then find the new extent that contains the block
> + ? ? ? ? ? ? ? ?*/
> + ? ? ? ? ? ? ? if(ee_block < iblock){
> + ? ? ? ? ? ? ? ? ? ? ? err = ext4_split_extents(handle, inode, iblock, path, 0);
> + ? ? ? ? ? ? ? ? ? ? ? if(err)
> + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? goto next;
> +
> + ? ? ? ? ? ? ? ? ? ? ? path = ext4_ext_find_extent(inode, iblock, NULL);
> +
> + ? ? ? ? ? ? ? ? ? ? ? if(path == NULL)
> + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? goto next;
> +
> + ? ? ? ? ? ? ? ? ? ? ? ex = path->p_ext;
> + ? ? ? ? ? ? ? ? ? ? ? if(ex == NULL)
> + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? goto next;
> +
> + ? ? ? ? ? ? ? ? ? ? ? ee_block = le32_to_cpu(ex->ee_block);
> + ? ? ? ? ? ? ? ? ? ? ? ee_len = ext4_ext_get_actual_len(ex);
> +
> + ? ? ? ? ? ? ? }
> +
> + ? ? ? ? ? ? ? /* If the extent exceeds len, split the extent to remove the tail */
> + ? ? ? ? ? ? ? if(ee_len > len){
> + ? ? ? ? ? ? ? ? ? ? ? err = ext4_split_extents(handle, inode, lblock+len, path, 0);
> + ? ? ? ? ? ? ? ? ? ? ? if(err)
> + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? goto next;
> +
> + ? ? ? ? ? ? ? ? ? ? ? ee_len = ext4_ext_get_actual_len(ex);
> + ? ? ? ? ? ? ? }
> +
> + ? ? ? ? ? ? ? /* Mark the extent uninitialized */
> + ? ? ? ? ? ? ? ext4_ext_mark_uninitialized(ex);
> +
> + ? ? ? ? ? ? ? iblock+=ex->ee_len;
> + ? ? ? ? ? ? ? ret+=ex->ee_len;
> + ? ? ? ? ? ? ? continue;
> +next:
> + ? ? ? ? ? ? ? /* If the extent for this block cannot be found, skip it */
> + ? ? ? ? ? ? ? iblock++;
> + ? ? ? }
> +
> + ? ? ? return ret;
> +}
> +
> +
> ?static void ext4_falloc_update_inode(struct inode *inode,
> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?int mode, loff_t new_size, int update_ctime)
> ?{
> --
> 1.7.1
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> the body of a message to [email protected]
> More majordomo info at ?http://vger.kernel.org/majordomo-info.html
>



--
Best Wishes
Yongqiang Yang

2011-03-01 20:47:41

by Allison Henderson

[permalink] [raw]
Subject: Re: [Ext4 punch hole 1/5] Ext4 Punch Hole Support: Convert Blocks to Uninit Exts

On 2/28/2011 11:42 PM, Andreas Dilger wrote:
> On 2011-02-28, at 8:08 PM, Allison Henderson wrote:
>> This first patch adds a function to convert a range of blocks
>> to an uninitialized extent. This function will
>> be used to first convert the blocks to extents before
>> punching them out.
>
> This was proposed as a separate function for FALLOCATE by Dave Chinner (based on XFS_IOC_ZERO_RANGE), so this is useful as a standalone function.
>
>> +/*
>> + * ext4_split_extents() Splits a given extent at the block "split"
>> + * @handle: The journal handle
>> + * @inode: The file inode
>> + * @split: The block where the extent will be split
>> + * @path: The path to the extent
>> + * @flags: flags used to insert the new extent
>> + */
>> +static int ext4_split_extents(handle_t *handle,
>> + struct inode *inode,
>> + ext4_lblk_t split,
>> + struct ext4_ext_path *path,
>> + int flags)
>
> Isn't this already done when converting an uninitialized extent into a regular extent when it is being written to? Hmm, ext4_ext_convert_to_initialized() is splitting the extent, but it is doing that "by hand". It might make sense to convert ext4_ext_convert_to_initialized() to use this routine, but this depends on whether it makes the code simpler or not.

Alrighty, I will look into getting ext4_ext_convert_to_initialized() to use this routine. If it looks like it helps make the code simpler, maybe I can add an extra code clean up patch on top of what I already have. That way if people like it we can keep it, or we can drop the last patch if they don't. :)

>
>> + /* if the block is not actually in the extent, go to out */
>> + if(split> ee_block+ee_len || split< ee_block)
>> + goto out;
>
> Is this actually needed?

Ideally no. All the code that calls this routine does this math before hand to make sure this never happens, but I thought I might add this check here anyway just in case someone in the future tries to use this routine and doesnt do the math right. If for some reason we feel this code should be removed, the rest of the patch should function just fine without it though.

>
>> + ext_debug("The new block is %llu, the orig block is: %llu\n", newblock, origblock);
>
> (style) please wrap lines at 80 columns. This patch should be run through checkpatch.pl, which should catch issues like this.
>
>> + if(EXT_LAST_EXTENT(eh)>= EXT_MAX_EXTENT(eh)){
>> +
>> + /* otherwise just scoot all ther other extents down */
>> + else{
>> + for(i_ex = EXT_LAST_EXTENT(eh); i_ex> ex1; i_ex-- ){
>> + memcpy(i_ex+1, i_ex, sizeof(struct ext4_extent));
>> + }
>> + memcpy(ex1+1, ex2, sizeof(struct ext4_extent));
>
> (style) the whitespace in several other places is also not following the Linux coding style
>
>
> Cheers, Andreas
>
>
>
>
>

Thx for the feed back! I will get the rest of the style comments integrated into the patch.

Allison Henderson


2011-03-01 20:48:21

by Allison Henderson

[permalink] [raw]
Subject: Re: [Ext4 punch hole 1/5] Ext4 Punch Hole Support: Convert Blocks to Uninit Exts

On 3/1/2011 2:30 AM, Amir Goldstein wrote:
> On Tue, Mar 1, 2011 at 5:08 AM, Allison Henderson
> <[email protected]> wrote:
>> Hi All!
>>
>> This is the patch series for ext4 punch hole that I have been working on.
>> There is still a little more debugging that I would like to do with it,
>> but I've had some requests to see the patch, so I'm sending it out a bit
>> early. Any feed back is appreciated. Thx!
>
> Hi Allison,
>
> Very nice work!
>
> One meta comment is related to locking considerations with direct I/O.
> I am not all that familiar with all direct I/O paths (i.e. dioread_nolock),
> but I smell a possible race with holes initiation via direct I/O.
>
> The thing with direct I/O is that you have no flags to test the state
> of the block (did Eric just add a hash table to cope with another issue?).
> Direct I/O writes outside i_size are synchronous (i.e. i_mutex is held until
> I/O completion), but inside i_size, they may be a-synchronous.
> The case of initiating holes is currently handled with DIO_SKIP_HOLES
> by falling back to buffered I/O, although it could be handled by allocating
> an uninitialized extent.
> The case of writing to falloced space in handled by converting extents to
> initialized at async I/O completion.
>
> I am not sure there a race with hole punching here and I am not even sure
> that the description above is totally accurate (?), but it's a dark corner,
> which should to be reviewed carefully.
>

Alrighty, thx for pointing that out though, I will keep an eye out for
it. I'll see if I can set up some test cases that do what you describe
to see how the code handles them.

>>
>> This first patch adds a function to convert a range of blocks
>> to an uninitialized extent. This function will
>> be used to first convert the blocks to extents before
>> punching them out.
>>
>> This function also adds a routine to split an extent at
>> a given block. This routine is used when a range of
>> blocks to be converted is only partially contained in
>> an extent.
>>
>> Signed-off-by: Allison Henderson<[email protected]>
>> ---
>> :100644 100644 7516fb9... ab2e42e... M fs/ext4/extents.c
>> fs/ext4/extents.c | 185 +++++++++++++++++++++++++++++++++++++++++++++++++++++
>> 1 files changed, 185 insertions(+), 0 deletions(-)
>>
>> diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c
>> index 7516fb9..ab2e42e 100644
>> --- a/fs/ext4/extents.c
>> +++ b/fs/ext4/extents.c
>> @@ -2169,6 +2169,105 @@ static int ext4_remove_blocks(handle_t *handle, struct inode *inode,
>> return 0;
>> }
>>
>> +/*
>> + * ext4_split_extents() Splits a given extent at the block "split"
>> + * @handle: The journal handle
>> + * @inode: The file inode
>> + * @split: The block where the extent will be split
>> + * @path: The path to the extent
>> + * @flags: flags used to insert the new extent
>> + */
>> +static int ext4_split_extents(handle_t *handle,
>> + struct inode *inode,
>> + ext4_lblk_t split,
>> + struct ext4_ext_path *path,
>> + int flags)
>> +{
>> + struct ext4_extent *ex, newex, orig_ex, *i_ex;
>> + struct ext4_extent *ex1 = NULL;
>> + struct ext4_extent *ex2 = NULL;
>> + struct ext4_extent_header *eh;
>> + ext4_lblk_t ee_block;
>> + unsigned int ee_len, depth;
>> + ext4_fsblk_t newblock, origblock;
>> + int err = 0;
>> +
>> + ext_debug("ext4_split_extent: inode %lu, split %u\n", inode->i_ino, split);
>> + ext4_ext_show_leaf(inode, path);
>> +
>> + depth = ext_depth(inode);
>> + eh = path[depth].p_hdr;
>> + ex = path[depth].p_ext;
>> + ee_block = le32_to_cpu(ex->ee_block);
>> + ee_len = ext4_ext_get_actual_len(ex);
>> +
>> + /* if the block is not actually in the extent, go to out */
>> + if(split> ee_block+ee_len || split< ee_block)
>> + goto out;
>> +
>> + origblock = ee_block + ext4_ext_pblock(ex);
>> + newblock = split - ee_block + ext4_ext_pblock(ex);
>> + ext_debug("The new block is %llu, the orig block is: %llu\n", newblock, origblock);
>> +
>> + /* save original block in case split fails */
>> + orig_ex.ee_block = ex->ee_block;
>> + orig_ex.ee_len = cpu_to_le16(ee_len);
>> + ext4_ext_store_pblock(&orig_ex, ext4_ext_pblock(ex));
>> +
>> + err = ext4_ext_get_access(handle, inode, path + depth);
>> + if (err)
>> + goto out;
>> +
>> + ex1 = ex;
>> + ex1->ee_len = cpu_to_le16(split-ee_block);
>> +
>> + ex2 =&newex;
>> + ex2->ee_len = cpu_to_le16(ee_len-(split-ee_block));
>> + ex2->ee_block = split;
>> + ext4_ext_store_pblock(ex2, newblock);
>> +
>> + if (ex1->ee_block != ex2->ee_block)
>> + goto insert;
>> +
>> + /* Mark modified extent as dirty */
>> + err = ext4_ext_dirty(handle, inode, path + depth);
>> + ext_debug("out here\n");
>> + goto out;
>> +insert:
>> +
>> + /*
>> + * If this leaf cannot fit in any more extents
>> + * insert it into another leaf
>> + */
>> + if(EXT_LAST_EXTENT(eh)>= EXT_MAX_EXTENT(eh)){
>> + err = ext4_ext_insert_extent(handle, inode, path,&newex, flags);
>> + if (err)
>> + goto fix_extent_len;
>> + }
>> +
>> + /* otherwise just scoot all ther other extents down */
>> + else{
>> + for(i_ex = EXT_LAST_EXTENT(eh); i_ex> ex1; i_ex-- ){
>> + memcpy(i_ex+1, i_ex, sizeof(struct ext4_extent));
>> + }
>> + memcpy(ex1+1, ex2, sizeof(struct ext4_extent));
>> + eh->eh_entries++;
>> + }
>> +
>> +out:
>> + ext4_ext_show_leaf(inode, path);
>> + return err;
>> +
>> +fix_extent_len:
>> + ex->ee_block = orig_ex.ee_block;
>> + ex->ee_len = orig_ex.ee_len;
>> + ext4_ext_store_pblock(ex, ext4_ext_pblock(&orig_ex));
>> + ext4_ext_mark_uninitialized(ex);
>> + ext4_ext_dirty(handle, inode, path + depth);
>> +
>> + return err;
>> +}
>> +
>> static int
>> ext4_ext_rm_leaf(handle_t *handle, struct inode *inode,
>> struct ext4_ext_path *path, ext4_lblk_t start)
>> @@ -3598,6 +3697,92 @@ out_stop:
>> ext4_journal_stop(handle);
>> }
>>
>> +/*
>> + * ext4_ext_convert_blocks_uninit()
>> + * Converts a range of blocks to uninitialized
>> + *
>> + * @inode: The files inode
>> + * @handle: The journal handle
>
>
> (style) in all other functions, @handle comes before @inode
> I'm sorry, but this hurts my eyes :-)
>
>> + * @lblock: The logical block where the conversion will start
>> + * @len: The max number of blocks to convert
>> + *
>> + * Returns the number of blocks successfully converted
>> + */
>> +static int ext4_ext_convert_blocks_uninit(struct inode *inode, handle_t *handle, ext4_lblk_t lblock, ext4_lblk_t len){
>> +
>> + ext4_lblk_t ee_block, iblock;
>> + struct ext4_ext_path *path;
>> + struct ext4_extent *ex;
>> + unsigned int ee_len;
>> + int ret, err = 0;
>> +
>> + iblock = lblock;
>> + while(iblock< lblock+len){
>> +
>> + path = ext4_ext_find_extent(inode, lblock, NULL);
>> +
>> + if(path == NULL)
>> + goto next;
>> +
>> + err = ext4_ext_get_access(handle, inode, path);
>> + if(err< 0)
>> + goto next;
>> +
>> + ex = path->p_ext;
>> + if(ex == NULL)
>> + goto next;
>> +
>> + ee_block = le32_to_cpu(ex->ee_block);
>> + ee_len = ext4_ext_get_actual_len(ex);
>> +
>> + /*
>> + * If the block is in the middle of the extent,
>> + * split the extent to remove the head.
>> + * Then find the new extent that contains the block
>> + */
>> + if(ee_block< iblock){
>> + err = ext4_split_extents(handle, inode, iblock, path, 0);
>> + if(err)
>> + goto next;
>> +
>> + path = ext4_ext_find_extent(inode, iblock, NULL);
>> +
>> + if(path == NULL)
>> + goto next;
>> +
>> + ex = path->p_ext;
>> + if(ex == NULL)
>> + goto next;
>> +
>> + ee_block = le32_to_cpu(ex->ee_block);
>> + ee_len = ext4_ext_get_actual_len(ex);
>> +
>> + }
>> +
>> + /* If the extent exceeds len, split the extent to remove the tail */
>> + if(ee_len> len){
>> + err = ext4_split_extents(handle, inode, lblock+len, path, 0);
>> + if(err)
>> + goto next;
>> +
>> + ee_len = ext4_ext_get_actual_len(ex);
>> + }
>> +
>> + /* Mark the extent uninitialized */
>> + ext4_ext_mark_uninitialized(ex);
>> +
>> + iblock+=ex->ee_len;
>> + ret+=ex->ee_len;
>> + continue;
>> +next:
>> + /* If the extent for this block cannot be found, skip it */
>> + iblock++;
>> + }
>> +
>> + return ret;
>> +}
>> +
>> +
>> static void ext4_falloc_update_inode(struct inode *inode,
>> int mode, loff_t new_size, int update_ctime)
>> {
>> --
>> 1.7.1
>>
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
>> the body of a message to [email protected]
>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html

Thx for the feed back, I will get the style comments integrated too.

Allison Henderson


2011-03-01 20:52:43

by Allison Henderson

[permalink] [raw]
Subject: Re: [Ext4 punch hole 1/5] Ext4 Punch Hole Support: Convert Blocks to Uninit Exts

On 3/1/2011 5:25 AM, Yongqiang Yang wrote:
> On Tue, Mar 1, 2011 at 11:08 AM, Allison Henderson
> <[email protected]> wrote:
>> Hi All!
>>
>> This is the patch series for ext4 punch hole that I have been working on.
>> There is still a little more debugging that I would like to do with it,
>> but I've had some requests to see the patch, so I'm sending it out a bit
>> early. Any feed back is appreciated. Thx!
>>
>> This first patch adds a function to convert a range of blocks
>> to an uninitialized extent. This function will
>> be used to first convert the blocks to extents before
>> punching them out.
>>
>> This function also adds a routine to split an extent at
>> a given block. This routine is used when a range of
>> blocks to be converted is only partially contained in
>> an extent.
>>
>> Signed-off-by: Allison Henderson<[email protected]>
>> ---
>> :100644 100644 7516fb9... ab2e42e... M fs/ext4/extents.c
>> fs/ext4/extents.c | 185 +++++++++++++++++++++++++++++++++++++++++++++++++++++
>> 1 files changed, 185 insertions(+), 0 deletions(-)
>>
>> diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c
>> index 7516fb9..ab2e42e 100644
>> --- a/fs/ext4/extents.c
>> +++ b/fs/ext4/extents.c
>> @@ -2169,6 +2169,105 @@ static int ext4_remove_blocks(handle_t *handle, struct inode *inode,
>> return 0;
>> }
>>
>> +/*
>> + * ext4_split_extents() Splits a given extent at the block "split"
>> + * @handle: The journal handle
>> + * @inode: The file inode
>> + * @split: The block where the extent will be split
>> + * @path: The path to the extent
>> + * @flags: flags used to insert the new extent
>> + */
>> +static int ext4_split_extents(handle_t *handle,
>> + struct inode *inode,
>> + ext4_lblk_t split,
>> + struct ext4_ext_path *path,
>> + int flags)
>> +{
>> + struct ext4_extent *ex, newex, orig_ex, *i_ex;
>> + struct ext4_extent *ex1 = NULL;
>> + struct ext4_extent *ex2 = NULL;
>> + struct ext4_extent_header *eh;
>> + ext4_lblk_t ee_block;
>> + unsigned int ee_len, depth;
>> + ext4_fsblk_t newblock, origblock;
>> + int err = 0;
>> +
>> + ext_debug("ext4_split_extent: inode %lu, split %u\n", inode->i_ino, split);
>> + ext4_ext_show_leaf(inode, path);
>> +
>> + depth = ext_depth(inode);
>> + eh = path[depth].p_hdr;
>> + ex = path[depth].p_ext;
>> + ee_block = le32_to_cpu(ex->ee_block);
>> + ee_len = ext4_ext_get_actual_len(ex);
>> +
>> + /* if the block is not actually in the extent, go to out */
>> + if(split> ee_block+ee_len || split< ee_block)
>> + goto out;
>> +
>> + origblock = ee_block + ext4_ext_pblock(ex);
>> + newblock = split - ee_block + ext4_ext_pblock(ex);
>> + ext_debug("The new block is %llu, the orig block is: %llu\n", newblock, origblock);
>> +
>> + /* save original block in case split fails */
>> + orig_ex.ee_block = ex->ee_block;
>> + orig_ex.ee_len = cpu_to_le16(ee_len);
>> + ext4_ext_store_pblock(&orig_ex, ext4_ext_pblock(ex));
>> +
>> + err = ext4_ext_get_access(handle, inode, path + depth);
>> + if (err)
>> + goto out;
>> +
>> + ex1 = ex;
>> + ex1->ee_len = cpu_to_le16(split-ee_block);
>> +
>> + ex2 =&newex;
>> + ex2->ee_len = cpu_to_le16(ee_len-(split-ee_block));
>> + ex2->ee_block = split;
>> + ext4_ext_store_pblock(ex2, newblock);
>> +
>> + if (ex1->ee_block != ex2->ee_block)
>> + goto insert;
>> +
>> + /* Mark modified extent as dirty */
>> + err = ext4_ext_dirty(handle, inode, path + depth);
>> + ext_debug("out here\n");
>> + goto out;
>> +insert:
>> +
>> + /*
>> + * If this leaf cannot fit in any more extents
>> + * insert it into another leaf
>> + */
>> + if(EXT_LAST_EXTENT(eh)>= EXT_MAX_EXTENT(eh)){
>> + err = ext4_ext_insert_extent(handle, inode, path,&newex, flags);
>> + if (err)
>> + goto fix_extent_len;
>> + }
>> +
>> + /* otherwise just scoot all ther other extents down */
>> + else{
>> + for(i_ex = EXT_LAST_EXTENT(eh); i_ex> ex1; i_ex-- ){
>> + memcpy(i_ex+1, i_ex, sizeof(struct ext4_extent));
>> + }
>> + memcpy(ex1+1, ex2, sizeof(struct ext4_extent));
>> + eh->eh_entries++;
>> + }
>> +
>> +out:
>> + ext4_ext_show_leaf(inode, path);
>> + return err;
>> +
>> +fix_extent_len:
>> + ex->ee_block = orig_ex.ee_block;
>> + ex->ee_len = orig_ex.ee_len;
>> + ext4_ext_store_pblock(ex, ext4_ext_pblock(&orig_ex));
>> + ext4_ext_mark_uninitialized(ex);
>> + ext4_ext_dirty(handle, inode, path + depth);
>> +
>> + return err;
>> +}
>> +
>> static int
>> ext4_ext_rm_leaf(handle_t *handle, struct inode *inode,
>> struct ext4_ext_path *path, ext4_lblk_t start)
>> @@ -3598,6 +3697,92 @@ out_stop:
>> ext4_journal_stop(handle);
>> }
>>
>> +/*
>> + * ext4_ext_convert_blocks_uninit()
>> + * Converts a range of blocks to uninitialized
>> + *
>> + * @inode: The files inode
>> + * @handle: The journal handle
>> + * @lblock: The logical block where the conversion will start
>> + * @len: The max number of blocks to convert
>> + *
>> + * Returns the number of blocks successfully converted
>> + */
>> +static int ext4_ext_convert_blocks_uninit(struct inode *inode, handle_t *handle, ext4_lblk_t lblock, ext4_lblk_t len){
>> +
>> + ext4_lblk_t ee_block, iblock;
>> + struct ext4_ext_path *path;
>> + struct ext4_extent *ex;
>> + unsigned int ee_len;
>> + int ret, err = 0;
>> +
>> + iblock = lblock;
>> + while(iblock< lblock+len){
>> +
>> + path = ext4_ext_find_extent(inode, lblock, NULL);
>> +
>> + if(path == NULL)
>> + goto next;
>> +
>> + err = ext4_ext_get_access(handle, inode, path);
>> + if(err< 0)
>> + goto next;
>> +
>> + ex = path->p_ext;
>> + if(ex == NULL)
>> + goto next;
>> +
>> + ee_block = le32_to_cpu(ex->ee_block);
>> + ee_len = ext4_ext_get_actual_len(ex);
>> +
>> + /*
>> + * If the block is in the middle of the extent,
>> + * split the extent to remove the head.
>> + * Then find the new extent that contains the block
>> + */
>> + if(ee_block< iblock){
>> + err = ext4_split_extents(handle, inode, iblock, path, 0);
>> + if(err)
>> + goto next;
>> +
>> + path = ext4_ext_find_extent(inode, iblock, NULL);
> (error) path needs to be kfree() and to release buffer_heads like
> ext4_ext_map_block() does.
>> +
>> + if(path == NULL)
>> + goto next;
>> +
>> + ex = path->p_ext;
>> + if(ex == NULL)
>> + goto next;
>> +
>> + ee_block = le32_to_cpu(ex->ee_block);
>> + ee_len = ext4_ext_get_actual_len(ex);
>> +
>> + }
>> +
>> + /* If the extent exceeds len, split the extent to remove the tail */
>> + if(ee_len> len){
>> + err = ext4_split_extents(handle, inode, lblock+len, path, 0);
>> + if(err)
>> + goto next;
>> +
>> + ee_len = ext4_ext_get_actual_len(ex);
>> + }
>> +
>> + /* Mark the extent uninitialized */
>> + ext4_ext_mark_uninitialized(ex);
>> +
>> + iblock+=ex->ee_len;
>> + ret+=ex->ee_len;
>> + continue;
>> +next:
>> + /* If the extent for this block cannot be found, skip it */
>> + iblock++;
>
>> + }
>> +
>> + return ret;
>> +}
>> +
>> +
>> static void ext4_falloc_update_inode(struct inode *inode,
>> int mode, loff_t new_size, int update_ctime)
>> {
>> --
>> 1.7.1
>>
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
>> the body of a message to [email protected]
>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>>
>
>
>
Alrighty, thx to everyone for all the feedback. I really appreciate everyone taking the time to look at it. I still have some testing I need to do with it, but when I'm done I will post an updated patch here. Thx again all!

Allison Henderson


2011-03-01 23:16:18

by Mingming Cao

[permalink] [raw]
Subject: Re: [Ext4 punch hole 1/5] Ext4 Punch Hole Support: Convert Blocks to Uninit Exts

On Mon, 2011-02-28 at 23:42 -0700, Andreas Dilger wrote:
> On 2011-02-28, at 8:08 PM, Allison Henderson wrote:
> > This first patch adds a function to convert a range of blocks
> > to an uninitialized extent. This function will
> > be used to first convert the blocks to extents before
> > punching them out.
>
> This was proposed as a separate function for FALLOCATE by Dave Chinner (based on XFS_IOC_ZERO_RANGE), so this is useful as a standalone function.
>
> > +/*
> > + * ext4_split_extents() Splits a given extent at the block "split"
> > + * @handle: The journal handle
> > + * @inode: The file inode
> > + * @split: The block where the extent will be split
> > + * @path: The path to the extent
> > + * @flags: flags used to insert the new extent
> > + */
> > +static int ext4_split_extents(handle_t *handle,
> > + struct inode *inode,
> > + ext4_lblk_t split,
> > + struct ext4_ext_path *path,
> > + int flags)
>
> Isn't this already done when converting an uninitialized extent into a regular extent when it is being written to? Hmm, ext4_ext_convert_to_initialized() is splitting the extent, but it is doing that "by hand". It might make sense to convert ext4_ext_convert_to_initialized() to use this routine, but this depends on whether it makes the code simpler or not.
>

Yes, the ext4_ext_convert_to_initialized() and
ext4_split_unwritten_extents() both does similar thing, split the
extents to prepare for filling unwritten extents...which is opposite
than punching holes. these two functions should be merged/factor out
the same code first.

ext4_ext_convert_to_initialized() handles write to unwritten extents for
buffered IO case, which does the zerout optimization for filling holes
to small area handling(avoid frequent convert to initizatlized). And it
does the optimization to try to merge extents with neighbors if
possible. Those two are not needed for punch holes.

ext4_split_unwritten_extents() is more straightforward to reuse for
punch hole(no need for merge with neighbors, no need to zero out for
optimization) -- for punch hole, just opposite the logic in
ext4_split_unwritten_extents().


> > + /* if the block is not actually in the extent, go to out */
> > + if(split > ee_block+ee_len || split < ee_block)
> > + goto out;
>
> Is this actually needed?
>
> > + ext_debug("The new block is %llu, the orig block is: %llu\n", newblock, origblock);
>
> (style) please wrap lines at 80 columns. This patch should be run through checkpatch.pl, which should catch issues like this.
>
> > + if(EXT_LAST_EXTENT(eh) >= EXT_MAX_EXTENT(eh)){
> > +
> > + /* otherwise just scoot all ther other extents down */
> > + else{
> > + for(i_ex = EXT_LAST_EXTENT(eh); i_ex > ex1; i_ex-- ){
> > + memcpy(i_ex+1, i_ex, sizeof(struct ext4_extent));
> > + }
> > + memcpy(ex1+1, ex2, sizeof(struct ext4_extent));
>
> (style) the whitespace in several other places is also not following the Linux coding style
>
>
> Cheers, Andreas
>
>
>
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html



2011-03-02 01:32:14

by Mingming Cao

[permalink] [raw]
Subject: Re: [Ext4 punch hole 1/5] Ext4 Punch Hole Support: Convert Blocks to Uninit Exts

On Tue, 2011-03-01 at 11:30 +0200, Amir Goldstein wrote:
> On Tue, Mar 1, 2011 at 5:08 AM, Allison Henderson
> <[email protected]> wrote:
> > Hi All!
> >
> > This is the patch series for ext4 punch hole that I have been working on.
> > There is still a little more debugging that I would like to do with it,
> > but I've had some requests to see the patch, so I'm sending it out a bit
> > early. Any feed back is appreciated. Thx!
>
> Hi Allison,
>
> Very nice work!
>
> One meta comment is related to locking considerations with direct I/O.
> I am not all that familiar with all direct I/O paths (i.e. dioread_nolock),
> but I smell a possible race with holes initiation via direct I/O.
>
> The thing with direct I/O is that you have no flags to test the state
> of the block (did Eric just add a hash table to cope with another issue?).
> Direct I/O writes outside i_size are synchronous (i.e. i_mutex is held until
> I/O completion), but inside i_size, they may be a-synchronous.
> The case of initiating holes is currently handled with DIO_SKIP_HOLES
> by falling back to buffered I/O, although it could be handled by allocating
> an uninitialized extent.

This has been changed. We now have real direct IO filling holes. Blocks
allocated for holes are marked as uninitialized extents first, after DIO
is completed, those uninitialized extents are converted to initialized.

> The case of writing to falloced space in handled by converting extents to
> initialized at async I/O completion.
>
> I am not sure there a race with hole punching here and I am not even sure
> that the description above is totally accurate (?), but it's a dark corner,
> which should to be reviewed carefully.
>

The exsiting fallocate call which does allocation takes care of truncate
&regular allocation (DIO or buffered IO) by holding i_mutex lock. The
get_blocks() routines should takes care of concurrent modification of
the allocation tree with i_data_sem. I haven't check carefully yet, but
we should make sure allocation tree modification via punch hole also
hold the i_data_sem as well, to prevent race with truncate/writepages.

> >
> > This first patch adds a function to convert a range of blocks
> > to an uninitialized extent. This function will
> > be used to first convert the blocks to extents before
> > punching them out.
> >
> > This function also adds a routine to split an extent at
> > a given block. This routine is used when a range of
> > blocks to be converted is only partially contained in
> > an extent.
> >
> > Signed-off-by: Allison Henderson <[email protected]>
> > ---
> > :100644 100644 7516fb9... ab2e42e... M fs/ext4/extents.c
> > fs/ext4/extents.c | 185 +++++++++++++++++++++++++++++++++++++++++++++++++++++
> > 1 files changed, 185 insertions(+), 0 deletions(-)
> >
> > diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c
> > index 7516fb9..ab2e42e 100644
> > --- a/fs/ext4/extents.c
> > +++ b/fs/ext4/extents.c
> > @@ -2169,6 +2169,105 @@ static int ext4_remove_blocks(handle_t *handle, struct inode *inode,
> > return 0;
> > }
> >
> > +/*
> > + * ext4_split_extents() Splits a given extent at the block "split"
> > + * @handle: The journal handle
> > + * @inode: The file inode
> > + * @split: The block where the extent will be split
> > + * @path: The path to the extent
> > + * @flags: flags used to insert the new extent
> > + */
> > +static int ext4_split_extents(handle_t *handle,
> > + struct inode *inode,
> > + ext4_lblk_t split,
> > + struct ext4_ext_path *path,
> > + int flags)
> > +{
> > + struct ext4_extent *ex, newex, orig_ex, *i_ex;
> > + struct ext4_extent *ex1 = NULL;
> > + struct ext4_extent *ex2 = NULL;
> > + struct ext4_extent_header *eh;
> > + ext4_lblk_t ee_block;
> > + unsigned int ee_len, depth;
> > + ext4_fsblk_t newblock, origblock;
> > + int err = 0;
> > +
> > + ext_debug("ext4_split_extent: inode %lu, split %u\n", inode->i_ino, split);
> > + ext4_ext_show_leaf(inode, path);
> > +
> > + depth = ext_depth(inode);
> > + eh = path[depth].p_hdr;
> > + ex = path[depth].p_ext;
> > + ee_block = le32_to_cpu(ex->ee_block);
> > + ee_len = ext4_ext_get_actual_len(ex);
> > +
> > + /* if the block is not actually in the extent, go to out */
> > + if(split > ee_block+ee_len || split < ee_block)
> > + goto out;
> > +
> > + origblock = ee_block + ext4_ext_pblock(ex);
> > + newblock = split - ee_block + ext4_ext_pblock(ex);
> > + ext_debug("The new block is %llu, the orig block is: %llu\n", newblock, origblock);
> > +
> > + /* save original block in case split fails */
> > + orig_ex.ee_block = ex->ee_block;
> > + orig_ex.ee_len = cpu_to_le16(ee_len);
> > + ext4_ext_store_pblock(&orig_ex, ext4_ext_pblock(ex));
> > +
> > + err = ext4_ext_get_access(handle, inode, path + depth);
> > + if (err)
> > + goto out;
> > +
> > + ex1 = ex;
> > + ex1->ee_len = cpu_to_le16(split-ee_block);
> > +
> > + ex2 = &newex;
> > + ex2->ee_len = cpu_to_le16(ee_len-(split-ee_block));
> > + ex2->ee_block = split;
> > + ext4_ext_store_pblock(ex2, newblock);
> > +
> > + if (ex1->ee_block != ex2->ee_block)
> > + goto insert;
> > +
> > + /* Mark modified extent as dirty */
> > + err = ext4_ext_dirty(handle, inode, path + depth);
> > + ext_debug("out here\n");
> > + goto out;
> > +insert:
> > +
> > + /*
> > + * If this leaf cannot fit in any more extents
> > + * insert it into another leaf
> > + */
> > + if(EXT_LAST_EXTENT(eh) >= EXT_MAX_EXTENT(eh)){
> > + err = ext4_ext_insert_extent(handle, inode, path, &newex, flags);
> > + if (err)
> > + goto fix_extent_len;
> > + }
> > +
> > + /* otherwise just scoot all ther other extents down */
> > + else{
> > + for(i_ex = EXT_LAST_EXTENT(eh); i_ex > ex1; i_ex-- ){
> > + memcpy(i_ex+1, i_ex, sizeof(struct ext4_extent));
> > + }
> > + memcpy(ex1+1, ex2, sizeof(struct ext4_extent));
> > + eh->eh_entries++;
> > + }
> > +
> > +out:
> > + ext4_ext_show_leaf(inode, path);
> > + return err;
> > +
> > +fix_extent_len:
> > + ex->ee_block = orig_ex.ee_block;
> > + ex->ee_len = orig_ex.ee_len;
> > + ext4_ext_store_pblock(ex, ext4_ext_pblock(&orig_ex));
> > + ext4_ext_mark_uninitialized(ex);
> > + ext4_ext_dirty(handle, inode, path + depth);
> > +
> > + return err;
> > +}
> > +
> > static int
> > ext4_ext_rm_leaf(handle_t *handle, struct inode *inode,
> > struct ext4_ext_path *path, ext4_lblk_t start)
> > @@ -3598,6 +3697,92 @@ out_stop:
> > ext4_journal_stop(handle);
> > }
> >
> > +/*
> > + * ext4_ext_convert_blocks_uninit()
> > + * Converts a range of blocks to uninitialized
> > + *
> > + * @inode: The files inode
> > + * @handle: The journal handle
>
>
> (style) in all other functions, @handle comes before @inode
> I'm sorry, but this hurts my eyes :-)
>
> > + * @lblock: The logical block where the conversion will start
> > + * @len: The max number of blocks to convert
> > + *
> > + * Returns the number of blocks successfully converted
> > + */
> > +static int ext4_ext_convert_blocks_uninit(struct inode *inode, handle_t *handle, ext4_lblk_t lblock, ext4_lblk_t len){
> > +
> > + ext4_lblk_t ee_block, iblock;
> > + struct ext4_ext_path *path;
> > + struct ext4_extent *ex;
> > + unsigned int ee_len;
> > + int ret, err = 0;
> > +
> > + iblock = lblock;
> > + while(iblock < lblock+len){
> > +
> > + path = ext4_ext_find_extent(inode, lblock, NULL);
> > +
> > + if(path == NULL)
> > + goto next;
> > +
> > + err = ext4_ext_get_access(handle, inode, path);
> > + if(err < 0)
> > + goto next;
> > +
> > + ex = path->p_ext;
> > + if(ex == NULL)
> > + goto next;
> > +
> > + ee_block = le32_to_cpu(ex->ee_block);
> > + ee_len = ext4_ext_get_actual_len(ex);
> > +
> > + /*
> > + * If the block is in the middle of the extent,
> > + * split the extent to remove the head.
> > + * Then find the new extent that contains the block
> > + */
> > + if(ee_block < iblock){
> > + err = ext4_split_extents(handle, inode, iblock, path, 0);
> > + if(err)
> > + goto next;
> > +
> > + path = ext4_ext_find_extent(inode, iblock, NULL);
> > +
> > + if(path == NULL)
> > + goto next;
> > +
> > + ex = path->p_ext;
> > + if(ex == NULL)
> > + goto next;
> > +
> > + ee_block = le32_to_cpu(ex->ee_block);
> > + ee_len = ext4_ext_get_actual_len(ex);
> > +
> > + }
> > +
> > + /* If the extent exceeds len, split the extent to remove the tail */
> > + if(ee_len > len){
> > + err = ext4_split_extents(handle, inode, lblock+len, path, 0);
> > + if(err)
> > + goto next;
> > +
> > + ee_len = ext4_ext_get_actual_len(ex);
> > + }
> > +
> > + /* Mark the extent uninitialized */
> > + ext4_ext_mark_uninitialized(ex);
> > +
> > + iblock+=ex->ee_len;
> > + ret+=ex->ee_len;
> > + continue;
> > +next:
> > + /* If the extent for this block cannot be found, skip it */
> > + iblock++;
> > + }
> > +
> > + return ret;
> > +}
> > +
> > +
> > static void ext4_falloc_update_inode(struct inode *inode,
> > int mode, loff_t new_size, int update_ctime)
> > {
> > --
> > 1.7.1
> >
> >
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> > the body of a message to [email protected]
> > More majordomo info at http://vger.kernel.org/majordomo-info.html
> >
> --
> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html



2011-03-02 03:32:56

by Yongqiang Yang

[permalink] [raw]
Subject: Re: [Ext4 punch hole 1/5] Ext4 Punch Hole Support: Convert Blocks to Uninit Exts

On Tue, Mar 1, 2011 at 7:09 PM, Yongqiang Yang <[email protected]> wrote:
> On Tue, Mar 1, 2011 at 11:08 AM, Allison Henderson
> <[email protected]> wrote:
>> Hi All!
>>
>> This is the patch series for ext4 punch hole that I have been working on.
>> There is still a little more debugging that I would like to do with it,
>> but I've had some requests to see the patch, so I'm sending it out a bit
>> early. ?Any feed back is appreciated. ?Thx!
>>
>> This first patch adds a function to convert a range of blocks
>> to an uninitialized extent. ?This function will
>> be used to first convert the blocks to extents before
>> punching them out.
>>
>> This function also adds a routine to split an extent at
>> a given block. ?This routine is used when a range of
>> blocks to be converted is only partially contained in
>> an extent.
>>
>> Signed-off-by: Allison Henderson <[email protected]>
>> ---
>> :100644 100644 7516fb9... ab2e42e... M ?fs/ext4/extents.c
>> ?fs/ext4/extents.c | ?185 +++++++++++++++++++++++++++++++++++++++++++++++++++++
>> ?1 files changed, 185 insertions(+), 0 deletions(-)
>>
>> diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c
>> index 7516fb9..ab2e42e 100644
>> --- a/fs/ext4/extents.c
>> +++ b/fs/ext4/extents.c
>> @@ -2169,6 +2169,105 @@ static int ext4_remove_blocks(handle_t *handle, struct inode *inode,
>> ? ? ? ?return 0;
>> ?}
>>
>> +/*
>> + * ext4_split_extents() Splits a given extent at the block "split"
>> + * @handle: The journal handle
>> + * @inode: The file inode
>> + * @split: The block where the extent will be split
>> + * @path: The path to the extent
>> + * @flags: flags used to insert the new extent
>> + */
>> +static int ext4_split_extents(handle_t *handle,
>> + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? struct inode *inode,
>> + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ext4_lblk_t split,
>> + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? struct ext4_ext_path *path,
>> + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? int flags)
>> +{
>> + ? ? ? struct ext4_extent *ex, newex, orig_ex, *i_ex;
>> + ? ? ? struct ext4_extent *ex1 = NULL;
>> + ? ? ? struct ext4_extent *ex2 = NULL;
>> + ? ? ? struct ext4_extent_header *eh;
>> + ? ? ? ext4_lblk_t ee_block;
>> + ? ? ? unsigned int ee_len, depth;
>> + ? ? ? ext4_fsblk_t newblock, origblock;
>> + ? ? ? int err = 0;
>> +
>> + ? ? ? ext_debug("ext4_split_extent: inode %lu, split %u\n", inode->i_ino, split);
>> + ? ? ? ext4_ext_show_leaf(inode, path);
>> +
>> + ? ? ? depth = ext_depth(inode);
>> + ? ? ? eh = path[depth].p_hdr;
>> + ? ? ? ex = path[depth].p_ext;
>> + ? ? ? ee_block = le32_to_cpu(ex->ee_block);
>> + ? ? ? ee_len = ext4_ext_get_actual_len(ex);
>> +
>> + ? ? ? /* if the block is not actually in the extent, go to out */
>> + ? ? ? if(split > ee_block+ee_len || split < ee_block)
>> + ? ? ? ? ? ? ? goto out;
>> +
>> + ? ? ? origblock = ee_block + ext4_ext_pblock(ex);
>> + ? ? ? newblock = split - ee_block + ext4_ext_pblock(ex);
>> + ? ? ? ext_debug("The new block is %llu, the orig block is: %llu\n", newblock, origblock);
>> +
>> + ? ? ? /* save original block in case split fails */
>> + ? ? ? orig_ex.ee_block = ex->ee_block;
>> + ? ? ? orig_ex.ee_len ? = cpu_to_le16(ee_len);
>> + ? ? ? ext4_ext_store_pblock(&orig_ex, ext4_ext_pblock(ex));
>> +
>> + ? ? ? err = ext4_ext_get_access(handle, inode, path + depth);
>> + ? ? ? if (err)
>> + ? ? ? ? ? ? ? goto out;
>> +
>> + ? ? ? ex1 = ex;
>> + ? ? ? ex1->ee_len = cpu_to_le16(split-ee_block);
>> +
>> + ? ? ? ex2 = &newex;
>> + ? ? ? ex2->ee_len = cpu_to_le16(ee_len-(split-ee_block));
>> + ? ? ? ex2->ee_block = split;
>> + ? ? ? ext4_ext_store_pblock(ex2, newblock);
>> +
>> + ? ? ? if (ex1->ee_block != ex2->ee_block)
>> + ? ? ? ? ? ? ? goto insert;
>> +
>> + ? ? ? /* Mark modified extent as dirty */
>> + ? ? ? err = ext4_ext_dirty(handle, inode, path + depth);
>> + ? ? ? ext_debug("out here\n");
>> + ? ? ? goto out;
>
>
>
> --------
>> +insert:
>> +
>> + ? ? ? /*
>> + ? ? ? ?* If this leaf cannot fit in any more extents
>> + ? ? ? ?* insert it into another leaf
>> + ? ? ? ?*/
>> + ? ? ? if(EXT_LAST_EXTENT(eh) >= ?EXT_MAX_EXTENT(eh)){
>> + ? ? ? ? ? ? ? err = ext4_ext_insert_extent(handle, inode, path, &newex, flags);
>> + ? ? ? ? ? ? ? if (err)
>> + ? ? ? ? ? ? ? ? ? ? ? goto fix_extent_len;
>> + ? ? ? }
>> +
>> + ? ? ? /* otherwise just scoot all ther other extents down ?*/
>> + ? ? ? else{
>> + ? ? ? ? ? ? ? for(i_ex = EXT_LAST_EXTENT(eh); i_ex > ex1; i_ex-- ){
>> + ? ? ? ? ? ? ? ? ? ? ? memcpy(i_ex+1, i_ex, sizeof(struct ext4_extent));
>> + ? ? ? ? ? ? ? }
> ?for loop can be replaced by:
> ? ? ? ? ? ? ? ? ? ?memcpy(ex1 + 1, ex1 + 2, (i_ex - ex1) *
> sizeof(struct ext4_extent));
sorry, memmove() should be used.

>> + ? ? ? ? ? ? ? memcpy(ex1+1, ex2, sizeof(struct ext4_extent));
>> + ? ? ? ? ? ? ? eh->eh_entries++;
> (error) should be replaced by:
> ? ? ? ? ? ? ? ? ? ?le16_add_cpu(&eh->eh_entries, 1);
>> + ? ? ? }
>
> insert label could be replaced by:
> ?? ? ? ? ? ? ? err = ext4_ext_insert_extent(handle, inode, path, &newex, flags);
> ?? ? ? ? ? ? ? if (err)
> ?? ? ? ? ? ? ? ? ? ? ? goto fix_extent_len;
>
>> +out:
>> + ? ? ? ext4_ext_show_leaf(inode, path);
>> + ? ? ? return err;
>> +
>> +fix_extent_len:
>> + ? ? ? ex->ee_block = orig_ex.ee_block;
>> + ? ? ? ex->ee_len ? = orig_ex.ee_len;
>> + ? ? ? ext4_ext_store_pblock(ex, ext4_ext_pblock(&orig_ex));
>> + ? ? ? ext4_ext_mark_uninitialized(ex);
>> + ? ? ? ext4_ext_dirty(handle, inode, path + depth);
>> +
>> + ? ? ? return err;
>> +}
>> +
>> ?static int
>> ?ext4_ext_rm_leaf(handle_t *handle, struct inode *inode,
>> ? ? ? ? ? ? ? ?struct ext4_ext_path *path, ext4_lblk_t start)
>> @@ -3598,6 +3697,92 @@ out_stop:
>> ? ? ? ?ext4_journal_stop(handle);
>> ?}
>>
>> +/*
>> + * ext4_ext_convert_blocks_uninit()
>> + * Converts a range of blocks to uninitialized
>> + *
>> + * @inode: ?The files inode
>> + * @handle: The journal handle
>> + * @lblock: The logical block where the conversion will start
>> + * @len: ? ?The max number of blocks to convert
>> + *
>> + * Returns the number of blocks successfully converted
>> + */
>> +static int ext4_ext_convert_blocks_uninit(struct inode *inode, handle_t *handle, ext4_lblk_t lblock, ext4_lblk_t len){
>> +
>> + ? ? ? ext4_lblk_t ee_block, iblock;
>> + ? ? ? struct ext4_ext_path *path;
>> + ? ? ? struct ext4_extent *ex;
>> + ? ? ? unsigned int ee_len;
>> + ? ? ? int ret, err = 0;
>> +
>> + ? ? ? iblock = lblock;
>> + ? ? ? while(iblock < lblock+len){
>> +
>> + ? ? ? ? ? ? ? path = ext4_ext_find_extent(inode, lblock, NULL);
>> +
>> + ? ? ? ? ? ? ? if(path == NULL)
>> + ? ? ? ? ? ? ? ? ? ? ? goto next;
>> +
>> + ? ? ? ? ? ? ? err = ext4_ext_get_access(handle, inode, path);
>> + ? ? ? ? ? ? ? if(err < 0)
>> + ? ? ? ? ? ? ? ? ? ? ? goto next;
>> +
>> + ? ? ? ? ? ? ? ex = path->p_ext;
>> + ? ? ? ? ? ? ? if(ex == NULL)
>> + ? ? ? ? ? ? ? ? ? ? ? goto next;
>> +
>> + ? ? ? ? ? ? ? ee_block = le32_to_cpu(ex->ee_block);
>> + ? ? ? ? ? ? ? ee_len = ext4_ext_get_actual_len(ex);
>> +
>> + ? ? ? ? ? ? ? /*
>> + ? ? ? ? ? ? ? ?* If the block is in the middle of the extent,
>> + ? ? ? ? ? ? ? ?* split the extent to remove the head.
>> + ? ? ? ? ? ? ? ?* Then find the new extent that contains the block
>> + ? ? ? ? ? ? ? ?*/
>> + ? ? ? ? ? ? ? if(ee_block < iblock){
>> + ? ? ? ? ? ? ? ? ? ? ? err = ext4_split_extents(handle, inode, iblock, path, 0);
>> + ? ? ? ? ? ? ? ? ? ? ? if(err)
>> + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? goto next;
>> +
>> + ? ? ? ? ? ? ? ? ? ? ? path = ext4_ext_find_extent(inode, iblock, NULL);
>> +
>> + ? ? ? ? ? ? ? ? ? ? ? if(path == NULL)
>> + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? goto next;
>> +
>> + ? ? ? ? ? ? ? ? ? ? ? ex = path->p_ext;
>> + ? ? ? ? ? ? ? ? ? ? ? if(ex == NULL)
>> + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? goto next;
>> +
>> + ? ? ? ? ? ? ? ? ? ? ? ee_block = le32_to_cpu(ex->ee_block);
>> + ? ? ? ? ? ? ? ? ? ? ? ee_len = ext4_ext_get_actual_len(ex);
>> +
>> + ? ? ? ? ? ? ? }
>> +
>> + ? ? ? ? ? ? ? /* If the extent exceeds len, split the extent to remove the tail */
>> + ? ? ? ? ? ? ? if(ee_len > len){
>> + ? ? ? ? ? ? ? ? ? ? ? err = ext4_split_extents(handle, inode, lblock+len, path, 0);
>> + ? ? ? ? ? ? ? ? ? ? ? if(err)
>> + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? goto next;
>> +
>> + ? ? ? ? ? ? ? ? ? ? ? ee_len = ext4_ext_get_actual_len(ex);
>> + ? ? ? ? ? ? ? }
>> +
>> + ? ? ? ? ? ? ? /* Mark the extent uninitialized */
>> + ? ? ? ? ? ? ? ext4_ext_mark_uninitialized(ex);
>> +
>> + ? ? ? ? ? ? ? iblock+=ex->ee_len;
>> + ? ? ? ? ? ? ? ret+=ex->ee_len;
>> + ? ? ? ? ? ? ? continue;
>> +next:
>> + ? ? ? ? ? ? ? /* If the extent for this block cannot be found, skip it */
>> + ? ? ? ? ? ? ? iblock++;
>> + ? ? ? }
>> +
>> + ? ? ? return ret;
>> +}
>> +
>> +
>> ?static void ext4_falloc_update_inode(struct inode *inode,
>> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?int mode, loff_t new_size, int update_ctime)
>> ?{
>> --
>> 1.7.1
>>
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
>> the body of a message to [email protected]
>> More majordomo info at ?http://vger.kernel.org/majordomo-info.html
>>
>
>
>
> --
> Best Wishes
> Yongqiang Yang
>



--
Best Wishes
Yongqiang Yang

2011-03-02 07:54:27

by Amir Goldstein

[permalink] [raw]
Subject: Re: [Ext4 punch hole 1/5] Ext4 Punch Hole Support: Convert Blocks to Uninit Exts

On Wed, Mar 2, 2011 at 3:32 AM, Mingming Cao <[email protected]> wrote:
> On Tue, 2011-03-01 at 11:30 +0200, Amir Goldstein wrote:
>> On Tue, Mar 1, 2011 at 5:08 AM, Allison Henderson
>> <[email protected]> wrote:
>> > Hi All!
>> >
>> > This is the patch series for ext4 punch hole that I have been working on.
>> > There is still a little more debugging that I would like to do with it,
>> > but I've had some requests to see the patch, so I'm sending it out a bit
>> > early. ?Any feed back is appreciated. ?Thx!
>>
>> Hi Allison,
>>
>> Very nice work!
>>
>> One meta comment is related to locking considerations with direct I/O.
>> I am not all that familiar with all direct I/O paths (i.e. dioread_nolock),
>> but I smell a possible race with holes initiation via direct I/O.
>>
>> The thing with direct I/O is that you have no flags to test the state
>> of the block (did Eric just add a hash table to cope with another issue?).
>> Direct I/O writes outside i_size are synchronous (i.e. i_mutex is held until
>> I/O completion), but inside i_size, they may be a-synchronous.
>> The case of initiating holes is currently handled with DIO_SKIP_HOLES
>> by falling back to buffered I/O, although it could be handled by allocating
>> an uninitialized extent.
>
> This has been changed. We now have real direct IO filling holes. Blocks
> allocated for holes are marked as uninitialized extents first, after DIO
> is completed, those uninitialized extents are converted to initialized.
>

Right. Sorry, I missed that. Was confused by the fact that DIO_SKIP_HOLES
is still being used, but it didn't make sense to me either.

>> The case of writing to falloced space in handled by converting extents to
>> initialized at async I/O completion.
>>
>> I am not sure there a race with hole punching here and I am not even sure
>> that the description above is totally accurate (?), but it's a dark corner,
>> which should to be reviewed carefully.
>>
>
> The exsiting fallocate call which does allocation takes care of truncate
> &regular allocation (DIO or buffered IO) by holding i_mutex lock. The
> get_blocks() routines should takes care of concurrent modification of
> the allocation tree with i_data_sem. I haven't check carefully yet, but
> we should make sure allocation tree modification via punch hole also
> hold the i_data_sem as well, to prevent race with truncate/writepages.
>

I admit that all locking seems to be in order.
However, considering the fact that to this day, throughout the years of
ext2/3/4, developers could assume that inside i_size, new block mapping
could appear, but not disappear, it's hard to believe that no one has used
that assumption to make a "shortcut".

But perhaps I am wrong and the ordinary case of truncate/extend is
exactly the same as hole punching in that regard.

Amir.


>> >
>> > This first patch adds a function to convert a range of blocks
>> > to an uninitialized extent. ?This function will
>> > be used to first convert the blocks to extents before
>> > punching them out.
>> >
>> > This function also adds a routine to split an extent at
>> > a given block. ?This routine is used when a range of
>> > blocks to be converted is only partially contained in
>> > an extent.
>> >
>> > Signed-off-by: Allison Henderson <[email protected]>
>> > ---
>> > :100644 100644 7516fb9... ab2e42e... M ?fs/ext4/extents.c
>> > ?fs/ext4/extents.c | ?185 +++++++++++++++++++++++++++++++++++++++++++++++++++++
>> > ?1 files changed, 185 insertions(+), 0 deletions(-)
>> >
>> > diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c
>> > index 7516fb9..ab2e42e 100644
>> > --- a/fs/ext4/extents.c
>> > +++ b/fs/ext4/extents.c
>> > @@ -2169,6 +2169,105 @@ static int ext4_remove_blocks(handle_t *handle, struct inode *inode,
>> > ? ? ? ?return 0;
>> > ?}
>> >
>> > +/*
>> > + * ext4_split_extents() Splits a given extent at the block "split"
>> > + * @handle: The journal handle
>> > + * @inode: The file inode
>> > + * @split: The block where the extent will be split
>> > + * @path: The path to the extent
>> > + * @flags: flags used to insert the new extent
>> > + */
>> > +static int ext4_split_extents(handle_t *handle,
>> > + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? struct inode *inode,
>> > + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ext4_lblk_t split,
>> > + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? struct ext4_ext_path *path,
>> > + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? int flags)
>> > +{
>> > + ? ? ? struct ext4_extent *ex, newex, orig_ex, *i_ex;
>> > + ? ? ? struct ext4_extent *ex1 = NULL;
>> > + ? ? ? struct ext4_extent *ex2 = NULL;
>> > + ? ? ? struct ext4_extent_header *eh;
>> > + ? ? ? ext4_lblk_t ee_block;
>> > + ? ? ? unsigned int ee_len, depth;
>> > + ? ? ? ext4_fsblk_t newblock, origblock;
>> > + ? ? ? int err = 0;
>> > +
>> > + ? ? ? ext_debug("ext4_split_extent: inode %lu, split %u\n", inode->i_ino, split);
>> > + ? ? ? ext4_ext_show_leaf(inode, path);
>> > +
>> > + ? ? ? depth = ext_depth(inode);
>> > + ? ? ? eh = path[depth].p_hdr;
>> > + ? ? ? ex = path[depth].p_ext;
>> > + ? ? ? ee_block = le32_to_cpu(ex->ee_block);
>> > + ? ? ? ee_len = ext4_ext_get_actual_len(ex);
>> > +
>> > + ? ? ? /* if the block is not actually in the extent, go to out */
>> > + ? ? ? if(split > ee_block+ee_len || split < ee_block)
>> > + ? ? ? ? ? ? ? goto out;
>> > +
>> > + ? ? ? origblock = ee_block + ext4_ext_pblock(ex);
>> > + ? ? ? newblock = split - ee_block + ext4_ext_pblock(ex);
>> > + ? ? ? ext_debug("The new block is %llu, the orig block is: %llu\n", newblock, origblock);
>> > +
>> > + ? ? ? /* save original block in case split fails */
>> > + ? ? ? orig_ex.ee_block = ex->ee_block;
>> > + ? ? ? orig_ex.ee_len ? = cpu_to_le16(ee_len);
>> > + ? ? ? ext4_ext_store_pblock(&orig_ex, ext4_ext_pblock(ex));
>> > +
>> > + ? ? ? err = ext4_ext_get_access(handle, inode, path + depth);
>> > + ? ? ? if (err)
>> > + ? ? ? ? ? ? ? goto out;
>> > +
>> > + ? ? ? ex1 = ex;
>> > + ? ? ? ex1->ee_len = cpu_to_le16(split-ee_block);
>> > +
>> > + ? ? ? ex2 = &newex;
>> > + ? ? ? ex2->ee_len = cpu_to_le16(ee_len-(split-ee_block));
>> > + ? ? ? ex2->ee_block = split;
>> > + ? ? ? ext4_ext_store_pblock(ex2, newblock);
>> > +
>> > + ? ? ? if (ex1->ee_block != ex2->ee_block)
>> > + ? ? ? ? ? ? ? goto insert;
>> > +
>> > + ? ? ? /* Mark modified extent as dirty */
>> > + ? ? ? err = ext4_ext_dirty(handle, inode, path + depth);
>> > + ? ? ? ext_debug("out here\n");
>> > + ? ? ? goto out;
>> > +insert:
>> > +
>> > + ? ? ? /*
>> > + ? ? ? ?* If this leaf cannot fit in any more extents
>> > + ? ? ? ?* insert it into another leaf
>> > + ? ? ? ?*/
>> > + ? ? ? if(EXT_LAST_EXTENT(eh) >= ?EXT_MAX_EXTENT(eh)){
>> > + ? ? ? ? ? ? ? err = ext4_ext_insert_extent(handle, inode, path, &newex, flags);
>> > + ? ? ? ? ? ? ? if (err)
>> > + ? ? ? ? ? ? ? ? ? ? ? goto fix_extent_len;
>> > + ? ? ? }
>> > +
>> > + ? ? ? /* otherwise just scoot all ther other extents down ?*/
>> > + ? ? ? else{
>> > + ? ? ? ? ? ? ? for(i_ex = EXT_LAST_EXTENT(eh); i_ex > ex1; i_ex-- ){
>> > + ? ? ? ? ? ? ? ? ? ? ? memcpy(i_ex+1, i_ex, sizeof(struct ext4_extent));
>> > + ? ? ? ? ? ? ? }
>> > + ? ? ? ? ? ? ? memcpy(ex1+1, ex2, sizeof(struct ext4_extent));
>> > + ? ? ? ? ? ? ? eh->eh_entries++;
>> > + ? ? ? }
>> > +
>> > +out:
>> > + ? ? ? ext4_ext_show_leaf(inode, path);
>> > + ? ? ? return err;
>> > +
>> > +fix_extent_len:
>> > + ? ? ? ex->ee_block = orig_ex.ee_block;
>> > + ? ? ? ex->ee_len ? = orig_ex.ee_len;
>> > + ? ? ? ext4_ext_store_pblock(ex, ext4_ext_pblock(&orig_ex));
>> > + ? ? ? ext4_ext_mark_uninitialized(ex);
>> > + ? ? ? ext4_ext_dirty(handle, inode, path + depth);
>> > +
>> > + ? ? ? return err;
>> > +}
>> > +
>> > ?static int
>> > ?ext4_ext_rm_leaf(handle_t *handle, struct inode *inode,
>> > ? ? ? ? ? ? ? ?struct ext4_ext_path *path, ext4_lblk_t start)
>> > @@ -3598,6 +3697,92 @@ out_stop:
>> > ? ? ? ?ext4_journal_stop(handle);
>> > ?}
>> >
>> > +/*
>> > + * ext4_ext_convert_blocks_uninit()
>> > + * Converts a range of blocks to uninitialized
>> > + *
>> > + * @inode: ?The files inode
>> > + * @handle: The journal handle
>>
>>
>> (style) in all other functions, @handle comes before @inode
>> I'm sorry, but this hurts my eyes :-)
>>
>> > + * @lblock: The logical block where the conversion will start
>> > + * @len: ? ?The max number of blocks to convert
>> > + *
>> > + * Returns the number of blocks successfully converted
>> > + */
>> > +static int ext4_ext_convert_blocks_uninit(struct inode *inode, handle_t *handle, ext4_lblk_t lblock, ext4_lblk_t len){
>> > +
>> > + ? ? ? ext4_lblk_t ee_block, iblock;
>> > + ? ? ? struct ext4_ext_path *path;
>> > + ? ? ? struct ext4_extent *ex;
>> > + ? ? ? unsigned int ee_len;
>> > + ? ? ? int ret, err = 0;
>> > +
>> > + ? ? ? iblock = lblock;
>> > + ? ? ? while(iblock < lblock+len){
>> > +
>> > + ? ? ? ? ? ? ? path = ext4_ext_find_extent(inode, lblock, NULL);
>> > +
>> > + ? ? ? ? ? ? ? if(path == NULL)
>> > + ? ? ? ? ? ? ? ? ? ? ? goto next;
>> > +
>> > + ? ? ? ? ? ? ? err = ext4_ext_get_access(handle, inode, path);
>> > + ? ? ? ? ? ? ? if(err < 0)
>> > + ? ? ? ? ? ? ? ? ? ? ? goto next;
>> > +
>> > + ? ? ? ? ? ? ? ex = path->p_ext;
>> > + ? ? ? ? ? ? ? if(ex == NULL)
>> > + ? ? ? ? ? ? ? ? ? ? ? goto next;
>> > +
>> > + ? ? ? ? ? ? ? ee_block = le32_to_cpu(ex->ee_block);
>> > + ? ? ? ? ? ? ? ee_len = ext4_ext_get_actual_len(ex);
>> > +
>> > + ? ? ? ? ? ? ? /*
>> > + ? ? ? ? ? ? ? ?* If the block is in the middle of the extent,
>> > + ? ? ? ? ? ? ? ?* split the extent to remove the head.
>> > + ? ? ? ? ? ? ? ?* Then find the new extent that contains the block
>> > + ? ? ? ? ? ? ? ?*/
>> > + ? ? ? ? ? ? ? if(ee_block < iblock){
>> > + ? ? ? ? ? ? ? ? ? ? ? err = ext4_split_extents(handle, inode, iblock, path, 0);
>> > + ? ? ? ? ? ? ? ? ? ? ? if(err)
>> > + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? goto next;
>> > +
>> > + ? ? ? ? ? ? ? ? ? ? ? path = ext4_ext_find_extent(inode, iblock, NULL);
>> > +
>> > + ? ? ? ? ? ? ? ? ? ? ? if(path == NULL)
>> > + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? goto next;
>> > +
>> > + ? ? ? ? ? ? ? ? ? ? ? ex = path->p_ext;
>> > + ? ? ? ? ? ? ? ? ? ? ? if(ex == NULL)
>> > + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? goto next;
>> > +
>> > + ? ? ? ? ? ? ? ? ? ? ? ee_block = le32_to_cpu(ex->ee_block);
>> > + ? ? ? ? ? ? ? ? ? ? ? ee_len = ext4_ext_get_actual_len(ex);
>> > +
>> > + ? ? ? ? ? ? ? }
>> > +
>> > + ? ? ? ? ? ? ? /* If the extent exceeds len, split the extent to remove the tail */
>> > + ? ? ? ? ? ? ? if(ee_len > len){
>> > + ? ? ? ? ? ? ? ? ? ? ? err = ext4_split_extents(handle, inode, lblock+len, path, 0);
>> > + ? ? ? ? ? ? ? ? ? ? ? if(err)
>> > + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? goto next;
>> > +
>> > + ? ? ? ? ? ? ? ? ? ? ? ee_len = ext4_ext_get_actual_len(ex);
>> > + ? ? ? ? ? ? ? }
>> > +
>> > + ? ? ? ? ? ? ? /* Mark the extent uninitialized */
>> > + ? ? ? ? ? ? ? ext4_ext_mark_uninitialized(ex);
>> > +
>> > + ? ? ? ? ? ? ? iblock+=ex->ee_len;
>> > + ? ? ? ? ? ? ? ret+=ex->ee_len;
>> > + ? ? ? ? ? ? ? continue;
>> > +next:
>> > + ? ? ? ? ? ? ? /* If the extent for this block cannot be found, skip it */
>> > + ? ? ? ? ? ? ? iblock++;
>> > + ? ? ? }
>> > +
>> > + ? ? ? return ret;
>> > +}
>> > +
>> > +
>> > ?static void ext4_falloc_update_inode(struct inode *inode,
>> > ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?int mode, loff_t new_size, int update_ctime)
>> > ?{
>> > --
>> > 1.7.1
>> >
>> >
>> > --
>> > To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
>> > the body of a message to [email protected]
>> > More majordomo info at ?http://vger.kernel.org/majordomo-info.html
>> >
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
>> the body of a message to [email protected]
>> More majordomo info at ?http://vger.kernel.org/majordomo-info.html
>
>
>

2011-03-02 20:23:16

by Dave Chinner

[permalink] [raw]
Subject: Re: [Ext4 punch hole 1/5] Ext4 Punch Hole Support: Convert Blocks to Uninit Exts

On Mon, Feb 28, 2011 at 11:42:48PM -0700, Andreas Dilger wrote:
> On 2011-02-28, at 8:08 PM, Allison Henderson wrote:
> > This first patch adds a function to convert a range of blocks
> > to an uninitialized extent. This function will
> > be used to first convert the blocks to extents before
> > punching them out.
>
> This was proposed as a separate function for FALLOCATE by Dave
> Chinner (based on XFS_IOC_ZERO_RANGE), so this is useful as a
> standalone function.

XFS_IOC_ZERO_RANGE converts a range to unwritten extents, not
uninitialised extents. An uninitialised extent is one that is
allocated but had not data written to it (i.e. contains stale data),
while an unwritten/preallocated extent is guaranteed to contain
zeros. This may be just a terminology issue, but we should try to
use the same jargon across all filesystems...

Cheers,

Dave.
--
Dave Chinner
[email protected]

2011-03-03 00:56:57

by Andreas Dilger

[permalink] [raw]
Subject: Re: [Ext4 punch hole 1/5] Ext4 Punch Hole Support: Convert Blocks to Uninit Exts

On 2011-03-02, at 13:23, Dave Chinner <[email protected]> wrote:
> On Mon, Feb 28, 2011 at 11:42:48PM -0700, Andreas Dilger wrote:
>> On 2011-02-28, at 8:08 PM, Allison Henderson wrote:
>>> This first patch adds a function to convert a range of blocks
>>> to an uninitialized extent. This function will
>>> be used to first convert the blocks to extents before
>>> punching them out.
>>
>> This was proposed as a separate function for FALLOCATE by Dave
>> Chinner (based on XFS_IOC_ZERO_RANGE), so this is useful as a
>> standalone function.
>
> XFS_IOC_ZERO_RANGE converts a range to unwritten extents, not
> uninitialised extents. An uninitialised extent is one that is
> allocated but had not data written to it (i.e. contains stale data),
> while an unwritten/preallocated extent is guaranteed to contain
> zeros.

For ext4 at least there is no distinction at least. Still, it is good to use the right terms so that we are all on the same page.

> This may be just a terminology issue, but we should try to
> use the same jargon across all filesystems...

Cheers, Andreas

2011-03-03 14:36:28

by Theodore Ts'o

[permalink] [raw]
Subject: Re: [Ext4 punch hole 1/5] Ext4 Punch Hole Support: Convert Blocks to Uninit Exts


On Mar 2, 2011, at 3:23 PM, Dave Chinner wrote:

>
> XFS_IOC_ZERO_RANGE converts a range to unwritten extents, not
> uninitialised extents. An uninitialised extent is one that is
> allocated but had not data written to it (i.e. contains stale data),
> while an unwritten/preallocated extent is guaranteed to contain
> zeros. This may be just a terminology issue, but we should try to
> use the same jargon across all filesystems...

What is the difference from the user's perspective? If the read from
from an uninitialized extent, what do they get back? And if the read
from an unwritten extent, what do they get back? In ext4, in both
cases we would return all zero's.

For XFS, why is it important to maintain the distinction between these
two concept?

Just trying to understand,

-- Ted



2011-03-10 05:40:14

by Dave Young

[permalink] [raw]
Subject: Re: [Ext4 punch hole 1/5] Ext4 Punch Hole Support: Convert Blocks to Uninit Exts

On Tue, Mar 1, 2011 at 11:08 AM, Allison Henderson
<[email protected]> wrote:
> Hi All!
>
> This is the patch series for ext4 punch hole that I have been working on.
> There is still a little more debugging that I would like to do with it,
> but I've had some requests to see the patch, so I'm sending it out a bit
> early.  Any feed back is appreciated.  Thx!
>

Hi, aliison

Do you have new version of the patchset which address the comments here?
If they are not ready to send to list, you can send to me, I would like to test.

I tested this patchset for kvm guest image cleanup.
Host ext4 + guest ext4
guest side use lukas's fstrim tool, host side use a modified version
of Christoph's qemu block discard patch

there's many error like following:
[62834.936374] EXT4-fs error (device sda7): ext4_ext_lookup_hole:2907:
inode #1310744: comm qemu-system-x86: bad header/extent: invalid magic
- magic 0, entries 0, max 256(0), depth 256(0)
[62834.936616] EXT4-fs error (device sda7): ext4_ext_lookup_hole:2907:
inode #1310744: comm qemu-system-x86: bad header/extent: invalid magic
- magic 0, entries 0, max 256(0), depth 256(0)
bad header/extent: invalid magic - magic 0, entries 0, max 256(0), depth 256(0)
Then host pc hangs there.

BTW, host xfs + guest ext4 works well for me.

--
Thanks
Dave