2022-06-21 14:41:59

by Eric Whitney

[permalink] [raw]
Subject: [PATCH] ext4: minor defrag code improvements

Modify two error paths returning EBUSY for bad argument file types to
return EOPNOTSUPP instead. Move an extent tree search whose results are
only occasionally required to the site always requiring them for
improved efficiency. Address a few typos.

Signed-off-by: Eric Whitney <[email protected]>
---
fs/ext4/move_extent.c | 16 +++++++---------
1 file changed, 7 insertions(+), 9 deletions(-)

diff --git a/fs/ext4/move_extent.c b/fs/ext4/move_extent.c
index 701f1d6a217f..4e4b0452106e 100644
--- a/fs/ext4/move_extent.c
+++ b/fs/ext4/move_extent.c
@@ -472,19 +472,17 @@ mext_check_arguments(struct inode *orig_inode,
if (IS_IMMUTABLE(donor_inode) || IS_APPEND(donor_inode))
return -EPERM;

- /* Ext4 move extent does not support swapfile */
+ /* Ext4 move extent does not support swap files */
if (IS_SWAPFILE(orig_inode) || IS_SWAPFILE(donor_inode)) {
- ext4_debug("ext4 move extent: The argument files should "
- "not be swapfile [ino:orig %lu, donor %lu]\n",
+ ext4_debug("ext4 move extent: The argument files should not be swap files [ino:orig %lu, donor %lu]\n",
orig_inode->i_ino, donor_inode->i_ino);
- return -EBUSY;
+ return -EOPNOTSUPP;
}

if (ext4_is_quota_file(orig_inode) && ext4_is_quota_file(donor_inode)) {
- ext4_debug("ext4 move extent: The argument files should "
- "not be quota files [ino:orig %lu, donor %lu]\n",
+ ext4_debug("ext4 move extent: The argument files should not be quota files [ino:orig %lu, donor %lu]\n",
orig_inode->i_ino, donor_inode->i_ino);
- return -EBUSY;
+ return -EOPNOTSUPP;
}

/* Ext4 move extent supports only extent based file */
@@ -631,11 +629,11 @@ ext4_move_extents(struct file *o_filp, struct file *d_filp, __u64 orig_blk,
if (ret)
goto out;
ex = path[path->p_depth].p_ext;
- next_blk = ext4_ext_next_allocated_block(path);
cur_blk = le32_to_cpu(ex->ee_block);
cur_len = ext4_ext_get_actual_len(ex);
/* Check hole before the start pos */
if (cur_blk + cur_len - 1 < o_start) {
+ next_blk = ext4_ext_next_allocated_block(path);
if (next_blk == EXT_MAX_BLOCKS) {
ret = -ENODATA;
goto out;
@@ -663,7 +661,7 @@ ext4_move_extents(struct file *o_filp, struct file *d_filp, __u64 orig_blk,
donor_page_index = d_start >> (PAGE_SHIFT -
donor_inode->i_blkbits);
offset_in_page = o_start % blocks_per_page;
- if (cur_len > blocks_per_page- offset_in_page)
+ if (cur_len > blocks_per_page - offset_in_page)
cur_len = blocks_per_page - offset_in_page;
/*
* Up semaphore to avoid following problems:
--
2.30.2


2022-07-14 11:58:20

by Jan Kara

[permalink] [raw]
Subject: Re: [PATCH] ext4: minor defrag code improvements

On Tue 21-06-22 10:33:40, Eric Whitney wrote:
> Modify two error paths returning EBUSY for bad argument file types to
> return EOPNOTSUPP instead. Move an extent tree search whose results are
> only occasionally required to the site always requiring them for
> improved efficiency. Address a few typos.
>
> Signed-off-by: Eric Whitney <[email protected]>

So why is EOPNOTSUPP better than EBUSY? Honestly we are rather inconsistent
with errors returned for various operations on swapfile -
read/write/fallocate/truncate return ETXTBSY, unlink returns EPERM, some
ext4 ioctls return EINVAL... I guess ETXTBSY is the most common return
value?

Otherwise the patch looks good.

Honza

> ---
> fs/ext4/move_extent.c | 16 +++++++---------
> 1 file changed, 7 insertions(+), 9 deletions(-)
>
> diff --git a/fs/ext4/move_extent.c b/fs/ext4/move_extent.c
> index 701f1d6a217f..4e4b0452106e 100644
> --- a/fs/ext4/move_extent.c
> +++ b/fs/ext4/move_extent.c
> @@ -472,19 +472,17 @@ mext_check_arguments(struct inode *orig_inode,
> if (IS_IMMUTABLE(donor_inode) || IS_APPEND(donor_inode))
> return -EPERM;
>
> - /* Ext4 move extent does not support swapfile */
> + /* Ext4 move extent does not support swap files */
> if (IS_SWAPFILE(orig_inode) || IS_SWAPFILE(donor_inode)) {
> - ext4_debug("ext4 move extent: The argument files should "
> - "not be swapfile [ino:orig %lu, donor %lu]\n",
> + ext4_debug("ext4 move extent: The argument files should not be swap files [ino:orig %lu, donor %lu]\n",
> orig_inode->i_ino, donor_inode->i_ino);
> - return -EBUSY;
> + return -EOPNOTSUPP;
> }
>
> if (ext4_is_quota_file(orig_inode) && ext4_is_quota_file(donor_inode)) {
> - ext4_debug("ext4 move extent: The argument files should "
> - "not be quota files [ino:orig %lu, donor %lu]\n",
> + ext4_debug("ext4 move extent: The argument files should not be quota files [ino:orig %lu, donor %lu]\n",
> orig_inode->i_ino, donor_inode->i_ino);
> - return -EBUSY;
> + return -EOPNOTSUPP;
> }
>
> /* Ext4 move extent supports only extent based file */
> @@ -631,11 +629,11 @@ ext4_move_extents(struct file *o_filp, struct file *d_filp, __u64 orig_blk,
> if (ret)
> goto out;
> ex = path[path->p_depth].p_ext;
> - next_blk = ext4_ext_next_allocated_block(path);
> cur_blk = le32_to_cpu(ex->ee_block);
> cur_len = ext4_ext_get_actual_len(ex);
> /* Check hole before the start pos */
> if (cur_blk + cur_len - 1 < o_start) {
> + next_blk = ext4_ext_next_allocated_block(path);
> if (next_blk == EXT_MAX_BLOCKS) {
> ret = -ENODATA;
> goto out;
> @@ -663,7 +661,7 @@ ext4_move_extents(struct file *o_filp, struct file *d_filp, __u64 orig_blk,
> donor_page_index = d_start >> (PAGE_SHIFT -
> donor_inode->i_blkbits);
> offset_in_page = o_start % blocks_per_page;
> - if (cur_len > blocks_per_page- offset_in_page)
> + if (cur_len > blocks_per_page - offset_in_page)
> cur_len = blocks_per_page - offset_in_page;
> /*
> * Up semaphore to avoid following problems:
> --
> 2.30.2
>
--
Jan Kara <[email protected]>
SUSE Labs, CR

2022-07-19 21:38:44

by Eric Whitney

[permalink] [raw]
Subject: Re: [PATCH] ext4: minor defrag code improvements

* Jan Kara <[email protected]>:
> On Tue 21-06-22 10:33:40, Eric Whitney wrote:
> > Modify two error paths returning EBUSY for bad argument file types to
> > return EOPNOTSUPP instead. Move an extent tree search whose results are
> > only occasionally required to the site always requiring them for
> > improved efficiency. Address a few typos.
> >
> > Signed-off-by: Eric Whitney <[email protected]>
>
> So why is EOPNOTSUPP better than EBUSY? Honestly we are rather inconsistent
> with errors returned for various operations on swapfile -
> read/write/fallocate/truncate return ETXTBSY, unlink returns EPERM, some
> ext4 ioctls return EINVAL... I guess ETXTBSY is the most common return
> value?
>
> Otherwise the patch looks good.
>
> Honza

Hi Jan - thanks for your review.

I think EOPNOTSUPP is better than EBUSY when EXT4_IOC_MOVE_EXT is applied to
a swap file because it's a much more direct message indicating that ext4
doesn't support swap file defragmentation. With the exception of the two
sites modified by this patch, all the other sites in the defrag code where
checks are made for unsupported file types or file system configurations,
EOPNOTSUPP is returned. Because EBUSY really doesn't seem to convey what's
happening here very well - this isn't so much a case of a potentially
temporarily busy resource as an attempt to perform an operation that will
never succeed - another choice seemed more appropriate. I picked EOPNOTSUPP
simply because it's used in those other sites in the defrag code, and was
trying to be consistent. (The defrag code reports EBUSYs when it can't
release pages with references on them.)

EINVAL could be an alternative. It's not quite as direct but it does
convey the idea that an argument to the ioctl is wrong. The defrag code
uses it (a little inconsistently) when argument values are out of range or
otherwise invalid.

Is ETXTBUSY still reported by the kernel? I couldn't find it in a search after
reading this: lwn.net/Articles/866493/
I didn't consider that because an executable wasn't involved - interesting that
it was used for some operations applied to swap files.

At any rate, I'm open to suggestions here - just trying to clean up a few
things after a little code review.

Thanks,
Eric

>
> > ---
> > fs/ext4/move_extent.c | 16 +++++++---------
> > 1 file changed, 7 insertions(+), 9 deletions(-)
> >
> > diff --git a/fs/ext4/move_extent.c b/fs/ext4/move_extent.c
> > index 701f1d6a217f..4e4b0452106e 100644
> > --- a/fs/ext4/move_extent.c
> > +++ b/fs/ext4/move_extent.c
> > @@ -472,19 +472,17 @@ mext_check_arguments(struct inode *orig_inode,
> > if (IS_IMMUTABLE(donor_inode) || IS_APPEND(donor_inode))
> > return -EPERM;
> >
> > - /* Ext4 move extent does not support swapfile */
> > + /* Ext4 move extent does not support swap files */
> > if (IS_SWAPFILE(orig_inode) || IS_SWAPFILE(donor_inode)) {
> > - ext4_debug("ext4 move extent: The argument files should "
> > - "not be swapfile [ino:orig %lu, donor %lu]\n",
> > + ext4_debug("ext4 move extent: The argument files should not be swap files [ino:orig %lu, donor %lu]\n",
> > orig_inode->i_ino, donor_inode->i_ino);
> > - return -EBUSY;
> > + return -EOPNOTSUPP;
> > }
> >
> > if (ext4_is_quota_file(orig_inode) && ext4_is_quota_file(donor_inode)) {
> > - ext4_debug("ext4 move extent: The argument files should "
> > - "not be quota files [ino:orig %lu, donor %lu]\n",
> > + ext4_debug("ext4 move extent: The argument files should not be quota files [ino:orig %lu, donor %lu]\n",
> > orig_inode->i_ino, donor_inode->i_ino);
> > - return -EBUSY;
> > + return -EOPNOTSUPP;
> > }
> >
> > /* Ext4 move extent supports only extent based file */
> > @@ -631,11 +629,11 @@ ext4_move_extents(struct file *o_filp, struct file *d_filp, __u64 orig_blk,
> > if (ret)
> > goto out;
> > ex = path[path->p_depth].p_ext;
> > - next_blk = ext4_ext_next_allocated_block(path);
> > cur_blk = le32_to_cpu(ex->ee_block);
> > cur_len = ext4_ext_get_actual_len(ex);
> > /* Check hole before the start pos */
> > if (cur_blk + cur_len - 1 < o_start) {
> > + next_blk = ext4_ext_next_allocated_block(path);
> > if (next_blk == EXT_MAX_BLOCKS) {
> > ret = -ENODATA;
> > goto out;
> > @@ -663,7 +661,7 @@ ext4_move_extents(struct file *o_filp, struct file *d_filp, __u64 orig_blk,
> > donor_page_index = d_start >> (PAGE_SHIFT -
> > donor_inode->i_blkbits);
> > offset_in_page = o_start % blocks_per_page;
> > - if (cur_len > blocks_per_page- offset_in_page)
> > + if (cur_len > blocks_per_page - offset_in_page)
> > cur_len = blocks_per_page - offset_in_page;
> > /*
> > * Up semaphore to avoid following problems:
> > --
> > 2.30.2
> >
> --
> Jan Kara <[email protected]>
> SUSE Labs, CR

2022-07-20 15:12:44

by Eric Whitney

[permalink] [raw]
Subject: Re: [PATCH] ext4: minor defrag code improvements

* Eric Whitney <[email protected]>:
> * Jan Kara <[email protected]>:
> > On Tue 21-06-22 10:33:40, Eric Whitney wrote:
> > > Modify two error paths returning EBUSY for bad argument file types to
> > > return EOPNOTSUPP instead. Move an extent tree search whose results are
> > > only occasionally required to the site always requiring them for
> > > improved efficiency. Address a few typos.
> > >
> > > Signed-off-by: Eric Whitney <[email protected]>
> >
> > So why is EOPNOTSUPP better than EBUSY? Honestly we are rather inconsistent
> > with errors returned for various operations on swapfile -
> > read/write/fallocate/truncate return ETXTBSY, unlink returns EPERM, some
> > ext4 ioctls return EINVAL... I guess ETXTBSY is the most common return
> > value?
> >
> > Otherwise the patch looks good.
> >
> > Honza
>
> Hi Jan - thanks for your review.
>
> I think EOPNOTSUPP is better than EBUSY when EXT4_IOC_MOVE_EXT is applied to
> a swap file because it's a much more direct message indicating that ext4
> doesn't support swap file defragmentation. With the exception of the two
> sites modified by this patch, all the other sites in the defrag code where
> checks are made for unsupported file types or file system configurations,
> EOPNOTSUPP is returned. Because EBUSY really doesn't seem to convey what's
> happening here very well - this isn't so much a case of a potentially
> temporarily busy resource as an attempt to perform an operation that will
> never succeed - another choice seemed more appropriate. I picked EOPNOTSUPP
> simply because it's used in those other sites in the defrag code, and was
> trying to be consistent. (The defrag code reports EBUSYs when it can't
> release pages with references on them.)
>
> EINVAL could be an alternative. It's not quite as direct but it does
> convey the idea that an argument to the ioctl is wrong. The defrag code
> uses it (a little inconsistently) when argument values are out of range or
> otherwise invalid.
>
> Is ETXTBUSY still reported by the kernel? I couldn't find it in a search after
> reading this: lwn.net/Articles/866493/
> I didn't consider that because an executable wasn't involved - interesting that
> it was used for some operations applied to swap files.

And of course I botched my search - ETXTBSY, not ETXTBUSY. 50+ instances
remain in the kernel.

So, EOPNOTSUPP or EINVAL would be a clearer indication that the move extents
ioctl is being applied to a file type it can't defrag. I can go with ETXTBSY
if you really prefer - any of these is better than EBUSY in this case, IMHO.

Eric

>
> At any rate, I'm open to suggestions here - just trying to clean up a few
> things after a little code review.
>
> Thanks,
> Eric
>
> >
> > > ---
> > > fs/ext4/move_extent.c | 16 +++++++---------
> > > 1 file changed, 7 insertions(+), 9 deletions(-)
> > >
> > > diff --git a/fs/ext4/move_extent.c b/fs/ext4/move_extent.c
> > > index 701f1d6a217f..4e4b0452106e 100644
> > > --- a/fs/ext4/move_extent.c
> > > +++ b/fs/ext4/move_extent.c
> > > @@ -472,19 +472,17 @@ mext_check_arguments(struct inode *orig_inode,
> > > if (IS_IMMUTABLE(donor_inode) || IS_APPEND(donor_inode))
> > > return -EPERM;
> > >
> > > - /* Ext4 move extent does not support swapfile */
> > > + /* Ext4 move extent does not support swap files */
> > > if (IS_SWAPFILE(orig_inode) || IS_SWAPFILE(donor_inode)) {
> > > - ext4_debug("ext4 move extent: The argument files should "
> > > - "not be swapfile [ino:orig %lu, donor %lu]\n",
> > > + ext4_debug("ext4 move extent: The argument files should not be swap files [ino:orig %lu, donor %lu]\n",
> > > orig_inode->i_ino, donor_inode->i_ino);
> > > - return -EBUSY;
> > > + return -EOPNOTSUPP;
> > > }
> > >
> > > if (ext4_is_quota_file(orig_inode) && ext4_is_quota_file(donor_inode)) {
> > > - ext4_debug("ext4 move extent: The argument files should "
> > > - "not be quota files [ino:orig %lu, donor %lu]\n",
> > > + ext4_debug("ext4 move extent: The argument files should not be quota files [ino:orig %lu, donor %lu]\n",
> > > orig_inode->i_ino, donor_inode->i_ino);
> > > - return -EBUSY;
> > > + return -EOPNOTSUPP;
> > > }
> > >
> > > /* Ext4 move extent supports only extent based file */
> > > @@ -631,11 +629,11 @@ ext4_move_extents(struct file *o_filp, struct file *d_filp, __u64 orig_blk,
> > > if (ret)
> > > goto out;
> > > ex = path[path->p_depth].p_ext;
> > > - next_blk = ext4_ext_next_allocated_block(path);
> > > cur_blk = le32_to_cpu(ex->ee_block);
> > > cur_len = ext4_ext_get_actual_len(ex);
> > > /* Check hole before the start pos */
> > > if (cur_blk + cur_len - 1 < o_start) {
> > > + next_blk = ext4_ext_next_allocated_block(path);
> > > if (next_blk == EXT_MAX_BLOCKS) {
> > > ret = -ENODATA;
> > > goto out;
> > > @@ -663,7 +661,7 @@ ext4_move_extents(struct file *o_filp, struct file *d_filp, __u64 orig_blk,
> > > donor_page_index = d_start >> (PAGE_SHIFT -
> > > donor_inode->i_blkbits);
> > > offset_in_page = o_start % blocks_per_page;
> > > - if (cur_len > blocks_per_page- offset_in_page)
> > > + if (cur_len > blocks_per_page - offset_in_page)
> > > cur_len = blocks_per_page - offset_in_page;
> > > /*
> > > * Up semaphore to avoid following problems:
> > > --
> > > 2.30.2
> > >
> > --
> > Jan Kara <[email protected]>
> > SUSE Labs, CR

2022-07-22 03:07:43

by Theodore Ts'o

[permalink] [raw]
Subject: Re: [PATCH] ext4: minor defrag code improvements

On Wed, Jul 20, 2022 at 11:11:38AM -0400, Eric Whitney wrote:
> > Is ETXTBUSY still reported by the kernel? I couldn't find it in a search after
> > reading this: lwn.net/Articles/866493/
> > I didn't consider that because an executable wasn't involved - interesting that
> > it was used for some operations applied to swap files.

The LWN article is specifically about whether it's worth it to block
writes to executable files.

However, if you look at some places where ETXTBSY is returned, such as
in fs/open.c and fs/read_write.c, it's being returned when there is
attempt to operate on a swap file using fallocate(2), write(2) or
copy_file(2). So I agree with Jan that it's better for the defrag
code to be consistent those uses of ETXTBSY.

I'll also add that, "busy" does make some sense as a concept, since if
you run "swapoff", you can now defrag the file, since it's no longer
being used as a swap file --- hence, it's no longer busy. So I don't
have as visceral reaction to using EBUSY, but given the other ways
defrag might return EBUSY where it *would* make sense to retry the
defrag, I agree that changing the error return in the case of an
attempted defrag of a swap file to ETXTBSY makes sense.

- Ted

2022-07-22 14:47:17

by Eric Whitney

[permalink] [raw]
Subject: Re: [PATCH] ext4: minor defrag code improvements

* Theodore Ts'o <[email protected]>:
> On Wed, Jul 20, 2022 at 11:11:38AM -0400, Eric Whitney wrote:
> > > Is ETXTBUSY still reported by the kernel? I couldn't find it in a search after
> > > reading this: lwn.net/Articles/866493/
> > > I didn't consider that because an executable wasn't involved - interesting that
> > > it was used for some operations applied to swap files.
>
> The LWN article is specifically about whether it's worth it to block
> writes to executable files.
>
> However, if you look at some places where ETXTBSY is returned, such as
> in fs/open.c and fs/read_write.c, it's being returned when there is
> attempt to operate on a swap file using fallocate(2), write(2) or
> copy_file(2). So I agree with Jan that it's better for the defrag
> code to be consistent those uses of ETXTBSY.
>
> I'll also add that, "busy" does make some sense as a concept, since if
> you run "swapoff", you can now defrag the file, since it's no longer
> being used as a swap file --- hence, it's no longer busy. So I don't
> have as visceral reaction to using EBUSY, but given the other ways
> defrag might return EBUSY where it *would* make sense to retry the
> defrag, I agree that changing the error return in the case of an
> attempted defrag of a swap file to ETXTBSY makes sense.
>
> - Ted

Thanks for your review. I'll modify the patch to return ETXTBSY and repost.

Eric