2022-07-27 14:52:05

by Jeff Layton

[permalink] [raw]
Subject: [PATCH v2] ext4: unconditionally enable the i_version counter

The original i_version implementation was pretty expensive, requiring a
log flush on every change. Because of this, it was gated behind a mount
option (implemented via the MS_I_VERSION mountoption flag).

Commit ae5e165d855d (fs: new API for handling inode->i_version) made the
i_version flag much less expensive, so there is no longer a performance
penalty from enabling it. xfs and btrfs already enable it
unconditionally when the on-disk format can support it.

Have ext4 ignore the SB_I_VERSION flag, and just enable it
unconditionally. While we're in here, remove the handling of
Opt_i_version as well, since we're almost to 5.20 anyway.

Ideally, we'd couple this change with a way to disable the i_version
counter (just in case), but the way the iversion mount option was
implemented makes that difficult to do. We'd need to add a new mount
option altogether or do something with tune2fs. That's probably best
left to later patches if it turns out to be needed.

Cc: Dave Chinner <[email protected]>
Cc: Lukas Czerner <[email protected]>
Cc: Benjamin Coddington <[email protected]>
Cc: Christoph Hellwig <[email protected]>
Cc: Darrick J. Wong <[email protected]>
Signed-off-by: Jeff Layton <[email protected]>
---
fs/ext4/inode.c | 5 ++---
fs/ext4/super.c | 13 ++++---------
2 files changed, 6 insertions(+), 12 deletions(-)

diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index 84c0eb55071d..c785c0b72116 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -5411,7 +5411,7 @@ int ext4_setattr(struct user_namespace *mnt_userns, struct dentry *dentry,
return -EINVAL;
}

- if (IS_I_VERSION(inode) && attr->ia_size != inode->i_size)
+ if (attr->ia_size != inode->i_size)
inode_inc_iversion(inode);

if (shrink) {
@@ -5717,8 +5717,7 @@ int ext4_mark_iloc_dirty(handle_t *handle,
}
ext4_fc_track_inode(handle, inode);

- if (IS_I_VERSION(inode))
- inode_inc_iversion(inode);
+ inode_inc_iversion(inode);

/* the do_update_inode consumes one bh->b_count */
get_bh(iloc->bh);
diff --git a/fs/ext4/super.c b/fs/ext4/super.c
index 845f2f8aee5f..4b06f394d7d1 100644
--- a/fs/ext4/super.c
+++ b/fs/ext4/super.c
@@ -1585,7 +1585,7 @@ enum {
Opt_inlinecrypt,
Opt_usrjquota, Opt_grpjquota, Opt_quota,
Opt_noquota, Opt_barrier, Opt_nobarrier, Opt_err,
- Opt_usrquota, Opt_grpquota, Opt_prjquota, Opt_i_version,
+ Opt_usrquota, Opt_grpquota, Opt_prjquota,
Opt_dax, Opt_dax_always, Opt_dax_inode, Opt_dax_never,
Opt_stripe, Opt_delalloc, Opt_nodelalloc, Opt_warn_on_error,
Opt_nowarn_on_error, Opt_mblk_io_submit, Opt_debug_want_extra_isize,
@@ -1694,7 +1694,6 @@ static const struct fs_parameter_spec ext4_param_specs[] = {
fsparam_flag ("barrier", Opt_barrier),
fsparam_u32 ("barrier", Opt_barrier),
fsparam_flag ("nobarrier", Opt_nobarrier),
- fsparam_flag ("i_version", Opt_i_version),
fsparam_flag ("dax", Opt_dax),
fsparam_enum ("dax", Opt_dax_type, ext4_param_dax),
fsparam_u32 ("stripe", Opt_stripe),
@@ -2140,11 +2139,6 @@ static int ext4_parse_param(struct fs_context *fc, struct fs_parameter *param)
case Opt_abort:
ctx_set_mount_flag(ctx, EXT4_MF_FS_ABORTED);
return 0;
- case Opt_i_version:
- ext4_msg(NULL, KERN_WARNING, deprecated_msg, param->key, "5.20");
- ext4_msg(NULL, KERN_WARNING, "Use iversion instead\n");
- ctx_set_flags(ctx, SB_I_VERSION);
- return 0;
case Opt_inlinecrypt:
#ifdef CONFIG_FS_ENCRYPTION_INLINE_CRYPT
ctx_set_flags(ctx, SB_INLINECRYPT);
@@ -2970,8 +2964,6 @@ static int _ext4_show_options(struct seq_file *seq, struct super_block *sb,
SEQ_OPTS_PRINT("min_batch_time=%u", sbi->s_min_batch_time);
if (nodefs || sbi->s_max_batch_time != EXT4_DEF_MAX_BATCH_TIME)
SEQ_OPTS_PRINT("max_batch_time=%u", sbi->s_max_batch_time);
- if (sb->s_flags & SB_I_VERSION)
- SEQ_OPTS_PUTS("i_version");
if (nodefs || sbi->s_stripe)
SEQ_OPTS_PRINT("stripe=%lu", sbi->s_stripe);
if (nodefs || EXT4_MOUNT_DATA_FLAGS &
@@ -4630,6 +4622,9 @@ static int __ext4_fill_super(struct fs_context *fc, struct super_block *sb)
sb->s_flags = (sb->s_flags & ~SB_POSIXACL) |
(test_opt(sb, POSIX_ACL) ? SB_POSIXACL : 0);

+ /* i_version is always enabled now */
+ sb->s_flags |= SB_I_VERSION;
+
if (le32_to_cpu(es->s_rev_level) == EXT4_GOOD_OLD_REV &&
(ext4_has_compat_features(sb) ||
ext4_has_ro_compat_features(sb) ||
--
2.37.1


2022-07-27 16:01:29

by Jeff Layton

[permalink] [raw]
Subject: Re: [PATCH v2] ext4: unconditionally enable the i_version counter

On Wed, 2022-07-27 at 11:48 -0400, Benjamin Coddington wrote:
> On 27 Jul 2022, at 10:37, Jeff Layton wrote:
>
> > The original i_version implementation was pretty expensive, requiring
> > a
> > log flush on every change. Because of this, it was gated behind a
> > mount
> > option (implemented via the MS_I_VERSION mountoption flag).
> >
> > Commit ae5e165d855d (fs: new API for handling inode->i_version) made
> > the
> > i_version flag much less expensive, so there is no longer a
> > performance
> > penalty from enabling it. xfs and btrfs already enable it
> > unconditionally when the on-disk format can support it.
> >
> > Have ext4 ignore the SB_I_VERSION flag, and just enable it
> > unconditionally. While we're in here, remove the handling of
> > Opt_i_version as well, since we're almost to 5.20 anyway.
> >
> > Ideally, we'd couple this change with a way to disable the i_version
> > counter (just in case), but the way the iversion mount option was
> > implemented makes that difficult to do. We'd need to add a new mount
> > option altogether or do something with tune2fs. That's probably best
> > left to later patches if it turns out to be needed.
> >
> > Cc: Dave Chinner <[email protected]>
> > Cc: Lukas Czerner <[email protected]>
> > Cc: Benjamin Coddington <[email protected]>
> > Cc: Christoph Hellwig <[email protected]>
> > Cc: Darrick J. Wong <[email protected]>
> > Signed-off-by: Jeff Layton <[email protected]>
> > ---
> > fs/ext4/inode.c | 5 ++---
> > fs/ext4/super.c | 13 ++++---------
> > 2 files changed, 6 insertions(+), 12 deletions(-)
> >
> > diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
> > index 84c0eb55071d..c785c0b72116 100644
> > --- a/fs/ext4/inode.c
> > +++ b/fs/ext4/inode.c
> > @@ -5411,7 +5411,7 @@ int ext4_setattr(struct user_namespace
> > *mnt_userns, struct dentry *dentry,
> > return -EINVAL;
> > }
> >
> > - if (IS_I_VERSION(inode) && attr->ia_size != inode->i_size)
> > + if (attr->ia_size != inode->i_size)
> > inode_inc_iversion(inode);
> >
> > if (shrink) {
> > @@ -5717,8 +5717,7 @@ int ext4_mark_iloc_dirty(handle_t *handle,
> > }
> > ext4_fc_track_inode(handle, inode);
> >
> > - if (IS_I_VERSION(inode))
> > - inode_inc_iversion(inode);
> > + inode_inc_iversion(inode);
> >
> > /* the do_update_inode consumes one bh->b_count */
> > get_bh(iloc->bh);
> > diff --git a/fs/ext4/super.c b/fs/ext4/super.c
> > index 845f2f8aee5f..4b06f394d7d1 100644
> > --- a/fs/ext4/super.c
> > +++ b/fs/ext4/super.c
> > @@ -1585,7 +1585,7 @@ enum {
> > Opt_inlinecrypt,
> > Opt_usrjquota, Opt_grpjquota, Opt_quota,
> > Opt_noquota, Opt_barrier, Opt_nobarrier, Opt_err,
> > - Opt_usrquota, Opt_grpquota, Opt_prjquota, Opt_i_version,
> > + Opt_usrquota, Opt_grpquota, Opt_prjquota,
> > Opt_dax, Opt_dax_always, Opt_dax_inode, Opt_dax_never,
> > Opt_stripe, Opt_delalloc, Opt_nodelalloc, Opt_warn_on_error,
> > Opt_nowarn_on_error, Opt_mblk_io_submit, Opt_debug_want_extra_isize,
> > @@ -1694,7 +1694,6 @@ static const struct fs_parameter_spec
> > ext4_param_specs[] = {
> > fsparam_flag ("barrier", Opt_barrier),
> > fsparam_u32 ("barrier", Opt_barrier),
> > fsparam_flag ("nobarrier", Opt_nobarrier),
> > - fsparam_flag ("i_version", Opt_i_version),
>
> We've got to keep the parameter, I think, else we'll break existing
> setups
> with the i_version mount option.
>

It had already been announced that the above mount option would be
removed by v5.20 (which Darrick pointed out). We might as well drop it
here since this likely wouldn't be merged before then anyway.

The "iversion" mount option is parsed in the userland mount program, and
gets turned into MS_I_VERSION flag for the mount syscall. That will
still be done, though with this change, the kernel should now just
ignore it.
--
Jeff Layton <[email protected]>

2022-07-27 16:07:22

by Benjamin Coddington

[permalink] [raw]
Subject: Re: [PATCH v2] ext4: unconditionally enable the i_version counter

On 27 Jul 2022, at 10:37, Jeff Layton wrote:

> The original i_version implementation was pretty expensive, requiring
> a
> log flush on every change. Because of this, it was gated behind a
> mount
> option (implemented via the MS_I_VERSION mountoption flag).
>
> Commit ae5e165d855d (fs: new API for handling inode->i_version) made
> the
> i_version flag much less expensive, so there is no longer a
> performance
> penalty from enabling it. xfs and btrfs already enable it
> unconditionally when the on-disk format can support it.
>
> Have ext4 ignore the SB_I_VERSION flag, and just enable it
> unconditionally. While we're in here, remove the handling of
> Opt_i_version as well, since we're almost to 5.20 anyway.
>
> Ideally, we'd couple this change with a way to disable the i_version
> counter (just in case), but the way the iversion mount option was
> implemented makes that difficult to do. We'd need to add a new mount
> option altogether or do something with tune2fs. That's probably best
> left to later patches if it turns out to be needed.
>
> Cc: Dave Chinner <[email protected]>
> Cc: Lukas Czerner <[email protected]>
> Cc: Benjamin Coddington <[email protected]>
> Cc: Christoph Hellwig <[email protected]>
> Cc: Darrick J. Wong <[email protected]>
> Signed-off-by: Jeff Layton <[email protected]>
> ---
> fs/ext4/inode.c | 5 ++---
> fs/ext4/super.c | 13 ++++---------
> 2 files changed, 6 insertions(+), 12 deletions(-)
>
> diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
> index 84c0eb55071d..c785c0b72116 100644
> --- a/fs/ext4/inode.c
> +++ b/fs/ext4/inode.c
> @@ -5411,7 +5411,7 @@ int ext4_setattr(struct user_namespace
> *mnt_userns, struct dentry *dentry,
> return -EINVAL;
> }
>
> - if (IS_I_VERSION(inode) && attr->ia_size != inode->i_size)
> + if (attr->ia_size != inode->i_size)
> inode_inc_iversion(inode);
>
> if (shrink) {
> @@ -5717,8 +5717,7 @@ int ext4_mark_iloc_dirty(handle_t *handle,
> }
> ext4_fc_track_inode(handle, inode);
>
> - if (IS_I_VERSION(inode))
> - inode_inc_iversion(inode);
> + inode_inc_iversion(inode);
>
> /* the do_update_inode consumes one bh->b_count */
> get_bh(iloc->bh);
> diff --git a/fs/ext4/super.c b/fs/ext4/super.c
> index 845f2f8aee5f..4b06f394d7d1 100644
> --- a/fs/ext4/super.c
> +++ b/fs/ext4/super.c
> @@ -1585,7 +1585,7 @@ enum {
> Opt_inlinecrypt,
> Opt_usrjquota, Opt_grpjquota, Opt_quota,
> Opt_noquota, Opt_barrier, Opt_nobarrier, Opt_err,
> - Opt_usrquota, Opt_grpquota, Opt_prjquota, Opt_i_version,
> + Opt_usrquota, Opt_grpquota, Opt_prjquota,
> Opt_dax, Opt_dax_always, Opt_dax_inode, Opt_dax_never,
> Opt_stripe, Opt_delalloc, Opt_nodelalloc, Opt_warn_on_error,
> Opt_nowarn_on_error, Opt_mblk_io_submit, Opt_debug_want_extra_isize,
> @@ -1694,7 +1694,6 @@ static const struct fs_parameter_spec
> ext4_param_specs[] = {
> fsparam_flag ("barrier", Opt_barrier),
> fsparam_u32 ("barrier", Opt_barrier),
> fsparam_flag ("nobarrier", Opt_nobarrier),
> - fsparam_flag ("i_version", Opt_i_version),

We've got to keep the parameter, I think, else we'll break existing
setups
with the i_version mount option.

Ben