2022-03-08 23:12:54

by Jan Kara

[permalink] [raw]
Subject: Re: [PATCH 3/5] ext4: for committing inode, make ext4_fc_track_inode wait

On Tue 08-03-22 02:51:10, Harshad Shirwadkar wrote:
> From: Harshad Shirwadkar <[email protected]>
>
> If the inode that's being requested to track using ext4_fc_track_inode
> is being committed, then wait until the inode finishes the commit.
>
> Signed-off-by: Harshad Shirwadkar <[email protected]>

Looks mostly good. Just some notes below.

> diff --git a/fs/ext4/ext4_jbd2.c b/fs/ext4/ext4_jbd2.c
> index 3477a16d08ae..7fa301b0a35a 100644
> --- a/fs/ext4/ext4_jbd2.c
> +++ b/fs/ext4/ext4_jbd2.c
> @@ -106,6 +106,18 @@ handle_t *__ext4_journal_start_sb(struct super_block *sb, unsigned int line,
> GFP_NOFS, type, line);
> }
>
> +handle_t *__ext4_journal_start(struct inode *inode, unsigned int line,
> + int type, int blocks, int rsv_blocks,
> + int revoke_creds)
> +{
> + handle_t *handle = __ext4_journal_start_sb(inode->i_sb, line,
> + type, blocks, rsv_blocks,
> + revoke_creds);
> + if (ext4_handle_valid(handle) && !IS_ERR(handle))
> + ext4_fc_track_inode(handle, inode);

Why do you need to call ext4_fc_track_inode() here? Calls in
ext4_map_blocks() and ext4_mark_iloc_dirty() should be enough, shouldn't
they?

> + return handle;
> +}
> +
> int __ext4_journal_stop(const char *where, unsigned int line, handle_t *handle)
> {
> struct super_block *sb;

...

> @@ -519,6 +525,33 @@ void ext4_fc_track_inode(handle_t *handle, struct inode *inode)
> return;
> }
>
> + if (!test_opt2(inode->i_sb, JOURNAL_FAST_COMMIT) ||
> + (EXT4_SB(inode->i_sb)->s_mount_state & EXT4_FC_REPLAY))
> + return;
> +
> + spin_lock(&ei->i_fc_lock);
> + while (ext4_test_inode_state(inode, EXT4_STATE_FC_COMMITTING)) {
> +#if (BITS_PER_LONG < 64)
> + DEFINE_WAIT_BIT(wait, &ei->i_state_flags,
> + EXT4_STATE_FC_COMMITTING);
> + wq = bit_waitqueue(&ei->i_state_flags,
> + EXT4_STATE_FC_COMMITTING);
> +#else
> + DEFINE_WAIT_BIT(wait, &ei->i_flags,
> + EXT4_STATE_FC_COMMITTING);
> + wq = bit_waitqueue(&ei->i_flags,
> + EXT4_STATE_FC_COMMITTING);
> +#endif
> +
> + prepare_to_wait(wq, &wait.wq_entry, TASK_UNINTERRUPTIBLE);
> + spin_unlock(&ei->i_fc_lock);
> +
> + schedule();
> + finish_wait(wq, &wait.wq_entry);
> + spin_lock(&ei->i_fc_lock);
> + }
> + spin_unlock(&ei->i_fc_lock);

Hum, we operate inode state with atomic bitops. So I think there's no real
need for ei->i_fc_lock here. You just need to be careful and check inode
state again after prepare_to_wait() call.

Honza
--
Jan Kara <[email protected]>
SUSE Labs, CR


2022-03-09 00:03:31

by harshad shirwadkar

[permalink] [raw]
Subject: Re: [PATCH 3/5] ext4: for committing inode, make ext4_fc_track_inode wait

On Tue, 8 Mar 2022 at 04:30, Jan Kara <[email protected]> wrote:
>
> On Tue 08-03-22 02:51:10, Harshad Shirwadkar wrote:
> > From: Harshad Shirwadkar <[email protected]>
> >
> > If the inode that's being requested to track using ext4_fc_track_inode
> > is being committed, then wait until the inode finishes the commit.
> >
> > Signed-off-by: Harshad Shirwadkar <[email protected]>
>
> Looks mostly good. Just some notes below.
>
> > diff --git a/fs/ext4/ext4_jbd2.c b/fs/ext4/ext4_jbd2.c
> > index 3477a16d08ae..7fa301b0a35a 100644
> > --- a/fs/ext4/ext4_jbd2.c
> > +++ b/fs/ext4/ext4_jbd2.c
> > @@ -106,6 +106,18 @@ handle_t *__ext4_journal_start_sb(struct super_block *sb, unsigned int line,
> > GFP_NOFS, type, line);
> > }
> >
> > +handle_t *__ext4_journal_start(struct inode *inode, unsigned int line,
> > + int type, int blocks, int rsv_blocks,
> > + int revoke_creds)
> > +{
> > + handle_t *handle = __ext4_journal_start_sb(inode->i_sb, line,
> > + type, blocks, rsv_blocks,
> > + revoke_creds);
> > + if (ext4_handle_valid(handle) && !IS_ERR(handle))
> > + ext4_fc_track_inode(handle, inode);
>
> Why do you need to call ext4_fc_track_inode() here? Calls in
> ext4_map_blocks() and ext4_mark_iloc_dirty() should be enough, shouldn't
> they?
This is just a precautionary call. ext4_fc_track_inode is an
idempotent function, so it doesn't matter if it gets called multiple
times. This check just covers cases (such as the ones in inline.c)
where ext4_reserve_inode_write() doesn't get called. I saw a few
failures in the log group when I remove this call. The right way to
fix this though is to ensure that ext4_reserve_inode_write() gets
called before every inode update.

>
> > + return handle;
> > +}
> > +
> > int __ext4_journal_stop(const char *where, unsigned int line, handle_t *handle)
> > {
> > struct super_block *sb;
>
> ...
>
> > @@ -519,6 +525,33 @@ void ext4_fc_track_inode(handle_t *handle, struct inode *inode)
> > return;
> > }
> >
> > + if (!test_opt2(inode->i_sb, JOURNAL_FAST_COMMIT) ||
> > + (EXT4_SB(inode->i_sb)->s_mount_state & EXT4_FC_REPLAY))
> > + return;
> > +
> > + spin_lock(&ei->i_fc_lock);
> > + while (ext4_test_inode_state(inode, EXT4_STATE_FC_COMMITTING)) {
> > +#if (BITS_PER_LONG < 64)
> > + DEFINE_WAIT_BIT(wait, &ei->i_state_flags,
> > + EXT4_STATE_FC_COMMITTING);
> > + wq = bit_waitqueue(&ei->i_state_flags,
> > + EXT4_STATE_FC_COMMITTING);
> > +#else
> > + DEFINE_WAIT_BIT(wait, &ei->i_flags,
> > + EXT4_STATE_FC_COMMITTING);
> > + wq = bit_waitqueue(&ei->i_flags,
> > + EXT4_STATE_FC_COMMITTING);
> > +#endif
> > +
> > + prepare_to_wait(wq, &wait.wq_entry, TASK_UNINTERRUPTIBLE);
> > + spin_unlock(&ei->i_fc_lock);
> > +
> > + schedule();
> > + finish_wait(wq, &wait.wq_entry);
> > + spin_lock(&ei->i_fc_lock);
> > + }
> > + spin_unlock(&ei->i_fc_lock);
>
> Hum, we operate inode state with atomic bitops. So I think there's no real
> need for ei->i_fc_lock here. You just need to be careful and check inode
> state again after prepare_to_wait() call.
Okay that makes sense, I'll do this in V2.

- Harshad
>
> Honza
> --
> Jan Kara <[email protected]>
> SUSE Labs, CR

2022-03-09 10:56:56

by Jan Kara

[permalink] [raw]
Subject: Re: [PATCH 3/5] ext4: for committing inode, make ext4_fc_track_inode wait

On Tue 08-03-22 05:06:51, harshad shirwadkar wrote:
> On Tue, 8 Mar 2022 at 04:30, Jan Kara <[email protected]> wrote:
> > > diff --git a/fs/ext4/ext4_jbd2.c b/fs/ext4/ext4_jbd2.c
> > > index 3477a16d08ae..7fa301b0a35a 100644
> > > --- a/fs/ext4/ext4_jbd2.c
> > > +++ b/fs/ext4/ext4_jbd2.c
> > > @@ -106,6 +106,18 @@ handle_t *__ext4_journal_start_sb(struct super_block *sb, unsigned int line,
> > > GFP_NOFS, type, line);
> > > }
> > >
> > > +handle_t *__ext4_journal_start(struct inode *inode, unsigned int line,
> > > + int type, int blocks, int rsv_blocks,
> > > + int revoke_creds)
> > > +{
> > > + handle_t *handle = __ext4_journal_start_sb(inode->i_sb, line,
> > > + type, blocks, rsv_blocks,
> > > + revoke_creds);
> > > + if (ext4_handle_valid(handle) && !IS_ERR(handle))
> > > + ext4_fc_track_inode(handle, inode);
> >
> > Why do you need to call ext4_fc_track_inode() here? Calls in
> > ext4_map_blocks() and ext4_mark_iloc_dirty() should be enough, shouldn't
> > they?
> This is just a precautionary call. ext4_fc_track_inode is an
> idempotent function, so it doesn't matter if it gets called multiple
> times. This check just covers cases (such as the ones in inline.c)
> where ext4_reserve_inode_write() doesn't get called. I saw a few
> failures in the log group when I remove this call. The right way to
> fix this though is to ensure that ext4_reserve_inode_write() gets
> called before every inode update.

Oh, you mean like when updating inline data? I'm not sure
ext4_reserve_inode_write() is usable for all the places in inline.c but
yes, you probably need some sprinkling of ext4_fc_track_inode() there. That
would be definitely better than hacking it around in
__ext4_journal_start().

Honza
--
Jan Kara <[email protected]>
SUSE Labs, CR