Hi!
Here's v3 of this fix to the fast commit enqueuing bug triggered by fstest
generic/047. This version simplifies the previous patch version by re-using
the i_sync_tid field in struct ext4_inode_info instead of adding a new one.
The extra patch includes a few extra fixes to the tid_t type handling. Jan
brought to my attention the fact that this sequence number may wrap, and I
quickly found a few places in the code where the tid_geq() and tid_gt()
helpers had to be used.
Again, please note that this fix requires [1] to be applied too.
[1] https://lore.kernel.org/all/[email protected]
Luis Henriques (SUSE) (2):
ext4: fix fast commit inode enqueueing during a full journal commit
ext4: fix possible tid_t sequence overflows
fs/ext4/fast_commit.c | 17 +++++++++++++----
1 file changed, 13 insertions(+), 4 deletions(-)
When a full journal commit is on-going, any fast commit has to be enqueued
into a different queue: FC_Q_STAGING instead of FC_Q_MAIN. This enqueueing
is done only once, i.e. if an inode is already queued in a previous fast
commit entry it won't be enqueued again. However, if a full commit starts
_after_ the inode is enqueued into FC_Q_MAIN, the next fast commit needs to
be done into FC_Q_STAGING. And this is not being done in function
ext4_fc_track_template().
This patch fixes the issue by re-enqueuing an inode into the STAGING queue
during the fast commit clean-up callback if it has a tid (i_sync_tid)
greater than the one being handled. The STAGING queue will then be spliced
back into MAIN.
This bug was found using fstest generic/047. This test creates several 32k
bytes files, sync'ing each of them after it's creation, and then shutting
down the filesystem. Some data may be loss in this operation; for example a
file may have it's size truncated to zero.
Signed-off-by: Luis Henriques (SUSE) <[email protected]>
---
fs/ext4/fast_commit.c | 11 ++++++++++-
1 file changed, 10 insertions(+), 1 deletion(-)
diff --git a/fs/ext4/fast_commit.c b/fs/ext4/fast_commit.c
index 87c009e0c59a..088bd509b116 100644
--- a/fs/ext4/fast_commit.c
+++ b/fs/ext4/fast_commit.c
@@ -1282,8 +1282,17 @@ static void ext4_fc_cleanup(journal_t *journal, int full, tid_t tid)
list_del_init(&iter->i_fc_list);
ext4_clear_inode_state(&iter->vfs_inode,
EXT4_STATE_FC_COMMITTING);
- if (iter->i_sync_tid <= tid)
+ if (iter->i_sync_tid <= tid) {
ext4_fc_reset_inode(&iter->vfs_inode);
+ } else {
+ /*
+ * re-enqueue inode into STAGING, which later will be
+ * splice back into MAIN
+ */
+ list_add_tail(&EXT4_I(&iter->vfs_inode)->i_fc_list,
+ &sbi->s_fc_q[FC_Q_STAGING]);
+ }
+
/* Make sure EXT4_STATE_FC_COMMITTING bit is clear */
smp_mb();
#if (BITS_PER_LONG < 64)
On Wed 29-05-24 10:20:29, Luis Henriques (SUSE) wrote:
> When a full journal commit is on-going, any fast commit has to be enqueued
> into a different queue: FC_Q_STAGING instead of FC_Q_MAIN. This enqueueing
> is done only once, i.e. if an inode is already queued in a previous fast
> commit entry it won't be enqueued again. However, if a full commit starts
> _after_ the inode is enqueued into FC_Q_MAIN, the next fast commit needs to
> be done into FC_Q_STAGING. And this is not being done in function
> ext4_fc_track_template().
>
> This patch fixes the issue by re-enqueuing an inode into the STAGING queue
> during the fast commit clean-up callback if it has a tid (i_sync_tid)
> greater than the one being handled. The STAGING queue will then be spliced
> back into MAIN.
>
> This bug was found using fstest generic/047. This test creates several 32k
> bytes files, sync'ing each of them after it's creation, and then shutting
> down the filesystem. Some data may be loss in this operation; for example a
> file may have it's size truncated to zero.
>
> Signed-off-by: Luis Henriques (SUSE) <[email protected]>
Looks good to me! Feel free to add:
Reviewed-by: Jan Kara <[email protected]>
Just a typo correction below.
> diff --git a/fs/ext4/fast_commit.c b/fs/ext4/fast_commit.c
> index 87c009e0c59a..088bd509b116 100644
> --- a/fs/ext4/fast_commit.c
> +++ b/fs/ext4/fast_commit.c
> @@ -1282,8 +1282,17 @@ static void ext4_fc_cleanup(journal_t *journal, int full, tid_t tid)
> list_del_init(&iter->i_fc_list);
> ext4_clear_inode_state(&iter->vfs_inode,
> EXT4_STATE_FC_COMMITTING);
> - if (iter->i_sync_tid <= tid)
> + if (iter->i_sync_tid <= tid) {
> ext4_fc_reset_inode(&iter->vfs_inode);
> + } else {
> + /*
> + * re-enqueue inode into STAGING, which later will be
> + * splice back into MAIN
^^^ spliced
> + */
> + list_add_tail(&EXT4_I(&iter->vfs_inode)->i_fc_list,
> + &sbi->s_fc_q[FC_Q_STAGING]);
> + }
> +
Honza
--
Jan Kara <[email protected]>
SUSE Labs, CR
Looks good!
Reviewed-by: Harshad Shirwadkar <[email protected]>
On Wed, May 29, 2024 at 2:50 AM Jan Kara <[email protected]> wrote:
>
> On Wed 29-05-24 10:20:29, Luis Henriques (SUSE) wrote:
> > When a full journal commit is on-going, any fast commit has to be enqueued
> > into a different queue: FC_Q_STAGING instead of FC_Q_MAIN. This enqueueing
> > is done only once, i.e. if an inode is already queued in a previous fast
> > commit entry it won't be enqueued again. However, if a full commit starts
> > _after_ the inode is enqueued into FC_Q_MAIN, the next fast commit needs to
> > be done into FC_Q_STAGING. And this is not being done in function
> > ext4_fc_track_template().
> >
> > This patch fixes the issue by re-enqueuing an inode into the STAGING queue
> > during the fast commit clean-up callback if it has a tid (i_sync_tid)
> > greater than the one being handled. The STAGING queue will then be spliced
> > back into MAIN.
> >
> > This bug was found using fstest generic/047. This test creates several 32k
> > bytes files, sync'ing each of them after it's creation, and then shutting
> > down the filesystem. Some data may be loss in this operation; for example a
> > file may have it's size truncated to zero.
> >
> > Signed-off-by: Luis Henriques (SUSE) <[email protected]>
>
> Looks good to me! Feel free to add:
>
> Reviewed-by: Jan Kara <[email protected]>
>
> Just a typo correction below.
>
> > diff --git a/fs/ext4/fast_commit.c b/fs/ext4/fast_commit.c
> > index 87c009e0c59a..088bd509b116 100644
> > --- a/fs/ext4/fast_commit.c
> > +++ b/fs/ext4/fast_commit.c
> > @@ -1282,8 +1282,17 @@ static void ext4_fc_cleanup(journal_t *journal, int full, tid_t tid)
> > list_del_init(&iter->i_fc_list);
> > ext4_clear_inode_state(&iter->vfs_inode,
> > EXT4_STATE_FC_COMMITTING);
> > - if (iter->i_sync_tid <= tid)
> > + if (iter->i_sync_tid <= tid) {
> > ext4_fc_reset_inode(&iter->vfs_inode);
> > + } else {
> > + /*
> > + * re-enqueue inode into STAGING, which later will be
> > + * splice back into MAIN
> ^^^ spliced
>
> > + */
> > + list_add_tail(&EXT4_I(&iter->vfs_inode)->i_fc_list,
> > + &sbi->s_fc_q[FC_Q_STAGING]);
> > + }
> > +
>
> Honza
> --
> Jan Kara <[email protected]>
> SUSE Labs, CR