2021-12-23 20:21:54

by harshad shirwadkar

[permalink] [raw]
Subject: [PATCH v2 0/4] ext4 fast commit API cleanup

This patch series fixes up fast commit APIs. There are NO on-disk
format changes introduced in this series. The main contribution of the
series is that it drops fast commit specific transaction APIs and
makes fast commits work with journal transaction APIs of JBD2
journalling system. With these changes, a fast commit eligible
transaction is simply enclosed in calls to "jbd2_journal_start()" and
"jbd2_journal_stop()". If the update that is being performed is fast
commit ineligible, one must simply call ext4_fc_mark_ineligible()
after starting a transaction using "jbd2_journal_start()". The last
patch in the series simplifies fast commit stats recording by moving
it to a different function.

I verified that the patch series introduces no regressions in "quick"
and "log" groups when "fast_commit" feature is enabled.

Changes from V1:
---------------

- In the V1 of the patch series, there's performance regression. With
this patch series, we lock the entire file system from starting any
new handles during (which ensures consistency at the cost of
performance). What we ideally want to do is to lock individual
inodes from starting new updates during a commit. To do so, the V2
of this patch series retains the infrastructure of inode level
transactions (ext4_fc_start/stop_update()). In future (not in this
series), we would build on this infrastructure to lock individual
inodes and drop the file system level locking during the commit path.

Signed-off-by: Harshad Shirwadkar <[email protected]>

Harshad Shirwadkar (4):
ext4: use ext4_journal_start/stop for fast commit transactions
ext4: drop ineligible txn start stop APIs
ext4: simplify updating of fast commit stats
ext4: update fast commit TODOs

fs/ext4/acl.c | 2 -
fs/ext4/ext4.h | 7 +-
fs/ext4/extents.c | 9 +-
fs/ext4/fast_commit.c | 188 ++++++++++++++++--------------------------
fs/ext4/fast_commit.h | 27 +++---
fs/ext4/file.c | 4 -
fs/ext4/inode.c | 7 +-
fs/ext4/ioctl.c | 13 +--
fs/ext4/super.c | 1 -
fs/jbd2/journal.c | 2 +
10 files changed, 96 insertions(+), 164 deletions(-)

--
2.34.1.307.g9b7440fafd-goog



2021-12-23 20:21:55

by harshad shirwadkar

[permalink] [raw]
Subject: [PATCH v2 1/4] ext4: use ext4_journal_start/stop for fast commit transactions

From: Harshad Shirwadkar <[email protected]>

This patch drops all calls to ext4_fc_start_update() and
ext4_fc_stop_update(). To ensure that there are no ongoing journal
updates during fast commit, we also make jbd2_fc_begin_commit() lock
journal for updates. This way we don't have to maintain two different
transaction start stop APIs for fast commit and full commit. This
patch doesn't remove the functions altogether since in future we want
to have inode level locking for fast commits.

Signed-off-by: Harshad Shirwadkar <[email protected]>
---
fs/ext4/acl.c | 2 --
fs/ext4/extents.c | 3 ---
fs/ext4/file.c | 4 ----
fs/ext4/inode.c | 7 +------
fs/ext4/ioctl.c | 10 +---------
fs/jbd2/journal.c | 2 ++
6 files changed, 4 insertions(+), 24 deletions(-)

diff --git a/fs/ext4/acl.c b/fs/ext4/acl.c
index 0613dfcbfd4a..5a35768d6149 100644
--- a/fs/ext4/acl.c
+++ b/fs/ext4/acl.c
@@ -246,7 +246,6 @@ ext4_set_acl(struct user_namespace *mnt_userns, struct inode *inode,
handle = ext4_journal_start(inode, EXT4_HT_XATTR, credits);
if (IS_ERR(handle))
return PTR_ERR(handle);
- ext4_fc_start_update(inode);

if ((type == ACL_TYPE_ACCESS) && acl) {
error = posix_acl_update_mode(mnt_userns, inode, &mode, &acl);
@@ -264,7 +263,6 @@ ext4_set_acl(struct user_namespace *mnt_userns, struct inode *inode,
}
out_stop:
ext4_journal_stop(handle);
- ext4_fc_stop_update(inode);
if (error == -ENOSPC && ext4_should_retry_alloc(inode->i_sb, &retries))
goto retry;
return error;
diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c
index 0ecf819bf189..703feff8cb8c 100644
--- a/fs/ext4/extents.c
+++ b/fs/ext4/extents.c
@@ -4697,8 +4697,6 @@ long ext4_fallocate(struct file *file, int mode, loff_t offset, loff_t len)
FALLOC_FL_INSERT_RANGE))
return -EOPNOTSUPP;

- ext4_fc_start_update(inode);
-
if (mode & FALLOC_FL_PUNCH_HOLE) {
ret = ext4_punch_hole(inode, offset, len);
goto exit;
@@ -4762,7 +4760,6 @@ long ext4_fallocate(struct file *file, int mode, loff_t offset, loff_t len)
inode_unlock(inode);
trace_ext4_fallocate_exit(inode, offset, max_blocks, ret);
exit:
- ext4_fc_stop_update(inode);
return ret;
}

diff --git a/fs/ext4/file.c b/fs/ext4/file.c
index 4c5f41052351..8cc11715518a 100644
--- a/fs/ext4/file.c
+++ b/fs/ext4/file.c
@@ -259,7 +259,6 @@ static ssize_t ext4_buffered_write_iter(struct kiocb *iocb,
if (iocb->ki_flags & IOCB_NOWAIT)
return -EOPNOTSUPP;

- ext4_fc_start_update(inode);
inode_lock(inode);
ret = ext4_write_checks(iocb, from);
if (ret <= 0)
@@ -271,7 +270,6 @@ static ssize_t ext4_buffered_write_iter(struct kiocb *iocb,

out:
inode_unlock(inode);
- ext4_fc_stop_update(inode);
if (likely(ret > 0)) {
iocb->ki_pos += ret;
ret = generic_write_sync(iocb, ret);
@@ -552,9 +550,7 @@ static ssize_t ext4_dio_write_iter(struct kiocb *iocb, struct iov_iter *from)
goto out;
}

- ext4_fc_start_update(inode);
ret = ext4_orphan_add(handle, inode);
- ext4_fc_stop_update(inode);
if (ret) {
ext4_journal_stop(handle);
goto out;
diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index bfd3545f1e5d..82f555d26980 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -5320,7 +5320,7 @@ int ext4_setattr(struct user_namespace *mnt_userns, struct dentry *dentry,
if (error)
return error;
}
- ext4_fc_start_update(inode);
+
if ((ia_valid & ATTR_UID && !uid_eq(attr->ia_uid, inode->i_uid)) ||
(ia_valid & ATTR_GID && !gid_eq(attr->ia_gid, inode->i_gid))) {
handle_t *handle;
@@ -5344,7 +5344,6 @@ int ext4_setattr(struct user_namespace *mnt_userns, struct dentry *dentry,

if (error) {
ext4_journal_stop(handle);
- ext4_fc_stop_update(inode);
return error;
}
/* Update corresponding info in inode so that everything is in
@@ -5356,7 +5355,6 @@ int ext4_setattr(struct user_namespace *mnt_userns, struct dentry *dentry,
error = ext4_mark_inode_dirty(handle, inode);
ext4_journal_stop(handle);
if (unlikely(error)) {
- ext4_fc_stop_update(inode);
return error;
}
}
@@ -5370,12 +5368,10 @@ int ext4_setattr(struct user_namespace *mnt_userns, struct dentry *dentry,
struct ext4_sb_info *sbi = EXT4_SB(inode->i_sb);

if (attr->ia_size > sbi->s_bitmap_maxbytes) {
- ext4_fc_stop_update(inode);
return -EFBIG;
}
}
if (!S_ISREG(inode->i_mode)) {
- ext4_fc_stop_update(inode);
return -EINVAL;
}

@@ -5499,7 +5495,6 @@ int ext4_setattr(struct user_namespace *mnt_userns, struct dentry *dentry,
ext4_std_error(inode->i_sb, error);
if (!error)
error = rc;
- ext4_fc_stop_update(inode);
return error;
}

diff --git a/fs/ext4/ioctl.c b/fs/ext4/ioctl.c
index 606dee9e08a3..e64a12e1218a 100644
--- a/fs/ext4/ioctl.c
+++ b/fs/ext4/ioctl.c
@@ -743,7 +743,6 @@ int ext4_fileattr_set(struct user_namespace *mnt_userns,
u32 flags = fa->flags;
int err = -EOPNOTSUPP;

- ext4_fc_start_update(inode);
if (flags & ~EXT4_FL_USER_VISIBLE)
goto out;

@@ -764,7 +763,6 @@ int ext4_fileattr_set(struct user_namespace *mnt_userns,
goto out;
err = ext4_ioctl_setproject(inode, fa->fsx_projid);
out:
- ext4_fc_stop_update(inode);
return err;
}

@@ -1273,13 +1271,7 @@ static long __ext4_ioctl(struct file *filp, unsigned int cmd, unsigned long arg)

long ext4_ioctl(struct file *filp, unsigned int cmd, unsigned long arg)
{
- long ret;
-
- ext4_fc_start_update(file_inode(filp));
- ret = __ext4_ioctl(filp, cmd, arg);
- ext4_fc_stop_update(file_inode(filp));
-
- return ret;
+ return __ext4_ioctl(filp, cmd, arg);
}

#ifdef CONFIG_COMPAT
diff --git a/fs/jbd2/journal.c b/fs/jbd2/journal.c
index 35302bc192eb..0b86a4365b66 100644
--- a/fs/jbd2/journal.c
+++ b/fs/jbd2/journal.c
@@ -757,6 +757,7 @@ int jbd2_fc_begin_commit(journal_t *journal, tid_t tid)
}
journal->j_flags |= JBD2_FAST_COMMIT_ONGOING;
write_unlock(&journal->j_state_lock);
+ jbd2_journal_lock_updates(journal);

return 0;
}
@@ -768,6 +769,7 @@ EXPORT_SYMBOL(jbd2_fc_begin_commit);
*/
static int __jbd2_fc_end_commit(journal_t *journal, tid_t tid, bool fallback)
{
+ jbd2_journal_unlock_updates(journal);
if (journal->j_fc_cleanup_callback)
journal->j_fc_cleanup_callback(journal, 0);
write_lock(&journal->j_state_lock);
--
2.34.1.307.g9b7440fafd-goog


2021-12-23 20:21:57

by harshad shirwadkar

[permalink] [raw]
Subject: [PATCH v2 2/4] ext4: drop ineligible txn start stop APIs

From: Harshad Shirwadkar <[email protected]>

This patch drops ext4_fc_start_ineligible() and
ext4_fc_stop_ineligible() APIs. Fast commit ineligible transactions
should simply call ext4_fc_mark_ineligible() after starting the
trasaction.

Signed-off-by: Harshad Shirwadkar <[email protected]>
---
fs/ext4/ext4.h | 6 ++--
fs/ext4/extents.c | 6 ++--
fs/ext4/fast_commit.c | 79 ++++++++-----------------------------------
fs/ext4/ioctl.c | 3 +-
fs/ext4/super.c | 1 -
5 files changed, 20 insertions(+), 75 deletions(-)

diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
index 404dd50856e5..d71485d53050 100644
--- a/fs/ext4/ext4.h
+++ b/fs/ext4/ext4.h
@@ -1725,9 +1725,9 @@ struct ext4_sb_info {
*/
struct work_struct s_error_work;

- /* Ext4 fast commit stuff */
+ /* Ext4 fast commit sub transaction ID */
atomic_t s_fc_subtid;
- atomic_t s_fc_ineligible_updates;
+
/*
* After commit starts, the main queue gets locked, and the further
* updates get added in the staging queue.
@@ -2926,8 +2926,6 @@ void __ext4_fc_track_create(handle_t *handle, struct inode *inode,
void ext4_fc_track_create(handle_t *handle, struct dentry *dentry);
void ext4_fc_track_inode(handle_t *handle, struct inode *inode);
void ext4_fc_mark_ineligible(struct super_block *sb, int reason);
-void ext4_fc_start_ineligible(struct super_block *sb, int reason);
-void ext4_fc_stop_ineligible(struct super_block *sb);
void ext4_fc_start_update(struct inode *inode);
void ext4_fc_stop_update(struct inode *inode);
void ext4_fc_del(struct inode *inode);
diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c
index 703feff8cb8c..38111ea18ae1 100644
--- a/fs/ext4/extents.c
+++ b/fs/ext4/extents.c
@@ -5341,7 +5341,7 @@ static int ext4_collapse_range(struct inode *inode, loff_t offset, loff_t len)
ret = PTR_ERR(handle);
goto out_mmap;
}
- ext4_fc_start_ineligible(sb, EXT4_FC_REASON_FALLOC_RANGE);
+ ext4_fc_mark_ineligible(sb, EXT4_FC_REASON_FALLOC_RANGE);

down_write(&EXT4_I(inode)->i_data_sem);
ext4_discard_preallocations(inode, 0);
@@ -5380,7 +5380,6 @@ static int ext4_collapse_range(struct inode *inode, loff_t offset, loff_t len)

out_stop:
ext4_journal_stop(handle);
- ext4_fc_stop_ineligible(sb);
out_mmap:
filemap_invalidate_unlock(mapping);
out_mutex:
@@ -5482,7 +5481,7 @@ static int ext4_insert_range(struct inode *inode, loff_t offset, loff_t len)
ret = PTR_ERR(handle);
goto out_mmap;
}
- ext4_fc_start_ineligible(sb, EXT4_FC_REASON_FALLOC_RANGE);
+ ext4_fc_mark_ineligible(sb, EXT4_FC_REASON_FALLOC_RANGE);

/* Expand file to avoid data loss if there is error while shifting */
inode->i_size += len;
@@ -5557,7 +5556,6 @@ static int ext4_insert_range(struct inode *inode, loff_t offset, loff_t len)

out_stop:
ext4_journal_stop(handle);
- ext4_fc_stop_ineligible(sb);
out_mmap:
filemap_invalidate_unlock(mapping);
out_mutex:
diff --git a/fs/ext4/fast_commit.c b/fs/ext4/fast_commit.c
index 0f32b445582a..2771adefdba0 100644
--- a/fs/ext4/fast_commit.c
+++ b/fs/ext4/fast_commit.c
@@ -65,21 +65,11 @@
*
* Fast Commit Ineligibility
* -------------------------
- * Not all operations are supported by fast commits today (e.g extended
- * attributes). Fast commit ineligibility is marked by calling one of the
- * two following functions:
- *
- * - ext4_fc_mark_ineligible(): This makes next fast commit operation to fall
- * back to full commit. This is useful in case of transient errors.
*
- * - ext4_fc_start_ineligible() and ext4_fc_stop_ineligible() - This makes all
- * the fast commits happening between ext4_fc_start_ineligible() and
- * ext4_fc_stop_ineligible() and one fast commit after the call to
- * ext4_fc_stop_ineligible() to fall back to full commits. It is important to
- * make one more fast commit to fall back to full commit after stop call so
- * that it guaranteed that the fast commit ineligible operation contained
- * within ext4_fc_start_ineligible() and ext4_fc_stop_ineligible() is
- * followed by at least 1 full commit.
+ * Not all operations are supported by fast commits today (e.g extended
+ * attributes). Fast commit ineligibility is marked by calling
+ * ext4_fc_mark_ineligible(): This makes next fast commit operation to fall back
+ * to full commit.
*
* Atomicity of commits
* --------------------
@@ -328,44 +318,6 @@ void ext4_fc_mark_ineligible(struct super_block *sb, int reason)
sbi->s_fc_stats.fc_ineligible_reason_count[reason]++;
}

-/*
- * Start a fast commit ineligible update. Any commits that happen while
- * such an operation is in progress fall back to full commits.
- */
-void ext4_fc_start_ineligible(struct super_block *sb, int reason)
-{
- struct ext4_sb_info *sbi = EXT4_SB(sb);
-
- if (!test_opt2(sb, JOURNAL_FAST_COMMIT) ||
- (EXT4_SB(sb)->s_mount_state & EXT4_FC_REPLAY))
- return;
-
- WARN_ON(reason >= EXT4_FC_REASON_MAX);
- sbi->s_fc_stats.fc_ineligible_reason_count[reason]++;
- atomic_inc(&sbi->s_fc_ineligible_updates);
-}
-
-/*
- * Stop a fast commit ineligible update. We set EXT4_MF_FC_INELIGIBLE flag here
- * to ensure that after stopping the ineligible update, at least one full
- * commit takes place.
- */
-void ext4_fc_stop_ineligible(struct super_block *sb)
-{
- if (!test_opt2(sb, JOURNAL_FAST_COMMIT) ||
- (EXT4_SB(sb)->s_mount_state & EXT4_FC_REPLAY))
- return;
-
- ext4_set_mount_flag(sb, EXT4_MF_FC_INELIGIBLE);
- atomic_dec(&EXT4_SB(sb)->s_fc_ineligible_updates);
-}
-
-static inline int ext4_fc_is_ineligible(struct super_block *sb)
-{
- return (ext4_test_mount_flag(sb, EXT4_MF_FC_INELIGIBLE) ||
- atomic_read(&EXT4_SB(sb)->s_fc_ineligible_updates));
-}
-
/*
* Generic fast commit tracking function. If this is the first time this we are
* called after a full commit, we initialize fast commit fields and then call
@@ -391,7 +343,7 @@ static int ext4_fc_track_template(
(sbi->s_mount_state & EXT4_FC_REPLAY))
return -EOPNOTSUPP;

- if (ext4_fc_is_ineligible(inode->i_sb))
+ if (ext4_test_mount_flag(inode->i_sb, EXT4_MF_FC_INELIGIBLE))
return -EINVAL;

tid = handle->h_transaction->t_tid;
@@ -1142,11 +1094,8 @@ int ext4_fc_commit(journal_t *journal, tid_t commit_tid)

start_time = ktime_get();

- if (!test_opt2(sb, JOURNAL_FAST_COMMIT) ||
- (ext4_fc_is_ineligible(sb))) {
- reason = EXT4_FC_REASON_INELIGIBLE;
- goto out;
- }
+ if (!test_opt2(sb, JOURNAL_FAST_COMMIT))
+ return jbd2_complete_transaction(journal, commit_tid);

restart_fc:
ret = jbd2_fc_begin_commit(journal, commit_tid);
@@ -1162,6 +1111,14 @@ int ext4_fc_commit(journal_t *journal, tid_t commit_tid)
reason = EXT4_FC_REASON_FC_START_FAILED;
goto out;
}
+ /*
+ * After establishing journal barrier via jbd2_fc_begin_commit(), check
+ * if we are fast commit ineligible.
+ */
+ if (ext4_test_mount_flag(sb, EXT4_MF_FC_INELIGIBLE)) {
+ reason = EXT4_FC_REASON_INELIGIBLE;
+ goto out;
+ }

fc_bufs_before = (sbi->s_fc_bytes + bsize - 1) / bsize;
ret = ext4_fc_perform_commit(journal);
@@ -1180,12 +1137,6 @@ int ext4_fc_commit(journal_t *journal, tid_t commit_tid)
atomic_inc(&sbi->s_fc_subtid);
jbd2_fc_end_commit(journal);
out:
- /* Has any ineligible update happened since we started? */
- if (reason == EXT4_FC_REASON_OK && ext4_fc_is_ineligible(sb)) {
- sbi->s_fc_stats.fc_ineligible_reason_count[EXT4_FC_COMMIT_FAILED]++;
- reason = EXT4_FC_REASON_INELIGIBLE;
- }
-
spin_lock(&sbi->s_fc_lock);
if (reason != EXT4_FC_REASON_OK &&
reason != EXT4_FC_REASON_ALREADY_COMMITTED) {
diff --git a/fs/ext4/ioctl.c b/fs/ext4/ioctl.c
index e64a12e1218a..1366afb59fba 100644
--- a/fs/ext4/ioctl.c
+++ b/fs/ext4/ioctl.c
@@ -169,7 +169,7 @@ static long swap_inode_boot_loader(struct super_block *sb,
err = -EINVAL;
goto err_out;
}
- ext4_fc_start_ineligible(sb, EXT4_FC_REASON_SWAP_BOOT);
+ ext4_fc_mark_ineligible(sb, EXT4_FC_REASON_SWAP_BOOT);

/* Protect extent tree against block allocations via delalloc */
ext4_double_down_write_data_sem(inode, inode_bl);
@@ -252,7 +252,6 @@ static long swap_inode_boot_loader(struct super_block *sb,

err_out1:
ext4_journal_stop(handle);
- ext4_fc_stop_ineligible(sb);
ext4_double_up_write_data_sem(inode, inode_bl);

err_out:
diff --git a/fs/ext4/super.c b/fs/ext4/super.c
index 6998c07c209a..0e547c6ec73f 100644
--- a/fs/ext4/super.c
+++ b/fs/ext4/super.c
@@ -5069,7 +5069,6 @@ static int __ext4_fill_super(struct fs_context *fc, struct super_block *sb,

/* Initialize fast commit stuff */
atomic_set(&sbi->s_fc_subtid, 0);
- atomic_set(&sbi->s_fc_ineligible_updates, 0);
INIT_LIST_HEAD(&sbi->s_fc_q[FC_Q_MAIN]);
INIT_LIST_HEAD(&sbi->s_fc_q[FC_Q_STAGING]);
INIT_LIST_HEAD(&sbi->s_fc_dentry_q[FC_Q_MAIN]);
--
2.34.1.307.g9b7440fafd-goog


2021-12-23 20:22:00

by harshad shirwadkar

[permalink] [raw]
Subject: [PATCH v2 3/4] ext4: simplify updating of fast commit stats

From: Harshad Shirwadkar <[email protected]>

Move fast commit stats updating logic to a separate function from
ext4_fc_commit(). This significantly improves readability of
ext4_fc_commit().

Signed-off-by: Harshad Shirwadkar <[email protected]>
---
fs/ext4/ext4.h | 1 -
fs/ext4/fast_commit.c | 99 +++++++++++++++++++++++--------------------
fs/ext4/fast_commit.h | 27 ++++++------
3 files changed, 68 insertions(+), 59 deletions(-)

diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
index d71485d53050..82fa51d6f145 100644
--- a/fs/ext4/ext4.h
+++ b/fs/ext4/ext4.h
@@ -1747,7 +1747,6 @@ struct ext4_sb_info {
spinlock_t s_fc_lock;
struct buffer_head *s_fc_bh;
struct ext4_fc_stats s_fc_stats;
- u64 s_fc_avg_commit_time;
#ifdef CONFIG_EXT4_DEBUG
int s_fc_debug_max_replay;
#endif
diff --git a/fs/ext4/fast_commit.c b/fs/ext4/fast_commit.c
index 2771adefdba0..a37384054c9e 100644
--- a/fs/ext4/fast_commit.c
+++ b/fs/ext4/fast_commit.c
@@ -1075,6 +1075,32 @@ static int ext4_fc_perform_commit(journal_t *journal)
return ret;
}

+static void ext4_fc_update_stats(struct super_block *sb, int status,
+ u64 commit_time, int nblks)
+{
+ struct ext4_fc_stats *stats = &EXT4_SB(sb)->s_fc_stats;
+
+ jbd_debug(1, "Fast commit ended with status = %d", status);
+ if (status == EXT4_FC_STATUS_OK) {
+ stats->fc_num_commits++;
+ stats->fc_numblks += nblks;
+ if (likely(stats->s_fc_avg_commit_time))
+ stats->s_fc_avg_commit_time =
+ (commit_time +
+ stats->s_fc_avg_commit_time * 3) / 4;
+ else
+ stats->s_fc_avg_commit_time = commit_time;
+ } else if (status == EXT4_FC_STATUS_FAILED ||
+ status == EXT4_FC_STATUS_INELIGIBLE) {
+ if (status == EXT4_FC_STATUS_FAILED)
+ stats->fc_failed_commits++;
+ stats->fc_ineligible_commits++;
+ } else {
+ stats->fc_skipped_commits++;
+ }
+ trace_ext4_fc_commit_stop(sb, nblks, status);
+}
+
/*
* The main commit entry point. Performs a fast commit for transaction
* commit_tid if needed. If it's not possible to perform a fast commit
@@ -1087,7 +1113,7 @@ int ext4_fc_commit(journal_t *journal, tid_t commit_tid)
struct ext4_sb_info *sbi = EXT4_SB(sb);
int nblks = 0, ret, bsize = journal->j_blocksize;
int subtid = atomic_read(&sbi->s_fc_subtid);
- int reason = EXT4_FC_REASON_OK, fc_bufs_before = 0;
+ int status = EXT4_FC_STATUS_OK, fc_bufs_before = 0;
ktime_t start_time, commit_time;

trace_ext4_fc_commit_start(sb);
@@ -1104,69 +1130,52 @@ int ext4_fc_commit(journal_t *journal, tid_t commit_tid)
if (atomic_read(&sbi->s_fc_subtid) <= subtid &&
commit_tid > journal->j_commit_sequence)
goto restart_fc;
- reason = EXT4_FC_REASON_ALREADY_COMMITTED;
- goto out;
+ ext4_fc_update_stats(sb, EXT4_FC_STATUS_SKIPPED, 0, 0);
+ return 0;
} else if (ret) {
- sbi->s_fc_stats.fc_ineligible_reason_count[EXT4_FC_COMMIT_FAILED]++;
- reason = EXT4_FC_REASON_FC_START_FAILED;
- goto out;
+ /*
+ * Commit couldn't start. Just update stats and perform a
+ * full commit.
+ */
+ ext4_fc_update_stats(sb, EXT4_FC_STATUS_FAILED, 0, 0);
+ return jbd2_complete_transaction(journal, commit_tid);
}
+
/*
* After establishing journal barrier via jbd2_fc_begin_commit(), check
* if we are fast commit ineligible.
*/
if (ext4_test_mount_flag(sb, EXT4_MF_FC_INELIGIBLE)) {
- reason = EXT4_FC_REASON_INELIGIBLE;
- goto out;
+ status = EXT4_FC_STATUS_INELIGIBLE;
+ goto fallback;
}

fc_bufs_before = (sbi->s_fc_bytes + bsize - 1) / bsize;
ret = ext4_fc_perform_commit(journal);
if (ret < 0) {
- sbi->s_fc_stats.fc_ineligible_reason_count[EXT4_FC_COMMIT_FAILED]++;
- reason = EXT4_FC_REASON_FC_FAILED;
- goto out;
+ status = EXT4_FC_STATUS_FAILED;
+ goto fallback;
}
nblks = (sbi->s_fc_bytes + bsize - 1) / bsize - fc_bufs_before;
ret = jbd2_fc_wait_bufs(journal, nblks);
if (ret < 0) {
- sbi->s_fc_stats.fc_ineligible_reason_count[EXT4_FC_COMMIT_FAILED]++;
- reason = EXT4_FC_REASON_FC_FAILED;
- goto out;
+ status = EXT4_FC_STATUS_FAILED;
+ goto fallback;
}
atomic_inc(&sbi->s_fc_subtid);
- jbd2_fc_end_commit(journal);
-out:
- spin_lock(&sbi->s_fc_lock);
- if (reason != EXT4_FC_REASON_OK &&
- reason != EXT4_FC_REASON_ALREADY_COMMITTED) {
- sbi->s_fc_stats.fc_ineligible_commits++;
- } else {
- sbi->s_fc_stats.fc_num_commits++;
- sbi->s_fc_stats.fc_numblks += nblks;
- }
- spin_unlock(&sbi->s_fc_lock);
- nblks = (reason == EXT4_FC_REASON_OK) ? nblks : 0;
- trace_ext4_fc_commit_stop(sb, nblks, reason);
- commit_time = ktime_to_ns(ktime_sub(ktime_get(), start_time));
+ ret = jbd2_fc_end_commit(journal);
/*
- * weight the commit time higher than the average time so we don't
- * react too strongly to vast changes in the commit time
+ * weight the commit time higher than the average time so we
+ * don't react too strongly to vast changes in the commit time
*/
- if (likely(sbi->s_fc_avg_commit_time))
- sbi->s_fc_avg_commit_time = (commit_time +
- sbi->s_fc_avg_commit_time * 3) / 4;
- else
- sbi->s_fc_avg_commit_time = commit_time;
- jbd_debug(1,
- "Fast commit ended with blks = %d, reason = %d, subtid - %d",
- nblks, reason, subtid);
- if (reason == EXT4_FC_REASON_FC_FAILED)
- return jbd2_fc_end_commit_fallback(journal);
- if (reason == EXT4_FC_REASON_FC_START_FAILED ||
- reason == EXT4_FC_REASON_INELIGIBLE)
- return jbd2_complete_transaction(journal, commit_tid);
- return 0;
+ commit_time = ktime_to_ns(ktime_sub(ktime_get(), start_time));
+ ext4_fc_update_stats(sb, status, commit_time, nblks);
+ return ret;
+
+fallback:
+ ret = jbd2_fc_end_commit_fallback(journal);
+ ext4_fc_update_stats(sb, status, 0, 0);
+ return ret;
}

/*
@@ -2124,7 +2133,7 @@ int ext4_fc_info_show(struct seq_file *seq, void *v)
"fc stats:\n%ld commits\n%ld ineligible\n%ld numblks\n%lluus avg_commit_time\n",
stats->fc_num_commits, stats->fc_ineligible_commits,
stats->fc_numblks,
- div_u64(sbi->s_fc_avg_commit_time, 1000));
+ div_u64(stats->s_fc_avg_commit_time, 1000));
seq_puts(seq, "Ineligible reasons:\n");
for (i = 0; i < EXT4_FC_REASON_MAX; i++)
seq_printf(seq, "\"%s\":\t%d\n", fc_ineligible_reasons[i],
diff --git a/fs/ext4/fast_commit.h b/fs/ext4/fast_commit.h
index 937c381b4c85..083ad1cb705a 100644
--- a/fs/ext4/fast_commit.h
+++ b/fs/ext4/fast_commit.h
@@ -71,21 +71,19 @@ struct ext4_fc_tail {
};

/*
- * Fast commit reason codes
+ * Fast commit status codes
+ */
+enum {
+ EXT4_FC_STATUS_OK = 0,
+ EXT4_FC_STATUS_INELIGIBLE,
+ EXT4_FC_STATUS_SKIPPED,
+ EXT4_FC_STATUS_FAILED,
+};
+
+/*
+ * Fast commit ineligiblity reasons:
*/
enum {
- /*
- * Commit status codes:
- */
- EXT4_FC_REASON_OK = 0,
- EXT4_FC_REASON_INELIGIBLE,
- EXT4_FC_REASON_ALREADY_COMMITTED,
- EXT4_FC_REASON_FC_START_FAILED,
- EXT4_FC_REASON_FC_FAILED,
-
- /*
- * Fast commit ineligiblity reasons:
- */
EXT4_FC_REASON_XATTR = 0,
EXT4_FC_REASON_CROSS_RENAME,
EXT4_FC_REASON_JOURNAL_FLAG_CHANGE,
@@ -117,7 +115,10 @@ struct ext4_fc_stats {
unsigned int fc_ineligible_reason_count[EXT4_FC_REASON_MAX];
unsigned long fc_num_commits;
unsigned long fc_ineligible_commits;
+ unsigned long fc_failed_commits;
+ unsigned long fc_skipped_commits;
unsigned long fc_numblks;
+ u64 s_fc_avg_commit_time;
};

#define EXT4_FC_REPLAY_REALLOC_INCREMENT 4
--
2.34.1.307.g9b7440fafd-goog


2021-12-23 20:22:00

by harshad shirwadkar

[permalink] [raw]
Subject: [PATCH v2 4/4] ext4: update fast commit TODOs

From: Harshad Shirwadkar <[email protected]>

This series takes care of a couple of TODOs and adds new ones. Update
the TODOs section to reflect current state and future work that needs
to happen.

Signed-off-by: Harshad Shirwadkar <[email protected]>
---
fs/ext4/fast_commit.c | 14 ++++++--------
1 file changed, 6 insertions(+), 8 deletions(-)

diff --git a/fs/ext4/fast_commit.c b/fs/ext4/fast_commit.c
index a37384054c9e..dd002facf6c9 100644
--- a/fs/ext4/fast_commit.c
+++ b/fs/ext4/fast_commit.c
@@ -156,15 +156,13 @@
* fast commit recovery even if that area is invalidated by later full
* commits.
*
- * 1) Make fast commit atomic updates more fine grained. Today, a fast commit
- * eligible update must be protected within ext4_fc_start_update() and
- * ext4_fc_stop_update(). These routines are called at much higher
- * routines. This can be made more fine grained by combining with
- * ext4_journal_start().
+ * 1) Fast commit's commit path locks the entire file system during fast
+ * commit. This has significant performance penalty. Instead of that, we
+ * should use ext4_fc_start/stop_update functions to start inode level
+ * updates from ext4_journal_start/stop. Once we do that we can drop file
+ * system locking during commit path.
*
- * 2) Same above for ext4_fc_start_ineligible() and ext4_fc_stop_ineligible()
- *
- * 3) Handle more ineligible cases.
+ * 2) Handle more ineligible cases.
*/

#include <trace/events/ext4.h>
--
2.34.1.307.g9b7440fafd-goog


2021-12-23 23:16:13

by Theodore Ts'o

[permalink] [raw]
Subject: Re: [PATCH v2 0/4] ext4 fast commit API cleanup

On Thu, 23 Dec 2021 12:21:36 -0800, Harshad Shirwadkar wrote:
> This patch series fixes up fast commit APIs. There are NO on-disk
> format changes introduced in this series. The main contribution of the
> series is that it drops fast commit specific transaction APIs and
> makes fast commits work with journal transaction APIs of JBD2
> journalling system. With these changes, a fast commit eligible
> transaction is simply enclosed in calls to "jbd2_journal_start()" and
> "jbd2_journal_stop()". If the update that is being performed is fast
> commit ineligible, one must simply call ext4_fc_mark_ineligible()
> after starting a transaction using "jbd2_journal_start()". The last
> patch in the series simplifies fast commit stats recording by moving
> it to a different function.
>
> [...]

Applied, thanks!

[1/4] ext4: use ext4_journal_start/stop for fast commit transactions
commit: 2729cfdcfa1cc49bef5a90d046fa4a187fdfcc69
[2/4] ext4: drop ineligible txn start stop APIs
commit: 7bbbe241ec7ce0def9f71464c878fdbd2b0dcf37
[3/4] ext4: simplify updating of fast commit stats
commit: 0915e464cb274648e1ef1663e1356e53ff400983
[4/4] ext4: update fast commit TODOs
commit: d1199b94474ac4513b8491a4b751a8a466e1886b

Best regards,
--
Theodore Ts'o <[email protected]>

2022-01-11 12:53:08

by Ritesh Harjani

[permalink] [raw]
Subject: Re: [PATCH v2 0/4] ext4 fast commit API cleanup

On 21/12/23 12:21PM, Harshad Shirwadkar wrote:
> This patch series fixes up fast commit APIs. There are NO on-disk
> format changes introduced in this series. The main contribution of the
> series is that it drops fast commit specific transaction APIs and
> makes fast commits work with journal transaction APIs of JBD2
> journalling system. With these changes, a fast commit eligible
> transaction is simply enclosed in calls to "jbd2_journal_start()" and
> "jbd2_journal_stop()". If the update that is being performed is fast
> commit ineligible, one must simply call ext4_fc_mark_ineligible()
> after starting a transaction using "jbd2_journal_start()". The last
> patch in the series simplifies fast commit stats recording by moving
> it to a different function.
>
> I verified that the patch series introduces no regressions in "quick"
> and "log" groups when "fast_commit" feature is enabled.
>
> Changes from V1:
> ---------------
>
> - In the V1 of the patch series, there's performance regression. With
> this patch series, we lock the entire file system from starting any
> new handles during (which ensures consistency at the cost of
> performance). What we ideally want to do is to lock individual
> inodes from starting new updates during a commit. To do so, the V2
> of this patch series retains the infrastructure of inode level
> transactions (ext4_fc_start/stop_update()). In future (not in this
> series), we would build on this infrastructure to lock individual
> inodes and drop the file system level locking during the commit path.

Hello Harshad,

Sorry about being so late in the game :(

So from what I understood from your above commit msg is that even the current
v2 patch series suffers from the same performance regression which is:-
we lock the filesystem from any starting transaction updates
(by taking j_barrier or say by calling jbd2_journal_lock_updates()) while
fast_commit's commit operation is in progress (which happens during a file fsync()).
This means when fast_commit's commit operation is in progress, then we can't even
start a new transaction for recording any metadata updates to any inodes of my FS.

Is above understanding correct w.r.t this v2 patch series?
If yes, then why do we need to lock the full filesystem from starting any
journal txns? Why can't we let any process starts a new transaction while
the previous fast_commit's commit operation is in progress?

JBD2 does allow us to do that right? i.e. while a jbd2 commit is in progress,
a new running transaction can be allocated and all the new metadata updates will
now be tracked in the new running txn, right?

-ritesh


>
> Signed-off-by: Harshad Shirwadkar <[email protected]>
>
> Harshad Shirwadkar (4):
> ext4: use ext4_journal_start/stop for fast commit transactions
> ext4: drop ineligible txn start stop APIs
> ext4: simplify updating of fast commit stats
> ext4: update fast commit TODOs
>
> fs/ext4/acl.c | 2 -
> fs/ext4/ext4.h | 7 +-
> fs/ext4/extents.c | 9 +-
> fs/ext4/fast_commit.c | 188 ++++++++++++++++--------------------------
> fs/ext4/fast_commit.h | 27 +++---
> fs/ext4/file.c | 4 -
> fs/ext4/inode.c | 7 +-
> fs/ext4/ioctl.c | 13 +--
> fs/ext4/super.c | 1 -
> fs/jbd2/journal.c | 2 +
> 10 files changed, 96 insertions(+), 164 deletions(-)
>
> --
> 2.34.1.307.g9b7440fafd-goog
>

2022-01-11 16:20:15

by harshad shirwadkar

[permalink] [raw]
Subject: Re: [PATCH v2 0/4] ext4 fast commit API cleanup

Hey Ritesh,

Yes, your understanding is correct, this patch series does have a side
effect that the entire file system gets locked before starting a fast
commit. However, this regression is meant to be temporary (mainly to
prevent merging of unnecessary correctness patches). I am working on
another series which fixes this by only locking the inode that is
being committed. That patch should be out shortly.

The reason I hurried this patch series in without the inode locking
patches first was that we had some consistency and file system hanging
issues which needed to be fixed in the right way before the code
becomes more cluttered with temporary correctness fixes which would
eventually get dropped out. The hope is that with these patches, such
fixes wouldn't need to be merged in.

But yeah, I am fully aware of the performance degradation that this
series introduces and you will soon see another patch series that
fixes that issue.

Thanks,
Harshad

On Tue, Jan 11, 2022 at 4:53 AM riteshh <[email protected]> wrote:
>
> On 21/12/23 12:21PM, Harshad Shirwadkar wrote:
> > This patch series fixes up fast commit APIs. There are NO on-disk
> > format changes introduced in this series. The main contribution of the
> > series is that it drops fast commit specific transaction APIs and
> > makes fast commits work with journal transaction APIs of JBD2
> > journalling system. With these changes, a fast commit eligible
> > transaction is simply enclosed in calls to "jbd2_journal_start()" and
> > "jbd2_journal_stop()". If the update that is being performed is fast
> > commit ineligible, one must simply call ext4_fc_mark_ineligible()
> > after starting a transaction using "jbd2_journal_start()". The last
> > patch in the series simplifies fast commit stats recording by moving
> > it to a different function.
> >
> > I verified that the patch series introduces no regressions in "quick"
> > and "log" groups when "fast_commit" feature is enabled.
> >
> > Changes from V1:
> > ---------------
> >
> > - In the V1 of the patch series, there's performance regression. With
> > this patch series, we lock the entire file system from starting any
> > new handles during (which ensures consistency at the cost of
> > performance). What we ideally want to do is to lock individual
> > inodes from starting new updates during a commit. To do so, the V2
> > of this patch series retains the infrastructure of inode level
> > transactions (ext4_fc_start/stop_update()). In future (not in this
> > series), we would build on this infrastructure to lock individual
> > inodes and drop the file system level locking during the commit path.
>
> Hello Harshad,
>
> Sorry about being so late in the game :(
>
> So from what I understood from your above commit msg is that even the current
> v2 patch series suffers from the same performance regression which is:-
> we lock the filesystem from any starting transaction updates
> (by taking j_barrier or say by calling jbd2_journal_lock_updates()) while
> fast_commit's commit operation is in progress (which happens during a file fsync()).
> This means when fast_commit's commit operation is in progress, then we can't even
> start a new transaction for recording any metadata updates to any inodes of my FS.
>
> Is above understanding correct w.r.t this v2 patch series?
> If yes, then why do we need to lock the full filesystem from starting any
> journal txns? Why can't we let any process starts a new transaction while
> the previous fast_commit's commit operation is in progress?
>
> JBD2 does allow us to do that right? i.e. while a jbd2 commit is in progress,
> a new running transaction can be allocated and all the new metadata updates will
> now be tracked in the new running txn, right?
>
> -ritesh
>
>
> >
> > Signed-off-by: Harshad Shirwadkar <[email protected]>
> >
> > Harshad Shirwadkar (4):
> > ext4: use ext4_journal_start/stop for fast commit transactions
> > ext4: drop ineligible txn start stop APIs
> > ext4: simplify updating of fast commit stats
> > ext4: update fast commit TODOs
> >
> > fs/ext4/acl.c | 2 -
> > fs/ext4/ext4.h | 7 +-
> > fs/ext4/extents.c | 9 +-
> > fs/ext4/fast_commit.c | 188 ++++++++++++++++--------------------------
> > fs/ext4/fast_commit.h | 27 +++---
> > fs/ext4/file.c | 4 -
> > fs/ext4/inode.c | 7 +-
> > fs/ext4/ioctl.c | 13 +--
> > fs/ext4/super.c | 1 -
> > fs/jbd2/journal.c | 2 +
> > 10 files changed, 96 insertions(+), 164 deletions(-)
> >
> > --
> > 2.34.1.307.g9b7440fafd-goog
> >

2022-01-11 16:57:20

by Ritesh Harjani

[permalink] [raw]
Subject: Re: [PATCH v2 0/4] ext4 fast commit API cleanup

On 22/01/11 08:19AM, harshad shirwadkar wrote:
> Hey Ritesh,
>
> Yes, your understanding is correct, this patch series does have a side
> effect that the entire file system gets locked before starting a fast
> commit. However, this regression is meant to be temporary (mainly to
> prevent merging of unnecessary correctness patches). I am working on
> another series which fixes this by only locking the inode that is
> being committed. That patch should be out shortly.
>
> The reason I hurried this patch series in without the inode locking
> patches first was that we had some consistency and file system hanging
> issues which needed to be fixed in the right way before the code
> becomes more cluttered with temporary correctness fixes which would
> eventually get dropped out. The hope is that with these patches, such
> fixes wouldn't need to be merged in.
>
> But yeah, I am fully aware of the performance degradation that this
> series introduces and you will soon see another patch series that
> fixes that issue.

Thanks Harshad, for taking time and helping me understand the reasoning behind.

-ritesh

>
> Thanks,
> Harshad
>
> On Tue, Jan 11, 2022 at 4:53 AM riteshh <[email protected]> wrote:
> >
> > On 21/12/23 12:21PM, Harshad Shirwadkar wrote:
> > > This patch series fixes up fast commit APIs. There are NO on-disk
> > > format changes introduced in this series. The main contribution of the
> > > series is that it drops fast commit specific transaction APIs and
> > > makes fast commits work with journal transaction APIs of JBD2
> > > journalling system. With these changes, a fast commit eligible
> > > transaction is simply enclosed in calls to "jbd2_journal_start()" and
> > > "jbd2_journal_stop()". If the update that is being performed is fast
> > > commit ineligible, one must simply call ext4_fc_mark_ineligible()
> > > after starting a transaction using "jbd2_journal_start()". The last
> > > patch in the series simplifies fast commit stats recording by moving
> > > it to a different function.
> > >
> > > I verified that the patch series introduces no regressions in "quick"
> > > and "log" groups when "fast_commit" feature is enabled.
> > >
> > > Changes from V1:
> > > ---------------
> > >
> > > - In the V1 of the patch series, there's performance regression. With
> > > this patch series, we lock the entire file system from starting any
> > > new handles during (which ensures consistency at the cost of
> > > performance). What we ideally want to do is to lock individual
> > > inodes from starting new updates during a commit. To do so, the V2
> > > of this patch series retains the infrastructure of inode level
> > > transactions (ext4_fc_start/stop_update()). In future (not in this
> > > series), we would build on this infrastructure to lock individual
> > > inodes and drop the file system level locking during the commit path.
> >
> > Hello Harshad,
> >
> > Sorry about being so late in the game :(
> >
> > So from what I understood from your above commit msg is that even the current
> > v2 patch series suffers from the same performance regression which is:-
> > we lock the filesystem from any starting transaction updates
> > (by taking j_barrier or say by calling jbd2_journal_lock_updates()) while
> > fast_commit's commit operation is in progress (which happens during a file fsync()).
> > This means when fast_commit's commit operation is in progress, then we can't even
> > start a new transaction for recording any metadata updates to any inodes of my FS.
> >
> > Is above understanding correct w.r.t this v2 patch series?
> > If yes, then why do we need to lock the full filesystem from starting any
> > journal txns? Why can't we let any process starts a new transaction while
> > the previous fast_commit's commit operation is in progress?
> >
> > JBD2 does allow us to do that right? i.e. while a jbd2 commit is in progress,
> > a new running transaction can be allocated and all the new metadata updates will
> > now be tracked in the new running txn, right?
> >
> > -ritesh
> >
> >
> > >
> > > Signed-off-by: Harshad Shirwadkar <[email protected]>
> > >
> > > Harshad Shirwadkar (4):
> > > ext4: use ext4_journal_start/stop for fast commit transactions
> > > ext4: drop ineligible txn start stop APIs
> > > ext4: simplify updating of fast commit stats
> > > ext4: update fast commit TODOs
> > >
> > > fs/ext4/acl.c | 2 -
> > > fs/ext4/ext4.h | 7 +-
> > > fs/ext4/extents.c | 9 +-
> > > fs/ext4/fast_commit.c | 188 ++++++++++++++++--------------------------
> > > fs/ext4/fast_commit.h | 27 +++---
> > > fs/ext4/file.c | 4 -
> > > fs/ext4/inode.c | 7 +-
> > > fs/ext4/ioctl.c | 13 +--
> > > fs/ext4/super.c | 1 -
> > > fs/jbd2/journal.c | 2 +
> > > 10 files changed, 96 insertions(+), 164 deletions(-)
> > >
> > > --
> > > 2.34.1.307.g9b7440fafd-goog
> > >