2022-01-13 03:27:06

by Ritesh Harjani

[permalink] [raw]
Subject: [PATCH 0/6] ext4/jbd2: inline_data fixes and some cleanups

Hellos,

Patch[1]: fixes BUG_ON with inline_data which was reported [1] with generic/475.

Patch[2]: is mostly cleanup found during code review of inline_data code.

Patch[3]: is a possible memory corruption fix in case of krealloc failure.

Patch[4-5]: Cleanups.

Patch[6]: Needs careful review. As it gets rid of t_handle_lock spinlock
in jbd2_journal_wait_updates(). From the code review I found it to be not
required. But let me know if I missed anything here.

[1]: https://lore.kernel.org/linux-ext4/[email protected]/

Ritesh Harjani (6):
ext4: Fix error handling in ext4_restore_inline_data()
ext4: Remove redundant max inline_size check in ext4_da_write_inline_data_begin()
ext4: Fix error handling in ext4_fc_record_modified_inode()
jbd2: Cleanup unused functions declarations from jbd2.h
jbd2: Refactor wait logic for transaction updates into a common function
jbd2: No need to use t_handle_lock in jbd2_journal_wait_updates

fs/ext4/fast_commit.c | 64 ++++++++++++++++++++-----------------------
fs/ext4/inline.c | 23 +++++++++-------
fs/jbd2/commit.c | 19 ++-----------
fs/jbd2/transaction.c | 24 ++--------------
include/linux/jbd2.h | 34 +++++++++++++++++------
5 files changed, 74 insertions(+), 90 deletions(-)

--
2.31.1



2022-01-13 03:27:09

by Ritesh Harjani

[permalink] [raw]
Subject: [PATCH 2/6] ext4: Remove redundant max inline_size check in ext4_da_write_inline_data_begin()

ext4_prepare_inline_data() already checks for ext4_get_max_inline_size()
and returns -ENOSPC. So there is no need to check it twice within
ext4_da_write_inline_data_begin(). This patch removes the extra check.

It also makes it more clean.

No functionality change in this patch.

Signed-off-by: Ritesh Harjani <[email protected]>
---
fs/ext4/inline.c | 13 ++++---------
1 file changed, 4 insertions(+), 9 deletions(-)

diff --git a/fs/ext4/inline.c b/fs/ext4/inline.c
index 31741e8a462e..c52b0037983d 100644
--- a/fs/ext4/inline.c
+++ b/fs/ext4/inline.c
@@ -913,7 +913,7 @@ int ext4_da_write_inline_data_begin(struct address_space *mapping,
struct page **pagep,
void **fsdata)
{
- int ret, inline_size;
+ int ret;
handle_t *handle;
struct page *page;
struct ext4_iloc iloc;
@@ -930,14 +930,9 @@ int ext4_da_write_inline_data_begin(struct address_space *mapping,
goto out;
}

- inline_size = ext4_get_max_inline_size(inode);
-
- ret = -ENOSPC;
- if (inline_size >= pos + len) {
- ret = ext4_prepare_inline_data(handle, inode, pos + len);
- if (ret && ret != -ENOSPC)
- goto out_journal;
- }
+ ret = ext4_prepare_inline_data(handle, inode, pos + len);
+ if (ret && ret != -ENOSPC)
+ goto out_journal;

/*
* We cannot recurse into the filesystem as the transaction
--
2.31.1


2022-01-13 03:27:13

by Ritesh Harjani

[permalink] [raw]
Subject: [PATCH 4/6] jbd2: Cleanup unused functions declarations from jbd2.h

During code review found no references of few of these below function
declarations. This patch cleans those up from jbd2.h

Signed-off-by: Ritesh Harjani <[email protected]>
---
include/linux/jbd2.h | 7 -------
1 file changed, 7 deletions(-)

diff --git a/include/linux/jbd2.h b/include/linux/jbd2.h
index fd933c45281a..f76598265896 100644
--- a/include/linux/jbd2.h
+++ b/include/linux/jbd2.h
@@ -1419,9 +1419,7 @@ extern void jbd2_journal_unfile_buffer(journal_t *, struct journal_head *);
extern bool __jbd2_journal_refile_buffer(struct journal_head *);
extern void jbd2_journal_refile_buffer(journal_t *, struct journal_head *);
extern void __jbd2_journal_file_buffer(struct journal_head *, transaction_t *, int);
-extern void __journal_free_buffer(struct journal_head *bh);
extern void jbd2_journal_file_buffer(struct journal_head *, transaction_t *, int);
-extern void __journal_clean_data_list(transaction_t *transaction);
static inline void jbd2_file_log_bh(struct list_head *head, struct buffer_head *bh)
{
list_add_tail(&bh->b_assoc_buffers, head);
@@ -1486,9 +1484,6 @@ extern int jbd2_journal_write_metadata_buffer(transaction_t *transaction,
struct buffer_head **bh_out,
sector_t blocknr);

-/* Transaction locking */
-extern void __wait_on_journal (journal_t *);
-
/* Transaction cache support */
extern void jbd2_journal_destroy_transaction_cache(void);
extern int __init jbd2_journal_init_transaction_cache(void);
@@ -1774,8 +1769,6 @@ static inline unsigned long jbd2_log_space_left(journal_t *journal)
#define BJ_Reserved 4 /* Buffer is reserved for access by journal */
#define BJ_Types 5

-extern int jbd_blocks_per_page(struct inode *inode);
-
/* JBD uses a CRC32 checksum */
#define JBD_MAX_CHECKSUM_SIZE 4

--
2.31.1


2022-01-13 03:27:15

by Ritesh Harjani

[permalink] [raw]
Subject: [PATCH 3/6] ext4: Fix error handling in ext4_fc_record_modified_inode()

Current code does not fully takes care of krealloc() error case,
which could lead to silent memory corruption or a kernel bug.
This patch fixes that.

Also it cleans up some duplicated error handling logic from various functions
in fast_commit.c file.

Reported-by: luo penghao <[email protected]>
Suggested-by: Lukas Czerner <[email protected]>
Signed-off-by: Ritesh Harjani <[email protected]>
---
fs/ext4/fast_commit.c | 64 ++++++++++++++++++++-----------------------
1 file changed, 29 insertions(+), 35 deletions(-)

diff --git a/fs/ext4/fast_commit.c b/fs/ext4/fast_commit.c
index 5ae8026a0c56..4541c8468c01 100644
--- a/fs/ext4/fast_commit.c
+++ b/fs/ext4/fast_commit.c
@@ -1392,14 +1392,15 @@ static int ext4_fc_record_modified_inode(struct super_block *sb, int ino)
if (state->fc_modified_inodes[i] == ino)
return 0;
if (state->fc_modified_inodes_used == state->fc_modified_inodes_size) {
- state->fc_modified_inodes_size +=
- EXT4_FC_REPLAY_REALLOC_INCREMENT;
state->fc_modified_inodes = krealloc(
- state->fc_modified_inodes, sizeof(int) *
- state->fc_modified_inodes_size,
- GFP_KERNEL);
+ state->fc_modified_inodes,
+ sizeof(int) * (state->fc_modified_inodes_size +
+ EXT4_FC_REPLAY_REALLOC_INCREMENT),
+ GFP_KERNEL);
if (!state->fc_modified_inodes)
return -ENOMEM;
+ state->fc_modified_inodes_size +=
+ EXT4_FC_REPLAY_REALLOC_INCREMENT;
}
state->fc_modified_inodes[state->fc_modified_inodes_used++] = ino;
return 0;
@@ -1431,7 +1432,9 @@ static int ext4_fc_replay_inode(struct super_block *sb, struct ext4_fc_tl *tl,
}
inode = NULL;

- ext4_fc_record_modified_inode(sb, ino);
+ ret = ext4_fc_record_modified_inode(sb, ino);
+ if (ret)
+ goto out;

raw_fc_inode = (struct ext4_inode *)
(val + offsetof(struct ext4_fc_inode, fc_raw_inode));
@@ -1621,6 +1624,8 @@ static int ext4_fc_replay_add_range(struct super_block *sb,
}

ret = ext4_fc_record_modified_inode(sb, inode->i_ino);
+ if (ret)
+ goto out;

start = le32_to_cpu(ex->ee_block);
start_pblk = ext4_ext_pblock(ex);
@@ -1638,18 +1643,14 @@ static int ext4_fc_replay_add_range(struct super_block *sb,
map.m_pblk = 0;
ret = ext4_map_blocks(NULL, inode, &map, 0);

- if (ret < 0) {
- iput(inode);
- return 0;
- }
+ if (ret < 0)
+ goto out;

if (ret == 0) {
/* Range is not mapped */
path = ext4_find_extent(inode, cur, NULL, 0);
- if (IS_ERR(path)) {
- iput(inode);
- return 0;
- }
+ if (IS_ERR(path))
+ goto out;
memset(&newex, 0, sizeof(newex));
newex.ee_block = cpu_to_le32(cur);
ext4_ext_store_pblock(
@@ -1663,10 +1664,8 @@ static int ext4_fc_replay_add_range(struct super_block *sb,
up_write((&EXT4_I(inode)->i_data_sem));
ext4_ext_drop_refs(path);
kfree(path);
- if (ret) {
- iput(inode);
- return 0;
- }
+ if (ret)
+ goto out;
goto next;
}

@@ -1679,10 +1678,8 @@ static int ext4_fc_replay_add_range(struct super_block *sb,
ret = ext4_ext_replay_update_ex(inode, cur, map.m_len,
ext4_ext_is_unwritten(ex),
start_pblk + cur - start);
- if (ret) {
- iput(inode);
- return 0;
- }
+ if (ret)
+ goto out;
/*
* Mark the old blocks as free since they aren't used
* anymore. We maintain an array of all the modified
@@ -1702,10 +1699,8 @@ static int ext4_fc_replay_add_range(struct super_block *sb,
ext4_ext_is_unwritten(ex), map.m_pblk);
ret = ext4_ext_replay_update_ex(inode, cur, map.m_len,
ext4_ext_is_unwritten(ex), map.m_pblk);
- if (ret) {
- iput(inode);
- return 0;
- }
+ if (ret)
+ goto out;
/*
* We may have split the extent tree while toggling the state.
* Try to shrink the extent tree now.
@@ -1717,6 +1712,7 @@ static int ext4_fc_replay_add_range(struct super_block *sb,
}
ext4_ext_replay_shrink_inode(inode, i_size_read(inode) >>
sb->s_blocksize_bits);
+out:
iput(inode);
return 0;
}
@@ -1746,6 +1742,8 @@ ext4_fc_replay_del_range(struct super_block *sb, struct ext4_fc_tl *tl,
}

ret = ext4_fc_record_modified_inode(sb, inode->i_ino);
+ if (ret)
+ goto out;

jbd_debug(1, "DEL_RANGE, inode %ld, lblk %d, len %d\n",
inode->i_ino, le32_to_cpu(lrange.fc_lblk),
@@ -1755,10 +1753,8 @@ ext4_fc_replay_del_range(struct super_block *sb, struct ext4_fc_tl *tl,
map.m_len = remaining;

ret = ext4_map_blocks(NULL, inode, &map, 0);
- if (ret < 0) {
- iput(inode);
- return 0;
- }
+ if (ret < 0)
+ goto out;
if (ret > 0) {
remaining -= ret;
cur += ret;
@@ -1773,15 +1769,13 @@ ext4_fc_replay_del_range(struct super_block *sb, struct ext4_fc_tl *tl,
ret = ext4_ext_remove_space(inode, lrange.fc_lblk,
lrange.fc_lblk + lrange.fc_len - 1);
up_write(&EXT4_I(inode)->i_data_sem);
- if (ret) {
- iput(inode);
- return 0;
- }
+ if (ret)
+ goto out;
ext4_ext_replay_shrink_inode(inode,
i_size_read(inode) >> sb->s_blocksize_bits);
ext4_mark_inode_dirty(NULL, inode);
+out:
iput(inode);
-
return 0;
}

--
2.31.1


2022-01-13 03:27:19

by Ritesh Harjani

[permalink] [raw]
Subject: [PATCH 5/6] jbd2: Refactor wait logic for transaction updates into a common function

No functionality change as such in this patch. This only refactors the
common piece of code which waits for t_updates to finish into a common
function named as jbd2_journal_wait_updates(journal_t *)

Signed-off-by: Ritesh Harjani <[email protected]>
---
fs/jbd2/commit.c | 19 +++----------------
fs/jbd2/transaction.c | 24 +++---------------------
include/linux/jbd2.h | 31 ++++++++++++++++++++++++++++++-
3 files changed, 36 insertions(+), 38 deletions(-)

diff --git a/fs/jbd2/commit.c b/fs/jbd2/commit.c
index 3cc4ab2ba7f4..428364f107be 100644
--- a/fs/jbd2/commit.c
+++ b/fs/jbd2/commit.c
@@ -484,22 +484,9 @@ void jbd2_journal_commit_transaction(journal_t *journal)
stats.run.rs_running = jbd2_time_diff(commit_transaction->t_start,
stats.run.rs_locked);

- spin_lock(&commit_transaction->t_handle_lock);
- while (atomic_read(&commit_transaction->t_updates)) {
- DEFINE_WAIT(wait);
+ // waits for any t_updates to finish
+ jbd2_journal_wait_updates(journal);

- prepare_to_wait(&journal->j_wait_updates, &wait,
- TASK_UNINTERRUPTIBLE);
- if (atomic_read(&commit_transaction->t_updates)) {
- spin_unlock(&commit_transaction->t_handle_lock);
- write_unlock(&journal->j_state_lock);
- schedule();
- write_lock(&journal->j_state_lock);
- spin_lock(&commit_transaction->t_handle_lock);
- }
- finish_wait(&journal->j_wait_updates, &wait);
- }
- spin_unlock(&commit_transaction->t_handle_lock);
commit_transaction->t_state = T_SWITCH;
write_unlock(&journal->j_state_lock);

@@ -817,7 +804,7 @@ void jbd2_journal_commit_transaction(journal_t *journal)
commit_transaction->t_state = T_COMMIT_DFLUSH;
write_unlock(&journal->j_state_lock);

- /*
+ /*
* If the journal is not located on the file system device,
* then we must flush the file system device before we issue
* the commit record
diff --git a/fs/jbd2/transaction.c b/fs/jbd2/transaction.c
index 6a3caedd2285..89a955ab1557 100644
--- a/fs/jbd2/transaction.c
+++ b/fs/jbd2/transaction.c
@@ -449,7 +449,7 @@ static int start_this_handle(journal_t *journal, handle_t *handle,
}

/* OK, account for the buffers that this operation expects to
- * use and add the handle to the running transaction.
+ * use and add the handle to the running transaction.
*/
update_t_max_wait(transaction, ts);
handle->h_transaction = transaction;
@@ -863,27 +863,9 @@ void jbd2_journal_lock_updates(journal_t *journal)
write_lock(&journal->j_state_lock);
}

- /* Wait until there are no running updates */
- while (1) {
- transaction_t *transaction = journal->j_running_transaction;
+ /* Wait until there are no running t_updates */
+ jbd2_journal_wait_updates(journal);

- if (!transaction)
- break;
-
- spin_lock(&transaction->t_handle_lock);
- prepare_to_wait(&journal->j_wait_updates, &wait,
- TASK_UNINTERRUPTIBLE);
- if (!atomic_read(&transaction->t_updates)) {
- spin_unlock(&transaction->t_handle_lock);
- finish_wait(&journal->j_wait_updates, &wait);
- break;
- }
- spin_unlock(&transaction->t_handle_lock);
- write_unlock(&journal->j_state_lock);
- schedule();
- finish_wait(&journal->j_wait_updates, &wait);
- write_lock(&journal->j_state_lock);
- }
write_unlock(&journal->j_state_lock);

/*
diff --git a/include/linux/jbd2.h b/include/linux/jbd2.h
index f76598265896..34b051aa9009 100644
--- a/include/linux/jbd2.h
+++ b/include/linux/jbd2.h
@@ -594,7 +594,7 @@ struct transaction_s
*/
unsigned long t_log_start;

- /*
+ /*
* Number of buffers on the t_buffers list [j_list_lock, no locks
* needed for jbd2 thread]
*/
@@ -1757,6 +1757,35 @@ static inline unsigned long jbd2_log_space_left(journal_t *journal)
return max_t(long, free, 0);
}

+/*
+ * Waits for any outstanding t_updates to finish.
+ * This is called with write j_state_lock held.
+ */
+static inline void jbd2_journal_wait_updates(journal_t *journal)
+{
+ transaction_t *commit_transaction = journal->j_running_transaction;
+
+ if (!commit_transaction)
+ return;
+
+ spin_lock(&commit_transaction->t_handle_lock);
+ while (atomic_read(&commit_transaction->t_updates)) {
+ DEFINE_WAIT(wait);
+
+ prepare_to_wait(&journal->j_wait_updates, &wait,
+ TASK_UNINTERRUPTIBLE);
+ if (atomic_read(&commit_transaction->t_updates)) {
+ spin_unlock(&commit_transaction->t_handle_lock);
+ write_unlock(&journal->j_state_lock);
+ schedule();
+ write_lock(&journal->j_state_lock);
+ spin_lock(&commit_transaction->t_handle_lock);
+ }
+ finish_wait(&journal->j_wait_updates, &wait);
+ }
+ spin_unlock(&commit_transaction->t_handle_lock);
+}
+
/*
* Definitions which augment the buffer_head layer
*/
--
2.31.1


2022-01-13 03:27:22

by Ritesh Harjani

[permalink] [raw]
Subject: [PATCH 6/6] jbd2: No need to use t_handle_lock in jbd2_journal_wait_updates

Since jbd2_journal_wait_updates() uses waitq based on t_updates atomic_t
variable. So from code review it looks like we don't need to use
t_handle_lock spinlock for checking t_updates value.
Hence this patch gets rid of the spinlock protection in
jbd2_journal_wait_updates()

Signed-off-by: Ritesh Harjani <[email protected]>
---
include/linux/jbd2.h | 4 ----
1 file changed, 4 deletions(-)

diff --git a/include/linux/jbd2.h b/include/linux/jbd2.h
index 34b051aa9009..9bef47622b9d 100644
--- a/include/linux/jbd2.h
+++ b/include/linux/jbd2.h
@@ -1768,22 +1768,18 @@ static inline void jbd2_journal_wait_updates(journal_t *journal)
if (!commit_transaction)
return;

- spin_lock(&commit_transaction->t_handle_lock);
while (atomic_read(&commit_transaction->t_updates)) {
DEFINE_WAIT(wait);

prepare_to_wait(&journal->j_wait_updates, &wait,
TASK_UNINTERRUPTIBLE);
if (atomic_read(&commit_transaction->t_updates)) {
- spin_unlock(&commit_transaction->t_handle_lock);
write_unlock(&journal->j_state_lock);
schedule();
write_lock(&journal->j_state_lock);
- spin_lock(&commit_transaction->t_handle_lock);
}
finish_wait(&journal->j_wait_updates, &wait);
}
- spin_unlock(&commit_transaction->t_handle_lock);
}

/*
--
2.31.1


2022-01-13 10:58:06

by Jan Kara

[permalink] [raw]
Subject: Re: [PATCH 2/6] ext4: Remove redundant max inline_size check in ext4_da_write_inline_data_begin()

On Thu 13-01-22 08:56:25, Ritesh Harjani wrote:
> ext4_prepare_inline_data() already checks for ext4_get_max_inline_size()
> and returns -ENOSPC. So there is no need to check it twice within
> ext4_da_write_inline_data_begin(). This patch removes the extra check.
>
> It also makes it more clean.
>
> No functionality change in this patch.
>
> Signed-off-by: Ritesh Harjani <[email protected]>

Looks good. Feel free to add:

Reviewed-by: Jan Kara <[email protected]>

Honza

> ---
> fs/ext4/inline.c | 13 ++++---------
> 1 file changed, 4 insertions(+), 9 deletions(-)
>
> diff --git a/fs/ext4/inline.c b/fs/ext4/inline.c
> index 31741e8a462e..c52b0037983d 100644
> --- a/fs/ext4/inline.c
> +++ b/fs/ext4/inline.c
> @@ -913,7 +913,7 @@ int ext4_da_write_inline_data_begin(struct address_space *mapping,
> struct page **pagep,
> void **fsdata)
> {
> - int ret, inline_size;
> + int ret;
> handle_t *handle;
> struct page *page;
> struct ext4_iloc iloc;
> @@ -930,14 +930,9 @@ int ext4_da_write_inline_data_begin(struct address_space *mapping,
> goto out;
> }
>
> - inline_size = ext4_get_max_inline_size(inode);
> -
> - ret = -ENOSPC;
> - if (inline_size >= pos + len) {
> - ret = ext4_prepare_inline_data(handle, inode, pos + len);
> - if (ret && ret != -ENOSPC)
> - goto out_journal;
> - }
> + ret = ext4_prepare_inline_data(handle, inode, pos + len);
> + if (ret && ret != -ENOSPC)
> + goto out_journal;
>
> /*
> * We cannot recurse into the filesystem as the transaction
> --
> 2.31.1
>
--
Jan Kara <[email protected]>
SUSE Labs, CR

2022-01-13 11:00:48

by Jan Kara

[permalink] [raw]
Subject: Re: [PATCH 3/6] ext4: Fix error handling in ext4_fc_record_modified_inode()

On Thu 13-01-22 08:56:26, Ritesh Harjani wrote:
> Current code does not fully takes care of krealloc() error case,
> which could lead to silent memory corruption or a kernel bug.
> This patch fixes that.
>
> Also it cleans up some duplicated error handling logic from various functions
> in fast_commit.c file.
>
> Reported-by: luo penghao <[email protected]>
> Suggested-by: Lukas Czerner <[email protected]>
> Signed-off-by: Ritesh Harjani <[email protected]>

Looks good to me. Feel free to add:

Reviewed-by: Jan Kara <[email protected]>

Honza

> ---
> fs/ext4/fast_commit.c | 64 ++++++++++++++++++++-----------------------
> 1 file changed, 29 insertions(+), 35 deletions(-)
>
> diff --git a/fs/ext4/fast_commit.c b/fs/ext4/fast_commit.c
> index 5ae8026a0c56..4541c8468c01 100644
> --- a/fs/ext4/fast_commit.c
> +++ b/fs/ext4/fast_commit.c
> @@ -1392,14 +1392,15 @@ static int ext4_fc_record_modified_inode(struct super_block *sb, int ino)
> if (state->fc_modified_inodes[i] == ino)
> return 0;
> if (state->fc_modified_inodes_used == state->fc_modified_inodes_size) {
> - state->fc_modified_inodes_size +=
> - EXT4_FC_REPLAY_REALLOC_INCREMENT;
> state->fc_modified_inodes = krealloc(
> - state->fc_modified_inodes, sizeof(int) *
> - state->fc_modified_inodes_size,
> - GFP_KERNEL);
> + state->fc_modified_inodes,
> + sizeof(int) * (state->fc_modified_inodes_size +
> + EXT4_FC_REPLAY_REALLOC_INCREMENT),
> + GFP_KERNEL);
> if (!state->fc_modified_inodes)
> return -ENOMEM;
> + state->fc_modified_inodes_size +=
> + EXT4_FC_REPLAY_REALLOC_INCREMENT;
> }
> state->fc_modified_inodes[state->fc_modified_inodes_used++] = ino;
> return 0;
> @@ -1431,7 +1432,9 @@ static int ext4_fc_replay_inode(struct super_block *sb, struct ext4_fc_tl *tl,
> }
> inode = NULL;
>
> - ext4_fc_record_modified_inode(sb, ino);
> + ret = ext4_fc_record_modified_inode(sb, ino);
> + if (ret)
> + goto out;
>
> raw_fc_inode = (struct ext4_inode *)
> (val + offsetof(struct ext4_fc_inode, fc_raw_inode));
> @@ -1621,6 +1624,8 @@ static int ext4_fc_replay_add_range(struct super_block *sb,
> }
>
> ret = ext4_fc_record_modified_inode(sb, inode->i_ino);
> + if (ret)
> + goto out;
>
> start = le32_to_cpu(ex->ee_block);
> start_pblk = ext4_ext_pblock(ex);
> @@ -1638,18 +1643,14 @@ static int ext4_fc_replay_add_range(struct super_block *sb,
> map.m_pblk = 0;
> ret = ext4_map_blocks(NULL, inode, &map, 0);
>
> - if (ret < 0) {
> - iput(inode);
> - return 0;
> - }
> + if (ret < 0)
> + goto out;
>
> if (ret == 0) {
> /* Range is not mapped */
> path = ext4_find_extent(inode, cur, NULL, 0);
> - if (IS_ERR(path)) {
> - iput(inode);
> - return 0;
> - }
> + if (IS_ERR(path))
> + goto out;
> memset(&newex, 0, sizeof(newex));
> newex.ee_block = cpu_to_le32(cur);
> ext4_ext_store_pblock(
> @@ -1663,10 +1664,8 @@ static int ext4_fc_replay_add_range(struct super_block *sb,
> up_write((&EXT4_I(inode)->i_data_sem));
> ext4_ext_drop_refs(path);
> kfree(path);
> - if (ret) {
> - iput(inode);
> - return 0;
> - }
> + if (ret)
> + goto out;
> goto next;
> }
>
> @@ -1679,10 +1678,8 @@ static int ext4_fc_replay_add_range(struct super_block *sb,
> ret = ext4_ext_replay_update_ex(inode, cur, map.m_len,
> ext4_ext_is_unwritten(ex),
> start_pblk + cur - start);
> - if (ret) {
> - iput(inode);
> - return 0;
> - }
> + if (ret)
> + goto out;
> /*
> * Mark the old blocks as free since they aren't used
> * anymore. We maintain an array of all the modified
> @@ -1702,10 +1699,8 @@ static int ext4_fc_replay_add_range(struct super_block *sb,
> ext4_ext_is_unwritten(ex), map.m_pblk);
> ret = ext4_ext_replay_update_ex(inode, cur, map.m_len,
> ext4_ext_is_unwritten(ex), map.m_pblk);
> - if (ret) {
> - iput(inode);
> - return 0;
> - }
> + if (ret)
> + goto out;
> /*
> * We may have split the extent tree while toggling the state.
> * Try to shrink the extent tree now.
> @@ -1717,6 +1712,7 @@ static int ext4_fc_replay_add_range(struct super_block *sb,
> }
> ext4_ext_replay_shrink_inode(inode, i_size_read(inode) >>
> sb->s_blocksize_bits);
> +out:
> iput(inode);
> return 0;
> }
> @@ -1746,6 +1742,8 @@ ext4_fc_replay_del_range(struct super_block *sb, struct ext4_fc_tl *tl,
> }
>
> ret = ext4_fc_record_modified_inode(sb, inode->i_ino);
> + if (ret)
> + goto out;
>
> jbd_debug(1, "DEL_RANGE, inode %ld, lblk %d, len %d\n",
> inode->i_ino, le32_to_cpu(lrange.fc_lblk),
> @@ -1755,10 +1753,8 @@ ext4_fc_replay_del_range(struct super_block *sb, struct ext4_fc_tl *tl,
> map.m_len = remaining;
>
> ret = ext4_map_blocks(NULL, inode, &map, 0);
> - if (ret < 0) {
> - iput(inode);
> - return 0;
> - }
> + if (ret < 0)
> + goto out;
> if (ret > 0) {
> remaining -= ret;
> cur += ret;
> @@ -1773,15 +1769,13 @@ ext4_fc_replay_del_range(struct super_block *sb, struct ext4_fc_tl *tl,
> ret = ext4_ext_remove_space(inode, lrange.fc_lblk,
> lrange.fc_lblk + lrange.fc_len - 1);
> up_write(&EXT4_I(inode)->i_data_sem);
> - if (ret) {
> - iput(inode);
> - return 0;
> - }
> + if (ret)
> + goto out;
> ext4_ext_replay_shrink_inode(inode,
> i_size_read(inode) >> sb->s_blocksize_bits);
> ext4_mark_inode_dirty(NULL, inode);
> +out:
> iput(inode);
> -
> return 0;
> }
>
> --
> 2.31.1
>
--
Jan Kara <[email protected]>
SUSE Labs, CR

2022-01-13 11:01:43

by Jan Kara

[permalink] [raw]
Subject: Re: [PATCH 4/6] jbd2: Cleanup unused functions declarations from jbd2.h

On Thu 13-01-22 08:56:27, Ritesh Harjani wrote:
> During code review found no references of few of these below function
> declarations. This patch cleans those up from jbd2.h
>
> Signed-off-by: Ritesh Harjani <[email protected]>

Spring cleaning :). Feel free to add:

Reviewed-by: Jan Kara <[email protected]>

Honza

> ---
> include/linux/jbd2.h | 7 -------
> 1 file changed, 7 deletions(-)
>
> diff --git a/include/linux/jbd2.h b/include/linux/jbd2.h
> index fd933c45281a..f76598265896 100644
> --- a/include/linux/jbd2.h
> +++ b/include/linux/jbd2.h
> @@ -1419,9 +1419,7 @@ extern void jbd2_journal_unfile_buffer(journal_t *, struct journal_head *);
> extern bool __jbd2_journal_refile_buffer(struct journal_head *);
> extern void jbd2_journal_refile_buffer(journal_t *, struct journal_head *);
> extern void __jbd2_journal_file_buffer(struct journal_head *, transaction_t *, int);
> -extern void __journal_free_buffer(struct journal_head *bh);
> extern void jbd2_journal_file_buffer(struct journal_head *, transaction_t *, int);
> -extern void __journal_clean_data_list(transaction_t *transaction);
> static inline void jbd2_file_log_bh(struct list_head *head, struct buffer_head *bh)
> {
> list_add_tail(&bh->b_assoc_buffers, head);
> @@ -1486,9 +1484,6 @@ extern int jbd2_journal_write_metadata_buffer(transaction_t *transaction,
> struct buffer_head **bh_out,
> sector_t blocknr);
>
> -/* Transaction locking */
> -extern void __wait_on_journal (journal_t *);
> -
> /* Transaction cache support */
> extern void jbd2_journal_destroy_transaction_cache(void);
> extern int __init jbd2_journal_init_transaction_cache(void);
> @@ -1774,8 +1769,6 @@ static inline unsigned long jbd2_log_space_left(journal_t *journal)
> #define BJ_Reserved 4 /* Buffer is reserved for access by journal */
> #define BJ_Types 5
>
> -extern int jbd_blocks_per_page(struct inode *inode);
> -
> /* JBD uses a CRC32 checksum */
> #define JBD_MAX_CHECKSUM_SIZE 4
>
> --
> 2.31.1
>
--
Jan Kara <[email protected]>
SUSE Labs, CR

2022-01-13 11:27:55

by Jan Kara

[permalink] [raw]
Subject: Re: [PATCH 6/6] jbd2: No need to use t_handle_lock in jbd2_journal_wait_updates

On Thu 13-01-22 08:56:29, Ritesh Harjani wrote:
> Since jbd2_journal_wait_updates() uses waitq based on t_updates atomic_t
> variable. So from code review it looks like we don't need to use
> t_handle_lock spinlock for checking t_updates value.
> Hence this patch gets rid of the spinlock protection in
> jbd2_journal_wait_updates()
>
> Signed-off-by: Ritesh Harjani <[email protected]>

This patch looks good. Feel free to add:

Reviewed-by: Jan Kara <[email protected]>

Actually looking at it, t_handle_lock seems to be very much unused. I agree
we don't need it when waiting for outstanding handles but the only
remaining uses are:

1) jbd2_journal_extend() where it is not needed either - we use
atomic_add_return() to manipulate t_outstanding_credits and hold
j_state_lock for reading which provides us enough exclusion.

2) update_t_max_wait() - this is the only valid use of t_handle_lock but we
can just switch it to cmpxchg loop with a bit of care. Something like:

unsigned long old;

ts = jbd2_time_diff(ts, transaction->t_start);
old = transaction->t_max_wait;
while (old < ts)
old = cmpxchg(&transaction->t_max_wait, old, ts);

So perhaps you can add two more patches to remove other t_handle_lock uses
and drop it completely.

Honza

> ---
> include/linux/jbd2.h | 4 ----
> 1 file changed, 4 deletions(-)
>
> diff --git a/include/linux/jbd2.h b/include/linux/jbd2.h
> index 34b051aa9009..9bef47622b9d 100644
> --- a/include/linux/jbd2.h
> +++ b/include/linux/jbd2.h
> @@ -1768,22 +1768,18 @@ static inline void jbd2_journal_wait_updates(journal_t *journal)
> if (!commit_transaction)
> return;
>
> - spin_lock(&commit_transaction->t_handle_lock);
> while (atomic_read(&commit_transaction->t_updates)) {
> DEFINE_WAIT(wait);
>
> prepare_to_wait(&journal->j_wait_updates, &wait,
> TASK_UNINTERRUPTIBLE);
> if (atomic_read(&commit_transaction->t_updates)) {
> - spin_unlock(&commit_transaction->t_handle_lock);
> write_unlock(&journal->j_state_lock);
> schedule();
> write_lock(&journal->j_state_lock);
> - spin_lock(&commit_transaction->t_handle_lock);
> }
> finish_wait(&journal->j_wait_updates, &wait);
> }
> - spin_unlock(&commit_transaction->t_handle_lock);
> }
>
> /*
> --
> 2.31.1
>
--
Jan Kara <[email protected]>
SUSE Labs, CR

2022-01-13 11:30:57

by Jan Kara

[permalink] [raw]
Subject: Re: [PATCH 5/6] jbd2: Refactor wait logic for transaction updates into a common function

On Thu 13-01-22 08:56:28, Ritesh Harjani wrote:
> No functionality change as such in this patch. This only refactors the
> common piece of code which waits for t_updates to finish into a common
> function named as jbd2_journal_wait_updates(journal_t *)
>
> Signed-off-by: Ritesh Harjani <[email protected]>

Just one nit, otherwise. Feel free to add:

Reviewed-by: Jan Kara <[email protected]>

> @@ -1757,6 +1757,35 @@ static inline unsigned long jbd2_log_space_left(journal_t *journal)
> return max_t(long, free, 0);
> }
>
> +/*
> + * Waits for any outstanding t_updates to finish.
> + * This is called with write j_state_lock held.
> + */
> +static inline void jbd2_journal_wait_updates(journal_t *journal)
> +{
> + transaction_t *commit_transaction = journal->j_running_transaction;
> +
> + if (!commit_transaction)
> + return;
> +
> + spin_lock(&commit_transaction->t_handle_lock);
> + while (atomic_read(&commit_transaction->t_updates)) {
> + DEFINE_WAIT(wait);
> +
> + prepare_to_wait(&journal->j_wait_updates, &wait,
> + TASK_UNINTERRUPTIBLE);
> + if (atomic_read(&commit_transaction->t_updates)) {
> + spin_unlock(&commit_transaction->t_handle_lock);
> + write_unlock(&journal->j_state_lock);
> + schedule();
> + write_lock(&journal->j_state_lock);
> + spin_lock(&commit_transaction->t_handle_lock);
> + }
> + finish_wait(&journal->j_wait_updates, &wait);
> + }
> + spin_unlock(&commit_transaction->t_handle_lock);
> +}
> +

I don't think making this inline makes sence. Neither the commit code nor
jbd2_journal_lock_updates() are so hot that it would warrant this large
inline function...

Honza

--
Jan Kara <[email protected]>
SUSE Labs, CR

2022-01-13 12:17:36

by Ritesh Harjani

[permalink] [raw]
Subject: Re: [PATCH 5/6] jbd2: Refactor wait logic for transaction updates into a common function

On 22/01/13 12:30PM, Jan Kara wrote:
> On Thu 13-01-22 08:56:28, Ritesh Harjani wrote:
> > No functionality change as such in this patch. This only refactors the
> > common piece of code which waits for t_updates to finish into a common
> > function named as jbd2_journal_wait_updates(journal_t *)
> >
> > Signed-off-by: Ritesh Harjani <[email protected]>
>
> Just one nit, otherwise. Feel free to add:
>
> Reviewed-by: Jan Kara <[email protected]>
>
> > @@ -1757,6 +1757,35 @@ static inline unsigned long jbd2_log_space_left(journal_t *journal)
> > return max_t(long, free, 0);
> > }
> >
> > +/*
> > + * Waits for any outstanding t_updates to finish.
> > + * This is called with write j_state_lock held.
> > + */
> > +static inline void jbd2_journal_wait_updates(journal_t *journal)
> > +{
> > + transaction_t *commit_transaction = journal->j_running_transaction;
> > +
> > + if (!commit_transaction)
> > + return;
> > +
> > + spin_lock(&commit_transaction->t_handle_lock);
> > + while (atomic_read(&commit_transaction->t_updates)) {
> > + DEFINE_WAIT(wait);
> > +
> > + prepare_to_wait(&journal->j_wait_updates, &wait,
> > + TASK_UNINTERRUPTIBLE);
> > + if (atomic_read(&commit_transaction->t_updates)) {
> > + spin_unlock(&commit_transaction->t_handle_lock);
> > + write_unlock(&journal->j_state_lock);
> > + schedule();
> > + write_lock(&journal->j_state_lock);
> > + spin_lock(&commit_transaction->t_handle_lock);
> > + }
> > + finish_wait(&journal->j_wait_updates, &wait);
> > + }
> > + spin_unlock(&commit_transaction->t_handle_lock);
> > +}
> > +
>
> I don't think making this inline makes sence. Neither the commit code nor
> jbd2_journal_lock_updates() are so hot that it would warrant this large
> inline function...

Yes, make sense. Thanks for the review.
Will do the needful in v2.

-ritesh

>
> Honza
>
> --
> Jan Kara <[email protected]>
> SUSE Labs, CR

2022-01-13 12:38:59

by Ritesh Harjani

[permalink] [raw]
Subject: Re: [PATCH 6/6] jbd2: No need to use t_handle_lock in jbd2_journal_wait_updates

On 22/01/13 12:27PM, Jan Kara wrote:
> On Thu 13-01-22 08:56:29, Ritesh Harjani wrote:
> > Since jbd2_journal_wait_updates() uses waitq based on t_updates atomic_t
> > variable. So from code review it looks like we don't need to use
> > t_handle_lock spinlock for checking t_updates value.
> > Hence this patch gets rid of the spinlock protection in
> > jbd2_journal_wait_updates()
> >
> > Signed-off-by: Ritesh Harjani <[email protected]>
>
> This patch looks good. Feel free to add:
>
> Reviewed-by: Jan Kara <[email protected]>
>
> Actually looking at it, t_handle_lock seems to be very much unused. I agree

I too had this thought in mind. Thanks for taking a deeper look into it :)

>
> we don't need it when waiting for outstanding handles but the only
> remaining uses are:
>
> 1) jbd2_journal_extend() where it is not needed either - we use
> atomic_add_return() to manipulate t_outstanding_credits and hold
> j_state_lock for reading which provides us enough exclusion.
>
> 2) update_t_max_wait() - this is the only valid use of t_handle_lock but we
> can just switch it to cmpxchg loop with a bit of care. Something like:
>
> unsigned long old;
>
> ts = jbd2_time_diff(ts, transaction->t_start);
> old = transaction->t_max_wait;
> while (old < ts)
> old = cmpxchg(&transaction->t_max_wait, old, ts);
>
> So perhaps you can add two more patches to remove other t_handle_lock uses
> and drop it completely.

Thanks for providing the details Jan :)
Agree with jbd2_journal_extend(). I did looked a bit around t_max_wait and
I agree that something like above could work. I will spend some more time around
that code and will submit those changes together in v2.

-ritesh

>
> Honza
>
> > ---
> > include/linux/jbd2.h | 4 ----
> > 1 file changed, 4 deletions(-)
> >
> > diff --git a/include/linux/jbd2.h b/include/linux/jbd2.h
> > index 34b051aa9009..9bef47622b9d 100644
> > --- a/include/linux/jbd2.h
> > +++ b/include/linux/jbd2.h
> > @@ -1768,22 +1768,18 @@ static inline void jbd2_journal_wait_updates(journal_t *journal)
> > if (!commit_transaction)
> > return;
> >
> > - spin_lock(&commit_transaction->t_handle_lock);
> > while (atomic_read(&commit_transaction->t_updates)) {
> > DEFINE_WAIT(wait);
> >
> > prepare_to_wait(&journal->j_wait_updates, &wait,
> > TASK_UNINTERRUPTIBLE);
> > if (atomic_read(&commit_transaction->t_updates)) {
> > - spin_unlock(&commit_transaction->t_handle_lock);
> > write_unlock(&journal->j_state_lock);
> > schedule();
> > write_lock(&journal->j_state_lock);
> > - spin_lock(&commit_transaction->t_handle_lock);
> > }
> > finish_wait(&journal->j_wait_updates, &wait);
> > }
> > - spin_unlock(&commit_transaction->t_handle_lock);
> > }
> >
> > /*
> > --
> > 2.31.1
> >
> --
> Jan Kara <[email protected]>
> SUSE Labs, CR