2023-02-09 11:56:55

by Kemeng Shi

[permalink] [raw]
Subject: [PATCH 00/21] Some bugfix and cleanup to mballoc

Hi, this series contain some random cleanup patches and some bugfix
patches to make EXT4_MB_HINT_GOAL_ONLY work properly, protect pa->pa_free
from race and so on. More details can be found in git log.
Thanks!

Kemeng Shi (21):
ext4: set goal start correctly in ext4_mb_normalize_request
ext4: allow to find by goal if EXT4_MB_HINT_GOAL_ONLY is set
ext4: avoid to use preallocated blocks if EXT4_MB_HINT_GOAL_ONLY is
set
ext4: get correct ext4_group_info in ext4_mb_prefetch_fini
ext4: correct calculation of s_mb_preallocated
ext4: correct start of used group pa for debug in ext4_mb_use_group_pa
ext4: protect pa->pa_free in ext4_discard_allocated_blocks
ext4: add missed brelse in ext4_free_blocks_simple
ext4: remove unused return value of ext4_mb_try_best_found and
ext4_mb_free_metadata
ext4: Remove unnecessary release when memory allocation failed in
ext4_mb_init_cache
ext4: remove unnecessary e4b->bd_buddy_page check in
ext4_mb_load_buddy_gfp
ext4: remove unnecessary check in ext4_mb_new_blocks
ext4: remove dead check in mb_buddy_mark_free
ext4: remove ac->ac_found > sbi->s_mb_min_to_scan dead check in
ext4_mb_check_limits
ext4: use best found when complex scan of group finishs
ext4: remove unnecessary exit_meta_group_info tag
ext4: remove unnecessary count2 in ext4_free_data_in_buddy
ext4: remove unnecessary goto in ext4_mb_mark_diskspace_used
ext4: remove repeat assignment to ac_f_ex
ext4: remove comment code ext4_discard_preallocations
ext4: simplify calculation of blkoff in ext4_mb_new_blocks_simple

fs/ext4/mballoc.c | 105 ++++++++++++++++++----------------------------
1 file changed, 40 insertions(+), 65 deletions(-)

--
2.30.0



2023-02-09 11:57:01

by Kemeng Shi

[permalink] [raw]
Subject: [PATCH 01/21] ext4: set goal start correctly in ext4_mb_normalize_request

We need to set ac_g_ex to notify the goal start used in
ext4_mb_find_by_goal. Set ac_g_ex instead of ac_f_ex in
ext4_mb_normalize_request.

Signed-off-by: Kemeng Shi <[email protected]>
---
fs/ext4/mballoc.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c
index 5b2ae37a8b80..0650a1dc870e 100644
--- a/fs/ext4/mballoc.c
+++ b/fs/ext4/mballoc.c
@@ -4191,15 +4191,15 @@ ext4_mb_normalize_request(struct ext4_allocation_context *ac,
if (ar->pright && (ar->lright == (start + size))) {
/* merge to the right */
ext4_get_group_no_and_offset(ac->ac_sb, ar->pright - size,
- &ac->ac_f_ex.fe_group,
- &ac->ac_f_ex.fe_start);
+ &ac->ac_g_ex.fe_group,
+ &ac->ac_g_ex.fe_start);
ac->ac_flags |= EXT4_MB_HINT_TRY_GOAL;
}
if (ar->pleft && (ar->lleft + 1 == start)) {
/* merge to the left */
ext4_get_group_no_and_offset(ac->ac_sb, ar->pleft + 1,
- &ac->ac_f_ex.fe_group,
- &ac->ac_f_ex.fe_start);
+ &ac->ac_g_ex.fe_group,
+ &ac->ac_g_ex.fe_start);
ac->ac_flags |= EXT4_MB_HINT_TRY_GOAL;
}

--
2.30.0


2023-02-09 11:57:06

by Kemeng Shi

[permalink] [raw]
Subject: [PATCH 02/21] ext4: allow to find by goal if EXT4_MB_HINT_GOAL_ONLY is set

If EXT4_MB_HINT_GOAL_ONLY is set, ext4_mb_regular_allocator will only
allocate blocks from ext4_mb_find_by_goal. Allow to find by goal in
ext4_mb_find_by_goal if EXT4_MB_HINT_GOAL_ONLY is set or allocation
with EXT4_MB_HINT_GOAL_ONLY set will always fail.

EXT4_MB_HINT_GOAL_ONLY is not used at all, so the problem is not
found for now.

Signed-off-by: Kemeng Shi <[email protected]>
---
fs/ext4/mballoc.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c
index 0650a1dc870e..375d9655b525 100644
--- a/fs/ext4/mballoc.c
+++ b/fs/ext4/mballoc.c
@@ -2162,7 +2162,7 @@ int ext4_mb_find_by_goal(struct ext4_allocation_context *ac,
struct ext4_group_info *grp = ext4_get_group_info(ac->ac_sb, group);
struct ext4_free_extent ex;

- if (!(ac->ac_flags & EXT4_MB_HINT_TRY_GOAL))
+ if (!(ac->ac_flags & (EXT4_MB_HINT_TRY_GOAL | EXT4_MB_HINT_GOAL_ONLY)))
return 0;
if (grp->bb_free == 0)
return 0;
--
2.30.0


2023-02-09 11:57:09

by Kemeng Shi

[permalink] [raw]
Subject: [PATCH 03/21] ext4: avoid to use preallocated blocks if EXT4_MB_HINT_GOAL_ONLY is set

ext4_mb_use_preallocated will ignore the demand to alloc at goal block
only. Return false if EXT4_MB_HINT_GOAL_ONLY is set before use
preallocated blocks in ext4_mb_use_preallocated.

Signed-off-by: Kemeng Shi <[email protected]>
---
fs/ext4/mballoc.c | 3 +++
1 file changed, 3 insertions(+)

diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c
index 375d9655b525..352ac9139fee 100644
--- a/fs/ext4/mballoc.c
+++ b/fs/ext4/mballoc.c
@@ -4368,6 +4368,9 @@ ext4_mb_use_preallocated(struct ext4_allocation_context *ac)
if (!(ac->ac_flags & EXT4_MB_HINT_DATA))
return false;

+ if (unlikely(ac->ac_flags & EXT4_MB_HINT_GOAL_ONLY))
+ return false;
+
/* first, try per-file preallocation */
rcu_read_lock();
list_for_each_entry_rcu(pa, &ei->i_prealloc_list, pa_inode_list) {
--
2.30.0


2023-02-09 11:57:13

by Kemeng Shi

[permalink] [raw]
Subject: [PATCH 04/21] ext4: get correct ext4_group_info in ext4_mb_prefetch_fini

We always get ext4_group_desc with group + 1 and ext4_group_info with
group to check if we need do initialize ext4_group_info for the group.
Just get ext4_group_desc with group for ext4_group_info initialization
check.

Signed-off-by: Kemeng Shi <[email protected]>
---
fs/ext4/mballoc.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c
index 352ac9139fee..f24f80ecf318 100644
--- a/fs/ext4/mballoc.c
+++ b/fs/ext4/mballoc.c
@@ -2570,13 +2570,13 @@ void ext4_mb_prefetch_fini(struct super_block *sb, ext4_group_t group,
unsigned int nr)
{
while (nr-- > 0) {
- struct ext4_group_desc *gdp = ext4_get_group_desc(sb, group,
- NULL);
- struct ext4_group_info *grp = ext4_get_group_info(sb, group);
+ struct ext4_group_desc *gdp;
+ struct ext4_group_info *grp;

if (!group)
group = ext4_get_groups_count(sb);
group--;
+ gdp = ext4_get_group_desc(sb, group, NULL);
grp = ext4_get_group_info(sb, group);

if (EXT4_MB_GRP_NEED_INIT(grp) &&
--
2.30.0


2023-02-09 11:57:19

by Kemeng Shi

[permalink] [raw]
Subject: [PATCH 05/21] ext4: correct calculation of s_mb_preallocated

We will add pa_free to s_mb_preallocated when new ext4_prealloc_space is
created. In ext4_mb_new_inode_pa, we will call ext4_mb_use_inode_pa
before adding pa_free to s_mb_preallocated. However, ext4_mb_use_inode_pa
will consume pa_free for block allocation which triggerred the creation
of ext4_prealloc_space. Add pa_free to s_mb_preallocated before
ext4_mb_use_inode_pa to correct calculation of s_mb_preallocated.
There is no such problem in ext4_mb_new_group_pa as pa_free of group pa
is consumed in ext4_mb_release_context instead of ext4_mb_use_group_pa.

Signed-off-by: Kemeng Shi <[email protected]>
---
fs/ext4/mballoc.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c
index f24f80ecf318..2bffc93778d5 100644
--- a/fs/ext4/mballoc.c
+++ b/fs/ext4/mballoc.c
@@ -4670,8 +4670,8 @@ ext4_mb_new_inode_pa(struct ext4_allocation_context *ac)
pa->pa_len, pa->pa_lstart);
trace_ext4_mb_new_inode_pa(ac, pa);

- ext4_mb_use_inode_pa(ac, pa);
atomic_add(pa->pa_free, &sbi->s_mb_preallocated);
+ ext4_mb_use_inode_pa(ac, pa);

ei = EXT4_I(ac->ac_inode);
grp = ext4_get_group_info(sb, ac->ac_b_ex.fe_group);
--
2.30.0


2023-02-09 11:57:24

by Kemeng Shi

[permalink] [raw]
Subject: [PATCH 07/21] ext4: protect pa->pa_free in ext4_discard_allocated_blocks

If ext4_mb_mark_diskspace_used fails in ext4_mb_new_blocks, we may
discard pa already in list. Protect pa with pa_lock to avoid race.

Signed-off-by: Kemeng Shi <[email protected]>
---
fs/ext4/mballoc.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c
index 433337ac8da2..4e1caac74b44 100644
--- a/fs/ext4/mballoc.c
+++ b/fs/ext4/mballoc.c
@@ -4263,8 +4263,11 @@ static void ext4_discard_allocated_blocks(struct ext4_allocation_context *ac)
ext4_mb_unload_buddy(&e4b);
return;
}
- if (pa->pa_type == MB_INODE_PA)
+ if (pa->pa_type == MB_INODE_PA) {
+ spin_lock(&pa->pa_lock);
pa->pa_free += ac->ac_b_ex.fe_len;
+ spin_unlock(&pa->pa_lock);
+ }
}

/*
--
2.30.0


2023-02-09 11:57:29

by Kemeng Shi

[permalink] [raw]
Subject: [PATCH 06/21] ext4: correct start of used group pa for debug in ext4_mb_use_group_pa

As we don't correct pa_lstart here, so there is no need to subtract
pa_lstart with consumed len.

Signed-off-by: Kemeng Shi <[email protected]>
---
fs/ext4/mballoc.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c
index 2bffc93778d5..433337ac8da2 100644
--- a/fs/ext4/mballoc.c
+++ b/fs/ext4/mballoc.c
@@ -4319,7 +4319,7 @@ static void ext4_mb_use_group_pa(struct ext4_allocation_context *ac,
* Other CPUs are prevented from allocating from this pa by lg_mutex
*/
mb_debug(ac->ac_sb, "use %u/%u from group pa %p\n",
- pa->pa_lstart-len, len, pa);
+ pa->pa_lstart, len, pa);
}

/*
--
2.30.0


2023-02-09 11:57:33

by Kemeng Shi

[permalink] [raw]
Subject: [PATCH 08/21] ext4: add missed brelse in ext4_free_blocks_simple

Release bitmap buffer_head we got if error occurs.
Besides, this patch remove unused assignment to err.

Signed-off-by: Kemeng Shi <[email protected]>
---
fs/ext4/mballoc.c | 7 ++++---
1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c
index 4e1caac74b44..17ac98c501ed 100644
--- a/fs/ext4/mballoc.c
+++ b/fs/ext4/mballoc.c
@@ -5848,13 +5848,12 @@ static void ext4_free_blocks_simple(struct inode *inode, ext4_fsblk_t block,
ext4_get_group_no_and_offset(sb, block, &group, &blkoff);
bitmap_bh = ext4_read_block_bitmap(sb, group);
if (IS_ERR(bitmap_bh)) {
- err = PTR_ERR(bitmap_bh);
pr_warn("Failed to read block bitmap\n");
return;
}
gdp = ext4_get_group_desc(sb, group, &gdp_bh);
if (!gdp)
- return;
+ goto err_out;

for (i = 0; i < count; i++) {
if (!mb_test_bit(blkoff + i, bitmap_bh->b_data))
@@ -5863,7 +5862,7 @@ static void ext4_free_blocks_simple(struct inode *inode, ext4_fsblk_t block,
mb_clear_bits(bitmap_bh->b_data, blkoff, count);
err = ext4_handle_dirty_metadata(NULL, NULL, bitmap_bh);
if (err)
- return;
+ goto err_out;
ext4_free_group_clusters_set(
sb, gdp, ext4_free_group_clusters(sb, gdp) +
count - already_freed);
@@ -5872,6 +5871,8 @@ static void ext4_free_blocks_simple(struct inode *inode, ext4_fsblk_t block,
ext4_handle_dirty_metadata(NULL, NULL, gdp_bh);
sync_dirty_buffer(bitmap_bh);
sync_dirty_buffer(gdp_bh);
+
+err_out:
brelse(bitmap_bh);
}

--
2.30.0


2023-02-09 11:57:40

by Kemeng Shi

[permalink] [raw]
Subject: [PATCH 09/21] ext4: remove unused return value of ext4_mb_try_best_found and ext4_mb_free_metadata

Return value static function ext4_mb_try_best_found and
ext4_mb_free_metadata is not used. Just remove unused return value.

Signed-off-by: Kemeng Shi <[email protected]>
---
fs/ext4/mballoc.c | 11 ++++-------
1 file changed, 4 insertions(+), 7 deletions(-)

diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c
index 17ac98c501ed..ad9e3b7d3198 100644
--- a/fs/ext4/mballoc.c
+++ b/fs/ext4/mballoc.c
@@ -2124,7 +2124,7 @@ static void ext4_mb_measure_extent(struct ext4_allocation_context *ac,
}

static noinline_for_stack
-int ext4_mb_try_best_found(struct ext4_allocation_context *ac,
+void ext4_mb_try_best_found(struct ext4_allocation_context *ac,
struct ext4_buddy *e4b)
{
struct ext4_free_extent ex = ac->ac_b_ex;
@@ -2135,7 +2135,7 @@ int ext4_mb_try_best_found(struct ext4_allocation_context *ac,
BUG_ON(ex.fe_len <= 0);
err = ext4_mb_load_buddy(ac->ac_sb, group, e4b);
if (err)
- return err;
+ return;

ext4_lock_group(ac->ac_sb, group);
max = mb_find_extent(e4b, ex.fe_start, ex.fe_len, &ex);
@@ -2147,8 +2147,6 @@ int ext4_mb_try_best_found(struct ext4_allocation_context *ac,

ext4_unlock_group(ac->ac_sb, group);
ext4_mb_unload_buddy(e4b);
-
- return 0;
}

static noinline_for_stack
@@ -5699,7 +5697,7 @@ static void ext4_try_merge_freed_extent(struct ext4_sb_info *sbi,
kmem_cache_free(ext4_free_data_cachep, entry);
}

-static noinline_for_stack int
+static noinline_for_stack void
ext4_mb_free_metadata(handle_t *handle, struct ext4_buddy *e4b,
struct ext4_free_data *new_entry)
{
@@ -5742,7 +5740,7 @@ ext4_mb_free_metadata(handle_t *handle, struct ext4_buddy *e4b,
EXT4_C2B(sbi, cluster),
"Block already on to-be-freed list");
kmem_cache_free(ext4_free_data_cachep, new_entry);
- return 0;
+ return;
}
}

@@ -5768,7 +5766,6 @@ ext4_mb_free_metadata(handle_t *handle, struct ext4_buddy *e4b,
list_add_tail(&new_entry->efd_list, &sbi->s_freed_data_list);
sbi->s_mb_free_pending += clusters;
spin_unlock(&sbi->s_md_lock);
- return 0;
}

/*
--
2.30.0


2023-02-09 11:57:45

by Kemeng Shi

[permalink] [raw]
Subject: [PATCH 10/21] ext4: Remove unnecessary release when memory allocation failed in ext4_mb_init_cache

If we alloc array of buffer_head failed, there is no resource need to be
freed and we can simpily return error.

Signed-off-by: Kemeng Shi <[email protected]>
---
fs/ext4/mballoc.c | 6 ++----
1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c
index ad9e3b7d3198..15fc7105becc 100644
--- a/fs/ext4/mballoc.c
+++ b/fs/ext4/mballoc.c
@@ -1168,10 +1168,8 @@ static int ext4_mb_init_cache(struct page *page, char *incore, gfp_t gfp)
if (groups_per_page > 1) {
i = sizeof(struct buffer_head *) * groups_per_page;
bh = kzalloc(i, gfp);
- if (bh == NULL) {
- err = -ENOMEM;
- goto out;
- }
+ if (bh == NULL)
+ return -ENOMEM;
} else
bh = &bhs;

--
2.30.0


2023-02-09 11:57:50

by Kemeng Shi

[permalink] [raw]
Subject: [PATCH 11/21] ext4: remove unnecessary e4b->bd_buddy_page check in ext4_mb_load_buddy_gfp

e4b->bd_buddy_page is only set if we initialize ext4_buddy successfully. So
e4b->bd_buddy_page is always NULL in error handle branch. Just remove the
dead check.

Signed-off-by: Kemeng Shi <[email protected]>
---
fs/ext4/mballoc.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c
index 15fc7105becc..74da24c9bf14 100644
--- a/fs/ext4/mballoc.c
+++ b/fs/ext4/mballoc.c
@@ -1555,8 +1555,7 @@ ext4_mb_load_buddy_gfp(struct super_block *sb, ext4_group_t group,
put_page(page);
if (e4b->bd_bitmap_page)
put_page(e4b->bd_bitmap_page);
- if (e4b->bd_buddy_page)
- put_page(e4b->bd_buddy_page);
+
e4b->bd_buddy = NULL;
e4b->bd_bitmap = NULL;
return ret;
--
2.30.0


2023-02-09 11:57:53

by Kemeng Shi

[permalink] [raw]
Subject: [PATCH 13/21] ext4: remove dead check in mb_buddy_mark_free

We always adjust first to even number and adjust last to odd number, so
first == last will never happen. Remove this dead check.

Signed-off-by: Kemeng Shi <[email protected]>
---
fs/ext4/mballoc.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c
index bdabe0d81d4a..0fdbf16ac180 100644
--- a/fs/ext4/mballoc.c
+++ b/fs/ext4/mballoc.c
@@ -1718,7 +1718,8 @@ static void mb_buddy_mark_free(struct ext4_buddy *e4b, int first, int last)
break;
order++;

- if (first == last || !(buddy2 = mb_find_buddy(e4b, order, &max))) {
+ buddy2 = mb_find_buddy(e4b, order, &max);
+ if (!buddy2) {
mb_clear_bits(buddy, first, last - first + 1);
e4b->bd_info->bb_counters[order - 1] += last - first + 1;
break;
--
2.30.0


2023-02-09 11:57:57

by Kemeng Shi

[permalink] [raw]
Subject: [PATCH 14/21] ext4: remove ac->ac_found > sbi->s_mb_min_to_scan dead check in ext4_mb_check_limits

Only call trace of ext4_mb_check_limits is as following:
ext4_mb_complex_scan_group
ext4_mb_measure_extent
ext4_mb_check_limits(ac, e4b, 0);
ext4_mb_check_limits(ac, e4b, 1);

If the first ac->ac_found > sbi->s_mb_max_to_scan check in
ext4_mb_check_limits is met, we will set ac_status to
AC_STATUS_BREAK and call ext4_mb_try_best_found to try to use
ac->ac_b_ex.
If ext4_mb_try_best_found successes, then block allocation finishs,
the removed ac->ac_found > sbi->s_mb_min_to_scan check is not reachable.
If ext4_mb_try_best_found fails, then we set EXT4_MB_HINT_FIRST and
reset ac->ac_b_ex to retry block allocation. We will use any found
free extent in ext4_mb_measure_extent before reach the removed
ac->ac_found > sbi->s_mb_min_to_scan check.
In summary, the removed ac->ac_found > sbi->s_mb_min_to_scan check is
not reachable and we can remove that dead check.

Signed-off-by: Kemeng Shi <[email protected]>
---
fs/ext4/mballoc.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c
index 0fdbf16ac180..e53f84de5018 100644
--- a/fs/ext4/mballoc.c
+++ b/fs/ext4/mballoc.c
@@ -2039,8 +2039,7 @@ static void ext4_mb_check_limits(struct ext4_allocation_context *ac,
if (bex->fe_len < gex->fe_len)
return;

- if ((finish_group || ac->ac_found > sbi->s_mb_min_to_scan)
- && bex->fe_group == e4b->bd_group) {
+ if (finish_group && bex->fe_group == e4b->bd_group) {
/* recheck chunk's availability - we don't know
* when it was found (within this lock-unlock
* period or not) */
--
2.30.0


2023-02-09 11:58:00

by Kemeng Shi

[permalink] [raw]
Subject: [PATCH 12/21] ext4: remove unnecessary check in ext4_mb_new_blocks

1. remove unnecessary ac check:
We always go to out tag before ac is successfully allocated, then we can
move out tag after free of ac and remove NULL check of ac.

2. remove unnecessary *errp check:
We always go to errout tag if *errp is non-zero, then we can move errout
tag into error handle if *errp is non-zero.

Signed-off-by: Kemeng Shi <[email protected]>
---
fs/ext4/mballoc.c | 5 ++---
1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c
index 74da24c9bf14..bdabe0d81d4a 100644
--- a/fs/ext4/mballoc.c
+++ b/fs/ext4/mballoc.c
@@ -5641,16 +5641,15 @@ ext4_fsblk_t ext4_mb_new_blocks(handle_t *handle,
*errp = -ENOSPC;
}

-errout:
if (*errp) {
+errout:
ac->ac_b_ex.fe_len = 0;
ar->len = 0;
ext4_mb_show_ac(ac);
}
ext4_mb_release_context(ac);
+ kmem_cache_free(ext4_ac_cachep, ac);
out:
- if (ac)
- kmem_cache_free(ext4_ac_cachep, ac);
if (inquota && ar->len < inquota)
dquot_free_block(ar->inode, EXT4_C2B(sbi, inquota - ar->len));
if (!ar->len) {
--
2.30.0


2023-02-09 11:58:03

by Kemeng Shi

[permalink] [raw]
Subject: [PATCH 15/21] ext4: use best found when complex scan of group finishs

If any bex which meets bex->fe_len >= gex->fe_len is found, then it will
always be used when complex scan of group that bex belongs to finishs.
So there will not be any lock-unlock period.

Signed-off-by: Kemeng Shi <[email protected]>
---
fs/ext4/mballoc.c | 14 ++------------
1 file changed, 2 insertions(+), 12 deletions(-)

diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c
index e53f84de5018..c684758d6dbb 100644
--- a/fs/ext4/mballoc.c
+++ b/fs/ext4/mballoc.c
@@ -2019,8 +2019,6 @@ static void ext4_mb_check_limits(struct ext4_allocation_context *ac,
struct ext4_sb_info *sbi = EXT4_SB(ac->ac_sb);
struct ext4_free_extent *bex = &ac->ac_b_ex;
struct ext4_free_extent *gex = &ac->ac_g_ex;
- struct ext4_free_extent ex;
- int max;

if (ac->ac_status == AC_STATUS_FOUND)
return;
@@ -2039,16 +2037,8 @@ static void ext4_mb_check_limits(struct ext4_allocation_context *ac,
if (bex->fe_len < gex->fe_len)
return;

- if (finish_group && bex->fe_group == e4b->bd_group) {
- /* recheck chunk's availability - we don't know
- * when it was found (within this lock-unlock
- * period or not) */
- max = mb_find_extent(e4b, bex->fe_start, gex->fe_len, &ex);
- if (max >= gex->fe_len) {
- ext4_mb_use_best_found(ac, e4b);
- return;
- }
- }
+ if (finish_group)
+ ext4_mb_use_best_found(ac, e4b);
}

/*
--
2.30.0


2023-02-09 11:58:05

by Kemeng Shi

[permalink] [raw]
Subject: [PATCH 16/21] ext4: remove unnecessary exit_meta_group_info tag

We goto exit_meta_group_info only to return -ENOMEM. Return -ENOMEM
directly instead of goto to remove this unnecessary tag.

Signed-off-by: Kemeng Shi <[email protected]>
---
fs/ext4/mballoc.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c
index c684758d6dbb..289dcd81dd5a 100644
--- a/fs/ext4/mballoc.c
+++ b/fs/ext4/mballoc.c
@@ -3069,7 +3069,7 @@ int ext4_mb_add_groupinfo(struct super_block *sb, ext4_group_t group,
if (meta_group_info == NULL) {
ext4_msg(sb, KERN_ERR, "can't allocate mem "
"for a buddy group");
- goto exit_meta_group_info;
+ return -ENOMEM;
}
rcu_read_lock();
rcu_dereference(sbi->s_group_info)[idx] = meta_group_info;
@@ -3123,7 +3123,6 @@ int ext4_mb_add_groupinfo(struct super_block *sb, ext4_group_t group,
group_info[idx] = NULL;
rcu_read_unlock();
}
-exit_meta_group_info:
return -ENOMEM;
} /* ext4_mb_add_groupinfo */

--
2.30.0


2023-02-09 11:58:07

by Kemeng Shi

[permalink] [raw]
Subject: [PATCH 17/21] ext4: remove unnecessary count2 in ext4_free_data_in_buddy

count2 is always 1 in mb_debug, just remove unnecessary count2.

Signed-off-by: Kemeng Shi <[email protected]>
---
fs/ext4/mballoc.c | 6 ++----
1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c
index 289dcd81dd5a..f9fc461b633f 100644
--- a/fs/ext4/mballoc.c
+++ b/fs/ext4/mballoc.c
@@ -3590,7 +3590,7 @@ static void ext4_free_data_in_buddy(struct super_block *sb,
{
struct ext4_buddy e4b;
struct ext4_group_info *db;
- int err, count = 0, count2 = 0;
+ int err, count = 0;

mb_debug(sb, "gonna free %u blocks in group %u (0x%p):",
entry->efd_count, entry->efd_group, entry);
@@ -3606,7 +3606,6 @@ static void ext4_free_data_in_buddy(struct super_block *sb,
db = e4b.bd_info;
/* there are blocks to put in buddy to make them really free */
count += entry->efd_count;
- count2++;
ext4_lock_group(sb, entry->efd_group);
/* Take it out of per group rb tree */
rb_erase(&entry->efd_node, &(db->bb_free_root));
@@ -3631,8 +3630,7 @@ static void ext4_free_data_in_buddy(struct super_block *sb,
ext4_unlock_group(sb, entry->efd_group);
ext4_mb_unload_buddy(&e4b);

- mb_debug(sb, "freed %d blocks in %d structures\n", count,
- count2);
+ mb_debug(sb, "freed %d blocks in 1 structures\n", count);
}

/*
--
2.30.0


2023-02-09 11:58:11

by Kemeng Shi

[permalink] [raw]
Subject: [PATCH 18/21] ext4: remove unnecessary goto in ext4_mb_mark_diskspace_used

When ext4_read_block_bitmap fails, we can return PTR_ERR(bitmap_bh) to
remove unnecessary NULL check of bitmap_bh.

Signed-off-by: Kemeng Shi <[email protected]>
---
fs/ext4/mballoc.c | 4 +---
1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c
index f9fc461b633f..7d6991af50d8 100644
--- a/fs/ext4/mballoc.c
+++ b/fs/ext4/mballoc.c
@@ -3739,9 +3739,7 @@ ext4_mb_mark_diskspace_used(struct ext4_allocation_context *ac,

bitmap_bh = ext4_read_block_bitmap(sb, ac->ac_b_ex.fe_group);
if (IS_ERR(bitmap_bh)) {
- err = PTR_ERR(bitmap_bh);
- bitmap_bh = NULL;
- goto out_err;
+ return PTR_ERR(bitmap_bh);
}

BUFFER_TRACE(bitmap_bh, "getting write access");
--
2.30.0


2023-02-09 11:58:15

by Kemeng Shi

[permalink] [raw]
Subject: [PATCH 19/21] ext4: remove repeat assignment to ac_f_ex

Call trace to assign ac_f_ex:
ext4_mb_use_best_found
ac->ac_f_ex = ac->ac_b_ex;
ext4_mb_new_preallocation
ext4_mb_new_group_pa
ac->ac_f_ex = ac->ac_b_ex;
ext4_mb_new_inode_pa
ac->ac_f_ex = ac->ac_b_ex;

Actually allocated blocks is already stored in ac_f_ex in
ext4_mb_use_best_found, so there is no need to assign ac_f_ex
in ext4_mb_new_group_pa and ext4_mb_new_inode_pa.
Just remove repeat assignment to ac_f_ex in ext4_mb_new_group_pa
and ext4_mb_new_inode_pa.

Signed-off-by: Kemeng Shi <[email protected]>
---
fs/ext4/mballoc.c | 8 --------
1 file changed, 8 deletions(-)

diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c
index 7d6991af50d8..dec716f331ac 100644
--- a/fs/ext4/mballoc.c
+++ b/fs/ext4/mballoc.c
@@ -4635,10 +4635,6 @@ ext4_mb_new_inode_pa(struct ext4_allocation_context *ac)
BUG_ON(ac->ac_o_ex.fe_len > ac->ac_b_ex.fe_len);
}

- /* preallocation can change ac_b_ex, thus we store actually
- * allocated blocks for history */
- ac->ac_f_ex = ac->ac_b_ex;
-
pa->pa_lstart = ac->ac_b_ex.fe_logical;
pa->pa_pstart = ext4_grp_offs_to_block(sb, &ac->ac_b_ex);
pa->pa_len = ac->ac_b_ex.fe_len;
@@ -4689,10 +4685,6 @@ ext4_mb_new_group_pa(struct ext4_allocation_context *ac)

pa = ac->ac_pa;

- /* preallocation can change ac_b_ex, thus we store actually
- * allocated blocks for history */
- ac->ac_f_ex = ac->ac_b_ex;
-
pa->pa_pstart = ext4_grp_offs_to_block(sb, &ac->ac_b_ex);
pa->pa_lstart = pa->pa_pstart;
pa->pa_len = ac->ac_b_ex.fe_len;
--
2.30.0


2023-02-09 11:58:24

by Kemeng Shi

[permalink] [raw]
Subject: [PATCH 20/21] ext4: remove comment code ext4_discard_preallocations

Just remove comment code in ext4_discard_preallocations.

Signed-off-by: Kemeng Shi <[email protected]>
---
fs/ext4/mballoc.c | 1 -
1 file changed, 1 deletion(-)

diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c
index dec716f331ac..5bc30cd5debf 100644
--- a/fs/ext4/mballoc.c
+++ b/fs/ext4/mballoc.c
@@ -4925,7 +4925,6 @@ void ext4_discard_preallocations(struct inode *inode, unsigned int needed)
int err;

if (!S_ISREG(inode->i_mode)) {
- /*BUG_ON(!list_empty(&ei->i_prealloc_list));*/
return;
}

--
2.30.0


2023-02-09 11:58:28

by Kemeng Shi

[permalink] [raw]
Subject: [PATCH 21/21] ext4: simplify calculation of blkoff in ext4_mb_new_blocks_simple

We try to allocate a block from goal in ext4_mb_new_blocks_simple. We
only need get blkoff in first group with goal and set blkoff to 0 for
the rest groups.

Signed-off-by: Kemeng Shi <[email protected]>
---
fs/ext4/mballoc.c | 5 ++---
1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c
index 5bc30cd5debf..bd5e4db677c7 100644
--- a/fs/ext4/mballoc.c
+++ b/fs/ext4/mballoc.c
@@ -5772,9 +5772,6 @@ static ext4_fsblk_t ext4_mb_new_blocks_simple(handle_t *handle,
return 0;
}

- ext4_get_group_no_and_offset(sb,
- max(ext4_group_first_block_no(sb, group), goal),
- NULL, &blkoff);
while (1) {
i = mb_find_next_zero_bit(bitmap_bh->b_data, max,
blkoff);
@@ -5789,6 +5786,8 @@ static ext4_fsblk_t ext4_mb_new_blocks_simple(handle_t *handle,
brelse(bitmap_bh);
if (i < max)
break;
+
+ blkoff = 0;
}

if (group >= ext4_get_groups_count(sb) || i >= max) {
--
2.30.0


2023-02-13 06:57:12

by Ojaswin Mujoo

[permalink] [raw]
Subject: Re: [PATCH 01/21] ext4: set goal start correctly in ext4_mb_normalize_request

On Fri, Feb 10, 2023 at 03:48:05AM +0800, Kemeng Shi wrote:
> We need to set ac_g_ex to notify the goal start used in
> ext4_mb_find_by_goal. Set ac_g_ex instead of ac_f_ex in
> ext4_mb_normalize_request.
>
> Signed-off-by: Kemeng Shi <[email protected]>
> ---
> fs/ext4/mballoc.c | 8 ++++----
> 1 file changed, 4 insertions(+), 4 deletions(-)
>
> diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c
> index 5b2ae37a8b80..0650a1dc870e 100644
> --- a/fs/ext4/mballoc.c
> +++ b/fs/ext4/mballoc.c
> @@ -4191,15 +4191,15 @@ ext4_mb_normalize_request(struct ext4_allocation_context *ac,
> if (ar->pright && (ar->lright == (start + size))) {
> /* merge to the right */
> ext4_get_group_no_and_offset(ac->ac_sb, ar->pright - size,
> - &ac->ac_f_ex.fe_group,
> - &ac->ac_f_ex.fe_start);
> + &ac->ac_g_ex.fe_group,
> + &ac->ac_g_ex.fe_start);
> ac->ac_flags |= EXT4_MB_HINT_TRY_GOAL;
> }
> if (ar->pleft && (ar->lleft + 1 == start)) {
> /* merge to the left */
> ext4_get_group_no_and_offset(ac->ac_sb, ar->pleft + 1,
> - &ac->ac_f_ex.fe_group,
> - &ac->ac_f_ex.fe_start);
> + &ac->ac_g_ex.fe_group,
> + &ac->ac_g_ex.fe_start);
> ac->ac_flags |= EXT4_MB_HINT_TRY_GOAL;
> }

Hi Kemeng,

Looks good, feel free to add:

Reviewed-by: Ojaswin Mujoo <[email protected]>


2023-02-13 06:57:40

by Ojaswin Mujoo

[permalink] [raw]
Subject: Re: [PATCH 02/21] ext4: allow to find by goal if EXT4_MB_HINT_GOAL_ONLY is set

On Fri, Feb 10, 2023 at 03:48:06AM +0800, Kemeng Shi wrote:
> If EXT4_MB_HINT_GOAL_ONLY is set, ext4_mb_regular_allocator will only
> allocate blocks from ext4_mb_find_by_goal. Allow to find by goal in
> ext4_mb_find_by_goal if EXT4_MB_HINT_GOAL_ONLY is set or allocation
> with EXT4_MB_HINT_GOAL_ONLY set will always fail.
>
> EXT4_MB_HINT_GOAL_ONLY is not used at all, so the problem is not
> found for now.
>
> Signed-off-by: Kemeng Shi <[email protected]>
> ---
> fs/ext4/mballoc.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c
> index 0650a1dc870e..375d9655b525 100644
> --- a/fs/ext4/mballoc.c
> +++ b/fs/ext4/mballoc.c
> @@ -2162,7 +2162,7 @@ int ext4_mb_find_by_goal(struct ext4_allocation_context *ac,
> struct ext4_group_info *grp = ext4_get_group_info(ac->ac_sb, group);
> struct ext4_free_extent ex;
>
> - if (!(ac->ac_flags & EXT4_MB_HINT_TRY_GOAL))
> + if (!(ac->ac_flags & (EXT4_MB_HINT_TRY_GOAL | EXT4_MB_HINT_GOAL_ONLY)))
> return 0;
> if (grp->bb_free == 0)
> return 0;
Looks good to me, feel free to add:

Reviewed-by: Ojaswin Mujoo <[email protected]>

2023-02-13 07:03:59

by Ojaswin Mujoo

[permalink] [raw]
Subject: Re: [PATCH 04/21] ext4: get correct ext4_group_info in ext4_mb_prefetch_fini

On Fri, Feb 10, 2023 at 03:48:08AM +0800, Kemeng Shi wrote:
> We always get ext4_group_desc with group + 1 and ext4_group_info with
> group to check if we need do initialize ext4_group_info for the group.
> Just get ext4_group_desc with group for ext4_group_info initialization
> check.
>
> Signed-off-by: Kemeng Shi <[email protected]>
> ---
> fs/ext4/mballoc.c | 6 +++---
> 1 file changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c
> index 352ac9139fee..f24f80ecf318 100644
> --- a/fs/ext4/mballoc.c
> +++ b/fs/ext4/mballoc.c
> @@ -2570,13 +2570,13 @@ void ext4_mb_prefetch_fini(struct super_block *sb, ext4_group_t group,
> unsigned int nr)
> {
> while (nr-- > 0) {
> - struct ext4_group_desc *gdp = ext4_get_group_desc(sb, group,
> - NULL);
> - struct ext4_group_info *grp = ext4_get_group_info(sb, group);
> + struct ext4_group_desc *gdp;
> + struct ext4_group_info *grp;
>
> if (!group)
> group = ext4_get_groups_count(sb);
> group--;
> + gdp = ext4_get_group_desc(sb, group, NULL);
> grp = ext4_get_group_info(sb, group);
>
> if (EXT4_MB_GRP_NEED_INIT(grp) &&
> --
> 2.30.0
>

This is a duplicate of [1] :)

But I'm okay with this going in as that patchseries might take some time
to get merged. Feel free to add:

Reviewed-by: Ojaswin Mujoo <[email protected]>

[1] https://lore.kernel.org/r/85bbcb3774e38de65b737ef0000241ddbdda73aa.1674822311.git.ojaswin@linux.ibm.com

2023-02-13 07:05:04

by Ojaswin Mujoo

[permalink] [raw]
Subject: Re: [PATCH 05/21] ext4: correct calculation of s_mb_preallocated

On Fri, Feb 10, 2023 at 03:48:09AM +0800, Kemeng Shi wrote:
> We will add pa_free to s_mb_preallocated when new ext4_prealloc_space is
> created. In ext4_mb_new_inode_pa, we will call ext4_mb_use_inode_pa
> before adding pa_free to s_mb_preallocated. However, ext4_mb_use_inode_pa
> will consume pa_free for block allocation which triggerred the creation
> of ext4_prealloc_space. Add pa_free to s_mb_preallocated before
> ext4_mb_use_inode_pa to correct calculation of s_mb_preallocated.
> There is no such problem in ext4_mb_new_group_pa as pa_free of group pa
> is consumed in ext4_mb_release_context instead of ext4_mb_use_group_pa.
>
> Signed-off-by: Kemeng Shi <[email protected]>
> ---
> fs/ext4/mballoc.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c
> index f24f80ecf318..2bffc93778d5 100644
> --- a/fs/ext4/mballoc.c
> +++ b/fs/ext4/mballoc.c
> @@ -4670,8 +4670,8 @@ ext4_mb_new_inode_pa(struct ext4_allocation_context *ac)
> pa->pa_len, pa->pa_lstart);
> trace_ext4_mb_new_inode_pa(ac, pa);
>
> - ext4_mb_use_inode_pa(ac, pa);
> atomic_add(pa->pa_free, &sbi->s_mb_preallocated);
> + ext4_mb_use_inode_pa(ac, pa);
>
> ei = EXT4_I(ac->ac_inode);
> grp = ext4_get_group_info(sb, ac->ac_b_ex.fe_group);
> --
> 2.30.0
>
My understanding is that s_mb_preallocated is for all the blocks we ever preallocated and
so we need to add 'original' pa_len instead of (pa_len - best_len).

Hence, the patch looks correct to me. Feel free to add:

Reviewed-by: Ojaswin Mujoo <[email protected]>

2023-02-13 07:09:57

by Ojaswin Mujoo

[permalink] [raw]
Subject: Re: [PATCH 06/21] ext4: correct start of used group pa for debug in ext4_mb_use_group_pa

On Fri, Feb 10, 2023 at 03:48:10AM +0800, Kemeng Shi wrote:
> As we don't correct pa_lstart here, so there is no need to subtract
> pa_lstart with consumed len.
>
> Signed-off-by: Kemeng Shi <[email protected]>
> ---
> fs/ext4/mballoc.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c
> index 2bffc93778d5..433337ac8da2 100644
> --- a/fs/ext4/mballoc.c
> +++ b/fs/ext4/mballoc.c
> @@ -4319,7 +4319,7 @@ static void ext4_mb_use_group_pa(struct ext4_allocation_context *ac,
> * Other CPUs are prevented from allocating from this pa by lg_mutex
> */
> mb_debug(ac->ac_sb, "use %u/%u from group pa %p\n",
> - pa->pa_lstart-len, len, pa);
> + pa->pa_lstart, len, pa);
> }
>
> /*
> --
> 2.30.0
>
Looks good, feel free to add:

Reviewed-by: Ojaswin Mujoo <[email protected]>


2023-02-13 08:38:49

by Ojaswin Mujoo

[permalink] [raw]
Subject: Re: [PATCH 03/21] ext4: avoid to use preallocated blocks if EXT4_MB_HINT_GOAL_ONLY is set

On Fri, Feb 10, 2023 at 03:48:07AM +0800, Kemeng Shi wrote:
> ext4_mb_use_preallocated will ignore the demand to alloc at goal block
> only. Return false if EXT4_MB_HINT_GOAL_ONLY is set before use
> preallocated blocks in ext4_mb_use_preallocated.
>
> Signed-off-by: Kemeng Shi <[email protected]>
> ---
> fs/ext4/mballoc.c | 3 +++
> 1 file changed, 3 insertions(+)
>
> diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c
> index 375d9655b525..352ac9139fee 100644
> --- a/fs/ext4/mballoc.c
> +++ b/fs/ext4/mballoc.c
> @@ -4368,6 +4368,9 @@ ext4_mb_use_preallocated(struct ext4_allocation_context *ac)
> if (!(ac->ac_flags & EXT4_MB_HINT_DATA))
> return false;
>
> + if (unlikely(ac->ac_flags & EXT4_MB_HINT_GOAL_ONLY))
> + return false;
> +
> /* first, try per-file preallocation */
> rcu_read_lock();
> list_for_each_entry_rcu(pa, &ei->i_prealloc_list, pa_inode_list) {
> --
> 2.30.0
>
So with the flag, even when we request for a logical/goal block
combination that can be satisfied by one of the inode PAs in the list,
we would still exit early and the allocation would eventually fail since
the goal block would be marked "in-use" in the buddy. This doesn't seem
to be desirable.

Maybe instead of exiting early we should try to find a PA that satisfies
the logical block we are asking for and then incase of
EXT4_MB_HINT_GOAL_ONLY, we can add a check to see if the physical block
of the PA corresponds to the goal block. Would like to hear your
thoughts on this.

Regards,
ojaswin

2023-02-13 08:43:09

by Ojaswin Mujoo

[permalink] [raw]
Subject: Re: [PATCH 07/21] ext4: protect pa->pa_free in ext4_discard_allocated_blocks

On Fri, Feb 10, 2023 at 03:48:11AM +0800, Kemeng Shi wrote:
> If ext4_mb_mark_diskspace_used fails in ext4_mb_new_blocks, we may
> discard pa already in list. Protect pa with pa_lock to avoid race.
>
> Signed-off-by: Kemeng Shi <[email protected]>
> ---
> fs/ext4/mballoc.c | 5 ++++-
> 1 file changed, 4 insertions(+), 1 deletion(-)
>
> diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c
> index 433337ac8da2..4e1caac74b44 100644
> --- a/fs/ext4/mballoc.c
> +++ b/fs/ext4/mballoc.c
> @@ -4263,8 +4263,11 @@ static void ext4_discard_allocated_blocks(struct ext4_allocation_context *ac)
> ext4_mb_unload_buddy(&e4b);
> return;
> }
> - if (pa->pa_type == MB_INODE_PA)
> + if (pa->pa_type == MB_INODE_PA) {
> + spin_lock(&pa->pa_lock);
> pa->pa_free += ac->ac_b_ex.fe_len;
> + spin_unlock(&pa->pa_lock);
> + }
> }
>
> /*
> --
> 2.30.0
>
Looks correct. Feel free to add:

Reviewed-by: Ojaswin Mujoo <[email protected]>


2023-02-13 12:27:44

by Kemeng Shi

[permalink] [raw]
Subject: Re: [PATCH 04/21] ext4: get correct ext4_group_info in ext4_mb_prefetch_fini



on 2/13/2023 3:03 PM, Ojaswin Mujoo wrote:
> On Fri, Feb 10, 2023 at 03:48:08AM +0800, Kemeng Shi wrote:
>> We always get ext4_group_desc with group + 1 and ext4_group_info with
>> group to check if we need do initialize ext4_group_info for the group.
>> Just get ext4_group_desc with group for ext4_group_info initialization
>> check.
>>
>> Signed-off-by: Kemeng Shi <[email protected]>
>> ---
>> fs/ext4/mballoc.c | 6 +++---
>> 1 file changed, 3 insertions(+), 3 deletions(-)
>>
>> diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c
>> index 352ac9139fee..f24f80ecf318 100644
>> --- a/fs/ext4/mballoc.c
>> +++ b/fs/ext4/mballoc.c
>> @@ -2570,13 +2570,13 @@ void ext4_mb_prefetch_fini(struct super_block *sb, ext4_group_t group,
>> unsigned int nr)
>> {
>> while (nr-- > 0) {
>> - struct ext4_group_desc *gdp = ext4_get_group_desc(sb, group,
>> - NULL);
>> - struct ext4_group_info *grp = ext4_get_group_info(sb, group);
>> + struct ext4_group_desc *gdp;
>> + struct ext4_group_info *grp;
>>
>> if (!group)
>> group = ext4_get_groups_count(sb);
>> group--;
>> + gdp = ext4_get_group_desc(sb, group, NULL);
>> grp = ext4_get_group_info(sb, group);
>>
>> if (EXT4_MB_GRP_NEED_INIT(grp) &&
>> --
>> 2.30.0
>>
>
> This is a duplicate of [1] :)
>
> But I'm okay with this going in as that patchseries might take some time
> to get merged. Feel free to add:
>
> Reviewed-by: Ojaswin Mujoo <[email protected]>
>
> [1] https://lore.kernel.org/r/85bbcb3774e38de65b737ef0000241ddbdda73aa.1674822311.git.ojaswin@linux.ibm.com
>
Hi Ojaswin, Thank you so much to review my code. I 'm sorry that I didn't
notice this patch is duplicated and I really appreciate that you allow this
one to get merged first. I will add your sign to this patch in next version.
Thanks!

--
Best wishes
Kemeng Shi


2023-02-13 13:27:17

by Kemeng Shi

[permalink] [raw]
Subject: Re: [PATCH 03/21] ext4: avoid to use preallocated blocks if EXT4_MB_HINT_GOAL_ONLY is set



on 2/13/2023 4:38 PM, Ojaswin Mujoo wrote:
> On Fri, Feb 10, 2023 at 03:48:07AM +0800, Kemeng Shi wrote:
>> ext4_mb_use_preallocated will ignore the demand to alloc at goal block
>> only. Return false if EXT4_MB_HINT_GOAL_ONLY is set before use
>> preallocated blocks in ext4_mb_use_preallocated.
>>
>> Signed-off-by: Kemeng Shi <[email protected]>
>> ---
>> fs/ext4/mballoc.c | 3 +++
>> 1 file changed, 3 insertions(+)
>>
>> diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c
>> index 375d9655b525..352ac9139fee 100644
>> --- a/fs/ext4/mballoc.c
>> +++ b/fs/ext4/mballoc.c
>> @@ -4368,6 +4368,9 @@ ext4_mb_use_preallocated(struct ext4_allocation_context *ac)
>> if (!(ac->ac_flags & EXT4_MB_HINT_DATA))
>> return false;
>>
>> + if (unlikely(ac->ac_flags & EXT4_MB_HINT_GOAL_ONLY))
>> + return false;
>> +
>> /* first, try per-file preallocation */
>> rcu_read_lock();
>> list_for_each_entry_rcu(pa, &ei->i_prealloc_list, pa_inode_list) {
>> --
>> 2.30.0
>>
> So with the flag, even when we request for a logical/goal block
> combination that can be satisfied by one of the inode PAs in the list,
> we would still exit early and the allocation would eventually fail since
> the goal block would be marked "in-use" in the buddy. This doesn't seem
> to be desirable.
>
> Maybe instead of exiting early we should try to find a PA that satisfies
> the logical block we are asking for and then incase of
> EXT4_MB_HINT_GOAL_ONLY, we can add a check to see if the physical block
> of the PA corresponds to the goal block. Would like to hear your
> thoughts on this.
Yes, this is a more thoughtful, I will improve this in next version.
Besides, maybe we should also test length if EXT4_MB_HINT_MERGE is set
as ext4_mb_find_by_goal do.

> Regards,
> ojaswin
>

--
Best wishes
Kemeng Shi


2023-02-13 19:46:49

by Ojaswin Mujoo

[permalink] [raw]
Subject: Re: [PATCH 08/21] ext4: add missed brelse in ext4_free_blocks_simple

On Fri, Feb 10, 2023 at 03:48:12AM +0800, Kemeng Shi wrote:
> Release bitmap buffer_head we got if error occurs.
> Besides, this patch remove unused assignment to err.
>
> Signed-off-by: Kemeng Shi <[email protected]>
> ---
> fs/ext4/mballoc.c | 7 ++++---
> 1 file changed, 4 insertions(+), 3 deletions(-)
>
> diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c
> index 4e1caac74b44..17ac98c501ed 100644
> --- a/fs/ext4/mballoc.c
> +++ b/fs/ext4/mballoc.c
> @@ -5848,13 +5848,12 @@ static void ext4_free_blocks_simple(struct inode *inode, ext4_fsblk_t block,
> ext4_get_group_no_and_offset(sb, block, &group, &blkoff);
> bitmap_bh = ext4_read_block_bitmap(sb, group);
> if (IS_ERR(bitmap_bh)) {
> - err = PTR_ERR(bitmap_bh);
> pr_warn("Failed to read block bitmap\n");
> return;
> }
> gdp = ext4_get_group_desc(sb, group, &gdp_bh);
> if (!gdp)
> - return;
> + goto err_out;
>
> for (i = 0; i < count; i++) {
> if (!mb_test_bit(blkoff + i, bitmap_bh->b_data))
> @@ -5863,7 +5862,7 @@ static void ext4_free_blocks_simple(struct inode *inode, ext4_fsblk_t block,
> mb_clear_bits(bitmap_bh->b_data, blkoff, count);
> err = ext4_handle_dirty_metadata(NULL, NULL, bitmap_bh);
> if (err)
> - return;
> + goto err_out;
> ext4_free_group_clusters_set(
> sb, gdp, ext4_free_group_clusters(sb, gdp) +
> count - already_freed);
> @@ -5872,6 +5871,8 @@ static void ext4_free_blocks_simple(struct inode *inode, ext4_fsblk_t block,
> ext4_handle_dirty_metadata(NULL, NULL, gdp_bh);
> sync_dirty_buffer(bitmap_bh);
> sync_dirty_buffer(gdp_bh);
> +
> +err_out:
> brelse(bitmap_bh);
> }
>
> --
> 2.30.0
>
Looks good, I'm also okay with removing the err variable out completely.
Either ways, feel free to add:

Reviewed-by: Ojaswin Mujoo <[email protected]>



2023-02-13 19:47:39

by Ojaswin Mujoo

[permalink] [raw]
Subject: Re: [PATCH 09/21] ext4: remove unused return value of ext4_mb_try_best_found and ext4_mb_free_metadata

On Fri, Feb 10, 2023 at 03:48:13AM +0800, Kemeng Shi wrote:
> Return value static function ext4_mb_try_best_found and
> ext4_mb_free_metadata is not used. Just remove unused return value.
>
> Signed-off-by: Kemeng Shi <[email protected]>
> ---
> fs/ext4/mballoc.c | 11 ++++-------
> 1 file changed, 4 insertions(+), 7 deletions(-)
>
> diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c
> index 17ac98c501ed..ad9e3b7d3198 100644
> --- a/fs/ext4/mballoc.c
> +++ b/fs/ext4/mballoc.c
> @@ -2124,7 +2124,7 @@ static void ext4_mb_measure_extent(struct ext4_allocation_context *ac,
> }
>
> static noinline_for_stack
> -int ext4_mb_try_best_found(struct ext4_allocation_context *ac,
> +void ext4_mb_try_best_found(struct ext4_allocation_context *ac,
> struct ext4_buddy *e4b)
> {
> struct ext4_free_extent ex = ac->ac_b_ex;
> @@ -2135,7 +2135,7 @@ int ext4_mb_try_best_found(struct ext4_allocation_context *ac,
> BUG_ON(ex.fe_len <= 0);
> err = ext4_mb_load_buddy(ac->ac_sb, group, e4b);
> if (err)
> - return err;
> + return;
>
> ext4_lock_group(ac->ac_sb, group);
> max = mb_find_extent(e4b, ex.fe_start, ex.fe_len, &ex);
> @@ -2147,8 +2147,6 @@ int ext4_mb_try_best_found(struct ext4_allocation_context *ac,
>
> ext4_unlock_group(ac->ac_sb, group);
> ext4_mb_unload_buddy(e4b);
> -
> - return 0;
> }
>
> static noinline_for_stack
> @@ -5699,7 +5697,7 @@ static void ext4_try_merge_freed_extent(struct ext4_sb_info *sbi,
> kmem_cache_free(ext4_free_data_cachep, entry);
> }
>
> -static noinline_for_stack int
> +static noinline_for_stack void
> ext4_mb_free_metadata(handle_t *handle, struct ext4_buddy *e4b,
> struct ext4_free_data *new_entry)
> {
> @@ -5742,7 +5740,7 @@ ext4_mb_free_metadata(handle_t *handle, struct ext4_buddy *e4b,
> EXT4_C2B(sbi, cluster),
> "Block already on to-be-freed list");
> kmem_cache_free(ext4_free_data_cachep, new_entry);
> - return 0;
> + return;
> }
> }
>
> @@ -5768,7 +5766,6 @@ ext4_mb_free_metadata(handle_t *handle, struct ext4_buddy *e4b,
> list_add_tail(&new_entry->efd_list, &sbi->s_freed_data_list);
> sbi->s_mb_free_pending += clusters;
> spin_unlock(&sbi->s_md_lock);
> - return 0;
> }
>
> /*
> --
> 2.30.0
>
Feel free to add:

Reviewed-by: Ojaswin Mujoo <[email protected]>

2023-02-13 19:48:28

by Ojaswin Mujoo

[permalink] [raw]
Subject: Re: [PATCH 10/21] ext4: Remove unnecessary release when memory allocation failed in ext4_mb_init_cache

On Fri, Feb 10, 2023 at 03:48:14AM +0800, Kemeng Shi wrote:
> If we alloc array of buffer_head failed, there is no resource need to be
> freed and we can simpily return error.
>
> Signed-off-by: Kemeng Shi <[email protected]>
> ---
> fs/ext4/mballoc.c | 6 ++----
> 1 file changed, 2 insertions(+), 4 deletions(-)
>
> diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c
> index ad9e3b7d3198..15fc7105becc 100644
> --- a/fs/ext4/mballoc.c
> +++ b/fs/ext4/mballoc.c
> @@ -1168,10 +1168,8 @@ static int ext4_mb_init_cache(struct page *page, char *incore, gfp_t gfp)
> if (groups_per_page > 1) {
> i = sizeof(struct buffer_head *) * groups_per_page;
> bh = kzalloc(i, gfp);
> - if (bh == NULL) {
> - err = -ENOMEM;
> - goto out;
> - }
> + if (bh == NULL)
> + return -ENOMEM;
> } else
> bh = &bhs;
>
> --
> 2.30.0
>
Feel free to add:

Reviewed-by: Ojaswin Mujoo <[email protected]>

2023-02-13 19:48:59

by Ojaswin Mujoo

[permalink] [raw]
Subject: Re: [PATCH 11/21] ext4: remove unnecessary e4b->bd_buddy_page check in ext4_mb_load_buddy_gfp

On Fri, Feb 10, 2023 at 03:48:15AM +0800, Kemeng Shi wrote:
> e4b->bd_buddy_page is only set if we initialize ext4_buddy successfully. So
> e4b->bd_buddy_page is always NULL in error handle branch. Just remove the
> dead check.
>
> Signed-off-by: Kemeng Shi <[email protected]>
> ---
> fs/ext4/mballoc.c | 3 +--
> 1 file changed, 1 insertion(+), 2 deletions(-)
>
> diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c
> index 15fc7105becc..74da24c9bf14 100644
> --- a/fs/ext4/mballoc.c
> +++ b/fs/ext4/mballoc.c
> @@ -1555,8 +1555,7 @@ ext4_mb_load_buddy_gfp(struct super_block *sb, ext4_group_t group,
> put_page(page);
> if (e4b->bd_bitmap_page)
> put_page(e4b->bd_bitmap_page);
> - if (e4b->bd_buddy_page)
> - put_page(e4b->bd_buddy_page);
> +
> e4b->bd_buddy = NULL;
> e4b->bd_bitmap = NULL;
> return ret;
> --
> 2.30.0
>
Feel free to add:

Reviewed-by: Ojaswin Mujoo <[email protected]>

2023-02-13 19:49:27

by Ojaswin Mujoo

[permalink] [raw]
Subject: Re: [PATCH 12/21] ext4: remove unnecessary check in ext4_mb_new_blocks

On Fri, Feb 10, 2023 at 03:48:16AM +0800, Kemeng Shi wrote:
> 1. remove unnecessary ac check:
> We always go to out tag before ac is successfully allocated, then we can
> move out tag after free of ac and remove NULL check of ac.
>
> 2. remove unnecessary *errp check:
> We always go to errout tag if *errp is non-zero, then we can move errout
> tag into error handle if *errp is non-zero.
>
> Signed-off-by: Kemeng Shi <[email protected]>
> ---
> fs/ext4/mballoc.c | 5 ++---
> 1 file changed, 2 insertions(+), 3 deletions(-)
>
> diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c
> index 74da24c9bf14..bdabe0d81d4a 100644
> --- a/fs/ext4/mballoc.c
> +++ b/fs/ext4/mballoc.c
> @@ -5641,16 +5641,15 @@ ext4_fsblk_t ext4_mb_new_blocks(handle_t *handle,
> *errp = -ENOSPC;
> }
>
> -errout:
> if (*errp) {
> +errout:
> ac->ac_b_ex.fe_len = 0;
> ar->len = 0;
> ext4_mb_show_ac(ac);
> }
> ext4_mb_release_context(ac);
> + kmem_cache_free(ext4_ac_cachep, ac);
> out:
> - if (ac)
> - kmem_cache_free(ext4_ac_cachep, ac);
> if (inquota && ar->len < inquota)
> dquot_free_block(ar->inode, EXT4_C2B(sbi, inquota - ar->len));
> if (!ar->len) {
> --
> 2.30.0
>
Feel free to add:

Reviewed-by: Ojaswin Mujoo <[email protected]>

2023-02-13 19:50:54

by Ojaswin Mujoo

[permalink] [raw]
Subject: Re: [PATCH 13/21] ext4: remove dead check in mb_buddy_mark_free

On Fri, Feb 10, 2023 at 03:48:17AM +0800, Kemeng Shi wrote:
> We always adjust first to even number and adjust last to odd number, so
> first == last will never happen. Remove this dead check.
>
> Signed-off-by: Kemeng Shi <[email protected]>
> ---
> fs/ext4/mballoc.c | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c
> index bdabe0d81d4a..0fdbf16ac180 100644
> --- a/fs/ext4/mballoc.c
> +++ b/fs/ext4/mballoc.c
> @@ -1718,7 +1718,8 @@ static void mb_buddy_mark_free(struct ext4_buddy *e4b, int first, int last)
> break;
> order++;
>
> - if (first == last || !(buddy2 = mb_find_buddy(e4b, order, &max))) {
> + buddy2 = mb_find_buddy(e4b, order, &max);
> + if (!buddy2) {
> mb_clear_bits(buddy, first, last - first + 1);
> e4b->bd_info->bb_counters[order - 1] += last - first + 1;
> break;
> --
> 2.30.0
>
Okay, so I checked the code and seems like you are right. There is can't be any
scenario where (first == last) after the calls to mb_buddy_adjust_border().

However, I'm a bit hesitant to give my Reviewed by since buddy algo is still a
bit confusing to me and I might be missing some weird edge case. Maybe someone
can help double check this.

2023-02-13 19:51:39

by Ojaswin Mujoo

[permalink] [raw]
Subject: Re: [PATCH 14/21] ext4: remove ac->ac_found > sbi->s_mb_min_to_scan dead check in ext4_mb_check_limits

On Fri, Feb 10, 2023 at 03:48:18AM +0800, Kemeng Shi wrote:
> Only call trace of ext4_mb_check_limits is as following:
> ext4_mb_complex_scan_group
> ext4_mb_measure_extent
> ext4_mb_check_limits(ac, e4b, 0);
> ext4_mb_check_limits(ac, e4b, 1);
>
> If the first ac->ac_found > sbi->s_mb_max_to_scan check in
> ext4_mb_check_limits is met, we will set ac_status to
> AC_STATUS_BREAK and call ext4_mb_try_best_found to try to use
> ac->ac_b_ex.
> If ext4_mb_try_best_found successes, then block allocation finishs,
> the removed ac->ac_found > sbi->s_mb_min_to_scan check is not reachable.
> If ext4_mb_try_best_found fails, then we set EXT4_MB_HINT_FIRST and
> reset ac->ac_b_ex to retry block allocation. We will use any found
> free extent in ext4_mb_measure_extent before reach the removed
> ac->ac_found > sbi->s_mb_min_to_scan check.
> In summary, the removed ac->ac_found > sbi->s_mb_min_to_scan check is
> not reachable and we can remove that dead check.
>
> Signed-off-by: Kemeng Shi <[email protected]>
> ---
> fs/ext4/mballoc.c | 3 +--
> 1 file changed, 1 insertion(+), 2 deletions(-)
>
> diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c
> index 0fdbf16ac180..e53f84de5018 100644
> --- a/fs/ext4/mballoc.c
> +++ b/fs/ext4/mballoc.c
> @@ -2039,8 +2039,7 @@ static void ext4_mb_check_limits(struct ext4_allocation_context *ac,
> if (bex->fe_len < gex->fe_len)
> return;
>
> - if ((finish_group || ac->ac_found > sbi->s_mb_min_to_scan)
> - && bex->fe_group == e4b->bd_group) {
> + if (finish_group && bex->fe_group == e4b->bd_group) {
> /* recheck chunk's availability - we don't know
> * when it was found (within this lock-unlock
> * period or not) */
> --
> 2.30.0
>
Feel free to add:

Reviewed-by: Ojaswin Mujoo <[email protected]>

2023-02-13 19:54:06

by Ojaswin Mujoo

[permalink] [raw]
Subject: Re: [PATCH 15/21] ext4: use best found when complex scan of group finishs

On Fri, Feb 10, 2023 at 03:48:19AM +0800, Kemeng Shi wrote:
> If any bex which meets bex->fe_len >= gex->fe_len is found, then it will
> always be used when complex scan of group that bex belongs to finishs.
> So there will not be any lock-unlock period.
>
> Signed-off-by: Kemeng Shi <[email protected]>
> ---
> fs/ext4/mballoc.c | 14 ++------------
> 1 file changed, 2 insertions(+), 12 deletions(-)
>
> diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c
> index e53f84de5018..c684758d6dbb 100644
> --- a/fs/ext4/mballoc.c
> +++ b/fs/ext4/mballoc.c
> @@ -2019,8 +2019,6 @@ static void ext4_mb_check_limits(struct ext4_allocation_context *ac,
> struct ext4_sb_info *sbi = EXT4_SB(ac->ac_sb);
> struct ext4_free_extent *bex = &ac->ac_b_ex;
> struct ext4_free_extent *gex = &ac->ac_g_ex;
> - struct ext4_free_extent ex;
> - int max;
>
> if (ac->ac_status == AC_STATUS_FOUND)
> return;
> @@ -2039,16 +2037,8 @@ static void ext4_mb_check_limits(struct ext4_allocation_context *ac,
> if (bex->fe_len < gex->fe_len)
> return;
>
> - if (finish_group && bex->fe_group == e4b->bd_group) {
> - /* recheck chunk's availability - we don't know
> - * when it was found (within this lock-unlock
> - * period or not) */
> - max = mb_find_extent(e4b, bex->fe_start, gex->fe_len, &ex);
> - if (max >= gex->fe_len) {
> - ext4_mb_use_best_found(ac, e4b);
> - return;
> - }
> - }
> + if (finish_group)
> + ext4_mb_use_best_found(ac, e4b);
> }
>
> /*
> --
> 2.30.0
>
Looks good. So when we have found bex > gex, then we wont have a lock
unlock period since we always allocate the bex when we reach the end of
group.

Just a small typo in the commit message (finshs -> finishes), but other
than that feel free to add:

Reviewed-by: Ojaswin Mujoo <[email protected]>

2023-02-13 19:55:11

by Ojaswin Mujoo

[permalink] [raw]
Subject: Re: [PATCH 16/21] ext4: remove unnecessary exit_meta_group_info tag

On Fri, Feb 10, 2023 at 03:48:20AM +0800, Kemeng Shi wrote:
> We goto exit_meta_group_info only to return -ENOMEM. Return -ENOMEM
> directly instead of goto to remove this unnecessary tag.
>
> Signed-off-by: Kemeng Shi <[email protected]>
> ---
> fs/ext4/mballoc.c | 3 +--
> 1 file changed, 1 insertion(+), 2 deletions(-)
>
> diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c
> index c684758d6dbb..289dcd81dd5a 100644
> --- a/fs/ext4/mballoc.c
> +++ b/fs/ext4/mballoc.c
> @@ -3069,7 +3069,7 @@ int ext4_mb_add_groupinfo(struct super_block *sb, ext4_group_t group,
> if (meta_group_info == NULL) {
> ext4_msg(sb, KERN_ERR, "can't allocate mem "
> "for a buddy group");
> - goto exit_meta_group_info;
> + return -ENOMEM;
> }
> rcu_read_lock();
> rcu_dereference(sbi->s_group_info)[idx] = meta_group_info;
> @@ -3123,7 +3123,6 @@ int ext4_mb_add_groupinfo(struct super_block *sb, ext4_group_t group,
> group_info[idx] = NULL;
> rcu_read_unlock();
> }
> -exit_meta_group_info:
> return -ENOMEM;
> } /* ext4_mb_add_groupinfo */
>
> --
> 2.30.0
>
Looks good. Feel free to add:

Reviewed-by: Ojaswin Mujoo <[email protected]>

2023-02-13 19:57:08

by Ojaswin Mujoo

[permalink] [raw]
Subject: Re: [PATCH 17/21] ext4: remove unnecessary count2 in ext4_free_data_in_buddy

On Fri, Feb 10, 2023 at 03:48:21AM +0800, Kemeng Shi wrote:
> count2 is always 1 in mb_debug, just remove unnecessary count2.
>
> Signed-off-by: Kemeng Shi <[email protected]>
> ---
> fs/ext4/mballoc.c | 6 ++----
> 1 file changed, 2 insertions(+), 4 deletions(-)
>
> diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c
> index 289dcd81dd5a..f9fc461b633f 100644
> --- a/fs/ext4/mballoc.c
> +++ b/fs/ext4/mballoc.c
> @@ -3590,7 +3590,7 @@ static void ext4_free_data_in_buddy(struct super_block *sb,
> {
> struct ext4_buddy e4b;
> struct ext4_group_info *db;
> - int err, count = 0, count2 = 0;
> + int err, count = 0;
>
> mb_debug(sb, "gonna free %u blocks in group %u (0x%p):",
> entry->efd_count, entry->efd_group, entry);
> @@ -3606,7 +3606,6 @@ static void ext4_free_data_in_buddy(struct super_block *sb,
> db = e4b.bd_info;
> /* there are blocks to put in buddy to make them really free */
> count += entry->efd_count;
> - count2++;
> ext4_lock_group(sb, entry->efd_group);
> /* Take it out of per group rb tree */
> rb_erase(&entry->efd_node, &(db->bb_free_root));
> @@ -3631,8 +3630,7 @@ static void ext4_free_data_in_buddy(struct super_block *sb,
> ext4_unlock_group(sb, entry->efd_group);
> ext4_mb_unload_buddy(&e4b);
>
> - mb_debug(sb, "freed %d blocks in %d structures\n", count,
> - count2);
> + mb_debug(sb, "freed %d blocks in 1 structures\n", count);
> }
>
> /*
> --
> 2.30.0
>
If we always have 1 in the debug message, maybe we can change "structures"
to "structure". Other than that, feel free to add:

Reviewed-by: Ojaswin Mujoo <[email protected]>

2023-02-13 19:59:20

by Ojaswin Mujoo

[permalink] [raw]
Subject: Re: [PATCH 18/21] ext4: remove unnecessary goto in ext4_mb_mark_diskspace_used

On Fri, Feb 10, 2023 at 03:48:22AM +0800, Kemeng Shi wrote:
> When ext4_read_block_bitmap fails, we can return PTR_ERR(bitmap_bh) to
> remove unnecessary NULL check of bitmap_bh.
>
> Signed-off-by: Kemeng Shi <[email protected]>
> ---
> fs/ext4/mballoc.c | 4 +---
> 1 file changed, 1 insertion(+), 3 deletions(-)
>
> diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c
> index f9fc461b633f..7d6991af50d8 100644
> --- a/fs/ext4/mballoc.c
> +++ b/fs/ext4/mballoc.c
> @@ -3739,9 +3739,7 @@ ext4_mb_mark_diskspace_used(struct ext4_allocation_context *ac,
>
> bitmap_bh = ext4_read_block_bitmap(sb, ac->ac_b_ex.fe_group);
> if (IS_ERR(bitmap_bh)) {
> - err = PTR_ERR(bitmap_bh);
> - bitmap_bh = NULL;

It's probably trivial but the fact that we no longer have `bitmap_bh =
NULL` is making me a bit paranoid. Although I think it should be
okay but maybe someone else can help double check this :)

> - goto out_err;
> + return PTR_ERR(bitmap_bh);
> }
>
> BUFFER_TRACE(bitmap_bh, "getting write access");
> --
> 2.30.0
>


2023-02-13 20:11:15

by Ojaswin Mujoo

[permalink] [raw]
Subject: Re: [PATCH 19/21] ext4: remove repeat assignment to ac_f_ex

On Fri, Feb 10, 2023 at 03:48:23AM +0800, Kemeng Shi wrote:
> Call trace to assign ac_f_ex:
> ext4_mb_use_best_found
> ac->ac_f_ex = ac->ac_b_ex;
> ext4_mb_new_preallocation
> ext4_mb_new_group_pa
> ac->ac_f_ex = ac->ac_b_ex;
> ext4_mb_new_inode_pa
> ac->ac_f_ex = ac->ac_b_ex;
>
> Actually allocated blocks is already stored in ac_f_ex in
> ext4_mb_use_best_found, so there is no need to assign ac_f_ex
> in ext4_mb_new_group_pa and ext4_mb_new_inode_pa.
> Just remove repeat assignment to ac_f_ex in ext4_mb_new_group_pa
> and ext4_mb_new_inode_pa.
>
> Signed-off-by: Kemeng Shi <[email protected]>
> ---
> fs/ext4/mballoc.c | 8 --------
> 1 file changed, 8 deletions(-)
>
> diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c
> index 7d6991af50d8..dec716f331ac 100644
> --- a/fs/ext4/mballoc.c
> +++ b/fs/ext4/mballoc.c
> @@ -4635,10 +4635,6 @@ ext4_mb_new_inode_pa(struct ext4_allocation_context *ac)
> BUG_ON(ac->ac_o_ex.fe_len > ac->ac_b_ex.fe_len);
> }
>
> - /* preallocation can change ac_b_ex, thus we store actually
> - * allocated blocks for history */
> - ac->ac_f_ex = ac->ac_b_ex;
> -
Alright, so we should note that the ac_b_ex being assigned to ac_f_ex
here can differ from the last allocation of ac_f_ex in
ext4_mb_use_best_found() in the case that bex_len < gex_len and we enter
the PA window adjustment if condition that is just above this line.

In such a case, we should maybe see which of the bex do we actually want
to store in fex. IMO, this patch does the right thing as the allocation
in ext4_mb_use_best_found() gives us an idea of how much the allocator
was exactly able to allocate before other PA related logics start making
changes to the best ex. Further, following the discussion here [1], we
might end up rewriting the PA window adjustment logic in future, and
that could also change the length of ac_b_ex, which we would not want to
reflect in ac_f_ex (again since i see ac_f_ex as the record of actual ex
that allocator was able to find).

So this patch looks like the right thing to do for me. Feel free to add:

Reviewed-by: Ojaswin Mujoo <[email protected]>

[1]
https://lore.kernel.org/linux-ext4/[email protected]/
> pa->pa_lstart = ac->ac_b_ex.fe_logical;
> pa->pa_pstart = ext4_grp_offs_to_block(sb, &ac->ac_b_ex);
> pa->pa_len = ac->ac_b_ex.fe_len;
> @@ -4689,10 +4685,6 @@ ext4_mb_new_group_pa(struct ext4_allocation_context *ac)
>
> pa = ac->ac_pa;
>
> - /* preallocation can change ac_b_ex, thus we store actually
> - * allocated blocks for history */
> - ac->ac_f_ex = ac->ac_b_ex;
> -
> pa->pa_pstart = ext4_grp_offs_to_block(sb, &ac->ac_b_ex);
> pa->pa_lstart = pa->pa_pstart;
> pa->pa_len = ac->ac_b_ex.fe_len;
> --
> 2.30.0
>

2023-02-13 20:12:21

by Ojaswin Mujoo

[permalink] [raw]
Subject: Re: [PATCH 20/21] ext4: remove comment code ext4_discard_preallocations

On Fri, Feb 10, 2023 at 03:48:24AM +0800, Kemeng Shi wrote:
> Just remove comment code in ext4_discard_preallocations.
>
> Signed-off-by: Kemeng Shi <[email protected]>
> ---
> fs/ext4/mballoc.c | 1 -
> 1 file changed, 1 deletion(-)
>
> diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c
> index dec716f331ac..5bc30cd5debf 100644
> --- a/fs/ext4/mballoc.c
> +++ b/fs/ext4/mballoc.c
> @@ -4925,7 +4925,6 @@ void ext4_discard_preallocations(struct inode *inode, unsigned int needed)
> int err;
>
> if (!S_ISREG(inode->i_mode)) {
> - /*BUG_ON(!list_empty(&ei->i_prealloc_list));*/
> return;
> }
>
> --
> 2.30.0
>
Feel free to add:

Reviewed-by: Ojaswin Mujoo <[email protected]>


2023-02-13 20:15:09

by Ojaswin Mujoo

[permalink] [raw]
Subject: Re: [PATCH 04/21] ext4: get correct ext4_group_info in ext4_mb_prefetch_fini

On Mon, Feb 13, 2023 at 08:27:32PM +0800, Kemeng Shi wrote:
>
>
> on 2/13/2023 3:03 PM, Ojaswin Mujoo wrote:
> > On Fri, Feb 10, 2023 at 03:48:08AM +0800, Kemeng Shi wrote:
> >> We always get ext4_group_desc with group + 1 and ext4_group_info with
> >> group to check if we need do initialize ext4_group_info for the group.
> >> Just get ext4_group_desc with group for ext4_group_info initialization
> >> check.
> >>
> >> Signed-off-by: Kemeng Shi <[email protected]>
> >> ---
> >> fs/ext4/mballoc.c | 6 +++---
> >> 1 file changed, 3 insertions(+), 3 deletions(-)
> >>
> >> diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c
> >> index 352ac9139fee..f24f80ecf318 100644
> >> --- a/fs/ext4/mballoc.c
> >> +++ b/fs/ext4/mballoc.c
> >> @@ -2570,13 +2570,13 @@ void ext4_mb_prefetch_fini(struct super_block *sb, ext4_group_t group,
> >> unsigned int nr)
> >> {
> >> while (nr-- > 0) {
> >> - struct ext4_group_desc *gdp = ext4_get_group_desc(sb, group,
> >> - NULL);
> >> - struct ext4_group_info *grp = ext4_get_group_info(sb, group);
> >> + struct ext4_group_desc *gdp;
> >> + struct ext4_group_info *grp;
> >>
> >> if (!group)
> >> group = ext4_get_groups_count(sb);
> >> group--;
> >> + gdp = ext4_get_group_desc(sb, group, NULL);
> >> grp = ext4_get_group_info(sb, group);
> >>
> >> if (EXT4_MB_GRP_NEED_INIT(grp) &&
> >> --
> >> 2.30.0
> >>
> >
> > This is a duplicate of [1] :)
> >
> > But I'm okay with this going in as that patchseries might take some time
> > to get merged. Feel free to add:
> >
> > Reviewed-by: Ojaswin Mujoo <[email protected]>
> >
> > [1] https://lore.kernel.org/r/85bbcb3774e38de65b737ef0000241ddbdda73aa.1674822311.git.ojaswin@linux.ibm.com
> >
> Hi Ojaswin, Thank you so much to review my code. I 'm sorry that I didn't
> notice this patch is duplicated and I really appreciate that you allow this
> one to get merged first. I will add your sign to this patch in next version.
> Thanks!
Hi Kemeng,

So I'm not sure what the norm is when dealing with such duplicate
patches, but if you do plan to add the Signed-off-by then I'd just like
to point out that the patch I linked is mainly from Ritesh Harjani, so
you can pick his Signed-off-by rather than mine.

Thanks,
ojaswin
>
> --
> Best wishes
> Kemeng Shi
>

2023-02-14 01:12:32

by Kemeng Shi

[permalink] [raw]
Subject: Re: [PATCH 04/21] ext4: get correct ext4_group_info in ext4_mb_prefetch_fini



on 2/14/2023 4:14 AM, Ojaswin Mujoo wrote:
> On Mon, Feb 13, 2023 at 08:27:32PM +0800, Kemeng Shi wrote:
>>
>>
>> on 2/13/2023 3:03 PM, Ojaswin Mujoo wrote:
>>> On Fri, Feb 10, 2023 at 03:48:08AM +0800, Kemeng Shi wrote:
>>>> We always get ext4_group_desc with group + 1 and ext4_group_info with
>>>> group to check if we need do initialize ext4_group_info for the group.
>>>> Just get ext4_group_desc with group for ext4_group_info initialization
>>>> check.
>>>>
>>>> Signed-off-by: Kemeng Shi <[email protected]>
>>>> ---
>>>> fs/ext4/mballoc.c | 6 +++---
>>>> 1 file changed, 3 insertions(+), 3 deletions(-)
>>>>
>>>> diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c
>>>> index 352ac9139fee..f24f80ecf318 100644
>>>> --- a/fs/ext4/mballoc.c
>>>> +++ b/fs/ext4/mballoc.c
>>>> @@ -2570,13 +2570,13 @@ void ext4_mb_prefetch_fini(struct super_block *sb, ext4_group_t group,
>>>> unsigned int nr)
>>>> {
>>>> while (nr-- > 0) {
>>>> - struct ext4_group_desc *gdp = ext4_get_group_desc(sb, group,
>>>> - NULL);
>>>> - struct ext4_group_info *grp = ext4_get_group_info(sb, group);
>>>> + struct ext4_group_desc *gdp;
>>>> + struct ext4_group_info *grp;
>>>>
>>>> if (!group)
>>>> group = ext4_get_groups_count(sb);
>>>> group--;
>>>> + gdp = ext4_get_group_desc(sb, group, NULL);
>>>> grp = ext4_get_group_info(sb, group);
>>>>
>>>> if (EXT4_MB_GRP_NEED_INIT(grp) &&
>>>> --
>>>> 2.30.0
>>>>
>>>
>>> This is a duplicate of [1] :)
>>>
>>> But I'm okay with this going in as that patchseries might take some time
>>> to get merged. Feel free to add:
>>>
>>> Reviewed-by: Ojaswin Mujoo <[email protected]>
>>>
>>> [1] https://lore.kernel.org/r/85bbcb3774e38de65b737ef0000241ddbdda73aa.1674822311.git.ojaswin@linux.ibm.com
>>>
>> Hi Ojaswin, Thank you so much to review my code. I 'm sorry that I didn't
>> notice this patch is duplicated and I really appreciate that you allow this
>> one to get merged first. I will add your sign to this patch in next version.
>> Thanks!
> Hi Kemeng,
>
> So I'm not sure what the norm is when dealing with such duplicate
> patches, but if you do plan to add the Signed-off-by then I'd just like
> to point out that the patch I linked is mainly from Ritesh Harjani, so
> you can pick his Signed-off-by rather than mine.
>
Sorry that I miss that there are two Signed-off-bys in patch you have already
sent. I will collect both signs to this patch. Thanks!

--
Best wishes
Kemeng Shi


2023-02-17 01:24:52

by Kemeng Shi

[permalink] [raw]
Subject: Re: [PATCH 13/21] ext4: remove dead check in mb_buddy_mark_free



on 2/14/2023 3:50 AM, Ojaswin Mujoo wrote:
> On Fri, Feb 10, 2023 at 03:48:17AM +0800, Kemeng Shi wrote:
>> We always adjust first to even number and adjust last to odd number, so
>> first == last will never happen. Remove this dead check.
>>
>> Signed-off-by: Kemeng Shi <[email protected]>
>> ---
>> fs/ext4/mballoc.c | 3 ++-
>> 1 file changed, 2 insertions(+), 1 deletion(-)
>>
>> diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c
>> index bdabe0d81d4a..0fdbf16ac180 100644
>> --- a/fs/ext4/mballoc.c
>> +++ b/fs/ext4/mballoc.c
>> @@ -1718,7 +1718,8 @@ static void mb_buddy_mark_free(struct ext4_buddy *e4b, int first, int last)
>> break;
>> order++;
>>
>> - if (first == last || !(buddy2 = mb_find_buddy(e4b, order, &max))) {
>> + buddy2 = mb_find_buddy(e4b, order, &max);
>> + if (!buddy2) {
>> mb_clear_bits(buddy, first, last - first + 1);
>> e4b->bd_info->bb_counters[order - 1] += last - first + 1;
>> break;
>> --
>> 2.30.0
>>
> Okay, so I checked the code and seems like you are right. There is can't be any
> scenario where (first == last) after the calls to mb_buddy_adjust_border().
>
> However, I'm a bit hesitant to give my Reviewed by since buddy algo is still a
> bit confusing to me and I might be missing some weird edge case. Maybe someone
> can help double check this.
Hi, could anyone help double check this patch and patch 18/21 "ext4: remove
unnecessary goto in ext4_mb_mark_diskspace_used" in the same patchset. Thanks.

--
Best wishes
Kemeng Shi


2023-02-17 06:37:25

by Ritesh Harjani

[permalink] [raw]
Subject: Re: [PATCH 18/21] ext4: remove unnecessary goto in ext4_mb_mark_diskspace_used

Kemeng Shi <[email protected]> writes:

> When ext4_read_block_bitmap fails, we can return PTR_ERR(bitmap_bh) to
> remove unnecessary NULL check of bitmap_bh.

bitmap_bh is a local pointer variable. So not setting it to NULL is not
a problem. I guess for consistency in return error code paths the author
would have kept it this way, but since this is the first return from the
function in case of an error, hence it looks ok if we simply call
return PTR_ERR(bitmap_bh), rather than a goto out_err.

Hence this looks good to me. Feel free to add -

Reviewed-by: Ritesh Harjani (IBM) <[email protected]>


>
> Signed-off-by: Kemeng Shi <[email protected]>
> ---
> fs/ext4/mballoc.c | 4 +---
> 1 file changed, 1 insertion(+), 3 deletions(-)
>
> diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c
> index f9fc461b633f..7d6991af50d8 100644
> --- a/fs/ext4/mballoc.c
> +++ b/fs/ext4/mballoc.c
> @@ -3739,9 +3739,7 @@ ext4_mb_mark_diskspace_used(struct ext4_allocation_context *ac,
>
> bitmap_bh = ext4_read_block_bitmap(sb, ac->ac_b_ex.fe_group);
> if (IS_ERR(bitmap_bh)) {
> - err = PTR_ERR(bitmap_bh);
> - bitmap_bh = NULL;
> - goto out_err;
> + return PTR_ERR(bitmap_bh);
> }
>
> BUFFER_TRACE(bitmap_bh, "getting write access");
> --
> 2.30.0

2023-02-17 06:46:38

by Ritesh Harjani

[permalink] [raw]
Subject: Re: [PATCH 04/21] ext4: get correct ext4_group_info in ext4_mb_prefetch_fini

Kemeng Shi <[email protected]> writes:

> We always get ext4_group_desc with group + 1 and ext4_group_info with
> group to check if we need do initialize ext4_group_info for the group.
> Just get ext4_group_desc with group for ext4_group_info initialization
> check.
>
> Signed-off-by: Kemeng Shi <[email protected]>
> ---
> fs/ext4/mballoc.c | 6 +++---
> 1 file changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c
> index 352ac9139fee..f24f80ecf318 100644
> --- a/fs/ext4/mballoc.c
> +++ b/fs/ext4/mballoc.c
> @@ -2570,13 +2570,13 @@ void ext4_mb_prefetch_fini(struct super_block *sb, ext4_group_t group,
> unsigned int nr)
> {
> while (nr-- > 0) {
> - struct ext4_group_desc *gdp = ext4_get_group_desc(sb, group,
> - NULL);
> - struct ext4_group_info *grp = ext4_get_group_info(sb, group);
> + struct ext4_group_desc *gdp;
> + struct ext4_group_info *grp;

We can even declare these variables at the begining of the function like
in [1]. Also I would advise to rearrange any "fixes" patches at the
begining of the patch series and "cleanup" patches at the end.
e.g. this looks like a fix to me.

That way it is sometimes easier for people to cherry-pick any fixes if
required in their older kernel trees. ;)

[1]: https://lore.kernel.org/all/85bbcb3774e38de65b737ef0000241ddbdda73aa.1674822311.git.ojaswin@linux.ibm.com/

-ritesh

>
> if (!group)
> group = ext4_get_groups_count(sb);
> group--;
> + gdp = ext4_get_group_desc(sb, group, NULL);
> grp = ext4_get_group_info(sb, group);
>
> if (EXT4_MB_GRP_NEED_INIT(grp) &&
> --
> 2.30.0

2023-02-17 07:24:22

by Kemeng Shi

[permalink] [raw]
Subject: Re: [PATCH 04/21] ext4: get correct ext4_group_info in ext4_mb_prefetch_fini



on 2/17/2023 2:46 PM, Ritesh Harjani (IBM) wrote:
> Kemeng Shi <[email protected]> writes:
>
>> We always get ext4_group_desc with group + 1 and ext4_group_info with
>> group to check if we need do initialize ext4_group_info for the group.
>> Just get ext4_group_desc with group for ext4_group_info initialization
>> check.
>>
>> Signed-off-by: Kemeng Shi <[email protected]>
>> ---
>> fs/ext4/mballoc.c | 6 +++---
>> 1 file changed, 3 insertions(+), 3 deletions(-)
>>
>> diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c
>> index 352ac9139fee..f24f80ecf318 100644
>> --- a/fs/ext4/mballoc.c
>> +++ b/fs/ext4/mballoc.c
>> @@ -2570,13 +2570,13 @@ void ext4_mb_prefetch_fini(struct super_block *sb, ext4_group_t group,
>> unsigned int nr)
>> {
>> while (nr-- > 0) {
>> - struct ext4_group_desc *gdp = ext4_get_group_desc(sb, group,
>> - NULL);
>> - struct ext4_group_info *grp = ext4_get_group_info(sb, group);
>> + struct ext4_group_desc *gdp;
>> + struct ext4_group_info *grp;
>
> We can even declare these variables at the begining of the function like
> in [1]. Also I would advise to rearrange any "fixes" patches at the
> begining of the patch series and "cleanup" patches at the end.
> e.g. this looks like a fix to me.
>
> That way it is sometimes easier for people to cherry-pick any fixes if
> required in their older kernel trees. ;)
>
Hi Ritesh, Thanks for feedback. I declare these variables at the begining
of the function in next version.
I agree that we should keep bugfix patches at the beginning. Actually,
patch 1-8 are all fix patches from my view. So I think current patch sort
is fine.
Thanks.

--
Best wishes
Kemeng Shi