2023-07-25 10:56:44

by Kemeng Shi

[permalink] [raw]
Subject: [PATCH v2 00/10] A few fixes and cleanups to mballoc

v1->v2:
Collect review-by from Ritesh and do improve as Ritesh suggested:
-Keep checks inside unlikely() in patch 1
-Add missed fixes tags in patch 1, 2 and 10
-Fix typo, fix conflic and kill one more return in patch 5

Hi all, this series contains some random fixes and cleanups to mballoc
which include correct grp validation, fix data overflow and so on.
More details can be found in respective patches.
Besides, 'kvm-xfstest smoke' runs successfully without error.

Thanks!

Kemeng Shi (10):
ext4: correct grp validation in ext4_mb_good_group
ext4: avoid potential data overflow in next_linear_group
ext4: return found group directly in
ext4_mb_choose_next_group_p2_aligned
ext4: use is_power_of_2 helper in ext4_mb_regular_allocator
ext4: remove unnecessary return for void function
ext4: replace the traditional ternary conditional operator with with
max()/min()
ext4: remove unused ext4_{set}/{clear}_bit_atomic
ext4: return found group directly in
ext4_mb_choose_next_group_goal_fast
ext4: return found group directly in
ext4_mb_choose_next_group_best_avail
ext4: correct some stale comment of criteria

fs/ext4/ext4.h | 2 --
fs/ext4/mballoc.c | 89 ++++++++++++++++++-----------------------------
2 files changed, 33 insertions(+), 58 deletions(-)

--
2.30.0



2023-07-25 10:56:44

by Kemeng Shi

[permalink] [raw]
Subject: [PATCH v2 08/10] ext4: return found group directly in ext4_mb_choose_next_group_goal_fast

Return good group when it's found in loop to remove futher check if good
group is found after loop.

Signed-off-by: Kemeng Shi <[email protected]>
Reviewed-by: Ritesh Harjani (IBM) <[email protected]>
---
fs/ext4/mballoc.c | 14 ++++++--------
1 file changed, 6 insertions(+), 8 deletions(-)

diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c
index 73f8ecdf4d23..88a3c00e484f 100644
--- a/fs/ext4/mballoc.c
+++ b/fs/ext4/mballoc.c
@@ -959,16 +959,14 @@ static void ext4_mb_choose_next_group_goal_fast(struct ext4_allocation_context *
for (i = mb_avg_fragment_size_order(ac->ac_sb, ac->ac_g_ex.fe_len);
i < MB_NUM_ORDERS(ac->ac_sb); i++) {
grp = ext4_mb_find_good_group_avg_frag_lists(ac, i);
- if (grp)
- break;
+ if (grp) {
+ *group = grp->bb_group;
+ ac->ac_flags |= EXT4_MB_CR_GOAL_LEN_FAST_OPTIMIZED;
+ return;
+ }
}

- if (grp) {
- *group = grp->bb_group;
- ac->ac_flags |= EXT4_MB_CR_GOAL_LEN_FAST_OPTIMIZED;
- } else {
- *new_cr = CR_BEST_AVAIL_LEN;
- }
+ *new_cr = CR_BEST_AVAIL_LEN;
}

/*
--
2.30.0


2023-07-25 10:56:46

by Kemeng Shi

[permalink] [raw]
Subject: [PATCH v2 10/10] ext4: correct some stale comment of criteria

We named criteria with CR_XXX, correct stale comment to criteria with
raw number.

Fixes: f52f3d2b9fba ("ext4: Give symbolic names to mballoc criterias")
Signed-off-by: Kemeng Shi <[email protected]>
Reviewed-by: Ritesh Harjani (IBM) <[email protected]>
---
fs/ext4/mballoc.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c
index 36eea63eaace..de5da76e6748 100644
--- a/fs/ext4/mballoc.c
+++ b/fs/ext4/mballoc.c
@@ -2777,8 +2777,8 @@ ext4_mb_regular_allocator(struct ext4_allocation_context *ac)

/*
* ac->ac_2order is set only if the fe_len is a power of 2
- * if ac->ac_2order is set we also set criteria to 0 so that we
- * try exact allocation using buddy.
+ * if ac->ac_2order is set we also set criteria to CR_POWER2_ALIGNED
+ * so that we try exact allocation using buddy.
*/
i = fls(ac->ac_g_ex.fe_len);
ac->ac_2order = 0;
@@ -2835,8 +2835,8 @@ ext4_mb_regular_allocator(struct ext4_allocation_context *ac)
/*
* Batch reads of the block allocation bitmaps
* to get multiple READs in flight; limit
- * prefetching at cr=0/1, otherwise mballoc can
- * spend a lot of time loading imperfect groups
+ * prefetching at cr below CR_FAST, otherwise mballoc
+ * can spend a lot of time loading imperfect groups
*/
if ((prefetch_grp == group) &&
(cr >= CR_FAST ||
--
2.30.0


2023-07-25 10:57:02

by Kemeng Shi

[permalink] [raw]
Subject: [PATCH v2 07/10] ext4: remove unused ext4_{set}/{clear}_bit_atomic

Remove ext4_set_bit_atomic and ext4_clear_bit_atomic which are defined but not
used.

Signed-off-by: Kemeng Shi <[email protected]>
Reviewed-by: Ritesh Harjani (IBM) <[email protected]>
---
fs/ext4/ext4.h | 2 --
1 file changed, 2 deletions(-)

diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
index 0a2d55faa095..7166edb2e4a7 100644
--- a/fs/ext4/ext4.h
+++ b/fs/ext4/ext4.h
@@ -1252,10 +1252,8 @@ struct ext4_inode_info {

#define ext4_test_and_set_bit __test_and_set_bit_le
#define ext4_set_bit __set_bit_le
-#define ext4_set_bit_atomic ext2_set_bit_atomic
#define ext4_test_and_clear_bit __test_and_clear_bit_le
#define ext4_clear_bit __clear_bit_le
-#define ext4_clear_bit_atomic ext2_clear_bit_atomic
#define ext4_test_bit test_bit_le
#define ext4_find_next_zero_bit find_next_zero_bit_le
#define ext4_find_next_bit find_next_bit_le
--
2.30.0


2023-07-25 10:57:08

by Kemeng Shi

[permalink] [raw]
Subject: [PATCH v2 01/10] ext4: correct grp validation in ext4_mb_good_group

Group corruption check will access memory of grp and will trigger kernel
crash if grp is NULL. So do NULL check before corruption check.

Fixes: 5354b2af3406 ("ext4: allow ext4_get_group_info() to fail")
Signed-off-by: Kemeng Shi <[email protected]>
---
fs/ext4/mballoc.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c
index 456150ef6111..62e7a045ad79 100644
--- a/fs/ext4/mballoc.c
+++ b/fs/ext4/mballoc.c
@@ -2553,7 +2553,7 @@ static bool ext4_mb_good_group(struct ext4_allocation_context *ac,

BUG_ON(cr < CR_POWER2_ALIGNED || cr >= EXT4_MB_NUM_CRS);

- if (unlikely(EXT4_MB_GRP_BBITMAP_CORRUPT(grp) || !grp))
+ if (unlikely(!grp || EXT4_MB_GRP_BBITMAP_CORRUPT(grp)))
return false;

free = grp->bb_free;
--
2.30.0


2023-07-25 10:57:24

by Kemeng Shi

[permalink] [raw]
Subject: [PATCH v2 09/10] ext4: return found group directly in ext4_mb_choose_next_group_best_avail

Return good group when it's found in loop to remove futher check if good
group is found after loop.

Signed-off-by: Kemeng Shi <[email protected]>
Reviewed-by: Ritesh Harjani (IBM) <[email protected]>
---
fs/ext4/mballoc.c | 18 ++++++++----------
1 file changed, 8 insertions(+), 10 deletions(-)

diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c
index 88a3c00e484f..36eea63eaace 100644
--- a/fs/ext4/mballoc.c
+++ b/fs/ext4/mballoc.c
@@ -1042,18 +1042,16 @@ static void ext4_mb_choose_next_group_best_avail(struct ext4_allocation_context
ac->ac_g_ex.fe_len);

grp = ext4_mb_find_good_group_avg_frag_lists(ac, frag_order);
- if (grp)
- break;
+ if (grp) {
+ *group = grp->bb_group;
+ ac->ac_flags |= EXT4_MB_CR_BEST_AVAIL_LEN_OPTIMIZED;
+ return;
+ }
}

- if (grp) {
- *group = grp->bb_group;
- ac->ac_flags |= EXT4_MB_CR_BEST_AVAIL_LEN_OPTIMIZED;
- } else {
- /* Reset goal length to original goal length before falling into CR_GOAL_LEN_SLOW */
- ac->ac_g_ex.fe_len = ac->ac_orig_goal_len;
- *new_cr = CR_GOAL_LEN_SLOW;
- }
+ /* Reset goal length to original goal length before falling into CR_GOAL_LEN_SLOW */
+ ac->ac_g_ex.fe_len = ac->ac_orig_goal_len;
+ *new_cr = CR_GOAL_LEN_SLOW;
}

static inline int should_optimize_scan(struct ext4_allocation_context *ac)
--
2.30.0


2023-07-25 11:08:54

by Ritesh Harjani

[permalink] [raw]
Subject: Re: [PATCH v2 01/10] ext4: correct grp validation in ext4_mb_good_group

Kemeng Shi <[email protected]> writes:

> Group corruption check will access memory of grp and will trigger kernel
> crash if grp is NULL. So do NULL check before corruption check.
>
> Fixes: 5354b2af3406 ("ext4: allow ext4_get_group_info() to fail")
> Signed-off-by: Kemeng Shi <[email protected]>
> ---
> fs/ext4/mballoc.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)

Looks good to me. Feel free to add:
Reviewed-by: Ritesh Harjani (IBM) <[email protected]>

>
> diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c
> index 456150ef6111..62e7a045ad79 100644
> --- a/fs/ext4/mballoc.c
> +++ b/fs/ext4/mballoc.c
> @@ -2553,7 +2553,7 @@ static bool ext4_mb_good_group(struct ext4_allocation_context *ac,
>
> BUG_ON(cr < CR_POWER2_ALIGNED || cr >= EXT4_MB_NUM_CRS);
>
> - if (unlikely(EXT4_MB_GRP_BBITMAP_CORRUPT(grp) || !grp))
> + if (unlikely(!grp || EXT4_MB_GRP_BBITMAP_CORRUPT(grp)))
> return false;
>
> free = grp->bb_free;
> --
> 2.30.0

2023-07-26 15:03:08

by Ojaswin Mujoo

[permalink] [raw]
Subject: Re: [PATCH v2 10/10] ext4: correct some stale comment of criteria

On Wed, Jul 26, 2023 at 02:51:06AM +0800, Kemeng Shi wrote:
> We named criteria with CR_XXX, correct stale comment to criteria with
> raw number.

Hi Kemeng,

Thanks for the cleanups.

>
> Fixes: f52f3d2b9fba ("ext4: Give symbolic names to mballoc criterias")
> Signed-off-by: Kemeng Shi <[email protected]>
> Reviewed-by: Ritesh Harjani (IBM) <[email protected]>
> ---
> fs/ext4/mballoc.c | 8 ++++----
> 1 file changed, 4 insertions(+), 4 deletions(-)
>
> diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c
> index 36eea63eaace..de5da76e6748 100644
> --- a/fs/ext4/mballoc.c
> +++ b/fs/ext4/mballoc.c
> @@ -2777,8 +2777,8 @@ ext4_mb_regular_allocator(struct ext4_allocation_context *ac)
>
> /*
> * ac->ac_2order is set only if the fe_len is a power of 2
> - * if ac->ac_2order is set we also set criteria to 0 so that we
> - * try exact allocation using buddy.
> + * if ac->ac_2order is set we also set criteria to CR_POWER2_ALIGNED
> + * so that we try exact allocation using buddy.
> */
> i = fls(ac->ac_g_ex.fe_len);
> ac->ac_2order = 0;
> @@ -2835,8 +2835,8 @@ ext4_mb_regular_allocator(struct ext4_allocation_context *ac)
> /*
> * Batch reads of the block allocation bitmaps
> * to get multiple READs in flight; limit
> - * prefetching at cr=0/1, otherwise mballoc can
> - * spend a lot of time loading imperfect groups
> + * prefetching at cr below CR_FAST, otherwise mballoc

One of my earlier patchset has replaced the CR_FAST macro with
ext4_mb_cr_expensive() so maybe we can account for that here:

https://lore.kernel.org/linux-ext4/[email protected]/

Regards,
ojaswin

> + * can spend a lot of time loading imperfect groups

> */
> if ((prefetch_grp == group) &&
> (cr >= CR_FAST ||
> --
> 2.30.0
>

2023-07-27 06:42:50

by Ojaswin Mujoo

[permalink] [raw]
Subject: Re: [PATCH v2 10/10] ext4: correct some stale comment of criteria

On Thu, Jul 27, 2023 at 09:29:11AM +0800, Kemeng Shi wrote:
>
>
> on 7/26/2023 10:50 PM, Ojaswin Mujoo wrote:
> > On Wed, Jul 26, 2023 at 02:51:06AM +0800, Kemeng Shi wrote:
> >> We named criteria with CR_XXX, correct stale comment to criteria with
> >> raw number.
> >
> > Hi Kemeng,
> >
> > Thanks for the cleanups.
> >
> >>
> >> Fixes: f52f3d2b9fba ("ext4: Give symbolic names to mballoc criterias")
> >> Signed-off-by: Kemeng Shi <[email protected]>
> >> Reviewed-by: Ritesh Harjani (IBM) <[email protected]>
> >> ---
> >> fs/ext4/mballoc.c | 8 ++++----
> >> 1 file changed, 4 insertions(+), 4 deletions(-)
> >>
> >> diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c
> >> index 36eea63eaace..de5da76e6748 100644
> >> --- a/fs/ext4/mballoc.c
> >> +++ b/fs/ext4/mballoc.c
> >> @@ -2777,8 +2777,8 @@ ext4_mb_regular_allocator(struct ext4_allocation_context *ac)
> >>
> >> /*
> >> * ac->ac_2order is set only if the fe_len is a power of 2
> >> - * if ac->ac_2order is set we also set criteria to 0 so that we
> >> - * try exact allocation using buddy.
> >> + * if ac->ac_2order is set we also set criteria to CR_POWER2_ALIGNED
> >> + * so that we try exact allocation using buddy.
> >> */
> >> i = fls(ac->ac_g_ex.fe_len);
> >> ac->ac_2order = 0;
> >> @@ -2835,8 +2835,8 @@ ext4_mb_regular_allocator(struct ext4_allocation_context *ac)
> >> /*
> >> * Batch reads of the block allocation bitmaps
> >> * to get multiple READs in flight; limit
> >> - * prefetching at cr=0/1, otherwise mballoc can
> >> - * spend a lot of time loading imperfect groups
> >> + * prefetching at cr below CR_FAST, otherwise mballoc
> >
> > One of my earlier patchset has replaced the CR_FAST macro with
> > ext4_mb_cr_expensive() so maybe we can account for that here:
> >
> > https://lore.kernel.org/linux-ext4/[email protected]/
> >
> Hi Ojaswin, sorry for missing this. I still could not find the comment update
> of stale comment "limit prefetching at cr=0/1" in that patch. Maybe we could
> update comment to "prefetching at inexpensive CR, otherwise ...". What do
> you think. Or did I miss anything.

Hey Kemeng,

That's right I missed the update but just wanted to let you know that
CR_FAST would be removed. "prefetching at inexpensive CRs, ..." sounds
good to me.

Regards,
ojaswin
>
> --
> Best wishes
> Kemeng Shi
> > Regards,
> > ojaswin
> >
> >> + * can spend a lot of time loading imperfect groups
> >
> >> */
> >> if ((prefetch_grp == group) &&
> >> (cr >= CR_FAST ||
> >> --
> >> 2.30.0
> >>
> >
>