by Theodore Ts'o

[permalink] [raw]

Subject: Re: [PATCH v2 00/12] multiblock allocator improvements

On Tue, 30 May 2023 18:03:38 +0530, Ojaswin Mujoo wrote:
> ** Changed since v1 [2] **
>
> 1. Rebase over Kemeng's recent mballoc patchset [3]
> 2. Picked up Kemeng's RVB on patch 1/12
>
> [2] https://lore.kernel.org/all/[email protected]/
> [3] https://lore.kernel.org/all/[email protected]/
>
> [...]

Applied, thanks!

[01/12] Revert "ext4: remove ac->ac_found > sbi->s_mb_min_to_scan dead check in ext4_mb_check_limits"
commit: 3582e74599d376bc18cae123045cd295360d885b
[02/12] ext4: mballoc: Remove useless setting of ac_criteria
commit: fb665804fd62e600b5c2350ea69295261ce8374d
[03/12] ext4: Remove unused extern variables declaration
commit: 3086ed54c0e65c60b0fb142e181e7dd4e3b7b1e0
[04/12] ext4: Convert mballoc cr (criteria) to enum
commit: eb7d4a8b9510887fb690a6b912d80cb0bce21387
[05/12] ext4: Add per CR extent scanned counter
commit: 9e97d81a1fa105b80583b5152e4b9cb794734585
[06/12] ext4: Add counter to track successful allocation of goal length
commit: af97bca67ff63191d44023f895b6033eb7d3423a
[07/12] ext4: Avoid scanning smaller extents in BG during CR1
commit: caf886aecd608a8ef05ab10957cf4b9fd9564712
[08/12] ext4: Don't skip prefetching BLOCK_UNINIT groups
commit: bf912c937ed41c4581d77806b003f22625eee0b5
[09/12] ext4: Ensure ext4_mb_prefetch_fini() is called for all prefetched BGs
commit: 64f6fb876cedc30fc1430b96eb442bd84bc61459
[10/12] ext4: Abstract out logic to search average fragment list
commit: 1918cdc99d125c275dcdd4527520c78bb1a3c1ef
[11/12] ext4: Add allocation criteria 1.5 (CR1_5)
commit: 7b748ea2a6ad2bda304553b5cf8745f542af6b34
[12/12] ext4: Give symbolic names to mballoc criterias
commit: c9f19daa1824a73218526650a9aade17536527c8

Best regards,
--
Theodore Ts'o <[email protected]>

2023-06-27 06:54:33

by Ojaswin Mujoo

[permalink] [raw]

Subject: Re: [PATCH v2 09/12] ext4: Ensure ext4_mb_prefetch_fini() is called for all prefetched BGs

On Tue, Jun 06, 2023 at 10:00:57PM +0800, Guoqing Jiang wrote:
> Hello,
>
> On 5/30/23 20:33, Ojaswin Mujoo wrote:
> > Before this patch, the call stack in ext4_run_li_request is as follows:
> >
> > /*
> > * nr = no. of BGs we want to fetch (=s_mb_prefetch)
> > * prefetch_ios = no. of BGs not uptodate after
> > * ext4_read_block_bitmap_nowait()
> > */
> > next_group = ext4_mb_prefetch(sb, group, nr, prefetch_ios);
> > ext4_mb_prefetch_fini(sb, next_group prefetch_ios);
> >
> > ext4_mb_prefetch_fini() will only try to initialize buddies for BGs in
> > range [next_group - prefetch_ios, next_group). This is incorrect since
> > sometimes (prefetch_ios < nr), which causes ext4_mb_prefetch_fini() to
> > incorrectly ignore some of the BGs that might need initialization. This
> > issue is more notable now with the previous patch enabling "fetching" of
> > BLOCK_UNINIT BGs which are marked buffer_uptodate by default.
> >
> > Fix this by passing nr to ext4_mb_prefetch_fini() instead of
> > prefetch_ios so that it considers the right range of groups.
>
> Thanks for the series.
>
> > Similarly, make sure we don't pass nr=0 to ext4_mb_prefetch_fini() in
> > ext4_mb_regular_allocator() since we might have prefetched BLOCK_UNINIT
> > groups that would need buddy initialization.
>
> Seems ext4_mb_prefetch_fini can't be called by ext4_mb_regular_allocator
> if nr is 0.

Hi Guoqing,

Sorry I was on vacation so didn't get a chance to reply to this sooner.
Let me explain what I meant by that statement in the commit message.

So basically, the prefetch_ios output argument is incremented whenever
ext4_mb_prefetch() reads a block group with !buffer_uptodate(bh).
However, for BLOCK_UNINIT BGs the buffer is marked uptodate after
initialization and hence prefetch_ios is not incremented when such BGs
are prefetched.

This leads to nr becoming 0 due to the following line (removed in this patch):

if (prefetch_ios == curr_ios)
nr = 0;

hence ext4_mb_prefetch_fini() would never pre initialise the corresponding
buddy structures. Instead, these structures would then get initialized
probably at a later point during the slower allocation criterias. The
motivation of making sure the BLOCK_UNINIT BGs' buddies are pre
initialized is so the faster allocation criterias can utilize the data
to make better decisions.

Regards,
ojaswin

>
> https://elixir.bootlin.com/linux/v6.4-rc5/source/fs/ext4/mballoc.c#L2816
>
> Am I miss something?
>
> Thanks,
> Guoqing
>