On Wed 13-12-23 16:16:35, Suraj Jitindar Singh wrote:
> The ext4 filesystem tracks the trim status of blocks at the group level.
> When an entire group has been trimmed then it is marked as such and subsequent
> trim invocations with the same minimum trim size will not be attempted on that
> group unless it is marked as able to be trimmed again such as when a block is
> freed.
>
> Currently the last group can't be marked as trimmed due to incorrect logic
> in ext4_last_grp_cluster(). ext4_last_grp_cluster() is supposed to return the
> zero based index of the last cluster in a group. This is then used by
> ext4_try_to_trim_range() to determine if the trim operation spans the entire
> group and as such if the trim status of the group should be recorded.
>
> ext4_last_grp_cluster() takes a 0 based group index, thus the valid values
> for grp are 0..(ext4_get_groups_count - 1). Any group index less than
> (ext4_get_groups_count - 1) is not the last group and must have
> EXT4_CLUSTERS_PER_GROUP(sb) clusters. For the last group we need to calculate
> the number of clusters based on the number of blocks in the group. Finally
> subtract 1 from the number of clusters as zero based indexing is expected.
> Rearrange the function slightly to make it clear what we are calculating
> and returning.
>
> Reproducer:
> // Create file system where the last group has fewer blocks than blocks per group
> $ mkfs.ext4 -b 4096 -g 8192 /dev/nvme0n1 8191
> $ mount /dev/nvme0n1 /mnt
>
> Before Patch:
> $ fstrim -v /mnt
> /mnt: 25.9 MiB (27156480 bytes) trimmed
> // Group not marked as trimmed so second invocation still discards blocks
> $ fstrim -v /mnt
> /mnt: 25.9 MiB (27156480 bytes) trimmed
>
> After Patch:
> fstrim -v /mnt
> /mnt: 25.9 MiB (27156480 bytes) trimmed
> // Group marked as trimmed so second invocation DOESN'T discard any blocks
> fstrim -v /mnt
> /mnt: 0 B (0 bytes) trimmed
>
> Fixes: 45e4ab320c9b ("ext4: move setting of trimmed bit into ext4_try_to_trim_range()")
> Cc: [email protected] # 4.19+
> Signed-off-by: Suraj Jitindar Singh <[email protected]>
Indeed. The fix looks good. Feel free to add:
Reviewed-by: Jan Kara <[email protected]>
Honza
> ---
> fs/ext4/mballoc.c | 15 ++++++++++-----
> 1 file changed, 10 insertions(+), 5 deletions(-)
>
> diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c
> index 454d5612641ee..c15d8b6f887dd 100644
> --- a/fs/ext4/mballoc.c
> +++ b/fs/ext4/mballoc.c
> @@ -6731,11 +6731,16 @@ __acquires(bitlock)
> static ext4_grpblk_t ext4_last_grp_cluster(struct super_block *sb,
> ext4_group_t grp)
> {
> - if (grp < ext4_get_groups_count(sb))
> - return EXT4_CLUSTERS_PER_GROUP(sb) - 1;
> - return (ext4_blocks_count(EXT4_SB(sb)->s_es) -
> - ext4_group_first_block_no(sb, grp) - 1) >>
> - EXT4_CLUSTER_BITS(sb);
> + unsigned long nr_clusters_in_group;
> +
> + if (grp < (ext4_get_groups_count(sb) - 1))
> + nr_clusters_in_group = EXT4_CLUSTERS_PER_GROUP(sb);
> + else
> + nr_clusters_in_group = (ext4_blocks_count(EXT4_SB(sb)->s_es) -
> + ext4_group_first_block_no(sb, grp))
> + >> EXT4_CLUSTER_BITS(sb);
> +
> + return nr_clusters_in_group - 1;
> }
>
> static bool ext4_trim_interrupted(void)
> --
> 2.34.1
>
--
Jan Kara <[email protected]>
SUSE Labs, CR