2023-02-20 08:53:55

by Geert Uytterhoeven

[permalink] [raw]
Subject: Re: [PATCH v2 4/4] btrfs: replace btrfs_io_context::raid_map[] with a fixed u64 value

Hi Qu,

On Tue, 7 Feb 2023, Qu Wenruo wrote:
> In btrfs_io_context structure, we have a pointer raid_map, which is to
> indicate the logical bytenr for each stripe.
>
> But considering we always call sort_parity_stripes(), the result
> raid_map[] is always sorted, thus raid_map[0] is always the logical
> bytenr of the full stripe.
>
> So why we waste the space and time (for sorting) for raid_map[]?
>
> This patch will replace btrfs_io_context::raid_map with a single u64
> number, full_stripe_start, by:
>
> - Replace btrfs_io_context::raid_map with full_stripe_start
>
> - Replace call sites using raid_map[0] to use full_stripe_start
>
> - Replace call sites using raid_map[i] to compare with nr_data_stripes.
>
> The benefits are:
>
> - Less memory wasted on raid_map
> It's sizeof(u64) * num_stripes vs size(u64).
> It's always a save for at least one u64, and the benefit grows larger
> with num_stripes.
>
> - No more weird alloc_btrfs_io_context() behavior
> As there is only one fixed size + one variable length array.
>
> Signed-off-by: Qu Wenruo <[email protected]>

Thanks for your patch, which is now commit 4a8c6e8a6dc8ae4c ("btrfs:
replace btrfs_io_context::raid_map with a fixed u64 value") in
next-20230220.

[email protected] reported several build failures when
building for 32-bit platforms:

ERROR: modpost: "__umoddi3" [fs/btrfs/btrfs.ko] undefined!

> --- a/fs/btrfs/volumes.c
> +++ b/fs/btrfs/volumes.c
> @@ -6556,35 +6532,44 @@ int __btrfs_map_block(struct btrfs_fs_info *fs_info, enum btrfs_map_op op,
> }
> bioc->map_type = map->type;
>
> - for (i = 0; i < num_stripes; i++) {
> - set_io_stripe(&bioc->stripes[i], map, stripe_index, stripe_offset,
> - stripe_nr);
> - stripe_index++;
> - }
> -
> - /* Build raid_map */
> + /*
> + * For RAID56 full map, we need to make sure the stripes[] follows
> + * the rule that data stripes are all ordered, then followed with P
> + * and Q (if we have).
> + *
> + * It's still mostly the same as other profiles, just with extra
> + * rotataion.
> + */
> if (map->type & BTRFS_BLOCK_GROUP_RAID56_MASK && need_raid_map &&
> (need_full_stripe(op) || mirror_num > 1)) {
> - u64 tmp;
> - unsigned rot;
> -
> - /* Work out the disk rotation on this stripe-set */
> - rot = stripe_nr % num_stripes;
> -
> - /* Fill in the logical address of each stripe */
> - tmp = stripe_nr * data_stripes;
> - for (i = 0; i < data_stripes; i++)
> - bioc->raid_map[(i + rot) % num_stripes] =
> - em->start + ((tmp + i) << BTRFS_STRIPE_LEN_SHIFT);
> -
> - bioc->raid_map[(i + rot) % map->num_stripes] = RAID5_P_STRIPE;
> - if (map->type & BTRFS_BLOCK_GROUP_RAID6)
> - bioc->raid_map[(i + rot + 1) % num_stripes] =
> - RAID6_Q_STRIPE;
> -
> - sort_parity_stripes(bioc, num_stripes);
> + /*
> + * For RAID56 @stripe_nr is already the number of full stripes
> + * before us, which is also the rotation value (needs to modulo
> + * with num_stripes).
> + *
> + * In this case, we just add @stripe_nr with @i, then do the
> + * modulo, to reduce one modulo call.
> + */
> + bioc->full_stripe_logical = em->start +
> + ((stripe_nr * data_stripes) << BTRFS_STRIPE_LEN_SHIFT);
> + for (i = 0; i < num_stripes; i++) {
> + set_io_stripe(&bioc->stripes[i], map,
> + (i + stripe_nr) % num_stripes,

As stripe_nr is u64, this is a 64-by-32 modulo operation, which
should be implemented using a helper from include/linux/math64.h
instead.

> + stripe_offset, stripe_nr);
> + }
> + } else {
> + /*
> + * For all other non-RAID56 profiles, just copy the target
> + * stripe into the bioc.
> + */
> + for (i = 0; i < num_stripes; i++) {
> + set_io_stripe(&bioc->stripes[i], map, stripe_index,
> + stripe_offset, stripe_nr);
> + stripe_index++;
> + }
> }
>
> +
> if (need_full_stripe(op))
> max_errors = btrfs_chunk_max_errors(map);

Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- [email protected]

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds


2023-02-20 11:50:56

by Qu Wenruo

[permalink] [raw]
Subject: Re: [PATCH v2 4/4] btrfs: replace btrfs_io_context::raid_map[] with a fixed u64 value



On 2023/2/20 16:53, Geert Uytterhoeven wrote:
>     Hi Qu,
>
> On Tue, 7 Feb 2023, Qu Wenruo wrote:
>> In btrfs_io_context structure, we have a pointer raid_map, which is to
>> indicate the logical bytenr for each stripe.
>>
>> But considering we always call sort_parity_stripes(), the result
>> raid_map[] is always sorted, thus raid_map[0] is always the logical
>> bytenr of the full stripe.
>>
>> So why we waste the space and time (for sorting) for raid_map[]?
>>
>> This patch will replace btrfs_io_context::raid_map with a single u64
>> number, full_stripe_start, by:
>>
>> - Replace btrfs_io_context::raid_map with full_stripe_start
>>
>> - Replace call sites using raid_map[0] to use full_stripe_start
>>
>> - Replace call sites using raid_map[i] to compare with nr_data_stripes.
>>
>> The benefits are:
>>
>> - Less memory wasted on raid_map
>>  It's sizeof(u64) * num_stripes vs size(u64).
>>  It's always a save for at least one u64, and the benefit grows larger
>>  with num_stripes.
>>
>> - No more weird alloc_btrfs_io_context() behavior
>>  As there is only one fixed size + one variable length array.
>>
>> Signed-off-by: Qu Wenruo <[email protected]>
>
> Thanks for your patch, which is now commit 4a8c6e8a6dc8ae4c ("btrfs:
> replace btrfs_io_context::raid_map with a fixed u64 value") in
> next-20230220.
>
> [email protected] reported several build failures when
> building for 32-bit platforms:
>
>     ERROR: modpost: "__umoddi3" [fs/btrfs/btrfs.ko] undefined!
>
>> --- a/fs/btrfs/volumes.c
>> +++ b/fs/btrfs/volumes.c
>> @@ -6556,35 +6532,44 @@ int __btrfs_map_block(struct btrfs_fs_info
>> *fs_info, enum btrfs_map_op op,
>>     }
>>     bioc->map_type = map->type;
>>
>> -    for (i = 0; i < num_stripes; i++) {
>> -        set_io_stripe(&bioc->stripes[i], map, stripe_index,
>> stripe_offset,
>> -                  stripe_nr);
>> -        stripe_index++;
>> -    }
>> -
>> -    /* Build raid_map */
>> +    /*
>> +     * For RAID56 full map, we need to make sure the stripes[] follows
>> +     * the rule that data stripes are all ordered, then followed with P
>> +     * and Q (if we have).
>> +     *
>> +     * It's still mostly the same as other profiles, just with extra
>> +     * rotataion.
>> +     */
>>     if (map->type & BTRFS_BLOCK_GROUP_RAID56_MASK && need_raid_map &&
>>         (need_full_stripe(op) || mirror_num > 1)) {
>> -        u64 tmp;
>> -        unsigned rot;
>> -
>> -        /* Work out the disk rotation on this stripe-set */
>> -        rot = stripe_nr % num_stripes;
>> -
>> -        /* Fill in the logical address of each stripe */
>> -        tmp = stripe_nr * data_stripes;
>> -        for (i = 0; i < data_stripes; i++)
>> -            bioc->raid_map[(i + rot) % num_stripes] =
>> -                em->start + ((tmp + i) << BTRFS_STRIPE_LEN_SHIFT);
>> -
>> -        bioc->raid_map[(i + rot) % map->num_stripes] = RAID5_P_STRIPE;
>> -        if (map->type & BTRFS_BLOCK_GROUP_RAID6)
>> -            bioc->raid_map[(i + rot + 1) % num_stripes] =
>> -                RAID6_Q_STRIPE;
>> -
>> -        sort_parity_stripes(bioc, num_stripes);
>> +        /*
>> +         * For RAID56 @stripe_nr is already the number of full stripes
>> +         * before us, which is also the rotation value (needs to modulo
>> +         * with num_stripes).
>> +         *
>> +         * In this case, we just add @stripe_nr with @i, then do the
>> +         * modulo, to reduce one modulo call.
>> +         */
>> +        bioc->full_stripe_logical = em->start +
>> +            ((stripe_nr * data_stripes) << BTRFS_STRIPE_LEN_SHIFT);
>> +        for (i = 0; i < num_stripes; i++) {
>> +            set_io_stripe(&bioc->stripes[i], map,
>> +                      (i + stripe_nr) % num_stripes,
>
> As stripe_nr is u64, this is a 64-by-32 modulo operation, which
> should be implemented using a helper from include/linux/math64.h
> instead.

This is an older version, in the latest version, the @stripe_nr variable
is also u32, and I tried compiling the latest branch with i686, it
doesn't cause any u64 division problems anymore.

You can find the latest branch in either github or from the mailling list:

https://github.com/adam900710/linux/tree/map_block_refactor

https://lore.kernel.org/linux-btrfs/[email protected]/

Thanks,
Qu

>
>> +                      stripe_offset, stripe_nr);
>> +        }
>> +    } else {
>> +        /*
>> +         * For all other non-RAID56 profiles, just copy the target
>> +         * stripe into the bioc.
>> +         */
>> +        for (i = 0; i < num_stripes; i++) {
>> +            set_io_stripe(&bioc->stripes[i], map, stripe_index,
>> +                      stripe_offset, stripe_nr);
>> +            stripe_index++;
>> +        }
>>     }
>>
>> +
>>     if (need_full_stripe(op))
>>         max_errors = btrfs_chunk_max_errors(map);
>
> Gr{oetje,eeting}s,
>
>                         Geert
>
> --
> Geert Uytterhoeven -- There's lots of Linux beyond ia32 --
> [email protected]
>
> In personal conversations with technical people, I call myself a hacker.
> But
> when I'm talking to journalists I just say "programmer" or something
> like that.
>                                 -- Linus Torvalds

2023-02-20 12:15:06

by Geert Uytterhoeven

[permalink] [raw]
Subject: Re: [PATCH v2 4/4] btrfs: replace btrfs_io_context::raid_map[] with a fixed u64 value

Hi Qu,

On Mon, Feb 20, 2023 at 12:50 PM Qu Wenruo <[email protected]> wrote:
> On 2023/2/20 16:53, Geert Uytterhoeven wrote:
> > On Tue, 7 Feb 2023, Qu Wenruo wrote:
> >> In btrfs_io_context structure, we have a pointer raid_map, which is to
> >> indicate the logical bytenr for each stripe.
> >>
> >> But considering we always call sort_parity_stripes(), the result
> >> raid_map[] is always sorted, thus raid_map[0] is always the logical
> >> bytenr of the full stripe.
> >>
> >> So why we waste the space and time (for sorting) for raid_map[]?
> >>
> >> This patch will replace btrfs_io_context::raid_map with a single u64
> >> number, full_stripe_start, by:
> >>
> >> - Replace btrfs_io_context::raid_map with full_stripe_start
> >>
> >> - Replace call sites using raid_map[0] to use full_stripe_start
> >>
> >> - Replace call sites using raid_map[i] to compare with nr_data_stripes.
> >>
> >> The benefits are:
> >>
> >> - Less memory wasted on raid_map
> >> It's sizeof(u64) * num_stripes vs size(u64).
> >> It's always a save for at least one u64, and the benefit grows larger
> >> with num_stripes.
> >>
> >> - No more weird alloc_btrfs_io_context() behavior
> >> As there is only one fixed size + one variable length array.
> >>
> >> Signed-off-by: Qu Wenruo <[email protected]>
> >
> > Thanks for your patch, which is now commit 4a8c6e8a6dc8ae4c ("btrfs:
> > replace btrfs_io_context::raid_map with a fixed u64 value") in
> > next-20230220.
> >
> > [email protected] reported several build failures when
> > building for 32-bit platforms:
> >
> > ERROR: modpost: "__umoddi3" [fs/btrfs/btrfs.ko] undefined!
> >
> >> --- a/fs/btrfs/volumes.c
> >> +++ b/fs/btrfs/volumes.c
> >> @@ -6556,35 +6532,44 @@ int __btrfs_map_block(struct btrfs_fs_info
> >> *fs_info, enum btrfs_map_op op,
> >> }
> >> bioc->map_type = map->type;
> >>
> >> - for (i = 0; i < num_stripes; i++) {
> >> - set_io_stripe(&bioc->stripes[i], map, stripe_index,
> >> stripe_offset,
> >> - stripe_nr);
> >> - stripe_index++;
> >> - }
> >> -
> >> - /* Build raid_map */
> >> + /*
> >> + * For RAID56 full map, we need to make sure the stripes[] follows
> >> + * the rule that data stripes are all ordered, then followed with P
> >> + * and Q (if we have).
> >> + *
> >> + * It's still mostly the same as other profiles, just with extra
> >> + * rotataion.
> >> + */
> >> if (map->type & BTRFS_BLOCK_GROUP_RAID56_MASK && need_raid_map &&
> >> (need_full_stripe(op) || mirror_num > 1)) {
> >> - u64 tmp;
> >> - unsigned rot;
> >> -
> >> - /* Work out the disk rotation on this stripe-set */
> >> - rot = stripe_nr % num_stripes;
> >> -
> >> - /* Fill in the logical address of each stripe */
> >> - tmp = stripe_nr * data_stripes;
> >> - for (i = 0; i < data_stripes; i++)
> >> - bioc->raid_map[(i + rot) % num_stripes] =
> >> - em->start + ((tmp + i) << BTRFS_STRIPE_LEN_SHIFT);
> >> -
> >> - bioc->raid_map[(i + rot) % map->num_stripes] = RAID5_P_STRIPE;
> >> - if (map->type & BTRFS_BLOCK_GROUP_RAID6)
> >> - bioc->raid_map[(i + rot + 1) % num_stripes] =
> >> - RAID6_Q_STRIPE;
> >> -
> >> - sort_parity_stripes(bioc, num_stripes);
> >> + /*
> >> + * For RAID56 @stripe_nr is already the number of full stripes
> >> + * before us, which is also the rotation value (needs to modulo
> >> + * with num_stripes).
> >> + *
> >> + * In this case, we just add @stripe_nr with @i, then do the
> >> + * modulo, to reduce one modulo call.
> >> + */
> >> + bioc->full_stripe_logical = em->start +
> >> + ((stripe_nr * data_stripes) << BTRFS_STRIPE_LEN_SHIFT);
> >> + for (i = 0; i < num_stripes; i++) {
> >> + set_io_stripe(&bioc->stripes[i], map,
> >> + (i + stripe_nr) % num_stripes,
> >
> > As stripe_nr is u64, this is a 64-by-32 modulo operation, which
> > should be implemented using a helper from include/linux/math64.h
> > instead.
>
> This is an older version, in the latest version, the @stripe_nr variable
> is also u32, and I tried compiling the latest branch with i686, it
> doesn't cause any u64 division problems anymore.
>
> You can find the latest branch in either github or from the mailling list:
>
> https://github.com/adam900710/linux/tree/map_block_refactor
>
> https://lore.kernel.org/linux-btrfs/[email protected]/

So the older version was "v2", and the latest version had no
version indicator, nor changelog, thus assuming v1?
No surprise people end up applying the wrong version...

Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- [email protected]

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds

2023-02-21 00:10:07

by Qu Wenruo

[permalink] [raw]
Subject: Re: [PATCH v2 4/4] btrfs: replace btrfs_io_context::raid_map[] with a fixed u64 value



On 2023/2/20 20:14, Geert Uytterhoeven wrote:
> Hi Qu,
>
> On Mon, Feb 20, 2023 at 12:50 PM Qu Wenruo <[email protected]> wrote:
>> On 2023/2/20 16:53, Geert Uytterhoeven wrote:
>>> On Tue, 7 Feb 2023, Qu Wenruo wrote:
>>>> In btrfs_io_context structure, we have a pointer raid_map, which is to
>>>> indicate the logical bytenr for each stripe.
>>>>
>>>> But considering we always call sort_parity_stripes(), the result
>>>> raid_map[] is always sorted, thus raid_map[0] is always the logical
>>>> bytenr of the full stripe.
>>>>
>>>> So why we waste the space and time (for sorting) for raid_map[]?
>>>>
>>>> This patch will replace btrfs_io_context::raid_map with a single u64
>>>> number, full_stripe_start, by:
>>>>
>>>> - Replace btrfs_io_context::raid_map with full_stripe_start
>>>>
>>>> - Replace call sites using raid_map[0] to use full_stripe_start
>>>>
>>>> - Replace call sites using raid_map[i] to compare with nr_data_stripes.
>>>>
>>>> The benefits are:
>>>>
>>>> - Less memory wasted on raid_map
>>>> It's sizeof(u64) * num_stripes vs size(u64).
>>>> It's always a save for at least one u64, and the benefit grows larger
>>>> with num_stripes.
>>>>
>>>> - No more weird alloc_btrfs_io_context() behavior
>>>> As there is only one fixed size + one variable length array.
>>>>
>>>> Signed-off-by: Qu Wenruo <[email protected]>
>>>
>>> Thanks for your patch, which is now commit 4a8c6e8a6dc8ae4c ("btrfs:
>>> replace btrfs_io_context::raid_map with a fixed u64 value") in
>>> next-20230220.
>>>
>>> [email protected] reported several build failures when
>>> building for 32-bit platforms:
>>>
>>> ERROR: modpost: "__umoddi3" [fs/btrfs/btrfs.ko] undefined!
>>>
>>>> --- a/fs/btrfs/volumes.c
>>>> +++ b/fs/btrfs/volumes.c
>>>> @@ -6556,35 +6532,44 @@ int __btrfs_map_block(struct btrfs_fs_info
>>>> *fs_info, enum btrfs_map_op op,
>>>> }
>>>> bioc->map_type = map->type;
>>>>
>>>> - for (i = 0; i < num_stripes; i++) {
>>>> - set_io_stripe(&bioc->stripes[i], map, stripe_index,
>>>> stripe_offset,
>>>> - stripe_nr);
>>>> - stripe_index++;
>>>> - }
>>>> -
>>>> - /* Build raid_map */
>>>> + /*
>>>> + * For RAID56 full map, we need to make sure the stripes[] follows
>>>> + * the rule that data stripes are all ordered, then followed with P
>>>> + * and Q (if we have).
>>>> + *
>>>> + * It's still mostly the same as other profiles, just with extra
>>>> + * rotataion.
>>>> + */
>>>> if (map->type & BTRFS_BLOCK_GROUP_RAID56_MASK && need_raid_map &&
>>>> (need_full_stripe(op) || mirror_num > 1)) {
>>>> - u64 tmp;
>>>> - unsigned rot;
>>>> -
>>>> - /* Work out the disk rotation on this stripe-set */
>>>> - rot = stripe_nr % num_stripes;
>>>> -
>>>> - /* Fill in the logical address of each stripe */
>>>> - tmp = stripe_nr * data_stripes;
>>>> - for (i = 0; i < data_stripes; i++)
>>>> - bioc->raid_map[(i + rot) % num_stripes] =
>>>> - em->start + ((tmp + i) << BTRFS_STRIPE_LEN_SHIFT);
>>>> -
>>>> - bioc->raid_map[(i + rot) % map->num_stripes] = RAID5_P_STRIPE;
>>>> - if (map->type & BTRFS_BLOCK_GROUP_RAID6)
>>>> - bioc->raid_map[(i + rot + 1) % num_stripes] =
>>>> - RAID6_Q_STRIPE;
>>>> -
>>>> - sort_parity_stripes(bioc, num_stripes);
>>>> + /*
>>>> + * For RAID56 @stripe_nr is already the number of full stripes
>>>> + * before us, which is also the rotation value (needs to modulo
>>>> + * with num_stripes).
>>>> + *
>>>> + * In this case, we just add @stripe_nr with @i, then do the
>>>> + * modulo, to reduce one modulo call.
>>>> + */
>>>> + bioc->full_stripe_logical = em->start +
>>>> + ((stripe_nr * data_stripes) << BTRFS_STRIPE_LEN_SHIFT);
>>>> + for (i = 0; i < num_stripes; i++) {
>>>> + set_io_stripe(&bioc->stripes[i], map,
>>>> + (i + stripe_nr) % num_stripes,
>>>
>>> As stripe_nr is u64, this is a 64-by-32 modulo operation, which
>>> should be implemented using a helper from include/linux/math64.h
>>> instead.
>>
>> This is an older version, in the latest version, the @stripe_nr variable
>> is also u32, and I tried compiling the latest branch with i686, it
>> doesn't cause any u64 division problems anymore.
>>
>> You can find the latest branch in either github or from the mailling list:
>>
>> https://github.com/adam900710/linux/tree/map_block_refactor
>>
>> https://lore.kernel.org/linux-btrfs/[email protected]/
>
> So the older version was "v2", and the latest version had no
> version indicator, nor changelog, thus assuming v1?
> No surprise people end up applying the wrong version...

The previous version is two separate patchsets, the new one is the
merged one.

And I sent the merged version because the dependency problem and
conflicts, and since it's the merged version, no changelog based on
previous version.

Thanks,
Qu

>
> Gr{oetje,eeting}s,
>
> Geert
>