2020-12-02 18:28:23

by Yun Levi

[permalink] [raw]
Subject:

On Thu, Dec 3, 2020 at 2:26 AM Yury Norov <[email protected]> wrote:

> Also look at lib/find_bit_benchmark.c
Thanks. I'll see.

> We need find_next_*_bit() because find_first_*_bit() can start searching only at word-aligned
> bits. In the case of find_last_*_bit(), we can start at any bit. So, if my understanding is correct,
> for the purpose of reverse traversing we can go with already existing find_last_bit(),

Thank you. I haven't thought that way.
But I think if we implement reverse traversing using find_last_bit(),
we have a problem.
Suppose the last bit 0, 1, 2, is set.
If we start
find_last_bit(bitmap, 3) ==> return 2;
find_last_bit(bitmap, 2) ==> return 1;
find_last_bit(bitmap, 1) ==> return 0;
find_last_bit(bitmap, 0) ===> return 0? // here we couldn't
distinguish size 0 input or 0 is set

and the for_each traverse routine prevent above case by returning size
(nbits) using find_next_bit.
So, for compatibility and the same expected return value like next traversing,
I think we need to find_prev_*_bit routine. if my understanding is correct.


> I think this patch has some good catches. We definitely need to implement
> find_last_zero_bit(), as it is used by fs/ufs, and their local implementation is not optimal.
>
> We also should consider adding reverse traversing macros based on find_last_*_bit(),
> if there are proposed users.

Not only this, I think 'steal_from_bitmap_to_front' can be improved
using ffind_prev_zero_bit
like

diff --git a/fs/btrfs/free-space-cache.c b/fs/btrfs/free-space-cache.c
index af0013d3df63..9debb9707390 100644
--- a/fs/btrfs/free-space-cache.c
+++ b/fs/btrfs/free-space-cache.c
@@ -2372,7 +2372,6 @@ static bool steal_from_bitmap_to_front(struct
btrfs_free_space_ctl *ctl,
u64 bitmap_offset;
unsigned long i;
unsigned long j;
- unsigned long prev_j;
u64 bytes;

bitmap_offset = offset_to_bitmap(ctl, info->offset);
@@ -2388,20 +2387,15 @@ static bool steal_from_bitmap_to_front(struct
btrfs_free_space_ctl *ctl,
return false;

i = offset_to_bit(bitmap->offset, ctl->unit, info->offset) - 1;
- j = 0;
- prev_j = (unsigned long)-1;
- for_each_clear_bit_from(j, bitmap->bitmap, BITS_PER_BITMAP) {
- if (j > i)
- break;
- prev_j = j;
- }
- if (prev_j == i)
+ j = find_prev_zero_bit(bitmap->bitmap, BITS_PER_BITMAP, i);
+
+ if (j == i)
return false;

- if (prev_j == (unsigned long)-1)
+ if (j == BITS_PER_BITMAP)
bytes = (i + 1) * ctl->unit;
else
- bytes = (i - prev_j) * ctl->unit;
+ bytes = (i - j) * ctl->unit;

info->offset -= bytes;
info->bytes += bytes;

Thanks.

HTH
Levi.


2020-12-02 21:29:43

by Yury Norov

[permalink] [raw]
Subject: Re:

On Wed, Dec 2, 2020 at 10:22 AM Yun Levi <[email protected]> wrote:
>
> On Thu, Dec 3, 2020 at 2:26 AM Yury Norov <[email protected]> wrote:
>
> > Also look at lib/find_bit_benchmark.c
> Thanks. I'll see.
>
> > We need find_next_*_bit() because find_first_*_bit() can start searching only at word-aligned
> > bits. In the case of find_last_*_bit(), we can start at any bit. So, if my understanding is correct,
> > for the purpose of reverse traversing we can go with already existing find_last_bit(),
>
> Thank you. I haven't thought that way.
> But I think if we implement reverse traversing using find_last_bit(),
> we have a problem.
> Suppose the last bit 0, 1, 2, is set.
> If we start
> find_last_bit(bitmap, 3) ==> return 2;
> find_last_bit(bitmap, 2) ==> return 1;
> find_last_bit(bitmap, 1) ==> return 0;
> find_last_bit(bitmap, 0) ===> return 0? // here we couldn't
> distinguish size 0 input or 0 is set

If you traverse backward and reach bit #0, you're done. No need to continue.

>
> and the for_each traverse routine prevent above case by returning size
> (nbits) using find_next_bit.
> So, for compatibility and the same expected return value like next traversing,
> I think we need to find_prev_*_bit routine. if my understanding is correct.
>
>
> > I think this patch has some good catches. We definitely need to implement
> > find_last_zero_bit(), as it is used by fs/ufs, and their local implementation is not optimal.
> >
> > We also should consider adding reverse traversing macros based on find_last_*_bit(),
> > if there are proposed users.
>
> Not only this, I think 'steal_from_bitmap_to_front' can be improved
> using ffind_prev_zero_bit
> like
>
> diff --git a/fs/btrfs/free-space-cache.c b/fs/btrfs/free-space-cache.c
> index af0013d3df63..9debb9707390 100644
> --- a/fs/btrfs/free-space-cache.c
> +++ b/fs/btrfs/free-space-cache.c
> @@ -2372,7 +2372,6 @@ static bool steal_from_bitmap_to_front(struct
> btrfs_free_space_ctl *ctl,
> u64 bitmap_offset;
> unsigned long i;
> unsigned long j;
> - unsigned long prev_j;
> u64 bytes;
>
> bitmap_offset = offset_to_bitmap(ctl, info->offset);
> @@ -2388,20 +2387,15 @@ static bool steal_from_bitmap_to_front(struct
> btrfs_free_space_ctl *ctl,
> return false;
>
> i = offset_to_bit(bitmap->offset, ctl->unit, info->offset) - 1;
> - j = 0;
> - prev_j = (unsigned long)-1;
> - for_each_clear_bit_from(j, bitmap->bitmap, BITS_PER_BITMAP) {
> - if (j > i)
> - break;
> - prev_j = j;
> - }
> - if (prev_j == i)
> + j = find_prev_zero_bit(bitmap->bitmap, BITS_PER_BITMAP, i);

This one may be implemented with find_last_zero_bit() as well:

unsigned log j = find_last_zero_bit(bitmap, BITS_PER_BITMAP);
if (j <= i || j >= BITS_PER_BITMAP)
return false;

I believe the latter version is better because find_last_*_bit() is simpler in
implementation (and partially exists), has less parameters, and therefore
simpler for users, and doesn't introduce functionality duplication.

The only consideration I can imagine to advocate find_prev*() is the performance
advantage in the scenario when we know for sure that first N bits of
bitmap are all
set/clear, and we can bypass traversing that area. But again, in this
case we can pass the
bitmap address with the appropriate offset, and stay with find_last_*()

> +
> + if (j == i)
> return false;
>
> - if (prev_j == (unsigned long)-1)
> + if (j == BITS_PER_BITMAP)
> bytes = (i + 1) * ctl->unit;
> else
> - bytes = (i - prev_j) * ctl->unit;
> + bytes = (i - j) * ctl->unit;
>
> info->offset -= bytes;
> info->bytes += bytes;
>
> Thanks.
>
> HTH
> Levi.

2020-12-02 22:53:48

by Yun Levi

[permalink] [raw]
Subject:

On Thu, Dec 3, 2020 at 6:26 AM Yury Norov <[email protected]> wrote:
>
> On Wed, Dec 2, 2020 at 10:22 AM Yun Levi <[email protected]> wrote:
> >
> > On Thu, Dec 3, 2020 at 2:26 AM Yury Norov <[email protected]> wrote:
> >
> > > Also look at lib/find_bit_benchmark.c
> > Thanks. I'll see.
> >
> > > We need find_next_*_bit() because find_first_*_bit() can start searching only at word-aligned
> > > bits. In the case of find_last_*_bit(), we can start at any bit. So, if my understanding is correct,
> > > for the purpose of reverse traversing we can go with already existing find_last_bit(),
> >
> > Thank you. I haven't thought that way.
> > But I think if we implement reverse traversing using find_last_bit(),
> > we have a problem.
> > Suppose the last bit 0, 1, 2, is set.
> > If we start
> > find_last_bit(bitmap, 3) ==> return 2;
> > find_last_bit(bitmap, 2) ==> return 1;
> > find_last_bit(bitmap, 1) ==> return 0;
> > find_last_bit(bitmap, 0) ===> return 0? // here we couldn't
> > distinguish size 0 input or 0 is set
>
> If you traverse backward and reach bit #0, you're done. No need to continue.
I think the case when I consider the this macro like

#define for_each_clear_bit_reverse(bit, addr, size)
for ((bit) = find_last_zero_bit((addr), (size))
(bit) < (size);
(bit) = find_prev_zero_bit((addr), (size), (bit)))

If we implement the above macro only with find_last_zero_bit,
I think there is no way without adding any additional variable to finish loop.
But I don't want to add additional variable to sustain same format
with for_each_clear_bit,
That's why i decide to implement find_prev_*_bit series.

I don't know it's correct thinking or biased. Am I wrong?

>
> >
> > and the for_each traverse routine prevent above case by returning size
> > (nbits) using find_next_bit.
> > So, for compatibility and the same expected return value like next traversing,
> > I think we need to find_prev_*_bit routine. if my understanding is correct.
> >
> >
> > > I think this patch has some good catches. We definitely need to implement
> > > find_last_zero_bit(), as it is used by fs/ufs, and their local implementation is not optimal.
> > >
> > > We also should consider adding reverse traversing macros based on find_last_*_bit(),
> > > if there are proposed users.
> >
> > Not only this, I think 'steal_from_bitmap_to_front' can be improved
> > using ffind_prev_zero_bit
> > like
> >
> > diff --git a/fs/btrfs/free-space-cache.c b/fs/btrfs/free-space-cache.c
> > index af0013d3df63..9debb9707390 100644
> > --- a/fs/btrfs/free-space-cache.c
> > +++ b/fs/btrfs/free-space-cache.c
> > @@ -2372,7 +2372,6 @@ static bool steal_from_bitmap_to_front(struct
> > btrfs_free_space_ctl *ctl,
> > u64 bitmap_offset;
> > unsigned long i;
> > unsigned long j;
> > - unsigned long prev_j;
> > u64 bytes;
> >
> > bitmap_offset = offset_to_bitmap(ctl, info->offset);
> > @@ -2388,20 +2387,15 @@ static bool steal_from_bitmap_to_front(struct
> > btrfs_free_space_ctl *ctl,
> > return false;
> >
> > i = offset_to_bit(bitmap->offset, ctl->unit, info->offset) - 1;
> > - j = 0;
> > - prev_j = (unsigned long)-1;
> > - for_each_clear_bit_from(j, bitmap->bitmap, BITS_PER_BITMAP) {
> > - if (j > i)
> > - break;
> > - prev_j = j;
> > - }
> > - if (prev_j == i)
> > + j = find_prev_zero_bit(bitmap->bitmap, BITS_PER_BITMAP, i);
>
> This one may be implemented with find_last_zero_bit() as well:
>
> unsigned log j = find_last_zero_bit(bitmap, BITS_PER_BITMAP);
> if (j <= i || j >= BITS_PER_BITMAP)
> return false;
>
Actually, in that code, we don't need to check the bit after i.
Originally, if my understanding is correct, former code tries to find
the last 0 bit before i.
and if all bits are fully set before i, it use next one as i + 1

that's why i think the if condition should be
if (j >= i)

But above condition couldn't the discern the case when all bits are
fully set before i.
Also, I think we don't need to check the bit after i and In this case,
find_prev_zero_bit which
specifies the start point is clear to show the meaning of the code.


> I believe the latter version is better because find_last_*_bit() is simpler in
> implementation (and partially exists), has less parameters, and therefore
> simpler for users, and doesn't introduce functionality duplication.
>
> The only consideration I can imagine to advocate find_prev*() is the performance
> advantage in the scenario when we know for sure that first N bits of
> bitmap are all
> set/clear, and we can bypass traversing that area. But again, in this
> case we can pass the
> bitmap address with the appropriate offset, and stay with find_last_*()
>
> > +
> > + if (j == i)
> > return false;
> >
> > - if (prev_j == (unsigned long)-1)
> > + if (j == BITS_PER_BITMAP)
> > bytes = (i + 1) * ctl->unit;
> > else
> > - bytes = (i - prev_j) * ctl->unit;
> > + bytes = (i - j) * ctl->unit;
> >
> > info->offset -= bytes;
> > info->bytes += bytes;
> >
> > Thanks.
> >
> > HTH
> > Levi.

Thanks but

2020-12-03 01:28:07

by Yun Levi

[permalink] [raw]
Subject:

On Thu, Dec 3, 2020 at 7:51 AM Yun Levi <[email protected]> wrote:
>
> On Thu, Dec 3, 2020 at 6:26 AM Yury Norov <[email protected]> wrote:
> >
> > On Wed, Dec 2, 2020 at 10:22 AM Yun Levi <[email protected]> wrote:
> > >
> > > On Thu, Dec 3, 2020 at 2:26 AM Yury Norov <[email protected]> wrote:
> > >
> > > > Also look at lib/find_bit_benchmark.c
> > > Thanks. I'll see.
> > >
> > > > We need find_next_*_bit() because find_first_*_bit() can start searching only at word-aligned
> > > > bits. In the case of find_last_*_bit(), we can start at any bit. So, if my understanding is correct,
> > > > for the purpose of reverse traversing we can go with already existing find_last_bit(),
> > >
> > > Thank you. I haven't thought that way.
> > > But I think if we implement reverse traversing using find_last_bit(),
> > > we have a problem.
> > > Suppose the last bit 0, 1, 2, is set.
> > > If we start
> > > find_last_bit(bitmap, 3) ==> return 2;
> > > find_last_bit(bitmap, 2) ==> return 1;
> > > find_last_bit(bitmap, 1) ==> return 0;
> > > find_last_bit(bitmap, 0) ===> return 0? // here we couldn't
> > > distinguish size 0 input or 0 is set
> >
> > If you traverse backward and reach bit #0, you're done. No need to continue.
> I think the case when I consider the this macro like
>
> #define for_each_clear_bit_reverse(bit, addr, size)
> for ((bit) = find_last_zero_bit((addr), (size))
> (bit) < (size);
> (bit) = find_prev_zero_bit((addr), (size), (bit)))
>
> If we implement the above macro only with find_last_zero_bit,
> I think there is no way without adding any additional variable to finish loop.
> But I don't want to add additional variable to sustain same format
> with for_each_clear_bit,
> That's why i decide to implement find_prev_*_bit series.
>
> I don't know it's correct thinking or biased. Am I wrong?
>
> >
> > >
> > > and the for_each traverse routine prevent above case by returning size
> > > (nbits) using find_next_bit.
> > > So, for compatibility and the same expected return value like next traversing,
> > > I think we need to find_prev_*_bit routine. if my understanding is correct.
> > >
> > >
> > > > I think this patch has some good catches. We definitely need to implement
> > > > find_last_zero_bit(), as it is used by fs/ufs, and their local implementation is not optimal.
> > > >
> > > > We also should consider adding reverse traversing macros based on find_last_*_bit(),
> > > > if there are proposed users.
> > >
> > > Not only this, I think 'steal_from_bitmap_to_front' can be improved
> > > using ffind_prev_zero_bit
> > > like
> > >
> > > diff --git a/fs/btrfs/free-space-cache.c b/fs/btrfs/free-space-cache.c
> > > index af0013d3df63..9debb9707390 100644
> > > --- a/fs/btrfs/free-space-cache.c
> > > +++ b/fs/btrfs/free-space-cache.c
> > > @@ -2372,7 +2372,6 @@ static bool steal_from_bitmap_to_front(struct
> > > btrfs_free_space_ctl *ctl,
> > > u64 bitmap_offset;
> > > unsigned long i;
> > > unsigned long j;
> > > - unsigned long prev_j;
> > > u64 bytes;
> > >
> > > bitmap_offset = offset_to_bitmap(ctl, info->offset);
> > > @@ -2388,20 +2387,15 @@ static bool steal_from_bitmap_to_front(struct
> > > btrfs_free_space_ctl *ctl,
> > > return false;
> > >
> > > i = offset_to_bit(bitmap->offset, ctl->unit, info->offset) - 1;
> > > - j = 0;
> > > - prev_j = (unsigned long)-1;
> > > - for_each_clear_bit_from(j, bitmap->bitmap, BITS_PER_BITMAP) {
> > > - if (j > i)
> > > - break;
> > > - prev_j = j;
> > > - }
> > > - if (prev_j == i)
> > > + j = find_prev_zero_bit(bitmap->bitmap, BITS_PER_BITMAP, i);
> >
> > This one may be implemented with find_last_zero_bit() as well:
> >
> > unsigned log j = find_last_zero_bit(bitmap, BITS_PER_BITMAP);
> > if (j <= i || j >= BITS_PER_BITMAP)
> > return false;
> >
> Actually, in that code, we don't need to check the bit after i.
> Originally, if my understanding is correct, former code tries to find
> the last 0 bit before i.
> and if all bits are fully set before i, it use next one as i + 1
>
> that's why i think the if condition should be
> if (j >= i)
>
> But above condition couldn't the discern the case when all bits are
> fully set before i.
> Also, I think we don't need to check the bit after i and In this case,
> find_prev_zero_bit which
> specifies the start point is clear to show the meaning of the code.
>
>
> > I believe the latter version is better because find_last_*_bit() is simpler in
> > implementation (and partially exists), has less parameters, and therefore
> > simpler for users, and doesn't introduce functionality duplication.

I think it's not duplication.
Actually, former you teach me find_first_*_bit should be start word-aligned bit,
But as find_first_*_bit declares it as "size of bitmap" not a start offset.
Though the bitmap size it's word-aligned, it doesn't matter to fine
first bit in the specified size of bitmap (it no, it will return just
size of bitmap)

Likewise, find_last_*_bit is also similar in context.
Fundamentally, it's not a start offset of bitmap but I think it just
size of bitmap.

That's the reason why we need to find_next_*_bit to start at the
specified offset.
In this matter, I think it's better to have find_prev_*_bit.

So, I think we can use both of these functions to be used to achieve a goal.
But, each function has different concept actually that's why I don't
think it's not duplication.

if my understanding is wrong.. Forgive me. and let me know..

Thanks.



> >
> > The only consideration I can imagine to advocate find_prev*() is the performance
> > advantage in the scenario when we know for sure that first N bits of
> > bitmap are all
> > set/clear, and we can bypass traversing that area. But again, in this
> > case we can pass the
> > bitmap address with the appropriate offset, and stay with find_last_*()
> >
> > > +
> > > + if (j == i)
> > > return false;
> > >
> > > - if (prev_j == (unsigned long)-1)
> > > + if (j == BITS_PER_BITMAP)
> > > bytes = (i + 1) * ctl->unit;
> > > else
> > > - bytes = (i - prev_j) * ctl->unit;
> > > + bytes = (i - j) * ctl->unit;
> > >
> > > info->offset -= bytes;
> > > info->bytes += bytes;
> > >
> > > Thanks.
> > >
> > > HTH
> > > Levi.
>
> Thanks but

2020-12-03 08:36:16

by Rasmus Villemoes

[permalink] [raw]
Subject: Re:

On 03/12/2020 02.23, Yun Levi wrote:
> On Thu, Dec 3, 2020 at 7:51 AM Yun Levi <[email protected]> wrote:
>>
>> On Thu, Dec 3, 2020 at 6:26 AM Yury Norov <[email protected]> wrote:
>>>
>>> On Wed, Dec 2, 2020 at 10:22 AM Yun Levi <[email protected]> wrote:
>>>>
>>>> On Thu, Dec 3, 2020 at 2:26 AM Yury Norov <[email protected]> wrote:
>>>>
>>>>> Also look at lib/find_bit_benchmark.c
>>>> Thanks. I'll see.
>>>>
>>>>> We need find_next_*_bit() because find_first_*_bit() can start searching only at word-aligned
>>>>> bits. In the case of find_last_*_bit(), we can start at any bit. So, if my understanding is correct,
>>>>> for the purpose of reverse traversing we can go with already existing find_last_bit(),
>>>>
>>>> Thank you. I haven't thought that way.
>>>> But I think if we implement reverse traversing using find_last_bit(),
>>>> we have a problem.
>>>> Suppose the last bit 0, 1, 2, is set.
>>>> If we start
>>>> find_last_bit(bitmap, 3) ==> return 2;
>>>> find_last_bit(bitmap, 2) ==> return 1;
>>>> find_last_bit(bitmap, 1) ==> return 0;
>>>> find_last_bit(bitmap, 0) ===> return 0? // here we couldn't

Either just make the return type of all find_prev/find_last be signed
int and use -1 as the sentinel to indicate "no such position exists", so
the loop condition would be foo >= 0. Or, change the condition from
"stop if we get the size returned" to "only continue if we get something
strictly less than the size we passed in (i.e., something which can
possibly be a valid bit index). In the latter case, both (unsigned)-1
aka UINT_MAX and the actual size value passed work equally well as a
sentinel.

If one uses UINT_MAX, a for_each_bit_reverse() macro would just be
something like

for (i = find_last_bit(bitmap, size); i < size; i =
find_last_bit(bitmap, i))

if one wants to use the size argument as the sentinel, the caller would
have to supply a scratch variable to keep track of the last i value:

for (j = size, i = find_last_bit(bitmap, j); i < j; j = i, i =
find_last_bit(bitmap, j))

which is probably a little less ergonomic.

Rasmus

2020-12-03 09:50:17

by Yun Levi

[permalink] [raw]
Subject: Re:

> If one uses UINT_MAX, a for_each_bit_reverse() macro would just be
> something like
>
> for (i = find_last_bit(bitmap, size); i < size; i =
> find_last_bit(bitmap, i))
>
> if one wants to use the size argument as the sentinel, the caller would
> have to supply a scratch variable to keep track of the last i value:
>
> for (j = size, i = find_last_bit(bitmap, j); i < j; j = i, i =
> find_last_bit(bitmap, j))
>
> which is probably a little less ergonomic.

Actually Because I want to avoid the modification of return type of
find_last_*_bit for new sentinel,
I add find_prev_*_bit.
the big difference between find_last_bit and find_prev_bit is
find_last_bit doesn't check the size bit and use sentinel with size.
but find_prev_bit check the offset bit and use sentinel with size
which passed by another argument.
So if we use find_prev_bit, we could have a clear iteration if
using find_prev_bit like.

#define for_each_set_bit_reverse(bit, addr, size) \
for ((bit) = find_last_bit((addr), (size)); \
(bit) < (size); \
(bit) = find_prev_bit((addr), (size), (bit - 1)))

#define for_each_set_bit_from_reverse(bit, addr, size) \
for ((bit) = find_prev_bit((addr), (size), (bit)); \
(bit) < (size); \
(bit) = find_prev_bit((addr), (size), (bit - 1)))

Though find_prev_*_bit / find_last_*_bit have the same functionality.
But they also have a small difference.
I think this small this small difference doesn't make some of
confusion to user but it help to solve problem
with a simple way (just like the iteration above).

So I think I need, find_prev_*_bit series.

Am I missing anything?

Thanks.

Levi.

On Thu, Dec 3, 2020 at 5:33 PM Rasmus Villemoes
<[email protected]> wrote:
>
> On 03/12/2020 02.23, Yun Levi wrote:
> > On Thu, Dec 3, 2020 at 7:51 AM Yun Levi <[email protected]> wrote:
> >>
> >> On Thu, Dec 3, 2020 at 6:26 AM Yury Norov <[email protected]> wrote:
> >>>
> >>> On Wed, Dec 2, 2020 at 10:22 AM Yun Levi <[email protected]> wrote:
> >>>>
> >>>> On Thu, Dec 3, 2020 at 2:26 AM Yury Norov <[email protected]> wrote:
> >>>>
> >>>>> Also look at lib/find_bit_benchmark.c
> >>>> Thanks. I'll see.
> >>>>
> >>>>> We need find_next_*_bit() because find_first_*_bit() can start searching only at word-aligned
> >>>>> bits. In the case of find_last_*_bit(), we can start at any bit. So, if my understanding is correct,
> >>>>> for the purpose of reverse traversing we can go with already existing find_last_bit(),
> >>>>
> >>>> Thank you. I haven't thought that way.
> >>>> But I think if we implement reverse traversing using find_last_bit(),
> >>>> we have a problem.
> >>>> Suppose the last bit 0, 1, 2, is set.
> >>>> If we start
> >>>> find_last_bit(bitmap, 3) ==> return 2;
> >>>> find_last_bit(bitmap, 2) ==> return 1;
> >>>> find_last_bit(bitmap, 1) ==> return 0;
> >>>> find_last_bit(bitmap, 0) ===> return 0? // here we couldn't
>
> Either just make the return type of all find_prev/find_last be signed
> int and use -1 as the sentinel to indicate "no such position exists", so
> the loop condition would be foo >= 0. Or, change the condition from
> "stop if we get the size returned" to "only continue if we get something
> strictly less than the size we passed in (i.e., something which can
> possibly be a valid bit index). In the latter case, both (unsigned)-1
> aka UINT_MAX and the actual size value passed work equally well as a
> sentinel.
>
> If one uses UINT_MAX, a for_each_bit_reverse() macro would just be
> something like
>
> for (i = find_last_bit(bitmap, size); i < size; i =
> find_last_bit(bitmap, i))
>
> if one wants to use the size argument as the sentinel, the caller would
> have to supply a scratch variable to keep track of the last i value:
>
> for (j = size, i = find_last_bit(bitmap, j); i < j; j = i, i =
> find_last_bit(bitmap, j))
>
> which is probably a little less ergonomic.
>
> Rasmus

2020-12-03 18:51:00

by Yury Norov

[permalink] [raw]
Subject: Re:

Yun, could you please stop top-posting and excessive trimming in the thread?

On Thu, Dec 3, 2020 at 1:47 AM Yun Levi <[email protected]> wrote:
> > Either just make the return type of all find_prev/find_last be signed
> > int and use -1 as the sentinel to indicate "no such position exists", so
> > the loop condition would be foo >= 0. Or, change the condition from
> > "stop if we get the size returned" to "only continue if we get something
> > strictly less than the size we passed in (i.e., something which can
> > possibly be a valid bit index). In the latter case, both (unsigned)-1
> > aka UINT_MAX and the actual size value passed work equally well as a
> > sentinel.
> >
> > If one uses UINT_MAX, a for_each_bit_reverse() macro would just be
> > something like
> >
> > for (i = find_last_bit(bitmap, size); i < size; i =
> > find_last_bit(bitmap, i))
> >
> > if one wants to use the size argument as the sentinel, the caller would
> > have to supply a scratch variable to keep track of the last i value:
> >
> > for (j = size, i = find_last_bit(bitmap, j); i < j; j = i, i =
> > find_last_bit(bitmap, j))
> >
> > which is probably a little less ergonomic.
> >
> > Rasmus

I would prefer to avoid changing the find*bit() semantics. As for now,
if any of find_*_bit()
finds nothing, it returns the size of the bitmap it was passed.
Changing this for
a single function would break the consistency, and may cause problems
for those who
rely on existing behaviour.

Passing non-positive size to find_*_bit() should produce undefined
behaviour, because we cannot dereference a pointer to the bitmap in
this case; this is most probably a sign of a problem on a caller side
anyways.

Let's keep this logic unchanged?

> Actually Because I want to avoid the modification of return type of
> find_last_*_bit for new sentinel,
> I add find_prev_*_bit.
> the big difference between find_last_bit and find_prev_bit is
> find_last_bit doesn't check the size bit and use sentinel with size.
> but find_prev_bit check the offset bit and use sentinel with size
> which passed by another argument.
> So if we use find_prev_bit, we could have a clear iteration if
> using find_prev_bit like.
>
> #define for_each_set_bit_reverse(bit, addr, size) \
> for ((bit) = find_last_bit((addr), (size)); \
> (bit) < (size); \
> (bit) = find_prev_bit((addr), (size), (bit - 1)))
>
> #define for_each_set_bit_from_reverse(bit, addr, size) \
> for ((bit) = find_prev_bit((addr), (size), (bit)); \
> (bit) < (size); \
> (bit) = find_prev_bit((addr), (size), (bit - 1)))
>
> Though find_prev_*_bit / find_last_*_bit have the same functionality.
> But they also have a small difference.
> I think this small this small difference doesn't make some of
> confusion to user but it help to solve problem
> with a simple way (just like the iteration above).
>
> So I think I need, find_prev_*_bit series.
>
> Am I missing anything?
>
> Thanks.
>
> Levi.

As you said, find_last_bit() and proposed find_prev_*_bit() have the
same functionality.
If you really want to have find_prev_*_bit(), could you please at
least write it using find_last_bit(), otherwise it would be just a
blottering.

Regarding reverse search, we can probably do like this (not tested,
just an idea):

#define for_each_set_bit_reverse(bit, addr, size) \
for ((bit) = find_last_bit((addr), (size)); \
(bit) < (size); \
(size) = (bit), (bit) = find_last_bit((addr), (bit)))

Thanks,
Yury

2020-12-03 18:58:23

by Willy Tarreau

[permalink] [raw]
Subject: Re:

On Thu, Dec 03, 2020 at 10:46:25AM -0800, Yury Norov wrote:
> Yun, could you please stop top-posting and excessive trimming in the thread?

And re-configure the mail agent to make the "Subject" field appear and
fill it.

Willy

2020-12-04 01:41:31

by Yun Levi

[permalink] [raw]
Subject: Re:

>On Fri, Dec 4, 2020 at 3:53 AM Willy Tarreau <[email protected]> wrote:
>
> On Thu, Dec 03, 2020 at 10:46:25AM -0800, Yury Norov wrote:
> > Yun, could you please stop top-posting and excessive trimming in the thread?
>
> And re-configure the mail agent to make the "Subject" field appear and
> fill it.

>On Thu, Dec 03, 2020 at 10:46:25AM -0800, Yury Norov wrote:
> Yun, could you please stop top-posting and excessive trimming in the thread?
Sorry to make you uncomfortable... Thanks for advice.

>On Thu, Dec 03, 2020 at 10:46:25AM -0800, Yury Norov wrote:
> As you said, find_last_bit() and proposed find_prev_*_bit() have the
> same functionality.
> If you really want to have find_prev_*_bit(), could you please at
> least write it using find_last_bit(), otherwise it would be just a
> blottering.

Actually find_prev_*_bit call _find_prev_bit which is a common helper function
like _find_next_bit.
As you know this function is required to support __BIGEDIAN's little
endian search.
find_prev_bit actually wrapper of _find_prev_bit which have a feature
the find_last_bit.

That makes the semantics difference between find_last_bit and find_prev_bit.
-- specify where you find from and
In loop, find_last_bit couldn't sustain original size as sentinel
return value
(we should change the size argument for next searching
But it means whenever we call, "NOT SET or NOT CLEAR"'s sentinel
return value is changed per call).

Because we should have _find_prev_bit,
I think it's the matter to choose which is better to usein
find_prev_bit (find_last_bit? or _find_prev_bit?)
sustaining find_prev_bit feature (give size as sentinel return, from
where I start).
if my understanding is correct.

In my view, I prefer to use _find_prev_bit like find_next_bit for
integrated format.

But In some of the benchmarking, find_last_bit is better than _find_prev_bit,
here what I tested (look similar but sometimes have some difference).

Start testing find_bit() with random-filled bitmap
[ +0.001850] find_next_bit: 842792 ns, 163788 iterations
[ +0.000873] find_prev_bit: 870914 ns, 163788 iterations
[ +0.000824] find_next_zero_bit: 821959 ns, 163894 iterations
[ +0.000677] find_prev_zero_bit: 676240 ns, 163894 iterations
[ +0.000777] find_last_bit: 659103 ns, 163788 iterations
[ +0.001822] find_first_bit: 1708041 ns, 16250 iterations
[ +0.000539] find_next_and_bit: 492182 ns, 73871 iterations
[ +0.000001]
Start testing find_bit() with sparse bitmap
[ +0.000222] find_next_bit: 13227 ns, 654 iterations
[ +0.000013] find_prev_bit: 11652 ns, 654 iterations
[ +0.001845] find_next_zero_bit: 1723869 ns, 327028 iterations
[ +0.001538] find_prev_zero_bit: 1355808 ns, 327028 iterations
[ +0.000010] find_last_bit: 8114 ns, 654 iterations
[ +0.000867] find_first_bit: 710639 ns, 654 iterations
[ +0.000006] find_next_and_bit: 4273 ns, 1 iterations
[ +0.000004] find_next_and_bit: 3278 ns, 1 iterations

Start testing find_bit() with random-filled bitmap
[ +0.001784] find_next_bit: 805553 ns, 164240 iterations
[ +0.000643] find_prev_bit: 632474 ns, 164240 iterations
[ +0.000950] find_next_zero_bit: 877215 ns, 163442 iterations
[ +0.000664] find_prev_zero_bit: 662339 ns, 163442 iterations
[ +0.000680] find_last_bit: 602204 ns, 164240 iterations
[ +0.001912] find_first_bit: 1758208 ns, 16408 iterations
[ +0.000760] find_next_and_bit: 531033 ns, 73798 iterations
[ +0.000002]
Start testing find_bit() with sparse bitmap
[ +0.000203] find_next_bit: 12468 ns, 656 iterations
[ +0.000205] find_prev_bit: 10948 ns, 656 iterations
[ +0.001759] find_next_zero_bit: 1579447 ns, 327026 iterations
[ +0.001935] find_prev_zero_bit: 1931961 ns, 327026 iterations
[ +0.000013] find_last_bit: 9543 ns, 656 iterations
[ +0.000732] find_first_bit: 562009 ns, 656 iterations
[ +0.000217] find_next_and_bit: 6804 ns, 1 iterations
[ +0.000007] find_next_and_bit: 4367 ns, 1 iterations

Is it better to write find_prev_bit using find_last_bit?
I question again.

Thanks for your great advice, But please forgive my fault and lackness.

HTH.
Levi.

2020-12-04 18:17:58

by Yury Norov

[permalink] [raw]
Subject: Re:

On Thu, Dec 3, 2020 at 5:36 PM Yun Levi <[email protected]> wrote:
>
> >On Fri, Dec 4, 2020 at 3:53 AM Willy Tarreau <[email protected]> wrote:
> >
> > On Thu, Dec 03, 2020 at 10:46:25AM -0800, Yury Norov wrote:
> > > Yun, could you please stop top-posting and excessive trimming in the thread?
> >
> > And re-configure the mail agent to make the "Subject" field appear and
> > fill it.
>
> >On Thu, Dec 03, 2020 at 10:46:25AM -0800, Yury Norov wrote:
> > Yun, could you please stop top-posting and excessive trimming in the thread?
> Sorry to make you uncomfortable... Thanks for advice.
>
> >On Thu, Dec 03, 2020 at 10:46:25AM -0800, Yury Norov wrote:
> > As you said, find_last_bit() and proposed find_prev_*_bit() have the
> > same functionality.
> > If you really want to have find_prev_*_bit(), could you please at
> > least write it using find_last_bit(), otherwise it would be just a
> > blottering.
>
> Actually find_prev_*_bit call _find_prev_bit which is a common helper function
> like _find_next_bit.
> As you know this function is required to support __BIGEDIAN's little
> endian search.
> find_prev_bit actually wrapper of _find_prev_bit which have a feature
> the find_last_bit.
>
> That makes the semantics difference between find_last_bit and find_prev_bit.
> -- specify where you find from and
> In loop, find_last_bit couldn't sustain original size as sentinel
> return value
> (we should change the size argument for next searching
> But it means whenever we call, "NOT SET or NOT CLEAR"'s sentinel
> return value is changed per call).
>
> Because we should have _find_prev_bit,
> I think it's the matter to choose which is better to usein
> find_prev_bit (find_last_bit? or _find_prev_bit?)
> sustaining find_prev_bit feature (give size as sentinel return, from
> where I start).
> if my understanding is correct.
>
> In my view, I prefer to use _find_prev_bit like find_next_bit for
> integrated format.
>
> But In some of the benchmarking, find_last_bit is better than _find_prev_bit,
> here what I tested (look similar but sometimes have some difference).
>
> Start testing find_bit() with random-filled bitmap
> [ +0.001850] find_next_bit: 842792 ns, 163788 iterations
> [ +0.000873] find_prev_bit: 870914 ns, 163788 iterations
> [ +0.000824] find_next_zero_bit: 821959 ns, 163894 iterations
> [ +0.000677] find_prev_zero_bit: 676240 ns, 163894 iterations
> [ +0.000777] find_last_bit: 659103 ns, 163788 iterations
> [ +0.001822] find_first_bit: 1708041 ns, 16250 iterations
> [ +0.000539] find_next_and_bit: 492182 ns, 73871 iterations
> [ +0.000001]
> Start testing find_bit() with sparse bitmap
> [ +0.000222] find_next_bit: 13227 ns, 654 iterations
> [ +0.000013] find_prev_bit: 11652 ns, 654 iterations
> [ +0.001845] find_next_zero_bit: 1723869 ns, 327028 iterations
> [ +0.001538] find_prev_zero_bit: 1355808 ns, 327028 iterations
> [ +0.000010] find_last_bit: 8114 ns, 654 iterations
> [ +0.000867] find_first_bit: 710639 ns, 654 iterations
> [ +0.000006] find_next_and_bit: 4273 ns, 1 iterations
> [ +0.000004] find_next_and_bit: 3278 ns, 1 iterations
>
> Start testing find_bit() with random-filled bitmap
> [ +0.001784] find_next_bit: 805553 ns, 164240 iterations
> [ +0.000643] find_prev_bit: 632474 ns, 164240 iterations
> [ +0.000950] find_next_zero_bit: 877215 ns, 163442 iterations
> [ +0.000664] find_prev_zero_bit: 662339 ns, 163442 iterations
> [ +0.000680] find_last_bit: 602204 ns, 164240 iterations
> [ +0.001912] find_first_bit: 1758208 ns, 16408 iterations
> [ +0.000760] find_next_and_bit: 531033 ns, 73798 iterations
> [ +0.000002]
> Start testing find_bit() with sparse bitmap
> [ +0.000203] find_next_bit: 12468 ns, 656 iterations
> [ +0.000205] find_prev_bit: 10948 ns, 656 iterations
> [ +0.001759] find_next_zero_bit: 1579447 ns, 327026 iterations
> [ +0.001935] find_prev_zero_bit: 1931961 ns, 327026 iterations
> [ +0.000013] find_last_bit: 9543 ns, 656 iterations
> [ +0.000732] find_first_bit: 562009 ns, 656 iterations
> [ +0.000217] find_next_and_bit: 6804 ns, 1 iterations
> [ +0.000007] find_next_and_bit: 4367 ns, 1 iterations
>
> Is it better to write find_prev_bit using find_last_bit?
> I question again.

I answer again. It's better not to write find_prev_bit at all and
learn how to use existing functionality.

Yury

> Thanks for your great advice, But please forgive my fault and lackness.
>
> HTH.
> Levi.

2020-12-05 00:50:26

by Yun Levi

[permalink] [raw]
Subject: Re:

> I answer again. It's better not to write find_prev_bit at all and
> learn how to use existing functionality.

Thanks for the answer I'll fix and send the patch again :)

On Sat, Dec 5, 2020 at 3:14 AM Yury Norov <[email protected]> wrote:
>
> On Thu, Dec 3, 2020 at 5:36 PM Yun Levi <[email protected]> wrote:
> >
> > >On Fri, Dec 4, 2020 at 3:53 AM Willy Tarreau <[email protected]> wrote:
> > >
> > > On Thu, Dec 03, 2020 at 10:46:25AM -0800, Yury Norov wrote:
> > > > Yun, could you please stop top-posting and excessive trimming in the thread?
> > >
> > > And re-configure the mail agent to make the "Subject" field appear and
> > > fill it.
> >
> > >On Thu, Dec 03, 2020 at 10:46:25AM -0800, Yury Norov wrote:
> > > Yun, could you please stop top-posting and excessive trimming in the thread?
> > Sorry to make you uncomfortable... Thanks for advice.
> >
> > >On Thu, Dec 03, 2020 at 10:46:25AM -0800, Yury Norov wrote:
> > > As you said, find_last_bit() and proposed find_prev_*_bit() have the
> > > same functionality.
> > > If you really want to have find_prev_*_bit(), could you please at
> > > least write it using find_last_bit(), otherwise it would be just a
> > > blottering.
> >
> > Actually find_prev_*_bit call _find_prev_bit which is a common helper function
> > like _find_next_bit.
> > As you know this function is required to support __BIGEDIAN's little
> > endian search.
> > find_prev_bit actually wrapper of _find_prev_bit which have a feature
> > the find_last_bit.
> >
> > That makes the semantics difference between find_last_bit and find_prev_bit.
> > -- specify where you find from and
> > In loop, find_last_bit couldn't sustain original size as sentinel
> > return value
> > (we should change the size argument for next searching
> > But it means whenever we call, "NOT SET or NOT CLEAR"'s sentinel
> > return value is changed per call).
> >
> > Because we should have _find_prev_bit,
> > I think it's the matter to choose which is better to usein
> > find_prev_bit (find_last_bit? or _find_prev_bit?)
> > sustaining find_prev_bit feature (give size as sentinel return, from
> > where I start).
> > if my understanding is correct.
> >
> > In my view, I prefer to use _find_prev_bit like find_next_bit for
> > integrated format.
> >
> > But In some of the benchmarking, find_last_bit is better than _find_prev_bit,
> > here what I tested (look similar but sometimes have some difference).
> >
> > Start testing find_bit() with random-filled bitmap
> > [ +0.001850] find_next_bit: 842792 ns, 163788 iterations
> > [ +0.000873] find_prev_bit: 870914 ns, 163788 iterations
> > [ +0.000824] find_next_zero_bit: 821959 ns, 163894 iterations
> > [ +0.000677] find_prev_zero_bit: 676240 ns, 163894 iterations
> > [ +0.000777] find_last_bit: 659103 ns, 163788 iterations
> > [ +0.001822] find_first_bit: 1708041 ns, 16250 iterations
> > [ +0.000539] find_next_and_bit: 492182 ns, 73871 iterations
> > [ +0.000001]
> > Start testing find_bit() with sparse bitmap
> > [ +0.000222] find_next_bit: 13227 ns, 654 iterations
> > [ +0.000013] find_prev_bit: 11652 ns, 654 iterations
> > [ +0.001845] find_next_zero_bit: 1723869 ns, 327028 iterations
> > [ +0.001538] find_prev_zero_bit: 1355808 ns, 327028 iterations
> > [ +0.000010] find_last_bit: 8114 ns, 654 iterations
> > [ +0.000867] find_first_bit: 710639 ns, 654 iterations
> > [ +0.000006] find_next_and_bit: 4273 ns, 1 iterations
> > [ +0.000004] find_next_and_bit: 3278 ns, 1 iterations
> >
> > Start testing find_bit() with random-filled bitmap
> > [ +0.001784] find_next_bit: 805553 ns, 164240 iterations
> > [ +0.000643] find_prev_bit: 632474 ns, 164240 iterations
> > [ +0.000950] find_next_zero_bit: 877215 ns, 163442 iterations
> > [ +0.000664] find_prev_zero_bit: 662339 ns, 163442 iterations
> > [ +0.000680] find_last_bit: 602204 ns, 164240 iterations
> > [ +0.001912] find_first_bit: 1758208 ns, 16408 iterations
> > [ +0.000760] find_next_and_bit: 531033 ns, 73798 iterations
> > [ +0.000002]
> > Start testing find_bit() with sparse bitmap
> > [ +0.000203] find_next_bit: 12468 ns, 656 iterations
> > [ +0.000205] find_prev_bit: 10948 ns, 656 iterations
> > [ +0.001759] find_next_zero_bit: 1579447 ns, 327026 iterations
> > [ +0.001935] find_prev_zero_bit: 1931961 ns, 327026 iterations
> > [ +0.000013] find_last_bit: 9543 ns, 656 iterations
> > [ +0.000732] find_first_bit: 562009 ns, 656 iterations
> > [ +0.000217] find_next_and_bit: 6804 ns, 1 iterations
> > [ +0.000007] find_next_and_bit: 4367 ns, 1 iterations
> >
> > Is it better to write find_prev_bit using find_last_bit?
> > I question again.
>
> I answer again. It's better not to write find_prev_bit at all and
> learn how to use existing functionality.
>
> Yury
>
> > Thanks for your great advice, But please forgive my fault and lackness.
> >
> > HTH.
> > Levi.

2020-12-05 12:03:43

by Rasmus Villemoes

[permalink] [raw]
Subject: Re:

On 03/12/2020 19.46, Yury Norov wrote:

> I would prefer to avoid changing the find*bit() semantics. As for now,
> if any of find_*_bit()
> finds nothing, it returns the size of the bitmap it was passed.

Yeah, we should actually try to fix that, it causes bad code generation.
It's hard, because callers of course do that "if ret == size" check. But
it's really silly that something like find_first_bit needs to do that
"min(i*BPL + __ffs(word), size)" - the caller does a comparison anyway,
that comparison might as well be "ret >= size" rather than "ret ==
size", and then we could get rid of that branch (which min() necessarily
becomes) at the end of find_next_bit.

I haven't dug very deep into this, but I could also imagine the
arch-specific parts of this might become a little easier to do if the
semantics were just "if no such bit, return an indeterminate value >=
the size".

> Changing this for
> a single function would break the consistency, and may cause problems
> for those who
> rely on existing behaviour.

True. But I think it should be possible - I suppose most users are via
the iterator macros, which could all be updated at once. Changing ret ==
size to ret >= size will still work even if the implementations have not
been switched over, so it should be doable.

>
> Passing non-positive size to find_*_bit() should produce undefined
> behaviour, because we cannot dereference a pointer to the bitmap in
> this case; this is most probably a sign of a problem on a caller side
> anyways.

No, the out-of-line bitmap functions should all handle the case of a
zero-size bitmap sensibly.

Is bitmap full? Yes (all the 0 bits are set).
Is bitmap empty? Yes, (none of the 0 bits are set).
Find the first bit set (returns 0, there's no such bit)

Etc. The static inlines for small_const_nbits do assume that the pointer
can be dereferenced, which is why small_const_nbits was updated to mean
1<=bits<=BITS_PER_LONG rather than just bits<=BITS_PER_LONG.

Rasmus

2020-12-05 18:35:52

by Yury Norov

[permalink] [raw]
Subject: Re:

On Sat, Dec 5, 2020 at 3:10 AM Rasmus Villemoes
<[email protected]> wrote:
>
> On 03/12/2020 19.46, Yury Norov wrote:
>
> > I would prefer to avoid changing the find*bit() semantics. As for now,
> > if any of find_*_bit()
> > finds nothing, it returns the size of the bitmap it was passed.
>
> Yeah, we should actually try to fix that, it causes bad code generation.
> It's hard, because callers of course do that "if ret == size" check. But
> it's really silly that something like find_first_bit needs to do that
> "min(i*BPL + __ffs(word), size)" - the caller does a comparison anyway,
> that comparison might as well be "ret >= size" rather than "ret ==
> size", and then we could get rid of that branch (which min() necessarily
> becomes) at the end of find_next_bit.

We didn't do that 5 years ago because it's too invasive and the improvement
is barely measurable, the difference is 2 instructions (on arm64).e.
Has something
changed since that?

20000000000000000 <find_first_bit_better>:
0: aa0003e3 mov x3, x0
4: aa0103e0 mov x0, x1
8: b4000181 cbz x1, 38 <find_first_bit_better+0x38>
c: f9400064 ldr x4, [x3]
10: d2800802 mov x2, #0x40 // #64
14: 91002063 add x3, x3, #0x8
18: b40000c4 cbz x4, 30 <find_first_bit_better+0x30>
1c: 14000008 b 3c <find_first_bit_better+0x3c>
20: f8408464 ldr x4, [x3], #8
24: 91010045 add x5, x2, #0x40
28: b50000c4 cbnz x4, 40 <find_first_bit_better+0x40>
2c: aa0503e2 mov x2, x5
30: eb00005f cmp x2, x0
34: 54ffff63 b.cc 20 <find_first_bit_better+0x20> //
b.lo, b.ul, b.last
38: d65f03c0 ret
3c: d2800002 mov x2, #0x0 // #0
40: dac00084 rbit x4, x4
44: dac01084 clz x4, x4
48: 8b020080 add x0, x4, x2
4c: d65f03c0 ret

0000000000000050 <find_first_bit_worse>:
50: aa0003e4 mov x4, x0
54: aa0103e0 mov x0, x1
58: b4000181 cbz x1, 88 <find_first_bit_worse+0x38>
5c: f9400083 ldr x3, [x4]
60: d2800802 mov x2, #0x40 // #64
64: 91002084 add x4, x4, #0x8
68: b40000c3 cbz x3, 80 <find_first_bit_worse+0x30>
6c: 14000008 b 8c <find_first_bit_worse+0x3c>
70: f8408483 ldr x3, [x4], #8
74: 91010045 add x5, x2, #0x40
78: b50000c3 cbnz x3, 90 <find_first_bit_worse+0x40>
7c: aa0503e2 mov x2, x5
80: eb02001f cmp x0, x2
84: 54ffff68 b.hi 70 <find_first_bit_worse+0x20> // b.pmore
88: d65f03c0 ret
8c: d2800002 mov x2, #0x0 // #0
90: dac00063 rbit x3, x3
94: dac01063 clz x3, x3
98: 8b020062 add x2, x3, x2
9c: eb02001f cmp x0, x2
a0: 9a829000 csel x0, x0, x2, ls // ls = plast
a4: d65f03c0 ret

> I haven't dug very deep into this, but I could also imagine the
> arch-specific parts of this might become a little easier to do if the
> semantics were just "if no such bit, return an indeterminate value >=
> the size".
>
> > Changing this for
> > a single function would break the consistency, and may cause problems
> > for those who
> > rely on existing behaviour.
>
> True. But I think it should be possible - I suppose most users are via
> the iterator macros, which could all be updated at once. Changing ret ==
> size to ret >= size will still work even if the implementations have not
> been switched over, so it should be doable.

Since there's no assembler users for it, we can do just:
#define find_first_bit(bitmap, size)
min(better_find_first_bit((bitmap), (size)), (size))

... and deprecate find_first_bit.

> > Passing non-positive size to find_*_bit() should produce undefined
> > behaviour, because we cannot dereference a pointer to the bitmap in
> > this case; this is most probably a sign of a problem on a caller side
> > anyways.
>
> No, the out-of-line bitmap functions should all handle the case of a
> zero-size bitmap sensibly.

I could be more specific, the behaviour is defined: don't dereference
the address and return undefined value (which now is always 0).

> Is bitmap full? Yes (all the 0 bits are set).
> Is bitmap empty? Yes, (none of the 0 bits are set).
> Find the first bit set (returns 0, there's no such bit)

I can't answer because this object is not a map of bits - there's no room for
bits inside.

> Etc. The static inlines for small_const_nbits do assume that the pointer
> can be dereferenced, which is why small_const_nbits was updated to mean
> 1<=bits<=BITS_PER_LONG rather than just bits<=BITS_PER_LONG.

I don't want to do something like

if (size == 0)
return -1;

... because it legitimizes this kind of usage and hides problems on
callers' side.
Instead, I'd add WARN_ON(size == 0), but I don't think it's so
critical to bother with it.

Yury

> Rasmus