2013-05-07 08:16:05

by Chanho Min

[permalink] [raw]
Subject: [PATCH RESEND v3] bitmap: speed up bitmap_find_free_region

In bitmap_find_free_region, If we skip the all-ones words and find bits
in a not-all-ones word, we can improve performance of it.

For example, If bitmap_find_free_region() is called with order=0, First,
It scans bitmap array by the increment of long type, then find 1 free bit
within 1 long type value. In 32 bits system and 1024 bits size, in the worst
case, We need 1024 for-loops to find 1 free bit. But, If this is applied,
it takes 64 for-loops. Instead, It will be needed additional if-comparison
of every word and It can take time slightly as 'Test case 3'. But, In many
cases, It will speed up significantly.

Test cases bellows show the time taken to execute bitmap_find_free_region()
before and after patch.

Test condition : ARMv7 1GHZ 32 bits system, 1024 bits size, gcc 4.6.2

Test case 1: order is 0. all bits are one except that last one bit is zero.
before patch: 29727 ns
after patch: 2349 ns

Test case 2: order is 1. all bits are one except that last 2 contiguous bits
are zero.
before patch: 15475 ns
after patch: 2225 ns

Test case 3: order is 1. all words are not-all-ones and don't have 2 contiguous
bits except that last 2 contiguous are zero.
before patch: 15475 ns
after patch: 16131 ns

Changes compared to v1:
- Modified unnecessarily complicated code.
- Fixed the buggy code if `bits' is not an multiple of BITS_PER_LONG.

Changes compared to v2:
- Removed bitmap_find_free_one and merged it to bitmap_find_free_region.

Signed-off-by: Chanho Min <[email protected]>
---
lib/bitmap.c | 23 +++++++++++++++--------
1 file changed, 15 insertions(+), 8 deletions(-)

diff --git a/lib/bitmap.c b/lib/bitmap.c
index 06f7e4f..95e8efc 100644
--- a/lib/bitmap.c
+++ b/lib/bitmap.c
@@ -1114,14 +1114,21 @@ done:
*/
int bitmap_find_free_region(unsigned long *bitmap, int bits, int order)
{
- int pos, end; /* scans bitmap by regions of size order */
-
- for (pos = 0 ; (end = pos + (1 << order)) <= bits; pos = end) {
- if (!__reg_op(bitmap, pos, order, REG_OP_ISFREE))
- continue;
- __reg_op(bitmap, pos, order, REG_OP_ALLOC);
- return pos;
- }
+ int pos, end, nbit, i; /* scans bitmap by regions of size order */
+ int nlongs = BITS_TO_LONGS(bits);
+
+ for (i = 0; i < nlongs; i++)
+ if (bitmap[i] != ~0UL) {
+ pos = i * BITS_PER_LONG;
+ nbit = min(bits, pos + BITS_PER_LONG);
+ for (; (end = pos + (1 << order)) <= nbit; pos = end) {
+ if (!__reg_op(bitmap, pos, order,
+ REG_OP_ISFREE))
+ continue;
+ __reg_op(bitmap, pos, order, REG_OP_ALLOC);
+ return pos;
+ }
+ }
return -ENOMEM;
}
EXPORT_SYMBOL(bitmap_find_free_region);
--
1.7.9.5


2013-05-07 11:42:55

by anish singh

[permalink] [raw]
Subject: Re: [PATCH RESEND v3] bitmap: speed up bitmap_find_free_region

On Tue, May 7, 2013 at 1:45 PM, Chanho Min <[email protected]> wrote:
> In bitmap_find_free_region, If we skip the all-ones words and find bits
> in a not-all-ones word, we can improve performance of it.
>
> For example, If bitmap_find_free_region() is called with order=0, First,
> It scans bitmap array by the increment of long type, then find 1 free bit
> within 1 long type value. In 32 bits system and 1024 bits size, in the worst
> case, We need 1024 for-loops to find 1 free bit. But, If this is applied,
> it takes 64 for-loops. Instead, It will be needed additional if-comparison
> of every word and It can take time slightly as 'Test case 3'. But, In many
> cases, It will speed up significantly.
>
> Test cases bellows show the time taken to execute bitmap_find_free_region()
> before and after patch.
>
> Test condition : ARMv7 1GHZ 32 bits system, 1024 bits size, gcc 4.6.2
>
> Test case 1: order is 0. all bits are one except that last one bit is zero.
> before patch: 29727 ns
> after patch: 2349 ns
How did you find out this time?Can you please post your code
for that?
>
> Test case 2: order is 1. all bits are one except that last 2 contiguous bits
> are zero.
> before patch: 15475 ns
> after patch: 2225 ns
>
> Test case 3: order is 1. all words are not-all-ones and don't have 2 contiguous
> bits except that last 2 contiguous are zero.
> before patch: 15475 ns
> after patch: 16131 ns
>
> Changes compared to v1:
> - Modified unnecessarily complicated code.
> - Fixed the buggy code if `bits' is not an multiple of BITS_PER_LONG.
>
> Changes compared to v2:
> - Removed bitmap_find_free_one and merged it to bitmap_find_free_region.
>
> Signed-off-by: Chanho Min <[email protected]>
> ---
> lib/bitmap.c | 23 +++++++++++++++--------
> 1 file changed, 15 insertions(+), 8 deletions(-)
>
> diff --git a/lib/bitmap.c b/lib/bitmap.c
> index 06f7e4f..95e8efc 100644
> --- a/lib/bitmap.c
> +++ b/lib/bitmap.c
> @@ -1114,14 +1114,21 @@ done:
> */
> int bitmap_find_free_region(unsigned long *bitmap, int bits, int order)
> {
> - int pos, end; /* scans bitmap by regions of size order */
> -
> - for (pos = 0 ; (end = pos + (1 << order)) <= bits; pos = end) {
> - if (!__reg_op(bitmap, pos, order, REG_OP_ISFREE))
> - continue;
> - __reg_op(bitmap, pos, order, REG_OP_ALLOC);
> - return pos;
> - }
> + int pos, end, nbit, i; /* scans bitmap by regions of size order */
> + int nlongs = BITS_TO_LONGS(bits);
> +
> + for (i = 0; i < nlongs; i++)
> + if (bitmap[i] != ~0UL) {
> + pos = i * BITS_PER_LONG;
> + nbit = min(bits, pos + BITS_PER_LONG);
> + for (; (end = pos + (1 << order)) <= nbit; pos = end) {
> + if (!__reg_op(bitmap, pos, order,
> + REG_OP_ISFREE))
> + continue;
> + __reg_op(bitmap, pos, order, REG_OP_ALLOC);
> + return pos;
> + }
> + }
> return -ENOMEM;
> }
> EXPORT_SYMBOL(bitmap_find_free_region);
> --
> 1.7.9.5
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/

2013-05-08 00:59:23

by Chanho Min

[permalink] [raw]
Subject: Re : Re: [PATCH RESEND v3] bitmap: speed up bitmap_find_free_region


>> Test case 1: order is 0. all bits are one except that last one bit is zero.
>> before patch: 29727 ns
>> after patch: 2349 ns
>How did you find out this time?Can you please post your code
>for that?
I used local_clock(), But, It may be more proper way for each system.

before = local_clock();
bitmap_find_free_region(bitmap, 1024,0);
after = local_clock();

Chanho Min,

2013-05-08 20:24:57

by Andrew Morton

[permalink] [raw]
Subject: Re: [PATCH RESEND v3] bitmap: speed up bitmap_find_free_region

On Tue, 7 May 2013 17:15:53 +0900 Chanho Min <[email protected]> wrote:

> In bitmap_find_free_region, If we skip the all-ones words and find bits
> in a not-all-ones word, we can improve performance of it.
>
> For example, If bitmap_find_free_region() is called with order=0, First,
> It scans bitmap array by the increment of long type, then find 1 free bit
> within 1 long type value. In 32 bits system and 1024 bits size, in the worst
> case, We need 1024 for-loops to find 1 free bit. But, If this is applied,
> it takes 64 for-loops. Instead, It will be needed additional if-comparison
> of every word and It can take time slightly as 'Test case 3'. But, In many
> cases, It will speed up significantly.
>
> ...
>
> --- a/lib/bitmap.c
> +++ b/lib/bitmap.c
> @@ -1114,14 +1114,21 @@ done:
> */
> int bitmap_find_free_region(unsigned long *bitmap, int bits, int order)
> {
> - int pos, end; /* scans bitmap by regions of size order */
> -
> - for (pos = 0 ; (end = pos + (1 << order)) <= bits; pos = end) {
> - if (!__reg_op(bitmap, pos, order, REG_OP_ISFREE))
> - continue;
> - __reg_op(bitmap, pos, order, REG_OP_ALLOC);
> - return pos;
> - }
> + int pos, end, nbit, i; /* scans bitmap by regions of size order */
> + int nlongs = BITS_TO_LONGS(bits);
> +
> + for (i = 0; i < nlongs; i++)
> + if (bitmap[i] != ~0UL) {
> + pos = i * BITS_PER_LONG;
> + nbit = min(bits, pos + BITS_PER_LONG);
> + for (; (end = pos + (1 << order)) <= nbit; pos = end) {
> + if (!__reg_op(bitmap, pos, order,
> + REG_OP_ISFREE))
> + continue;
> + __reg_op(bitmap, pos, order, REG_OP_ALLOC);
> + return pos;
> + }
> + }
> return -ENOMEM;

It could be improved further by returning to the long-at-a-time search
if we failed to find a suffificiently large free region at `i'. Oh
well, I guess we can do that later if someone cares enough.

We now have a definition of four(!) locals on a single line followed by
a head-scratching comment which doesn't really pertain to any of those
locals anyway. Let's do this:

From: Andrew Morton <[email protected]>
Subject: lib-bitmapc-speed-up-bitmap_find_free_region-fix

reduce scope of locals, remove barely comprehensible comment

Cc: Chanho Min <[email protected]>
Cc: Jiri Kosina <[email protected]>
Cc: Joe Perches <[email protected]>
Cc: Nadia Yvette Chambers <[email protected]>
Cc: anish singh <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
---

lib/bitmap.c | 8 +++++---
1 file changed, 5 insertions(+), 3 deletions(-)

diff -puN lib/bitmap.c~lib-bitmapc-speed-up-bitmap_find_free_region-fix lib/bitmap.c
--- a/lib/bitmap.c~lib-bitmapc-speed-up-bitmap_find_free_region-fix
+++ a/lib/bitmap.c
@@ -1114,13 +1114,15 @@ done:
*/
int bitmap_find_free_region(unsigned long *bitmap, int bits, int order)
{
- int pos, end, nbit, i; /* scans bitmap by regions of size order */
int nlongs = BITS_TO_LONGS(bits);
+ int i;

for (i = 0; i < nlongs; i++)
if (bitmap[i] != ~0UL) {
- pos = i * BITS_PER_LONG;
- nbit = min(bits, pos + BITS_PER_LONG);
+ int pos = i * BITS_PER_LONG;
+ int nbit = min(bits, pos + BITS_PER_LONG);
+ int end;
+
for (; (end = pos + (1 << order)) <= nbit; pos = end) {
if (!__reg_op(bitmap, pos, order,
REG_OP_ISFREE))
_